cloverleaf-larry/lib/cygwin-safe.sh
Bryan Johnson 9dd5821436 v0.7.5: OAuth CR-taint fix + mouse opt-in + CR-safety sweep
- Fix bash arithmetic crash on MobaXterm/Cygwin: $(date +%s) was
  returning CR-tainted values landing in $(( )) operands
- Mouse mode off by default; opt in via LARRY_MOUSE=1 or /mouse on
- Comprehensive CR-safety sweep across lib/*.sh and larry.sh — every
  command-substitution result, file read, and user input that feeds
  an arithmetic context, case dispatcher, or path/header is now
  CR-stripped at the source

New shared helper lib/cygwin-safe.sh defines three primitives:
  coerce_int VAL [DEFAULT]   — for arithmetic / integer-test operands
  strip_cr VAL               — for case patterns, regex tests, paths, headers
  read_clean VAR [PROMPT]    — read -r wrapper that strips CR pre-assign

Hardened call sites (14 files, 60+ patch points):
  - larry.sh:  status-line date/tput, 3 y/N approvals, auth menu, API key
  - lib/oauth.sh:  cmd_login + cmd_refresh date+%s captures
  - lib/nc-engine.sh:  5 y/N action prompts + find|wc arithmetic
  - lib/nc-msgs.sh:  parse_time_ms (4 date sites) + meta-TSV time + MSG_COUNT
  - lib/nc-regression.sh:  tr|wc count + hl7-diff ?-fallback arithmetic
  - lib/nc-smat-diff.sh:  A_COUNT/B_COUNT/DIFFS_TOTAL
  - lib/nc-insert-protocol.sh:  every awk-emitted line number → head/tail math
  - lib/journal.sh:  _next_seq wc -l arithmetic
  - lib/lessons.sh:  _next_id/_count + 2 y/N prompts
  - lib/hl7-sanitize.sh:  cmd_count + clear-table y/N
  - lib/ssh-helper.sh:  4 local+remote wc -c integer compares
  - lib/nc-find.sh, lib/nc-table.sh, lib/nc-document.sh, larry-rollback.sh

Reproduces the exact error Bryan hit:
  bash: ...: arithmetic syntax error: invalid arithmetic operator (error token is "")

lib/cygwin-safe.sh added to MANIFEST so it auto-syncs on next launch.

Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
2026-05-27 19:17:48 -07:00

110 lines
4.4 KiB
Bash
Executable File
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
# cygwin-safe.sh — three primitives that defend Larry-Anywhere against the
# Cygwin/MobaXterm CR-taint pattern that crashed OAuth in v0.7.3.
#
# Pattern (full diagnosis in
# Deliverables/2026-05-27-cloverleaf-larry-oauth-arithmetic-fix.md and the
# v0.7.5 CR-safety sweep deliverable):
#
# On MobaXterm / Cygwin / Git-Bash-for-Windows, any of the following can
# return a string ending in a literal carriage return (\r):
# - `$(date +%s)`, `$(date ...)`, `$(cmd)` against a Cygwin-built binary
# - `read` of user input (depending on tty mode)
# - `cat`/`head`/`tail` against a CRLF-line-ended file
# - `$(<file)` when the file is CRLF
# - `wc -l < file`, `wc -c < file` (the count is fine, but `wc.exe` may
# still emit `\r\n` on the captured stdout)
# - `jq -r '.field' file.json` when file was created with CRLF
# - Heredoc lines that came through a Windows clipboard
#
# The CR is invisible in normal output but lethal when the string lands in:
# - bash arithmetic ($(( )), (( )), let, [ -gt N ]) → "invalid arithmetic
# operator (error token is "")"
# - case dispatchers → pattern matches LITERAL `/cmd\r`, not `/cmd`
# - regex tests → `[[ $x =~ ^[Yy]$ ]]` silently fails on `Y\r`
# - path construction → mkdir/stat fail with ENOENT on `dir\r/file`
# - HTTP headers → server rejects the malformed Authorization line
# - file compares → `[[ $a == $b ]]` silent false-negative
#
# This file is SOURCEABLE — every caller does:
# . "$LARRY_LIB_DIR/cygwin-safe.sh"
#
# Idempotent: re-sourcing is harmless (functions just get redefined identically).
# It defines functions only, runs no code on source, sets no `set -u/-e/-o pipefail`
# globally (those are the caller's responsibility — we must not change them).
# coerce_int VAL [DEFAULT] — return a clean decimal integer that is SAFE to
# drop into any bash arithmetic / integer-test context.
#
# Algorithm: strip every byte that isn't 0-9, then fall back to DEFAULT (or 0)
# if the result is empty. No printf %d (whose behaviour on CR taint varies by
# libc), no shell expansion in arithmetic context — nothing that can crash
# the caller.
#
# Use whenever the value will appear in:
# $((expr)) (( expr )) [ X -gt Y ] [[ X -lt Y ]] let X=...
coerce_int() {
local raw="${1:-}" default="${2:-0}"
local cleaned; cleaned=$(printf '%s' "$raw" | tr -cd '0-9')
printf '%s' "${cleaned:-$default}"
}
# strip_cr VAL — return VAL with every embedded carriage return removed.
#
# Use when the value will appear in:
# case "$X" in ...) ...; esac # pattern dispatchers
# [[ "$X" =~ ^[Yy]$ ]] # regex tests
# [[ "$X" == "literal" ]] # string compares
# "$prefix/$X" # path construction
# "-H Authorization: Bearer $X" # HTTP headers
#
# Cheaper than coerce_int — no subshell, pure bash parameter expansion.
strip_cr() {
local v="${1:-}"
# Strip ALL \r occurrences, not just trailing — embedded CRs (from CRLF
# multi-line input) are just as toxic for the consumers above.
printf '%s' "${v//$'\r'/}"
}
# read_clean VAR [PROMPT] — like `read -r VAR`, but every captured byte that
# is \r gets stripped before the assignment.
#
# Why a wrapper instead of post-processing the var: bash's `read` already
# strips a trailing newline, but on Cygwin/MobaXterm with a CRLF tty the
# \r BEFORE the \n stays in the variable. Doing `read X; X="${X//$'\r'/}"`
# at every call site is 2× the diff and easy to forget; this folds it.
#
# Reads from /dev/tty by default (same as the prevailing `read -r ans </dev/tty
# || ans=""` idiom across the codebase) so it works when stdin is piped.
# If /dev/tty is unavailable, falls back to plain stdin.
#
# Usage:
# read_clean answer "Proceed? [y/N]: "
# if [[ "$answer" =~ ^[Yy]$ ]]; then ... fi
#
# Returns the same exit code as the underlying `read` (1 on EOF).
read_clean() {
local _var="$1"; shift
local _prompt="${1:-}"
local _raw=""
if [ -r /dev/tty ]; then
if [ -n "$_prompt" ]; then
IFS= read -r -p "$_prompt" _raw </dev/tty
else
IFS= read -r _raw </dev/tty
fi
else
if [ -n "$_prompt" ]; then
IFS= read -r -p "$_prompt" _raw
else
IFS= read -r _raw
fi
fi
local _rc=$?
# Strip ALL CRs (paste of multi-line CRLF can introduce embedded ones).
_raw="${_raw//$'\r'/}"
# Assign through eval — printf-quote the value so it survives metacharacters.
printf -v "$_var" '%s' "$_raw"
return $_rc
}