cloverleaf-larry/CHANGELOG.md
Bryan Johnson 4a992d9668 v0.8.7: status line renders on MobaXterm — gate on turn count not data presence
Root cause: render_status_line suppressed the OAuth line whenever ctx_used,
5h_util, and 7d_util were ALL empty. On a rate-limited session ctx is never
recorded (the error path returns before _record_ctx_used) and pre-v0.8.5 the
unified-* headers weren't captured on errors — so all three stayed empty turn
after turn and the line never appeared on Bryan's work-box. NOT a positioning
bug: the line is a plain printf'd dim line (no scroll-region/cursor escapes)
and is not coupled to streaming or mouse mode.

Fix: suppress only before the first turn (_LARRY_TURNS==0); thereafter always
render — empty fields show "—" placeholders, reset date fills in once headers
populate. /status now renders on demand even pre-first-turn. CR-taint sweep:
coerce_int the reset-epoch arithmetic comparisons + strip_cr the oauth-status
color case (MobaXterm CRLF would otherwise crash/blank the line).

Verify: bash -n clean; 7/7 unit tests (turn-0 suppressed, turn>=1 placeholders,
reset date when populated, renders with LARRY_NO_STREAM=1 + mouse off, survives
CR-tainted epoch, LARRY_NO_STATUS=1 still disables).

Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
2026-05-27 21:41:03 -07:00

487 lines
28 KiB
Markdown

# Changelog
All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here.
Versioning is loose-semver; bumps trigger the in-process self-update on every
running client via `LARRY_BASE_URL` + `MANIFEST`.
## v0.8.7 — 2026-05-27
Status-line render fix for MobaXterm/Cygwin (Clover). Symptom: the dim
between-turn status line (session context + rate-limit reset date) never
appeared on Bryan's MobaXterm work-box — the v0.7.1 status-line feature that
MEMORY.md flagged as a never-verified passive item.
**Root cause = suppress-when-empty gate, NOT terminal positioning.**
`render_status_line` (larry.sh) gated the OAuth arm on `ctx_used_tokens`,
`oauth_5h_utilization`, AND `oauth_7d_utilization` ALL being empty — returning
silently (rendering nothing) when so. Two facts made all three stay empty turn
after turn on Bryan's box: (1) `STATUS_ctx_used_tokens` is populated by
`_record_ctx_used`, which runs only AFTER a successful `agent_turn`; on a
`rate_limit_error` the turn returns early (larry.sh ~L3929), so ctx was never
recorded; (2) pre-v0.8.5 the `anthropic-ratelimit-unified-*` utilization
headers weren't captured on error responses. With every turn erroring, all
three gate fields were empty every turn, so the line was suppressed for the
whole session and never rendered. This was NOT a positioning bug: the status
line is a single plain `printf`'d dim line printed between turns — there is no
DECSTBM scroll-region reservation, no cursor save/restore, no absolute-row
positioning anywhere in the codebase, so MobaXterm's terminal emulation had
nothing to mis-honor. It was also NOT coupled to streaming or mouse mode.
- **Gate on turn count, not data presence.** `render_status_line` now suppresses
ONLY before the first turn has run (`_LARRY_TURNS == 0`, coerce_int-guarded);
thereafter it ALWAYS renders. Both auth-mode helpers already self-render a
`—` placeholder for any unpopulated field, so the line always shows session
context (model context window, turns, session cost) and the rate-limit reset
time fills in once a successful call — or, since v0.8.5, a captured error
response — populates the headers.
- **`/status` always renders on demand**, even before the first turn — an
explicit request bypasses the turn-0 gate. Lets Bryan verify the line renders
on MobaXterm without first completing a (possibly rate-limited) turn.
- **CR-taint hardening (same-pattern sweep).** The OAuth segment's reset-epoch
comparisons (`[ <epoch> -le <now> ]`) read `STATUS_oauth_{5h,7d}_reset_epoch`,
which come from `_header_value` (strips only the TRAILING CR). A CRLF response
on MobaXterm or a non-numeric token would have crashed the arithmetic test and
aborted the entire line. Both comparisons now `coerce_int` the epoch first;
the `STATUS_oauth_status` color-override `case` is now `strip_cr`-guarded so a
`rate_limited\r` value still matches its literal-glob arm.
- **Same-pattern sweep results:** audited every escape sequence in larry.sh +
lib/ — only color SGR, clear-screen (`\033[2J\033[H`), erase-line
(`\r\033[K`), and the opt-in mouse/bracketed-paste modes (off by default since
v0.7.5) are used; ZERO scroll-region / cursor-save-restore / absolute-cursor
sequences, so no other UI element is at risk of MobaXterm mis-rendering.
Confirmed no user-visible element is gated behind streaming (`used_stream`
only guards re-printing already-streamed text) or mouse mode.
- **Verification:** `bash -n` clean; 7/7 unit tests pass against the shipped
render functions — turn-0 suppressed; turn≥1 with empty data renders with
placeholders; reset date shown when populated; renders with `LARRY_NO_STREAM=1`
+ mouse off (Bryan's exact config); survives a CR-tainted reset epoch without
crashing; `LARRY_NO_STATUS=1` still fully disables.
## v0.8.6 — 2026-05-27
Work-box → Mac `headers.log` sync (tsk-2026-05-27-023, Clover headers-sync).
Closes the last gap in the rate-limit-diagnosis pipeline: the
`anthropic-ratelimit-*` headers captured on Bryan's MobaXterm work-box (where
the testing happens) never reached the Mac's memory daemon, so they could not
be analyzed. v0.8.6 pushes the work-box `headers.log` to a daemon-watched path
on the Mac automatically; the Mac daemon ingests it to memory Tier 4
(Hindsight) + Tier 7 (mem0).
- **New `lib/headers-sync.sh`** — incremental, offset-tracked, idempotent push
of `$LARRY_HOME/log/headers.log` to a per-host file on the Mac
(`~/.cloverleaf/headers-<workbox-hostname>.jsonl`, a daemon-watched dir).
Transport rides the EXISTING authenticated SSH ControlMaster
(`/ssh-setup <alias>`) — no new key, no second auth, the password is never in
argv/env. Only the new bytes since the last sync are sent (`dd skip=offset`
→ remote `cat >>`); a no-op when nothing is new; a re-seed (truncate + resend)
when the local file rotates/shrinks. Fully graceful: missing target, closed
master, or transport failure logs a warn to `$LARRY_HOME/log/sync.log` and
returns non-fatally — it can NEVER crash or wedge the larry session.
- **`/headers-sync on|off|status|target <alias>|now`** slash command. `target`
binds the Mac SSH alias; `on`/`off` toggle auto-sync (persisted to
`$LARRY_HOME/.env` as `LARRY_HEADERS_SYNC` / `LARRY_HEADERS_SYNC_TARGET`);
`status` shows enabled?, target, dest, last-sync time, bytes pushed, and
master state; `now` runs one incremental sync on demand. Registered in the
TAB-completion arrays and `/help`.
- **Auto-sync cadence: on larry exit.** The REPL EXIT/INT/TERM handler flushes
headers.log if auto-sync is enabled (cheap + incremental). On-demand
`/headers-sync now` is always available. (After-EVERY-turn cadence was
intentionally deferred to keep this change out of the turn/streaming loop
that v0.8.5 just reworked.)
- **Mac-daemon receive side** (`scripts/headers_log_ingest.py`, not part of the
larry bundle): now resolves `headers-*.jsonl` glob sources under the watched
dirs IN ADDITION to the fixed canonical `headers.log`, and processes ALL
sources with PER-SOURCE offsets — so the Mac's own stream and one or more
work-box streams are surfaced independently. Each fact carries a `source=`
label (the work-box hostname) so the memory layer can tell them apart.
- **Security (Vera PHI audit V7):** headers.log holds only `anthropic-*`
response headers (rate-limit metadata + org id) and HTTP status lines — NO
message body, NO PHI — so syncing is safe. The existing key/password-auth
ControlMaster transport is reused unchanged (not weakened).
## v0.8.5 — 2026-05-27
Diagnose-don't-assume rate-limit cluster fix (Clover #8). Symptom: a `hello`
turn threw `rate_limit_error` on a work-box with 90% of the Claude Max 5h quota
free — so NOT 5h-window exhaustion. Root cause = a short-window BURST rail
tripped by a stream→non-stream **double-send** per turn, with no backoff.
- **Rate-limit backoff + actionable message (ROOT).** A 429 no longer fails the
turn or fires an immediate re-send. `agent_turn` now retries with backoff that
HONORS the `retry-after` header (else exponential 2/4/8s capped at 30s;
`LARRY_RL_MAX_RETRIES`/`LARRY_RL_BACKOFF_MAX` tunable). The error message is now
ACTIONABLE: `_parse_response_headers` captures `retry-after` + which rail
tripped (`anthropic-ratelimit-{requests,input-tokens,output-tokens}-remaining:0`
or `unified-{5h,7d}` for OAuth) and `_humanize_rate_limit` renders e.g.
`rate limit: requests-per-minute exhausted (short-window burst, NOT your 5h
quota) — resets in 38s; retrying with backoff`. `headers.log` now captures the
full header block on ANY 429 (was: OAuth-mode + unified-* header only), tagged
`*** 429 retry-after=Ns rail=… ***`, so the next rate-limit is always
diagnosable.
- **Streaming parse failure no longer double-sends (burst trigger).** A
streaming 429/overload returns a plain JSON error body (not SSE);
`parse_stream_to_response` previously dropped those non-`data:` lines, produced
zero blocks, returned 1, and `agent_turn` blindly re-SENT the whole prompt
non-streaming — a SECOND full API call within the same second (the per-minute
burst). The parser now buffers the non-SSE body and, if it parses as a JSON
error, returns a distinct code so the caller surfaces it WITH backoff instead
of re-sending (single-send invariant: one logical attempt per turn). Also
auto-defaults `LARRY_NO_STREAM=1` on MobaXterm/Cygwin/MSYS (`_is_cygwin_like`)
where SSE parsing is fragile; an explicit `LARRY_NO_STREAM=0` still forces it on.
- **`ErrorPI` mangled error string fixed (CR-taint).** `— ErrorPI error:
rate_limit_error` was a carriage-return overprint: on MobaXterm the response
field `jq -r '.error.type'` carried a trailing `\r`, which (a) broke the
`case "$err_type" in rate_limit_error)` match → fell to the `%s — %s` default
(the stray ` — `), and (b) CR-returned the cursor so the terminal overprinted
"API error" → "ErrorPI". Fix: `strip_cr` on `err_type`/`err_msg` in
`_humanize_api_error`, and `err()`/`warn()`/`log()` now strip embedded CRs
defensively. (The v0.7.5 CR sweep missed the error-DISPLAY construction path.)
- **phi tier-5 notice fires once per session (was per-turn nag).** The
`tier-5 (presidio NER) disabled — sidecar not running` notice printed every
turn because `auto_detect_phi` runs inside `$(...)` command substitution and
the old `export _LARRY_PHI_TIER5_WARNED=1` flag died in the subshell. Now keyed
to a `$LARRY_HOME/.phi-notice-shown` file holding `SESSION_ID` — fires once per
session, survives the subshell, resets for a genuinely new session. Same-pattern
sweep caught the identical subshell-flag bug in `_auto_phi_b64_roundtrip`'s
python3-missing notice (`_LARRY_B64_PY3_WARNED`) — fixed the same way.
## v0.8.4 — 2026-05-27
- **Installer/updater now detects HTML-sign-in-page responses and fails loud
instead of silently corrupting.** Root cause (Clover #5's diagnosis,
`Deliverables/2026-05-27-cloverleaf-larry-stuck-update-and-tab-bug.md`): a
private/sign-in-gated Gitea answers an unauthenticated raw-file read with the
**HTML Sign-In page at HTTP 200** (303 → `/user/login`, followed by `curl -L`
to a 200 HTML page). `curl -fsSL` treats that as success, so the old
installer/auto-updater parsed the HTML as VERSION/MANIFEST/`larry.sh` content
— silently aborting, or overwriting real on-disk files with HTML soup. This
is exactly what stranded a work-box at v0.7.3 until the Gitea
`REQUIRE_SIGNIN_VIEW=false` flip.
- **New `lib/fetch-safe.sh`** — a content-validating fetch wrapper
(`fetch_validate URL DEST KIND [MAX_TIME]`). After every `curl`, BEFORE
trusting the bytes, it (a) detects the HTML-login trap (`<!DOCTYPE html` /
`<html` / `Sign In - Gitea` / `<title>Sign In` markers, or a `text/html`
`Content-Type` when a raw file was expected) and (b) validates the content
shape per file type: VERSION must match `^[0-9]+\.[0-9]+\.[0-9]+`, MANIFEST
must be a path-list with no HTML, `larry.sh` must start with
`#!/usr/bin/env bash`, other `.sh` must be non-HTML. On any failure it prints
an actionable error and returns non-zero **without overwriting the target**.
The bootstrap `install-larry.sh` (curl|bash, runs before any lib exists) and
`larry.sh`'s `self_update()` (runs before lib is sourced) each carry a
byte-identical inline copy; the canonical file is in MANIFEST and auto-syncs.
- **Every remote-content fetch hardened.** `install-larry.sh` `fetch()`;
`larry.sh` agent fetch, `sync_from_manifest` MANIFEST + per-file fetches, and
`_fetch_with_fallback` (Phase-B VERSION + larry.sh) all route through the
validator. No trusted-content fetch still uses raw `curl -fsSL`.
- **Optional `LARRY_GITEA_TOKEN` (alias `GITEA_TOKEN`) for authenticated
fetch.** When set, fetches add `Authorization: token <PAT>` so the
installer/updater works against a PRIVATE repo without the public-flip. The
token is never hardcoded and never logged. Documented in `--help` + MANUAL.md.
## v0.8.3 — 2026-05-27
- **Tab-completion trailing space no longer breaks command dispatch.** The
slash-command completer intentionally appends a trailing space after a
unique match (so arg-taking commands feel snappy), but the main_loop
dispatcher matched exact `case` globs, so `/quit ` (completed) missed the
`/quit)` arm and fell through to "unknown command". Latent since v0.6.6
when tab completion shipped. Fixed by rtrimming the dispatch key once at
the `case "$input"` boundary (`larry.sh`), which tolerates the completer's
space, a user-typed trailing space, and any CR remnant while preserving
interior `/load FILE` argument spacing. Added a shared `rtrim()` helper to
`lib/cygwin-safe.sh` (and the inline fallback) next to `strip_cr`.
## v0.8.2 — 2026-05-27
Microsoft Presidio sidecar for free-text NER. Closes V1 from Vera's audit —
the dominant real-world failure mode (patient names, addresses, un-keyworded
dates in prose chat). Opt-in install; larry runs in v0.8.1 mode on hosts
where Presidio isn't installed (MobaXterm/Cygwin per Bryan's accepted
tradeoff).
- **`lib/phi-presidio-sidecar.py`** — FastAPI service on
`127.0.0.1:$LARRY_PHI_PORT` (default `41189`). Wraps Presidio's
`AnalyzerEngine` + `AnonymizerEngine` over spaCy `en_core_web_sm`
(12MB model, ~9-second cold start). Two endpoints: `POST /redact`
takes `{"text": "..."}` and returns `{"redacted": "...", "entities":
[...], "latency_ms": N}`; `GET /health` for the launcher's readiness
probe. Three HL7-specific custom recognizers added (`HL7_MRN` for
6-12 digit numerics with patient/MRN/account context; `HL7_CARET_NAME`
for `SMITH^JOHN` outside Tier-3 line context; `HL7_PHONE_BARE` for
plain 10-digit phones). Confidence threshold for tier-5 tokenize is
0.3 (below that is too noisy).
- **`lib/phi-sidecar.sh`** — lifecycle launcher. Subcommands:
`start / stop / status / health / ensure`. `ensure` is idempotent
(no-op if already up); called from `larry.sh` main_loop startup,
backgrounded so it never blocks larry's first prompt. Waits up to
30 seconds for the sidecar to become healthy after `start`; surfaces
the log tail if startup fails. PID file at
`$LARRY_HOME/.phi-sidecar.pid`; log at `$LARRY_HOME/log/phi-sidecar.log`.
Honors `LARRY_PHI_VENV` env to use a dedicated virtualenv (which the
installer sets up at `$LARRY_HOME/phi-venv` when the user opts in).
- **`lib/phi-client.sh`** — bash wrapper around `/redact`. Sourceable
functions: `phi_client_available`, `phi_redact_text`, `phi_redact_entities`.
Also runs standalone as a CLI (`./phi-client.sh check / redact / entities`).
CR-safe (sources `cygwin-safe.sh` defensively); 5-second curl timeout
bounds any tier-5 stall.
- **Tier-5 integration in `larry.sh:auto_detect_phi`.** New stage AFTER
the existing tier-1/2/3/4 substitution and BEFORE the status summary.
Sources `phi-client.sh` lazily, probes `phi_client_available`, and on
success runs `phi_redact_entities` to get Presidio's per-entity output.
Each entity is tokenized through the SAME `hl7-sanitize.sh tokenize-value`
pipeline as tiers 1-4 (category prefixed `presidio_<TYPE>`) so token IDs
remain stable across surfaces and the `/tokens` listing stays unified.
Tier-5 honors `LARRY_AUTO_PHI=confirm` (prompts Y/n once per value) and
`strict` (aborts the turn if `tokenize-value` fails on a Presidio hit).
Critically, v0.8.2 removes the v0.7.3 early-return that exited
`auto_detect_phi` when tiers 1-4 found nothing — pure-prose input now
ALWAYS reaches tier-5.
- **Graceful degradation.** If the sidecar is unreachable (not installed,
not started, crashed), tier-5 silently no-ops with a one-time stderr
warning per session. Larry's REPL remains fully functional in v0.8.1
mode. `LARRY_AUTO_PHI=strict` does NOT abort on absent sidecar (the
strict mode escape is for HL7-shaped content where rule-pack would
have caught the leak; tier-5 is additive coverage).
- **`/phi-sidecar` slash command** — `start / stop / status / health /
ensure` exposed to the user. Slash-completion table and `_LARRY_SLASH_CMDS_DESC`
updated.
- **`install-larry.sh` install path.** On hosts with Python 3.9+ + pip,
the installer prompts before creating `$LARRY_HOME/phi-venv` and
installing `presidio_analyzer + presidio_anonymizer + fastapi +
uvicorn + spaCy en_core_web_sm` (~400MB on disk, ~250MB RAM resident).
On MobaXterm/Cygwin without python3, the installer skips the prompt
entirely and prints Bryan's accepted tradeoff (MobaXterm stays on
v0.8.1 + nudges). Re-runnable; idempotent.
- **MANIFEST.** Added three new lib files. They auto-sync to every
running client on next launch; clients without Python 3 won't run
the sidecar but the files are harmless to ship.
**Prototype validation (Bryan's Mac, Apple Silicon, Python 3.14).**
Cold start (model load): ~9 seconds with `en_core_web_sm` (vs ~82s with
the larger `en_core_web_lg` Presidio auto-downloads by default — we
explicitly pin `_sm` for the latency-sensitive REPL use case). Warm
analyzer latency: P50 20.6ms, P95 22.7ms over 20 sequential requests
on 100-word input. End-to-end HTTP round-trip (curl + json roundtrip):
P50 ~57ms warm; first request post-startup pays a ~150ms tokenizer
warmup tax then steady. Well under the 200ms-per-turn REPL budget.
Detection quality on the canonical "John Doe MRN 623000286" sample: 8
core entities caught (PERSON x2, DATE_TIME x2, PHONE_NUMBER, US_*),
plus the three custom HL7 recognizers add MRN + caret-name + bare-phone
coverage. Misclassifications (MRN as US_PASSPORT, "ED" as PERSON) are
within tolerance for the tokenize-everything-suspicious policy — the
auto-PHI lookup table sees them as `presidio_*` categories and the
operator can audit via `/tokens`.
**MobaXterm compatibility verdict.** Per Bryan's accepted tradeoff:
v0.8.2 ships Mac/Linux-only. MobaXterm/Cygwin stays on v0.8.1
(rule-pack + path-block + content-shape gating + strict mode + base64
round-trip + tool-result review gate). Test path: install-larry.sh
detects platform and skips the Presidio install on `windows-cygwin`
with a clear "v0.8.1 mode" note. No code in larry.sh is platform-gated
— tier-5 silently no-ops when the sidecar is absent, which IS the
MobaXterm path.
**Proactive same-pattern sweep.** Searched for other call sites where
free-text NER would help: tool-result surface already gets HL7-shape
sanitize (v0.8.1) and base64 round-trip (v0.8.1-c). Tier-5 is
user_input-only by design — tool-result free-text NER deferred to a
future patch (would require deciding on per-tool latency budgets;
Bryan to call when needed).
## v0.8.1 — 2026-05-27
Tool-result PHI gating expansion. Closes V2 / V12 and the V2 base64 sub-gap
from Vera's audit. No behavior change for users not on HL7-shaped data;
opt-in friction for the 8KB+ tool-result review gate.
- **Tool-name allow-list dropped; content-shape gating only.** The v0.7.3
tool-result auto-PHI gate ran only on `read_file (.hl7|.txt)`, `nc_msgs`,
`hl7_field`, `hl7_diff`. v0.8.1 runs `_auto_phi_looks_like_hl7` on
EVERY tool result. On hit → route through `lib/hl7-sanitize.sh`.
On miss → pass through unchanged. Closes V2: `bash_exec`/`ssh_exec`/
`grep_files`/`read_file` of `.log`/`.csv`/`.dat`/no-suffix files are
now all covered when their output is HL7-shaped. False-positive cost
is cheap (extra regex pass with zero behavioral impact on non-HL7).
- **Base64-wrapped HL7 round-trip.** New `_auto_phi_b64_roundtrip` helper.
Detects candidate base64 runs (length >= 200 chars, `[A-Za-z0-9+/=]`
only, length divisible by 4 — NOT entropy-based, per Pax §V2-sub:
HL7's repetitive prefixes survive base64 with LOW entropy). Speculatively
decodes each run; if decoded bytes look like HL7, routes through
`hl7-sanitize.sh` and re-encodes (`base64 -w0`) back into the result.
Catches `ssh_pull_smat` sampled mode TSV (server-side encoding kept
for binary-safe TSV transport; client-side unwrap handles the safety
concern). Requires `python3` (installed everywhere larry-anywhere
runs); skipped with a one-time stderr warning if unavailable.
- **Operator review gate for `bash_exec`/`ssh_exec`/`ssh_pull`/
`ssh_pull_smat` results.** When the tool produced HL7-shaped output OR
the result exceeds `LARRY_TOOL_RESULT_REVIEW_THRESHOLD` bytes
(default 8192), Larry prompts `[Y/n/i]` before passing the result
back to the model. `i` opens the result in `$PAGER` then re-prompts.
Default Y — zero friction by default. `N` substitutes a refusal JSON
so the model knows a result was withheld. Skipped when
`LARRY_AUTO_PHI=off` (consistent with the opt-out) OR running
non-interactively (no TTY — never blocks headless scripts).
Override with `LARRY_TOOL_RESULT_REVIEW=always` to gate every result.
Per Pax §V2/V12: closes the "operator wanted to see this themselves,
didn't want the model to see it" gap that's the actual common case.
**Proactive same-pattern sweep.** Searched the codebase for other call
sites where tool output bypasses content-shape gating: found only the
one in `agent_turn`. The v0.8.0-c strict-mode tool-result branch was
hardened in lockstep so it now triggers on the broader (content-only)
eligibility instead of the old name-allow-list.
Manifest unchanged.
## v0.8.0 — 2026-05-27
PHI-safety quick-wins pack — three independent zero-risk patches closing
four gap-classes Vera identified in the v0.7.5 static audit
(`Deliverables/2026-05-27-cloverleaf-larry-phi-leak-audit.md`) with Pax's
recommended mitigations
(`Deliverables/2026-05-27-cloverleaf-larry-phi-mitigation-research.md`).
No new dependencies, no behavior change for users not interacting with PHI.
- **`read_file`/`grep_files`/`glob_files`/`list_dir` path-block list
(closes V4 + V6 + V11).** Refuse — with a structured JSON error the
model must surface, NOT a silent "file not found" — any tool-side
attempt to read or enumerate under `$LARRY_HOME/log/` (auto-phi.log,
headers.log, oauth.log, session logs), `$LARRY_HOME/sanitize/`
(lookup.tsv — the desanitization key), `$LARRY_HOME/sessions/`,
`$LARRY_HOME/.oauth.json`, or `$LARRY_HOME/.env`. Block-list resolves
`$LARRY_HOME` at call time (not script-parse time) and runs against
both the literal path and its `realpath -m` canonical form, so symlink
detours don't bypass. The proactive same-pattern sweep (Bryan standing
rule, 2026-05-27) extended the block from `tool_read_file` alone to
also cover `tool_grep_files`, `tool_glob_files`, and `tool_list_dir`
— those tools would otherwise leak filenames or grep-matched content
out of the same protected dirs without any approval gate.
- **`/load <file>` HL7 pre-routing (closes V3).** When the loaded file's
content matches `_auto_phi_looks_like_hl7`, route it through
`lib/hl7-sanitize.sh` (the segment-aware tokenizer with the full PHI
field rule set: PID, PV1, NK1, GT1, IN1, OBR, OBX, DG1, ORC) BEFORE
the existing user_input auto-PHI pass. Closes the gap where smat dumps
loaded via `/load` only got the lighter per-word classifier, which
misses bare HL7 PID fields. Status line reports how many fields were
tokenized: `phi> /load: hl7-sanitize.sh tokenized N HL7 field(s) from
<file> before passing to auto-PHI`. Strict mode (see below) aborts the
`/load` if sanitize fails; default/confirm modes warn-and-continue.
- **`LARRY_AUTO_PHI=strict` fail-closed mode (closes V5).** New fourth
value alongside `off / on / confirm`. In strict mode, the auto-PHI
pipeline aborts the surrounding turn (no payload built, no API call)
when: (a) `lib/hl7-sanitize.sh` is missing/non-executable on HL7-shaped
user_input, (b) the sanitizer returns empty on HL7-shaped content,
(c) any single value's `tokenize-value` call fails inside the
detection loop. On the tool-result surface (which can't kill the
in-flight tool_use), strict mode substitutes the result with a
structured JSON refusal sentinel so the raw HL7 NEVER reaches the
model. Existing `off / on / confirm` semantics unchanged (still
fail-open per Bryan's "don't break tools" priority). Strict is the
opt-in tradeoff for HIPAA work where a silent leak is worse than a
broken turn. `/phi-auto strict` toggle and `/help` text updated.
Wired into both auto-PHI invocation sites: user input scan and the
tool-result HL7 sanitizer gate.
**Proactive same-pattern sweep (Bryan standing rule, 2026-05-27).**
Searched the codebase for other tools matching the pattern "reads
arbitrary path, returns content to model, no approval gate": found and
patched `tool_grep_files`, `tool_glob_files`, `tool_list_dir`
alongside `tool_read_file`. `bash_exec`/`ssh_exec` already require Y/N
operator approval — the operator is the gatekeeper there (a second gate
deferred to v0.8.1). No other matches.
Manifest unchanged (no new files in `lib/`).
## v0.7.5 — 2026-05-27
Three focused changes, one common cause: the Cygwin/MobaXterm CR-taint pattern
that crashed OAuth on Bryan's v0.7.3 work-box with the cryptic error
`bash: ...: arithmetic syntax error: invalid arithmetic operator (error token is "")`.
- **OAuth/arithmetic CR fix.** `lib/oauth.sh` now routes every operand entering
a bash arithmetic context (`fetched_at`, `expires_in`, `now`) through a
dedicated `coerce_int` helper that strips non-digits at the source. The
failure mode: `$(date +%s)` against a Cygwin pty where Windows-native
`date.exe` shadows Cygwin `date` can return a CR-tainted epoch like
`"1779999999\r"`, which crashes the very next `$((expires_at - now))`.
Diagnosis in `Deliverables/2026-05-27-cloverleaf-larry-oauth-arithmetic-fix.md`.
- **Mouse mode is opt-in.** REPL mouse handling now defaults to OFF and is
enabled via `LARRY_MOUSE=1` env var or `/mouse on` slash command. Several
terminals (notably MobaXterm and stripped tmux) were swallowing the mouse
ANSI sequences and printing literal `^[[?1000h` garbage when v0.7.0 turned
it on unconditionally. Diagnosis in
`Deliverables/2026-05-27-cloverleaf-larry-mouse-regression-fix.md`.
- **CR-safety sweep across `lib/*.sh` and top-level scripts.** Three new
primitives in `lib/cygwin-safe.sh` (sourced by every tool family member):
- `coerce_int VAL [DEFAULT]` — for arithmetic and integer-test operands
- `strip_cr VAL` — for case patterns, regex tests, paths, HTTP headers
- `read_clean VAR [PROMPT]``read -r` wrapper that strips CR pre-assign
Hardened call sites:
- `larry.sh` — status-line `date +%s` / `tput cols`, three y/N approval
prompts (write_file, bash_exec, first-run auth), API-key paste,
first-run auth menu
- `lib/oauth.sh``cmd_login` and `cmd_refresh` `date +%s` captures
- `lib/nc-engine.sh` — five y/N action prompts (stop/start/bounce, resend,
route-test, testxlate, tpstest) + `find ... | wc -l` arithmetic
- `lib/nc-msgs.sh``parse_time_ms` `date` captures (4 sites),
meta-TSV `tm` field, `MSG_COUNT` `wc -l`
- `lib/nc-regression.sh``tr | wc -c` count, hl7-diff `?`-fallback
arithmetic
- `lib/nc-smat-diff.sh``A_COUNT`/`B_COUNT`/`DIFFS_TOTAL`
- `lib/nc-insert-protocol.sh` — every awk-emitted line-number that feeds
`head -n $((N-1))` / `tail -n +$((N+1))` arithmetic
- `lib/journal.sh``_next_seq` `wc -l` arithmetic
- `lib/lessons.sh``_next_id`, `cmd_list`, `cmd_count` arithmetic +
two y/N prompts (clear all, clear since)
- `lib/hl7-sanitize.sh``cmd_count` arithmetic + clear-table y/N
- `lib/ssh-helper.sh` — local + remote `wc -c` integer compares (4 sites)
- `lib/nc-find.sh``wc -l` count for `%d` printf
- `lib/nc-table.sh``$(date +%s)` in backup-filename construction
- `lib/nc-document.sh` — two `wc -l | %d` printf sites
- `larry-rollback.sh` — Proceed? y/N prompt
Reproduction (now exercised by `cygwin-safe.sh`'s in-line tests):
```
now=$(printf '%s\r' 1779999999); echo $((now - 1)) # pre-fix: crashes
now=$(coerce_int "$(printf '%s\r' 1779999999)" 0); echo $((now - 1)) # fix: 1779999998
```
Added `lib/cygwin-safe.sh` to `MANIFEST` so it auto-syncs to every running
client on next launch.
## v0.7.4 — 2026-05-27
- Drop GitHub fallback from auto-update. Single-source Gitea
(`https://git.bjnoela.com/bryan/cloverleaf-larry.git`).
## v0.7.3 — 2026-05-26
- Automatic PHI detection (tiered detection + blacklist contexts).
## v0.7.2 — 2026-05-26
- Gitea becomes primary auto-update origin; GitHub demoted to fallback.
## v0.7.1 — 2026-05-26
- Status line moves to between-turn position (post-input, pre-response).
- Status line below prompt; automatic PHI detection; session-artifact upload.
## v0.7.0 — 2026-05-26
- HL7-aware tab completion + REPL mouse mode (later made opt-in in v0.7.5).