Three independent zero-risk patches closing V3/V4/V5/V6/V11 gaps from
Vera's static PHI-leak audit. Implemented per Pax's mitigation
recommendations. No new deps, no behavior change for users not handling PHI.
- tool_read_file / tool_grep_files / tool_glob_files / tool_list_dir now
refuse paths under $LARRY_HOME/{log,sanitize,sessions} and
$LARRY_HOME/{.oauth.json,.env} with a structured JSON error the model
must surface. Block-list evaluates at call time; comparison runs against
both the literal and realpath-canonicalized form of both PATH and
$LARRY_HOME. Closes V4 + V6 + V11 (de-sanitization key, OAuth tokens,
PHI clear-text audit log). The proactive same-pattern sweep extended
the block from read_file alone to grep_files/glob_files/list_dir.
- /load <file> pre-routes HL7-shaped content through lib/hl7-sanitize.sh
(segment-aware tokenizer) BEFORE the user_input auto-PHI pass. Closes
V3 — smat dumps loaded via /load no longer rely on the lighter per-word
classifier.
- LARRY_AUTO_PHI=strict (fourth value alongside off/on/confirm) is the
fail-closed mode. Aborts the turn when sanitizer is missing or returns
empty on HL7-shaped content, or when tokenize-value fails. On the
tool-result surface (can't kill an in-flight tool_use), substitutes
the result with a refusal sentinel so raw HL7 NEVER reaches the model.
Existing off/on/confirm semantics unchanged. /phi-auto strict toggle,
/help text, and tests updated. Closes V5.
Refs:
Deliverables/2026-05-27-cloverleaf-larry-phi-leak-audit.md (Vera)
Deliverables/2026-05-27-cloverleaf-larry-phi-mitigation-research.md (Pax)
Verification: bash -n clean; path-block unit-tested with 13 cases including
symlink resolution (file and dir), ../ traversal, nonexistent paths, and
the empty-LARRY_HOME edge case — all pass.
Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
148 lines
7.6 KiB
Markdown
148 lines
7.6 KiB
Markdown
# Changelog
|
|
|
|
All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here.
|
|
Versioning is loose-semver; bumps trigger the in-process self-update on every
|
|
running client via `LARRY_BASE_URL` + `MANIFEST`.
|
|
|
|
## v0.8.0 — 2026-05-27
|
|
|
|
PHI-safety quick-wins pack — three independent zero-risk patches closing
|
|
four gap-classes Vera identified in the v0.7.5 static audit
|
|
(`Deliverables/2026-05-27-cloverleaf-larry-phi-leak-audit.md`) with Pax's
|
|
recommended mitigations
|
|
(`Deliverables/2026-05-27-cloverleaf-larry-phi-mitigation-research.md`).
|
|
No new dependencies, no behavior change for users not interacting with PHI.
|
|
|
|
- **`read_file`/`grep_files`/`glob_files`/`list_dir` path-block list
|
|
(closes V4 + V6 + V11).** Refuse — with a structured JSON error the
|
|
model must surface, NOT a silent "file not found" — any tool-side
|
|
attempt to read or enumerate under `$LARRY_HOME/log/` (auto-phi.log,
|
|
headers.log, oauth.log, session logs), `$LARRY_HOME/sanitize/`
|
|
(lookup.tsv — the desanitization key), `$LARRY_HOME/sessions/`,
|
|
`$LARRY_HOME/.oauth.json`, or `$LARRY_HOME/.env`. Block-list resolves
|
|
`$LARRY_HOME` at call time (not script-parse time) and runs against
|
|
both the literal path and its `realpath -m` canonical form, so symlink
|
|
detours don't bypass. The proactive same-pattern sweep (Bryan standing
|
|
rule, 2026-05-27) extended the block from `tool_read_file` alone to
|
|
also cover `tool_grep_files`, `tool_glob_files`, and `tool_list_dir`
|
|
— those tools would otherwise leak filenames or grep-matched content
|
|
out of the same protected dirs without any approval gate.
|
|
|
|
- **`/load <file>` HL7 pre-routing (closes V3).** When the loaded file's
|
|
content matches `_auto_phi_looks_like_hl7`, route it through
|
|
`lib/hl7-sanitize.sh` (the segment-aware tokenizer with the full PHI
|
|
field rule set: PID, PV1, NK1, GT1, IN1, OBR, OBX, DG1, ORC) BEFORE
|
|
the existing user_input auto-PHI pass. Closes the gap where smat dumps
|
|
loaded via `/load` only got the lighter per-word classifier, which
|
|
misses bare HL7 PID fields. Status line reports how many fields were
|
|
tokenized: `phi> /load: hl7-sanitize.sh tokenized N HL7 field(s) from
|
|
<file> before passing to auto-PHI`. Strict mode (see below) aborts the
|
|
`/load` if sanitize fails; default/confirm modes warn-and-continue.
|
|
|
|
- **`LARRY_AUTO_PHI=strict` fail-closed mode (closes V5).** New fourth
|
|
value alongside `off / on / confirm`. In strict mode, the auto-PHI
|
|
pipeline aborts the surrounding turn (no payload built, no API call)
|
|
when: (a) `lib/hl7-sanitize.sh` is missing/non-executable on HL7-shaped
|
|
user_input, (b) the sanitizer returns empty on HL7-shaped content,
|
|
(c) any single value's `tokenize-value` call fails inside the
|
|
detection loop. On the tool-result surface (which can't kill the
|
|
in-flight tool_use), strict mode substitutes the result with a
|
|
structured JSON refusal sentinel so the raw HL7 NEVER reaches the
|
|
model. Existing `off / on / confirm` semantics unchanged (still
|
|
fail-open per Bryan's "don't break tools" priority). Strict is the
|
|
opt-in tradeoff for HIPAA work where a silent leak is worse than a
|
|
broken turn. `/phi-auto strict` toggle and `/help` text updated.
|
|
Wired into both auto-PHI invocation sites: user input scan and the
|
|
tool-result HL7 sanitizer gate.
|
|
|
|
**Proactive same-pattern sweep (Bryan standing rule, 2026-05-27).**
|
|
Searched the codebase for other tools matching the pattern "reads
|
|
arbitrary path, returns content to model, no approval gate": found and
|
|
patched `tool_grep_files`, `tool_glob_files`, `tool_list_dir`
|
|
alongside `tool_read_file`. `bash_exec`/`ssh_exec` already require Y/N
|
|
operator approval — the operator is the gatekeeper there (a second gate
|
|
deferred to v0.8.1). No other matches.
|
|
|
|
Manifest unchanged (no new files in `lib/`).
|
|
|
|
## v0.7.5 — 2026-05-27
|
|
|
|
Three focused changes, one common cause: the Cygwin/MobaXterm CR-taint pattern
|
|
that crashed OAuth on Bryan's v0.7.3 work-box with the cryptic error
|
|
`bash: ...: arithmetic syntax error: invalid arithmetic operator (error token is "")`.
|
|
|
|
- **OAuth/arithmetic CR fix.** `lib/oauth.sh` now routes every operand entering
|
|
a bash arithmetic context (`fetched_at`, `expires_in`, `now`) through a
|
|
dedicated `coerce_int` helper that strips non-digits at the source. The
|
|
failure mode: `$(date +%s)` against a Cygwin pty where Windows-native
|
|
`date.exe` shadows Cygwin `date` can return a CR-tainted epoch like
|
|
`"1779999999\r"`, which crashes the very next `$((expires_at - now))`.
|
|
Diagnosis in `Deliverables/2026-05-27-cloverleaf-larry-oauth-arithmetic-fix.md`.
|
|
|
|
- **Mouse mode is opt-in.** REPL mouse handling now defaults to OFF and is
|
|
enabled via `LARRY_MOUSE=1` env var or `/mouse on` slash command. Several
|
|
terminals (notably MobaXterm and stripped tmux) were swallowing the mouse
|
|
ANSI sequences and printing literal `^[[?1000h` garbage when v0.7.0 turned
|
|
it on unconditionally. Diagnosis in
|
|
`Deliverables/2026-05-27-cloverleaf-larry-mouse-regression-fix.md`.
|
|
|
|
- **CR-safety sweep across `lib/*.sh` and top-level scripts.** Three new
|
|
primitives in `lib/cygwin-safe.sh` (sourced by every tool family member):
|
|
- `coerce_int VAL [DEFAULT]` — for arithmetic and integer-test operands
|
|
- `strip_cr VAL` — for case patterns, regex tests, paths, HTTP headers
|
|
- `read_clean VAR [PROMPT]` — `read -r` wrapper that strips CR pre-assign
|
|
Hardened call sites:
|
|
- `larry.sh` — status-line `date +%s` / `tput cols`, three y/N approval
|
|
prompts (write_file, bash_exec, first-run auth), API-key paste,
|
|
first-run auth menu
|
|
- `lib/oauth.sh` — `cmd_login` and `cmd_refresh` `date +%s` captures
|
|
- `lib/nc-engine.sh` — five y/N action prompts (stop/start/bounce, resend,
|
|
route-test, testxlate, tpstest) + `find ... | wc -l` arithmetic
|
|
- `lib/nc-msgs.sh` — `parse_time_ms` `date` captures (4 sites),
|
|
meta-TSV `tm` field, `MSG_COUNT` `wc -l`
|
|
- `lib/nc-regression.sh` — `tr | wc -c` count, hl7-diff `?`-fallback
|
|
arithmetic
|
|
- `lib/nc-smat-diff.sh` — `A_COUNT`/`B_COUNT`/`DIFFS_TOTAL`
|
|
- `lib/nc-insert-protocol.sh` — every awk-emitted line-number that feeds
|
|
`head -n $((N-1))` / `tail -n +$((N+1))` arithmetic
|
|
- `lib/journal.sh` — `_next_seq` `wc -l` arithmetic
|
|
- `lib/lessons.sh` — `_next_id`, `cmd_list`, `cmd_count` arithmetic +
|
|
two y/N prompts (clear all, clear since)
|
|
- `lib/hl7-sanitize.sh` — `cmd_count` arithmetic + clear-table y/N
|
|
- `lib/ssh-helper.sh` — local + remote `wc -c` integer compares (4 sites)
|
|
- `lib/nc-find.sh` — `wc -l` count for `%d` printf
|
|
- `lib/nc-table.sh` — `$(date +%s)` in backup-filename construction
|
|
- `lib/nc-document.sh` — two `wc -l | %d` printf sites
|
|
- `larry-rollback.sh` — Proceed? y/N prompt
|
|
|
|
Reproduction (now exercised by `cygwin-safe.sh`'s in-line tests):
|
|
```
|
|
now=$(printf '%s\r' 1779999999); echo $((now - 1)) # pre-fix: crashes
|
|
now=$(coerce_int "$(printf '%s\r' 1779999999)" 0); echo $((now - 1)) # fix: 1779999998
|
|
```
|
|
|
|
Added `lib/cygwin-safe.sh` to `MANIFEST` so it auto-syncs to every running
|
|
client on next launch.
|
|
|
|
## v0.7.4 — 2026-05-27
|
|
|
|
- Drop GitHub fallback from auto-update. Single-source Gitea
|
|
(`https://git.bjnoela.com/bryan/cloverleaf-larry.git`).
|
|
|
|
## v0.7.3 — 2026-05-26
|
|
|
|
- Automatic PHI detection (tiered detection + blacklist contexts).
|
|
|
|
## v0.7.2 — 2026-05-26
|
|
|
|
- Gitea becomes primary auto-update origin; GitHub demoted to fallback.
|
|
|
|
## v0.7.1 — 2026-05-26
|
|
|
|
- Status line moves to between-turn position (post-input, pre-response).
|
|
- Status line below prompt; automatic PHI detection; session-artifact upload.
|
|
|
|
## v0.7.0 — 2026-05-26
|
|
|
|
- HL7-aware tab completion + REPL mouse mode (later made opt-in in v0.7.5).
|