The only path that closes V1 (free-text PHI gap — the dominant real-world
failure mode per Vera). Opt-in install; larry runs in v0.8.1 mode on hosts
without Presidio (MobaXterm/Cygwin per Bryan's accepted tradeoff).
New files:
- lib/phi-presidio-sidecar.py — FastAPI service on 127.0.0.1:$LARRY_PHI_PORT
(default 41189). Presidio AnalyzerEngine + AnonymizerEngine over spaCy
en_core_web_sm + 3 HL7-specific custom recognizers (HL7_MRN, HL7_CARET_NAME,
HL7_PHONE_BARE). POST /redact and GET /health.
- lib/phi-sidecar.sh — lifecycle (start/stop/status/health/ensure). ensure
is idempotent; called backgrounded from main_loop so it never blocks the
first prompt. Honors LARRY_PHI_VENV.
- lib/phi-client.sh — bash client (phi_client_available / phi_redact_text /
phi_redact_entities). CR-safe; 5s timeout bounds tier-5 stall.
larry.sh:
- auto_detect_phi gains tier-5: after tiers 1-4, before status summary,
source phi-client.sh, run Presidio on a token-masked copy of the input,
tokenize each entity through hl7-sanitize.sh tokenize-value (category
presidio_<TYPE>) so token IDs stay stable. Honors confirm + strict modes.
Removed the v0.7.3 early-return that skipped past tier-5 when tiers 1-4
found nothing — pure prose now always reaches tier-5.
- Token-safe substitution: existing [[...]] tokens are pulled to sentinels,
tier-5 value is replaced, sentinels restored — prevents the token-within-
token corruption that naive literal-replace caused on already-tokenized
text. Acronym guard drops HL7/clinical jargon (SSN/MRN/DOB/ADT) Presidio
over-tags as ORGANIZATION.
- Graceful degradation: sidecar unreachable → tier-5 no-ops with a one-time
stderr warning. /phi-sidecar slash command + completion table.
install-larry.sh:
- Probes python3 3.9+; offers to create $LARRY_HOME/phi-venv and install
presidio + fastapi + uvicorn + en_core_web_sm. Skips silently (with a
v0.8.1-mode note) on Cygwin/MobaXterm without python3, and on
non-interactive pipe installs. Sets LARRY_PHI_VENV in the larry shim.
MANIFEST: three new lib files added for auto-sync.
Prototype validation (Bryan's Mac, Apple Silicon, Python 3.14):
cold start (en_core_web_sm): ~9s (vs ~82s if Presidio auto-grabs _lg;
we pin _sm for the REPL budget)
warm analyzer latency: P50 20.6ms / P95 22.7ms
end-to-end HTTP round-trip: ~57ms warm; ~150ms first-post-startup
All comfortably under the 200ms-per-turn budget.
MobaXterm verdict: v0.8.2 is Mac/Linux-only. MobaXterm stays on v0.8.1 +
nudges, per Bryan's explicit acceptance. install-larry.sh enforces this
by platform detection; larry.sh tier-5 silently no-ops when the sidecar
is absent (which IS the MobaXterm path — no code is platform-gated).
Verification: bash -n clean on larry.sh + all 3 new lib scripts; python3
ast.parse clean on the sidecar; end-to-end tier-5 tested live against the
sidecar (pure prose, rule-pack+tier-5 combined with no token corruption,
!nophi bypass); strict-mode fail-closed abort tested; CR-taint, path-block,
and base64 round-trip batteries re-run green.
Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
300 lines
16 KiB
Markdown
300 lines
16 KiB
Markdown
# Changelog
|
|
|
|
All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here.
|
|
Versioning is loose-semver; bumps trigger the in-process self-update on every
|
|
running client via `LARRY_BASE_URL` + `MANIFEST`.
|
|
|
|
## v0.8.2 — 2026-05-27
|
|
|
|
Microsoft Presidio sidecar for free-text NER. Closes V1 from Vera's audit —
|
|
the dominant real-world failure mode (patient names, addresses, un-keyworded
|
|
dates in prose chat). Opt-in install; larry runs in v0.8.1 mode on hosts
|
|
where Presidio isn't installed (MobaXterm/Cygwin per Bryan's accepted
|
|
tradeoff).
|
|
|
|
- **`lib/phi-presidio-sidecar.py`** — FastAPI service on
|
|
`127.0.0.1:$LARRY_PHI_PORT` (default `41189`). Wraps Presidio's
|
|
`AnalyzerEngine` + `AnonymizerEngine` over spaCy `en_core_web_sm`
|
|
(12MB model, ~9-second cold start). Two endpoints: `POST /redact`
|
|
takes `{"text": "..."}` and returns `{"redacted": "...", "entities":
|
|
[...], "latency_ms": N}`; `GET /health` for the launcher's readiness
|
|
probe. Three HL7-specific custom recognizers added (`HL7_MRN` for
|
|
6-12 digit numerics with patient/MRN/account context; `HL7_CARET_NAME`
|
|
for `SMITH^JOHN` outside Tier-3 line context; `HL7_PHONE_BARE` for
|
|
plain 10-digit phones). Confidence threshold for tier-5 tokenize is
|
|
0.3 (below that is too noisy).
|
|
|
|
- **`lib/phi-sidecar.sh`** — lifecycle launcher. Subcommands:
|
|
`start / stop / status / health / ensure`. `ensure` is idempotent
|
|
(no-op if already up); called from `larry.sh` main_loop startup,
|
|
backgrounded so it never blocks larry's first prompt. Waits up to
|
|
30 seconds for the sidecar to become healthy after `start`; surfaces
|
|
the log tail if startup fails. PID file at
|
|
`$LARRY_HOME/.phi-sidecar.pid`; log at `$LARRY_HOME/log/phi-sidecar.log`.
|
|
Honors `LARRY_PHI_VENV` env to use a dedicated virtualenv (which the
|
|
installer sets up at `$LARRY_HOME/phi-venv` when the user opts in).
|
|
|
|
- **`lib/phi-client.sh`** — bash wrapper around `/redact`. Sourceable
|
|
functions: `phi_client_available`, `phi_redact_text`, `phi_redact_entities`.
|
|
Also runs standalone as a CLI (`./phi-client.sh check / redact / entities`).
|
|
CR-safe (sources `cygwin-safe.sh` defensively); 5-second curl timeout
|
|
bounds any tier-5 stall.
|
|
|
|
- **Tier-5 integration in `larry.sh:auto_detect_phi`.** New stage AFTER
|
|
the existing tier-1/2/3/4 substitution and BEFORE the status summary.
|
|
Sources `phi-client.sh` lazily, probes `phi_client_available`, and on
|
|
success runs `phi_redact_entities` to get Presidio's per-entity output.
|
|
Each entity is tokenized through the SAME `hl7-sanitize.sh tokenize-value`
|
|
pipeline as tiers 1-4 (category prefixed `presidio_<TYPE>`) so token IDs
|
|
remain stable across surfaces and the `/tokens` listing stays unified.
|
|
Tier-5 honors `LARRY_AUTO_PHI=confirm` (prompts Y/n once per value) and
|
|
`strict` (aborts the turn if `tokenize-value` fails on a Presidio hit).
|
|
Critically, v0.8.2 removes the v0.7.3 early-return that exited
|
|
`auto_detect_phi` when tiers 1-4 found nothing — pure-prose input now
|
|
ALWAYS reaches tier-5.
|
|
|
|
- **Graceful degradation.** If the sidecar is unreachable (not installed,
|
|
not started, crashed), tier-5 silently no-ops with a one-time stderr
|
|
warning per session. Larry's REPL remains fully functional in v0.8.1
|
|
mode. `LARRY_AUTO_PHI=strict` does NOT abort on absent sidecar (the
|
|
strict mode escape is for HL7-shaped content where rule-pack would
|
|
have caught the leak; tier-5 is additive coverage).
|
|
|
|
- **`/phi-sidecar` slash command** — `start / stop / status / health /
|
|
ensure` exposed to the user. Slash-completion table and `_LARRY_SLASH_CMDS_DESC`
|
|
updated.
|
|
|
|
- **`install-larry.sh` install path.** On hosts with Python 3.9+ + pip,
|
|
the installer prompts before creating `$LARRY_HOME/phi-venv` and
|
|
installing `presidio_analyzer + presidio_anonymizer + fastapi +
|
|
uvicorn + spaCy en_core_web_sm` (~400MB on disk, ~250MB RAM resident).
|
|
On MobaXterm/Cygwin without python3, the installer skips the prompt
|
|
entirely and prints Bryan's accepted tradeoff (MobaXterm stays on
|
|
v0.8.1 + nudges). Re-runnable; idempotent.
|
|
|
|
- **MANIFEST.** Added three new lib files. They auto-sync to every
|
|
running client on next launch; clients without Python 3 won't run
|
|
the sidecar but the files are harmless to ship.
|
|
|
|
**Prototype validation (Bryan's Mac, Apple Silicon, Python 3.14).**
|
|
Cold start (model load): ~9 seconds with `en_core_web_sm` (vs ~82s with
|
|
the larger `en_core_web_lg` Presidio auto-downloads by default — we
|
|
explicitly pin `_sm` for the latency-sensitive REPL use case). Warm
|
|
analyzer latency: P50 20.6ms, P95 22.7ms over 20 sequential requests
|
|
on 100-word input. End-to-end HTTP round-trip (curl + json roundtrip):
|
|
P50 ~57ms warm; first request post-startup pays a ~150ms tokenizer
|
|
warmup tax then steady. Well under the 200ms-per-turn REPL budget.
|
|
|
|
Detection quality on the canonical "John Doe MRN 623000286" sample: 8
|
|
core entities caught (PERSON x2, DATE_TIME x2, PHONE_NUMBER, US_*),
|
|
plus the three custom HL7 recognizers add MRN + caret-name + bare-phone
|
|
coverage. Misclassifications (MRN as US_PASSPORT, "ED" as PERSON) are
|
|
within tolerance for the tokenize-everything-suspicious policy — the
|
|
auto-PHI lookup table sees them as `presidio_*` categories and the
|
|
operator can audit via `/tokens`.
|
|
|
|
**MobaXterm compatibility verdict.** Per Bryan's accepted tradeoff:
|
|
v0.8.2 ships Mac/Linux-only. MobaXterm/Cygwin stays on v0.8.1
|
|
(rule-pack + path-block + content-shape gating + strict mode + base64
|
|
round-trip + tool-result review gate). Test path: install-larry.sh
|
|
detects platform and skips the Presidio install on `windows-cygwin`
|
|
with a clear "v0.8.1 mode" note. No code in larry.sh is platform-gated
|
|
— tier-5 silently no-ops when the sidecar is absent, which IS the
|
|
MobaXterm path.
|
|
|
|
**Proactive same-pattern sweep.** Searched for other call sites where
|
|
free-text NER would help: tool-result surface already gets HL7-shape
|
|
sanitize (v0.8.1) and base64 round-trip (v0.8.1-c). Tier-5 is
|
|
user_input-only by design — tool-result free-text NER deferred to a
|
|
future patch (would require deciding on per-tool latency budgets;
|
|
Bryan to call when needed).
|
|
|
|
## v0.8.1 — 2026-05-27
|
|
|
|
Tool-result PHI gating expansion. Closes V2 / V12 and the V2 base64 sub-gap
|
|
from Vera's audit. No behavior change for users not on HL7-shaped data;
|
|
opt-in friction for the 8KB+ tool-result review gate.
|
|
|
|
- **Tool-name allow-list dropped; content-shape gating only.** The v0.7.3
|
|
tool-result auto-PHI gate ran only on `read_file (.hl7|.txt)`, `nc_msgs`,
|
|
`hl7_field`, `hl7_diff`. v0.8.1 runs `_auto_phi_looks_like_hl7` on
|
|
EVERY tool result. On hit → route through `lib/hl7-sanitize.sh`.
|
|
On miss → pass through unchanged. Closes V2: `bash_exec`/`ssh_exec`/
|
|
`grep_files`/`read_file` of `.log`/`.csv`/`.dat`/no-suffix files are
|
|
now all covered when their output is HL7-shaped. False-positive cost
|
|
is cheap (extra regex pass with zero behavioral impact on non-HL7).
|
|
|
|
- **Base64-wrapped HL7 round-trip.** New `_auto_phi_b64_roundtrip` helper.
|
|
Detects candidate base64 runs (length >= 200 chars, `[A-Za-z0-9+/=]`
|
|
only, length divisible by 4 — NOT entropy-based, per Pax §V2-sub:
|
|
HL7's repetitive prefixes survive base64 with LOW entropy). Speculatively
|
|
decodes each run; if decoded bytes look like HL7, routes through
|
|
`hl7-sanitize.sh` and re-encodes (`base64 -w0`) back into the result.
|
|
Catches `ssh_pull_smat` sampled mode TSV (server-side encoding kept
|
|
for binary-safe TSV transport; client-side unwrap handles the safety
|
|
concern). Requires `python3` (installed everywhere larry-anywhere
|
|
runs); skipped with a one-time stderr warning if unavailable.
|
|
|
|
- **Operator review gate for `bash_exec`/`ssh_exec`/`ssh_pull`/
|
|
`ssh_pull_smat` results.** When the tool produced HL7-shaped output OR
|
|
the result exceeds `LARRY_TOOL_RESULT_REVIEW_THRESHOLD` bytes
|
|
(default 8192), Larry prompts `[Y/n/i]` before passing the result
|
|
back to the model. `i` opens the result in `$PAGER` then re-prompts.
|
|
Default Y — zero friction by default. `N` substitutes a refusal JSON
|
|
so the model knows a result was withheld. Skipped when
|
|
`LARRY_AUTO_PHI=off` (consistent with the opt-out) OR running
|
|
non-interactively (no TTY — never blocks headless scripts).
|
|
Override with `LARRY_TOOL_RESULT_REVIEW=always` to gate every result.
|
|
Per Pax §V2/V12: closes the "operator wanted to see this themselves,
|
|
didn't want the model to see it" gap that's the actual common case.
|
|
|
|
**Proactive same-pattern sweep.** Searched the codebase for other call
|
|
sites where tool output bypasses content-shape gating: found only the
|
|
one in `agent_turn`. The v0.8.0-c strict-mode tool-result branch was
|
|
hardened in lockstep so it now triggers on the broader (content-only)
|
|
eligibility instead of the old name-allow-list.
|
|
|
|
Manifest unchanged.
|
|
|
|
## v0.8.0 — 2026-05-27
|
|
|
|
PHI-safety quick-wins pack — three independent zero-risk patches closing
|
|
four gap-classes Vera identified in the v0.7.5 static audit
|
|
(`Deliverables/2026-05-27-cloverleaf-larry-phi-leak-audit.md`) with Pax's
|
|
recommended mitigations
|
|
(`Deliverables/2026-05-27-cloverleaf-larry-phi-mitigation-research.md`).
|
|
No new dependencies, no behavior change for users not interacting with PHI.
|
|
|
|
- **`read_file`/`grep_files`/`glob_files`/`list_dir` path-block list
|
|
(closes V4 + V6 + V11).** Refuse — with a structured JSON error the
|
|
model must surface, NOT a silent "file not found" — any tool-side
|
|
attempt to read or enumerate under `$LARRY_HOME/log/` (auto-phi.log,
|
|
headers.log, oauth.log, session logs), `$LARRY_HOME/sanitize/`
|
|
(lookup.tsv — the desanitization key), `$LARRY_HOME/sessions/`,
|
|
`$LARRY_HOME/.oauth.json`, or `$LARRY_HOME/.env`. Block-list resolves
|
|
`$LARRY_HOME` at call time (not script-parse time) and runs against
|
|
both the literal path and its `realpath -m` canonical form, so symlink
|
|
detours don't bypass. The proactive same-pattern sweep (Bryan standing
|
|
rule, 2026-05-27) extended the block from `tool_read_file` alone to
|
|
also cover `tool_grep_files`, `tool_glob_files`, and `tool_list_dir`
|
|
— those tools would otherwise leak filenames or grep-matched content
|
|
out of the same protected dirs without any approval gate.
|
|
|
|
- **`/load <file>` HL7 pre-routing (closes V3).** When the loaded file's
|
|
content matches `_auto_phi_looks_like_hl7`, route it through
|
|
`lib/hl7-sanitize.sh` (the segment-aware tokenizer with the full PHI
|
|
field rule set: PID, PV1, NK1, GT1, IN1, OBR, OBX, DG1, ORC) BEFORE
|
|
the existing user_input auto-PHI pass. Closes the gap where smat dumps
|
|
loaded via `/load` only got the lighter per-word classifier, which
|
|
misses bare HL7 PID fields. Status line reports how many fields were
|
|
tokenized: `phi> /load: hl7-sanitize.sh tokenized N HL7 field(s) from
|
|
<file> before passing to auto-PHI`. Strict mode (see below) aborts the
|
|
`/load` if sanitize fails; default/confirm modes warn-and-continue.
|
|
|
|
- **`LARRY_AUTO_PHI=strict` fail-closed mode (closes V5).** New fourth
|
|
value alongside `off / on / confirm`. In strict mode, the auto-PHI
|
|
pipeline aborts the surrounding turn (no payload built, no API call)
|
|
when: (a) `lib/hl7-sanitize.sh` is missing/non-executable on HL7-shaped
|
|
user_input, (b) the sanitizer returns empty on HL7-shaped content,
|
|
(c) any single value's `tokenize-value` call fails inside the
|
|
detection loop. On the tool-result surface (which can't kill the
|
|
in-flight tool_use), strict mode substitutes the result with a
|
|
structured JSON refusal sentinel so the raw HL7 NEVER reaches the
|
|
model. Existing `off / on / confirm` semantics unchanged (still
|
|
fail-open per Bryan's "don't break tools" priority). Strict is the
|
|
opt-in tradeoff for HIPAA work where a silent leak is worse than a
|
|
broken turn. `/phi-auto strict` toggle and `/help` text updated.
|
|
Wired into both auto-PHI invocation sites: user input scan and the
|
|
tool-result HL7 sanitizer gate.
|
|
|
|
**Proactive same-pattern sweep (Bryan standing rule, 2026-05-27).**
|
|
Searched the codebase for other tools matching the pattern "reads
|
|
arbitrary path, returns content to model, no approval gate": found and
|
|
patched `tool_grep_files`, `tool_glob_files`, `tool_list_dir`
|
|
alongside `tool_read_file`. `bash_exec`/`ssh_exec` already require Y/N
|
|
operator approval — the operator is the gatekeeper there (a second gate
|
|
deferred to v0.8.1). No other matches.
|
|
|
|
Manifest unchanged (no new files in `lib/`).
|
|
|
|
## v0.7.5 — 2026-05-27
|
|
|
|
Three focused changes, one common cause: the Cygwin/MobaXterm CR-taint pattern
|
|
that crashed OAuth on Bryan's v0.7.3 work-box with the cryptic error
|
|
`bash: ...: arithmetic syntax error: invalid arithmetic operator (error token is "")`.
|
|
|
|
- **OAuth/arithmetic CR fix.** `lib/oauth.sh` now routes every operand entering
|
|
a bash arithmetic context (`fetched_at`, `expires_in`, `now`) through a
|
|
dedicated `coerce_int` helper that strips non-digits at the source. The
|
|
failure mode: `$(date +%s)` against a Cygwin pty where Windows-native
|
|
`date.exe` shadows Cygwin `date` can return a CR-tainted epoch like
|
|
`"1779999999\r"`, which crashes the very next `$((expires_at - now))`.
|
|
Diagnosis in `Deliverables/2026-05-27-cloverleaf-larry-oauth-arithmetic-fix.md`.
|
|
|
|
- **Mouse mode is opt-in.** REPL mouse handling now defaults to OFF and is
|
|
enabled via `LARRY_MOUSE=1` env var or `/mouse on` slash command. Several
|
|
terminals (notably MobaXterm and stripped tmux) were swallowing the mouse
|
|
ANSI sequences and printing literal `^[[?1000h` garbage when v0.7.0 turned
|
|
it on unconditionally. Diagnosis in
|
|
`Deliverables/2026-05-27-cloverleaf-larry-mouse-regression-fix.md`.
|
|
|
|
- **CR-safety sweep across `lib/*.sh` and top-level scripts.** Three new
|
|
primitives in `lib/cygwin-safe.sh` (sourced by every tool family member):
|
|
- `coerce_int VAL [DEFAULT]` — for arithmetic and integer-test operands
|
|
- `strip_cr VAL` — for case patterns, regex tests, paths, HTTP headers
|
|
- `read_clean VAR [PROMPT]` — `read -r` wrapper that strips CR pre-assign
|
|
Hardened call sites:
|
|
- `larry.sh` — status-line `date +%s` / `tput cols`, three y/N approval
|
|
prompts (write_file, bash_exec, first-run auth), API-key paste,
|
|
first-run auth menu
|
|
- `lib/oauth.sh` — `cmd_login` and `cmd_refresh` `date +%s` captures
|
|
- `lib/nc-engine.sh` — five y/N action prompts (stop/start/bounce, resend,
|
|
route-test, testxlate, tpstest) + `find ... | wc -l` arithmetic
|
|
- `lib/nc-msgs.sh` — `parse_time_ms` `date` captures (4 sites),
|
|
meta-TSV `tm` field, `MSG_COUNT` `wc -l`
|
|
- `lib/nc-regression.sh` — `tr | wc -c` count, hl7-diff `?`-fallback
|
|
arithmetic
|
|
- `lib/nc-smat-diff.sh` — `A_COUNT`/`B_COUNT`/`DIFFS_TOTAL`
|
|
- `lib/nc-insert-protocol.sh` — every awk-emitted line-number that feeds
|
|
`head -n $((N-1))` / `tail -n +$((N+1))` arithmetic
|
|
- `lib/journal.sh` — `_next_seq` `wc -l` arithmetic
|
|
- `lib/lessons.sh` — `_next_id`, `cmd_list`, `cmd_count` arithmetic +
|
|
two y/N prompts (clear all, clear since)
|
|
- `lib/hl7-sanitize.sh` — `cmd_count` arithmetic + clear-table y/N
|
|
- `lib/ssh-helper.sh` — local + remote `wc -c` integer compares (4 sites)
|
|
- `lib/nc-find.sh` — `wc -l` count for `%d` printf
|
|
- `lib/nc-table.sh` — `$(date +%s)` in backup-filename construction
|
|
- `lib/nc-document.sh` — two `wc -l | %d` printf sites
|
|
- `larry-rollback.sh` — Proceed? y/N prompt
|
|
|
|
Reproduction (now exercised by `cygwin-safe.sh`'s in-line tests):
|
|
```
|
|
now=$(printf '%s\r' 1779999999); echo $((now - 1)) # pre-fix: crashes
|
|
now=$(coerce_int "$(printf '%s\r' 1779999999)" 0); echo $((now - 1)) # fix: 1779999998
|
|
```
|
|
|
|
Added `lib/cygwin-safe.sh` to `MANIFEST` so it auto-syncs to every running
|
|
client on next launch.
|
|
|
|
## v0.7.4 — 2026-05-27
|
|
|
|
- Drop GitHub fallback from auto-update. Single-source Gitea
|
|
(`https://git.bjnoela.com/bryan/cloverleaf-larry.git`).
|
|
|
|
## v0.7.3 — 2026-05-26
|
|
|
|
- Automatic PHI detection (tiered detection + blacklist contexts).
|
|
|
|
## v0.7.2 — 2026-05-26
|
|
|
|
- Gitea becomes primary auto-update origin; GitHub demoted to fallback.
|
|
|
|
## v0.7.1 — 2026-05-26
|
|
|
|
- Status line moves to between-turn position (post-input, pre-response).
|
|
- Status line below prompt; automatic PHI detection; session-artifact upload.
|
|
|
|
## v0.7.0 — 2026-05-26
|
|
|
|
- HL7-aware tab completion + REPL mouse mode (later made opt-in in v0.7.5).
|