cloverleaf-larry/CHANGELOG.md
bj ea9f4c2399 v0.9.0: broker mode is the DEFAULT — wire the remote kill-switch into every Cloverleaf-Larry
Phase 3 of the Larry remote kill-switch (Pax design; Mack's broker on .135 LAN
8181 / Tailscale 100.86.16.114:8181). Deployed Larry no longer holds a long-lived
sk-ant-… key: it holds a per-deployment enrollment secret, mints a short-lived
token from the broker, and routes every LLM call THROUGH the broker /v1/messages
(real key injected server-side). set-authorized <id> false => the deployment 401s
and dies, no box access required.

- LARRY_AUTH_MODE=broker is the DEFAULT (was apikey). Self-update flips existing
  installs to broker-mode too, so upgrading Gundersen delivers the kill-switch.
  Escape hatch (documented, not default): LARRY_AUTH_MODE=apikey (no kill-switch,
  never for PHI boxes).
- New lib/broker.sh: enroll+mint, fail-closed heartbeat, best-effort PHI wipe
  (reuses uninstall-larry.sh's shred/overwrite secure-delete + LARRY_HOME guard).
- Fail-closed preflight at launch + in-REPL heartbeat (default 60s, 3-miss budget):
  disabled => refuse to run (+ PHI wipe for profile:phi); unreachable past budget
  => refuse to run (NO wipe on a network blip — only an explicit disable wipes).
- call_api / call_api_stream broker branch: Bearer short-lived token, no x-api-key,
  token never on disk.
- install-larry.sh enrollment provisioning: LARRY_DEPLOYMENT_ID + LARRY_ENROLL_SECRET
  (+ LARRY_PROFILE/LARRY_BROKER_URL) baked 0600 + into the shim; box shows up in the
  dashboard ready to toggle.
- /auth reports broker state.

Reachability (flagged for Bryan): the broker is LAN + Tailscale only (no public
route). Egress-restricted boxes reach it over Tailscale (default URL = tailnet).
A box that can reach neither fail-closes = won't run (correct kill, useless work
state) — such a box MUST run Tailscale, or Bryan must stand up a hardened public
broker ingress.

Bug fixed in test: _broker_json_field jq `// empty` rendered literal false as
empty, mis-classifying a DISABLED deployment as an unreachable MISS (delaying
fail-close + skipping the PHI wipe). Fixed to `if has($k) then .[$k] else "" end`.
Verified end-to-end against the live broker: enroll -> mint -> proxied call ->
disable -> instant 401 + heartbeat fail-close + 5 PHI files shredded.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 23:10:09 -07:00

2002 lines
123 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Changelog
All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here.
Versioning is loose-semver; bumps trigger the in-process self-update on every
running client via `LARRY_BASE_URL` + `MANIFEST`.
## v0.9.0 — 2026-05-31
**Broker mode is the DEFAULT — the remote kill-switch is wired into every
Cloverleaf-Larry deployment, including self-update of existing installs.**
Phase 3 of the Larry remote kill-switch (Pax design brief
`Deliverables/2026-05-31-larry-remote-killswitch-design.md`; server broker by
Mack at `192.168.20.135:8181` / Tailscale `100.86.16.114:8181`). A deployed
Larry no longer holds a long-lived `sk-ant-…` key: it holds a per-deployment
enrollment secret, mints a short-lived token from the broker, and routes every
LLM call THROUGH the broker (`/v1/messages`), which injects the real key
server-side. Flip `set-authorized <id> false` in the broker and the deployment
401s and dies — no access to the box required.
### New / changed behavior
- **`LARRY_AUTH_MODE=broker` is the DEFAULT** (was `apikey`). Unset => broker.
Self-update flips existing installs to broker-mode too: upgrading Gundersen
delivers the kill-switch automatically, per Bryan's requirement. Escape hatch
(documented, not default): `LARRY_AUTH_MODE=apikey` keeps the legacy baked-key
rail (NO kill-switch) — never for PHI boxes.
- **New `lib/broker.sh`** — enroll+mint (`/enroll-mint`), fail-closed heartbeat
(`/authorized`), best-effort PHI wipe (reuses `uninstall-larry.sh`'s
shred/overwrite secure-delete + the same hard LARRY_HOME safety guard).
- **Fail-closed preflight** at launch (after self-update, before any model call):
`authorized:false` => for `profile:phi`, best-effort local PHI wipe, then refuse
to run; unreachable past `LARRY_HEARTBEAT_MAX_MISS` (default 3) => refuse to run
(NO wipe on a mere network blip — only an explicit disable wipes).
- **In-REPL heartbeat** every `LARRY_HEARTBEAT_INTERVAL` (default 60s): a
mid-session disable stops the next turn (and wipes PHI on a phi profile). The
per-call token re-mint in `call_api`/`call_api_stream` is the hard stop on top.
- **`call_api` / `call_api_stream` broker branch:** `Authorization: Bearer
<short-lived-token>`, NO `x-api-key`. Token never touches disk.
- **Enrollment provisioning** in `install-larry.sh`: pass `LARRY_DEPLOYMENT_ID`
+ `LARRY_ENROLL_SECRET` (+ `LARRY_PROFILE`, `LARRY_BROKER_URL`) at install →
stored 0600, baked into the `larry` shim so every launch is broker-mode and the
box shows up in the dashboard ready to toggle.
- **`/auth`** now reports broker state (deployment id, profile, heartbeat
cadence, token held) and confirms "no sk-ant-… key is stored — kill-switch ON."
### Reachability (the design tension, flagged for Bryan)
Broker mode means the client MUST reach the broker to function. The broker is
**LAN + Tailscale only — no public route.** On an egress-restricted box (the
Gundersen Cloudflare block that 28'd `git.bjnoela.com`), the client reaches the
broker over **Tailscale** (`LARRY_BROKER_URL=http://100.86.16.114:8181`, the
default). If a box can reach NEITHER LAN nor Tailscale, broker-mode fail-closes
= the agent will not run — a correct KILL state but a useless WORKING state. So
a real deployment on a locked-down network MUST run Tailscale (or Bryan must
stand up a hardened public broker ingress with its own auth). The installer and
README say this explicitly so no one ships a silently-bricked box.
### Bug fixed during integration test
- `_broker_json_field` used jq `// empty`, which renders a literal `false` as
empty — `authorized:false` would have mis-classified a DISABLED deployment as
an unreachable MISS (delaying fail-close past the miss budget and SKIPPING the
PHI wipe). Fixed to `if has($k) then .[$k] else "" end`. Verified: disable now
fail-closes immediately and the phi wipe fires.
## v0.8.34 — 2026-05-31
**Hardened `uninstall-larry.sh` into a first-class, healthcare-grade decommission
command — "update larry, then run one command."**
The v0.8.33 uninstaller removed the install footprint but did not meet the
operational decommission requirements for a PHI-handling deployment left on a
client box. This release makes `larry uninstall` (and the shipped
`uninstall-larry.sh` it delegates to) self-contained and idempotent: one run
stops everything, securely destroys cleartext PHI, scrubs creds, self-removes,
and tells you what you still must do at the source. The `larry uninstall`
early-dispatch in `larry.sh` is unchanged — it already delegates here.
### New / changed behavior
- **Stop everything (tolerant):** in addition to the precise pidfile kills (PHI
Presidio sidecar, reverse-SSH tunnel), it now `pgrep`+kills detached
`larry.sh` REPLs, `phi-presidio-sidecar`, and `larry-tunnel` keepalives by
command pattern. It NEVER kills itself, its parent, or any `uninstall-larry`
process (patterns + per-PID exclusion).
- **PHI hygiene (healthcare):** `log/auto-phi.log`, `sanitize/lookup.tsv`, and
`sessions/*.log.md` are SECURELY deleted FIRST — `shred -u -z -n 3` where
available; overwrite-then-rm fallback (zero + urandom passes) where `shred` is
absent (Windows/MobaXterm); truncate-then-rm if even `dd` is missing. It
prints HONESTLY per-file whether a real secure-delete was achieved, and warns
that overwrite-in-place is not guaranteed on SSD/CoW/Windows filesystems.
`find`-less hosts use a bash-glob fallback so session PHI is never skipped.
- **Shell-profile creds:** greps `.bashrc`/`.bash_profile`/`.profile` for
`ANTHROPIC_API_KEY|CLAUDE_CODE_OAUTH_TOKEN|LARRY_*|GITEA_TOKEN` and, by
default, backs up the rc (timestamped `.larry-uninstall.<ts>.bak`) then strips
only those lines. `--keep-rc` prints them for manual removal instead.
- **More shim/copy locations:** removes `~/larry`, `~/.local/bin/larry`,
`~/bin/larry`, `$LARRY_BIN_DIR/larry`, and a scp'd `~/larry-anywhere` — but
only OUR shims (auto-generated header or a launcher/symlink into `$LARRY_HOME`);
a foreign `larry` is left untouched.
- **Self-uninstall last:** when run from a standalone larry-anywhere checkout
(not from inside `$LARRY_HOME`), removes that checkout and the script itself —
only if real work was done this run and the dir is a full bundle.
- **Post-uninstall reminder:** prints the explicit "FINISH AT THE SOURCE" block —
REVOKE the Anthropic API key AND the Claude-Code OAuth grant (local removal
does not invalidate egressed copies), revoke any Gitea PAT / rotate SSH creds,
plus a BAA/PHI-disclosure reminder for healthcare deployments.
- **Safety guards:** refuses to operate if `LARRY_HOME` is empty/unset, `/`,
`$HOME`, a single-component-at-root path (`/.larry`), or anything that doesn't
look like a Larry dir — so a misconfigured env can never `rm -rf /`. Removal is
scoped strictly to the built target list (no glob expansion, no parent touch).
- New flags: `--keep-rc`, `--no-shred` (alongside the existing `--yes`,
`--dry-run`, `--keep-data`). DRY-RUN remains the default.
## v0.8.33 — 2026-05-29
**Two operator-requested features: a real uninstaller, and a deterministic-only
(no-API) mode.**
### 1. `larry uninstall` / `uninstall-larry.sh` — remove EVERYTHING the installer put down
There was no uninstall command before this. Now there is one, and it reverses
`install-larry.sh` exactly.
- **New script `uninstall-larry.sh`** (shipped by the installer into
`$LARRY_HOME`, manifest-synced) + a **`larry uninstall` subcommand** that
delegates to it. Runs BEFORE bootstrap/self-update/auth, so you can uninstall
even when the API/origin is unreachable (same early-dispatch pattern as
`larry tools`).
- **DRY-RUN BY DEFAULT.** `larry uninstall` (or `--dry-run`) only PREVIEWS the
full removal list and prints "would stop" for any running PHI sidecar / tunnel.
`larry uninstall --yes` actually deletes (with a `[y/N]` confirm unless `--yes`).
`--keep-data` removes program files but preserves user data
(sessions/journal/lessons/knowledge/sanitize/log + creds).
- **Footprint removed:** the whole `$LARRY_HOME` (shipped bundle + `bin/jq`
fallback + optional `phi-venv` + all runtime artifacts: sessions, journal,
lessons, knowledge, sanitize, log incl. `headers.log`, `.history`, `.origin`,
`.api-key`/`.env`/`.oauth.json`, `.ssh-creds`/`.ssh-sockets`/`.ssh-hosts.tsv`,
`known_hosts`, tunnel.*, every dotfile state marker) **and** the `larry` PATH
shim at `$LARRY_BIN_DIR/larry` (default `$HOME/bin/larry`).
- **SAFETY:** the shim is removed ONLY if it carries our auto-generated header
(`Auto-generated by install-larry.sh`) — an unrelated `larry` on PATH is left
alone. Background procs are stopped via THEIR OWN pidfiles only (never a
guessed PID). `rm -rf` is scoped strictly to the built target list — no globs,
no parent dirs. The installer NEVER edits your shell rc, so the uninstaller
doesn't either: it prints a one-line reminder to remove any PATH export you
added by hand. Cloverleaf sites / `$HCIROOT` are never touched.
### 2. `--no-api` — deterministic-only mode (ZERO LLM API calls, zero cost)
- **New flag `--no-api`** (env: `LARRY_NO_API=1`). In this mode larry makes
ZERO `/v1/messages` requests — no pay-as-you-go cost, ever. The interactive
REPL and ALL local/deterministic commands still work; a free-text prompt is
NOT sent to the model. Instead larry keyword-sniffs the prompt and points you
at the matching deterministic `larry tools <name>` command (the same
`lib/nc-*.sh` / `hl7-*.sh` tools the LLM would have called) — or at
`larry tools list` when nothing matches.
- **No key required:** the first-run auth prompt is skipped in `--no-api` mode
(no key is read because no key is used). The REPL prompt label shows
`you[no-api]>` instead of a model name.
- **Defense in depth:** `call_api` and `call_api_stream` HARD-REFUSE when
`LARRY_NO_API=1`, so even a stray codepath cannot silently bill a request. The
turn handler short-circuits upstream, so the guard normally never fires.
- **What's unavailable:** open-ended reasoning / free-form authoring (anything
that genuinely needs the model). Those surface a clear "unavailable in
--no-api mode" message rather than erroring obscurely or silently calling out.
Coverage reference: `Deliverables/2026-05-28-larry-deterministic-tool-coverage-plan.md`
— most of Bryan's NetConfig/HL7 operations already have deterministic tools.
## v0.8.32 — 2026-05-28
**★ CAPSTONE TOOL: `nc_provision_jumps` — point at a SITE and build the
cross-environment `server_jump` thread set for ALL inbound (root) threads,
routing the existing-env inbound feed (the ADT feed) into a NEW environment.**
Bryan's stated ultimate goal. PURE COMPOSITION of tools all validated in the
read-pass (v0.8.29) and write/mutate pass (v0.8.30) — it reimplements nothing:
`nc-inbound.sh` enumerates the inbound roots → `nc-parse.sh` reads each root's
`PROTOCOL.PORT` + `ENCODING``nc-make-jump.sh` generates the per-inbound
3-thread set + splice route → `nc-insert-protocol.sh` persists them → `journal.sh`
wraps the WHOLE batch in ONE session.
- **Invocation:** `nc-provision-jumps --site <SITE> --new-host <H> --new-port-base <BASE>`.
Flags: `--dry-run` (preview the FULL plan — every inbound + every block/route
that WOULD be created — write NOTHING), `--confirm yes` (required to write;
absent = plan-only refusal), `--filter REGEX` (limit which inbound roots),
`--scope tcp-listen|all`, `--netconfig` / `--new-netconfig` (explicit OLD /
server_jump NetConfigs; new defaults to the site NetConfig), `--hciroot`,
`--inbound-host`, `--process-jump`, `--format text|tsv`.
- **Per-inbound loop is name-driven** (tag = the inbound name, so it loops
cleanly): `jump_port = new_port_base + index`, collision-checked. For each root
it generates `linux_<name>_out` (OLD/site NetConfig), `windows_<name>_in` +
`windows_<name>_out` (NEW-env server_jump NetConfig), and splices a route onto
the OLD inbound's DATAXLATE → `linux_<name>_out` (existing routes preserved).
- **★ SINGLE-SESSION JOURNALING — the key safety property.** All 3N inserts +
N splice routes for the whole batch share ONE `LARRY_SESSION_ID`, so
`larry-rollback.sh --session <id>` undoes the ENTIRE provisioning run
byte-identical in one shot (not per-thread).
- **★ FAIL-SAFE — the batch is ALL-OR-NOTHING (robustness hardening).** Each
inbound's later steps are gated on its earlier steps: if ① (`linux_<x>_out`
insert) fails, ②/③/④ are skipped — so the ④ splice can never route to a
thread that was never created (no dangling DEST). On the FIRST hard failure
ANYWHERE in the batch the loop STOPS and auto-rolls-back the whole session via
the proven byte-identical `larry-rollback.sh --session <id> --yes`, then exits
6 with `provisioning failed at <inbound>; auto-rolled-back the batch`. If the
rollback tool can't run, it fails loud and prints the manual rollback command.
- **★ PRE-FLIGHT COLLISION CHECK against the EXISTING NetConfig (robustness
hardening).** Before ANY write (and surfaced in `--dry-run`) it scans the
target NetConfig — and the `--new-netconfig` when different — via `nc-parse`
(`protocol-summary`; never a hand-grep) for (a) any planned `jump_port`
already in use by an existing `PROTOCOL.PORT` (would create a duplicate
listener) and (b) any planned thread NAME already present. ANY collision
ABORTS before writing anything (exit 5), listing every conflict (inbound,
port/name, what it collides with). This complements the existing intra-batch
port/name guards.
- Splice route artifact path is now a proper `PLAN_ROUTE[]` array built at
plan time (was a fragile `${oldout%old_out.tcl}route_add.tcl` suffix-strip).
- **Regression handoff (capstone's SEPARATE, already-shipped half):** output ends
with a `roots: <csv>` line consumable directly as the `nc_regression` scope
(server / threads:<csv>, `route_test_cmd` via the exposed `nc_engine route-test`).
This tool does NOT run regression.
- **Wired into all 4 larry.sh surfaces** (manual-tools registry, `tool_nc_provision_jumps`
wrapper, `execute_tool` case, TOOLS_JSON schema).
- Verified on a COPY of the real 24-site integrator (`/tmp/clvf_jumps_test`, site
`epic` = 21 inbound roots; read fixture sha-confirmed untouched at
`fa129cfc…`): (a) full-site `--dry-run` listed all 21 roots → 63 new threads +
21 splice routes, wrote NOTHING (sha unchanged); (b) a real full-site provision
inserted 63 balanced-brace blocks + 21 splice routes (topology spot-checked:
`windows_<x>_in` listens on the jump port and routes internally to
`windows_<x>_out` → 127.0.0.1:<orig_port>; `linux_<x>_out` → new-host:jump_port;
OLD inbound keeps its existing dests + the new jump-out), all under ONE journal
session (84 entries, 1 session id); (c) `larry-rollback.sh --session <id>`
removed the ENTIRE batch BYTE-IDENTICAL (sha back to `fa129cfc…`); (d)
`--filter '^MFNfr'` correctly limited to 2 roots, and that batch also rolled
back byte-identical.
- Robustness re-test on a COPY (`/tmp/clvf_jumps_h`, site `epic` = 21 roots; read
fixture `/tmp/clvf_realtest` sha-confirmed untouched): (a) a CLEAN run still
provisions 21 roots → 63 blocks + 21 splice routes under ONE session (84
entries) and `--session` rollback is still byte-identical; (b) both `--dry-run`
AND a real run ABORT cleanly (exit 5, sha unchanged, no journal minted) when a
`jump_port`/NAME collides with the existing NetConfig (crafted via
`--new-port-base` = an existing listener port); (c) a simulated mid-batch
insert failure fail-fasts (exit 6), auto-rolls-back the whole session
byte-identical, and leaves NO dangling `linux_<x>_out` / splice.
- VERSION + `larry.sh` `LARRY_VERSION` → 0.8.32; MANIFEST regenerated (`--check`
clean); `bash -n` clean. (Version held at 0.8.32 — still unpushed; the
fail-safe + collision pre-flight harden the same release.)
## v0.8.31 — 2026-05-28
**★ NEW WRITE TOOL: `nc_set_field` — change a settable field (PORT, HOST/IP,
PROCESSNAME, ENCODING) on an existing thread, JOURNALED + rollback-reversible.**
Bryan's top-requested write feature ("changing port numbers and ip addresses").
Built on the exact journal/atomic-write foundation the v0.8.30 mutate pass proved
byte-identical-reversible — same idiom as `nc-table.sh` / `nc-insert-protocol.sh`
(snapshot → diff → atomic write → journal entry; undo via `larry-rollback.sh`).
- **Invocation:** `nc-set-field <thread>[.<site>] <field> <value>` (bare thread →
`$HCISITE`; `thread.site` cross-site; also `--site`). Flags: `--dry-run` (show
before→after, NO write), `--confirm yes` (skip the y/N prompt; still journaled),
`--netconfig PATH`, `--hciroot PATH`, `--completion` (emit a bash-completion
snippet: thread names + the field enum).
- **Curated safe set, explicit reject otherwise** — never blind-edits arbitrary
tokens and never CREATES a missing field: PORT → nested `PROTOCOL.PORT`; HOST
(alias IP) → nested `PROTOCOL.HOST`; PROCESSNAME → top-level; ENCODING →
top-level (must already exist). Field name is case-insensitive.
- **Anchored edit, NOT a global sed.** Locates the thread's protocol-block line
range via `nc-parse` (`protocol-line` + brace-balanced end walk), then the exact
field line within it (nested vs. top-level scoped), and replaces ONLY that value
token — preserving indentation + brace shape. A port/host SHARED by another
thread is provably untouched. Re-verifies balanced braces before the journal
write; a broken structure aborts with nothing written. No-op (value already set)
exits 4 cleanly.
- **Wired into all 4 larry.sh surfaces** (manual-tools registry, `tool_nc_set_field`
wrapper, `execute_tool` case, TOOLS_JSON schema) + the bash-completion snippet.
- Verified on a COPY of the real 24-site integrator (`/tmp/clvf_setfield_test`,
read fixture sha-confirmed untouched): dry-run shows before→after without writing;
a real PORT change (39500→39600) and HOST change (172.31.23.2→10.34.48.11) each
apply as a single surgical line edit (braces balanced, the two threads sharing
PORT 51205 untouched); both journaled; whole-session rollback restored the file
BYTE-IDENTICAL; an unsupported field (MLP_TIMEOUT, TYPE) and an unknown thread
are rejected cleanly.
## v0.8.30 — 2026-05-28
**★ WRITE/MUTATE TOOL VALIDATION PASS + journal-rollback foundation verified —
2 real bugs found & fixed.** Ran every config-mutating tool against a COPY of the
real 24-site Cloverleaf integrator fixture (`/tmp/clvf_writetest_integrator`, the
read fixture left untouched), proving each mutation (a) applies, (b) journals, and
(c) rolls back BYTE-IDENTICAL to the original. nc_table (add/delete/replace/create),
nc_add_route, nc_insert_protocol (end + after-anchor), nc_create_thread (tcpip/file,
±connect-from), nc_make_jump (gen + e2e insert), nc_tclgen (all 7 templates) all
verified. Rollback validated across ALL granularities: `--session`, `--entry`,
`--target`, `--last N`, the 2-entry chain unwind (newest-first), and the create→delete
path. The journal rollback foundation reliably restores byte-identical originals.
- **`nc-create-thread.sh` HOST brace-collision (malformed NetConfig).** The block
template inlined `{ HOST ${HOST:-{}} }`. When `--host` was supplied (the common
case for a tcpip client/outbound), the `}` closing the parameter default collided
with the template's closing brace, emitting `{ HOST 10.9.9.9} }` — one extra `}`,
brace-imbalanced (-1), invalid TCL that Cloverleaf would reject. Now the HOST field
is precomputed into `$HOST_FIELD` (real host, else `{}`) and referenced cleanly:
`{ HOST ${HOST_FIELD} }`. Only triggered with `--host`; the empty-host/file case was
already correct. `nc-make-jump.sh` was NOT affected (it uses `{ HOST ${host} }`).
- **`lessons.sh:142` printf option-injection.** Same class fixed in v0.8.29 (and
flagged there as out-of-scope): `printf '---\n\n'` parsed `---` as printf options
`printf: --: invalid option`, dropping the markdown rule from `lessons.sh export`
bundles. Now `printf '%s\n\n' '---'`. (The in-loop `printf '\n---\n\n'` on line 151
is safe — the leading `\n` shields the dashes.)
KNOWN / TRIAGE (not fixed this pass): the journal records the table `target` path with
the lowercase `tables/` segment from `nc-table.sh`'s `modify_via_csv` even when the file
lives under `Tables/` (capital). Restore works on case-insensitive filesystems (macOS
APFS, Windows) but could miss on a case-sensitive Linux mount. Low-risk for current hosts;
flagged for a path-normalization follow-up (do not patch blind — needs a real
case-sensitive box to verify against).
## v0.8.29 — 2026-05-28
**★ READ/INSPECT TOOL VALIDATION PASS — 6 real bugs found & fixed.** Ran every
read/inspect/analysis tool against a real 24-site Cloverleaf integrator fixture
(via BOTH the `lib/<x>.sh` path and the wired `execute_tool` dispatch), with real
thread/site/xlate/table names discovered from the config. Same class as the
v0.8.28 nc-engine arg-parse crash: each only surfaced when actually run. A BSD-awk
word-boundary nit in `nc-status.sh` not-up (gawk-only `\<up\>` → portable
`(^|[^a-z])up([^a-z]|$)`) was also caught at the Vera gate and folded in.
- **`nc-find.sh` `--name` mode returned ZERO matches (BSD/macOS).** The protocol-
name extraction used the GNU-only `sed \+`. BSD sed (macOS) treats `\+` as a
literal `+`, so the thread name came back empty and every `--name` hit was
silently dropped. Now POSIX-portable (`[[:space:]][[:space:]]*` /
`[A-Za-z0-9_][A-Za-z0-9_]*`). Worked on GNU/Cygwin hosts; broke on BSD.
- **`nc-find.sh` exit 1 on success for tsv/jsonl.** The trailing
`[ "$FORMAT" = "table" ] && printf ...` test left the script exit code at 1 for
non-table formats (the `&&` short-circuits). Added explicit `exit 0` — a
successful search (even zero matches) now returns 0; mis-signaled failure to
any `$?`/`&&` caller.
- **`nc-parse.sh` `tclproc-refs` dropped digit-leading proc names.** The `{ PROC`
/ `{ PROCS` regexes required a leading `[A-Za-z_]`, so a real Cloverleaf proc
like `3M_check_ack` was never reported — which also blanked `nc_find --tclproc
3M_check_ack`. Widened to `[A-Za-z0-9_]+`; `PROCSCONTROL` still excluded.
- **`nc-xlate.sh diff` could not find site-scoped xlates.** Unlike show/ops/tree/
summary, `diff` did not accept `--site`, so `locate_xlate` only checked
`$HCIROOT/Xlate` and always died `no such xlate` for the common case (xlates
under `<site>/Xlate/`). `cmd_diff` now takes `--site` (applied to both names)
and larry.sh forwards it.
- **`nc-diff-interface.sh` printf option-injection.** `printf '---\n\n'` (the
markdown horizontal rule) parsed `---` as printf options → `printf: --: invalid
option`, dropping the rule. Now `printf '%s\n\n' '---'`.
- **`nc-smat-diff.sh` printf option-injection (8 lines).** `printf '- A: ...'`
format strings starting with `- ` errored `printf: - : invalid option` in
NON-interactive bash, dropping every summary bullet. Guarded with `printf --`.
- **`nc-status.sh not-up` crashed on `--format`.** `cmd_not_up` accepted only
`--site`/`--filter`; the larry.sh dispatch ALWAYS appends `--format`, so
`not-up` died `unknown flag: --format` and was unusable via the wired tool.
`cmd_not_up` now accepts `--format` and forwards it to `cmd_threads`.
All other read/inspect tools (nc_paths up/down/full/all/site-only incl. the
cross-site ADTto_CodaMetrix chain, nc_destinations/sources, nc_list_*,
nc_protocol_*, nc_find_inbound, nc_document single+system, nc_revisions
timeline+diff, nc_msgs raw+field-filter+json, nc_xlate list/show/ops/tree/summary,
nc_xlate_refs, nc_tclproc_refs, hl7_field, hl7_diff, nc_diff_interface,
nc_regression 6-phase + chain-walk command-gen, nc_smat_diff pairing, nc_engine
+ nc_status graceful degrade, list_sites ignore-rules, all 7 nc_tclgen templates
verified `info complete` in tclsh) PASSED unchanged. Test matrix:
`Deliverables/2026-05-28-cloverleaf-v3-tool-test-matrix.md`.
KNOWN / TRIAGE (not fixed this pass): `hl7-sanitize.sh` is a silent no-op on
LF-delimited input (`RS="\r"` reads the whole file as one record) — fixing needs
a portable CR/LF normalizer (BSD awk has no regex `RS`); `nc_engine route-test/
testxlate/resend` ignore `--dry-run` (only stop/start/bounce/restart honor it) and
the dispatch never forwards it; `lessons.sh:142` has the same `printf '---'`
option-injection (out-of-scope write tool).
## v0.8.28 — 2026-05-28
**★ EXPOSE 5 lib-only tools as first-class LLM tools.** A roadmap audit found
~7 working `lib/*.sh` scripts that ran from the shell (`larry tools <name>`) but
were NOT wired into larry.sh's LLM `TOOLS_JSON`. This pass exposes the
READ/INSPECT/TEST ones (the config-mutating `nc-table` / `nc-create-thread` are
a separate next pass). Each tool now has all four larry.sh surfaces —
manual-tools registry, `tool_<name>` wrapper, `execute_tool` dispatch case, and
the `TOOLS_JSON` LLM spec — mirroring the v0.8.27 `nc_revisions` pattern. The
LLM tool count goes 38 → 43.
- **`nc_status`** ← `nc-status.sh` — engine RUNTIME status (sites/threads/
not-up/connections/queued/raw + `--site`/`--filter`/`--format`). READ-ONLY;
needs a live engine for real numbers, degrades gracefully on a config-only
host (sites → `config-present`, threads → reports the missing tstat binary,
no crash).
- **`nc_engine`** ← `nc-engine.sh` — engine process control + test drivers
(stop/start/bounce/restart/status + route-test/testxlate/tpstest/resend-ib/
resend-ob). State changes run only on a live box and are journaled; `status`
and the test drivers need a live engine. Unlocks the TPS-test (`tpstest`,
wraps `hcitps`) and route-test driver the regression tool needs.
- **`nc_xlate`** ← `nc-xlate.sh` — READ-ONLY xlate (.xlt) explorer
(list/show/ops/tree/summary/diff). Works fully on a config fixture.
- **`nc_smat_diff`** ← `nc-smat-diff.sh` — diff stored SMAT messages across two
envs, paired by MSH.10 (`--env-a`/`--env-b`/`--pair-on`). READ-ONLY; full run
needs two reachable archives.
- **`nc_tclgen`** ← `nc-tclgen.sh` — TCL UPOC skeleton generator (tps-presc/
tps-postsc/tps-iclkill/xlate-helper/trxid/ack/field-rewrite). Standalone,
deterministic, API-free.
- **`--help` header-leak fix (4 libs).** The sed comment-header range over-ran
into live code: `nc-status.sh` (`2,25p`→`2,20p`), `nc-engine.sh`
(`2,30p`→`2,29p`), `nc-xlate.sh` (`2,15p`→`2,11p`), `nc-tclgen.sh`
(`2,20p`→`2,19p`). Each now stops exactly at the end of the comment header —
no `set -o pipefail` / `NC_SELF=` / `die()` leaking into `--help`.
`nc-smat-diff.sh` (`2,22p`) was already clean.
- **FIXED `nc-engine.sh` arg-parsing crash (found during this exposure pass).**
The subcommand dispatcher treated EVERY `--flag` as taking a value
(`flags+=("$1" "${2:-}"); shift 2`), so the valueless `--dry-run` swallowed the
next token; and the `set -u` that leaks in from sourcing `journal.sh` made the
empty-array expansion `"${flags[@]}"` throw `unbound variable` under bash 3.2
(macOS/Gundersen), crashing `stop/start/bounce/restart` with `--site`/`--dry-run`.
Fix: `set +u` after sourcing `journal.sh` (nc-engine predates nounset and uses
bare `$1`/empty arrays), plus a multi-case parser — `--dry-run` is valueless,
`--site`/`--confirm` consume one value (without over-shifting on a bare trailing
flag). `status` and the test-driver paths were already unaffected.
## v0.8.27 — 2026-05-28
**★ NEW TOOL: `nc-revisions.sh` — NetConfig change-history / revision-diff.**
Shows how a THREAD, a multi-thread SYSTEM, or a WHOLE SITE changed over time by
diffing Cloverleaf's own per-save NetConfig revision SNAPSHOTS, annotated with
WHO saved each revision and WHEN. Deterministic, pure bash+awk, NO API, NO
network — runs identically on an API-blocked host.
- **Revision storage (verified on the real integrator).** Each save snapshots
the full NetConfig into `$HCIROOT/<site>/revisions/NetConfig<TS>/NetConfig`,
where `<TS>` is the save time as `M D YYYY H M S` with NO zero-padding (so
`NetConfig5212025121420` = 5/21/2025 12:14:20). Because the components are
un-padded the dir NAME is NOT lexically sortable across months/hours, so we do
NOT sort on it. WHO + DATE come from the snapshot's NetConfig PROLOGUE
(`who:` / `date:`, TAB-separated, leading-space-indented; the block is
`prologue … end_prologue`). The `date:` string (`Month DD, YYYY H:MM:SS AM/PM
TZ`) is parsed into a sortable key `YYYYMMDDHHMMSS` so the timeline is in true
chronological order. The live `<site>/NetConfig` is included last as
`(current)`; a secondary key (snapshot dir-timestamp; `~` sentinel for live)
breaks ties when two saves share the same last-editor prologue stamp.
- **Scoped, not whole-file.** The diff/summary is reduced to the relevant
protocol-block(s) via `nc-parse.sh` (single thread, or every name-matching
thread for `--system`), so it is NOT the whole 10k-line NetConfig unless
`--site` (whole-NetConfig scope).
- **Flags.** `--format timeline` (default) = plain-text who/when/summary, one
revision per block, summary counting routing threads added/removed/modified
in scope between consecutive revisions; `--format diff` = the scoped unified
diff between consecutive revisions, one block per transition. `--limit N`
(most-recent N), `--since <date>` (YYYY-MM-DD or YYYYMMDD). Invocation:
`nc-revisions <thread>` (defaults to `$HCISITE`), `<thread>.<site>`,
`<site>/<thread>`, `--system <pat> [--site S]`, or `--site <site>`.
- **Plain text + control-byte safety.** Output matches the v0.8.24/26 OneNote
conventions: no markdown. Timeline runs through `_sanitize_ctl`
(unconditional strip — human-readable artifact); `--format diff` through the
tty-gated `_sanitize_ctl_tty` (raw, byte-identical on a pipe/redirect so a
downstream consumer of the diff sees exact bytes). `--help` sed range stops at
the end of the comment header (no code leak).
- **Wired into `larry.sh`.** Added to the `tools` registry, the `tool_nc_revisions`
dispatcher, the `execute_tool` case, and a full `nc_revisions` LLM-tool schema
mirroring the other NetConfig tools.
Proved on the real test integrator (`HCIROOT=/tmp/clvf_realtest/integrator`):
`nc-revisions ADTto_uds.ancout2` rendered an 8-revision timeline spanning
Apr 2024 → Nov 2025 (correctly ordered across months despite un-padded dir
names), surfacing a `~1 modified` host change and the eventual `-1 removed` in
the live config; `--format diff` showed the scoped one-line `{ HOST 10.33.176.5 }
→ { HOST 10.34.48.11 }` change. `--system codametrix --site ancout --format diff`
showed `ADTto_CodaMetrix` added then its `{ PORT 8000 } → { PORT 39500 }` and
inbound-archive (`INFILE`/`INSAVE`) edits by BRYJOHN. `bash -n` clean on all
touched files; MANIFEST regenerated and `--check` passes.
## v0.8.26 — 2026-05-28
**★ HARDENING: extend the v0.8.25 control-byte sanitize across the whole tool
suite (Vera follow-up).** v0.8.25 fixed the terminal-corruption leak in
`lib/nc-document.sh` only. Vera flagged that the OTHER tools dumping
NetConfig/`.tcl`/HL7 content to stdout carry the SAME risk and were still
unsanitised — most importantly `nc-msgs.sh`, whose raw HL7 contains C0 framing
bytes (e.g. `0x1c` block separator) that corrupt a terminal when viewed
un-redirected.
- **Shared sanitiser.** `_sanitize_ctl` is hoisted out of `nc-document.sh` into
`lib/cygwin-safe.sh` (already the sourceable shared-primitives lib), so every
tool shares ONE definition. `nc-document.sh` now sources it; its behavior is
UNCHANGED — it still strips UNCONDITIONALLY (the doc is a human-readable
artifact; control bytes are unwanted even when redirected to a `.txt`).
- **New `_sanitize_ctl_tty` — the data-tool variant.** For the data-producing
tools (`nc-msgs.sh`, `nc-parse.sh`, `hl7-field.sh`, `hl7-diff.sh`,
`hl7-sanitize.sh`, `hl7-desanitize.sh`) stripping happens ONLY when stdout is
an interactive terminal (`[ -t 1 ]`). When the user pipes/redirects (e.g.
`nc-msgs … --format raw > input.msgs` feeding route_test, or `| awkcut`), the
output passes through RAW and byte-identical — the `0x1c` HL7 framing and
other bytes are LOAD-BEARING downstream and must not be silently corrupted.
Each tool's output region runs in a brace group piped through the gate, with
`${PIPESTATUS[0]}` propagated so subcommand exit codes survive the pipe.
(`hl7-schema.sh` is sourced-only — no stdout sink of its own — so it is
untouched.)
- **`ssh-helper.sh` `_read_hidden` trap nits (Vera).** (a) The restore trap was
installed only when `stty -g` succeeded, yet `stty -echo` ran regardless — a
`^C` in that window left echo off. Now a restore trap is installed on EVERY
path BEFORE touching echo (falling back to `stty echo` when the save failed).
(b) The trap reset was `trap -` (reset-to-default); it now captures the
caller's PRIOR `INT/TERM/HUP` trap with `trap -p` and restores it.
Proved on the real test integrator (`to_appriss.smatdb`, 11 msgs): `nc-msgs …
--format raw` to a file kept `0x1c`/`0x0d` intact (493 bytes), while the same
command through a PTY stripped ESC/FS; `cmp` confirmed they differ. `nc-document
--name codametrix` to a file still emits 0 control bytes with the em-dash
preserved. Portable: `LC_ALL=C tr` octal ranges + POSIX `[ -t 1 ]`; no GNU-only
flags. `bash -n` clean on all touched files. MANIFEST regenerated.
## v0.8.25 — 2026-05-28
**★ FIX: terminal line-editing corruption after `larry tools …` runs (Bryan,
reproduced on his live Cloverleaf box).** After running `larry tools nc-document`
(and `tbn`-class searches), his interactive shell's line editing broke —
backspace printed a literal `^H` instead of erasing, arrow keys died, and he had
to run `stty sane`/`reset` to recover. Two independent causes, both fixed
belt-and-suspenders:
- **Cause 1 — CONTROL-BYTE OUTPUT LEAK (the one Bryan hit).** `lib/nc-document.sh`
reads arbitrary NetConfig and `.tcl` proc source — author comments, the
`--raw-tcl` verbatim appendix (`cat "$apath"`), and xlate bodies — and emits it
to stdout unsanitised. Any raw C0 control byte in that content, **especially ESC
(0x1B)** and DECSET mode-switch sequences (`ESC[?1049h`, `ESC[?25l`, etc.), flips
the terminal into a different mode when the doc is printed (i.e. NOT redirected),
wrecking line editing. **Fix:** a single sanitiser at the one output choke-point
(`out_target` → new `_sanitize_ctl`) strips C0 controls except TAB/LF/CR via
`LC_ALL=C tr -d '\001-\010\013\014\016-\037\177'`. Applies to BOTH stdout and
`--out <file>`. High bytes (0x800xFF) are untouched, so UTF-8 names/comments
survive. Portable to AIX/Linux/BSD/Cygwin (POSIX `tr` octal ranges; no GNU-only
flags). The non-interactive `tools <name>` dispatch in `larry.sh` itself touches
NO termios (it `exec`s the lib tool before any trap/mouse-mode is installed),
confirming the leak was the tool's own output bytes.
- **Cause 2 — TTY-MODE LEAK on SIGINT (latent, hardened).** `lib/ssh-helper.sh`
read hidden passwords with a bare `stty -echo … read … stty echo`. A Ctrl-C (or
EOF/signal) BETWEEN the two `stty` calls left echo permanently OFF — terminal
corrupted until `stty sane`/`reset`. **Fix:** new `_read_hidden` saves the full
prior termios with `stty -g` and restores it via a `trap … INT TERM HUP` plus an
explicit restore on normal return, so every interrupt path restores the tty.
Reads from `/dev/tty`. Both hidden-password prompts (`cmd_pass`, `cmd_setup`
re-prompt) now route through it.
`bash -n` clean on all touched files. MANIFEST regenerated.
## v0.8.24 — 2026-05-28
**★ PLAIN-TEXT output rewrite of `lib/nc-document.sh` (Bryan's priority — OneNote
readability).** Bryan shares the generated interface doc with a teammate in
OneNote, which does NOT render markdown — `#`, `**bold**`, `| pipe tables |`,
`---` rules and backticks all show as literal junk. The OUTPUT layer is rewritten
to emit clean PLAIN TEXT by default (parsing/extraction logic unchanged):
- **Default = plain text everywhere.** UPPERCASE headings underlined with a
dashes line; no `**bold**`, no backtick code spans, no `---` horizontal rules,
no `| pipe tables |`. Reads cleanly in any editor and pastes straight into
OneNote.
- **Tabular sections (Message Flow, Delivery breakdown) — two render modes.**
DEFAULT = indented `label:value` blocks (one block per hop/row; reads in any
font, zero setup). New `--onenote-table` flag renders those same sections as
TAB-separated rows (header row + one data row per record, real `\t` between
cells, NO leading/trailing pipes) — paste into OneNote then Insert > Table to
get real cells. Non-tabular sections (Title/Context/Description) stay plain
text in both modes.
- **Removed** the verbose `Filter / translation logic (surfaced deterministically
from the referenced UPOC procs):` preamble line; the surfaced UPOC bits now sit
under a simple `FILTER / TRANSLATION LOGIC` plain heading, still inline in each
interface description (always on).
- **Raw TCL is now OPT-IN behind `--raw-tcl`** (off by default). The readable,
deterministically-extracted UPOC bits stay inline in the per-interface
description regardless; only the verbatim raw-proc appendix is gated. The
appendix is rendered as a plain indented source dump (no ```` ```tcl ```` fences).
- New flags `--onenote-table` and `--raw-tcl` documented in the lib `--help` and
in the `nc_document` tool schema/wrapper in `larry.sh` (`onenote_table`,
`raw_tcl` booleans). The `--help` `sed -n` range was re-pinned to the END of the
comment header (`2,92p`) so it does not leak live code.
**Folded in two Vera-deferred v0.8.22 minors** (`lib/nc-document.sh`):
- Removed dead/unused `sup`/`fan` counters in `_xlate_filter_block`.
- Added `local` to the previously un-`local`'d `_dest_hit`/`_d` loop variables in
`document_thread`.
Verified against the real integrator (`HCIROOT=/tmp/clvf_realtest/integrator`,
`--name codametrix`): default output is plain text with `label:value` hop blocks
and no markdown chars; `--onenote-table` produces tab-separated rows; the removed
preamble line is gone; raw TCL only appears with `--raw-tcl`.
## v0.8.23 — 2026-05-28
**★ REGRESSION CHAIN-WALK route-test capture (Bryan's priority).** New
`--chain-walk` mode in `lib/nc-regression.sh` — a single-env (per-env) capture
that runs `route_test` at every routing(inbound) thread along an `nc-paths`
chain and chains each step's per-destination output into the next step's input.
Orthogonal to the existing 6-phase env-A/env-B pipeline; Bryan runs it on env-A
and env-B with the same START input, then the existing `hl7-diff --ignore MSH.7`
diff phase compares the captured outputs.
Workflow (verbatim Bryan spec):
- Take a START thread + N. Grab the N most-recent messages from the START
inbound's SMAT via `nc-msgs --limit N --format raw``input.msgs`.
- Resolve the downstream chain(s) from START with `nc-paths --format jsonl`.
`--target <site/thread>` restricts to the single chain ending at that node;
with no `--target`, ALL fan-out branches are walked (each branch a chain dir).
- The route_test ENTRY threads are the START node plus every node that
immediately follows a cross-site `==>` hop (the remote inbound). Outbound
*sender* nodes (e.g. `OB3_RAD_ordersS`) are NOT entries — they are produced as
`.out.<sender>` files by the upstream route_test.
- At each entry E: `hciroutetest -a -d -f nl -s <out_base> <E> <input>` — the
command string is mined VERBATIM from the v1/v2 wrapper
(`cloverleaf_tools/tools/route_test_wrapper.py:10/149-156`); `-f nl` yields
NEWLINE output directly (no manual `len2nl`+delete). `-a` (all routes) writes
every fan-out branch's output in one invocation.
- route_test writes ONE FILE PER DESTINATION named `<out_base>.out.<DEST>`
(`.out.<DEST>` naming confirmed from `legacy_workflow_commands.py:1015`). File
SELECTION: the node immediately AFTER E in the chain is the suffix selected to
feed the NEXT step. For the cross-site boundary `S ==> R`, the selected
`.out.<S>` payload becomes the input fed to the next site's inbound `R`.
- PRODUCES under `$OUT/chain/<NN>/`: `input.msgs` (original START messages),
`step-NN.<E>.out.<DEST>` (per-destination outputs), `step-NN.<E>.selected`
(the chosen next-step input), `commands.sh` (the exact generated command
sequence), `chain.txt` (the resolved v1 chain).
`hciroutetest` is a Linux engine binary needing a live engine — NOT runnable off
the server — so chain-walk GENERATES the orchestration + command STRINGS +
file-chaining/selection and stubs the engine call under `--dry-run`; real
execution is on the server (source the Cloverleaf profile first). Everything that
does NOT need the engine is self-verified against the real integrator: the
nc_paths chain resolution, the nc-msgs message-grab command, the per-step
route_test command strings, and the `.out.<DEST>` → next-step SELECTION.
New flags: `--chain-walk --start <site/thread|thread> [--site SITE] --count N
--hciroot HCIROOT --out DIR [--target <site/thread>] [--route-test-bin BIN]
[--dry-run]`. Portable bash (Win+Linux), pure bash+awk, **API-FREE**, no
python/.pyz. Existing phase pipeline untouched.
Verified on `ORUto_CodaMextrix orders`
(`mux/RadOrdfr_epic_972310 --> mux/OB3_RAD_ordersS ==> orders/IB3_RAD_muxS -->
orders/ORUto_CodaMextrix`): 2 route_test entries (`RadOrdfr_epic_972310`@mux,
`IB3_RAD_muxS`@orders); start grab + the exact per-step `hciroutetest` strings +
the `.out.OB3_RAD_ordersS` (cross-site) → `.out.ORUto_CodaMextrix` (terminus)
selection. No-target run walks all 6 `IB3_RAD_muxS` fan-out branches.
## v0.8.22 — 2026-05-28
Interface **`document`** tool follow-on (`lib/nc-document.sh`, `inbound-systems.tsv`,
cheatsheet). All changes remain deterministic, pure bash+awk, **API-FREE**.
**★ Xlate-internal filtering & fan-out (Bryan).** The route's `.xlt` is now parsed
deterministically for the three ops that change the message COUNT, and they are
called out in the Description prose, a new **"Xlate filtering & fan-out"**
subsection, and the delivery breakdown's XLATE line:
- `OP SUPPRESS`**FILTERING** — the message/segment is dropped; the governing
`OP IF` condition is surfaced (e.g. "message SUPPRESSED when `@medicopia_fac eq =KILL`").
- `OP SEND`**FAN-OUT** — an extra output copy is emitted mid-translation
("message cloned/multiplied here"); conditional sends show the `when …` clause.
- `OP CONTINUE`**FAN-OUT** — translation continues after a send (the companion
that yields a second distinct message).
Pure-awk brace-depth + IF-frame-stack parse of the xlt; no API. A pure-pathcopy
xlate (e.g. `Epic_ADT_CodaMetrix.xlt`) correctly yields no subsection.
**★ Configurable inbound-systems lookup + known-feed hard-map (Bryan).** A new
curated `inbound-systems.tsv` (`<key>\t<upstream system identity>`, key on the
feed thread name OR `port:<n>`) deterministically labels the external sender in
the Message Flow "From"/feed row. On NO match it falls back to the honest generic
`Epic (process <name>)` — it never fabricates (Vera's fabrication concern: the map
is curated, not guessed). Resolution order: `$LARRY_HOME/inbound-systems.tsv`
shipped seed; override with `--inbound-systems PATH` or `$INBOUND_SYSTEMS_FILE`.
Seeded with known feeds (`ADTfr_epic_964700` → "Epic AIP 964700 (ADT)", etc.).
The installer seeds it copy-if-missing (never clobbers edits); it is intentionally
NOT in the MANIFEST so self-update never overwrites curation.
**Vera Minor-1 (cosmetic):** `--help` no longer leaks shell code — the header
`sed` range now stops at the end of the usage block (was `2,72p`).
**Vera Minor-2:** new optional `--strict-delivery` gate for SYSTEM mode — a thread
counts as a delivery only if it has a real downstream endpoint (non-empty
`PROTOCOL.HOST`/`PORT`) or `OUTBOUNDONLY=1`, excluding reply-only outbounds a broad
`--name` would otherwise sweep in. Default OFF (preserves prior behavior). Verified:
`--name Infor` drops the reply-only `Empfr_Infor` under the strict gate; `codametrix`
still yields its 2 deliveries.
**Vera Minor-3:** list-form `{ DEST {a b c} }` routes are now captured in
`_routes_of` (was single-form `{ DEST <thread> }` only, mirroring nc-parse's index
parser), so a delivery reachable only via a multi-dest route is matched instead of
missed. `route_thr` also falls back to the authoritative nc_paths chain's
penultimate node when the one-hop `sources` primitive misses, so the route/xlate
breakdown stays populated.
**Cheatsheet drift (Vera):** `nc_make_jump` row + the jump-thread-pattern section
now name the actual three generated threads (`linux_<tag>_out`, `windows_<tag>_in`,
`windows_<tag>_out`) instead of the stale `to_/fr_<inbound>_server_jump` pair.
Bonus: fixed a pre-existing latent `printf: --: invalid option` warning on the
`---` footer separator (now `printf '%s\n\n' '---'`).
## v0.8.21 — 2026-05-28
Interface **`document`** tool rebuilt (`lib/nc-document.sh`) — documents a
Cloverleaf interface end-to-end in Bryan's confirmed Legacy "ADT Messages"
template. Deterministic, pure bash+awk, **API-FREE** (the whole tool runs
identically on an API-blocked host like Gundersen — no python, no `.pyz`, no
network).
**Two modes.**
- SINGLE INTERFACE: `nc-document.sh <thread> [site]` (or `<site>/<thread>`),
e.g. `ADTto_CodaMetrix ancout` — one fully-detailed interface section.
- SYSTEM/PATTERN: `nc-document.sh --name <pattern>`, e.g. `--name codametrix`
one section per matching DELIVERY (outbound) thread across all sites. A delivery
is any thread that is NOT an inbound listener (`ISSERVER=1`) and NOT an ICL/file
inbound router (`OBWORKASIB=1`), so it catches both `OUTBOUNDONLY=1` threads and
bidirectional `Xto_*` deliveries (e.g. `DFTto_codaMetrix`, `OUTBOUNDONLY=0`).
**Per-interface output (Legacy template):**
- **Title** = the interface / message type.
- **Description** prose — what the messages are, the TRXID filter that selects the
delivery, where translation happens (xlate vs raw), seeded from the surfaced
UPOC bits.
- **Message Flow** table `Platform | Action | Description | From | To` — Epic feed
→ Cloverleaf routing → Final Delivery, one row per hop. The routing row uses
`nc-paths.sh` and adapts its wording to whether the chain crosses a site
boundary (cross-site `destination`-block hop `==>` vs intra-site DATAXLATE route
`-->`).
- **Delivery breakdown** — Flow chain; how-received (inbound `PROTOCOL`
TYPE/HOST/PORT/ISSERVER + `ICLSERVERPORT`); inbound TRXID/TPS proc
(`DATAFORMAT.PROC`); the route's TRXID filter + WILDCARD; route TYPE;
PREPROCS/POSTPROCS; XLATE; destination host:port / process / type.
**★ Deterministic UPOC-bits extraction (the key feature).** For every referenced
UPOC proc (inbound TRXID/TPS proc + each route's PRE/POST/PROCS), locate its
`.tcl` under `$HCIROOT/<site>/tclprocs/` (home site first, then any site) and
surface — with NO API — into the Description: (1) the proc's **comments** (header +
inline `#` filter notes), (2) **HL7 fields** referenced (dotted `PID.8` + the
underscore `PV1_3_3` form normalized to dotted), (3) literal **event-code matches**
(`A01 A02 A03 …`, boundary-checked), (4) **table lookups** (`tbllookup` / `.tbl`,
e.g. `PeriCalm_Loc`), (5) **disposition** (CONTINUE/KILL/return → "pass matching /
kill non-matching"). Rendered compactly, e.g.:
`UPOC Epic_PeriCalm_ADT_pass — … · fields: PV1.45 PID.8 · matches: A02 A03 · table: PeriCalm_Loc · disposition: pass matching / kill non-matching`.
The **raw proc TCL** is included verbatim in a plainly-labelled `## Referenced
proc source` appendix for audit — with **no "summarize by hand / on an API box"
marker** (the surfaced bits ARE the content).
**LLM polish (enrichment, NOT in the bash tool).** The bash tool calls no API. The
`nc_document` tool schema now instructs the model, when run WITH the API, to
transparently polish the surfaced UPOC bits into smoother filter prose in the
Description (no marker, no special mechanism). On API-blocked hosts the
deterministic bits + appendix ARE the deliverable.
**Portability fixes baked in:**
- All extraction awk is `\b`-free (BSD/BWK awk on macOS + mawk on Windows Git-Bash
silently match nothing on `\b`); token boundaries use explicit char-class scans.
- Internal records are `\037`(US)-delimited, not TAB — bash `read` with a TAB IFS
collapses CONSECUTIVE empty fields and was silently shifting columns when an
ICL/file inbound has empty HOST/PORT/ISSERVER. Inbound facts are read into named
globals for the same reason.
- Route parser walks the real DATAXLATE depth map (route sub-blocks at depth 3,
DEST/TYPE/XLATE at depth 6, inner `{ PROCS <name> }` at depth 8), so per-route
TRXID/TYPE/XLATE/PREPROCS extraction is exact.
**Wiring.** `tool_nc_document` (larry.sh) now takes `thread`/`name`/`site`; the
`nc_document` tool schema documents single-thread + system modes and the
UPOC-bit-polish instruction. `larry tools nc-document` drives the same script
standalone (no API).
**Verified on the REAL integrator** (`HCIROOT=/tmp/clvf_realtest/integrator`, the
24-site QA env): generated `ADTto_CodaMetrix ancout` (matches Larry's verified
prototype — flow `mux/ADTfr_epic_964700 --> mux/OB_ADT_ancS ==> ancout/IB_ADT_muxS
--> ancout/ADTto_CodaMetrix`, inbound proc `trxId_IB_ADT_muxS`, TYPE xlate, XLATE
`Epic_ADT_CodaMetrix.xlt`, dest `172.31.23.2:39500` process ADT); the `codametrix`
system doc (2 deliveries: the ADT feed + the intra-site `DFTto_codaMetrix` DFT
feed); and the PeriWatch route, proving the UPOC-bits extraction surfaces real
content from `Epic_PeriCalm_ADT_pass` (comments, fields, `A02`/`A03` matches,
`PeriCalm_Loc` table, pass/kill disposition). `bash -n` clean; TOOLS_JSON valid.
## v0.8.20 — 2026-05-28
Route-chain tracer (`lib/nc-paths.sh`) REARCHITECTED for the real integrator:
parse once, walk in memory; cross-site linking corrected from a port-match
heuristic to authoritative **`destination`-block** resolution. v0.8.20 was never
shipped — this entry supersedes the earlier port-based draft of the same version,
which FAILED Bryan's real-integrator smoke (24-site QA env): catastrophically slow
AND missing real cross-site feeders.
**Problem (measured on the real 24-site integrator, before this fix):**
- `nc-paths.sh ADTto_CodaMetrix ancout --site-only` → correct chain but **84 s**.
- full (no flag) → same single chain, **164 s**.
- `--down` → "unknown flag".
- Root cause: the walker invoked `nc-parse.sh` as a SUBPROCESS per hop / per
candidate (`destinations`/`sources`/`protocol-nested`/`protocol-field`/
`list-protocols`), and each invocation re-ran `_blocks` + `cmd_protocol_block`
— two full awk passes over the (16K-line) NetConfig. O(threads × parse-cost) =
minutes. Even the intra-site walk was a bottleneck (`sources` scans every body).
- Correctness: the draft linked sites by matching an outbound's `PROTOCOL.PORT`
to an inbound's listen/ICL port. That MISSED the real mux feeder of ancout's
`IB_ADT_muxS` (port 62043) — because no thread has `PROTOCOL.PORT 62043`; the
link is expressed only through a `destination` block.
**1. Single-pass index (`lib/nc-parse.sh` new `index` subcommand, `cmd_index`).**
ONE awk pass per NetConfig emits a flat record stream the walker needs:
`P` protocol, `D` DEST edge (handles BOTH `{ DEST name }` and the list form
`{ DEST {a b c} }` — the list form was silently dropped by the old
`cmd_destinations` regex), `L` listen port (server `PROTOCOL.PORT` with
ISSERVER=1 and/or guarded `ICLSERVERPORT`), `O` outbound dest port, and
`X <destname> <site> <thread> <port>` — the resolution of a top-level
`destination` block. Indexing all 24 live NetConfigs is <1 s.
**2. In-memory route graph + in-memory walk (`lib/nc-paths.sh`).** The index loads
once into bash associative arrays (`G_PROTO`/`G_DESTS`/`G_LISTEN`/`G_OUT`/
`G_DESTBLK`/`G_INSRC`/`G_DESTBLK_REV`; `_load_nc`, `_build_in_sources`,
`_build_graph`). `_walk_down`/`_walk_up` and the one-hop primitives
(`_outgoing`/`_incoming`/`_xsite_down_targets`/`_xsite_up_feeders`) are now pure
O(1) lookups NO subprocess and NO re-parse per hop. Cycle test is a bash
substring match (`_seen_has`), not a `grep` fork per hop.
**3. Cross-site link corrected to `destination` blocks.** Cloverleaf links sites
through the named ICL destination table: a thread's DATAXLATE `DEST` may name
either a LOCAL protocol (intra-site hop) or a `destination` block, which resolves
to `{ SITE }` `{ THREAD }` `{ PORT }`. A `DEST` naming a destination block is the
cross-site hop, resolved by NAME to the exact remote (site,thread). The `PORT`
equals the remote thread's listen/ICL port (corroboration), but it is never the
primary key. `ICLSERVERPORT` is still read GUARDED in the index (absent/`{}`
skipped, never the un-guarded `keylget` that crashed v2 `paths.tcl`).
**4. `full` mode = upstream × downstream JOIN at the thread.** No more
O(sites × threads) entry-chain scan (Vera m3). The complete chain is the thread's
upstream feeder chains (each ending AT the thread) joined to its downstream chains
(each starting AT the thread); both walks follow destination blocks, so the join
spans sites naturally.
**5. Flag standardization.** `--down`/`--up` are now accepted as aliases of
`--downstream`/`--upstream` in `nc-paths.sh` itself (they already worked via the
`/paths` slash handler; the bare script rejected them).
**6. Intra-site hops UNCHANGED in semantics** still the DATAXLATE `DEST` list,
never an `ICLSERVERPORT` walk.
**7. Removed:** the port-match cross-site index (`_build_port_index`, the `PI_*`
arrays), the per-hop subprocess primitives (`_proto_port`/`_proto_isserver`/
`_icl_port`/`_norm_port`), and the dead `_nc_for_site` helper.
**Verification — RE-MEASURED ON THE REAL 24-SITE INTEGRATOR** (tarball
`cloverleaf_test.tar.gz`, HCIROOT = extracted `integrator/`):
- `ADTto_CodaMetrix ancout --site-only`: **84 s → 0.66 s**.
- `ADTto_CodaMetrix ancout` (full): **164 s → 1.0 s**.
- whole-tree `--all` (all 24 sites, 709 chains): **4.3 s** (well under a minute).
- `--down` / `--up`: now valid flags.
- REAL cross-site chain proven: `mux/ADTfr_epic_964700 --> mux/OB_ADT_ancS ==> ancout/IB_ADT_muxS --> ancout/ADTto_CodaMetrix`. `IB_ADT_muxS`'s upstream feeder lives in the `mux`
site and reaches ancout via destination block `OB_ADT_ancS`
(`{ SITE ancout } { THREAD IB_ADT_muxS } { PORT 62043 }`) exactly the feeder
the port-match draft missed and Bryan asked for. Multi-site fan-out is
site-correct (each destination block resolves to its own site's `IB_ADT_muxS`).
- `--site-only` confirmed to suppress all cross-site hops.
- `bash -n` clean (`nc-paths.sh`, `nc-parse.sh`, `larry.sh`); `/paths` +
`tool_nc_paths` drive clean under `set -u`; MANIFEST regenerated & `--check` OK.
- No-traffic-bypass preserved (read-only NetConfig parsing; no engine/network
calls; pure bash + awk, no python/.pyz; portable Win + Linux).
## v0.8.19 — 2026-05-28
Deterministic route-chain `nc_paths` tool the #1 fix from the deterministic
tool-coverage plan (Clover). The on-server LLM had NO transitive route-chain
tool: to answer "show me the path / what feeds X / full route" it brute-forced
the whole NetConfig with grep/read_file/bash_exec + a long chain of
nc_destinations calls (the ~$1 prompt Bryan hit, which still gave up before
unrolling the chain). This release wires up a deterministic enumerator so the
model makes ONE call.
**1. New single walker backend `lib/nc-paths.sh`.**
A DFS path-enumerator that ports the v2 `paths` semantics
(`cloverleaf_tools/cli/legacy_workflow_commands.py` paths_cmd +
`_enumerate_downstream_paths`/`_enumerate_upstream_paths`/`_enumerate_all_full_paths`).
Output columns **SITE THREAD HOPS PATH** HOPS = thread count in the chain,
PATH = the chain joined by ` -> ` (one row per enumerated root-to-leaf path; a
branching thread yields multiple rows). Matches Bryan's exact format
(`pharmacy / pharm_adt_in / 2 / pharm_adt_in -> pyxismed_crh_adtorm_out`).
- **DEST-based routing, never ICLSERVERPORT.** Next hop is resolved ONLY from
the DATAXLATE `{ DEST <name> }` list (via `nc-parse.sh destinations/sources`).
Bryan's old `paths.tcl` walked via `keylget data ICLSERVERPORT`, which THROWS
on any thread lacking that key (every outbound/client thread), so the trace
died on the first client thread. The DEST list is present on every routing
thread regardless of direction and yields nothing (no crash) when absent — the
v2 paths.tcl crash cannot recur here.
- **All-mode** (`--all`): enumerates every chain from every entry point (a thread
with no incoming), deduped — the whole-site/environment chain inventory
(covers gap #2, v2 `list_full_routes`).
- **Cross-site BY DEFAULT** (Bryan's resolved decision): when a chain's terminal
thread is also an entry thread in another site's NetConfig (correlated by
shared thread name), the walk CONTINUES into that site — the
mux -> ancout -> CodaMetrix chain is followed end to end. `--site-only` scopes
to a single site.
- **Robust cross-site cycle detection.** Every walk carries the full ancestor
set keyed by (site,thread); revisiting an ancestor terminates that path (the
terminal node is still emitted), plus a global max-depth backstop (128, v2
parity). Always terminates — verified against a deliberate cross-site cycle.
- Formats: table (aligned), tsv, jsonl.
**2. Consolidated the walker backend (no second dark walker).**
Removed the never-wired `cmd_chain` BFS-node-set command from `nc-parse.sh`
(it only emitted a flat set of reachable nodes, never enumerated paths, and was
invisible to the LLM). `nc-paths.sh` is now the SINGLE route-chain backend; the
`nc-parse.sh chain` subcommand now errors with a pointer to it.
**3. Wired `nc_paths` into the LLM (the critical piece).**
- `larry.sh` — new `tool_nc_paths` wrapper (table output routed through
`_fence_aligned_table`; tsv/jsonl pass unfenced), a `nc_paths)` dispatch case,
and a `{"name":"nc_paths", ...}` schema entry. The schema description steers
the model to use `nc_paths` for ANY "show me the path / trace the chain / what
feeds X / full route / end-to-end flow / sources+destinations chain" question
instead of grep_files / read_file / bash_exec / repeated nc_destinations.
- `nc_sources` schema note tightened to "ONE HOP ONLY — use nc_paths for chains."
**4. Ergonomics (manual entry + slash command).**
- `larry tools nc-paths <thread> <site> [--site-only]` runs the enumerator
standalone (no API). Registry entry added to `larry tools list`.
- `/paths <thread> [site] [--up|--down|--site-only|--all|--format ...]` REPL
slash command (defaults `site` to the current `$HCISITE`). The fuller `tbn` +
`<thread>.<site> <subcmd>` shell-shim ergonomics remain a separate follow-on
pending Bryan's cheat sheet — not built here.
Zero traffic-bypass primitives.
## v0.8.18 — 2026-05-28
Readable terminal output + two DIRECT-mode follow-ups from Vera's v0.8.17 gate
(Clover). larry runs in a plain monospace terminal that does NOT render
markdown — this release makes tabular results actually line up and site lists
read vertically, and closes the deferred `ssh_push` direct-mode gap plus the
duplicated direct-ssh options.
**1. Readable output in a monospace terminal (Bryan's primary ask).**
The on-server LLM was re-rendering tool results as markdown tables
(`| col | col |`, `|---|`) — which print raw and never align — and collapsing
the site list into a comma-joined inline sentence. The tools already emit clean
data (nc-find/nc-inbound pre-align columns with `%-*s`; `list_sites` prints one
site per line with a `sites: N (excluded: …)` headline). The fix steers the
model to PRESERVE that, and reinforces it at the tool boundary so it can't drift:
- `agents/larry.md` — new **TERMINAL OUTPUT CONTRACT** section (5 rules): never
emit a markdown table; reproduce a tool's pre-aligned/fenced table VERBATIM in
a ```text fence; never hand-align columns (the tools do it deterministically);
render entity lists ONE PER LINE vertically (never comma-joined inline), and
keep `list_sites`'s count headline + one-per-line list as returned.
- `larry.sh` — new `_fence_aligned_table` helper wraps an already-aligned tool
table in a ```text fence with an explicit "reproduce VERBATIM; do NOT convert
to a markdown table" marker. `tool_nc_find` and `tool_nc_find_inbound` route
their `--format table` output through it (tsv/jsonl data formats pass through
unfenced). Empty / error / usage output passes through UNCHANGED (the operator
must still see errors plainly); the table bytes are never altered — only the
fence + marker lines are added.
**2. `cmd_push` DIRECT-mode branch (closes Vera's deferred MAJOR).**
`cmd_push` called `_resolve_open_master` unconditionally, so a DIRECT-mode alias
(no ControlMaster) died "no open master" — breaking `ssh_push` (exposed tool;
used by `nc_regression` phase 4 to push cross-env input bundles, central to the
`epic_adt_in` cross-env goal). Added the symmetric direct branch mirroring
`cmd_pull`: `_direct_scp <alias> <local> <addr>:<remote>` for the transfer, then
post-transfer size verification via `_dispatch_remote` (fresh per-command
`_run_direct`). `lib/ssh-helper.sh`.
**3. De-duplicated the direct-ssh options (closes Vera's MINOR).**
The shared direct-mode `-o` flags (the five security-critical ones —
`PreferredAuthentications=password`, `PubkeyAuthentication=no`,
`StrictHostKeyChecking=accept-new`, `ControlMaster=no`, `ControlPath=none`
plus `NumberOfPasswordPrompts=1` and `ConnectTimeout`) were copied in three
places (`cmd_setup` probe, `_run_direct`, `_direct_scp`). Extracted into one
`_direct_ssh_opts` helper so a security-option change can't drift across copies;
all three sites now splat `$(_direct_ssh_opts)`. ssh `-o` ordering is immaterial
(no conflicting duplicate keys) so this is byte-equivalent in BEHAVIOR — verified
the reconstructed argv matches the prior inline copies token-for-token.
`lib/ssh-helper.sh`.
**No traffic bypass (unchanged, absolute).** DIRECT mode is legitimate
forced-password auth only — no proxy, no tunnel, no masking, and host-key
checking STAYS ON (`StrictHostKeyChecking=accept-new`). The dedup helper keeps
that posture in a single source of truth.
`bash -n` clean on every changed file. `/sites` slash path drives clean under
`set -u`. VERSION + larry.sh:81 bumped to 0.8.18; MANIFEST regenerated.
## v0.8.17 — 2026-05-28
Per-alias DIRECT (no-multiplex) SSH mode (Clover). The real unblock for
`/sites qa` on Bryan's Legacy→qa box.
**Confirmed root cause (live-verified on the qa box).** The qa server
(`bryjohnx@lhsixfqa` → Cloverleaf host `shdclvf01q`, release cis2025.01)
REJECTS SSH ControlMaster session multiplexing. The master opens and
authenticates fine, but any session multiplexed over it dies with
`read from master failed: Connection reset by peer`, then ssh falls back to a
fresh connection that fails auth. A DIRECT per-command connection WORKS and
returns the 24 site dirs (21 after the helloworld/siteProto/master filter).
**Fix — per-alias DIRECT mode.** A new TSV column 5 (`direct`, on|off) opts an
alias out of the ControlMaster entirely. When on, ALL remote ops for that alias
(`exec`, `discover`, `pull-smat`, `pull`) run a FRESH per-command
`sshpass -f <credfile> ssh -o PreferredAuthentications=password -o
PubkeyAuthentication=no -o StrictHostKeyChecking=accept-new -o ControlMaster=no
-o ControlPath=none -o ConnectTimeout=<n> <user@host> '<remote-cmd>'`. The
remote command is shaped by the SAME `_remote_cmd_for` chokepoint as the master
path, so a pinned HCIROOT (`set-hciroot`) is honoured identically — only the
transport changes. NO traffic bypass: legitimate forced-password auth, no
proxy/tunnel/masking, host-key checking stays on (accept-new).
1. `lib/ssh-helper.sh` — TSV column 5 (`direct`); `read_host_direct` /
`_alias_is_direct` readers; `cmd_set_direct` (`set-direct <alias> on|off`,
whitespace-trimmed, legacy <5-column files backfilled, pinned HCIROOT in
col 4 never clobbered); `_run_direct` (per-command sshpass dispatch +
stderr filtering); `_direct_scp` (fresh-sshpass scp for pull); centralised
`_dispatch_remote` (direct vs master, identical command shaping); `cmd_exec`
/ `cmd_discover` / `cmd_pull_smat` / `cmd_pull` route through it and SKIP the
open-master requirement in direct mode; `cmd_setup` in direct mode VALIDATES
the stored password with one trivial direct command instead of opening a
(pointless) master; `cmd_hosts` shows the `direct` column (`master=n/a` for
direct aliases). New `set-direct|direct` subcommand + help.
2. `larry.sh` `/ssh-set-direct <alias> on|off` slash command (set-u-safe
split, the v0.8.16 idiom); added to the completion array + descriptions +
`print_help`; `tool_list_sites` reports the transport (`direct sshpass` vs
`ControlMaster`) and gives a transport-correct recovery hint on discover
failure (stale password in direct mode, not "closed master"); `list_sites`
tool description + the system-prompt remote-alias guidance updated for the
multiplex-rejection symptom and the `/ssh-set-direct` recovery.
**Clean output (UX).** The qa login profile emits a pre-auth banner
("Unauthorized access…/monitored") AND `sudo: a terminal is required` /
`sudo: a password is required` on STDERR for non-interactive sessions. The
parsed site list is on STDOUT (already clean). `_filter_direct_stderr` strips
exactly those known-benign lines; remaining stderr is surfaced ONLY on an
actual non-zero command failure. So `/sites qa` shows just the 21 sites + the
`(excluded: …)` note no banner/sudo noise.
**No regression.** Aliases WITHOUT the direct flag keep the existing
ControlMaster-multiplex path unchanged (and still die "no open master" when
none is open).
**New flow for qa.** `/ssh-pass qa` `/ssh-set-hciroot qa
/hci/cis2025.01/integrator` `/ssh-set-direct qa on` `/sites qa`. No master
step in direct mode.
**Self-verify (this gate, pre-Vera).** Driven non-interactively on this Mac:
flag round-trip (add set-direct on/off hosts raw TSV; col 4 HCIROOT
preserved across col 5 writes); legacy 3-column file backfilled to 5 columns;
command-shaping (`_remote_cmd_for`) identical pinned-export for direct and
master, login-shell for unpinned; `_alias_is_direct` 0/1; stderr filter against
a sample banner+sudo+real-error (only the genuine `find: … Permission denied`
survives; pure banner+sudo empty); `_run_direct` end-to-end via a stubbed
`sshpass` STDOUT clean + STDERR empty on success, real error surfaced + banner
stripped + rc propagated on failure; `tool_list_sites qa` over a stubbed direct
discover `sites: 21 (excluded: helloworld, siteProto, master)`, clean,
correct mode line; direct-mode `setup`/`discover` skip the open-master die;
non-direct `exec`/`discover` still die "no open master" (no regression); and the
`/ssh-set-direct` REPL slash handler (+ neighbours `/ssh-set-hciroot`,
`/ssh-setup`, `/ssh`) drive through a `case` dispatcher under `set -u` +
`compat32` with no unbound-variable abort and correct routing. (Live connect to
the qa box not run here `sshpass` is the production host's tool, absent on the
Mac; the path up to the sshpass invocation is fully exercised via the stub.)
**Vera v0.8.16 caveat closed (evidence attached).** Repo-wide self-ref-class
grep `grep -rnE '^[[:space:]]*(local|declare|typeset|readonly|export)[[:space:]]+[A-Za-z_].*=.*$' larry.sh lib/`
(712 superficial matches) + a targeted same-statement self-reference detector:
**ZERO** same-statement self-references every match is the safe sequential
`;`-separated idiom. `bash -n larry.sh lib/*.sh` clean (36/36 files OK).
---
## v0.8.16 — 2026-05-28
Hotfix (Clover): `set -u` unbound-variable abort in the v0.8.15
`/ssh-set-hciroot` REPL slash command, reported live on Bryan's MobaXterm /
Cygwin bash (`larry.sh: line 6903: _sh_alias: unbound variable`).
**Root cause.** `larry.sh:6903` was a single-line `local` declaration in which
the 2nd variable's initializer referenced the 1st variable assigned in the SAME
`local` statement:
`local _sh_alias="${rest%% *}" _sh_path="${rest#"$_sh_alias"}"`.
Under `set -u`, `$_sh_alias` is treated as unbound at expansion time within its
own `local` statement, aborting the command. This is NOT Cygwin/bash-3.2
specific it reproduces identically on bash 5.x. The same latent pattern lived
at `larry.sh:6925` (the `/ssh <alias> <cmd>` handler:
`local alias="${rest%% *}" rcmd="${rest#"$alias"}"`), not yet triggered only
because that path hadn't been run.
**Fix.** Both split into set-u-safe form declare the locals first, assign the
first, THEN reference it on a later line. Correct on every bash version.
1. `/ssh-set-hciroot` handler (was `larry.sh:6903`) safe-split; preserves the
empty-path "clear the pin" semantics.
2. `/ssh` handler (was `larry.sh:6925`) same safe-split.
**Codebase-wide bug-class audit.** Every `.sh` (larry.sh + all `lib/*.sh` +
top-level scripts) was scanned for any single-statement `local`/`declare`/
`typeset`/`readonly`/`export` where one variable's initializer references
another variable assigned in the SAME statement. After the two fixes above,
**zero remaining instances**. All superficially-similar lines elsewhere are
`;`-separated sequential statements (e.g. `_x="${rest%% *}"; rest="${rest#"$_x"}"`)
where the referenced variable is already bound these are the CORRECT idiom and
are safe under `set -u`.
**Gate-gap closed.** The v0.8.15 gate exercised the `ssh-helper.sh set-hciroot`
SUBCOMMAND but never the larry.sh `/ssh-set-hciroot` REPL slash dispatch where
the bug actually was. The v0.8.16 gate drives the ACTUAL slash-command handler
bodies (extracted verbatim from the fixed source) through a `case` dispatcher
under `set -u` with `BASH_COMPAT=3.2`, non-interactively, with no self-update /
API / bootstrap. Verified `/ssh-set-hciroot <alias> <path>` (normal),
`/ssh-set-hciroot <alias>` (empty-path CLEAR), trailing-space clear, no-args
usage, `/ssh <alias> <cmd>`, and `/ssh <alias>` (usage) all execute WITHOUT the
unbound-var abort. The same harness run against the OLD v0.8.15 line aborts with
`_sh_alias: unbound variable` (rc=1), confirming the gate now catches this
regression class. The no-traffic-bypass security line is unchanged.
---
## v0.8.15 — 2026-05-28
Legacy/qa remote-enumeration fix (Clover). Three confirmed-live properties of
the qa box `bryjohnx@lhsixfqa` (→ Cloverleaf host `shdclvf01q`, release
cis2025.01) broke v0.8.13/v0.8.14 remote site enumeration: a **sudo-gated login
profile** (a non-interactive SSH session hits `sudo: a terminal is required`, so
`bash -lc` can't initialize the env and `$HCIROOT` comes back EMPTY); **no
`hcisitelist`** on the box; and a **password that rotates ~every 12h** (stale
stored credential, and a pre-auth banner that masked the real auth error). All
three are now handled. The no-traffic-bypass security line is unchanged **zero
proxy / masking / evasion primitives** (same Gundersen-class control).
1. **Per-alias HCIROOT pin (load-bearing).** New `ssh-helper.sh set-hciroot
<alias> <path>` and the `/ssh-set-hciroot <alias> <path>` slash command
persist an HCIROOT for an alias as a 4th column in `.ssh-hosts.tsv` (old
3-column files stay valid; an empty path clears the pin). When an alias is
pinned, `exec` / `discover` / `pull-smat` run the remote command with
`HCIROOT=<path>` exported EXPLICITLY under a NON-login `sh -c` — they do NOT
wrap in `bash -lc`, so the sudo-gated login profile is never invoked. A single
chokepoint, `_remote_cmd_for`, makes every remote path honour the pin
identically (unpinned aliases keep the v0.8.13 `bash -lc` login-shell
behaviour, unchanged). `/sites <alias> --hciroot <path>` is a convenience that
persists the pin then enumerates. This makes qa work regardless of the broken
profile. `qa` HCIROOT = `/hci/cis2025.01/integrator`.
2. **Portable site enumeration (no `hcisitelist` dependency).** `discover`'s
remote script now makes the **NetConfig walk the PRIMARY path**, identical to
`lib/each-site.sh` (`find $HCIROOT -mindepth 1 -maxdepth 2 -name NetConfig
-type f` → dirname → basename → sort -u). `hcisitelist` is consulted ONLY if
it is actually present AND the walk found nothing — never as the dependency.
Works on a box with no `hcisitelist`. Emits clear `NOTE` lines (HCIROOT empty
→ suggests the pin; not a directory; no NetConfigs found).
3. **ControlMaster-open hardening (banner + rotating password).** `setup` now
forces `-o PreferredAuthentications=password -o PubkeyAuthentication=no -o
NumberOfPasswordPrompts=1` so sshpass feeds the password cleanly past the
pre-auth banner and a stale credential fails fast instead of hanging; it
surfaces the REAL auth error (greps for permission/auth/password/host-key
keywords) instead of echoing only the banner; and on an auth failure it
RE-PROMPTS for a fresh password (the 12h rotation), stores it 0600, and
retries ONCE. Every failure path emits a clear next step — never a silent
no-op.
4. **`/sites` excludes non-real entries (transparent).** The enumeration now
drops (a) static scaffolding/special sites — `helloworld siteProto master`,
a documented, tunable `SITES_EXCLUDE` env var — and (b) any site dir whose
name equals the host: the REMOTE `discover` walk computes `hostname -s` and
full `hostname` and drops a match (qa's alias host is `lhsixfqa` but the
engine box is `shdclvf01q`; a dir just named after the box is not a site),
and also drops a match against the alias's configured SSH host. The filter
is applied at the SINGLE enumeration source so REMOTE (pinned + login-shell)
and LOCAL `/sites` behave identically. NOT silently hidden: the walk emits an
`EXCLUDED` line and the tool layer renders the real count as the headline with
a note, e.g. `sites: 21 (excluded: helloworld, master, siteProto)`. Acceptance
(qa, 24 raw dirs): `/sites qa` → 21, with the 3 exclusions noted; no dir
matches `shdclvf01q`. The no-traffic-bypass security line is unchanged.
Acceptance (qa): with HCIROOT pinned to `/hci/cis2025.01/integrator`,
`/sites qa` returns the site list via the NetConfig walk with no `bash -lc` and
no `hcisitelist`; `/ssh-setup qa` with a fresh password opens the master past
the banner. Self-verified with `bash -n` on every changed file. POSIX-sh remote
scripts; compatible with bash 3.2 / Cygwin; no regression to unpinned aliases.
---
## v0.8.14 — 2026-05-28
Locked-down-box survivability (Clover): make the full toolkit usable BY HAND
with no API/LLM, and turn a blocked-API failure from a raw error dump into
honest, actionable guidance. Both deliverables are graceful degradation only —
**ZERO traffic-bypass, masking, proxy-hiding, or block-circumvention primitives
were added** (hard security line: larry must not try to defeat a corporate
security control on a PHI box).
1. **`larry tools` — manual-tools dispatcher.** A discoverable, low-friction
entry point so all 24 operator-facing Cloverleaf/HL7 tools in `lib/` are
listable and runnable by hand with no REPL, no API, and no LLM. `larry tools
list` prints every tool grouped (NetConfig read/write, diff & regression,
HL7, site-iteration/format) with a one-line description and a `(missing)`
flag for any registry entry absent on disk; `larry tools <name> [args]` runs
a tool (no args → its own `--help`); `larry tools help` explains the mode.
It dispatches at the very top of `larry.sh` — BEFORE bootstrap, self-update,
the jq gate, and any network call — so it works on a fresh install with the
API unreachable. Only requirement is a `lib/` dir next to `larry.sh` (or in
`$LARRY_HOME/lib`). Internal plumbing (oauth/phi/fetch-safe/cygwin-safe/
ssh-helper/journal/lessons/etc.) is intentionally NOT listed — those are
REPL/agent support, not operator-facing tools. The registry descriptions are
the SSOT for `tools list`.
2. **`_diagnose_api_block` — honest blocked-API detection + guidance.** On a
locked-down box where `api.anthropic.com` is blocked, larry no longer dumps a
bare network error. It inspects the last call's curl exit code, curl stderr,
the response body, and the response headers for block signatures — TLS
interception / MITM cert (`unable to get local issuer certificate`, self-
signed, `certificate verify failed`), DNS-filter / refused / timed-out / TLS-
handshake exit codes, `Could not resolve host` / `Connection refused`,
HTML-where-JSON-was-expected 403/interstitial pages, and explicit Cisco
Umbrella fingerprints in body or headers (`Server: Cisco`, `X-Cisco`,
`Cisco Umbrella`, `This site is blocked`, …). When it recognizes a block it
prints what happened, the target URL, the path into manual-tools mode
(`larry tools list` / `<name> --help` / a worked `nc-parse` example), and the
correct remedy — ask IT to allowlist the API host, or run from a permitting
network. It states plainly that this is a corporate control on a PHI box and
that larry will not, and must not, try to bypass it. Wired into both the
non-streaming and streaming API paths; reads its diagnostics from the
deterministic `$LARRY_HOME/.last-curl-*` / `.last-stream-*` files that
`call_api` persists (the call runs in a subshell, so in-memory globals don't
reach the diagnoser).
Fires before bootstrap/network; compatible with bash 3.2 / Cygwin; no
regression to the v0.8.13 paths.
---
## v0.8.13 — 2026-05-28
Proactive both-mode Cloverleaf-env detection + the `$HCIROOT` login-shell fix +
a first-class site lister + the dominant residual-slowness fix (Clover). Closes
the live-session friction where "how many sites are on qa?" stalled on repeated
`export $HCIROOT` nags despite a working `qa` SSH alias.
1. **`$HCIROOT` login-shell fix (root cause).** `ssh-helper.sh exec` ran the
remote command in a NON-login, non-interactive shell, so the Cloverleaf
login profile (which exports `$HCIROOT`, `$HCISITE`, and the `hci*` PATH)
never sourced and `$HCIROOT` arrived empty — and Larry gave up and asked for a
path. `exec` now wraps the command in `bash -lc` (login shell), so the remote
env populates exactly as for an interactive operator login. Version-agnostic,
zero-config. Escape hatch: `NOLOGIN ` prefix or `LARRY_SSH_NO_LOGIN=1`. The
`pull-smat` find/sample paths use the same wrapper so `$HCISITEDIR`/`sqlite3`
resolve there too.
2. **Both-mode detection (proactive).** Startup now determines and surfaces a
`MODE=` line: **LOCAL** (`$HCIROOT` set by the local profile, or a Cloverleaf
install auto-discovered at a common path — work the local tree, no SSH),
**REMOTE** (no local install but a Cloverleaf SSH alias configured — discover
over a login shell), or **UNKNOWN** (ask which mode applies). Larry leads with
what it found instead of asking Bryan to spoon-feed paths.
3. **First-class `list_sites` tool + `/sites [alias]`.** "How many sites are on
X / what sites exist" now Just Works in both modes: REMOTE
(`list_sites(alias=qa)`) resolves the remote `$HCIROOT` in a login shell and
enumerates sites (Cloverleaf `hcisitelist` fast-path, NetConfig-walk
fallback); LOCAL (`list_sites()`) does the same against the detected local
`$HCIROOT`. Reports the resolved HCIROOT, a count, and the names. New
`ssh-helper.sh discover <alias>` backs the remote path.
4. **System-prompt de-nagging.** `agents/larry.md` and the `/nc-diff-env` /
`/nc-regression-env` templated prompts no longer instruct Larry to ask Bryan
to `export $HCIROOT` for a reachable host. The cardinal rule is explicit:
never request a path you can resolve; the only remote precondition surfaced
is an open ControlMaster (`/ssh-setup <alias>`).
5. **Streaming slowness — dominant residual fix.** v0.8.12 collapsed per-delta
routing to one jq call, but each `text_delta`/`input_json_delta` STILL forked
a second jq purely to un-escape the `@json`-encoded payload — ~N forks/turn
(N≈output tokens), and on Cygwin/MobaXterm a fork is ~50-100ms. New
`_json_str_decode` un-escapes in pure bash for the common escape-free chunk
(zero forks), deferring to jq only when a real backslash escape is present.
Halves per-turn fork count on top of v0.8.12. Round-trip verified for tab,
escaped quote/backslash, embedded newline, `\uXXXX` Unicode, and empty.
6. **`pull-smat` path capture hardened (Vera Minor #1).** The remote `.smatdb`
path lookup previously took `tail -1` of the merged stdout+stderr stream.
Under the new login shell (item 1) a profile MOTD/banner can land on stdout,
which a blind `tail -1` could mistake for the path. The remote `find` now
emits the resolved path behind a `SMATDB_PATH:` sentinel; the client selects
by pattern, not position, and only falls back to the prior `tail -1` when no
sentinel is present (so no host that worked before can regress). Selection
logic unit-verified for banner-before, banner-after, ERROR, empty, multi-
sentinel, and no-sentinel-fallback.
Compatible with bash 3.2 / Cygwin; no regression to the v0.8.12 fixes.
---
## v0.8.12 — 2026-05-27
Crash fix + slowness + cost pass on the new API-key rail (Clover). Full
diagnosis: `Deliverables/2026-05-27-cloverleaf-larry-v0812-crash-slowness-cost.md`.
1. **Fix the post-response arithmetic crash** (`bash: <n>: syntax error: invalid
arithmetic operator (error token is "")`). The token/cost accounting fed
jq-extracted usage counts straight into `$(( ))`; on Cygwin/MobaXterm a
CRLF-translated response body makes `jq -r` emit `1234\r`, and `// 0` only
guards JSON null, not the CR. `coerce_int` now defends all three accounting
sites (non-streaming cost block, streaming cost block, `_record_ctx_used`).
This is the v0.7.5 OAuth crash's Anomaly-#4 recurrence on the response path.
2. **Prompt caching wired into the request** (cost). The ~12.7K-token static
prefix (system ~6K + 35 tools ~6.7K) was re-sent uncached every turn. v0.8.12
sends `system` as a block array and marks the last tool with
`cache_control: ephemeral` (apikey rail; `LARRY_PROMPT_CACHE=0` to disable),
so the prefix is billed at the 0.1x cache-read rate after turn 1 — a ~90% cut
on the static prefix and a large TTFB reduction.
3. **Streaming parser per-delta jq forks collapsed** (slowness, dominant fix).
The SSE `content_block_delta` hot path spawned 3 `jq` processes per event
(`.index`, `.delta.type`, then the text/json payload). A normal response
ships dozens-to-hundreds of these, and on Cygwin/MobaXterm a process fork is
~50-100ms (Windows fork emulation) — so the per-turn render lagged multiple
seconds ("veeery slow"). The routing fields are now pulled in ONE jq call
(payload `@json`-encoded so embedded newlines/tabs survive the line read),
cutting the text path to 2 forks/event and ignored deltas to 1. Verified
round-trip on text with embedded NL/TAB/quote and on `input_json_delta`.
4. **Per-turn PHI sidecar `/health` probe cached for the session** (slowness).
`phi_client_available` ran `curl -m1` every turn; on MobaXterm the sidecar can
never run, so it was a guaranteed curl fork + up-to-1s wait per turn. Now
probed once per session (file-flag, survives the `$(...)` subshell);
`LARRY_PHI_REPROBE=1` forces a fresh probe.
---
## v0.8.11 — 2026-05-27
Two headline features land together (Clover):
1. **API key is now the default auth rail** — sanctioned, not edge-throttled;
OAuth-impersonation disabled by default to protect the user's Max account;
secure per-client key via `/set-api-key`, stored 0600, CR-safe, masked in
diagnostics.
2. **Manifest-hashing auto-update** — relaunch skips unchanged files by
comparing each MANIFEST sha256 to a local hash, so only changed/missing
files download. Relaunch drops from minutes to seconds.
---
### 1. API key as the default auth rail
Bryan's decision (2026-05-27): the durable fix for the multi-hour OAuth
`rate_limit_error` is to STOP impersonating the official Claude Code client and
move to the sanctioned API-key rail. Anthropic actively fingerprints and blocks
Claude-Code OAuth impersonation (server-side enforcement since ~2026-01-09, ToS
change 2026-02-19/20); every impersonated OAuth request flags the user's Max
account. The API key (`sk-ant-api03-`) is the intended programmatic rail — a
plain `x-api-key` request, billed pay-as-you-go, not edge-throttled. (Research:
`Deliverables/2026-05-27-claude-code-oauth-request-requirements-research.md`.)
- **API key is the default.** `LARRY_AUTH_MODE` defaults to `apikey`. The
request shape is the clean sanctioned form: `x-api-key` + `anthropic-version:
2023-06-01` + `content-type: application/json` — **no `Authorization: Bearer`,
no `anthropic-beta: claude-code-*`, no `claude-cli (external, cli)` UA, no
`x-app: cli`, no `?beta=true`, no "You are Claude Code" system block.** All of
the prior impersonation scaffolding was removed.
- **OAuth is OFF by default (opt-in only).** larry fires OAuth ONLY when
`LARRY_AUTH_MODE=oauth` is set explicitly, and prints a one-time
account-risk warning when it does. There is **no silent OAuth fallback**
larry never auto-pokes the impersonation tripwire. The opt-in OAuth request is
minimal and honest (`Bearer` + `anthropic-beta: oauth-2025-04-20` only).
- **Secure per-client API-key provisioning + storage** (Bryan's core ask):
- `/set-api-key` (and `larry-auth.sh --api-key`) prompts with `read -s`
(silent, never echoed, never in argv / process table / shell history),
optionally validates with one cheap `/v1/messages` ping (`max_tokens:1`),
then stores the key at `$LARRY_HOME/.api-key` (mode **0600**, owner-only),
**CR-stripped** (MobaXterm/Cygwin CRLF-safe). `--clear` removes it;
`--status` shows it masked. Each client holds its OWN key (mint one per
machine at console.anthropic.com — independently revocable, leak-contained).
- The key is fed to curl via `--config -` on **stdin**, so it never appears in
curl's argv / the process table (`ps`-clean), in all request and validation
paths.
- **Secret hygiene:** the key is never logged, never committed (added to
`.gitignore` and to larry's PHI/secret `read_file` path-block), and is
**masked** as `sk-ant-api03-XXXX…last4` in `/auth`, `/auth-debug` (alias
`/api-debug`), and `/set-api-key --status`. The audit-trail secret-guard's
`sk-ant-[A-Za-z0-9_-]{20,}` pattern catches `sk-ant-api03-` specifically.
- **429-discrimination (reused from the prior thread's good work):** a real
rate-limit 429 ALWAYS carries `anthropic-ratelimit-*` headers → legitimate
backoff; a 429 with NO such headers on the API-key rail → clear "edge/
transient bounce, not your quota" message (no futile backoff). On the opt-in
OAuth rail, that same edge-reject signature triggers an automatic flip to the
sanctioned API-key rail if a key is configured (`LARRY_NO_EDGE_FALLBACK=1` to
opt out).
- **Migration UX:** first launch with the apikey default and no key → friendly
prompt to run `/set-api-key` (not a bounce into the risky OAuth path).
### 2. Manifest-hashing auto-update speedup
`sync_from_manifest` used to re-download **all 48 manifest entries** every
relaunch over authenticated HTTPS (Gitea via proxy + Cloudflare) and `cmp`
locally to find the 03 that changed — ~3 min on the work-box for a 3-file
update, because the MANIFEST was paths-only and the client could not tell what
changed without fetching everything.
- **MANIFEST now ships each file's expected sha256** (`path<TAB>sha256`,
generated at release by `scripts/make-manifest.sh`). The client fetches
MANIFEST once, hashes its LOCAL copy of each path, and downloads ONLY entries
whose hash differs or are missing. 48 round-trips → **1 (MANIFEST) + a local
hash pass + N real downloads** (N = actually-changed files, usually 03).
Relaunch drops from minutes to seconds.
- **Fail-SAFE: a doubt NEVER skips an update.** A download is skipped only when
ALL hold — a working sha256 tool, a valid 64-hex manifest hash, the local file
exists, and its hash matches exactly. No tool / empty / malformed / non-hex
hash, missing local file, hash-tool error, or any mismatch (including a stale
or wrong published hash) all fall through to **download**. Worst case is a
needless re-download, never a missed update.
- **sha256 tool fallback chain:** `sha256sum``shasum -a 256`
`openssl dgst -sha256` → (none → full-download fallback, updater never breaks).
Detected once, cached. Tool output normalized to bare lowercase 64-hex.
- **CR-safe:** the MANIFEST is fetched from Gitea (CRLF risk); the whole line is
CR-stripped before splitting and the hash-tool output is `tr -d '\r'`'d, so a
CRLF-tainted hash never forces a needless re-download.
- **Download path unchanged** from v0.8.9 — still routes through `fetch_validate`
(HTML-sign-in-trap detection), keeps the post-download `cmp` guard
(idempotent), `chmod +x` for `*.sh`, and per-file `--max-time`. The v0.8.9
live progress indicator now renders over the new "verifying (local)" phase and
the (fewer) "downloading" frames. New summary line: `manifest sync: N updated,
M unchanged (local hash), F failed, T total`.
- **Release tooling (committed, not synced to clients):**
`scripts/make-manifest.sh` regenerates/checks the MANIFEST hashes;
`scripts/hooks/pre-commit` blocks a commit whose MANIFEST hashes drifted. These
are release-side only — the work-box consumes manifests, never generates them —
so they are deliberately NOT listed in MANIFEST.
Deliverables: `Deliverables/2026-05-27-cloverleaf-larry-api-key-default-rail.md`,
`Deliverables/2026-05-27-cloverleaf-larry-manifest-hashing-speedup.md`.
## v0.8.9 — 2026-05-27
Manifest-sync live progress indicator (Clover). Symptom: Bryan's auto-update
relaunch is very slow and **looks frozen** — a multi-minute silent gap between
`update found: X -> Y … relaunching` and `manifest sync: N updated, 0 failed, M
total`, with NO output in between. Observed: `[21:32:29]``[21:35:37]` = ~3 min
of silence to update just 3 of 48 files; earlier `[20:19:42]``[20:20:32]` =
~50s for 22 files. Bryan cannot tell working-but-slow from hung.
**Root cause (confirmed by reading `sync_from_manifest`).** Phase-A sync does NOT
do cheap HEAD/hash-checks — it **fully downloads EVERY manifest entry** over an
authenticated HTTPS round-trip (Gitea via the corporate proxy + Cloudflare),
then uses `cmp -s` locally to decide whether the bytes actually changed
(discarding unchanged ones). With 48 entries that is **48 sequential full
downloads**, every relaunch, regardless of how few files changed. The loop emits
nothing until the trailing summary `log` line → the entire window is silent →
looks hung. The "3 files changed" in the summary is just how many survived the
`cmp`; all 48 were still fetched.
- **Live in-place progress over the WHOLE sync.** New `_sync_progress` /
`_sync_progress_done` helpers render `checking N/48 lib/foo.sh` per entry,
rewriting the line via `\r\033[K` (carriage-return + clear-line). When a fetch
actually lands a changed file, the frame switches to `downloading N/48
lib/foo.sh` so real writes are distinguishable from the common unchanged case.
The current filename is always shown, so a genuine stall is VISIBLE — you see
exactly which file it is stuck on instead of a blank freeze.
- **MobaXterm-safe escapes only.** Uses solely `\r` + `ESC[K` (the same
primitive already at the readline prompt, audited safe in the v0.8.7 escape
inventory). Deliberately NO DECSTBM scroll-region, cursor save/restore, or
absolute-row addressing — the exact escapes MobaXterm mis-honors. Verified
under a pty: only `ESC[0m`, `ESC[2m`, `ESC[K` ever emitted; zero forbidden
escapes. Independent of `LARRY_NO_STREAM` / mouse mode (gates only on a TTY).
- **Non-TTY safe.** Gates on `[ -t 2 ]`. When stderr is a pipe/log it emits a
plain newline-terminated heartbeat every 10 files (no `\r`), so captured logs
stay clean — verified zero `\r` bytes in piped output.
- **Heartbeat for hangs.** The per-file fetch already carries curl `--max-time`
(5s for VERSION, 15s otherwise); a stuck file now shows its name, times out,
counts as a fail, and the loop advances — never an infinite silent stall.
- **Speedup deferred (by design).** The clean fix — fetch the manifest once,
compare per-file hashes locally, and SKIP unchanged files with NO per-file
network round-trip — requires the MANIFEST to carry hashes. It currently
carries **paths only** (no hashes/sizes), so the optimization would require a
manifest-format + release-tooling change (higher risk, separate change). Filed
as a follow-up: `MANIFEST` should emit `path<TAB>sha256` so v0.9.x can turn 48
network round-trips into 1 fetch + local compare. This release ships the
indicator (Bryan's explicit ask); correctness of the sync is unchanged.
Note: the v0.8.9 relaunch itself is still the slow path (the indicator helps
from the NEXT update onward), but you now see live forward progress during it.
## v0.8.8 — 2026-05-27
Force unconditional 429 header capture (Clover). Symptom: Bryan's MobaXterm
work-box hits `rate_limit_error` repeatedly, but `$LARRY_HOME/log/headers.log`
NEVER generates — so we cannot diagnose which rate-limit rail / auth path is
failing. The single goal: guarantee the log generates on the NEXT 429 so Bryan
can `tail` it and paste it (manual paste is the plan; auto-sync is dropped).
**Call-flow trace (first, to disprove the deeper hypothesis).** Bryan's box
runs `LARRY_NO_STREAM=1` (auto-set on MobaXterm since v0.8.5), so `agent_turn`
takes `resp=$(call_api …)`. `call_api` (larry.sh) ALWAYS dumps response headers
via `curl -D` and ALWAYS calls `_parse_response_headers` on that dump after curl
returns — regardless of HTTP status (it explicitly comments "headers carry
rate-limit info even on 429s"). So the non-stream 429 path WAS reaching the
parser. The parser was NOT the missing call — the bug was entirely the
over-clever write-gate INSIDE the parser.
**Root cause = the write-gate was too clever for its own purpose.** The v0.8.5
gate wrote headers.log only if `(OAuth-mode AND a unified-* header was present)
OR (retry-after was non-empty)`. Bryan's 429s carry NEITHER: the backoff used
the exponential 2/4/8s fallback (proving no server `retry-after`), and a
per-minute burst 429 routinely omits the `unified-*` family. Neither branch
fired → no write → no log → no diagnosis. The capture defeated its own purpose.
- **Unconditional write on ANY 429, detected from the status line.**
`_parse_response_headers` now greps the `-D` dump for `^HTTP/<ver> 429`
(CRLF-tolerant) and, on a match, ALWAYS writes the full raw header block to
`$LARRY_HOME/log/headers.log` — regardless of `retry-after`, `unified-*`, or
auth mode. A bare 429 with no diagnostic headers STILL logs; that absence is
itself the finding (signals a low/bare-tier limit).
- **429s exempt from the OAuth 50-call cap.** New `STATUS_429_headers_logged`
counter with its own budget (`STATUS_429_HEADER_LOG_LIMIT=200`), independent
of the 200-path OAuth sampling cap. A session that burned all 50 OAuth
captures on successful calls STILL logs its next (51st-call) 429.
- **Full diagnostic dump.** The 429 block writes: a banner with `auth-mode`
(OAuth-Max vs API-key rail), the detected limit-rail, `retry-after`, org-id,
and request-id; then the HTTP status line, ALL `anthropic-*` headers (not just
`-ratelimit-*`), `retry-after`, `request-id`, and every `x-*` header — so the
auth-rail + which-limit question is answerable from one paste.
- **Live stderr pointer.** On every 429 capture, prints
`phi/rl> 429 headers logged to ~/.larry/log/headers.log (rail=<x>,
retry-after=<y>) — paste for diagnosis` so Bryan knows the log now exists.
- **Same-pattern sweep.** Streaming path (`call_api_stream` →
`_drain_pending_stream_headers``_parse_response_headers`) shares the same
function, so Mac/Linux streaming users get identical 429 capture. The v0.8.0
`tool_read_file` PHI path-block (which blocks `$LARRY_HOME/log/`) is
tool-dispatch-only — Bryan reading his own headers.log via interactive shell
`tail` is unaffected (verified: no shell-level block; `bash_exec` runs
`bash -c` directly without the path-block). The 200-path OAuth sampling cap is
unchanged.
## v0.8.7 — 2026-05-27
Status-line render fix for MobaXterm/Cygwin (Clover). Symptom: the dim
between-turn status line (session context + rate-limit reset date) never
appeared on Bryan's MobaXterm work-box — the v0.7.1 status-line feature that
MEMORY.md flagged as a never-verified passive item.
**Root cause = suppress-when-empty gate, NOT terminal positioning.**
`render_status_line` (larry.sh) gated the OAuth arm on `ctx_used_tokens`,
`oauth_5h_utilization`, AND `oauth_7d_utilization` ALL being empty — returning
silently (rendering nothing) when so. Two facts made all three stay empty turn
after turn on Bryan's box: (1) `STATUS_ctx_used_tokens` is populated by
`_record_ctx_used`, which runs only AFTER a successful `agent_turn`; on a
`rate_limit_error` the turn returns early (larry.sh ~L3929), so ctx was never
recorded; (2) pre-v0.8.5 the `anthropic-ratelimit-unified-*` utilization
headers weren't captured on error responses. With every turn erroring, all
three gate fields were empty every turn, so the line was suppressed for the
whole session and never rendered. This was NOT a positioning bug: the status
line is a single plain `printf`'d dim line printed between turns — there is no
DECSTBM scroll-region reservation, no cursor save/restore, no absolute-row
positioning anywhere in the codebase, so MobaXterm's terminal emulation had
nothing to mis-honor. It was also NOT coupled to streaming or mouse mode.
- **Gate on turn count, not data presence.** `render_status_line` now suppresses
ONLY before the first turn has run (`_LARRY_TURNS == 0`, coerce_int-guarded);
thereafter it ALWAYS renders. Both auth-mode helpers already self-render a
`—` placeholder for any unpopulated field, so the line always shows session
context (model context window, turns, session cost) and the rate-limit reset
time fills in once a successful call — or, since v0.8.5, a captured error
response — populates the headers.
- **`/status` always renders on demand**, even before the first turn — an
explicit request bypasses the turn-0 gate. Lets Bryan verify the line renders
on MobaXterm without first completing a (possibly rate-limited) turn.
- **CR-taint hardening (same-pattern sweep).** The OAuth segment's reset-epoch
comparisons (`[ <epoch> -le <now> ]`) read `STATUS_oauth_{5h,7d}_reset_epoch`,
which come from `_header_value` (strips only the TRAILING CR). A CRLF response
on MobaXterm or a non-numeric token would have crashed the arithmetic test and
aborted the entire line. Both comparisons now `coerce_int` the epoch first;
the `STATUS_oauth_status` color-override `case` is now `strip_cr`-guarded so a
`rate_limited\r` value still matches its literal-glob arm.
- **Same-pattern sweep results:** audited every escape sequence in larry.sh +
lib/ — only color SGR, clear-screen (`\033[2J\033[H`), erase-line
(`\r\033[K`), and the opt-in mouse/bracketed-paste modes (off by default since
v0.7.5) are used; ZERO scroll-region / cursor-save-restore / absolute-cursor
sequences, so no other UI element is at risk of MobaXterm mis-rendering.
Confirmed no user-visible element is gated behind streaming (`used_stream`
only guards re-printing already-streamed text) or mouse mode.
- **Verification:** `bash -n` clean; 7/7 unit tests pass against the shipped
render functions — turn-0 suppressed; turn≥1 with empty data renders with
placeholders; reset date shown when populated; renders with `LARRY_NO_STREAM=1`
+ mouse off (Bryan's exact config); survives a CR-tainted reset epoch without
crashing; `LARRY_NO_STATUS=1` still fully disables.
## v0.8.6 — 2026-05-27
Work-box → Mac `headers.log` sync (tsk-2026-05-27-023, Clover headers-sync).
Closes the last gap in the rate-limit-diagnosis pipeline: the
`anthropic-ratelimit-*` headers captured on Bryan's MobaXterm work-box (where
the testing happens) never reached the Mac's memory daemon, so they could not
be analyzed. v0.8.6 pushes the work-box `headers.log` to a daemon-watched path
on the Mac automatically; the Mac daemon ingests it to memory Tier 4
(Hindsight) + Tier 7 (mem0).
- **New `lib/headers-sync.sh`** — incremental, offset-tracked, idempotent push
of `$LARRY_HOME/log/headers.log` to a per-host file on the Mac
(`~/.cloverleaf/headers-<workbox-hostname>.jsonl`, a daemon-watched dir).
Transport rides the EXISTING authenticated SSH ControlMaster
(`/ssh-setup <alias>`) — no new key, no second auth, the password is never in
argv/env. Only the new bytes since the last sync are sent (`dd skip=offset`
→ remote `cat >>`); a no-op when nothing is new; a re-seed (truncate + resend)
when the local file rotates/shrinks. Fully graceful: missing target, closed
master, or transport failure logs a warn to `$LARRY_HOME/log/sync.log` and
returns non-fatally — it can NEVER crash or wedge the larry session.
- **`/headers-sync on|off|status|target <alias>|now`** slash command. `target`
binds the Mac SSH alias; `on`/`off` toggle auto-sync (persisted to
`$LARRY_HOME/.env` as `LARRY_HEADERS_SYNC` / `LARRY_HEADERS_SYNC_TARGET`);
`status` shows enabled?, target, dest, last-sync time, bytes pushed, and
master state; `now` runs one incremental sync on demand. Registered in the
TAB-completion arrays and `/help`.
- **Auto-sync cadence: on larry exit.** The REPL EXIT/INT/TERM handler flushes
headers.log if auto-sync is enabled (cheap + incremental). On-demand
`/headers-sync now` is always available. (After-EVERY-turn cadence was
intentionally deferred to keep this change out of the turn/streaming loop
that v0.8.5 just reworked.)
- **Mac-daemon receive side** (`scripts/headers_log_ingest.py`, not part of the
larry bundle): now resolves `headers-*.jsonl` glob sources under the watched
dirs IN ADDITION to the fixed canonical `headers.log`, and processes ALL
sources with PER-SOURCE offsets — so the Mac's own stream and one or more
work-box streams are surfaced independently. Each fact carries a `source=`
label (the work-box hostname) so the memory layer can tell them apart.
- **Security (Vera PHI audit V7):** headers.log holds only `anthropic-*`
response headers (rate-limit metadata + org id) and HTTP status lines — NO
message body, NO PHI — so syncing is safe. The existing key/password-auth
ControlMaster transport is reused unchanged (not weakened).
## v0.8.5 — 2026-05-27
Diagnose-don't-assume rate-limit cluster fix (Clover #8). Symptom: a `hello`
turn threw `rate_limit_error` on a work-box with 90% of the Claude Max 5h quota
free — so NOT 5h-window exhaustion. Root cause = a short-window BURST rail
tripped by a stream→non-stream **double-send** per turn, with no backoff.
- **Rate-limit backoff + actionable message (ROOT).** A 429 no longer fails the
turn or fires an immediate re-send. `agent_turn` now retries with backoff that
HONORS the `retry-after` header (else exponential 2/4/8s capped at 30s;
`LARRY_RL_MAX_RETRIES`/`LARRY_RL_BACKOFF_MAX` tunable). The error message is now
ACTIONABLE: `_parse_response_headers` captures `retry-after` + which rail
tripped (`anthropic-ratelimit-{requests,input-tokens,output-tokens}-remaining:0`
or `unified-{5h,7d}` for OAuth) and `_humanize_rate_limit` renders e.g.
`rate limit: requests-per-minute exhausted (short-window burst, NOT your 5h
quota) — resets in 38s; retrying with backoff`. `headers.log` now captures the
full header block on ANY 429 (was: OAuth-mode + unified-* header only), tagged
`*** 429 retry-after=Ns rail=… ***`, so the next rate-limit is always
diagnosable.
- **Streaming parse failure no longer double-sends (burst trigger).** A
streaming 429/overload returns a plain JSON error body (not SSE);
`parse_stream_to_response` previously dropped those non-`data:` lines, produced
zero blocks, returned 1, and `agent_turn` blindly re-SENT the whole prompt
non-streaming — a SECOND full API call within the same second (the per-minute
burst). The parser now buffers the non-SSE body and, if it parses as a JSON
error, returns a distinct code so the caller surfaces it WITH backoff instead
of re-sending (single-send invariant: one logical attempt per turn). Also
auto-defaults `LARRY_NO_STREAM=1` on MobaXterm/Cygwin/MSYS (`_is_cygwin_like`)
where SSE parsing is fragile; an explicit `LARRY_NO_STREAM=0` still forces it on.
- **`ErrorPI` mangled error string fixed (CR-taint).** `— ErrorPI error:
rate_limit_error` was a carriage-return overprint: on MobaXterm the response
field `jq -r '.error.type'` carried a trailing `\r`, which (a) broke the
`case "$err_type" in rate_limit_error)` match → fell to the `%s — %s` default
(the stray ` — `), and (b) CR-returned the cursor so the terminal overprinted
"API error" → "ErrorPI". Fix: `strip_cr` on `err_type`/`err_msg` in
`_humanize_api_error`, and `err()`/`warn()`/`log()` now strip embedded CRs
defensively. (The v0.7.5 CR sweep missed the error-DISPLAY construction path.)
- **phi tier-5 notice fires once per session (was per-turn nag).** The
`tier-5 (presidio NER) disabled — sidecar not running` notice printed every
turn because `auto_detect_phi` runs inside `$(...)` command substitution and
the old `export _LARRY_PHI_TIER5_WARNED=1` flag died in the subshell. Now keyed
to a `$LARRY_HOME/.phi-notice-shown` file holding `SESSION_ID` — fires once per
session, survives the subshell, resets for a genuinely new session. Same-pattern
sweep caught the identical subshell-flag bug in `_auto_phi_b64_roundtrip`'s
python3-missing notice (`_LARRY_B64_PY3_WARNED`) — fixed the same way.
## v0.8.4 — 2026-05-27
- **Installer/updater now detects HTML-sign-in-page responses and fails loud
instead of silently corrupting.** Root cause (Clover #5's diagnosis,
`Deliverables/2026-05-27-cloverleaf-larry-stuck-update-and-tab-bug.md`): a
private/sign-in-gated Gitea answers an unauthenticated raw-file read with the
**HTML Sign-In page at HTTP 200** (303 → `/user/login`, followed by `curl -L`
to a 200 HTML page). `curl -fsSL` treats that as success, so the old
installer/auto-updater parsed the HTML as VERSION/MANIFEST/`larry.sh` content
— silently aborting, or overwriting real on-disk files with HTML soup. This
is exactly what stranded a work-box at v0.7.3 until the Gitea
`REQUIRE_SIGNIN_VIEW=false` flip.
- **New `lib/fetch-safe.sh`** — a content-validating fetch wrapper
(`fetch_validate URL DEST KIND [MAX_TIME]`). After every `curl`, BEFORE
trusting the bytes, it (a) detects the HTML-login trap (`<!DOCTYPE html` /
`<html` / `Sign In - Gitea` / `<title>Sign In` markers, or a `text/html`
`Content-Type` when a raw file was expected) and (b) validates the content
shape per file type: VERSION must match `^[0-9]+\.[0-9]+\.[0-9]+`, MANIFEST
must be a path-list with no HTML, `larry.sh` must start with
`#!/usr/bin/env bash`, other `.sh` must be non-HTML. On any failure it prints
an actionable error and returns non-zero **without overwriting the target**.
The bootstrap `install-larry.sh` (curl|bash, runs before any lib exists) and
`larry.sh`'s `self_update()` (runs before lib is sourced) each carry a
byte-identical inline copy; the canonical file is in MANIFEST and auto-syncs.
- **Every remote-content fetch hardened.** `install-larry.sh` `fetch()`;
`larry.sh` agent fetch, `sync_from_manifest` MANIFEST + per-file fetches, and
`_fetch_with_fallback` (Phase-B VERSION + larry.sh) all route through the
validator. No trusted-content fetch still uses raw `curl -fsSL`.
- **Optional `LARRY_GITEA_TOKEN` (alias `GITEA_TOKEN`) for authenticated
fetch.** When set, fetches add `Authorization: token <PAT>` so the
installer/updater works against a PRIVATE repo without the public-flip. The
token is never hardcoded and never logged. Documented in `--help` + MANUAL.md.
## v0.8.3 — 2026-05-27
- **Tab-completion trailing space no longer breaks command dispatch.** The
slash-command completer intentionally appends a trailing space after a
unique match (so arg-taking commands feel snappy), but the main_loop
dispatcher matched exact `case` globs, so `/quit ` (completed) missed the
`/quit)` arm and fell through to "unknown command". Latent since v0.6.6
when tab completion shipped. Fixed by rtrimming the dispatch key once at
the `case "$input"` boundary (`larry.sh`), which tolerates the completer's
space, a user-typed trailing space, and any CR remnant while preserving
interior `/load FILE` argument spacing. Added a shared `rtrim()` helper to
`lib/cygwin-safe.sh` (and the inline fallback) next to `strip_cr`.
## v0.8.2 — 2026-05-27
Microsoft Presidio sidecar for free-text NER. Closes V1 from Vera's audit —
the dominant real-world failure mode (patient names, addresses, un-keyworded
dates in prose chat). Opt-in install; larry runs in v0.8.1 mode on hosts
where Presidio isn't installed (MobaXterm/Cygwin per Bryan's accepted
tradeoff).
- **`lib/phi-presidio-sidecar.py`** — FastAPI service on
`127.0.0.1:$LARRY_PHI_PORT` (default `41189`). Wraps Presidio's
`AnalyzerEngine` + `AnonymizerEngine` over spaCy `en_core_web_sm`
(12MB model, ~9-second cold start). Two endpoints: `POST /redact`
takes `{"text": "..."}` and returns `{"redacted": "...", "entities":
[...], "latency_ms": N}`; `GET /health` for the launcher's readiness
probe. Three HL7-specific custom recognizers added (`HL7_MRN` for
6-12 digit numerics with patient/MRN/account context; `HL7_CARET_NAME`
for `SMITH^JOHN` outside Tier-3 line context; `HL7_PHONE_BARE` for
plain 10-digit phones). Confidence threshold for tier-5 tokenize is
0.3 (below that is too noisy).
- **`lib/phi-sidecar.sh`** — lifecycle launcher. Subcommands:
`start / stop / status / health / ensure`. `ensure` is idempotent
(no-op if already up); called from `larry.sh` main_loop startup,
backgrounded so it never blocks larry's first prompt. Waits up to
30 seconds for the sidecar to become healthy after `start`; surfaces
the log tail if startup fails. PID file at
`$LARRY_HOME/.phi-sidecar.pid`; log at `$LARRY_HOME/log/phi-sidecar.log`.
Honors `LARRY_PHI_VENV` env to use a dedicated virtualenv (which the
installer sets up at `$LARRY_HOME/phi-venv` when the user opts in).
- **`lib/phi-client.sh`** — bash wrapper around `/redact`. Sourceable
functions: `phi_client_available`, `phi_redact_text`, `phi_redact_entities`.
Also runs standalone as a CLI (`./phi-client.sh check / redact / entities`).
CR-safe (sources `cygwin-safe.sh` defensively); 5-second curl timeout
bounds any tier-5 stall.
- **Tier-5 integration in `larry.sh:auto_detect_phi`.** New stage AFTER
the existing tier-1/2/3/4 substitution and BEFORE the status summary.
Sources `phi-client.sh` lazily, probes `phi_client_available`, and on
success runs `phi_redact_entities` to get Presidio's per-entity output.
Each entity is tokenized through the SAME `hl7-sanitize.sh tokenize-value`
pipeline as tiers 1-4 (category prefixed `presidio_<TYPE>`) so token IDs
remain stable across surfaces and the `/tokens` listing stays unified.
Tier-5 honors `LARRY_AUTO_PHI=confirm` (prompts Y/n once per value) and
`strict` (aborts the turn if `tokenize-value` fails on a Presidio hit).
Critically, v0.8.2 removes the v0.7.3 early-return that exited
`auto_detect_phi` when tiers 1-4 found nothing — pure-prose input now
ALWAYS reaches tier-5.
- **Graceful degradation.** If the sidecar is unreachable (not installed,
not started, crashed), tier-5 silently no-ops with a one-time stderr
warning per session. Larry's REPL remains fully functional in v0.8.1
mode. `LARRY_AUTO_PHI=strict` does NOT abort on absent sidecar (the
strict mode escape is for HL7-shaped content where rule-pack would
have caught the leak; tier-5 is additive coverage).
- **`/phi-sidecar` slash command** — `start / stop / status / health /
ensure` exposed to the user. Slash-completion table and `_LARRY_SLASH_CMDS_DESC`
updated.
- **`install-larry.sh` install path.** On hosts with Python 3.9+ + pip,
the installer prompts before creating `$LARRY_HOME/phi-venv` and
installing `presidio_analyzer + presidio_anonymizer + fastapi +
uvicorn + spaCy en_core_web_sm` (~400MB on disk, ~250MB RAM resident).
On MobaXterm/Cygwin without python3, the installer skips the prompt
entirely and prints Bryan's accepted tradeoff (MobaXterm stays on
v0.8.1 + nudges). Re-runnable; idempotent.
- **MANIFEST.** Added three new lib files. They auto-sync to every
running client on next launch; clients without Python 3 won't run
the sidecar but the files are harmless to ship.
**Prototype validation (Bryan's Mac, Apple Silicon, Python 3.14).**
Cold start (model load): ~9 seconds with `en_core_web_sm` (vs ~82s with
the larger `en_core_web_lg` Presidio auto-downloads by default — we
explicitly pin `_sm` for the latency-sensitive REPL use case). Warm
analyzer latency: P50 20.6ms, P95 22.7ms over 20 sequential requests
on 100-word input. End-to-end HTTP round-trip (curl + json roundtrip):
P50 ~57ms warm; first request post-startup pays a ~150ms tokenizer
warmup tax then steady. Well under the 200ms-per-turn REPL budget.
Detection quality on the canonical "John Doe MRN 623000286" sample: 8
core entities caught (PERSON x2, DATE_TIME x2, PHONE_NUMBER, US_*),
plus the three custom HL7 recognizers add MRN + caret-name + bare-phone
coverage. Misclassifications (MRN as US_PASSPORT, "ED" as PERSON) are
within tolerance for the tokenize-everything-suspicious policy — the
auto-PHI lookup table sees them as `presidio_*` categories and the
operator can audit via `/tokens`.
**MobaXterm compatibility verdict.** Per Bryan's accepted tradeoff:
v0.8.2 ships Mac/Linux-only. MobaXterm/Cygwin stays on v0.8.1
(rule-pack + path-block + content-shape gating + strict mode + base64
round-trip + tool-result review gate). Test path: install-larry.sh
detects platform and skips the Presidio install on `windows-cygwin`
with a clear "v0.8.1 mode" note. No code in larry.sh is platform-gated
— tier-5 silently no-ops when the sidecar is absent, which IS the
MobaXterm path.
**Proactive same-pattern sweep.** Searched for other call sites where
free-text NER would help: tool-result surface already gets HL7-shape
sanitize (v0.8.1) and base64 round-trip (v0.8.1-c). Tier-5 is
user_input-only by design — tool-result free-text NER deferred to a
future patch (would require deciding on per-tool latency budgets;
Bryan to call when needed).
## v0.8.1 — 2026-05-27
Tool-result PHI gating expansion. Closes V2 / V12 and the V2 base64 sub-gap
from Vera's audit. No behavior change for users not on HL7-shaped data;
opt-in friction for the 8KB+ tool-result review gate.
- **Tool-name allow-list dropped; content-shape gating only.** The v0.7.3
tool-result auto-PHI gate ran only on `read_file (.hl7|.txt)`, `nc_msgs`,
`hl7_field`, `hl7_diff`. v0.8.1 runs `_auto_phi_looks_like_hl7` on
EVERY tool result. On hit → route through `lib/hl7-sanitize.sh`.
On miss → pass through unchanged. Closes V2: `bash_exec`/`ssh_exec`/
`grep_files`/`read_file` of `.log`/`.csv`/`.dat`/no-suffix files are
now all covered when their output is HL7-shaped. False-positive cost
is cheap (extra regex pass with zero behavioral impact on non-HL7).
- **Base64-wrapped HL7 round-trip.** New `_auto_phi_b64_roundtrip` helper.
Detects candidate base64 runs (length >= 200 chars, `[A-Za-z0-9+/=]`
only, length divisible by 4 — NOT entropy-based, per Pax §V2-sub:
HL7's repetitive prefixes survive base64 with LOW entropy). Speculatively
decodes each run; if decoded bytes look like HL7, routes through
`hl7-sanitize.sh` and re-encodes (`base64 -w0`) back into the result.
Catches `ssh_pull_smat` sampled mode TSV (server-side encoding kept
for binary-safe TSV transport; client-side unwrap handles the safety
concern). Requires `python3` (installed everywhere larry-anywhere
runs); skipped with a one-time stderr warning if unavailable.
- **Operator review gate for `bash_exec`/`ssh_exec`/`ssh_pull`/
`ssh_pull_smat` results.** When the tool produced HL7-shaped output OR
the result exceeds `LARRY_TOOL_RESULT_REVIEW_THRESHOLD` bytes
(default 8192), Larry prompts `[Y/n/i]` before passing the result
back to the model. `i` opens the result in `$PAGER` then re-prompts.
Default Y — zero friction by default. `N` substitutes a refusal JSON
so the model knows a result was withheld. Skipped when
`LARRY_AUTO_PHI=off` (consistent with the opt-out) OR running
non-interactively (no TTY — never blocks headless scripts).
Override with `LARRY_TOOL_RESULT_REVIEW=always` to gate every result.
Per Pax §V2/V12: closes the "operator wanted to see this themselves,
didn't want the model to see it" gap that's the actual common case.
**Proactive same-pattern sweep.** Searched the codebase for other call
sites where tool output bypasses content-shape gating: found only the
one in `agent_turn`. The v0.8.0-c strict-mode tool-result branch was
hardened in lockstep so it now triggers on the broader (content-only)
eligibility instead of the old name-allow-list.
Manifest unchanged.
## v0.8.0 — 2026-05-27
PHI-safety quick-wins pack — three independent zero-risk patches closing
four gap-classes Vera identified in the v0.7.5 static audit
(`Deliverables/2026-05-27-cloverleaf-larry-phi-leak-audit.md`) with Pax's
recommended mitigations
(`Deliverables/2026-05-27-cloverleaf-larry-phi-mitigation-research.md`).
No new dependencies, no behavior change for users not interacting with PHI.
- **`read_file`/`grep_files`/`glob_files`/`list_dir` path-block list
(closes V4 + V6 + V11).** Refuse — with a structured JSON error the
model must surface, NOT a silent "file not found" — any tool-side
attempt to read or enumerate under `$LARRY_HOME/log/` (auto-phi.log,
headers.log, oauth.log, session logs), `$LARRY_HOME/sanitize/`
(lookup.tsv — the desanitization key), `$LARRY_HOME/sessions/`,
`$LARRY_HOME/.oauth.json`, or `$LARRY_HOME/.env`. Block-list resolves
`$LARRY_HOME` at call time (not script-parse time) and runs against
both the literal path and its `realpath -m` canonical form, so symlink
detours don't bypass. The proactive same-pattern sweep (Bryan standing
rule, 2026-05-27) extended the block from `tool_read_file` alone to
also cover `tool_grep_files`, `tool_glob_files`, and `tool_list_dir`
— those tools would otherwise leak filenames or grep-matched content
out of the same protected dirs without any approval gate.
- **`/load <file>` HL7 pre-routing (closes V3).** When the loaded file's
content matches `_auto_phi_looks_like_hl7`, route it through
`lib/hl7-sanitize.sh` (the segment-aware tokenizer with the full PHI
field rule set: PID, PV1, NK1, GT1, IN1, OBR, OBX, DG1, ORC) BEFORE
the existing user_input auto-PHI pass. Closes the gap where smat dumps
loaded via `/load` only got the lighter per-word classifier, which
misses bare HL7 PID fields. Status line reports how many fields were
tokenized: `phi> /load: hl7-sanitize.sh tokenized N HL7 field(s) from
<file> before passing to auto-PHI`. Strict mode (see below) aborts the
`/load` if sanitize fails; default/confirm modes warn-and-continue.
- **`LARRY_AUTO_PHI=strict` fail-closed mode (closes V5).** New fourth
value alongside `off / on / confirm`. In strict mode, the auto-PHI
pipeline aborts the surrounding turn (no payload built, no API call)
when: (a) `lib/hl7-sanitize.sh` is missing/non-executable on HL7-shaped
user_input, (b) the sanitizer returns empty on HL7-shaped content,
(c) any single value's `tokenize-value` call fails inside the
detection loop. On the tool-result surface (which can't kill the
in-flight tool_use), strict mode substitutes the result with a
structured JSON refusal sentinel so the raw HL7 NEVER reaches the
model. Existing `off / on / confirm` semantics unchanged (still
fail-open per Bryan's "don't break tools" priority). Strict is the
opt-in tradeoff for HIPAA work where a silent leak is worse than a
broken turn. `/phi-auto strict` toggle and `/help` text updated.
Wired into both auto-PHI invocation sites: user input scan and the
tool-result HL7 sanitizer gate.
**Proactive same-pattern sweep (Bryan standing rule, 2026-05-27).**
Searched the codebase for other tools matching the pattern "reads
arbitrary path, returns content to model, no approval gate": found and
patched `tool_grep_files`, `tool_glob_files`, `tool_list_dir`
alongside `tool_read_file`. `bash_exec`/`ssh_exec` already require Y/N
operator approval — the operator is the gatekeeper there (a second gate
deferred to v0.8.1). No other matches.
Manifest unchanged (no new files in `lib/`).
## v0.7.5 — 2026-05-27
Three focused changes, one common cause: the Cygwin/MobaXterm CR-taint pattern
that crashed OAuth on Bryan's v0.7.3 work-box with the cryptic error
`bash: ...: arithmetic syntax error: invalid arithmetic operator (error token is "")`.
- **OAuth/arithmetic CR fix.** `lib/oauth.sh` now routes every operand entering
a bash arithmetic context (`fetched_at`, `expires_in`, `now`) through a
dedicated `coerce_int` helper that strips non-digits at the source. The
failure mode: `$(date +%s)` against a Cygwin pty where Windows-native
`date.exe` shadows Cygwin `date` can return a CR-tainted epoch like
`"1779999999\r"`, which crashes the very next `$((expires_at - now))`.
Diagnosis in `Deliverables/2026-05-27-cloverleaf-larry-oauth-arithmetic-fix.md`.
- **Mouse mode is opt-in.** REPL mouse handling now defaults to OFF and is
enabled via `LARRY_MOUSE=1` env var or `/mouse on` slash command. Several
terminals (notably MobaXterm and stripped tmux) were swallowing the mouse
ANSI sequences and printing literal `^[[?1000h` garbage when v0.7.0 turned
it on unconditionally. Diagnosis in
`Deliverables/2026-05-27-cloverleaf-larry-mouse-regression-fix.md`.
- **CR-safety sweep across `lib/*.sh` and top-level scripts.** Three new
primitives in `lib/cygwin-safe.sh` (sourced by every tool family member):
- `coerce_int VAL [DEFAULT]` — for arithmetic and integer-test operands
- `strip_cr VAL` — for case patterns, regex tests, paths, HTTP headers
- `read_clean VAR [PROMPT]``read -r` wrapper that strips CR pre-assign
Hardened call sites:
- `larry.sh` — status-line `date +%s` / `tput cols`, three y/N approval
prompts (write_file, bash_exec, first-run auth), API-key paste,
first-run auth menu
- `lib/oauth.sh``cmd_login` and `cmd_refresh` `date +%s` captures
- `lib/nc-engine.sh` — five y/N action prompts (stop/start/bounce, resend,
route-test, testxlate, tpstest) + `find ... | wc -l` arithmetic
- `lib/nc-msgs.sh``parse_time_ms` `date` captures (4 sites),
meta-TSV `tm` field, `MSG_COUNT` `wc -l`
- `lib/nc-regression.sh``tr | wc -c` count, hl7-diff `?`-fallback
arithmetic
- `lib/nc-smat-diff.sh``A_COUNT`/`B_COUNT`/`DIFFS_TOTAL`
- `lib/nc-insert-protocol.sh` — every awk-emitted line-number that feeds
`head -n $((N-1))` / `tail -n +$((N+1))` arithmetic
- `lib/journal.sh``_next_seq` `wc -l` arithmetic
- `lib/lessons.sh``_next_id`, `cmd_list`, `cmd_count` arithmetic +
two y/N prompts (clear all, clear since)
- `lib/hl7-sanitize.sh``cmd_count` arithmetic + clear-table y/N
- `lib/ssh-helper.sh` — local + remote `wc -c` integer compares (4 sites)
- `lib/nc-find.sh``wc -l` count for `%d` printf
- `lib/nc-table.sh``$(date +%s)` in backup-filename construction
- `lib/nc-document.sh` — two `wc -l | %d` printf sites
- `larry-rollback.sh` — Proceed? y/N prompt
Reproduction (now exercised by `cygwin-safe.sh`'s in-line tests):
```
now=$(printf '%s\r' 1779999999); echo $((now - 1)) # pre-fix: crashes
now=$(coerce_int "$(printf '%s\r' 1779999999)" 0); echo $((now - 1)) # fix: 1779999998
```
Added `lib/cygwin-safe.sh` to `MANIFEST` so it auto-syncs to every running
client on next launch.
## v0.7.4 — 2026-05-27
- Drop GitHub fallback from auto-update. Single-source Gitea
(`https://git.bjnoela.com/bryan/cloverleaf-larry.git`).
## v0.7.3 — 2026-05-26
- Automatic PHI detection (tiered detection + blacklist contexts).
## v0.7.2 — 2026-05-26
- Gitea becomes primary auto-update origin; GitHub demoted to fallback.
## v0.7.1 — 2026-05-26
- Status line moves to between-turn position (post-input, pre-response).
- Status line below prompt; automatic PHI detection; session-artifact upload.
## v0.7.0 — 2026-05-26
- HL7-aware tab completion + REPL mouse mode (later made opt-in in v0.7.5).