Root-cause fix for the live-session friction where "how many sites are on qa?" stalled on repeated `export $HCIROOT` nags despite a working `qa` SSH alias: 1. $HCIROOT login-shell fix: ssh-helper.sh `exec` now wraps remote commands in `bash -lc` so the Cloverleaf login profile sources and $HCIROOT/$HCISITE/PATH populate as for an interactive operator login. Escape hatch: NOLOGIN prefix or LARRY_SSH_NO_LOGIN=1. pull-smat find/sample use the same wrapper. 2. Both-mode detection: startup surfaces a MODE= line (LOCAL / REMOTE / UNKNOWN) and leads with what it found instead of asking for paths. 3. First-class list_sites tool + /sites [alias]: enumerates sites in both modes (hcisitelist fast-path, NetConfig-walk fallback) via new ssh-helper discover. 4. System-prompt de-nagging: agents/larry.md + env-diff/regression prompts no longer tell Larry to ask Bryan to export $HCIROOT for a reachable host. 5. Streaming slowness (dominant residual): new pure-bash _json_str_decode un-escapes the common escape-free delta with zero forks, halving per-turn jq forks on top of v0.8.12. Round-trip verified. 6. pull-smat path capture hardened (Vera Minor #1): resolved path now emitted behind a SMATDB_PATH: sentinel and selected by pattern not position, so a login-shell MOTD/banner on stdout can't be mistaken for the path; falls back to prior tail -1 when no sentinel present. Selection logic unit-verified. Vera gate: PASS-WITH-NOTES (v0.8.13). bash -n clean on larry.sh + ssh-helper.sh; MANIFEST regenerated (48 entries) and --check clean. Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
47 KiB
Changelog
All notable changes to cloverleaf-larry / larry-anywhere are recorded here.
Versioning is loose-semver; bumps trigger the in-process self-update on every
running client via LARRY_BASE_URL + MANIFEST.
v0.8.13 — 2026-05-28
Proactive both-mode Cloverleaf-env detection + the $HCIROOT login-shell fix +
a first-class site lister + the dominant residual-slowness fix (Clover). Closes
the live-session friction where "how many sites are on qa?" stalled on repeated
export $HCIROOT nags despite a working qa SSH alias.
$HCIROOTlogin-shell fix (root cause).ssh-helper.sh execran the remote command in a NON-login, non-interactive shell, so the Cloverleaf login profile (which exports$HCIROOT,$HCISITE, and thehci*PATH) never sourced and$HCIROOTarrived empty — and Larry gave up and asked for a path.execnow wraps the command inbash -lc(login shell), so the remote env populates exactly as for an interactive operator login. Version-agnostic, zero-config. Escape hatch:NOLOGINprefix orLARRY_SSH_NO_LOGIN=1. Thepull-smatfind/sample paths use the same wrapper so$HCISITEDIR/sqlite3resolve there too.- Both-mode detection (proactive). Startup now determines and surfaces a
MODE=line: LOCAL ($HCIROOTset by the local profile, or a Cloverleaf install auto-discovered at a common path — work the local tree, no SSH), REMOTE (no local install but a Cloverleaf SSH alias configured — discover over a login shell), or UNKNOWN (ask which mode applies). Larry leads with what it found instead of asking Bryan to spoon-feed paths. - First-class
list_sitestool +/sites [alias]. "How many sites are on X / what sites exist" now Just Works in both modes: REMOTE (list_sites(alias=qa)) resolves the remote$HCIROOTin a login shell and enumerates sites (Cloverleafhcisitelistfast-path, NetConfig-walk fallback); LOCAL (list_sites()) does the same against the detected local$HCIROOT. Reports the resolved HCIROOT, a count, and the names. Newssh-helper.sh discover <alias>backs the remote path. - System-prompt de-nagging.
agents/larry.mdand the/nc-diff-env//nc-regression-envtemplated prompts no longer instruct Larry to ask Bryan toexport $HCIROOTfor a reachable host. The cardinal rule is explicit: never request a path you can resolve; the only remote precondition surfaced is an open ControlMaster (/ssh-setup <alias>). - Streaming slowness — dominant residual fix. v0.8.12 collapsed per-delta
routing to one jq call, but each
text_delta/input_json_deltaSTILL forked a second jq purely to un-escape the@json-encoded payload — ~N forks/turn (N≈output tokens), and on Cygwin/MobaXterm a fork is ~50-100ms. New_json_str_decodeun-escapes in pure bash for the common escape-free chunk (zero forks), deferring to jq only when a real backslash escape is present. Halves per-turn fork count on top of v0.8.12. Round-trip verified for tab, escaped quote/backslash, embedded newline,\uXXXXUnicode, and empty. pull-smatpath capture hardened (Vera Minor #1). The remote.smatdbpath lookup previously tooktail -1of the merged stdout+stderr stream. Under the new login shell (item 1) a profile MOTD/banner can land on stdout, which a blindtail -1could mistake for the path. The remotefindnow emits the resolved path behind aSMATDB_PATH:sentinel; the client selects by pattern, not position, and only falls back to the priortail -1when no sentinel is present (so no host that worked before can regress). Selection logic unit-verified for banner-before, banner-after, ERROR, empty, multi- sentinel, and no-sentinel-fallback.
Compatible with bash 3.2 / Cygwin; no regression to the v0.8.12 fixes.
v0.8.12 — 2026-05-27
Crash fix + slowness + cost pass on the new API-key rail (Clover). Full
diagnosis: Deliverables/2026-05-27-cloverleaf-larry-v0812-crash-slowness-cost.md.
- Fix the post-response arithmetic crash (
bash: <n>: syntax error: invalid arithmetic operator (error token is "")). The token/cost accounting fed jq-extracted usage counts straight into$(( )); on Cygwin/MobaXterm a CRLF-translated response body makesjq -remit1234\r, and// 0only guards JSON null, not the CR.coerce_intnow defends all three accounting sites (non-streaming cost block, streaming cost block,_record_ctx_used). This is the v0.7.5 OAuth crash's Anomaly-#4 recurrence on the response path. - Prompt caching wired into the request (cost). The ~12.7K-token static
prefix (system ~6K + 35 tools ~6.7K) was re-sent uncached every turn. v0.8.12
sends
systemas a block array and marks the last tool withcache_control: ephemeral(apikey rail;LARRY_PROMPT_CACHE=0to disable), so the prefix is billed at the 0.1x cache-read rate after turn 1 — a ~90% cut on the static prefix and a large TTFB reduction. - Streaming parser per-delta jq forks collapsed (slowness, dominant fix).
The SSE
content_block_deltahot path spawned 3jqprocesses per event (.index,.delta.type, then the text/json payload). A normal response ships dozens-to-hundreds of these, and on Cygwin/MobaXterm a process fork is ~50-100ms (Windows fork emulation) — so the per-turn render lagged multiple seconds ("veeery slow"). The routing fields are now pulled in ONE jq call (payload@json-encoded so embedded newlines/tabs survive the line read), cutting the text path to 2 forks/event and ignored deltas to 1. Verified round-trip on text with embedded NL/TAB/quote and oninput_json_delta. - Per-turn PHI sidecar
/healthprobe cached for the session (slowness).phi_client_availablerancurl -m1every turn; on MobaXterm the sidecar can never run, so it was a guaranteed curl fork + up-to-1s wait per turn. Now probed once per session (file-flag, survives the$(...)subshell);LARRY_PHI_REPROBE=1forces a fresh probe.
v0.8.11 — 2026-05-27
Two headline features land together (Clover):
- API key is now the default auth rail — sanctioned, not edge-throttled;
OAuth-impersonation disabled by default to protect the user's Max account;
secure per-client key via
/set-api-key, stored 0600, CR-safe, masked in diagnostics. - Manifest-hashing auto-update — relaunch skips unchanged files by comparing each MANIFEST sha256 to a local hash, so only changed/missing files download. Relaunch drops from minutes to seconds.
1. API key as the default auth rail
Bryan's decision (2026-05-27): the durable fix for the multi-hour OAuth
rate_limit_error is to STOP impersonating the official Claude Code client and
move to the sanctioned API-key rail. Anthropic actively fingerprints and blocks
Claude-Code OAuth impersonation (server-side enforcement since ~2026-01-09, ToS
change 2026-02-19/20); every impersonated OAuth request flags the user's Max
account. The API key (sk-ant-api03-) is the intended programmatic rail — a
plain x-api-key request, billed pay-as-you-go, not edge-throttled. (Research:
Deliverables/2026-05-27-claude-code-oauth-request-requirements-research.md.)
- API key is the default.
LARRY_AUTH_MODEdefaults toapikey. The request shape is the clean sanctioned form:x-api-key+anthropic-version: 2023-06-01+content-type: application/json— noAuthorization: Bearer, noanthropic-beta: claude-code-*, noclaude-cli (external, cli)UA, nox-app: cli, no?beta=true, no "You are Claude Code" system block. All of the prior impersonation scaffolding was removed. - OAuth is OFF by default (opt-in only). larry fires OAuth ONLY when
LARRY_AUTH_MODE=oauthis set explicitly, and prints a one-time account-risk warning when it does. There is no silent OAuth fallback — larry never auto-pokes the impersonation tripwire. The opt-in OAuth request is minimal and honest (Bearer+anthropic-beta: oauth-2025-04-20only). - Secure per-client API-key provisioning + storage (Bryan's core ask):
/set-api-key(andlarry-auth.sh --api-key) prompts withread -s(silent, never echoed, never in argv / process table / shell history), optionally validates with one cheap/v1/messagesping (max_tokens:1), then stores the key at$LARRY_HOME/.api-key(mode 0600, owner-only), CR-stripped (MobaXterm/Cygwin CRLF-safe).--clearremoves it;--statusshows it masked. Each client holds its OWN key (mint one per machine at console.anthropic.com — independently revocable, leak-contained).- The key is fed to curl via
--config -on stdin, so it never appears in curl's argv / the process table (ps-clean), in all request and validation paths.
- Secret hygiene: the key is never logged, never committed (added to
.gitignoreand to larry's PHI/secretread_filepath-block), and is masked assk-ant-api03-XXXX…last4in/auth,/auth-debug(alias/api-debug), and/set-api-key --status. The audit-trail secret-guard'ssk-ant-[A-Za-z0-9_-]{20,}pattern catchessk-ant-api03-specifically. - 429-discrimination (reused from the prior thread's good work): a real
rate-limit 429 ALWAYS carries
anthropic-ratelimit-*headers → legitimate backoff; a 429 with NO such headers on the API-key rail → clear "edge/ transient bounce, not your quota" message (no futile backoff). On the opt-in OAuth rail, that same edge-reject signature triggers an automatic flip to the sanctioned API-key rail if a key is configured (LARRY_NO_EDGE_FALLBACK=1to opt out). - Migration UX: first launch with the apikey default and no key → friendly
prompt to run
/set-api-key(not a bounce into the risky OAuth path).
2. Manifest-hashing auto-update speedup
sync_from_manifest used to re-download all 48 manifest entries every
relaunch over authenticated HTTPS (Gitea via proxy + Cloudflare) and cmp
locally to find the 0–3 that changed — ~3 min on the work-box for a 3-file
update, because the MANIFEST was paths-only and the client could not tell what
changed without fetching everything.
- MANIFEST now ships each file's expected sha256 (
path<TAB>sha256, generated at release byscripts/make-manifest.sh). The client fetches MANIFEST once, hashes its LOCAL copy of each path, and downloads ONLY entries whose hash differs or are missing. 48 round-trips → 1 (MANIFEST) + a local hash pass + N real downloads (N = actually-changed files, usually 0–3). Relaunch drops from minutes to seconds. - Fail-SAFE: a doubt NEVER skips an update. A download is skipped only when ALL hold — a working sha256 tool, a valid 64-hex manifest hash, the local file exists, and its hash matches exactly. No tool / empty / malformed / non-hex hash, missing local file, hash-tool error, or any mismatch (including a stale or wrong published hash) all fall through to download. Worst case is a needless re-download, never a missed update.
- sha256 tool fallback chain:
sha256sum→shasum -a 256→openssl dgst -sha256→ (none → full-download fallback, updater never breaks). Detected once, cached. Tool output normalized to bare lowercase 64-hex. - CR-safe: the MANIFEST is fetched from Gitea (CRLF risk); the whole line is
CR-stripped before splitting and the hash-tool output is
tr -d '\r''d, so a CRLF-tainted hash never forces a needless re-download. - Download path unchanged from v0.8.9 — still routes through
fetch_validate(HTML-sign-in-trap detection), keeps the post-downloadcmpguard (idempotent),chmod +xfor*.sh, and per-file--max-time. The v0.8.9 live progress indicator now renders over the new "verifying (local)" phase and the (fewer) "downloading" frames. New summary line:manifest sync: N updated, M unchanged (local hash), F failed, T total. - Release tooling (committed, not synced to clients):
scripts/make-manifest.shregenerates/checks the MANIFEST hashes;scripts/hooks/pre-commitblocks a commit whose MANIFEST hashes drifted. These are release-side only — the work-box consumes manifests, never generates them — so they are deliberately NOT listed in MANIFEST.
Deliverables: Deliverables/2026-05-27-cloverleaf-larry-api-key-default-rail.md,
Deliverables/2026-05-27-cloverleaf-larry-manifest-hashing-speedup.md.
v0.8.9 — 2026-05-27
Manifest-sync live progress indicator (Clover). Symptom: Bryan's auto-update
relaunch is very slow and looks frozen — a multi-minute silent gap between
update found: X -> Y … relaunching and manifest sync: N updated, 0 failed, M total, with NO output in between. Observed: [21:32:29] → [21:35:37] = ~3 min
of silence to update just 3 of 48 files; earlier [20:19:42] → [20:20:32] =
~50s for 22 files. Bryan cannot tell working-but-slow from hung.
Root cause (confirmed by reading sync_from_manifest). Phase-A sync does NOT
do cheap HEAD/hash-checks — it fully downloads EVERY manifest entry over an
authenticated HTTPS round-trip (Gitea via the corporate proxy + Cloudflare),
then uses cmp -s locally to decide whether the bytes actually changed
(discarding unchanged ones). With 48 entries that is 48 sequential full
downloads, every relaunch, regardless of how few files changed. The loop emits
nothing until the trailing summary log line → the entire window is silent →
looks hung. The "3 files changed" in the summary is just how many survived the
cmp; all 48 were still fetched.
- Live in-place progress over the WHOLE sync. New
_sync_progress/_sync_progress_donehelpers renderchecking N/48 lib/foo.shper entry, rewriting the line via\r\033[K(carriage-return + clear-line). When a fetch actually lands a changed file, the frame switches todownloading N/48 lib/foo.shso real writes are distinguishable from the common unchanged case. The current filename is always shown, so a genuine stall is VISIBLE — you see exactly which file it is stuck on instead of a blank freeze. - MobaXterm-safe escapes only. Uses solely
\r+ESC[K(the same primitive already at the readline prompt, audited safe in the v0.8.7 escape inventory). Deliberately NO DECSTBM scroll-region, cursor save/restore, or absolute-row addressing — the exact escapes MobaXterm mis-honors. Verified under a pty: onlyESC[0m,ESC[2m,ESC[Kever emitted; zero forbidden escapes. Independent ofLARRY_NO_STREAM/ mouse mode (gates only on a TTY). - Non-TTY safe. Gates on
[ -t 2 ]. When stderr is a pipe/log it emits a plain newline-terminated heartbeat every 10 files (no\r), so captured logs stay clean — verified zero\rbytes in piped output. - Heartbeat for hangs. The per-file fetch already carries curl
--max-time(5s for VERSION, 15s otherwise); a stuck file now shows its name, times out, counts as a fail, and the loop advances — never an infinite silent stall. - Speedup deferred (by design). The clean fix — fetch the manifest once,
compare per-file hashes locally, and SKIP unchanged files with NO per-file
network round-trip — requires the MANIFEST to carry hashes. It currently
carries paths only (no hashes/sizes), so the optimization would require a
manifest-format + release-tooling change (higher risk, separate change). Filed
as a follow-up:
MANIFESTshould emitpath<TAB>sha256so v0.9.x can turn 48 network round-trips into 1 fetch + local compare. This release ships the indicator (Bryan's explicit ask); correctness of the sync is unchanged.
Note: the v0.8.9 relaunch itself is still the slow path (the indicator helps from the NEXT update onward), but you now see live forward progress during it.
v0.8.8 — 2026-05-27
Force unconditional 429 header capture (Clover). Symptom: Bryan's MobaXterm
work-box hits rate_limit_error repeatedly, but $LARRY_HOME/log/headers.log
NEVER generates — so we cannot diagnose which rate-limit rail / auth path is
failing. The single goal: guarantee the log generates on the NEXT 429 so Bryan
can tail it and paste it (manual paste is the plan; auto-sync is dropped).
Call-flow trace (first, to disprove the deeper hypothesis). Bryan's box
runs LARRY_NO_STREAM=1 (auto-set on MobaXterm since v0.8.5), so agent_turn
takes resp=$(call_api …). call_api (larry.sh) ALWAYS dumps response headers
via curl -D and ALWAYS calls _parse_response_headers on that dump after curl
returns — regardless of HTTP status (it explicitly comments "headers carry
rate-limit info even on 429s"). So the non-stream 429 path WAS reaching the
parser. The parser was NOT the missing call — the bug was entirely the
over-clever write-gate INSIDE the parser.
Root cause = the write-gate was too clever for its own purpose. The v0.8.5
gate wrote headers.log only if (OAuth-mode AND a unified-* header was present) OR (retry-after was non-empty). Bryan's 429s carry NEITHER: the backoff used
the exponential 2/4/8s fallback (proving no server retry-after), and a
per-minute burst 429 routinely omits the unified-* family. Neither branch
fired → no write → no log → no diagnosis. The capture defeated its own purpose.
- Unconditional write on ANY 429, detected from the status line.
_parse_response_headersnow greps the-Ddump for^HTTP/<ver> 429(CRLF-tolerant) and, on a match, ALWAYS writes the full raw header block to$LARRY_HOME/log/headers.log— regardless ofretry-after,unified-*, or auth mode. A bare 429 with no diagnostic headers STILL logs; that absence is itself the finding (signals a low/bare-tier limit). - 429s exempt from the OAuth 50-call cap. New
STATUS_429_headers_loggedcounter with its own budget (STATUS_429_HEADER_LOG_LIMIT=200), independent of the 200-path OAuth sampling cap. A session that burned all 50 OAuth captures on successful calls STILL logs its next (51st-call) 429. - Full diagnostic dump. The 429 block writes: a banner with
auth-mode(OAuth-Max vs API-key rail), the detected limit-rail,retry-after, org-id, and request-id; then the HTTP status line, ALLanthropic-*headers (not just-ratelimit-*),retry-after,request-id, and everyx-*header — so the auth-rail + which-limit question is answerable from one paste. - Live stderr pointer. On every 429 capture, prints
phi/rl> 429 headers logged to ~/.larry/log/headers.log (rail=<x>, retry-after=<y>) — paste for diagnosisso Bryan knows the log now exists. - Same-pattern sweep. Streaming path (
call_api_stream→_drain_pending_stream_headers→_parse_response_headers) shares the same function, so Mac/Linux streaming users get identical 429 capture. The v0.8.0tool_read_filePHI path-block (which blocks$LARRY_HOME/log/) is tool-dispatch-only — Bryan reading his own headers.log via interactive shelltailis unaffected (verified: no shell-level block;bash_execrunsbash -cdirectly without the path-block). The 200-path OAuth sampling cap is unchanged.
v0.8.7 — 2026-05-27
Status-line render fix for MobaXterm/Cygwin (Clover). Symptom: the dim between-turn status line (session context + rate-limit reset date) never appeared on Bryan's MobaXterm work-box — the v0.7.1 status-line feature that MEMORY.md flagged as a never-verified passive item.
Root cause = suppress-when-empty gate, NOT terminal positioning.
render_status_line (larry.sh) gated the OAuth arm on ctx_used_tokens,
oauth_5h_utilization, AND oauth_7d_utilization ALL being empty — returning
silently (rendering nothing) when so. Two facts made all three stay empty turn
after turn on Bryan's box: (1) STATUS_ctx_used_tokens is populated by
_record_ctx_used, which runs only AFTER a successful agent_turn; on a
rate_limit_error the turn returns early (larry.sh ~L3929), so ctx was never
recorded; (2) pre-v0.8.5 the anthropic-ratelimit-unified-* utilization
headers weren't captured on error responses. With every turn erroring, all
three gate fields were empty every turn, so the line was suppressed for the
whole session and never rendered. This was NOT a positioning bug: the status
line is a single plain printf'd dim line printed between turns — there is no
DECSTBM scroll-region reservation, no cursor save/restore, no absolute-row
positioning anywhere in the codebase, so MobaXterm's terminal emulation had
nothing to mis-honor. It was also NOT coupled to streaming or mouse mode.
- Gate on turn count, not data presence.
render_status_linenow suppresses ONLY before the first turn has run (_LARRY_TURNS == 0, coerce_int-guarded); thereafter it ALWAYS renders. Both auth-mode helpers already self-render a—placeholder for any unpopulated field, so the line always shows session context (model context window, turns, session cost) and the rate-limit reset time fills in once a successful call — or, since v0.8.5, a captured error response — populates the headers. /statusalways renders on demand, even before the first turn — an explicit request bypasses the turn-0 gate. Lets Bryan verify the line renders on MobaXterm without first completing a (possibly rate-limited) turn.- CR-taint hardening (same-pattern sweep). The OAuth segment's reset-epoch
comparisons (
[ <epoch> -le <now> ]) readSTATUS_oauth_{5h,7d}_reset_epoch, which come from_header_value(strips only the TRAILING CR). A CRLF response on MobaXterm or a non-numeric token would have crashed the arithmetic test and aborted the entire line. Both comparisons nowcoerce_intthe epoch first; theSTATUS_oauth_statuscolor-overridecaseis nowstrip_cr-guarded so arate_limited\rvalue still matches its literal-glob arm. - Same-pattern sweep results: audited every escape sequence in larry.sh +
lib/ — only color SGR, clear-screen (
\033[2J\033[H), erase-line (\r\033[K), and the opt-in mouse/bracketed-paste modes (off by default since v0.7.5) are used; ZERO scroll-region / cursor-save-restore / absolute-cursor sequences, so no other UI element is at risk of MobaXterm mis-rendering. Confirmed no user-visible element is gated behind streaming (used_streamonly guards re-printing already-streamed text) or mouse mode. - Verification:
bash -nclean; 7/7 unit tests pass against the shipped render functions — turn-0 suppressed; turn≥1 with empty data renders with placeholders; reset date shown when populated; renders withLARRY_NO_STREAM=1- mouse off (Bryan's exact config); survives a CR-tainted reset epoch without
crashing;
LARRY_NO_STATUS=1still fully disables.
- mouse off (Bryan's exact config); survives a CR-tainted reset epoch without
crashing;
v0.8.6 — 2026-05-27
Work-box → Mac headers.log sync (tsk-2026-05-27-023, Clover headers-sync).
Closes the last gap in the rate-limit-diagnosis pipeline: the
anthropic-ratelimit-* headers captured on Bryan's MobaXterm work-box (where
the testing happens) never reached the Mac's memory daemon, so they could not
be analyzed. v0.8.6 pushes the work-box headers.log to a daemon-watched path
on the Mac automatically; the Mac daemon ingests it to memory Tier 4
(Hindsight) + Tier 7 (mem0).
- New
lib/headers-sync.sh— incremental, offset-tracked, idempotent push of$LARRY_HOME/log/headers.logto a per-host file on the Mac (~/.cloverleaf/headers-<workbox-hostname>.jsonl, a daemon-watched dir). Transport rides the EXISTING authenticated SSH ControlMaster (/ssh-setup <alias>) — no new key, no second auth, the password is never in argv/env. Only the new bytes since the last sync are sent (dd skip=offset→ remotecat >>); a no-op when nothing is new; a re-seed (truncate + resend) when the local file rotates/shrinks. Fully graceful: missing target, closed master, or transport failure logs a warn to$LARRY_HOME/log/sync.logand returns non-fatally — it can NEVER crash or wedge the larry session. /headers-sync on|off|status|target <alias>|nowslash command.targetbinds the Mac SSH alias;on/offtoggle auto-sync (persisted to$LARRY_HOME/.envasLARRY_HEADERS_SYNC/LARRY_HEADERS_SYNC_TARGET);statusshows enabled?, target, dest, last-sync time, bytes pushed, and master state;nowruns one incremental sync on demand. Registered in the TAB-completion arrays and/help.- Auto-sync cadence: on larry exit. The REPL EXIT/INT/TERM handler flushes
headers.log if auto-sync is enabled (cheap + incremental). On-demand
/headers-sync nowis always available. (After-EVERY-turn cadence was intentionally deferred to keep this change out of the turn/streaming loop that v0.8.5 just reworked.) - Mac-daemon receive side (
scripts/headers_log_ingest.py, not part of the larry bundle): now resolvesheaders-*.jsonlglob sources under the watched dirs IN ADDITION to the fixed canonicalheaders.log, and processes ALL sources with PER-SOURCE offsets — so the Mac's own stream and one or more work-box streams are surfaced independently. Each fact carries asource=label (the work-box hostname) so the memory layer can tell them apart. - Security (Vera PHI audit V7): headers.log holds only
anthropic-*response headers (rate-limit metadata + org id) and HTTP status lines — NO message body, NO PHI — so syncing is safe. The existing key/password-auth ControlMaster transport is reused unchanged (not weakened).
v0.8.5 — 2026-05-27
Diagnose-don't-assume rate-limit cluster fix (Clover #8). Symptom: a hello
turn threw rate_limit_error on a work-box with 90% of the Claude Max 5h quota
free — so NOT 5h-window exhaustion. Root cause = a short-window BURST rail
tripped by a stream→non-stream double-send per turn, with no backoff.
- Rate-limit backoff + actionable message (ROOT). A 429 no longer fails the
turn or fires an immediate re-send.
agent_turnnow retries with backoff that HONORS theretry-afterheader (else exponential 2/4/8s capped at 30s;LARRY_RL_MAX_RETRIES/LARRY_RL_BACKOFF_MAXtunable). The error message is now ACTIONABLE:_parse_response_headerscapturesretry-after+ which rail tripped (anthropic-ratelimit-{requests,input-tokens,output-tokens}-remaining:0orunified-{5h,7d}for OAuth) and_humanize_rate_limitrenders e.g.rate limit: requests-per-minute exhausted (short-window burst, NOT your 5h quota) — resets in 38s; retrying with backoff.headers.lognow captures the full header block on ANY 429 (was: OAuth-mode + unified-* header only), tagged*** 429 retry-after=Ns rail=… ***, so the next rate-limit is always diagnosable. - Streaming parse failure no longer double-sends (burst trigger). A
streaming 429/overload returns a plain JSON error body (not SSE);
parse_stream_to_responsepreviously dropped those non-data:lines, produced zero blocks, returned 1, andagent_turnblindly re-SENT the whole prompt non-streaming — a SECOND full API call within the same second (the per-minute burst). The parser now buffers the non-SSE body and, if it parses as a JSON error, returns a distinct code so the caller surfaces it WITH backoff instead of re-sending (single-send invariant: one logical attempt per turn). Also auto-defaultsLARRY_NO_STREAM=1on MobaXterm/Cygwin/MSYS (_is_cygwin_like) where SSE parsing is fragile; an explicitLARRY_NO_STREAM=0still forces it on. ErrorPImangled error string fixed (CR-taint).— ErrorPI error: rate_limit_errorwas a carriage-return overprint: on MobaXterm the response fieldjq -r '.error.type'carried a trailing\r, which (a) broke thecase "$err_type" in rate_limit_error)match → fell to the%s — %sdefault (the stray—), and (b) CR-returned the cursor so the terminal overprinted "API error" → "ErrorPI". Fix:strip_cronerr_type/err_msgin_humanize_api_error, anderr()/warn()/log()now strip embedded CRs defensively. (The v0.7.5 CR sweep missed the error-DISPLAY construction path.)- phi tier-5 notice fires once per session (was per-turn nag). The
tier-5 (presidio NER) disabled — sidecar not runningnotice printed every turn becauseauto_detect_phiruns inside$(...)command substitution and the oldexport _LARRY_PHI_TIER5_WARNED=1flag died in the subshell. Now keyed to a$LARRY_HOME/.phi-notice-shownfile holdingSESSION_ID— fires once per session, survives the subshell, resets for a genuinely new session. Same-pattern sweep caught the identical subshell-flag bug in_auto_phi_b64_roundtrip's python3-missing notice (_LARRY_B64_PY3_WARNED) — fixed the same way.
v0.8.4 — 2026-05-27
- Installer/updater now detects HTML-sign-in-page responses and fails loud
instead of silently corrupting. Root cause (Clover #5's diagnosis,
Deliverables/2026-05-27-cloverleaf-larry-stuck-update-and-tab-bug.md): a private/sign-in-gated Gitea answers an unauthenticated raw-file read with the HTML Sign-In page at HTTP 200 (303 →/user/login, followed bycurl -Lto a 200 HTML page).curl -fsSLtreats that as success, so the old installer/auto-updater parsed the HTML as VERSION/MANIFEST/larry.shcontent — silently aborting, or overwriting real on-disk files with HTML soup. This is exactly what stranded a work-box at v0.7.3 until the GiteaREQUIRE_SIGNIN_VIEW=falseflip. - New
lib/fetch-safe.sh— a content-validating fetch wrapper (fetch_validate URL DEST KIND [MAX_TIME]). After everycurl, BEFORE trusting the bytes, it (a) detects the HTML-login trap (<!DOCTYPE html/<html/Sign In - Gitea/<title>Sign Inmarkers, or atext/htmlContent-Typewhen a raw file was expected) and (b) validates the content shape per file type: VERSION must match^[0-9]+\.[0-9]+\.[0-9]+, MANIFEST must be a path-list with no HTML,larry.shmust start with#!/usr/bin/env bash, other.shmust be non-HTML. On any failure it prints an actionable error and returns non-zero without overwriting the target. The bootstrapinstall-larry.sh(curl|bash, runs before any lib exists) andlarry.sh'sself_update()(runs before lib is sourced) each carry a byte-identical inline copy; the canonical file is in MANIFEST and auto-syncs. - Every remote-content fetch hardened.
install-larry.shfetch();larry.shagent fetch,sync_from_manifestMANIFEST + per-file fetches, and_fetch_with_fallback(Phase-B VERSION + larry.sh) all route through the validator. No trusted-content fetch still uses rawcurl -fsSL. - Optional
LARRY_GITEA_TOKEN(aliasGITEA_TOKEN) for authenticated fetch. When set, fetches addAuthorization: token <PAT>so the installer/updater works against a PRIVATE repo without the public-flip. The token is never hardcoded and never logged. Documented in--help+ MANUAL.md.
v0.8.3 — 2026-05-27
- Tab-completion trailing space no longer breaks command dispatch. The
slash-command completer intentionally appends a trailing space after a
unique match (so arg-taking commands feel snappy), but the main_loop
dispatcher matched exact
caseglobs, so/quit(completed) missed the/quit)arm and fell through to "unknown command". Latent since v0.6.6 when tab completion shipped. Fixed by rtrimming the dispatch key once at thecase "$input"boundary (larry.sh), which tolerates the completer's space, a user-typed trailing space, and any CR remnant while preserving interior/load FILEargument spacing. Added a sharedrtrim()helper tolib/cygwin-safe.sh(and the inline fallback) next tostrip_cr.
v0.8.2 — 2026-05-27
Microsoft Presidio sidecar for free-text NER. Closes V1 from Vera's audit — the dominant real-world failure mode (patient names, addresses, un-keyworded dates in prose chat). Opt-in install; larry runs in v0.8.1 mode on hosts where Presidio isn't installed (MobaXterm/Cygwin per Bryan's accepted tradeoff).
-
lib/phi-presidio-sidecar.py— FastAPI service on127.0.0.1:$LARRY_PHI_PORT(default41189). Wraps Presidio'sAnalyzerEngine+AnonymizerEngineover spaCyen_core_web_sm(12MB model, ~9-second cold start). Two endpoints:POST /redacttakes{"text": "..."}and returns{"redacted": "...", "entities": [...], "latency_ms": N};GET /healthfor the launcher's readiness probe. Three HL7-specific custom recognizers added (HL7_MRNfor 6-12 digit numerics with patient/MRN/account context;HL7_CARET_NAMEforSMITH^JOHNoutside Tier-3 line context;HL7_PHONE_BAREfor plain 10-digit phones). Confidence threshold for tier-5 tokenize is 0.3 (below that is too noisy). -
lib/phi-sidecar.sh— lifecycle launcher. Subcommands:start / stop / status / health / ensure.ensureis idempotent (no-op if already up); called fromlarry.shmain_loop startup, backgrounded so it never blocks larry's first prompt. Waits up to 30 seconds for the sidecar to become healthy afterstart; surfaces the log tail if startup fails. PID file at$LARRY_HOME/.phi-sidecar.pid; log at$LARRY_HOME/log/phi-sidecar.log. HonorsLARRY_PHI_VENVenv to use a dedicated virtualenv (which the installer sets up at$LARRY_HOME/phi-venvwhen the user opts in). -
lib/phi-client.sh— bash wrapper around/redact. Sourceable functions:phi_client_available,phi_redact_text,phi_redact_entities. Also runs standalone as a CLI (./phi-client.sh check / redact / entities). CR-safe (sourcescygwin-safe.shdefensively); 5-second curl timeout bounds any tier-5 stall. -
Tier-5 integration in
larry.sh:auto_detect_phi. New stage AFTER the existing tier-1/2/3/4 substitution and BEFORE the status summary. Sourcesphi-client.shlazily, probesphi_client_available, and on success runsphi_redact_entitiesto get Presidio's per-entity output. Each entity is tokenized through the SAMEhl7-sanitize.sh tokenize-valuepipeline as tiers 1-4 (category prefixedpresidio_<TYPE>) so token IDs remain stable across surfaces and the/tokenslisting stays unified. Tier-5 honorsLARRY_AUTO_PHI=confirm(prompts Y/n once per value) andstrict(aborts the turn iftokenize-valuefails on a Presidio hit). Critically, v0.8.2 removes the v0.7.3 early-return that exitedauto_detect_phiwhen tiers 1-4 found nothing — pure-prose input now ALWAYS reaches tier-5. -
Graceful degradation. If the sidecar is unreachable (not installed, not started, crashed), tier-5 silently no-ops with a one-time stderr warning per session. Larry's REPL remains fully functional in v0.8.1 mode.
LARRY_AUTO_PHI=strictdoes NOT abort on absent sidecar (the strict mode escape is for HL7-shaped content where rule-pack would have caught the leak; tier-5 is additive coverage). -
/phi-sidecarslash command —start / stop / status / health / ensureexposed to the user. Slash-completion table and_LARRY_SLASH_CMDS_DESCupdated. -
install-larry.shinstall path. On hosts with Python 3.9+ + pip, the installer prompts before creating$LARRY_HOME/phi-venvand installingpresidio_analyzer + presidio_anonymizer + fastapi + uvicorn + spaCy en_core_web_sm(~400MB on disk, ~250MB RAM resident). On MobaXterm/Cygwin without python3, the installer skips the prompt entirely and prints Bryan's accepted tradeoff (MobaXterm stays on v0.8.1 + nudges). Re-runnable; idempotent. -
MANIFEST. Added three new lib files. They auto-sync to every running client on next launch; clients without Python 3 won't run the sidecar but the files are harmless to ship.
Prototype validation (Bryan's Mac, Apple Silicon, Python 3.14).
Cold start (model load): ~9 seconds with en_core_web_sm (vs ~82s with
the larger en_core_web_lg Presidio auto-downloads by default — we
explicitly pin _sm for the latency-sensitive REPL use case). Warm
analyzer latency: P50 20.6ms, P95 22.7ms over 20 sequential requests
on 100-word input. End-to-end HTTP round-trip (curl + json roundtrip):
P50 ~57ms warm; first request post-startup pays a ~150ms tokenizer
warmup tax then steady. Well under the 200ms-per-turn REPL budget.
Detection quality on the canonical "John Doe MRN 623000286" sample: 8
core entities caught (PERSON x2, DATE_TIME x2, PHONE_NUMBER, US_*),
plus the three custom HL7 recognizers add MRN + caret-name + bare-phone
coverage. Misclassifications (MRN as US_PASSPORT, "ED" as PERSON) are
within tolerance for the tokenize-everything-suspicious policy — the
auto-PHI lookup table sees them as presidio_* categories and the
operator can audit via /tokens.
MobaXterm compatibility verdict. Per Bryan's accepted tradeoff:
v0.8.2 ships Mac/Linux-only. MobaXterm/Cygwin stays on v0.8.1
(rule-pack + path-block + content-shape gating + strict mode + base64
round-trip + tool-result review gate). Test path: install-larry.sh
detects platform and skips the Presidio install on windows-cygwin
with a clear "v0.8.1 mode" note. No code in larry.sh is platform-gated
— tier-5 silently no-ops when the sidecar is absent, which IS the
MobaXterm path.
Proactive same-pattern sweep. Searched for other call sites where free-text NER would help: tool-result surface already gets HL7-shape sanitize (v0.8.1) and base64 round-trip (v0.8.1-c). Tier-5 is user_input-only by design — tool-result free-text NER deferred to a future patch (would require deciding on per-tool latency budgets; Bryan to call when needed).
v0.8.1 — 2026-05-27
Tool-result PHI gating expansion. Closes V2 / V12 and the V2 base64 sub-gap from Vera's audit. No behavior change for users not on HL7-shaped data; opt-in friction for the 8KB+ tool-result review gate.
-
Tool-name allow-list dropped; content-shape gating only. The v0.7.3 tool-result auto-PHI gate ran only on
read_file (.hl7|.txt),nc_msgs,hl7_field,hl7_diff. v0.8.1 runs_auto_phi_looks_like_hl7on EVERY tool result. On hit → route throughlib/hl7-sanitize.sh. On miss → pass through unchanged. Closes V2:bash_exec/ssh_exec/grep_files/read_fileof.log/.csv/.dat/no-suffix files are now all covered when their output is HL7-shaped. False-positive cost is cheap (extra regex pass with zero behavioral impact on non-HL7). -
Base64-wrapped HL7 round-trip. New
_auto_phi_b64_roundtriphelper. Detects candidate base64 runs (length >= 200 chars,[A-Za-z0-9+/=]only, length divisible by 4 — NOT entropy-based, per Pax §V2-sub: HL7's repetitive prefixes survive base64 with LOW entropy). Speculatively decodes each run; if decoded bytes look like HL7, routes throughhl7-sanitize.shand re-encodes (base64 -w0) back into the result. Catchesssh_pull_smatsampled mode TSV (server-side encoding kept for binary-safe TSV transport; client-side unwrap handles the safety concern). Requirespython3(installed everywhere larry-anywhere runs); skipped with a one-time stderr warning if unavailable. -
Operator review gate for
bash_exec/ssh_exec/ssh_pull/ssh_pull_smatresults. When the tool produced HL7-shaped output OR the result exceedsLARRY_TOOL_RESULT_REVIEW_THRESHOLDbytes (default 8192), Larry prompts[Y/n/i]before passing the result back to the model.iopens the result in$PAGERthen re-prompts. Default Y — zero friction by default.Nsubstitutes a refusal JSON so the model knows a result was withheld. Skipped whenLARRY_AUTO_PHI=off(consistent with the opt-out) OR running non-interactively (no TTY — never blocks headless scripts). Override withLARRY_TOOL_RESULT_REVIEW=alwaysto gate every result. Per Pax §V2/V12: closes the "operator wanted to see this themselves, didn't want the model to see it" gap that's the actual common case.
Proactive same-pattern sweep. Searched the codebase for other call
sites where tool output bypasses content-shape gating: found only the
one in agent_turn. The v0.8.0-c strict-mode tool-result branch was
hardened in lockstep so it now triggers on the broader (content-only)
eligibility instead of the old name-allow-list.
Manifest unchanged.
v0.8.0 — 2026-05-27
PHI-safety quick-wins pack — three independent zero-risk patches closing
four gap-classes Vera identified in the v0.7.5 static audit
(Deliverables/2026-05-27-cloverleaf-larry-phi-leak-audit.md) with Pax's
recommended mitigations
(Deliverables/2026-05-27-cloverleaf-larry-phi-mitigation-research.md).
No new dependencies, no behavior change for users not interacting with PHI.
-
read_file/grep_files/glob_files/list_dirpath-block list (closes V4 + V6 + V11). Refuse — with a structured JSON error the model must surface, NOT a silent "file not found" — any tool-side attempt to read or enumerate under$LARRY_HOME/log/(auto-phi.log, headers.log, oauth.log, session logs),$LARRY_HOME/sanitize/(lookup.tsv — the desanitization key),$LARRY_HOME/sessions/,$LARRY_HOME/.oauth.json, or$LARRY_HOME/.env. Block-list resolves$LARRY_HOMEat call time (not script-parse time) and runs against both the literal path and itsrealpath -mcanonical form, so symlink detours don't bypass. The proactive same-pattern sweep (Bryan standing rule, 2026-05-27) extended the block fromtool_read_filealone to also covertool_grep_files,tool_glob_files, andtool_list_dir— those tools would otherwise leak filenames or grep-matched content out of the same protected dirs without any approval gate. -
/load <file>HL7 pre-routing (closes V3). When the loaded file's content matches_auto_phi_looks_like_hl7, route it throughlib/hl7-sanitize.sh(the segment-aware tokenizer with the full PHI field rule set: PID, PV1, NK1, GT1, IN1, OBR, OBX, DG1, ORC) BEFORE the existing user_input auto-PHI pass. Closes the gap where smat dumps loaded via/loadonly got the lighter per-word classifier, which misses bare HL7 PID fields. Status line reports how many fields were tokenized:phi> /load: hl7-sanitize.sh tokenized N HL7 field(s) from <file> before passing to auto-PHI. Strict mode (see below) aborts the/loadif sanitize fails; default/confirm modes warn-and-continue. -
LARRY_AUTO_PHI=strictfail-closed mode (closes V5). New fourth value alongsideoff / on / confirm. In strict mode, the auto-PHI pipeline aborts the surrounding turn (no payload built, no API call) when: (a)lib/hl7-sanitize.shis missing/non-executable on HL7-shaped user_input, (b) the sanitizer returns empty on HL7-shaped content, (c) any single value'stokenize-valuecall fails inside the detection loop. On the tool-result surface (which can't kill the in-flight tool_use), strict mode substitutes the result with a structured JSON refusal sentinel so the raw HL7 NEVER reaches the model. Existingoff / on / confirmsemantics unchanged (still fail-open per Bryan's "don't break tools" priority). Strict is the opt-in tradeoff for HIPAA work where a silent leak is worse than a broken turn./phi-auto stricttoggle and/helptext updated. Wired into both auto-PHI invocation sites: user input scan and the tool-result HL7 sanitizer gate.
Proactive same-pattern sweep (Bryan standing rule, 2026-05-27).
Searched the codebase for other tools matching the pattern "reads
arbitrary path, returns content to model, no approval gate": found and
patched tool_grep_files, tool_glob_files, tool_list_dir
alongside tool_read_file. bash_exec/ssh_exec already require Y/N
operator approval — the operator is the gatekeeper there (a second gate
deferred to v0.8.1). No other matches.
Manifest unchanged (no new files in lib/).
v0.7.5 — 2026-05-27
Three focused changes, one common cause: the Cygwin/MobaXterm CR-taint pattern
that crashed OAuth on Bryan's v0.7.3 work-box with the cryptic error
bash: ...: arithmetic syntax error: invalid arithmetic operator (error token is "").
-
OAuth/arithmetic CR fix.
lib/oauth.shnow routes every operand entering a bash arithmetic context (fetched_at,expires_in,now) through a dedicatedcoerce_inthelper that strips non-digits at the source. The failure mode:$(date +%s)against a Cygwin pty where Windows-nativedate.exeshadows Cygwindatecan return a CR-tainted epoch like"1779999999\r", which crashes the very next$((expires_at - now)). Diagnosis inDeliverables/2026-05-27-cloverleaf-larry-oauth-arithmetic-fix.md. -
Mouse mode is opt-in. REPL mouse handling now defaults to OFF and is enabled via
LARRY_MOUSE=1env var or/mouse onslash command. Several terminals (notably MobaXterm and stripped tmux) were swallowing the mouse ANSI sequences and printing literal^[[?1000hgarbage when v0.7.0 turned it on unconditionally. Diagnosis inDeliverables/2026-05-27-cloverleaf-larry-mouse-regression-fix.md. -
CR-safety sweep across
lib/*.shand top-level scripts. Three new primitives inlib/cygwin-safe.sh(sourced by every tool family member):coerce_int VAL [DEFAULT]— for arithmetic and integer-test operandsstrip_cr VAL— for case patterns, regex tests, paths, HTTP headersread_clean VAR [PROMPT]—read -rwrapper that strips CR pre-assign Hardened call sites:larry.sh— status-linedate +%s/tput cols, three y/N approval prompts (write_file, bash_exec, first-run auth), API-key paste, first-run auth menulib/oauth.sh—cmd_loginandcmd_refreshdate +%scaptureslib/nc-engine.sh— five y/N action prompts (stop/start/bounce, resend, route-test, testxlate, tpstest) +find ... | wc -larithmeticlib/nc-msgs.sh—parse_time_msdatecaptures (4 sites), meta-TSVtmfield,MSG_COUNTwc -llib/nc-regression.sh—tr | wc -ccount, hl7-diff?-fallback arithmeticlib/nc-smat-diff.sh—A_COUNT/B_COUNT/DIFFS_TOTALlib/nc-insert-protocol.sh— every awk-emitted line-number that feedshead -n $((N-1))/tail -n +$((N+1))arithmeticlib/journal.sh—_next_seqwc -larithmeticlib/lessons.sh—_next_id,cmd_list,cmd_countarithmetic + two y/N prompts (clear all, clear since)lib/hl7-sanitize.sh—cmd_countarithmetic + clear-table y/Nlib/ssh-helper.sh— local + remotewc -cinteger compares (4 sites)lib/nc-find.sh—wc -lcount for%dprintflib/nc-table.sh—$(date +%s)in backup-filename constructionlib/nc-document.sh— twowc -l | %dprintf siteslarry-rollback.sh— Proceed? y/N prompt
Reproduction (now exercised by
cygwin-safe.sh's in-line tests):now=$(printf '%s\r' 1779999999); echo $((now - 1)) # pre-fix: crashes now=$(coerce_int "$(printf '%s\r' 1779999999)" 0); echo $((now - 1)) # fix: 1779999998Added
lib/cygwin-safe.shtoMANIFESTso it auto-syncs to every running client on next launch.
v0.7.4 — 2026-05-27
- Drop GitHub fallback from auto-update. Single-source Gitea
(
https://git.bjnoela.com/bryan/cloverleaf-larry.git).
v0.7.3 — 2026-05-26
- Automatic PHI detection (tiered detection + blacklist contexts).
v0.7.2 — 2026-05-26
- Gitea becomes primary auto-update origin; GitHub demoted to fallback.
v0.7.1 — 2026-05-26
- Status line moves to between-turn position (post-input, pre-response).
- Status line below prompt; automatic PHI detection; session-artifact upload.
v0.7.0 — 2026-05-26
- HL7-aware tab completion + REPL mouse mode (later made opt-in in v0.7.5).