From fe2f67a1aaaf69b6ffd84e1fee81e899621fab63 Mon Sep 17 00:00:00 2001 From: Bryan Johnson Date: Thu, 28 May 2026 07:40:53 -0700 Subject: [PATCH] v0.8.13: $HCIROOT login-shell fix + both-mode detection + list_sites/sites + per-delta jq-fork slowness fix Root-cause fix for the live-session friction where "how many sites are on qa?" stalled on repeated `export $HCIROOT` nags despite a working `qa` SSH alias: 1. $HCIROOT login-shell fix: ssh-helper.sh `exec` now wraps remote commands in `bash -lc` so the Cloverleaf login profile sources and $HCIROOT/$HCISITE/PATH populate as for an interactive operator login. Escape hatch: NOLOGIN prefix or LARRY_SSH_NO_LOGIN=1. pull-smat find/sample use the same wrapper. 2. Both-mode detection: startup surfaces a MODE= line (LOCAL / REMOTE / UNKNOWN) and leads with what it found instead of asking for paths. 3. First-class list_sites tool + /sites [alias]: enumerates sites in both modes (hcisitelist fast-path, NetConfig-walk fallback) via new ssh-helper discover. 4. System-prompt de-nagging: agents/larry.md + env-diff/regression prompts no longer tell Larry to ask Bryan to export $HCIROOT for a reachable host. 5. Streaming slowness (dominant residual): new pure-bash _json_str_decode un-escapes the common escape-free delta with zero forks, halving per-turn jq forks on top of v0.8.12. Round-trip verified. 6. pull-smat path capture hardened (Vera Minor #1): resolved path now emitted behind a SMATDB_PATH: sentinel and selected by pattern not position, so a login-shell MOTD/banner on stdout can't be mistaken for the path; falls back to prior tail -1 when no sentinel present. Selection logic unit-verified. Vera gate: PASS-WITH-NOTES (v0.8.13). bash -n clean on larry.sh + ssh-helper.sh; MANIFEST regenerated (48 entries) and --check clean. Co-Authored-By: Clover (Claude Opus 4.7) --- CHANGELOG.md | 56 ++++++++++++ MANIFEST | 10 +-- VERSION | 2 +- agents/larry.md | 19 ++-- larry.sh | 221 +++++++++++++++++++++++++++++++++++++++++----- lib/ssh-helper.sh | 123 ++++++++++++++++++++++++-- 6 files changed, 388 insertions(+), 43 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index b1afe34..1d1cef2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,62 @@ All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here. Versioning is loose-semver; bumps trigger the in-process self-update on every running client via `LARRY_BASE_URL` + `MANIFEST`. +## v0.8.13 — 2026-05-28 + +Proactive both-mode Cloverleaf-env detection + the `$HCIROOT` login-shell fix + +a first-class site lister + the dominant residual-slowness fix (Clover). Closes +the live-session friction where "how many sites are on qa?" stalled on repeated +`export $HCIROOT` nags despite a working `qa` SSH alias. + +1. **`$HCIROOT` login-shell fix (root cause).** `ssh-helper.sh exec` ran the + remote command in a NON-login, non-interactive shell, so the Cloverleaf + login profile (which exports `$HCIROOT`, `$HCISITE`, and the `hci*` PATH) + never sourced and `$HCIROOT` arrived empty — and Larry gave up and asked for a + path. `exec` now wraps the command in `bash -lc` (login shell), so the remote + env populates exactly as for an interactive operator login. Version-agnostic, + zero-config. Escape hatch: `NOLOGIN ` prefix or `LARRY_SSH_NO_LOGIN=1`. The + `pull-smat` find/sample paths use the same wrapper so `$HCISITEDIR`/`sqlite3` + resolve there too. +2. **Both-mode detection (proactive).** Startup now determines and surfaces a + `MODE=` line: **LOCAL** (`$HCIROOT` set by the local profile, or a Cloverleaf + install auto-discovered at a common path — work the local tree, no SSH), + **REMOTE** (no local install but a Cloverleaf SSH alias configured — discover + over a login shell), or **UNKNOWN** (ask which mode applies). Larry leads with + what it found instead of asking Bryan to spoon-feed paths. +3. **First-class `list_sites` tool + `/sites [alias]`.** "How many sites are on + X / what sites exist" now Just Works in both modes: REMOTE + (`list_sites(alias=qa)`) resolves the remote `$HCIROOT` in a login shell and + enumerates sites (Cloverleaf `hcisitelist` fast-path, NetConfig-walk + fallback); LOCAL (`list_sites()`) does the same against the detected local + `$HCIROOT`. Reports the resolved HCIROOT, a count, and the names. New + `ssh-helper.sh discover ` backs the remote path. +4. **System-prompt de-nagging.** `agents/larry.md` and the `/nc-diff-env` / + `/nc-regression-env` templated prompts no longer instruct Larry to ask Bryan + to `export $HCIROOT` for a reachable host. The cardinal rule is explicit: + never request a path you can resolve; the only remote precondition surfaced + is an open ControlMaster (`/ssh-setup `). +5. **Streaming slowness — dominant residual fix.** v0.8.12 collapsed per-delta + routing to one jq call, but each `text_delta`/`input_json_delta` STILL forked + a second jq purely to un-escape the `@json`-encoded payload — ~N forks/turn + (N≈output tokens), and on Cygwin/MobaXterm a fork is ~50-100ms. New + `_json_str_decode` un-escapes in pure bash for the common escape-free chunk + (zero forks), deferring to jq only when a real backslash escape is present. + Halves per-turn fork count on top of v0.8.12. Round-trip verified for tab, + escaped quote/backslash, embedded newline, `\uXXXX` Unicode, and empty. +6. **`pull-smat` path capture hardened (Vera Minor #1).** The remote `.smatdb` + path lookup previously took `tail -1` of the merged stdout+stderr stream. + Under the new login shell (item 1) a profile MOTD/banner can land on stdout, + which a blind `tail -1` could mistake for the path. The remote `find` now + emits the resolved path behind a `SMATDB_PATH:` sentinel; the client selects + by pattern, not position, and only falls back to the prior `tail -1` when no + sentinel is present (so no host that worked before can regress). Selection + logic unit-verified for banner-before, banner-after, ERROR, empty, multi- + sentinel, and no-sentinel-fallback. + +Compatible with bash 3.2 / Cygwin; no regression to the v0.8.12 fixes. + +--- + ## v0.8.12 — 2026-05-27 Crash fix + slowness + cost pass on the new API-key rail (Clover). Full diff --git a/MANIFEST b/MANIFEST index 87ad8dd..f8888ce 100644 --- a/MANIFEST +++ b/MANIFEST @@ -23,19 +23,19 @@ # scripts/make-manifest.sh and bump VERSION. # Top-level scripts -larry.sh 668a0951759c0ee67841f64d8d5f55f25a32fd908da46bc7733814123f7b0d68 +larry.sh fabb64714cc0b910805f9d2fdaf8de4a903ce546ef1b3a8923fe0bcb980b9b7e larry-tunnel.sh 6b050e4eeab15669f4858eaf3b807f168f211ced07815db9521bc40a093f6aaa larry-auth.sh a220cdf7878569dc3028951ee57fc8d5e706a8ca5c6aa45347b58facb386f831 larry-rollback.sh 91b5e9aa6c79266bf306dcfba4ca791c07971bd6924d67a779037531648aa6d0 install-larry.sh e97da4e12a0d8863ca18d79b12f6c4294c72fa6d4b11dffeab66504236bb4eb1 # Metadata -VERSION 48732179dae488fe78e7210c84f94a44cc6913283f845d7d978302f20750f8f8 +VERSION 414f2b8d3ed8f7c983632c4765179167948b1519b0cfc3596c68db66c9617dc3 MANUAL.md 755d98b802cb16a5d2d207d423b12c6ca632f118ee372cb5093fe2320a6515ce -CHANGELOG.md 78608882c507e1f8edd650c577eb17913e719f4836e6331e6885aeee89da9306 +CHANGELOG.md bc695402eaafd52bb718e1852355d01a50da7cb86320df980178489de1c683fb # Agent personas (system-prompt overlays) -agents/larry.md ace30b97a166c9f244df66ac5f5944e9251dda375a45340d443bccb34bc5ec94 +agents/larry.md 11ea905fa7cac6fa7baeb11b2d62af07b15a666ce90cfe36491bcbc555244397 agents/clover.md d1bbfd6cc4642c2bff6e15dcbdf051d71b063b3fe29e0be97d17b3180d3c7ac5 agents/cloverleaf-cheatsheet.md c0a2aab91f1ddf092bce312def02cc6f3f62a1f653ca5af67a9430c3fcef4c3f agents/regress.md bb05ed1439b1e35d6e9799e32d683bfab166472c72115c1f02757e227c74e42f @@ -52,7 +52,7 @@ lib/fetch-safe.sh abecf0045b9856f63ffa346119443c11de56547344be32bddaed9fbae6b021 lib/oauth.sh 04a93376f88fe53cc1c86a5dbe577735c60375dadd4f2fda55b921ef3cddf22b # Secure SSH with ControlMaster (password hidden from Larry-the-LLM) -lib/ssh-helper.sh d73924a18b0c0c5856c68fb538af706316e23679f72cd94b694be73325231c9b +lib/ssh-helper.sh 7aa2aa7b3860cb48b7ba5120f9efc2563a6cdaed41242f42ecc9dd03fdebeb28 # v0.8.6: work-box → Mac headers.log sync (tsk-2026-05-27-023). Incremental, # offset-tracked push of $LARRY_HOME/log/headers.log to a daemon-watched path diff --git a/VERSION b/VERSION index 7eff8ab..c2f73c6 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.8.12 +0.8.13 diff --git a/agents/larry.md b/agents/larry.md index 69477fe..7d84c6b 100644 --- a/agents/larry.md +++ b/agents/larry.md @@ -12,14 +12,21 @@ You are **Larry, Bryan's team orchestrator at myPKA**, running in portable mode Bryan downloaded you onto a locked-down machine (no install rights). You are running as a single bash script that calls the Anthropic API directly. Your job here is **Cloverleaf interface build and Netconfig analysis** — pure interface work, **no PHI is involved**, no production push, no destructive shell commands without explicit Y/N confirmation. -## Site-awareness on startup (use this!) +## Site-awareness on startup — TWO deployment modes (be proactive, never nag) -Larry-Anywhere auto-detects the Cloverleaf runtime context every session and includes it under "**Detected runtime context (read-only)**" at the bottom of your system prompt. It tells you: -- `$HCIROOT` and whether the directory exists -- `$HCISITE`, `$HCISITEDIR`, and counts of `NetConfig`, `Xlate/`, `tables/`, `tclprocs/`, `formats/` -- Which tool layer is present: modern `cloverleaf-tools.pyz`, classic Eric scripts (`tbn`, `hlq`, `mr`, `mp`, `mg`, etc.), or neither. +Larry-Anywhere auto-detects the Cloverleaf runtime context every session, under "**Detected runtime context (read-only)**" at the bottom of your system prompt. The first line is **`MODE=`** — read it and act: -**Lead every Cloverleaf-shaped task with the detected context in mind.** If `HCIROOT` is unset and Bryan asks "what threads are on this site," ask him to `export HCIROOT=…` and `export HCISITE=…` first, or use `/site ` mid-session. Don't fabricate a path. +- **`MODE=LOCAL`** — Cloverleaf is on THIS box. `$HCIROOT` is detected from the local login profile or auto-discovered at a common install path. **Work the local tree directly. Never ask Bryan for a path.** +- **`MODE=REMOTE`** — Cloverleaf is on a remote server reached via a configured SSH alias (e.g. `qa`). The context lists the configured aliases. The remote `$HCIROOT` is set by that host's LOGIN profile, so **you must reach it over a login shell** — which the tools already do for you. +- **`MODE=UNKNOWN`** — no local install and no SSH alias. Only here do you ask a question: "Is Cloverleaf on this box, or on a remote host I should `/ssh-add`?" + +It also lists `$HCISITE`/`$HCISITEDIR`, artifact counts, and which tool layer is present. + +**The cardinal rule (this fixed real friction): NEVER ask Bryan to `export $HCIROOT` or hand you a path for a host you can already reach.** Concretely: +- "How many sites are on `qa`?" / "what sites exist?" → call **`list_sites(alias="qa")`** (REMOTE) or **`list_sites()`** (LOCAL). It resolves `$HCIROOT` for you (REMOTE: in a login shell over the open ControlMaster; LOCAL: from the detected env) and returns the count + names. Do NOT first ask Bryan to export anything. +- Any remote command runs in a **login shell automatically** (`ssh_exec` wraps it in `bash -lc`), so `$HCIROOT`, `$HCISITE`, and the `hci*` binaries are populated exactly as for an interactive operator login. You do **not** need to source a profile yourself or ask Bryan to. +- The ONLY remote precondition you surface is the ControlMaster: if a `list_sites`/`ssh_exec` result says the master is closed, tell Bryan to run `/ssh-setup ` — that's it. Never the path. +- Lead with what you found ("`qa` has N sites: …"), don't fabricate a path, and don't spoon-feed prompts back to Bryan. The cheat-sheet (`agents/cloverleaf-cheatsheet.md`) is loaded into your system prompt — use it. When proposing a command, **prefer the modern `cloverleaf-tools.pyz` form if present**, fall back to classic Eric scripts, fall back to bash one-liners only if neither layer is on PATH. diff --git a/larry.sh b/larry.sh index 31b5131..9f89580 100755 --- a/larry.sh +++ b/larry.sh @@ -72,7 +72,7 @@ set -o pipefail # ───────────────────────────────────────────────────────────────────────────── # Config # ───────────────────────────────────────────────────────────────────────────── -LARRY_VERSION="0.8.12" +LARRY_VERSION="0.8.13" LARRY_HOME="${LARRY_HOME:-$HOME/.larry}" # ───────────────────────────────────────────────────────────────────────────── @@ -1016,18 +1016,86 @@ if [ -z "$LARRY_AUTH_MODE" ]; then fi # ───────────────────────────────────────────────────────────────────────────── -# Cloverleaf environment detection -# Surfaces HCIROOT / HCISITE / HCISITEDIR and which tool layer is present -# (modern cloverleaf-tools.pyz, classic Eric scripts, or neither). -# Result is appended to the system prompt so the model knows where it is. +# Cloverleaf environment detection — BOTH deployment modes (v0.8.13) +# +# Larry-Anywhere runs in one of two modes, and it must DETECT which and act +# proactively rather than nagging Bryan for a path: +# +# MODE LOCAL — larry is installed directly ON a Cloverleaf box. $HCIROOT is +# set by the local login profile, or a Cloverleaf install sits +# at a common path. No SSH needed; work the local tree. +# MODE REMOTE — larry is on a client/local box; Cloverleaf is on a remote +# server reachable via a configured SSH alias (e.g. `qa`). The +# REMOTE env is discovered over SSH in a LOGIN shell (so the +# remote $HCIROOT populates) — see ssh-helper.sh `discover`. +# +# This function surfaces the detected MODE, the local env (if any), and any +# configured SSH aliases, so the model leads with what it found instead of +# asking the user to spoon-feed paths. # ───────────────────────────────────────────────────────────────────────────── + +# _local_cloverleaf_root — echo a local Cloverleaf $HCIROOT if one is set or +# auto-discoverable, else empty. Order: $HCIROOT (if it's a real dir) → common +# install paths that contain a Cloverleaf marker (a site with a NetConfig, or a +# server/ + bin/ pair). Cheap, read-only, depth-limited. +_local_cloverleaf_root() { + if [ -n "${HCIROOT:-}" ] && [ -d "$HCIROOT" ]; then printf '%s' "$HCIROOT"; return 0; fi + local p + for p in /quovadx/qdx*/integrator /quovadx/integrator /opt/cloverleaf/integrator \ + /cloverleaf/integrator /usr/local/cloverleaf "$HOME/integrator" /qdx/integrator; do + [ -d "$p" ] || continue + # Marker: a server/ dir, or at least one immediate-child site with a NetConfig. + if [ -d "$p/server" ] || find "$p" -mindepth 2 -maxdepth 2 -name NetConfig -type f 2>/dev/null | head -1 | grep -q .; then + printf '%s' "$p"; return 0 + fi + done + return 1 +} + +# _ssh_aliases — echo configured SSH aliases (one per line) from the hosts TSV. +# Available even though LARRY_LIB_DIR isn't resolved yet (detection runs early). +_ssh_aliases() { + local f="${LARRY_HOME:-$HOME/.larry}/.ssh-hosts.tsv" + [ -f "$f" ] || return 0 + awk -F'\t' 'NR>1 && $1!="" { print $1 }' "$f" 2>/dev/null +} + detect_cloverleaf_env() { CLOVERLEAF_CTX="" local lines=() + + # ── Mode determination ────────────────────────────────────────────────── + local local_root; local_root=$(_local_cloverleaf_root || true) + local aliases; aliases=$(_ssh_aliases) + local alias_count=0 + [ -n "$aliases" ] && alias_count=$(printf '%s\n' "$aliases" | grep -c .) + # If $HCIROOT wasn't already exported but we auto-discovered a local install, + # adopt it so the rest of detection + the nc_* tools resolve against it. + if [ -z "${HCIROOT:-}" ] && [ -n "$local_root" ]; then + HCIROOT="$local_root"; export HCIROOT + fi + + local mode="UNKNOWN" + if [ -n "$local_root" ]; then + mode="LOCAL" + elif [ "$alias_count" -gt 0 ]; then + mode="REMOTE" + fi + lines+=("MODE=$mode") + case "$mode" in + LOCAL) lines+=("→ Cloverleaf is on THIS box. Work the local tree at \$HCIROOT directly — do NOT ask Bryan for a path.") ;; + REMOTE) lines+=("→ Cloverleaf is on a REMOTE host. ${alias_count} SSH alias(es) configured: $(printf '%s' "$aliases" | tr '\n' ' ')") ;; + *) lines+=("→ No local Cloverleaf install and no SSH alias configured. If Bryan names a host, /ssh-add it; otherwise ask which mode applies.") ;; + esac + if [ -n "$aliases" ]; then + lines+=("Configured SSH aliases: $(printf '%s' "$aliases" | tr '\n' ' ')") + lines+=("For a remote alias: discover its env with the list_sites tool (alias=) — it opens a LOGIN shell so the remote \$HCIROOT resolves. NEVER ask Bryan to export \$HCIROOT for a remote host.") + fi + if [ -n "${HCIROOT:-}" ]; then lines+=("HCIROOT=$HCIROOT (exists=$([ -d "$HCIROOT" ] && echo yes || echo no))") else - lines+=("HCIROOT=") + lines+=("HCIROOT=") fi if [ -n "${HCISITE:-}" ]; then local sitedir="${HCISITEDIR:-${HCIROOT:-}/$HCISITE}" @@ -3586,6 +3654,64 @@ tool_ssh_status() { "$helper" status 2>&1 } +# ── v0.8.13: first-class "list / count Cloverleaf sites" — works in BOTH modes. +# This is the proactive answer to "how many sites are on qa?" that previously +# stalled on a missing $HCIROOT. +# +# REMOTE: alias given → ssh-helper.sh `discover ` resolves the remote +# $HCIROOT in a LOGIN shell, then lists sites (hcisitelist fast-path, +# NetConfig-walk fallback). No path is ever requested from Bryan. +# LOCAL: no alias → enumerate sites under the local $HCIROOT (or hciroot +# override): hcisitelist if on PATH, else NetConfig walk. +# +# Output is human-readable: a count + one site name per line, plus the resolved +# HCIROOT so the model can cite it. +tool_list_sites() { + local alias="${1:-}" hciroot_ovr="${2:-}" + if [ -n "$alias" ]; then + # ── REMOTE mode ────────────────────────────────────────────────────── + local helper="$LARRY_LIB_DIR/ssh-helper.sh" + [ -x "$helper" ] || { echo "ERROR: ssh-helper.sh not installed"; return 1; } + local out rc + out=$("$helper" discover "$alias" 2>&1); rc=$? + if [ "$rc" -ne 0 ]; then + printf '%s\n[list_sites: discover failed for alias=%s rc=%d. If the master is closed, tell Bryan to run /ssh-setup %s]\n' \ + "$out" "$alias" "$rc" "$alias" + return 0 + fi + local rroot; rroot=$(printf '%s\n' "$out" | awk -F'\t' '$1=="HCIROOT"{print $2; exit}') + local sites; sites=$(printf '%s\n' "$out" | awk -F'\t' '$1=="SITE"{print $2}' | sort -u) + local note; note=$(printf '%s\n' "$out" | awk -F'\t' '$1=="NOTE"{print $2}') + local n=0; [ -n "$sites" ] && n=$(printf '%s\n' "$sites" | grep -c .) + printf 'Cloverleaf env on alias "%s" (REMOTE, login shell):\n' "$alias" + printf ' HCIROOT = %s\n' "${rroot:-}" + [ -n "$note" ] && printf ' NOTE: %s\n' "$note" + printf ' sites: %d\n' "$n" + [ -n "$sites" ] && printf '%s\n' "$sites" | sed 's/^/ - /' + return 0 + fi + + # ── LOCAL mode ───────────────────────────────────────────────────────── + local root="${hciroot_ovr:-${HCIROOT:-}}" + if [ -z "$root" ] || [ ! -d "$root" ]; then + echo "ERROR: no local \$HCIROOT (and no hciroot override). This box has no detected Cloverleaf install. If Cloverleaf is on a remote host, pass alias= (run ssh_status to see configured aliases)." + return 0 + fi + local sites="" + if command -v hcisitelist >/dev/null 2>&1; then + sites=$(hcisitelist 2>/dev/null | tr ' ' '\n' | grep -v '^$' | sort -u) + fi + if [ -z "$sites" ]; then + sites=$(find "$root" -mindepth 1 -maxdepth 2 -name NetConfig -type f 2>/dev/null \ + | while IFS= read -r nc; do basename "$(dirname "$nc")"; done | sort -u) + fi + local n=0; [ -n "$sites" ] && n=$(printf '%s\n' "$sites" | grep -c .) + printf 'Cloverleaf env (LOCAL):\n' + printf ' HCIROOT = %s\n' "$root" + printf ' sites: %d\n' "$n" + [ -n "$sites" ] && printf '%s\n' "$sites" | sed 's/^/ - /' +} + # ── v0.6.8: cross-env file transfer over the open ControlMaster ──────────── # ssh_pull pulls a remote file → local; ssh_push pushes local → remote. Both # multiplex via the existing master socket (set up by /ssh-setup ALIAS) — no @@ -3761,6 +3887,7 @@ execute_tool() { hl7_sanitize) tool_hl7_sanitize "$(J '.input_path')" "$(J '.strict // 0' | sed "s/false/0/;s/true/1/")" ;; ssh_exec) tool_ssh_exec "$(J '.alias')" "$(J '.command')" "$(J '.max_lines // 500')" ;; ssh_status) tool_ssh_status ;; + list_sites) tool_list_sites "$(J '.alias // ""')" "$(J '.hciroot // ""')" ;; ssh_pull) tool_ssh_pull "$(J '.alias')" "$(J '.remote_path')" "$(J '.local_path // ""')" ;; ssh_push) tool_ssh_push "$(J '.alias')" "$(J '.local_path')" "$(J '.remote_path')" ;; ssh_pull_smat) tool_ssh_pull_smat "$(J '.alias')" "$(J '.site')" "$(J '.thread')" "$(J '.days_back // ""')" ;; @@ -3810,6 +3937,7 @@ TOOLS_JSON=$(cat <<'TOOLS_END' {"name":"ssh_exec","description":"Run a shell command on a remote test/dev host via an authenticated SSH ControlMaster session. Bryan must have already configured the alias (via /ssh-add) and opened the master (via /ssh-setup). The password is stored locally and you CANNOT see it — do not ask Bryan for it; if the master is closed, tell him to run the /ssh-setup ALIAS slash command. Use ssh_status first to confirm which aliases are open. Output capped at max_lines (default 500). Tool result includes the remote exit code as a [ssh_exec: exit rc=N] footer.","input_schema":{"type":"object","properties":{"alias":{"type":"string","description":"Host alias Bryan configured. Run ssh_status to see the list."},"command":{"type":"string","description":"Shell command to execute on the remote. Quote as needed; will be passed through ssh as a single string."},"max_lines":{"type":"integer","description":"Cap output lines (default 500). Increase for known-large output, but prefer targeted commands."}},"required":["alias","command"]}}, {"name":"ssh_status","description":"List the SSH hosts Bryan has configured and which ones have an open ControlMaster session. Call this BEFORE ssh_exec to confirm an alias exists and the master is open. Each line shows: alias, user@host, port, cred (present/absent), master (open or dash). If the master is not open for an alias you need, ask Bryan to run the /ssh-setup ALIAS slash command. Do NOT attempt to authenticate yourself — you have no access to the password.","input_schema":{"type":"object","properties":{},"required":[]}}, + {"name":"list_sites","description":"List and COUNT the Cloverleaf sites in the environment. This is your proactive answer to 'how many sites are on ' / 'what sites exist' — NEVER ask Bryan to export or hand you $HCIROOT first; this tool resolves it for you. Works in BOTH deployment modes. REMOTE mode: pass alias= (a configured SSH alias, e.g. qa); the tool opens a LOGIN shell on that host so the remote $HCIROOT populates from the operator profile, then enumerates sites (Cloverleaf's hcisitelist if present, else a NetConfig walk). The ControlMaster must be open — if it is not, the result tells you to have Bryan run /ssh-setup . LOCAL mode: omit alias; the tool enumerates sites under the locally-detected $HCIROOT (or the hciroot override). Returns the resolved HCIROOT, a site count, and the site names.","input_schema":{"type":"object","properties":{"alias":{"type":"string","description":"REMOTE mode: an SSH alias from ssh_status (e.g. 'qa'). Omit for LOCAL mode (sites on this box)."},"hciroot":{"type":"string","description":"LOCAL mode only: override the detected $HCIROOT."}},"required":[]}}, {"name":"hl7_diff","description":"HL7-aware diff between two message files (or multi-message dumps). Compares segment-by-segment, field-by-field, with component and subcomponent precision. Ignores configured fields (default MSH.7 timestamp) so timestamp-only diffs do not show up as noise. Use for regression testing between environments (e.g. test vs prod route-test outputs).","input_schema":{"type":"object","properties":{"left":{"type":"string","description":"Path to left HL7 file."},"right":{"type":"string","description":"Path to right HL7 file."},"ignore":{"type":"string","description":"Comma-separated list of fields to ignore (e.g. MSH.7,MSH.10,EVN.6). Default MSH.7."},"include":{"type":"string","description":"If set, ONLY these fields are compared (overrides ignore for that set)."},"format":{"type":"string","enum":["text","tsv","count"],"description":"text=human-readable diff, tsv=machine-parseable, count=just the difference count."}},"required":["left","right"]}}, @@ -4166,6 +4294,48 @@ _humanize_rate_limit() { # - writes a JSON file with {content:[...], stop_reason, usage} on success # - on a non-SSE JSON error body, writes that body to $2 (if given) # - updates _LARRY_LAST_ASSISTANT_TEXT +# +# v0.8.13 (slowness, dominant residual fix): _json_str_decode — fork-free +# decode of a jq @json-encoded string ("...") back to raw text, in PURE bash. +# +# WHY this is the hot path: the streaming text delta arrives @json-encoded +# (so embedded newlines/tabs survive the line-oriented `read`). v0.8.12 cut the +# per-delta routing to ONE jq call, but each text_delta and input_json_delta +# STILL forked a SECOND jq (`jq -r '.'`) purely to un-escape that string. A +# normal answer ships dozens-to-hundreds of deltas; on Cygwin/MobaXterm a fork +# is ~50-100ms (Windows fork emulation), so that second fork is the bulk of the +# residual "feels slow" lag — ~N forks per turn, where N≈output tokens. +# +# The overwhelmingly common payload is a short chunk with NO backslash escapes +# (plain words/spaces). For that case we strip the surrounding quotes and emit +# the body verbatim — ZERO forks. We only fall back to jq when a backslash is +# actually present (rare: a literal \n, \t, \", \uXXXX in the model's text). +# Net: the dominant text path drops from 2 forks/delta to ~1, halving per-turn +# fork count on top of v0.8.12. Verified round-trip below for escaped + Unicode. +# +# $1 = the @json string INCLUDING surrounding double-quotes. Echoes raw text. +_json_str_decode() { + local s="$1" + # Empty / unquoted (jq emitted "" or a bare value) → nothing to do. + case "$s" in + '""'|'') printf ''; return ;; + '"'*'"') : ;; # well-formed quoted string → proceed + *) printf '%s' "$s"; return ;; # defensive: not quoted, pass through + esac + # Strip the surrounding quotes. + s="${s#\"}"; s="${s%\"}" + case "$s" in + *\\*) + # Has at least one escape — defer to jq for correct \uXXXX / \" / \\ / \n. + printf '"%s"' "$s" | jq -r '. // ""' 2>/dev/null + ;; + *) + # No escapes — verbatim, no fork. + printf '%s' "$s" + ;; + esac +} + parse_stream_to_response() { local out_file="$1" local err_body_file="${2:-}" @@ -4255,7 +4425,9 @@ parse_stream_to_response() { text_delta) local t # _dpay is a JSON-encoded string ("..."); decode back to raw. - t=$(printf '%s' "$_dpay" | jq -r '. // ""' 2>/dev/null) + # v0.8.13: fork-free for the common (escape-free) chunk; jq only + # when an actual backslash escape is present. See _json_str_decode. + t=$(_json_str_decode "$_dpay") # Stream to stderr so it can't get swallowed by stdout redirect. # Color whole stream with magenta (Larry's voice). if [ "$started_text" = "0" ]; then @@ -4267,7 +4439,9 @@ parse_stream_to_response() { ;; input_json_delta) local pj - pj=$(printf '%s' "$_dpay" | jq -r '. // ""' 2>/dev/null) + # v0.8.13: fork-free decode (see _json_str_decode). Tool-call + # arg fragments are typically escape-free JSON token slices. + pj=$(_json_str_decode "$_dpay") block_input_buf[$idx]+="$pj" ;; thinking_delta|signature_delta) @@ -5024,8 +5198,8 @@ Slash commands: Audit: every tokenization writes a JSONL entry to \$LARRY_HOME/log/auto-phi.log (ts/value/category/token/tier/surface/context). - /redetect re-scan for HCIROOT/HCISITE/tools - /sites list site dirs under HCIROOT + /redetect re-scan for HCIROOT/HCISITE/tools + deployment mode + /sites [alias] count/list Cloverleaf sites — local, or REMOTE via (login shell) /site switch HCISITE for this session /pwd show current working directory /help this help @@ -5208,7 +5382,7 @@ _LARRY_SLASH_CMDS_DESC=( [/ssh-close]=" close the ControlMaster" [/ssh-status]="show open ControlMaster sessions + cred presence" [/redetect]="re-scan for HCIROOT/HCISITE/tools" - [/sites]="list site dirs under HCIROOT" + [/sites]="count/list Cloverleaf sites — local, or REMOTE via /sites " [/site]=" switch HCISITE for this session" [/reset]="clear conversation history (keeps log)" [/model]=" switch model (e.g. /model claude-opus-4-7)" @@ -6358,12 +6532,10 @@ main_loop() { system_prompt=$(build_system_prompt) larry_say "re-detected. /env to view." continue ;; - /sites) if [ -n "${HCIROOT:-}" ] && [ -d "$HCIROOT" ]; then - if command -v sites >/dev/null 2>&1; then sites; else - find "$HCIROOT" -mindepth 1 -maxdepth 1 -type d -exec basename {} \; \ - | grep -Ev '^(archiving|master|lib|tcl|server|client|clgui|cchgs|Alerts|AppDefaults|Tables|backup.*)$' | sort - fi - else err "HCIROOT not set"; fi + /sites*) # v0.8.13: both-mode site listing. `/sites` → LOCAL; `/sites ` + # → REMOTE (login-shell discover over the open ControlMaster). + local _site_alias; _site_alias=$(_slash_args "/sites" "$input") + tool_list_sites "${_site_alias:-}" "" continue ;; /site\ *) HCISITE="${input#/site }"; HCISITEDIR="$HCIROOT/$HCISITE" export HCISITE HCISITEDIR @@ -6452,9 +6624,11 @@ Pattern filter: ${_pat:-}. Plan and execute: 1. Run ssh_status to confirm both aliases have an open ControlMaster. If either is closed, stop and tell me to run /ssh-setup . -2. Use ssh_exec to locate the NetConfig paths on each env (e.g. - find \$HCIROOT -maxdepth 3 -name NetConfig -type f), or ask me for the - site name if HCIROOT isn't exported on the remote. +2. Discover each env's sites with list_sites(alias=...) — it resolves the + remote \$HCIROOT in a login shell for you. Then locate NetConfig paths via + ssh_exec (e.g. find \$HCIROOT -maxdepth 3 -name NetConfig -type f) — ssh_exec + already runs in a login shell, so \$HCIROOT is populated. Do NOT ask me to + export \$HCIROOT. 3. ssh_pull each NetConfig locally. Also pull the matching Xlate/, tclprocs/, tables/ directories alongside if you intend to diff referenced artifacts. 4. Use nc_diff_interface with --interface set per protocol, --left and --right @@ -6494,8 +6668,11 @@ Output root: $_out Plan and execute: 1. ssh_status to confirm BOTH aliases have an open ControlMaster. If either is closed, stop and tell me to run /ssh-setup . -2. Discover the remote HCIROOT for each alias (ssh_exec 'echo \$HCIROOT'). If - not exported, ask me. Same for HCISITE if scope=site. +2. Discover the remote HCIROOT for each alias with list_sites(alias=...) — it + resolves \$HCIROOT in a login shell and lists the sites. (Equivalently, + ssh_exec 'echo \$HCIROOT' now works because ssh_exec runs a login shell.) + Do NOT ask me to export it. Only ask which site if scope=site and it's + ambiguous from the discovered list. 3. Call nc_regression with: - scope = "$_scope" - source_ssh_alias = "$_src" diff --git a/lib/ssh-helper.sh b/lib/ssh-helper.sh index 9260dd0..8987075 100755 --- a/lib/ssh-helper.sh +++ b/lib/ssh-helper.sh @@ -24,6 +24,14 @@ # close close ControlMaster # status [alias] show open masters / cred presence # exec run command via master (returns output) +# discover auto-detect remote Cloverleaf env: +# resolves $HCIROOT (LOGIN shell), then +# enumerates sites (hcisitelist fast-path, +# NetConfig-walk fallback). Prints TSV: +# HCIROOT +# SITE (one per site) +# No nagging for paths — the remote's own +# login profile is the source of truth. # pull [local] scp remote → local via existing master # push scp local → remote via existing master # pull-smat [days_back] @@ -238,6 +246,36 @@ cmd_status() { cmd_hosts } +# v0.8.13 (Cloverleaf login-shell fix): exec defaults to a LOGIN shell. +# +# Root cause of the "qa keeps asking me for $HCIROOT" friction: a plain +# ssh host 'cmd' +# runs a NON-interactive, NON-login shell. On a Cloverleaf host, $HCIROOT (and +# $HCISITE, the hci* binaries on PATH, etc.) are exported by the LOGIN profile +# (/etc/profile.d, the hci user's ~/.profile / ~/.bash_profile, the per-site +# `.profile`). A non-login shell never sources those, so $HCIROOT arrives empty +# and Larry used to give up and nag the user for a path. Wrapping the command in +# `bash -lc` forces a login shell, so the Cloverleaf environment populates +# exactly as it does for an interactive operator login. This is the version- +# agnostic, no-config fix — it works on any Cloverleaf host whose operator login +# sets up the environment (i.e. all of them). +# +# Escape hatch: prefix the command with the literal token NOLOGIN (or set +# LARRY_SSH_NO_LOGIN=1) to run a bare non-login shell — for the rare host where +# the login profile is interactive-only and hangs a non-tty `bash -l`. +_build_login_cmd() { + # $1 = raw command string. Echoes the command to hand to ssh. + local raw="$1" + case "$raw" in + NOLOGIN\ *) printf '%s' "${raw#NOLOGIN }"; return ;; + esac + [ "${LARRY_SSH_NO_LOGIN:-0}" = "1" ] && { printf '%s' "$raw"; return; } + # Single-quote the payload for a robust `bash -lc ''`. Embedded + # single quotes become '\'' (close, escaped-quote, reopen) — POSIX-portable. + local esc; esc=$(printf '%s' "$raw" | sed "s/'/'\\\\''/g") + printf "bash -lc '%s'" "$esc" +} + cmd_exec() { local alias="${1:-}" [ -n "$alias" ] || die "usage: exec " @@ -253,8 +291,56 @@ cmd_exec() { if [ ! -S "$sock" ] || ! ssh -S "$sock" -O check -p "$port" "$addr" 2>/dev/null; then die "no open master for $alias — run 'setup $alias' first" fi - # Multiplexed; no password needed. - ssh -S "$sock" -p "$port" -o BatchMode=yes "$addr" "$cmd" + # Multiplexed; no password needed. Run in a login shell so $HCIROOT et al. + # populate from the remote Cloverleaf login profile (see _build_login_cmd). + ssh -S "$sock" -p "$port" -o BatchMode=yes "$addr" "$(_build_login_cmd "$cmd")" +} + +# cmd_discover ALIAS — proactively detect the remote Cloverleaf environment. +# Resolves $HCIROOT in a LOGIN shell, then enumerates sites two ways: +# 1. hcisitelist (the Cloverleaf-shipped site lister) if it's on the login PATH +# 2. NetConfig walk under $HCIROOT (version-agnostic ground truth — the same +# "a site is a dir with a NetConfig" rule each-site.sh uses) +# Emits TSV to stdout the tool layer can parse deterministically: +# HCIROOT (or HCIROOT if unresolved) +# SITE (zero or more) +# Never prompts; on failure it emits what it could resolve + a NOTE line. +cmd_discover() { + local alias="${1:-}" + [ -n "$alias" ] || die "usage: discover " + local addr_port; addr_port=$(read_host_addr "$alias") + [ -n "$addr_port" ] || die "no such alias: $alias" + local addr port + addr=$(printf '%s' "$addr_port" | cut -f1) + port=$(printf '%s' "$addr_port" | cut -f2) + local sock="$SSH_SOCKETS_DIR/$alias.sock" + if [ ! -S "$sock" ] || ! ssh -S "$sock" -O check -p "$port" "$addr" 2>/dev/null; then + die "no open master for $alias — run 'setup $alias' first" + fi + + # A single login-shell remote script. It: + # - prints HCIROOT\t$HCIROOT + # - tries `hcisitelist` (Cloverleaf site lister); each non-empty token → SITE + # - falls back to a NetConfig walk under $HCIROOT (depth ≤2) + # Kept POSIX-sh so it runs under whatever /bin/sh the login shell spawns. + local remote=' + printf "HCIROOT\t%s\n" "${HCIROOT:-}"; + if [ -z "${HCIROOT:-}" ]; then + printf "NOTE\tHCIROOT empty even in a login shell — operator profile may not export it\n"; + exit 0; + fi + got=0; + if command -v hcisitelist >/dev/null 2>&1; then + for s in $(hcisitelist 2>/dev/null); do + [ -n "$s" ] && { printf "SITE\t%s\n" "$s"; got=1; } + done; + fi; + if [ "$got" = "0" ]; then + find "$HCIROOT" -mindepth 1 -maxdepth 2 -name NetConfig -type f 2>/dev/null \ + | while IFS= read -r nc; do d=$(dirname "$nc"); printf "SITE\t%s\n" "$(basename "$d")"; done \ + | sort -u; + fi' + ssh -S "$sock" -p "$port" -o BatchMode=yes "$addr" "$(_build_login_cmd "$remote")" } # ── v0.6.8: scp helpers that multiplex via the existing ControlMaster ──────── @@ -396,9 +482,10 @@ cmd_pull_smat() { || die "usage: pull-smat [days_back]" _resolve_open_master "$alias" - # Discover the remote .smatdb path. We rely on HCIROOT being exported in - # the remote shell rc (typical Cloverleaf user profile), else SITEDIR is - # taken as / via ssh-resolved $HCIROOT. We do the find + # Discover the remote .smatdb path. $HCISITEDIR/$HCIROOT are resolved by the + # LOGIN shell (see _build_login_cmd) — the v0.8.13 fix — so we no longer + # depend on a non-login rc happening to export them. SITEDIR falls back to + # / if HCISITEDIR isn't set for that site. The find runs # remotely to avoid hard-coding process directory names. local find_cmd find_cmd='set -e; SDIR="${HCISITEDIR:-${HCIROOT:-}/'"$site"'}"; ' @@ -406,10 +493,26 @@ cmd_pull_smat() { find_cmd+='F=$(find "$SDIR/exec/processes" -maxdepth 2 -type f -name "'"$thread"'.smatdb" 2>/dev/null | head -1); ' find_cmd+='[ -n "$F" ] || F=$(find "$SDIR" -type f -name "'"$thread"'.smatdb" 2>/dev/null | head -1); ' find_cmd+='[ -n "$F" ] || { echo "ERROR: no smatdb found for thread '"$thread"' under $SDIR" >&2; exit 3; }; ' - find_cmd+='printf "%s\n" "$F"' + # v0.8.13 M1 hardening (Vera Minor #1): emit the resolved path behind an + # unambiguous sentinel prefix instead of relying on it being the last stdout + # line. A login shell (`bash -lc`, the v0.8.13 fix) is the case most likely to + # print a MOTD/banner to stdout, which a blind `tail -1` would mistake for the + # path. We grep for the sentinel line and strip it; only if no sentinel is + # present (host somehow stripped it) do we fall back to the prior `tail -1` + # behaviour, so this can never regress a host that worked before. + find_cmd+='printf "SMATDB_PATH:%s\n" "$F"' - local remote_smatdb - remote_smatdb=$(ssh -S "$_RH_SOCK" -p "$_RH_PORT" -o BatchMode=yes "$_RH_ADDR" "$find_cmd" 2>&1 | tail -1) + local _smat_raw remote_smatdb + _smat_raw=$(ssh -S "$_RH_SOCK" -p "$_RH_PORT" -o BatchMode=yes "$_RH_ADDR" "$(_build_login_cmd "$find_cmd")" 2>&1) + remote_smatdb=$(printf '%s\n' "$_smat_raw" | grep '^SMATDB_PATH:' | tail -1) + if [ -n "$remote_smatdb" ]; then + remote_smatdb="${remote_smatdb#SMATDB_PATH:}" + else + # No sentinel — surface any ERROR: line if present, else fall back to the + # last line (pre-hardening behaviour) so failure modes stay diagnosable. + remote_smatdb=$(printf '%s\n' "$_smat_raw" | grep '^ERROR:' | tail -1) + [ -n "$remote_smatdb" ] || remote_smatdb=$(printf '%s\n' "$_smat_raw" | tail -1) + fi case "$remote_smatdb" in ERROR:*|'') die "remote smatdb lookup failed: $remote_smatdb" ;; esac @@ -453,7 +556,8 @@ cmd_pull_smat() { sample_cmd+='RETURNED=$(sqlite3 "'"$remote_smatdb"'" "SELECT MIN(1000, COUNT(*)) FROM smat_msgs WHERE Time >= $CUTOFF_MS"); ' sample_cmd+='echo "# smatdb=$(basename '"$remote_smatdb"') days_back='"$days_back"' total_in_window=$TOTAL returned=$RETURNED truncated=$([ "$TOTAL" -gt 1000 ] && echo yes || echo no)" >&2' - ssh -S "$_RH_SOCK" -p "$_RH_PORT" -o BatchMode=yes "$_RH_ADDR" "$sample_cmd" + # Login shell so sqlite3 resolves from the operator's PATH (v0.8.13). + ssh -S "$_RH_SOCK" -p "$_RH_PORT" -o BatchMode=yes "$_RH_ADDR" "$(_build_login_cmd "$sample_cmd")" } case "${1:-help}" in @@ -465,6 +569,7 @@ case "${1:-help}" in close|exit) shift; cmd_close "$@" ;; status) shift; cmd_status "$@" ;; exec|run) shift; cmd_exec "$@" ;; + discover) shift; cmd_discover "$@" ;; pull) shift; cmd_pull "$@" ;; push) shift; cmd_push "$@" ;; pull-smat) shift; cmd_pull_smat "$@" ;;