v0.8.1: tool-result content-shape gating + base64 round-trip + review gate

Three changes expanding PHI safety envelope on the tool-result surface. Closes V2 + V12 + V2-sub from Vera's audit. No behavior change for users not interacting with HL7-shaped data. - Tool-name allow-list dropped. The v0.7.3 tool-result auto-PHI gate ran only on read_file (.hl7|.txt), nc_msgs, hl7_field, hl7_diff. v0.8.1 runs _auto_phi_looks_like_hl7 on EVERY tool result. On hit → route through lib/hl7-sanitize.sh. On miss → pass through unchanged. Closes V2: bash_exec / ssh_exec / grep_files / read_file of any extension all get scanned when their output is HL7-shaped. False- positive cost is negligible (extra regex pass on non-HL7 has zero behavioral impact). - Base64-wrapped HL7 round-trip. New _auto_phi_b64_roundtrip helper. Detects candidate base64 runs (length >= 200, [A-Za-z0-9+/=] only, length divisible by 4 — NOT entropy-based per Pax §V2-sub: HL7's repetitive prefixes survive base64 with LOW entropy, so entropy is the wrong signal). Speculatively decodes each candidate; if decoded bytes look like HL7, routes through hl7-sanitize.sh and re-encodes back into the result. Catches ssh_pull_smat sampled mode's TSV format. Requires python3 (installed everywhere larry runs); skipped with a one-time stderr warning when unavailable. Server-side TSV encoding kept (binary-safe transport); client-side unwrap handles the safety concern, no remote refactor needed. - Operator review gate for bash_exec/ssh_exec/ssh_pull/ssh_pull_smat results. When the tool produced HL7-shaped output OR the result exceeds LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes (default 8192), Larry prompts [Y/n/i] before passing the result back to the model. 'i' opens the full output in $PAGER then re-prompts. Default Y (zero friction). N substitutes a refusal JSON so the model surfaces that something was withheld. Skipped when LARRY_AUTO_PHI=off (opt-out consistency) OR no TTY (headless scripts unaffected). Override with LARRY_TOOL_RESULT_REVIEW=always for paranoid mode. Closes V12. Proactive same-pattern sweep. Searched for other call sites where tool output bypasses content-shape gating: only the one in agent_turn. The v0.8.0-c strict-mode tool-result branch was updated in lockstep so it now triggers on the broader (content-only) eligibility. Verification: bash -n clean; b64 round-trip unit-tested with three cases (real-world HL7 base64 → decoded contains tokenized PHI not clear-text PHI; plain text → passthrough; non-HL7 b64 → passthrough, no false positive). Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
2026-05-27 19:45:23 -07:00 · 2026-05-27 19:45:23 -07:00 · 9fc38e743d
commit 9fc38e743d
parent 7434e6e8b8
3 changed files with 314 additions and 17 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -4,6 +4,53 @@ All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here.
 Versioning is loose-semver; bumps trigger the in-process self-update on every
 running client via `LARRY_BASE_URL` + `MANIFEST`.
 ## v0.8.1 — 2026-05-27
 Tool-result PHI gating expansion. Closes V2 / V12 and the V2 base64 sub-gap
 from Vera's audit. No behavior change for users not on HL7-shaped data;
 opt-in friction for the 8KB+ tool-result review gate.
 - **Tool-name allow-list dropped; content-shape gating only.** The v0.7.3
  tool-result auto-PHI gate ran only on `read_file (.hl7|.txt)`, `nc_msgs`,
  `hl7_field`, `hl7_diff`. v0.8.1 runs `_auto_phi_looks_like_hl7` on
  EVERY tool result. On hit → route through `lib/hl7-sanitize.sh`.
  On miss → pass through unchanged. Closes V2: `bash_exec`/`ssh_exec`/
  `grep_files`/`read_file` of `.log`/`.csv`/`.dat`/no-suffix files are
  now all covered when their output is HL7-shaped. False-positive cost
  is cheap (extra regex pass with zero behavioral impact on non-HL7).
 - **Base64-wrapped HL7 round-trip.** New `_auto_phi_b64_roundtrip` helper.
  Detects candidate base64 runs (length >= 200 chars, `[A-Za-z0-9+/=]`
  only, length divisible by 4 — NOT entropy-based, per Pax §V2-sub:
  HL7's repetitive prefixes survive base64 with LOW entropy). Speculatively
  decodes each run; if decoded bytes look like HL7, routes through
  `hl7-sanitize.sh` and re-encodes (`base64 -w0`) back into the result.
  Catches `ssh_pull_smat` sampled mode TSV (server-side encoding kept
  for binary-safe TSV transport; client-side unwrap handles the safety
  concern). Requires `python3` (installed everywhere larry-anywhere
  runs); skipped with a one-time stderr warning if unavailable.
 - **Operator review gate for `bash_exec`/`ssh_exec`/`ssh_pull`/
  `ssh_pull_smat` results.** When the tool produced HL7-shaped output OR
  the result exceeds `LARRY_TOOL_RESULT_REVIEW_THRESHOLD` bytes
  (default 8192), Larry prompts `[Y/n/i]` before passing the result
  back to the model. `i` opens the result in `$PAGER` then re-prompts.
  Default Y — zero friction by default. `N` substitutes a refusal JSON
  so the model knows a result was withheld. Skipped when
  `LARRY_AUTO_PHI=off` (consistent with the opt-out) OR running
  non-interactively (no TTY — never blocks headless scripts).
  Override with `LARRY_TOOL_RESULT_REVIEW=always` to gate every result.
  Per Pax §V2/V12: closes the "operator wanted to see this themselves,
  didn't want the model to see it" gap that's the actual common case.
 **Proactive same-pattern sweep.** Searched the codebase for other call
 sites where tool output bypasses content-shape gating: found only the
 one in `agent_turn`. The v0.8.0-c strict-mode tool-result branch was
 hardened in lockstep so it now triggers on the broader (content-only)
 eligibility instead of the old name-allow-list.
 Manifest unchanged.
 ## v0.8.0 — 2026-05-27
 PHI-safety quick-wins pack — three independent zero-risk patches closing
--- a/2
+++ b/2
@ -1 +1 @@
-0.8.0
+0.8.1
--- a/larry.sh
+++ b/larry.sh
@ -57,7 +57,7 @@ set -o pipefail
 # ─────────────────────────────────────────────────────────────────────────────
 # Config
 # ─────────────────────────────────────────────────────────────────────────────
-LARRY_VERSION="0.8.0"
+LARRY_VERSION="0.8.1"
 LARRY_HOME="${LARRY_HOME:-$HOME/.larry}"
 # ─────────────────────────────────────────────────────────────────────────────
@ -1553,6 +1553,106 @@ _auto_phi_looks_like_hl7() {
  return 1
 }
 # v0.8.1-c: base64-wrapped HL7 round-trip.
 # Walks every base64-shaped run (>=200 chars, [A-Za-z0-9+/=], length%4==0)
 # in TEXT. For each: speculatively decode; if the decoded bytes look like
 # HL7, route through hl7-sanitize.sh and re-encode (base64 -w0 on GNU,
 # `base64 | tr -d \n` on BSD). Substitute the re-encoded form back into
 # TEXT. Echoes the rewritten text on stdout; empty stdout means "nothing
 # matched, leave the result alone" (callers MUST check for non-empty).
 #
 # Per Pax §V2-sub: do NOT use entropy as the trigger — HL7's repetitive
 # prefixes (`PID|1||...`) compress to LOW-entropy base64. Use length +
 # charset + modulo-4 as the candidate filter; speculative decode is the
 # decision point. False-positive cost is one extra base64-and-grep round
 # trip per matched run; cheap.
 _auto_phi_b64_roundtrip() {
  local text="$1" toolname="${2:-tool}"
  local sanitize_script="$LARRY_LIB_DIR/hl7-sanitize.sh"
  [ -x "$sanitize_script" ] || { printf ''; return 1; }
  # We use python3 for the walk because pure-bash regex over potentially
  # 400KB input with reliable run extraction is fragile; python3 is
  # present everywhere larry-anywhere runs (macOS, Linux, Cygwin).
  # The python helper:
  #   - finds every [A-Za-z0-9+/]{200,}={0,2} run with length%4==0
  #   - speculatively decodes
  #   - checks for HL7 shape in decoded bytes
  #   - if shape matches, writes decoded to a tempfile, runs hl7-sanitize.sh,
  #     re-encodes the sanitized output, and patches it back into the text
  #   - on no matches OR no shape hits, prints nothing (caller leaves result alone)
  if ! command -v python3 >/dev/null 2>&1; then
    # Python3 unavailable. We do NOT attempt a half-implementation in
    # bash/awk — base64 round-trip with byte-faithful re-encode is fiddly
    # to get right and a buggy substitute could corrupt the tool result
    # mid-payload. Conservative: leave the result intact, log once per
    # session for visibility. Strict-mode escape is handled upstream;
    # default mode falls back to the plain HL7-shape branch which catches
    # cleartext MSH/PID anyway.
    if [ -z "${_LARRY_B64_PY3_WARNED:-}" ]; then
      printf '%sphi>%s base64 unwrap pass skipped: python3 not on PATH (install python3 to enable v0.8.1-c)\n' \
        "$C_DIM" "$C_RESET" >&2
      _LARRY_B64_PY3_WARNED=1
    fi
    printf ''
    return 1
  fi
  # Write the python helper to a tempfile so we can pass text via stdin
  # (the `python3 - <<HEREDOC` form conflicts with stdin-piping — the
  # heredoc consumes the dash's stdin slot). Tempfile approach keeps both
  # channels clean.
  local _b64py; _b64py=$(mktemp -t larry-b64.XXXXXX.py 2>/dev/null || mktemp)
  cat > "$_b64py" <<'PY'
 import os, re, subprocess, sys, tempfile, base64
 sanitize_script = sys.argv[1]
 text = sys.stdin.buffer.read()
 # Candidate b64 runs: length >= 200, only [A-Za-z0-9+/=], length % 4 == 0.
 pat = re.compile(rb"[A-Za-z0-9+/]{200,}={0,2}")
 data = text
 changed = False
 def is_hl7(b):
    head = b[:4096]
    if head.startswith(b"MSH|"): return True
    for sep in (b"\rMSH|", b"\nMSH|", b"\rPID|", b"\nPID|",
                b"\rEVN|", b"\nEVN|", b"\rPV1|", b"\nPV1|"):
        if sep in head: return True
    return False
 def replace(match):
    global changed
    run = match.group(0)
    if len(run) % 4 != 0:
        return run
    try:
        decoded = base64.b64decode(run, validate=True)
    except Exception:
        return run
    if not is_hl7(decoded):
        return run
    with tempfile.NamedTemporaryFile(delete=False) as tf:
        tf.write(decoded)
        tfname = tf.name
    try:
        proc = subprocess.run(["bash", sanitize_script, tfname],
                              capture_output=True, timeout=30)
    finally:
        os.unlink(tfname)
    if proc.returncode != 0 or not proc.stdout:
        # Sanitize failure: keep original (fail-open). Strict-mode escape is
        # already handled upstream — this helper is best-effort cleanup.
        return run
    san = proc.stdout
    reencoded = base64.b64encode(san)
    changed = True
    return reencoded
 new_data = pat.sub(replace, data)
 if changed:
    sys.stdout.buffer.write(new_data)
 PY
  printf '%s' "$text" | python3 "$_b64py" "$sanitize_script"
  local _rc=$?
  rm -f "$_b64py"
  return $_rc
 }
 # Main detector. Args: surface ("user_input"|"tool_result"), input text.
 # Echoes the rewritten input. Status message goes to stderr.
 #
@ -2447,6 +2547,102 @@ _pretty_tool_input() {
  ' 2>/dev/null
 }
 # v0.8.1-b: second approval gate for tool results.
 # Sets the shell-global $_LARRY_GATE_RESULT to either the original result
 # (user accepted) or a refusal sentinel (user declined). Never silently
 # replaces — the caller reads $_LARRY_GATE_RESULT explicitly.
 #
 # Triggers (any of):
 #   - tool name is in the "operator-intent" list (bash_exec, ssh_exec,
 #     ssh_pull, ssh_pull_smat) AND
 #       - result is HL7-shaped, OR
 #       - result > LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes (default 8192)
 #   - LARRY_TOOL_RESULT_REVIEW=always (covers every tool result)
 #
 # Skipped when:
 #   - LARRY_AUTO_PHI=off (operator explicitly opted out of PHI prompts)
 #   - no controlling TTY (headless / non-interactive scripts must not block)
 _LARRY_GATE_RESULT=""
 _maybe_tool_result_review_gate() {
  local name="$1" result="$2"
  _LARRY_GATE_RESULT="$result"
  # Bypasses.
  [ "$AUTO_PHI_MODE" = "off" ] && return 0
  [ -t 0 ] || return 0
  [ -t 2 ] || return 0
  # Always-on env override.
  if [ "${LARRY_TOOL_RESULT_REVIEW:-}" = "always" ]; then
    :  # fall through to prompt
  else
    # Otherwise, only gate the operator-intent tools.
    case "$name" in
      bash_exec|ssh_exec|ssh_pull|ssh_pull_smat) ;;
      *) return 0 ;;
    esac
    # Trigger check.
    local size; size=$(printf '%s' "$result" | wc -c | tr -d ' ')
    local threshold="${LARRY_TOOL_RESULT_REVIEW_THRESHOLD:-8192}"
    # coerce_int defends against CR-tainted env on Cygwin (v0.7.5 lessons).
    if declare -F coerce_int >/dev/null 2>&1; then
      threshold=$(coerce_int "$threshold" 8192)
      size=$(coerce_int "$size" 0)
    fi
    local hl7=0
    _auto_phi_looks_like_hl7 "$result" && hl7=1
    if [ "$hl7" != "1" ] && [ "$size" -le "$threshold" ]; then
      return 0
    fi
  fi
  # Render the prompt. Show a 240-char preview of the result.
  local preview; preview=$(printf '%s' "$result" | head -c 240)
  printf '\n%s══ tool result review ══%s\n' "$C_YELLOW" "$C_RESET" >&2
  printf '  tool:   %s\n' "$name" >&2
  printf '  bytes:  %s\n' "$(printf '%s' "$result" | wc -c | tr -d ' ')" >&2
  if _auto_phi_looks_like_hl7 "$result"; then
    printf '  shape:  HL7-shaped (post-sanitize)\n' >&2
  fi
  printf '%s── preview (first 240 chars) ──%s\n' "$C_DIM" "$C_RESET" >&2
  printf '%s\n' "$preview" >&2
  printf '%sphi> send this output back to the model? [Y/n/i (inspect)]:%s ' \
    "$C_BOLD" "$C_RESET" >&2
  local ans=""
  if declare -F read_clean >/dev/null 2>&1; then
    read_clean ans </dev/tty 2>/dev/null || ans=""
  else
    IFS= read -r ans </dev/tty 2>/dev/null || ans=""
    ans="${ans//$'\r'/}"
  fi
  while [ "$ans" = "i" ] || [ "$ans" = "I" ]; do
    local pager="${PAGER:-less}"
    local _ins_tmp; _ins_tmp=$(mktemp)
    printf '%s' "$result" > "$_ins_tmp"
    if command -v "$pager" >/dev/null 2>&1; then
      "$pager" "$_ins_tmp" </dev/tty || true
    else
      cat "$_ins_tmp" >&2
    fi
    rm -f "$_ins_tmp"
    printf '%sphi> send this output back to the model? [Y/n/i (inspect again)]:%s ' \
      "$C_BOLD" "$C_RESET" >&2
    if declare -F read_clean >/dev/null 2>&1; then
      read_clean ans </dev/tty 2>/dev/null || ans=""
    else
      IFS= read -r ans </dev/tty 2>/dev/null || ans=""
      ans="${ans//$'\r'/}"
    fi
  done
  case "$ans" in
    n|N|no|NO|No)
      _LARRY_GATE_RESULT='{"error":"tool result withheld by operator","tool":"'"$name"'","note":"operator reviewed the output locally and declined to send it back to the model"}'
      printf '%sphi>%s tool result WITHHELD from model (operator declined)\n' \
        "$C_DIM" "$C_RESET" >&2
      ;;
    *)
      # Default Y — pass through.
      ;;
  esac
 }
 # Display a tool call header (cyan + bold name, dim args, optional truncation hint).
 display_tool_call() {
  local name="$1" input_json="$2"
@ -3248,17 +3444,25 @@ agent_turn() {
      # designed for prose, not pipe-delimited segment data). The two
      # share lookup.tsv so tokens are stable across surfaces.
      if [ "$AUTO_PHI_MODE" != "off" ]; then
-        local _ap_eligible=0
+        # v0.8.1-a: content-shape gating replaces the v0.7.3 tool-name
-        case "$name" in
+        # allow-list. The shape detector (_auto_phi_looks_like_hl7) runs on
-          read_file)
+        # EVERY tool result regardless of which tool produced it. On hit →
-            local _ap_path
+        # route through hl7-sanitize.sh. On miss → pass through unchanged.
-            _ap_path=$(printf '%s' "$input_json" | jq -r '.path // ""' 2>/dev/null)
+        # This closes V2 (bash_exec, ssh_exec, grep_files, and read_file of
-            case "$_ap_path" in
+        # any extension all get scanned when their output is HL7-shaped).
-              *.hl7|*.HL7|*.txt|*.TXT) _ap_eligible=1 ;;
+        # False-positive cost: cheap — sanitizer runs against output that
-            esac
+        # doesn't actually match its rules; mints no tokens; passes through.
-            ;;
+        local _ap_eligible=1
-          nc_msgs|hl7_field|hl7_diff) _ap_eligible=1 ;;
+        # v0.8.1-c: base64 unwrap pass. Detect candidate base64 (length+
-        esac
+        # charset+modulo, NOT entropy — per Pax §V2-sub: HL7's repetitive
        # prefixes survive base64 with LOW entropy, so entropy is the wrong
        # signal). Speculatively decode each candidate; if the decoded bytes
        # look like HL7, route THOSE through hl7-sanitize.sh and re-encode
        # back into the result. Catches ssh_pull_smat sampled mode TSV.
        local _ap_b64=0
        if printf '%s' "$result" | head -c 65536 | grep -qE '[A-Za-z0-9+/]{200,}={0,2}' 2>/dev/null; then
          _ap_b64=1
        fi
        # v0.8.0-c: strict mode aborts if sanitizer script is missing/non-exec
        # when we have HL7-shaped output. We can't kill the tool-loop iteration
        # without sending SOMETHING back to satisfy the tool_use; substitute
@ -3271,6 +3475,20 @@ agent_turn() {
            "$C_DIM" "$C_RESET" "$name" >&2
          result='{"error":"auto-PHI sanitizer unavailable on HL7-shaped result","tool":"'"$name"'","action":"result withheld; set LARRY_AUTO_PHI=on to fall back to best-effort, or repair lib/hl7-sanitize.sh"}'
          _ap_eligible=0   # skip the normal sanitize path below
          _ap_b64=0
        fi
        # v0.8.1-c: base64-wrapped HL7 round-trip (decode → sanitize → re-encode).
        # Runs BEFORE the plain HL7-shape branch so a result that's pure b64
        # (no MSH| in cleartext) still gets the field-aware sanitize.
        if [ "$_ap_b64" = "1" ] && [ -x "$LARRY_LIB_DIR/hl7-sanitize.sh" ]; then
          local _b64_changed
          _b64_changed=$(_auto_phi_b64_roundtrip "$result" "$name") || true
          if [ -n "$_b64_changed" ]; then
            result="$_b64_changed"
            printf '%sphi>%s base64-wrapped HL7 detected in %s result; decoded, sanitized, re-encoded\n' \
              "$C_DIM" "$C_RESET" "$name" >&2
            _auto_phi_log "(b64-hl7 roundtrip)" "BATCH" "(b64-decoded-and-sanitized)" "hl7_pipeline" "tool_result" "$name (b64)"
          fi
        fi
        if [ "$_ap_eligible" = "1" ] && _auto_phi_looks_like_hl7 "$result"; then
          local _ap_tmp _ap_sanitized _ap_before _ap_after
@ -3302,6 +3520,26 @@ agent_turn() {
        fi
      fi
      # v0.8.1-b: second approval gate. After the tool ran and we have the
      # (possibly sanitized) result, prompt the user before passing it back
      # to the model. Triggers:
      #   - tool produced HL7-shaped output (post-sanitize, in case sanitize
      #     missed something or fell open), OR
      #   - output exceeds LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes
      #     (default 8192), OR
      #   - LARRY_TOOL_RESULT_REVIEW=always
      # Skipped when:
      #   - LARRY_AUTO_PHI=off (user has explicitly opted out of all PHI
      #     safety prompts; consistent with that opt-out)
      #   - non-interactive shell (no TTY — never block headless scripts)
      #   - tool name is read_file/list_dir/grep_files/glob_files (these
      #     already had user intent via the model's tool_use; the model
      #     asked for them. The dominant V12 risk is bash_exec/ssh_exec/
      #     ssh_pull/ssh_pull_smat where the operator's "run + show" intent
      #     may not include "show to model")
      _maybe_tool_result_review_gate "$name" "$result"
      result="$_LARRY_GATE_RESULT"
      _LARRY_LAST_TOOL_RESULT="$result"
      log_append '```'; log_append "$result"; log_append '```'
@ -3436,10 +3674,22 @@ Slash commands:
      4 KNOWN        Value matches an existing lookup.tsv entry — Bryan
                     has seen this value before. Always.
-    Tool-result scan applies ONLY to read_file (.hl7/.txt), hl7_*, and
+    Tool-result scan (v0.8.1): runs on EVERY tool result. The tool-name
-    nc_msgs outputs, and ONLY if content is HL7-shaped. Generic outputs
+    allow-list was dropped — content-shape gating (_auto_phi_looks_like_hl7)
-    (list_dir, grep_files, bash_exec, web search) are NEVER scanned — the
+    is now the only filter. HL7-shaped output from any tool (bash_exec,
-    risk of breaking legitimate text outweighs the catch rate.
+    ssh_exec, grep_files, read_file of any extension, nc_msgs, etc.) is
    routed through hl7-sanitize.sh. Non-HL7-shaped output passes through
    unchanged (no behavior change for normal text). v0.8.1 also adds a
    base64 round-trip pass (decode → shape-check → sanitize → re-encode)
    for ssh_pull_smat sampled mode and any other base64-wrapped HL7.
    Operator review gate (v0.8.1): for bash_exec/ssh_exec/ssh_pull/
    ssh_pull_smat results that are HL7-shaped OR exceed
    LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes (default 8192), Larry prompts
    [Y/n/i] before passing the result back to the model. 'i' opens the
    full output in \$PAGER. Default Y (no friction). Skipped when
    LARRY_AUTO_PHI=off OR no controlling TTY. Override with
    LARRY_TOOL_RESULT_REVIEW=always to gate every tool result.
    Modes (env LARRY_AUTO_PHI or /phi-auto):
      on           default — all four tiers always tokenize (caution-first)