v0.8.1: tool-result content-shape gating + base64 round-trip + review gate

Three changes expanding PHI safety envelope on the tool-result surface.
Closes V2 + V12 + V2-sub from Vera's audit. No behavior change for users
not interacting with HL7-shaped data.

- Tool-name allow-list dropped. The v0.7.3 tool-result auto-PHI gate ran
  only on read_file (.hl7|.txt), nc_msgs, hl7_field, hl7_diff. v0.8.1
  runs _auto_phi_looks_like_hl7 on EVERY tool result. On hit → route
  through lib/hl7-sanitize.sh. On miss → pass through unchanged.
  Closes V2: bash_exec / ssh_exec / grep_files / read_file of any
  extension all get scanned when their output is HL7-shaped. False-
  positive cost is negligible (extra regex pass on non-HL7 has zero
  behavioral impact).

- Base64-wrapped HL7 round-trip. New _auto_phi_b64_roundtrip helper.
  Detects candidate base64 runs (length >= 200, [A-Za-z0-9+/=] only,
  length divisible by 4 — NOT entropy-based per Pax §V2-sub: HL7's
  repetitive prefixes survive base64 with LOW entropy, so entropy is
  the wrong signal). Speculatively decodes each candidate; if decoded
  bytes look like HL7, routes through hl7-sanitize.sh and re-encodes
  back into the result. Catches ssh_pull_smat sampled mode's TSV
  format. Requires python3 (installed everywhere larry runs); skipped
  with a one-time stderr warning when unavailable. Server-side TSV
  encoding kept (binary-safe transport); client-side unwrap handles
  the safety concern, no remote refactor needed.

- Operator review gate for bash_exec/ssh_exec/ssh_pull/ssh_pull_smat
  results. When the tool produced HL7-shaped output OR the result
  exceeds LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes (default 8192),
  Larry prompts [Y/n/i] before passing the result back to the model.
  'i' opens the full output in $PAGER then re-prompts. Default Y
  (zero friction). N substitutes a refusal JSON so the model surfaces
  that something was withheld. Skipped when LARRY_AUTO_PHI=off (opt-out
  consistency) OR no TTY (headless scripts unaffected). Override with
  LARRY_TOOL_RESULT_REVIEW=always for paranoid mode. Closes V12.

Proactive same-pattern sweep. Searched for other call sites where tool
output bypasses content-shape gating: only the one in agent_turn. The
v0.8.0-c strict-mode tool-result branch was updated in lockstep so it
now triggers on the broader (content-only) eligibility.

Verification: bash -n clean; b64 round-trip unit-tested with three
cases (real-world HL7 base64 → decoded contains tokenized PHI not
clear-text PHI; plain text → passthrough; non-HL7 b64 → passthrough,
no false positive).

Co-Authored-By: Clover (Claude Opus 4.7) <noreply@anthropic.com>
This commit is contained in:
Bryan Johnson 2026-05-27 19:45:23 -07:00
parent 7434e6e8b8
commit 9fc38e743d
3 changed files with 314 additions and 17 deletions

View File

@ -4,6 +4,53 @@ All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here.
Versioning is loose-semver; bumps trigger the in-process self-update on every
running client via `LARRY_BASE_URL` + `MANIFEST`.
## v0.8.1 — 2026-05-27
Tool-result PHI gating expansion. Closes V2 / V12 and the V2 base64 sub-gap
from Vera's audit. No behavior change for users not on HL7-shaped data;
opt-in friction for the 8KB+ tool-result review gate.
- **Tool-name allow-list dropped; content-shape gating only.** The v0.7.3
tool-result auto-PHI gate ran only on `read_file (.hl7|.txt)`, `nc_msgs`,
`hl7_field`, `hl7_diff`. v0.8.1 runs `_auto_phi_looks_like_hl7` on
EVERY tool result. On hit → route through `lib/hl7-sanitize.sh`.
On miss → pass through unchanged. Closes V2: `bash_exec`/`ssh_exec`/
`grep_files`/`read_file` of `.log`/`.csv`/`.dat`/no-suffix files are
now all covered when their output is HL7-shaped. False-positive cost
is cheap (extra regex pass with zero behavioral impact on non-HL7).
- **Base64-wrapped HL7 round-trip.** New `_auto_phi_b64_roundtrip` helper.
Detects candidate base64 runs (length >= 200 chars, `[A-Za-z0-9+/=]`
only, length divisible by 4 — NOT entropy-based, per Pax §V2-sub:
HL7's repetitive prefixes survive base64 with LOW entropy). Speculatively
decodes each run; if decoded bytes look like HL7, routes through
`hl7-sanitize.sh` and re-encodes (`base64 -w0`) back into the result.
Catches `ssh_pull_smat` sampled mode TSV (server-side encoding kept
for binary-safe TSV transport; client-side unwrap handles the safety
concern). Requires `python3` (installed everywhere larry-anywhere
runs); skipped with a one-time stderr warning if unavailable.
- **Operator review gate for `bash_exec`/`ssh_exec`/`ssh_pull`/
`ssh_pull_smat` results.** When the tool produced HL7-shaped output OR
the result exceeds `LARRY_TOOL_RESULT_REVIEW_THRESHOLD` bytes
(default 8192), Larry prompts `[Y/n/i]` before passing the result
back to the model. `i` opens the result in `$PAGER` then re-prompts.
Default Y — zero friction by default. `N` substitutes a refusal JSON
so the model knows a result was withheld. Skipped when
`LARRY_AUTO_PHI=off` (consistent with the opt-out) OR running
non-interactively (no TTY — never blocks headless scripts).
Override with `LARRY_TOOL_RESULT_REVIEW=always` to gate every result.
Per Pax §V2/V12: closes the "operator wanted to see this themselves,
didn't want the model to see it" gap that's the actual common case.
**Proactive same-pattern sweep.** Searched the codebase for other call
sites where tool output bypasses content-shape gating: found only the
one in `agent_turn`. The v0.8.0-c strict-mode tool-result branch was
hardened in lockstep so it now triggers on the broader (content-only)
eligibility instead of the old name-allow-list.
Manifest unchanged.
## v0.8.0 — 2026-05-27
PHI-safety quick-wins pack — three independent zero-risk patches closing

View File

@ -1 +1 @@
0.8.0
0.8.1

282
larry.sh
View File

@ -57,7 +57,7 @@ set -o pipefail
# ─────────────────────────────────────────────────────────────────────────────
# Config
# ─────────────────────────────────────────────────────────────────────────────
LARRY_VERSION="0.8.0"
LARRY_VERSION="0.8.1"
LARRY_HOME="${LARRY_HOME:-$HOME/.larry}"
# ─────────────────────────────────────────────────────────────────────────────
@ -1553,6 +1553,106 @@ _auto_phi_looks_like_hl7() {
return 1
}
# v0.8.1-c: base64-wrapped HL7 round-trip.
# Walks every base64-shaped run (>=200 chars, [A-Za-z0-9+/=], length%4==0)
# in TEXT. For each: speculatively decode; if the decoded bytes look like
# HL7, route through hl7-sanitize.sh and re-encode (base64 -w0 on GNU,
# `base64 | tr -d \n` on BSD). Substitute the re-encoded form back into
# TEXT. Echoes the rewritten text on stdout; empty stdout means "nothing
# matched, leave the result alone" (callers MUST check for non-empty).
#
# Per Pax §V2-sub: do NOT use entropy as the trigger — HL7's repetitive
# prefixes (`PID|1||...`) compress to LOW-entropy base64. Use length +
# charset + modulo-4 as the candidate filter; speculative decode is the
# decision point. False-positive cost is one extra base64-and-grep round
# trip per matched run; cheap.
_auto_phi_b64_roundtrip() {
local text="$1" toolname="${2:-tool}"
local sanitize_script="$LARRY_LIB_DIR/hl7-sanitize.sh"
[ -x "$sanitize_script" ] || { printf ''; return 1; }
# We use python3 for the walk because pure-bash regex over potentially
# 400KB input with reliable run extraction is fragile; python3 is
# present everywhere larry-anywhere runs (macOS, Linux, Cygwin).
# The python helper:
# - finds every [A-Za-z0-9+/]{200,}={0,2} run with length%4==0
# - speculatively decodes
# - checks for HL7 shape in decoded bytes
# - if shape matches, writes decoded to a tempfile, runs hl7-sanitize.sh,
# re-encodes the sanitized output, and patches it back into the text
# - on no matches OR no shape hits, prints nothing (caller leaves result alone)
if ! command -v python3 >/dev/null 2>&1; then
# Python3 unavailable. We do NOT attempt a half-implementation in
# bash/awk — base64 round-trip with byte-faithful re-encode is fiddly
# to get right and a buggy substitute could corrupt the tool result
# mid-payload. Conservative: leave the result intact, log once per
# session for visibility. Strict-mode escape is handled upstream;
# default mode falls back to the plain HL7-shape branch which catches
# cleartext MSH/PID anyway.
if [ -z "${_LARRY_B64_PY3_WARNED:-}" ]; then
printf '%sphi>%s base64 unwrap pass skipped: python3 not on PATH (install python3 to enable v0.8.1-c)\n' \
"$C_DIM" "$C_RESET" >&2
_LARRY_B64_PY3_WARNED=1
fi
printf ''
return 1
fi
# Write the python helper to a tempfile so we can pass text via stdin
# (the `python3 - <<HEREDOC` form conflicts with stdin-piping — the
# heredoc consumes the dash's stdin slot). Tempfile approach keeps both
# channels clean.
local _b64py; _b64py=$(mktemp -t larry-b64.XXXXXX.py 2>/dev/null || mktemp)
cat > "$_b64py" <<'PY'
import os, re, subprocess, sys, tempfile, base64
sanitize_script = sys.argv[1]
text = sys.stdin.buffer.read()
# Candidate b64 runs: length >= 200, only [A-Za-z0-9+/=], length % 4 == 0.
pat = re.compile(rb"[A-Za-z0-9+/]{200,}={0,2}")
data = text
changed = False
def is_hl7(b):
head = b[:4096]
if head.startswith(b"MSH|"): return True
for sep in (b"\rMSH|", b"\nMSH|", b"\rPID|", b"\nPID|",
b"\rEVN|", b"\nEVN|", b"\rPV1|", b"\nPV1|"):
if sep in head: return True
return False
def replace(match):
global changed
run = match.group(0)
if len(run) % 4 != 0:
return run
try:
decoded = base64.b64decode(run, validate=True)
except Exception:
return run
if not is_hl7(decoded):
return run
with tempfile.NamedTemporaryFile(delete=False) as tf:
tf.write(decoded)
tfname = tf.name
try:
proc = subprocess.run(["bash", sanitize_script, tfname],
capture_output=True, timeout=30)
finally:
os.unlink(tfname)
if proc.returncode != 0 or not proc.stdout:
# Sanitize failure: keep original (fail-open). Strict-mode escape is
# already handled upstream — this helper is best-effort cleanup.
return run
san = proc.stdout
reencoded = base64.b64encode(san)
changed = True
return reencoded
new_data = pat.sub(replace, data)
if changed:
sys.stdout.buffer.write(new_data)
PY
printf '%s' "$text" | python3 "$_b64py" "$sanitize_script"
local _rc=$?
rm -f "$_b64py"
return $_rc
}
# Main detector. Args: surface ("user_input"|"tool_result"), input text.
# Echoes the rewritten input. Status message goes to stderr.
#
@ -2447,6 +2547,102 @@ _pretty_tool_input() {
' 2>/dev/null
}
# v0.8.1-b: second approval gate for tool results.
# Sets the shell-global $_LARRY_GATE_RESULT to either the original result
# (user accepted) or a refusal sentinel (user declined). Never silently
# replaces — the caller reads $_LARRY_GATE_RESULT explicitly.
#
# Triggers (any of):
# - tool name is in the "operator-intent" list (bash_exec, ssh_exec,
# ssh_pull, ssh_pull_smat) AND
# - result is HL7-shaped, OR
# - result > LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes (default 8192)
# - LARRY_TOOL_RESULT_REVIEW=always (covers every tool result)
#
# Skipped when:
# - LARRY_AUTO_PHI=off (operator explicitly opted out of PHI prompts)
# - no controlling TTY (headless / non-interactive scripts must not block)
_LARRY_GATE_RESULT=""
_maybe_tool_result_review_gate() {
local name="$1" result="$2"
_LARRY_GATE_RESULT="$result"
# Bypasses.
[ "$AUTO_PHI_MODE" = "off" ] && return 0
[ -t 0 ] || return 0
[ -t 2 ] || return 0
# Always-on env override.
if [ "${LARRY_TOOL_RESULT_REVIEW:-}" = "always" ]; then
: # fall through to prompt
else
# Otherwise, only gate the operator-intent tools.
case "$name" in
bash_exec|ssh_exec|ssh_pull|ssh_pull_smat) ;;
*) return 0 ;;
esac
# Trigger check.
local size; size=$(printf '%s' "$result" | wc -c | tr -d ' ')
local threshold="${LARRY_TOOL_RESULT_REVIEW_THRESHOLD:-8192}"
# coerce_int defends against CR-tainted env on Cygwin (v0.7.5 lessons).
if declare -F coerce_int >/dev/null 2>&1; then
threshold=$(coerce_int "$threshold" 8192)
size=$(coerce_int "$size" 0)
fi
local hl7=0
_auto_phi_looks_like_hl7 "$result" && hl7=1
if [ "$hl7" != "1" ] && [ "$size" -le "$threshold" ]; then
return 0
fi
fi
# Render the prompt. Show a 240-char preview of the result.
local preview; preview=$(printf '%s' "$result" | head -c 240)
printf '\n%s══ tool result review ══%s\n' "$C_YELLOW" "$C_RESET" >&2
printf ' tool: %s\n' "$name" >&2
printf ' bytes: %s\n' "$(printf '%s' "$result" | wc -c | tr -d ' ')" >&2
if _auto_phi_looks_like_hl7 "$result"; then
printf ' shape: HL7-shaped (post-sanitize)\n' >&2
fi
printf '%s── preview (first 240 chars) ──%s\n' "$C_DIM" "$C_RESET" >&2
printf '%s\n' "$preview" >&2
printf '%sphi> send this output back to the model? [Y/n/i (inspect)]:%s ' \
"$C_BOLD" "$C_RESET" >&2
local ans=""
if declare -F read_clean >/dev/null 2>&1; then
read_clean ans </dev/tty 2>/dev/null || ans=""
else
IFS= read -r ans </dev/tty 2>/dev/null || ans=""
ans="${ans//$'\r'/}"
fi
while [ "$ans" = "i" ] || [ "$ans" = "I" ]; do
local pager="${PAGER:-less}"
local _ins_tmp; _ins_tmp=$(mktemp)
printf '%s' "$result" > "$_ins_tmp"
if command -v "$pager" >/dev/null 2>&1; then
"$pager" "$_ins_tmp" </dev/tty || true
else
cat "$_ins_tmp" >&2
fi
rm -f "$_ins_tmp"
printf '%sphi> send this output back to the model? [Y/n/i (inspect again)]:%s ' \
"$C_BOLD" "$C_RESET" >&2
if declare -F read_clean >/dev/null 2>&1; then
read_clean ans </dev/tty 2>/dev/null || ans=""
else
IFS= read -r ans </dev/tty 2>/dev/null || ans=""
ans="${ans//$'\r'/}"
fi
done
case "$ans" in
n|N|no|NO|No)
_LARRY_GATE_RESULT='{"error":"tool result withheld by operator","tool":"'"$name"'","note":"operator reviewed the output locally and declined to send it back to the model"}'
printf '%sphi>%s tool result WITHHELD from model (operator declined)\n' \
"$C_DIM" "$C_RESET" >&2
;;
*)
# Default Y — pass through.
;;
esac
}
# Display a tool call header (cyan + bold name, dim args, optional truncation hint).
display_tool_call() {
local name="$1" input_json="$2"
@ -3248,17 +3444,25 @@ agent_turn() {
# designed for prose, not pipe-delimited segment data). The two
# share lookup.tsv so tokens are stable across surfaces.
if [ "$AUTO_PHI_MODE" != "off" ]; then
local _ap_eligible=0
case "$name" in
read_file)
local _ap_path
_ap_path=$(printf '%s' "$input_json" | jq -r '.path // ""' 2>/dev/null)
case "$_ap_path" in
*.hl7|*.HL7|*.txt|*.TXT) _ap_eligible=1 ;;
esac
;;
nc_msgs|hl7_field|hl7_diff) _ap_eligible=1 ;;
esac
# v0.8.1-a: content-shape gating replaces the v0.7.3 tool-name
# allow-list. The shape detector (_auto_phi_looks_like_hl7) runs on
# EVERY tool result regardless of which tool produced it. On hit →
# route through hl7-sanitize.sh. On miss → pass through unchanged.
# This closes V2 (bash_exec, ssh_exec, grep_files, and read_file of
# any extension all get scanned when their output is HL7-shaped).
# False-positive cost: cheap — sanitizer runs against output that
# doesn't actually match its rules; mints no tokens; passes through.
local _ap_eligible=1
# v0.8.1-c: base64 unwrap pass. Detect candidate base64 (length+
# charset+modulo, NOT entropy — per Pax §V2-sub: HL7's repetitive
# prefixes survive base64 with LOW entropy, so entropy is the wrong
# signal). Speculatively decode each candidate; if the decoded bytes
# look like HL7, route THOSE through hl7-sanitize.sh and re-encode
# back into the result. Catches ssh_pull_smat sampled mode TSV.
local _ap_b64=0
if printf '%s' "$result" | head -c 65536 | grep -qE '[A-Za-z0-9+/]{200,}={0,2}' 2>/dev/null; then
_ap_b64=1
fi
# v0.8.0-c: strict mode aborts if sanitizer script is missing/non-exec
# when we have HL7-shaped output. We can't kill the tool-loop iteration
# without sending SOMETHING back to satisfy the tool_use; substitute
@ -3271,6 +3475,20 @@ agent_turn() {
"$C_DIM" "$C_RESET" "$name" >&2
result='{"error":"auto-PHI sanitizer unavailable on HL7-shaped result","tool":"'"$name"'","action":"result withheld; set LARRY_AUTO_PHI=on to fall back to best-effort, or repair lib/hl7-sanitize.sh"}'
_ap_eligible=0 # skip the normal sanitize path below
_ap_b64=0
fi
# v0.8.1-c: base64-wrapped HL7 round-trip (decode → sanitize → re-encode).
# Runs BEFORE the plain HL7-shape branch so a result that's pure b64
# (no MSH| in cleartext) still gets the field-aware sanitize.
if [ "$_ap_b64" = "1" ] && [ -x "$LARRY_LIB_DIR/hl7-sanitize.sh" ]; then
local _b64_changed
_b64_changed=$(_auto_phi_b64_roundtrip "$result" "$name") || true
if [ -n "$_b64_changed" ]; then
result="$_b64_changed"
printf '%sphi>%s base64-wrapped HL7 detected in %s result; decoded, sanitized, re-encoded\n' \
"$C_DIM" "$C_RESET" "$name" >&2
_auto_phi_log "(b64-hl7 roundtrip)" "BATCH" "(b64-decoded-and-sanitized)" "hl7_pipeline" "tool_result" "$name (b64)"
fi
fi
if [ "$_ap_eligible" = "1" ] && _auto_phi_looks_like_hl7 "$result"; then
local _ap_tmp _ap_sanitized _ap_before _ap_after
@ -3302,6 +3520,26 @@ agent_turn() {
fi
fi
# v0.8.1-b: second approval gate. After the tool ran and we have the
# (possibly sanitized) result, prompt the user before passing it back
# to the model. Triggers:
# - tool produced HL7-shaped output (post-sanitize, in case sanitize
# missed something or fell open), OR
# - output exceeds LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes
# (default 8192), OR
# - LARRY_TOOL_RESULT_REVIEW=always
# Skipped when:
# - LARRY_AUTO_PHI=off (user has explicitly opted out of all PHI
# safety prompts; consistent with that opt-out)
# - non-interactive shell (no TTY — never block headless scripts)
# - tool name is read_file/list_dir/grep_files/glob_files (these
# already had user intent via the model's tool_use; the model
# asked for them. The dominant V12 risk is bash_exec/ssh_exec/
# ssh_pull/ssh_pull_smat where the operator's "run + show" intent
# may not include "show to model")
_maybe_tool_result_review_gate "$name" "$result"
result="$_LARRY_GATE_RESULT"
_LARRY_LAST_TOOL_RESULT="$result"
log_append '```'; log_append "$result"; log_append '```'
@ -3436,10 +3674,22 @@ Slash commands:
4 KNOWN Value matches an existing lookup.tsv entry — Bryan
has seen this value before. Always.
Tool-result scan applies ONLY to read_file (.hl7/.txt), hl7_*, and
nc_msgs outputs, and ONLY if content is HL7-shaped. Generic outputs
(list_dir, grep_files, bash_exec, web search) are NEVER scanned — the
risk of breaking legitimate text outweighs the catch rate.
Tool-result scan (v0.8.1): runs on EVERY tool result. The tool-name
allow-list was dropped — content-shape gating (_auto_phi_looks_like_hl7)
is now the only filter. HL7-shaped output from any tool (bash_exec,
ssh_exec, grep_files, read_file of any extension, nc_msgs, etc.) is
routed through hl7-sanitize.sh. Non-HL7-shaped output passes through
unchanged (no behavior change for normal text). v0.8.1 also adds a
base64 round-trip pass (decode → shape-check → sanitize → re-encode)
for ssh_pull_smat sampled mode and any other base64-wrapped HL7.
Operator review gate (v0.8.1): for bash_exec/ssh_exec/ssh_pull/
ssh_pull_smat results that are HL7-shaped OR exceed
LARRY_TOOL_RESULT_REVIEW_THRESHOLD bytes (default 8192), Larry prompts
[Y/n/i] before passing the result back to the model. 'i' opens the
full output in \$PAGER. Default Y (no friction). Skipped when
LARRY_AUTO_PHI=off OR no controlling TTY. Override with
LARRY_TOOL_RESULT_REVIEW=always to gate every tool result.
Modes (env LARRY_AUTO_PHI or /phi-auto):
on default — all four tiers always tokenize (caution-first)