v0.7.1: status line below prompt + automatic PHI detection + session-artifact upload

Feature 1 — Status line BELOW the prompt (was: above).
The dim status line now renders AFTER each completed agent_turn and BEFORE
the next prompt, sitting between turns as a footer to the just-finished
exchange. Shipped Option B from the spec — render_status_line moved to the
tail of the REPL loop, the call before printing the prompt was removed.
Option A (cursor manipulation under an active readline prompt) was rejected
because `read -e` takes exclusive control of the cursor and inserting a
repositioned footer below an active prompt is fragile on MobaXterm / Cygwin
(readline redisplay clobbers manual cursor moves). Visual outcome is
identical to "below the previous prompt cycle", and /status still forces a
re-render mid-conversation if needed.

Feature 2 — Automatic PHI detection.
New auto_detect_phi() runs BEFORE preprocess_phi_markers and tokenizes any
value matching PHI-shaped patterns (email, SSN, phone, DOB, MRN 6-12 digits,
HL7 caret-name, "Last, First", or loose "Title Case Title Case"). Uses the
existing hl7-sanitize.sh tokenize-value pipeline so canonicalization
(sort-unique-lowercase NAME tokens, ISO DOB, digits-only PHONE/SSN,
lowercase EMAIL) collapses different surface forms onto one token across
the session. Skipped: paths, URLs, already-tokenized values, manual @@/{{phi:}}
markers, timestamps (13+ digits or 10 digits starting with '1'), and a
built-in allowlist of common non-PHI two-word phrases ("Home Assistant",
"Mac Studio", etc.).

Modes: confirm (default — prompts Y/n on loose name-like matches once per
session), aggressive (silent always-tokenize), off. Env LARRY_AUTO_PHI;
runtime /auto-phi and /auto-phi-status slash commands. Per-turn override
with "!nophi " prefix. Manual markers always win. New normalize-value
subcommand on hl7-sanitize.sh exposes the canonicalization step so the
per-session memory cache uses canonical keys (so "John Smith" and
"JOHN SMITH" share one confirm decision). EMAIL + PHONE categories added
to normalize_value().

Feature 3 — Session-artifact upload at close.
New upload_session_artifacts() POSTs $LARRY_HOME/log/headers.log,
$LARRY_HOME/sessions/<id>.log.md, and <id>.messages.json to
$LARRY_MEMORY_UPLOAD_URL on session exit. Each request carries
X-Larry-Source (headers-log | session-log | session-messages),
X-Larry-Version, and X-Session-Id headers so the ingest side can route
appropriately. Fires from both the clean main_loop exit and the EXIT/INT/TERM
trap (idempotent via _LARRY_UPLOAD_FIRED guard). Unset URL = silent skip
with a one-line warn. Auth tokens are never logged: headers.log captures
only response headers matching ^anthropic-* or ^retry-after: (per v0.6.9
writer); the session log + messages contain post-tokenization content only.

No regressions to v0.7.0 work — HL7 tab completion, mouse mode toggles,
TOOLS_JSON heredoc, streaming, @file refs, status-line existence, slash
completion, and all v0.6.x machinery remain untouched. MANIFEST unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Bryan Johnson 2026-05-27 16:59:01 -07:00
parent 8661948cf6
commit af2ffe883c
3 changed files with 539 additions and 10 deletions

View File

@ -1 +1 @@
0.7.0 0.7.1

525
larry.sh
View File

@ -47,7 +47,7 @@ set -o pipefail
# ───────────────────────────────────────────────────────────────────────────── # ─────────────────────────────────────────────────────────────────────────────
# Config # Config
# ───────────────────────────────────────────────────────────────────────────── # ─────────────────────────────────────────────────────────────────────────────
LARRY_VERSION="0.7.0" LARRY_VERSION="0.7.1"
LARRY_HOME="${LARRY_HOME:-$HOME/.larry}" LARRY_HOME="${LARRY_HOME:-$HOME/.larry}"
LARRY_BASE_URL="${LARRY_BASE_URL:-https://raw.githubusercontent.com/bojj27/cloverleaf-larry/main}" LARRY_BASE_URL="${LARRY_BASE_URL:-https://raw.githubusercontent.com/bojj27/cloverleaf-larry/main}"
LARRY_UPDATE_URL="${LARRY_UPDATE_URL:-${LARRY_BASE_URL}/larry.sh}" LARRY_UPDATE_URL="${LARRY_UPDATE_URL:-${LARRY_BASE_URL}/larry.sh}"
@ -852,6 +852,369 @@ preprocess_phi_markers() {
printf '%s' "$input" printf '%s' "$input"
} }
# ─────────────────────────────────────────────────────────────────────────────
# v0.7.1 — Automatic PHI detection
#
# Runs BEFORE preprocess_phi_markers (so explicit markers still take precedence)
# and BEFORE @file inline expansion has already been done (so file contents
# don't get token-walked here — they're tokenized by hl7_sanitize when needed).
#
# Strategy: walk every whitespace-delimited token and decide one of:
# * leave alone (path / URL / already-token / already-marker / timestamp)
# * tokenize via hl7-sanitize.sh tokenize-value (same pipeline as manual)
#
# Bryan's directive: err on the side of caution. We tokenize anything that
# *looks* like PHI as long as it doesn't interfere with required canonical
# matching. The same tokenize-value pipeline handles normalization, so
# different surface forms of the same value share one token across the session
# and across sanitized files.
#
# Modes (env LARRY_AUTO_PHI or /auto-phi slash):
# confirm (default) — prompt Y/n on first sighting of a name-like value
# aggressive — tokenize every match silently
# off — disable auto-detection entirely
#
# Per-turn override: prepend "!nophi " to skip auto-detection for that turn.
# ─────────────────────────────────────────────────────────────────────────────
# Mode (default confirm). Promoted to AUTO_PHI_MODE so /auto-phi can mutate it.
AUTO_PHI_MODE="${LARRY_AUTO_PHI:-confirm}"
# Per-session memory: declined values (user said "n" to confirm prompt) and
# accepted values (cached so we don't re-prompt). Keys are normalized canonical
# strings. For bash<4, fall back to two pipe-delimited strings.
if (( BASH_VERSINFO[0] >= 4 )); then
declare -A AUTO_PHI_ACCEPTED 2>/dev/null
declare -A AUTO_PHI_DECLINED 2>/dev/null
else
AUTO_PHI_ACCEPTED_LIST=""
AUTO_PHI_DECLINED_LIST=""
fi
AUTO_PHI_SESSION_COUNT=0
# Built-in allowlist of common non-PHI two-word phrases that match the loose
# "Title Case Title Case" name pattern. Lowercased + sorted on lookup.
# This is intentionally small — confirm-mode catches the rest interactively.
_AUTO_PHI_NAME_ALLOWLIST=$(cat <<'EOF'
home assistant
mac studio
mac mini
mac pro
mac book
apple watch
apple tv
new york
los angeles
san francisco
san diego
las vegas
united states
united kingdom
north america
south america
microsoft office
google cloud
amazon web
visual studio
sublime text
android studio
docker desktop
node red
linux mint
windows server
ubuntu server
debian linux
red hat
oracle linux
EOF
)
_auto_phi_in_allowlist() {
local v_lower
v_lower=$(printf '%s' "$1" | tr '[:upper:]' '[:lower:]')
grep -Fxq -- "$v_lower" <<< "$_AUTO_PHI_NAME_ALLOWLIST"
}
_auto_phi_seen_accepted() {
local key="$1"
if (( BASH_VERSINFO[0] >= 4 )); then
[ -n "${AUTO_PHI_ACCEPTED[$key]:-}" ]
else
[[ "|$AUTO_PHI_ACCEPTED_LIST|" == *"|$key|"* ]]
fi
}
_auto_phi_seen_declined() {
local key="$1"
if (( BASH_VERSINFO[0] >= 4 )); then
[ -n "${AUTO_PHI_DECLINED[$key]:-}" ]
else
[[ "|$AUTO_PHI_DECLINED_LIST|" == *"|$key|"* ]]
fi
}
_auto_phi_mark_accepted() {
local key="$1"
if (( BASH_VERSINFO[0] >= 4 )); then
AUTO_PHI_ACCEPTED[$key]=1
else
AUTO_PHI_ACCEPTED_LIST="${AUTO_PHI_ACCEPTED_LIST}|$key"
fi
}
_auto_phi_mark_declined() {
local key="$1"
if (( BASH_VERSINFO[0] >= 4 )); then
AUTO_PHI_DECLINED[$key]=1
else
AUTO_PHI_DECLINED_LIST="${AUTO_PHI_DECLINED_LIST}|$key"
fi
}
# _auto_phi_classify VALUE → echoes a category (EMAIL/SSN/PHONE/DOB/MRN/NAME/NAME_LOOSE)
# or empty string if the value is not a tokenization candidate.
_auto_phi_classify() {
local v="$1"
[ -z "$v" ] && return 0
# Already-token format: [[CATEGORY_NNNN]] — leave alone.
[[ "$v" =~ ^\[\[[A-Z][A-Z0-9_]*_[0-9]+\]\]$ ]] && return 0
# Already-marker formats: @@VALUE@@, @@VALUE, {{phi:...}} — manual handles.
[[ "$v" == @@* ]] && return 0
[[ "$v" == *@@ ]] && return 0
[[ "$v" == \{\{phi:* ]] && return 0
# Path-like — leave alone.
case "$v" in
/*|./*|../*|~/*) return 0 ;;
[A-Z]:\\*) return 0 ;;
esac
# URL-like — leave alone.
case "$v" in
http://*|https://*|ssh://*|ftp://*|sftp://*|file://*|ws://*|wss://*) return 0 ;;
esac
# Strip a single trailing punctuation that is sentence-grammar, not part
# of the value. Re-evaluate the cleaned form.
local trimmed="$v"
case "$trimmed" in
*[.,\;:\!\?\)]) trimmed="${trimmed%?}" ;;
esac
# Email-like. Must have exactly one @, dotted domain.
if [[ "$trimmed" =~ ^[^@[:space:]]+@[^@[:space:]]+\.[^@[:space:]]+$ ]]; then
printf 'EMAIL'; return
fi
# Long-digit timestamp guard FIRST (before phone/SSN/MRN checks). Pure
# digits 13+ chars OR 10 chars starting with '1' (epoch seconds / millis).
# These would otherwise match the bare-phone or MRN patterns. Leave alone.
if [[ "$trimmed" =~ ^[0-9]+$ ]]; then
local n="${#trimmed}"
if [ "$n" -ge 13 ]; then return 0; fi
if [ "$n" -eq 10 ] && [[ "$trimmed" == 1* ]]; then return 0; fi
fi
# SSN-like. 9 digits with optional dashes (must total exactly 9 digits).
if [[ "$trimmed" =~ ^[0-9]{3}-?[0-9]{2}-?[0-9]{4}$ ]]; then
local d="${trimmed//-/}"
[ "${#d}" -eq 9 ] && { printf 'SSN'; return; }
fi
# Phone-like. The regex needs to match the FULL token, including a "(212)"
# prefix when the next token is "555-1234". We can't see across token
# boundaries here, so we accept the most-common single-token forms:
# 555-123-4567 5551234567 555.123.4567
# 555 123 4567 (212)555-1234 (212)5551234
# Multi-token "(212) 555-1234" is reconstructed by the two-token-PHONE
# pass below in auto_detect_phi (caller side).
if [[ "$trimmed" =~ ^\(?[0-9]{3}\)?[-\.\ ]?[0-9]{3}[-\.\ ]?[0-9]{4}$ ]]; then
# Distinguish from pure-digit MRN: 10-digit all-numeric reaches here
# too. If trimmed is 10 pure digits starting with '1' we already
# returned above (timestamp). Otherwise treat as PHONE.
printf 'PHONE'; return
fi
# DOB / date-like.
if [[ "$trimmed" =~ ^[0-9]{1,4}[/-][0-9]{1,2}[/-][0-9]{1,4}$ ]]; then
printf 'DOB'; return
fi
# MRN-like: pure digits, 6-12 chars (conservative — see spec rule #9).
if [[ "$trimmed" =~ ^[0-9]+$ ]]; then
local n2="${#trimmed}"
if [ "$n2" -ge 6 ] && [ "$n2" -le 12 ]; then
printf 'MRN'; return
fi
return 0
fi
# Name-like (HL7 carat).
if [[ "$trimmed" =~ ^[A-Za-z]+\^[A-Za-z]+ ]]; then
printf 'NAME'; return
fi
# The loose "Title Case Title Case" pattern is handled across two whitespace
# tokens at the caller level — not classified per-token here.
return 0
}
# auto_detect_phi INPUT — main entrypoint. Echoes the rewritten input.
# Per-turn override: input starting with "!nophi " causes the function to
# strip the prefix and return without scanning.
auto_detect_phi() {
local input="$1"
local sanitize_script="$LARRY_LIB_DIR/hl7-sanitize.sh"
[ -x "$sanitize_script" ] || { printf '%s' "$input"; return; }
# Per-turn override.
if [[ "$input" == '!nophi '* ]]; then
printf '%s' "${input#!nophi }"
return 0
fi
if [ "$AUTO_PHI_MODE" = "off" ]; then
printf '%s' "$input"
return 0
fi
# Build a list of replacements (orig\tcategory\token) so we don't mutate
# the string mid-scan (which would invalidate offsets).
local -a hits=()
# Pass A: per-whitespace-token classification.
local IFS=$' \t\n' tok
local -a tokens
read -r -a tokens <<< "$input"
local t cat key strip_trailing
for t in "${tokens[@]}"; do
[ -z "$t" ] && continue
# Also split comma-delimited sub-tokens (e.g. "a@b.com,c@d.com").
local sub
for sub in ${t//,/ }; do
[ -z "$sub" ] && continue
cat=$(_auto_phi_classify "$sub")
[ -z "$cat" ] && continue
# Strip trailing sentence-grammar punct for the actual replace string,
# but only one char to match classify's behaviour.
strip_trailing="$sub"
case "$strip_trailing" in
*[.,\;:\!\?\)]) strip_trailing="${strip_trailing%?}" ;;
esac
hits+=("$strip_trailing|$cat")
done
done
# Pass B: loose "Title Case Title Case" two-word names. Detect using a
# tolerant regex over the prose; per Bryan's confirm-first default, every
# hit goes through confirm unless mode=aggressive.
local i name_pair
for ((i=0; i<${#tokens[@]}-1; i++)); do
local left="${tokens[$i]}" right="${tokens[$i+1]}"
# Strip one trailing punct from right for the test.
local right_clean="$right"
case "$right_clean" in
*[.,\;:\!\?\)]) right_clean="${right_clean%?}" ;;
esac
if [[ "$left" =~ ^[A-Z][a-z]+$ ]] && [[ "$right_clean" =~ ^[A-Z][a-z]+$ ]]; then
name_pair="$left $right_clean"
# Allowlist check (case-insensitive).
if _auto_phi_in_allowlist "$name_pair"; then
continue
fi
hits+=("$name_pair|NAME_LOOSE")
fi
done
# Pass C: two-token phone "(212) 555-1234" or "(212) 5551234" etc. The
# single-token classifier can't see across whitespace.
local phone_pair
for ((i=0; i<${#tokens[@]}-1; i++)); do
local p_left="${tokens[$i]}" p_right="${tokens[$i+1]}"
# Strip one trailing punct from p_right.
case "$p_right" in
*[.,\;:\!\?\)]) p_right="${p_right%?}" ;;
esac
if [[ "$p_left" =~ ^\(?[0-9]{3}\)?$ ]] \
&& [[ "$p_right" =~ ^[0-9]{3}[-\.]?[0-9]{4}$ ]]; then
phone_pair="$p_left $p_right"
hits+=("$phone_pair|PHONE")
fi
done
# No hits — fast path.
[ ${#hits[@]} -eq 0 ] && { printf '%s' "$input"; return 0; }
# Dedupe hits while preserving order.
local -A seen_hits=()
local -a uhits=()
local h
for h in "${hits[@]}"; do
if [ -z "${seen_hits[$h]:-}" ]; then
seen_hits[$h]=1
uhits+=("$h")
fi
done
# Apply each hit: confirm where needed, then tokenize + substitute.
local summary=""
local mode="$AUTO_PHI_MODE"
for h in "${uhits[@]}"; do
local orig="${h%|*}"
local cat="${h##*|}"
local actual_cat="$cat"
[ "$cat" = "NAME_LOOSE" ] && actual_cat="NAME"
# Use canonical normalize for memory key (so "John Smith" / "JOHN SMITH"
# share one decision).
local mem_key
mem_key=$("$sanitize_script" normalize-value "$orig" "$actual_cat" 2>/dev/null) || mem_key="$orig"
[ -z "$mem_key" ] && mem_key="$orig"
# User previously declined this value this session.
if _auto_phi_seen_declined "$mem_key"; then continue; fi
# Confirm-first prompting only for NAME_LOOSE (the high-FP-rate detector).
# Strict-format hits (EMAIL/SSN/PHONE/DOB/MRN/NAME-with-caret) are always
# tokenized. This matches Bryan's "err on the side of caution" while
# keeping confirms rare and high-signal.
if [ "$cat" = "NAME_LOOSE" ] && [ "$mode" = "confirm" ] \
&& ! _auto_phi_seen_accepted "$mem_key"; then
local ans
printf '%sphi auto>%s possible PHI detected: "%s". Tokenize? [Y/n] ' \
"$C_YELLOW" "$C_RESET" "$orig" >&2
IFS= read -r ans </dev/tty 2>/dev/null || ans=""
case "$ans" in
n|N|no|NO|No) _auto_phi_mark_declined "$mem_key"; continue ;;
*) _auto_phi_mark_accepted "$mem_key" ;;
esac
fi
# Tokenize.
local token
token=$("$sanitize_script" tokenize-value --category "$actual_cat" "$orig" 2>/dev/null)
[ -z "$token" ] && continue
# Substitute. Use literal string replacement (all occurrences).
input="${input//"$orig"/"$token"}"
# Build summary line.
if [ -z "$summary" ]; then
summary="${orig}${token}"
else
summary="${summary}, ${orig}${token}"
fi
AUTO_PHI_SESSION_COUNT=$((AUTO_PHI_SESSION_COUNT + 1))
done
if [ -n "$summary" ]; then
local count
count=$(awk -F', ' '{print NF}' <<< "$summary")
printf '%sphi auto>%s tokenized %d value(s): %s\n' \
"$C_YELLOW" "$C_RESET" "$count" "$summary" >&2
fi
printf '%s' "$input"
}
tool_hl7_sanitize() { tool_hl7_sanitize() {
local input_path="$1" strict="${2:-0}" local input_path="$1" strict="${2:-0}"
_lib_err_if_missing || return _lib_err_if_missing || return
@ -2431,6 +2794,23 @@ Slash commands:
all collapse to the same token. all collapse to the same token.
Category is auto-detected from value shape (MRN/SSN/DOB/NAME/MANUAL). Category is auto-detected from value shape (MRN/SSN/DOB/NAME/MANUAL).
{{phi:VALUE}} / {{phi:CAT:VALUE}} legacy syntax (still works) {{phi:VALUE}} / {{phi:CAT:VALUE}} legacy syntax (still works)
Automatic PHI detection (v0.7.1):
Larry now scans every prompt for PHI-shaped values and tokenizes them
BEFORE sending to Anthropic. Detects emails, SSNs, phones, dates,
MRNs (6-12 pure digits), HL7 caret-names, "Last, First" names, and
title-case "John Smith" patterns. Paths, URLs, timestamps, and a small
allowlist (Home Assistant, Mac Studio, etc.) are skipped.
Modes (env LARRY_AUTO_PHI or /auto-phi):
confirm default — prompts Y/n on loose name-like matches once per
session; explicit-format hits (email/SSN/phone/etc.) are
always tokenized
aggressive tokenize every match silently
off disable auto-detection entirely (manual markers still work)
Per-turn override: prefix any prompt with "!nophi " to skip the scan
for that turn only. Manual @@VALUE / {{phi:VALUE}} markers always win.
/redetect re-scan for HCIROOT/HCISITE/tools /redetect re-scan for HCIROOT/HCISITE/tools
/sites list site dirs under HCIROOT /sites list site dirs under HCIROOT
/site <name> switch HCISITE for this session /site <name> switch HCISITE for this session
@ -2453,13 +2833,23 @@ Multi-line input:
are not matched. Binary files and files >250 KB are skipped/truncated with are not matched. Binary files and files >250 KB are skipped/truncated with
a warning. TAB after @ autocompletes against files in cwd (fzf if installed). a warning. TAB after @ autocompletes against files in cwd (fzf if installed).
Status line (v0.6.9): Status line (v0.6.9, repositioned v0.7.1):
A dim 1-line summary prints above each you[...] > prompt: A dim 1-line summary now prints BELOW each just-completed turn (after the
Larry response, before the next you[...]> prompt) so it stays adjacent
to the conversation flow:
OAuth: ─ ctx 12% (24K/200K) ─ 5h 1.8% reset 19:45 ─ 7d 73.7% reset Mon Jun 2 OAuth: ─ ctx 12% (24K/200K) ─ 5h 1.8% reset 19:45 ─ 7d 73.7% reset Mon Jun 2
API key: ─ ctx 12% (24K/200K)$0.213 session ─ 14 turns ─ API key: ─ ctx 12% (24K/200K)$0.213 session ─ 14 turns ─
Disable entirely with LARRY_NO_STATUS=1. Force re-display with /status. Disable entirely with LARRY_NO_STATUS=1. Force re-display with /status.
Suppressed automatically on the first turn (no data yet). Suppressed automatically on the first turn (no data yet).
Memory upload at session close (v0.7.1):
When LARRY_MEMORY_UPLOAD_URL is set, on clean exit Larry POSTs three
artifacts to the configured endpoint: $LARRY_HOME/log/headers.log
(header-log), $LARRY_HOME/sessions/<id>.log.md (session-log), and
<id>.messages.json (session-messages). Each request carries
X-Larry-Source, X-Larry-Version, and X-Session-Id headers.
Unset = silent skip with a one-line warn at exit.
TAB completion (v0.6.6/v0.6.7/v0.7.0): TAB completion (v0.6.6/v0.6.7/v0.7.0):
Type '/' followed by any prefix and press TAB. Type '/' followed by any prefix and press TAB.
/h<TAB> → /help /h<TAB> → /help
@ -2614,6 +3004,8 @@ _LARRY_SLASH_CMDS_DESC=(
[/hl7]="<SEGMENT> print full field list for an HL7 segment (e.g. /hl7 PID)" [/hl7]="<SEGMENT> print full field list for an HL7 segment (e.g. /hl7 PID)"
[/hl7-fields]="<SEG.FIELD> print component breakdown (e.g. /hl7-fields PID.5)" [/hl7-fields]="<SEG.FIELD> print component breakdown (e.g. /hl7-fields PID.5)"
[/mouse]="on|off toggle xterm mouse mode for this session" [/mouse]="on|off toggle xterm mouse mode for this session"
[/auto-phi]="on|off|aggressive|confirm — runtime control for v0.7.1 auto PHI detection"
[/auto-phi-status]="show current auto-PHI mode + session tokenization count"
) )
# __larry_complete_slash — bound to TAB via `bind -x` (see _install_readline_tab). # __larry_complete_slash — bound to TAB via `bind -x` (see _install_readline_tab).
@ -3100,8 +3492,83 @@ _uninstall_mouse_mode() {
printf '\033[?1006l\033[?1000l' 2>/dev/null || true printf '\033[?1006l\033[?1000l' 2>/dev/null || true
_LARRY_MOUSE_ACTIVE=0 _LARRY_MOUSE_ACTIVE=0
} }
# Ensure mouse mode is disabled on REPL exit (Ctrl-C, /quit, EOF). Idempotent. # Ensure mouse mode is disabled on REPL exit (Ctrl-C, /quit, EOF). The trap
trap '_uninstall_mouse_mode' EXIT INT TERM # itself is registered AFTER the v0.7.1 upload function below, so we can
# chain mouse-mode teardown after the memory upload in a single trap.
# ─────────────────────────────────────────────────────────────────────────────
# v0.7.1 — session-artifact upload at session close.
#
# When LARRY_MEMORY_UPLOAD_URL is set, on clean exit we POST the headers.log,
# the session log.md, and the messages.json file to the configured endpoint.
# Each artifact goes as its own request with distinguishing headers so the
# ingest side can route appropriately.
#
# Bryan's memory pipeline (fswatch + ingest daemon) only sees files on his
# Mac; the WORK BOX (MobaXterm/Cygwin) where larry.sh runs is isolated, so
# we upload over the existing tailscale/network path.
#
# Safety:
# - headers.log filters to ^anthropic-* / ^retry-after: response headers
# only — request auth headers (Authorization / x-api-key) are NEVER
# captured into the log at write time (see _parse_response_headers).
# - session log.md contains conversation content. By design Bryan uses
# PHI markers / auto-PHI, so PHI is already tokenized before reaching
# the log. Auth tokens never enter the conversation stream.
# - messages.json contains the same token-substituted conversation
# content as the log.
#
# Set LARRY_MEMORY_UPLOAD_URL=<endpoint> (e.g. on proxy.bjnoela.com) to
# enable. Unset = silent skip with a one-line warn at session close.
# ─────────────────────────────────────────────────────────────────────────────
_LARRY_UPLOAD_FIRED=0
upload_session_artifacts() {
# Run once per session (in case both clean exit and EXIT trap fire).
[ "$_LARRY_UPLOAD_FIRED" = "1" ] && return 0
_LARRY_UPLOAD_FIRED=1
local url="${LARRY_MEMORY_UPLOAD_URL:-}"
if [ -z "$url" ]; then
warn "(memory upload skipped: LARRY_MEMORY_UPLOAD_URL not configured)"
return 0
fi
command -v curl >/dev/null 2>&1 || { warn "(memory upload skipped: curl missing)"; return 0; }
local artifacts=(
"$LARRY_HOME/log/headers.log|headers-log|text/plain"
"$LOG_FILE|session-log|text/markdown"
"$MESSAGES_FILE|session-messages|application/json"
)
local entry path kind ctype http_code uploaded=0
for entry in "${artifacts[@]}"; do
path="${entry%%|*}"
kind="${entry#*|}"; kind="${kind%%|*}"
ctype="${entry##*|}"
[ -f "$path" ] || continue
[ -s "$path" ] || continue
http_code=$(curl -fsS --max-time 15 \
-o /dev/null -w '%{http_code}' \
-X POST "$url" \
-H "Content-Type: $ctype" \
-H "X-Larry-Source: $kind" \
-H "X-Larry-Version: $LARRY_VERSION" \
-H "X-Session-Id: $SESSION_ID" \
--data-binary "@$path" 2>/dev/null) || http_code="000"
if [ "$http_code" = "200" ] || [ "$http_code" = "201" ] || [ "$http_code" = "202" ] || [ "$http_code" = "204" ]; then
uploaded=$((uploaded + 1))
else
warn "(memory upload: $kind → HTTP $http_code)"
fi
done
if [ "$uploaded" -gt 0 ]; then
larry_say "memory upload: posted $uploaded artifact(s) to $url"
fi
}
# Fire upload on EXIT trap too (covers Ctrl-C / EOF / kill). The function
# is idempotent (_LARRY_UPLOAD_FIRED guard) so the clean-exit call from
# main_loop won't double-post.
trap 'upload_session_artifacts || true; _uninstall_mouse_mode' EXIT INT TERM
read_user_input() { read_user_input() {
# Returns user input via global LARRY_INPUT. # Returns user input via global LARRY_INPUT.
@ -3242,10 +3709,14 @@ main_loop() {
while true; do while true; do
local _short; _short=$(model_short_name) local _short; _short=$(model_short_name)
# v0.6.9: persistent status line above the prompt. # v0.7.1: status line is rendered AFTER the previous agent_turn (see end
# Only on the FIRST line of input — heredoc continuation reads in # of loop), so it sits BELOW the just-completed prompt cycle / agent
# read_user_input do not invoke this loop iteration. # response and ABOVE the next prompt. Net visual effect: status reads as
render_status_line # a footer to the most-recent turn. This is "Option B" from the v0.7.1
# spec — chosen over cursor-manipulation Option A because `read -e`
# (readline) takes exclusive control of the cursor and inserting a
# repositioned footer below an active prompt is fragile on MobaXterm /
# Cygwin (readline redisplay clobbers manual cursor moves).
printf '%syou[%s]>%s ' "$C_GREEN" "$_short" "$C_RESET" printf '%syou[%s]>%s ' "$C_GREEN" "$_short" "$C_RESET"
if ! read_user_input; then if ! read_user_input; then
echo ""; break echo ""; break
@ -3387,6 +3858,33 @@ main_loop() {
;; ;;
esac esac
continue ;; continue ;;
# v0.7.1: auto-PHI runtime control.
/auto-phi|/auto-phi\ *)
local _arg; _arg=$(_slash_args "/auto-phi" "$input")
case "${_arg:-status}" in
on|confirm)
AUTO_PHI_MODE="confirm"
larry_say "auto-phi: confirm (default — prompts on loose name-like matches)"
;;
aggressive)
AUTO_PHI_MODE="aggressive"
larry_say "auto-phi: aggressive (tokenizes all candidates silently)"
;;
off)
AUTO_PHI_MODE="off"
larry_say "auto-phi: off (explicit markers @@VALUE / {{phi:VALUE}} still work)"
;;
status)
larry_say "auto-phi mode: $AUTO_PHI_MODE (tokenized this session: $AUTO_PHI_SESSION_COUNT)"
;;
*)
err "usage: /auto-phi on|off|aggressive|confirm (no arg → status)"
;;
esac
continue ;;
/auto-phi-status)
larry_say "auto-phi mode: $AUTO_PHI_MODE (tokenized this session: $AUTO_PHI_SESSION_COUNT)"
continue ;;
/show-last-tool) /show-last-tool)
if [ -z "$_LARRY_LAST_TOOL_NAME" ]; then if [ -z "$_LARRY_LAST_TOOL_NAME" ]; then
err "no tool calls yet this session" err "no tool calls yet this session"
@ -3608,6 +4106,11 @@ EOF
;; ;;
esac esac
# v0.7.1: auto-PHI detection runs BEFORE explicit markers, but the function
# itself defers to existing markers (it leaves anything inside @@...@@ or
# {{phi:...}} alone). Manual markers still win.
input=$(auto_detect_phi "$input")
# PHI preprocessing: replace any {{phi:VALUE}} markers with local tokens # PHI preprocessing: replace any {{phi:VALUE}} markers with local tokens
# BEFORE the input enters conversation history and gets sent to Anthropic. # BEFORE the input enters conversation history and gets sent to Anthropic.
if [[ "$input" == *"{{phi:"* ]] || [[ "$input" == *"@@"* ]]; then if [[ "$input" == *"{{phi:"* ]] || [[ "$input" == *"@@"* ]]; then
@ -3618,10 +4121,14 @@ EOF
add_user_text "$input" add_user_text "$input"
agent_turn "$system_prompt" || warn "turn ended with error" agent_turn "$system_prompt" || warn "turn ended with error"
echo "" echo ""
# v0.7.1: status line below the just-completed prompt cycle. Lives between
# turns, immediately above the next prompt. /status forces a re-render.
render_status_line
done done
log_section "session-end" log_section "session-end"
log_append "- end: $(date -Iseconds 2>/dev/null || date)" log_append "- end: $(date -Iseconds 2>/dev/null || date)"
upload_session_artifacts || true
larry_say "session log: $LOG_FILE" larry_say "session log: $LOG_FILE"
} }

View File

@ -202,6 +202,17 @@ normalize_value() {
# it as an option flag. Same caveat applies for any future tr -d call. # it as an option flag. Same caveat applies for any future tr -d call.
printf '%s' "$value" | tr -d '[:space:]-' printf '%s' "$value" | tr -d '[:space:]-'
;; ;;
PHONE)
# Strip all non-digits so "(555) 123-4567" and "5551234567" share one
# token. Keep digits only.
printf '%s' "$value" | tr -cd '[:digit:]'
;;
EMAIL)
# Lowercase + trim. Emails are case-insensitive in the local part per
# most providers' practice (RFC technically allows local-part case
# sensitivity, but tokenizing as one value is fine for PHI).
printf '%s' "$value" | tr '[:upper:]' '[:lower:]' | awk '{$1=$1; print}'
;;
*) *)
printf '%s' "$value" | awk '{$1=$1; print}' printf '%s' "$value" | awk '{$1=$1; print}'
;; ;;
@ -437,6 +448,17 @@ case "$SUB" in
count) shift; cmd_count "$@" ;; count) shift; cmd_count "$@" ;;
tokenize-value) shift; cmd_tokenize_value "$@" ;; tokenize-value) shift; cmd_tokenize_value "$@" ;;
detokenize-value) shift; cmd_detokenize_value "$@" ;; detokenize-value) shift; cmd_detokenize_value "$@" ;;
normalize-value)
# normalize-value VALUE [CATEGORY] — emit canonical form without
# tokenizing or touching the table. Used by larry.sh's auto-PHI to
# build per-session memory keys.
shift
nv_val="${1:-}"; nv_cat="${2:-}"
[ -n "$nv_val" ] || die "normalize-value needs a VALUE"
[ -n "$nv_cat" ] || nv_cat=$(detect_category "$nv_val")
normalize_value "$nv_val" "$nv_cat"
printf '\n'
;;
-h|--help) sed -n '2,30p' "$NC_SELF"; exit 0 ;; -h|--help) sed -n '2,30p' "$NC_SELF"; exit 0 ;;
*) *)
# Default = sanitize mode # Default = sanitize mode