cloverleaf-larry/MANUAL.md
Bryan Johnson b141d54847 v0.4.3: cross-env bundle for regression — no direct peer protocol needed
Each Larry is independent. Bryan's question "how will Larry on Windows
talk to Larry on Linux for regression file transfer" answered: they don't.
File transfer is YOUR responsibility (scp / gh release / shared mount /
USB), but nc-regression now produces and consumes portable bundles that
make the split a one-command-on-each-side workflow.

Changes:

lib/nc-regression.sh
  + --phase env-a    convenience for phases 1+2+3 (env-A side)
  + --phase env-b    convenience for phases 4+5+6 (env-B side + diff)
  + --bundle-out PATH  after env-A phases, tar inputs+outputs/env-a +
                       manifest.json + README.md + inbounds.txt
  + --bundle-in PATH   at start, untar a bundle into $OUT; pulls scope
                       from the manifest so the env-B side just needs
                       --env-b and --route-test-cmd

MANUAL.md
  + New "Cross-environment Larry — how the boxes communicate" section
  + Bundle transport table (scp, gh release, NFS, USB, etc.)
  + Notes that the lesson loop uses the same local-capture / manual-
    transport / central-merge model

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 11:25:02 -07:00

26 KiB

Larry-Anywhere — Manual Tool Cheat Sheet (no-Larry / offline mode)

Every Larry-Anywhere capability is also a standalone bash script under lib/. When the internet is down, you're on a plane, or the Anthropic API is unreachable, you can drive these tools directly from the shell. Larry just sequences them — the tools work without Larry.

This page documents every command with copy-paste examples. Print it.


Conventions

  • $LARRY_HOME defaults to ~/.larry/. Lib scripts live at $LARRY_HOME/lib/.
  • For brevity, examples below use lib/<tool>.sh. From $LARRY_HOME/, that's ./lib/<tool>.sh. Or add $LARRY_HOME/lib to your PATH.
  • $HCIROOT = Cloverleaf install root (e.g. /opt/cloverleaf/cis2025/integrator).
  • $HCISITE = current site name (e.g. adt).
  • $HCISITEDIR = $HCIROOT/$HCISITE.

Set these before running anything site-specific:

export HCIROOT=/opt/cloverleaf/cis2025/integrator
export HCISITE=adt
export HCISITEDIR="$HCIROOT/$HCISITE"

Authentication (larry-auth.sh, lib/oauth.sh)

Only needed if you're running the Larry REPL (larry.sh). The lib/ tools themselves never call Anthropic — they're pure local bash.

larry-auth.sh login              # OAuth via Claude.ai subscription
larry-auth.sh status             # show current auth state + expiry
larry-auth.sh refresh            # force-refresh the access token
larry-auth.sh logout             # delete tokens (revert to API key)

Falling back to API key: edit $LARRY_HOME/.env with ANTHROPIC_API_KEY=sk-ant-..., chmod 600. Larry uses API key whenever $LARRY_HOME/.oauth.json is absent.


NetConfig parsing — read (lib/nc-parse.sh)

The foundational reader. Every other NetConfig tool calls this.

# List every protocol (thread) in a NetConfig file
lib/nc-parse.sh list-protocols "$HCISITEDIR/NetConfig"

# List every process
lib/nc-parse.sh list-processes "$HCISITEDIR/NetConfig"

# Find the line where a thread is declared
lib/nc-parse.sh protocol-line "$HCISITEDIR/NetConfig" ADTto_3m
#   → 488

# Get the FULL TCL block for one protocol
lib/nc-parse.sh protocol-block "$HCISITEDIR/NetConfig" IB_ADT_muxS

# Extract a top-level field value
lib/nc-parse.sh protocol-field "$HCISITEDIR/NetConfig" IB_ADT_muxS PROCESSNAME
#   → ADT
lib/nc-parse.sh protocol-field "$HCISITEDIR/NetConfig" IB_ADT_muxS OBWORKASIB
#   → 1

# Drill into nested blocks via dotted path — HOST/PORT/TYPE/ISSERVER live inside PROTOCOL{}
lib/nc-parse.sh protocol-nested "$HCISITEDIR/NetConfig" ADTto_3m PROTOCOL.PORT
#   → 51006
lib/nc-parse.sh protocol-nested "$HCISITEDIR/NetConfig" ADTto_3m PROTOCOL.HOST
#   → SHD360ENCINT02T
lib/nc-parse.sh protocol-nested "$HCISITEDIR/NetConfig" ORU_fr_OPACS PROTOCOL.ISSERVER
#   → 1 (it's a TCP listener)

# TSV summary of every protocol — direction, port, host, type at a glance
lib/nc-parse.sh protocol-summary "$HCISITEDIR/NetConfig"
lib/nc-parse.sh protocol-summary "$HCISITEDIR/NetConfig" --filter adt   # only ADT-ish names

# Routing destinations (what does X route to?)
lib/nc-parse.sh destinations "$HCISITEDIR/NetConfig" IB_ADT_muxS

# Routing sources (what routes INTO X?) — inverse
lib/nc-parse.sh sources "$HCISITEDIR/NetConfig" ADTto_CodaMetrix

# Xlate files referenced (one protocol, or all)
lib/nc-parse.sh xlate-refs "$HCISITEDIR/NetConfig" IB_ADT_muxS
lib/nc-parse.sh xlate-refs "$HCISITEDIR/NetConfig"        # all in file

# TCL procs referenced
lib/nc-parse.sh tclproc-refs "$HCISITEDIR/NetConfig" IB_ADT_muxS

# Get just the DATAXLATE routing block (the heart of routing config)
lib/nc-parse.sh route-block "$HCISITEDIR/NetConfig" IB_ADT_muxS

Inbound thread classifier (lib/nc-inbound.sh)

Identifies inbound threads — TCP listeners directly fed by upstream clients (Epic etc.), or ICL/file inbounds fed via Cloverleaf's internal link.

# Every inbound thread (both classes), table format
lib/nc-inbound.sh "$HCISITEDIR/NetConfig" --format table

# Just real TCP listeners (the "directly fed by upstream" subset)
lib/nc-inbound.sh "$HCISITEDIR/NetConfig" --mode tcp-listen --format table

# Just ICL/file inbounds
lib/nc-inbound.sh "$HCISITEDIR/NetConfig" --mode icl-or-file --format table

# JSONL for piping into other tools
lib/nc-inbound.sh "$HCISITEDIR/NetConfig" --mode tcp-listen --format jsonl

Cross-site finder (lib/nc-find.sh) — the v1 tbn/tbp/tbh/tbpr/where replacements

Walks every NetConfig under $HCIROOT (or a passed list) and returns matches.

# tbn equivalent: partial name match
lib/nc-find.sh --name adt --format table

# tbp equivalent: exact port
lib/nc-find.sh --port 51204 --format table

# tbh equivalent: substring on host
lib/nc-find.sh --host SHD360 --format table

# tbpr equivalent: substring on PROCESSNAME
lib/nc-find.sh --process codametrix --format table

# v1 `<thread> where` — locate a thread across all sites
lib/nc-find.sh --where IB_ADT_muxS --format table

# Threads referencing a specific xlate file
lib/nc-find.sh --xlate Epic_ADT_CodaMetrix --format table

# Threads referencing a specific TCL proc
lib/nc-find.sh --tclproc trxId_IB_ADT_muxS --format table

# Override HCIROOT or pass explicit netconfigs
lib/nc-find.sh --name adt --hciroot /other/install/integrator --format table
lib/nc-find.sh --name adt --netconfigs "/a/NetConfig:/b/NetConfig" --format jsonl

Message search — smat queries (lib/nc-msgs.sh)

Smat databases are SQLite 3. Reads via native sqlite3 -ascii — no Cloverleaf binary involved.

# Count messages in a thread's smat
lib/nc-msgs.sh ADTto_3m --format count

# Recent 5 messages (text format = segments per line, with metadata header)
lib/nc-msgs.sh ADTto_3m --limit 5 --format text

# OUTPUT FORMATS:
#   text     (default) segments per line + metadata header per message
#   oneline  one message per line; segments separated by ⏎ marker
#   fields   each non-empty field on its own line: "SEG.N: value"
#   mp       alias for fields (v1 `mp`-style)
#   labeled  fields with alias names where known: "MSH.9 (msg_type): ADT^A08"
#   raw      raw bytes; messages separated by 0x1c (FS) — for piping
#   json     structured JSON
#   count    just the count
lib/nc-msgs.sh ADTto_3m --limit 3 --format oneline
lib/nc-msgs.sh ADTto_3m --limit 1 --format fields
lib/nc-msgs.sh ADTto_3m --limit 1 --format labeled    # adds friendly aliases

# Time range — supports human expressions
lib/nc-msgs.sh ADTto_3m --after "3 days ago" --format count
lib/nc-msgs.sh ADTto_3m --after "2026-05-20" --before "2026-05-26 12:00:00"

# Filter operators (paths accept either . or - separators; same field name aliases everywhere):
#   PATH=VALUE     exact equality (any repetition)
#   PATH!=VALUE    not equal (no repetition matches)
#   PATH~VALUE     contains, case-insensitive
#   PATH!~VALUE    does not contain, case-insensitive
#   PATH=NULL      empty / absent / "" — any of those
#   PATH=          same as =NULL
#   PATH=*         wildcard — any non-empty value
#   PATH!=NULL     present (any non-empty repetition)
# Multiple --field flags AND together. For OR, run two queries.

# Examples using field-name ALIASES (case-insensitive; auto-translates to SEG.N)
lib/nc-msgs.sh ADTto_3m --field 'mrn=5720501458' --format count
lib/nc-msgs.sh ADTto_3m --field 'account_number=623000286' --format text
lib/nc-msgs.sh ADTto_3m --field 'event=A08' --format count
lib/nc-msgs.sh ADTto_3m --field 'visit=*' --format count           # any non-empty
lib/nc-msgs.sh ADTto_3m --field 'ssn=NULL' --format count          # missing SSN
lib/nc-msgs.sh ADTto_3m --field 'name~smith' --format text         # contains
lib/nc-msgs.sh ADTto_3m --field 'name!~test' --format count        # production-looking
lib/nc-msgs.sh ADTto_3m --field 'event=A08' --field 'visit=*'       # AND of both
# Component access: name.2 (PID.5 component 2 = given name)
lib/nc-msgs.sh ADTto_3m --field 'name.2=SALLY' --format count
# Dash syntax (cheat-sheet style):
lib/nc-msgs.sh ADTto_3m --field 'PV1-3.4=100200' --format count

# JSON output for piping
lib/nc-msgs.sh ADTto_3m --field PID.3=5720501458 --format json | jq

# Explicit smatdb path (skip auto-locate)
lib/nc-msgs.sh ADTto_3m --db "$HCISITEDIR/exec/processes/3M/ADTto_3m.smatdb" --format count

# Raw format (for piping into route-test inputs) — messages separated by 0x1c
lib/nc-msgs.sh ADTto_3m --limit 10 --format raw > inputs.msgs

HL7 field extraction (lib/hl7-field.sh)

Extract specific fields from a single HL7 message.

# Read message from file, extract MRN
lib/hl7-field.sh PID.3 /path/to/message.hl7
#   → 5720501458

# From stdin
cat msg.hl7 | lib/hl7-field.sh MSH.10
#   → 27175  (message control ID)

# Component extraction
lib/hl7-field.sh MSH.9 msg.hl7        # → ADT^A08
lib/hl7-field.sh MSH.9.1 msg.hl7      # → ADT
lib/hl7-field.sh MSH.9.2 msg.hl7      # → A08

# Patient name (whole field + components)
lib/hl7-field.sh PID.5 msg.hl7        # → MORRIS^SALLY^^^^^^LHS^^^^^LEH^M
lib/hl7-field.sh PID.5.1 msg.hl7      # → MORRIS  (family name)
lib/hl7-field.sh PID.5.2 msg.hl7      # → SALLY   (given name)

# Pipe smat dump → extract → sort | uniq
lib/nc-msgs.sh ADTto_3m --limit 100 --format raw \
  | awk -v RS=$'\x1c' '{print $0 > "/tmp/m"NR; system("lib/hl7-field.sh PID.3 /tmp/m"NR)}'

HL7-aware diff (lib/hl7-diff.sh)

Compare two HL7 files (or multi-message dumps) with field-level normalization.

# Default — ignores MSH.7 (timestamp); shows everything else
lib/hl7-diff.sh left.hl7 right.hl7

# Add more fields to ignore
lib/hl7-diff.sh --ignore "MSH.7,MSH.10,EVN.6" left.hl7 right.hl7

# Inverse: ONLY compare specific fields
lib/hl7-diff.sh --include-fields "PID.3,PID.18,MSH.9" left.hl7 right.hl7

# Just count the differences
lib/hl7-diff.sh --format count left.hl7 right.hl7
#   → 5

# TSV output for parsing
lib/hl7-diff.sh --format tsv left.hl7 right.hl7
# columns: msg_idx \t field_path \t left_value \t right_value

Jump thread generation (lib/nc-make-jump.sh)

Generates the 3-thread cross-environment data-replay pattern: linux_<tag>_out on OLD, windows_<tag>_in + windows_<tag>_out in NEW's server_jump site. Output is plain TCL text — no file writes.

# Generate for one inbound, target new linux host:port, output to stdout
lib/nc-make-jump.sh "$HCISITEDIR/NetConfig" \
  --inbound ORU_fr_OPACS \
  --new-host newlinux01.test \
  --jump-port 61204

# Write each artifact to separate files
lib/nc-make-jump.sh "$HCISITEDIR/NetConfig" \
  --inbound ORU_fr_OPACS \
  --new-host newlinux01.test \
  --jump-port 61204 \
  --out-prefix /tmp/oru_jump
# Produces:
#   /tmp/oru_jump.old_out.tcl       — paste into OLD env NetConfig
#   /tmp/oru_jump.new_in.tcl        — paste into NEW server_jump NetConfig
#   /tmp/oru_jump.new_out.tcl       — paste into NEW server_jump NetConfig
#   /tmp/oru_jump.route_add.tcl     — splice into OLD inbound's DATAXLATE

# Override defaults
lib/nc-make-jump.sh "$HCISITEDIR/NetConfig" \
  --inbound ORU_fr_OPACS --new-host newlinux01 --jump-port 61204 \
  --inbound-host 10.0.0.5            # NEW-side outbound dials this instead of 127.0.0.1
  --process-jump migration_jump      # different process on NEW than default "server_jump"
  --encoding UTF8                    # override if not ASCII

NetConfig modification — journaled writes (lib/nc-insert-protocol.sh)

Inserts new protocol blocks and splices route entries. Every write is journaled — backup, diff, atomic replace.

# Insert a new protocol at end of file
lib/nc-insert-protocol.sh insert "$HCISITEDIR/NetConfig" /tmp/oru_jump.old_out.tcl
# →  journal entry: 2026-05-26-09xxxx/001_NetConfig
#    rollback: larry-rollback.sh --entry 2026-05-26-09xxxx/001_NetConfig

# Insert before/after a named anchor protocol
lib/nc-insert-protocol.sh insert "$HCISITEDIR/NetConfig" /tmp/new_block.tcl --mode after --anchor IB_ADT_muxS
lib/nc-insert-protocol.sh insert "$HCISITEDIR/NetConfig" /tmp/new_block.tcl --mode before --anchor ADTto_3m

# Splice a route entry into an existing protocol's DATAXLATE block
lib/nc-insert-protocol.sh add-route "$HCISITEDIR/NetConfig" ORU_fr_OPACS /tmp/oru_jump.route_add.tcl

After any write, see what changed:

larry-rollback.sh --list                                 # all journal entries
larry-rollback.sh --list --session <session-id>          # one session
cat "$LARRY_HOME/journal/<session>/manifest.md"           # human-readable summary
cat "$LARRY_HOME/journal/<session>/files/NNN_*.diff"      # the unified diff

Roll back:

larry-rollback.sh --target "$HCISITEDIR/NetConfig"       # all changes to this file, newest first
larry-rollback.sh --session 2026-05-26-09xxxx --yes      # whole session, no prompt
larry-rollback.sh --last 1                                # just the most recent write
larry-rollback.sh --entry 2026-05-26-09xxxx/001_NetConfig # one specific entry
larry-rollback.sh --dry-run --session ...                 # preview without changing anything

Pre-rollback copies land at <target>.larry-prerollback.<unix-ts> so you can redo.


System documentation (lib/nc-document.sh)

Walks every NetConfig under $HCIROOT, finds threads matching a pattern, composes a markdown knowledge entry.

# Auto-derived doc to stdout
lib/nc-document.sh --name codametrix

# Write to a file with context fields
lib/nc-document.sh --name codametrix \
  --out "$LARRY_HOME/knowledge/codametrix.md" \
  --title "CodaMetrix Coding System" \
  --status "production" \
  --poc-vendor "John Doe at CodaMetrix, jdoe@codametrix.com" \
  --poc-internal "Sarah Smith, Integration Team" \
  --escalation "Page #integration-oncall in Slack" \
  --open-items "- Renewal Q3 2026" \
  --notes "Lives in epic site mostly; 1 thread in ancout"

# Different scope sources
lib/nc-document.sh --name "3M" --hciroot /other/integrator --out /tmp/3m.md
lib/nc-document.sh --name epic_adt --netconfigs "$HCIROOT/epic/NetConfig:$HCIROOT/ancout/NetConfig"

Output: a complete markdown doc with cluster threads, sources, destinations, xlates, tclprocs, plus the placeholder context sections for the team.


Interface diff (lib/nc-diff-interface.sh)

Compares one interface (and connected threads) between two NetConfigs.

# Diff ADTto_3m + 1 hop of connected threads between test and prod
lib/nc-diff-interface.sh \
  --interface ADTto_3m \
  --left  /test/integrator/ancout/NetConfig \
  --right /prod/integrator/ancout/NetConfig \
  --left-label TEST --right-label PROD \
  --depth 1 \
  --out /tmp/adt_diff.md

# Walk further out
lib/nc-diff-interface.sh --interface ADTto_3m \
  --left  /test/integrator/ancout/NetConfig \
  --right /prod/integrator/ancout/NetConfig \
  --depth 3 \
  --out /tmp/adt_chain_diff.md

# Include table file diffs too (.tbl referenced by xlates/tclprocs)
lib/nc-diff-interface.sh --interface ADTto_3m ... --include-tables

Output: markdown report with cluster overview, per-thread protocol-block diff, per-xlate file diff, per-tclproc file diff.


Cross-environment Larry — how the boxes communicate

They don't, directly. Each Larry on each box is independent. For regression testing across two firewalled environments, the workflow is:

env-A (Windows)                      env-B (Linux)
  nc-regression --phase env-a   →    nc-regression --phase env-b
  --bundle-out /tmp/reg.tar.gz       --bundle-in /tmp/reg.tar.gz
       │                                    │
       └────── you move the bundle ────────┘
              (scp, gh release, USB,
               shared mount, etc.)

The --bundle-out flag (after env-A phases 1-3) produces a tarball with:

  • inputs/*.msgs — sampled messages from env-A smatdbs
  • outputs/env-a/*/*.out — env-A route_test outputs
  • manifest.json — env-A metadata + hints
  • inbounds.txt — the threads tested
  • README.md — instructions for the env-B side

Move the bundle however you can (see below). Then on env-B:

nc-regression.sh \
  --bundle-in /path/to/reg.tar.gz \
  --out /tmp/reg \
  --env-b $HCIROOT --site-b $HCISITE \
  --route-test-cmd '<env-b route_test command>' \
  --phase env-b

That runs phases 4-6 (route_test on env-B with same inputs, diff each pair, master summary).

Bundle transport options (pick whichever your network allows)

how when
scp between the boxes both boxes mutually SSH-reachable
scp to your laptop, scp to env-B you can SSH to each separately
Private GitHub release / gist both boxes can reach github.com but not each other
Shared NFS / SMB mount the boxes share a filesystem
Box / SharePoint / S3 enterprise common-storage
USB stick / portable drive nothing else works
Tarball as email attachment small bundles, both boxes have email

The bundle is plain tar.gz — nothing Cloverleaf-specific. Inspect it with tar -tzf reg.tar.gz to see what's inside before moving it.

Lessons / refinements use the same pattern

Larry on a client captures lessons locally (/lesson, lesson_record tool). You export the lesson bundle with lib/lessons.sh export and paste it back to home-Larry (the dev-machine session you have with me). Same model: local capture, manual transport, central merge. No direct peer protocol.


Regression testing — end to end (lib/nc-regression.sh)

Full Example 6 orchestrator. Six phases: discover → sample → route-test A → route-test B → diff → summary.

# Phase-by-phase walk through (Bryan's house pattern)

# Phase 1+2: discover inbounds and sample messages from env-A only
lib/nc-regression.sh \
  --scope site \
  --env-a /opt/cloverleaf/test/integrator    --site-a adt \
  --env-b /opt/cloverleaf/prod/integrator    --site-b adt \
  --out   /tmp/reg-2026-05-26 \
  --count 10 \
  --phase 2

# Inspect what would be sampled (dry-run)
lib/nc-regression.sh --scope site --env-a /test --env-b /prod \
  --site-a adt --site-b adt --out /tmp/reg --count 10 --phase 2 --dry-run

# Full run — needs Cloverleaf route_test command supplied:
lib/nc-regression.sh \
  --scope site --site-a adt --site-b adt \
  --env-a /opt/cloverleaf/test/integrator \
  --env-b /opt/cloverleaf/prod/integrator \
  --out /tmp/reg-2026-05-26 \
  --count 10 \
  --route-test-cmd 'cd {HCIROOT}/{HCISITE} && . ./.profile && {THREAD} route_test {INPUT} && cp *.out.* {OUTPUT_DIR}/' \
  --phase all

# Just diff existing outputs (you ran route_test manually before)
lib/nc-regression.sh --scope site --site-a adt --site-b adt \
  --env-a /opt/cloverleaf/test/integrator \
  --env-b /opt/cloverleaf/prod/integrator \
  --out /tmp/reg-2026-05-26 \
  --phase 5

# Other scopes
--scope thread:ADTto_3m                      # one thread
--scope threads:ADTto_3m,MFNto_3m,DFTto_3m   # specific list
--scope server                               # every inbound in every site under HCIROOT

Output tree:

/tmp/reg-2026-05-26/
├── inbounds.txt                       # the scope
├── inputs/<thread>.msgs               # sampled inputs (1 per inbound)
├── outputs/env-a/<thread>/<dest>...   # env-A route_test outputs
├── outputs/env-b/<thread>/<dest>...   # env-B route_test outputs (using same inputs)
├── diff/<thread>.<dest>.md            # per-pair hl7_diff report
├── diff/_index.md                     # diff summary table
└── regression-summary.md              # master report

Tip: if env-B is remote, pass --env-b-host <host> --env-b-user <user> and Phase 4 scp's inputs over before invoking route_test there. Or run on env-B separately and skip Phase 4 with --phase 5 for diff-only.


Reverse SSH tunnel (larry-tunnel.sh)

If you want a home Larry to SSH into the client box (when client → outbound SSH is allowed):

# Zero-config: serveo.net (third-party, NOT for sensitive sessions)
larry-tunnel.sh --serveo

# Your own hop (needs hop sshd configured with GatewayPorts)
LARRY_HOP_USER=larry-tunnel \
LARRY_HOP_HOST=bjnoela.com \
LARRY_HOP_KEY=~/.ssh/id_ed25519 \
  larry-tunnel.sh

# Inspect / stop
larry-tunnel.sh --status
larry-tunnel.sh --stop

PHI handling — sanitize / desanitize (lib/hl7-sanitize.sh, lib/hl7-desanitize.sh)

When working with prod data, tokenize PHI fields BEFORE they reach the API.

# Sanitize a file: replaces PHI fields with [[CATEGORY_NNNN]] tokens.
# Lookup table at ~/.larry/sanitize/lookup.tsv (mode 0600, never leaves the box).
lib/hl7-sanitize.sh /opt/cloverleaf/.../some.hl7 > /tmp/sanitized.hl7

# Pipe an entire smat-dump through sanitize:
lib/nc-msgs.sh ADTto_3m --limit 100 --format raw \
  | lib/hl7-sanitize.sh > /tmp/sanitized-batch.hl7

# Strict mode also tokenizes unknown Z-segments wholesale:
lib/hl7-sanitize.sh --strict ./msg.hl7 > /tmp/sanitized.hl7

# See the current lookup table (PHI is here — DON'T share):
lib/hl7-sanitize.sh show-table

# Count entries:
lib/hl7-sanitize.sh count

# Clear the table (asks for confirmation):
lib/hl7-sanitize.sh clear-table

# Tokenize a single value (used by Larry's {{phi:...}} preprocessor):
lib/hl7-sanitize.sh tokenize-value --category MRN 12345
#   → [[MRN_0001]]

# Detokenize a single token:
lib/hl7-sanitize.sh detokenize-value "[[MRN_0001]]"
#   → 12345

# Desanitize a whole document (e.g. view Larry's tokenized output unmasked, locally):
cat larry-output.txt | lib/hl7-desanitize.sh | less

# Quick token lookup:
lib/hl7-desanitize.sh --token "[[NAME_0042]]"

# Override default PHI rules (rule file format: SEG|FIELD|CATEGORY per line)
lib/hl7-sanitize.sh --rules-file /tmp/my-rules.txt /tmp/msg.hl7

Inside Larry (the REPL)

you> /phi 5720501458
phi> [[MRN_0001]]  (use this in your next prompt)

you> find messages for {{phi:MRN:5720501458}} in last 3 days
phi> {{phi:MRN:5720501458}} → [[MRN_0001]]
   (the actual prompt sent to Anthropic has [[MRN_0001]] — the original MRN never leaves the box)

you> /unmask [[NAME_0042]]
unmask> [[NAME_0042]] → MORRIS^SALLY^^^...  (local only; never sent to API)

you> /tokens
  (prints the full PHI ↔ token lookup table — local terminal only)

PHI inline syntax in any prompt:

  • {{phi:VALUE}} — tokenize before send; auto-detects category (matches existing entries)
  • {{phi:MRN:12345}} — explicit category=MRN (matches sanitized data)
  • {{phi:NAME:JOHN SMITH}} — explicit category=NAME

Default PHI rule set

Fields tokenized by default (override with --rules-file):

PID.2..7, .9, .11, .13, .14, .18, .19, .20, .21, .29, .30   (patient IDs, name, DOB, address, phone, account, SSN, license)
PV1.7, .8, .9, .17, .19, .50, .52                            (providers, visit number)
NK1.2, .3, .4, .5, .6, .16                                   (next of kin)
GT1.3, .4, .5, .6, .7, .11, .12, .19                         (guarantor)
IN1.16, .17, .18, .19, .20, .36, .49                         (insurance)
OBR.10, .16, .32  / OBX.16  / DG1.3, .4  / ORC.10, .12       (orders/observations)

Limitations (read these)

  • Your typed prompt can still leak PHI if you don't use {{phi:…}} markers. Be deliberate.
  • Custom Z segments aren't tokenized unless --strict is passed (which then redacts unknown Zs wholesale).
  • Free-text fields (OBX.5 narratives, comments in NTE segments) can contain PHI in prose form. Default rules don't tokenize OBX.5; add it via --rules-file if your shop carries PHI in lab narratives.
  • Repetitions (~-separated within a field) are tokenized as one value, not per-rep. Adequate for most analysis.
  • The lookup table at ~/.larry/sanitize/lookup.tsv contains real PHI. Mode 0600, never sent anywhere by these scripts, but it's still on disk. Wipe with clear-table before shipping the box anywhere.

Quick recipe: "I have to do X without internet"

Task Command
"what threads are here?" lib/nc-parse.sh list-protocols $HCISITEDIR/NetConfig
"find threads named 3m" lib/nc-find.sh --name 3m --format table
"where does d_foo live?" lib/nc-find.sh --where d_foo --format table
"what feeds d_foo?" lib/nc-parse.sh sources $HCISITEDIR/NetConfig d_foo
"what does d_foo route to?" lib/nc-parse.sh destinations $HCISITEDIR/NetConfig d_foo
"what xlates does d_foo use?" lib/nc-parse.sh xlate-refs $HCISITEDIR/NetConfig d_foo
"find messages for MRN X" lib/nc-msgs.sh d_foo --field PID.3=X --format text
"diff this interface across two envs" lib/nc-diff-interface.sh --interface NAME --left A/NC --right B/NC --depth 1 --out out.md
"generate jump threads for a migration" lib/nc-make-jump.sh ... --inbound X --new-host Y --jump-port Z --out-prefix /tmp/jump
"insert a new protocol with rollback" lib/nc-insert-protocol.sh insert NC /tmp/block.tcl then larry-rollback.sh --target NC if needed
"what changes did I make recently?" larry-rollback.sh --list
"undo my last change" larry-rollback.sh --last 1 --dry-run then --last 1 --yes
"document a system" lib/nc-document.sh --name PATTERN --out FILE
"full regression test between two envs" lib/nc-regression.sh --scope site --site-a SA --site-b SB --env-a EA --env-b EB --out DIR --count N --route-test-cmd '...' --phase all

Where to look when something breaks

  • ~/.larry/sessions/<id>.log.md — every Larry session is logged as markdown.
  • ~/.larry/journal/<session>/manifest.md — every journaled write in that session, with diffs.
  • ~/.larry/journal/index.tsv — flat index of every write across all sessions.
  • ~/.larry/.env — your API key (if using API-key auth).
  • ~/.larry/.oauth.json — OAuth tokens (if using subscription auth).
  • ~/.larry/agents/*.md — the personas loaded into Larry's system prompt. Editable; reloaded each launch.

All lib/ scripts accept --help (or -h) to print usage.