From 12989b2cedd71f2296c531e40e7c473e94eb19a7 Mon Sep 17 00:00:00 2001 From: Bryan Johnson Date: Thu, 28 May 2026 10:25:57 -0700 Subject: [PATCH] =?UTF-8?q?v0.8.19:=20nc=5Fpaths=20deterministic=20route-c?= =?UTF-8?q?hain=20tracer=20=E2=80=94=20DFS=20path=20enumerator=20(SITE/THR?= =?UTF-8?q?EAD/HOPS/PATH),=20cross-site,=20DEST-routing;=20wires=20the=20p?= =?UTF-8?q?reviously-dark=20walker=20into=20the=20LLM=20schema=20+=20/path?= =?UTF-8?q?s=20+=20manual=20tool,=20consolidates=20the=20BFS=20walker,=20c?= =?UTF-8?q?heatsheet=20steers=20to=20it.=20Kills=20brute-force=20route-tra?= =?UTF-8?q?cing.?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 65 ++++ MANIFEST | 13 +- MANUAL.md | 49 +++ VERSION | 2 +- agents/cloverleaf-cheatsheet.md | 4 +- larry.sh | 79 ++++- lib/nc-parse.sh | 74 +---- lib/nc-paths.sh | 530 ++++++++++++++++++++++++++++++++ 8 files changed, 746 insertions(+), 70 deletions(-) create mode 100755 lib/nc-paths.sh diff --git a/CHANGELOG.md b/CHANGELOG.md index 4699d79..fa8ef72 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -4,6 +4,71 @@ All notable changes to `cloverleaf-larry` / `larry-anywhere` are recorded here. Versioning is loose-semver; bumps trigger the in-process self-update on every running client via `LARRY_BASE_URL` + `MANIFEST`. +## v0.8.19 — 2026-05-28 + +Deterministic route-chain `nc_paths` tool — the #1 fix from the deterministic +tool-coverage plan (Clover). The on-server LLM had NO transitive route-chain +tool: to answer "show me the path / what feeds X / full route" it brute-forced +the whole NetConfig with grep/read_file/bash_exec + a long chain of +nc_destinations calls (the ~$1 prompt Bryan hit, which still gave up before +unrolling the chain). This release wires up a deterministic enumerator so the +model makes ONE call. + +**1. New single walker backend `lib/nc-paths.sh`.** +A DFS path-enumerator that ports the v2 `paths` semantics +(`cloverleaf_tools/cli/legacy_workflow_commands.py` paths_cmd + +`_enumerate_downstream_paths`/`_enumerate_upstream_paths`/`_enumerate_all_full_paths`). +Output columns **SITE THREAD HOPS PATH** — HOPS = thread count in the chain, +PATH = the chain joined by ` -> ` (one row per enumerated root-to-leaf path; a +branching thread yields multiple rows). Matches Bryan's exact format +(`pharmacy / pharm_adt_in / 2 / pharm_adt_in -> pyxismed_crh_adtorm_out`). + +- **DEST-based routing, never ICLSERVERPORT.** Next hop is resolved ONLY from + the DATAXLATE `{ DEST }` list (via `nc-parse.sh destinations/sources`). + Bryan's old `paths.tcl` walked via `keylget data ICLSERVERPORT`, which THROWS + on any thread lacking that key (every outbound/client thread), so the trace + died on the first client thread. The DEST list is present on every routing + thread regardless of direction and yields nothing (no crash) when absent — the + v2 paths.tcl crash cannot recur here. +- **All-mode** (`--all`): enumerates every chain from every entry point (a thread + with no incoming), deduped — the whole-site/environment chain inventory + (covers gap #2, v2 `list_full_routes`). +- **Cross-site BY DEFAULT** (Bryan's resolved decision): when a chain's terminal + thread is also an entry thread in another site's NetConfig (correlated by + shared thread name), the walk CONTINUES into that site — the + mux -> ancout -> CodaMetrix chain is followed end to end. `--site-only` scopes + to a single site. +- **Robust cross-site cycle detection.** Every walk carries the full ancestor + set keyed by (site,thread); revisiting an ancestor terminates that path (the + terminal node is still emitted), plus a global max-depth backstop (128, v2 + parity). Always terminates — verified against a deliberate cross-site cycle. +- Formats: table (aligned), tsv, jsonl. + +**2. Consolidated the walker backend (no second dark walker).** +Removed the never-wired `cmd_chain` BFS-node-set command from `nc-parse.sh` +(it only emitted a flat set of reachable nodes, never enumerated paths, and was +invisible to the LLM). `nc-paths.sh` is now the SINGLE route-chain backend; the +`nc-parse.sh chain` subcommand now errors with a pointer to it. + +**3. Wired `nc_paths` into the LLM (the critical piece).** +- `larry.sh` — new `tool_nc_paths` wrapper (table output routed through + `_fence_aligned_table`; tsv/jsonl pass unfenced), a `nc_paths)` dispatch case, + and a `{"name":"nc_paths", ...}` schema entry. The schema description steers + the model to use `nc_paths` for ANY "show me the path / trace the chain / what + feeds X / full route / end-to-end flow / sources+destinations chain" question + instead of grep_files / read_file / bash_exec / repeated nc_destinations. +- `nc_sources` schema note tightened to "ONE HOP ONLY — use nc_paths for chains." + +**4. Ergonomics (manual entry + slash command).** +- `larry tools nc-paths [--site-only]` runs the enumerator + standalone (no API). Registry entry added to `larry tools list`. +- `/paths [site] [--up|--down|--site-only|--all|--format ...]` REPL + slash command (defaults `site` to the current `$HCISITE`). The fuller `tbn` + + `. ` shell-shim ergonomics remain a separate follow-on + pending Bryan's cheat sheet — not built here. + +Zero traffic-bypass primitives. + ## v0.8.18 — 2026-05-28 Readable terminal output + two DIRECT-mode follow-ups from Vera's v0.8.17 gate diff --git a/MANIFEST b/MANIFEST index 991b758..a9f4695 100644 --- a/MANIFEST +++ b/MANIFEST @@ -23,21 +23,21 @@ # scripts/make-manifest.sh and bump VERSION. # Top-level scripts -larry.sh 087cc26634aa330049d46940ff6370dad2b84b267a8d4ce87b528eb8bd333d5d +larry.sh 8bc938bc3351b88b4fcf2c4244617ef335c9c9e3352fcc1b8da6ddbb9275cdf9 larry-tunnel.sh 6b050e4eeab15669f4858eaf3b807f168f211ced07815db9521bc40a093f6aaa larry-auth.sh a220cdf7878569dc3028951ee57fc8d5e706a8ca5c6aa45347b58facb386f831 larry-rollback.sh 91b5e9aa6c79266bf306dcfba4ca791c07971bd6924d67a779037531648aa6d0 install-larry.sh e97da4e12a0d8863ca18d79b12f6c4294c72fa6d4b11dffeab66504236bb4eb1 # Metadata -VERSION 1d14fd69d4f2d2b8118fa821e3c9a3d88f0a45cb6b262645ff643b4ae101d2b2 -MANUAL.md 666128a086b59ff3c31a574aec0c5dd681666d66319da9f078451bf9013ca5e1 -CHANGELOG.md 41763bdd066ed12d25a0f212378102fac4b5cfd91895a330f34e0859ae697d91 +VERSION d6cb21adf47733cbddb6f624c559d39c4fa8f018d961f0e577f71b91327880e6 +MANUAL.md 956f736291ed3ada0f7bd61c20f60f5267a16776bae918fe3fa17d9c8e07b997 +CHANGELOG.md 83fb342bf07fd2086070974ea7ec031ae665493307f95406591e89c7da222959 # Agent personas (system-prompt overlays) agents/larry.md 0a1ef737e7fc133ab35be09f79c3a4df33de814e0404b69b950932d0c8a01be1 agents/clover.md d1bbfd6cc4642c2bff6e15dcbdf051d71b063b3fe29e0be97d17b3180d3c7ac5 -agents/cloverleaf-cheatsheet.md c0a2aab91f1ddf092bce312def02cc6f3f62a1f653ca5af67a9430c3fcef4c3f +agents/cloverleaf-cheatsheet.md 4bd63c40bcc71ee4a15a330a3450118d8b88c1de1174366aaeef37b8940df751 agents/regress.md bb05ed1439b1e35d6e9799e32d683bfab166472c72115c1f02757e227c74e42f # Cygwin/MobaXterm CR-taint defense primitives (sourced by every tool) @@ -97,7 +97,8 @@ lib/nc-xlate.sh ea02693c3dff5db271771d4bb2927b23465b07798df2f9912bc2d2b58a134d54 lib/nc-smat-diff.sh ac003954701ea6b7f4aa1f6941f8536af5b5cdfbb75e306789753d453f06800e lib/nc-create-thread.sh 5a9d5407c117183cad831d6b95f0e785b1b806f5ccc67f803c12b3695882b5b7 lib/nc-tclgen.sh dc95f523d543192fc7b3ae204107ce67ebb9b7e5184fa0642a1af2e2454d3241 -lib/nc-parse.sh 834c294b156f4b10776db27203a8cc0ede1e98c753ef0d9d087c8619ca710d73 +lib/nc-parse.sh 473b64c66a55f07ef19fc589467102c9bf2f389c20eabea63bcf272cad3e16fb +lib/nc-paths.sh dadc4138dd24c5585e40253ef33a2a9adb0af1259bc6a601df44f26667934fb7 lib/nc-inbound.sh 52d28c5f8d97bdf96f0fc7b5300d35b106b8e1226578f4cda430deb2a8b4a91b lib/nc-make-jump.sh 08a0bc58a299c95c60a59a5202792daf0ada3a8a0be7dc1b4cccc5724f5c9c79 lib/nc-msgs.sh 729e2d6c9159e83fa177fc6b982e48ed8453a9743477cc90afdd3cd4ec7e620c diff --git a/MANUAL.md b/MANUAL.md index 1e653ef..80b06a6 100644 --- a/MANUAL.md +++ b/MANUAL.md @@ -157,6 +157,55 @@ lib/nc-parse.sh tclproc-refs "$HCISITEDIR/NetConfig" IB_ADT_muxS lib/nc-parse.sh route-block "$HCISITEDIR/NetConfig" IB_ADT_muxS ``` +> `destinations` / `sources` are ONE HOP. To trace a full multi-hop chain, use +> the route-chain tracer below (`nc-paths.sh`) — do not loop these by hand. + +--- + +## Route-chain path tracer (`lib/nc-paths.sh`) — the single walker + +Enumerates the full root-to-leaf message path(s) by following the DATAXLATE +`{ DEST }` routing graph. Output columns **SITE THREAD HOPS PATH** — HOPS +is the thread count in the chain, PATH is the chain joined by ` -> ` (one row per +enumerated path; a branch yields multiple rows). Routing resolves via DEST only, +never `ICLSERVERPORT` (so it never recurs the old `paths.tcl` crash). **Cross-site +by default**: when a chain's terminal thread is also an entry thread in another +site's NetConfig (same name), the chain continues into that site +(mux -> ancout -> CodaMetrix). `--site-only` scopes to one site. Cycle-safe; always +terminates. + +```bash +# One thread — every full path containing it (default), table output. +# Either pass , or let it auto-locate under $HCIROOT. +lib/nc-paths.sh pharm_adt_in pharmacy +lib/nc-paths.sh IB_ADT_muxS anc # cross-site chain followed + +# Only downstream chains from a thread / only upstream feeders +lib/nc-paths.sh IB_ADT_muxS anc --downstream +lib/nc-paths.sh ADTto_CodaMetrix codametrix --upstream + +# Stop at the site boundary (no cross-site join) +lib/nc-paths.sh IB_ADT_muxS anc --site-only + +# Whole-site / whole-environment chain inventory (every entry-point chain) +lib/nc-paths.sh --all # every site under $HCIROOT +lib/nc-paths.sh --all --site anc # scope to one site + +# Formats + HCIROOT override +lib/nc-paths.sh pharm_adt_in pharmacy --format jsonl +lib/nc-paths.sh --all --hciroot /other/install/integrator --format tsv +``` + +In the REPL this is the `/paths` slash command (the `site` defaults to the +current `$HCISITE` if omitted), and the LLM reaches it via the `nc_paths` tool +for any "show me the path / what feeds X / full route" question: + +```text +/paths pharm_adt_in pharmacy +/paths IB_ADT_muxS anc --site-only +/paths --all +``` + --- ## Inbound thread classifier (`lib/nc-inbound.sh`) diff --git a/VERSION b/VERSION index 492fd93..ad2cbac 100644 --- a/VERSION +++ b/VERSION @@ -1 +1 @@ -0.8.18 +0.8.19 diff --git a/agents/cloverleaf-cheatsheet.md b/agents/cloverleaf-cheatsheet.md index 38d47af..0b2d17b 100644 --- a/agents/cloverleaf-cheatsheet.md +++ b/agents/cloverleaf-cheatsheet.md @@ -19,7 +19,9 @@ Two kinds of capability: | `nc_protocol_field(netconfig, name, field)` | top-level field (`PROCESSNAME`, `OBWORKASIB`, `OUTBOUNDONLY`, `GROUPS`, `ENCODING`, `ICLSERVERPORT`, `AUTOSTART`, `HOSTDOWN`) | | `nc_protocol_nested(netconfig, name, path)` | nested field via dotted path. **Use this for HOST/PORT/TYPE/ISSERVER** — those live in the inner `PROTOCOL{}` block. e.g. path=`PROTOCOL.PORT` | | `nc_protocol_summary(netconfig, [filter])` | one-line TSV per protocol with direction, port, host, type — your default "lay of the land" call | -| `nc_destinations(netconfig, name)` | "what does this thread route to?" — unique DEST list from DATAXLATE | +| `nc_destinations(netconfig, name)` | "what does this thread route to?" — unique DEST list from DATAXLATE. **ONE HOP only — for the full multi-hop chain use `nc_paths`.** | +| `nc_sources(netconfig, name)` | "what routes INTO this thread?" — unique source list. **ONE HOP only — for the full chain use `nc_paths`.** | +| `nc_paths(thread, site, [all], [site_only])` | **"trace the FULL route chain / what feeds X / the whole path / downstream + upstream"** — deterministic DFS path enumerator, output `SITE THREAD HOPS PATH`, cross-site by default. **Use this instead of repeated `nc_destinations`/`nc_sources`, grep, or read_file** for ANY path / chain / route-tracing question. | | `nc_xlate_refs(netconfig, [name])` | "what .xlt files are referenced?" — all or scoped to one protocol | | `nc_find_inbound(netconfig, mode, format)` | "which threads are inbound?" — modes: `tcp-listen` (real upstream-client listeners, ISSERVER=1), `icl-or-file` (OBWORKASIB=1 internal mux/file inbounds), `all`. formats: tsv, jsonl, table | diff --git a/larry.sh b/larry.sh index f5f04ee..a776989 100755 --- a/larry.sh +++ b/larry.sh @@ -78,7 +78,7 @@ set -o pipefail # ───────────────────────────────────────────────────────────────────────────── # Config # ───────────────────────────────────────────────────────────────────────────── -LARRY_VERSION="0.8.18" +LARRY_VERSION="0.8.19" LARRY_HOME="${LARRY_HOME:-$HOME/.larry}" # ───────────────────────────────────────────────────────────────────────────── @@ -340,7 +340,8 @@ _tools_resolve_lib_dir() { _tools_registry() { cat <<'REG' #NetConfig (read) -nc-parse.sh|Parse a NetConfig: list/inspect protocols & processes, fields, routes, xlate refs, thread chains +nc-parse.sh|Parse a NetConfig: list/inspect protocols & processes, fields, routes, xlate refs, one-hop destinations/sources +nc-paths.sh|Route-chain PATH tracer: enumerate full root-to-leaf chains for a thread or whole site (cross-site by default). Usage: nc-paths.sh [--up|--down|--site-only] | --all [--site NAME] nc-find.sh|Cross-site search for threads/protocols by name/host/port/xlate across every site under $HCIROOT nc-inbound.sh|List the inbound (server/listener) threads in a NetConfig nc-status.sh|Engine runtime status (sites/threads/not-up/queued/connections) — wraps the shipped tstat binaries @@ -1700,6 +1701,40 @@ tool_nc_sources() { "$LARRY_LIB_DIR/nc-parse.sh" sources "$nc" "$name" 2>&1 } +# nc_paths — deterministic route-chain path ENUMERATOR (v0.8.19). The single +# walker backend; the model calls this ONCE instead of chaining +# nc_destinations + grep_files + read_file (the old ~$1 brute-force). Resolves +# the next hop ONLY from the DATAXLATE DEST list (never ICLSERVERPORT) so it +# cannot recur the old paths.tcl crash. Cross-site by default; --site-only scopes +# to one site. Either pass an explicit netconfig, or a (thread,site) pair, or +# --all for the whole-site/cross-site entry-chain inventory. +tool_nc_paths() { + local netconfig="$1" thread="$2" site="$3" direction="${4:-full}" + local all_mode="${5:-0}" site_only="${6:-0}" fmt="${7:-table}" hciroot="${8:-${HCIROOT:-}}" + _lib_err_if_missing || return + local args=() + [ -n "$netconfig" ] && args+=(--netconfig "$netconfig") + [ -n "$thread" ] && args+=("$thread") + [ -n "$site" ] && args+=(--site "$site") + case "$direction" in + up) args+=(--upstream) ;; + down) args+=(--downstream) ;; + full|"") : ;; + *) echo "ERROR: unknown nc_paths direction: $direction (full|up|down)"; return 1 ;; + esac + [ "$all_mode" = "1" ] && args+=(--all) + [ "$site_only" = "1" ] && args+=(--site-only) + [ -n "$hciroot" ] && args+=(--hciroot "$hciroot") + args+=(--format "$fmt") + # v0.8.18 convention: fence aligned tables so the model reproduces them + # verbatim in the monospace terminal. tsv/jsonl are data — passed unfenced. + if [ "$fmt" = "table" ]; then + "$LARRY_LIB_DIR/nc-paths.sh" "${args[@]}" 2>&1 | _fence_aligned_table + else + "$LARRY_LIB_DIR/nc-paths.sh" "${args[@]}" 2>&1 + fi +} + tool_nc_tclproc_refs() { local nc="$1" name="${2:-}" _lib_err_if_missing || return @@ -4114,6 +4149,11 @@ execute_tool() { nc_make_jump) tool_nc_make_jump "$(J '.netconfig')" "$(J '.inbound')" "$(J '.new_host')" "$(J '.jump_port')" \ "$(J '.inbound_host // "127.0.0.1"')" "$(J '.process_jump // "server_jump"')" "$(J '.encoding // ""')" ;; nc_sources) tool_nc_sources "$(J '.netconfig')" "$(J '.name')" ;; + nc_paths) tool_nc_paths "$(J '.netconfig // ""')" "$(J '.thread // ""')" "$(J '.site // ""')" \ + "$(J '.direction // "full"')" \ + "$(J '.all // 0' | sed "s/false/0/;s/true/1/")" \ + "$(J '.site_only // 0' | sed "s/false/0/;s/true/1/")" \ + "$(J '.format // "table"')" "$(J '.hciroot // ""')" ;; nc_tclproc_refs) tool_nc_tclproc_refs "$(J '.netconfig')" "$(J '.name // ""')" ;; hl7_field) tool_hl7_field "$(J '.message')" "$(J '.field_path')" ;; nc_msgs) tool_nc_msgs "$(J '.thread')" "$(J '.after // ""')" "$(J '.before // ""')" \ @@ -4174,7 +4214,8 @@ TOOLS_JSON=$(cat <<'TOOLS_END' {"name":"nc_find_inbound","description":"Find inbound threads in a NetConfig. mode=tcp-listen (ISSERVER=1, directly fed by upstream client systems), mode=icl-or-file (OBWORKASIB=1, fed by internal Cloverleaf link or file drop), mode=all (default). Output formats: tsv, jsonl, table.","input_schema":{"type":"object","properties":{"netconfig":{"type":"string"},"mode":{"type":"string","enum":["tcp-listen","icl-or-file","all"],"description":"Which class of inbound to return."},"format":{"type":"string","enum":["tsv","jsonl","table"]}},"required":["netconfig"]}}, {"name":"nc_make_jump","description":"Generate the 3-thread jump set for the cross-environment data replay pattern Bryan uses. Emits FOUR artifacts: (1) linux__out for OLD env (outbound tcpip-client to new linux:jump_port), (2) windows__in for NEW env server_jump site (inbound tcpip-server listening on jump_port, routes internally to #3), (3) windows__out for NEW env server_jump site (outbound tcpip-client to 127.0.0.1:, where orig_port is the existing inbound listening port read from the NetConfig), (4) route-add snippet to splice into the OLD inbound DATAXLATE block. Tag = inbound thread name (auto). The NEW env existing inbound is left COMPLETELY UNCHANGED. Pure generation; caller uses write_file (Y/N) to persist.","input_schema":{"type":"object","properties":{"netconfig":{"type":"string","description":"NetConfig path containing the inbound thread (OLD env)."},"inbound":{"type":"string","description":"Existing inbound protocol name to mirror. Must be a TCP-listener (ISSERVER=1); read its PROTOCOL.PORT first to confirm."},"new_host":{"type":"string","description":"Hostname/IP of the NEW linux env that OLD will TCP to."},"jump_port":{"type":"string","description":"TCP port for the OLD to NEW hop. linux__out targets it, windows__in listens on it."},"inbound_host":{"type":"string","description":"Host that windows__out connects to on NEW (the existing inbound on NEW). Default 127.0.0.1 (same box, loopback)."},"process_jump":{"type":"string","description":"Process for NEW-side threads on server_jump. Default server_jump."},"encoding":{"type":"string","description":"ENCODING override. Default = same as the existing inbound."}},"required":["netconfig","inbound","new_host","jump_port"]}}, - {"name":"nc_sources","description":"List every protocol that has a DATAXLATE DEST routing to the named thread. The inverse of nc_destinations. Use this to find what feeds a given thread.","input_schema":{"type":"object","properties":{"netconfig":{"type":"string"},"name":{"type":"string","description":"Target thread name."}},"required":["netconfig","name"]}}, + {"name":"nc_sources","description":"List every protocol that has a DATAXLATE DEST routing to the named thread. The inverse of nc_destinations. ONE HOP ONLY — to trace a full multi-hop chain use nc_paths, not repeated nc_sources calls.","input_schema":{"type":"object","properties":{"netconfig":{"type":"string"},"name":{"type":"string","description":"Target thread name."}},"required":["netconfig","name"]}}, + {"name":"nc_paths","description":"Deterministic ROUTE-CHAIN tracer. Enumerates the full root-to-leaf message path(s) by following the DATAXLATE DEST routing graph (NEVER ICLSERVERPORT). USE THIS — DO NOT brute-force with grep_files / read_file / bash_exec / repeated nc_destinations — for ANY of: 'show me the path', 'trace the chain', 'what feeds X', 'where does X go', 'full route', 'end-to-end flow', 'sources and destinations chain', 'how does a message get from A to B', 'map the interface flow'. ONE call answers the whole question. Output columns SITE THREAD HOPS PATH where HOPS = thread count in the chain and PATH = the chain joined by ' -> ' (one row per enumerated path; a branch yields multiple rows). MODES: (a) one thread — set `thread` (and optionally `site`); default returns every full path containing that thread; set direction=down for only downstream, direction=up for only upstream feeders. (b) whole-site / whole-environment inventory — set all=true (optionally scope with `site`); enumerates every chain from every entry point (a thread with no incoming), deduped. CROSS-SITE BY DEFAULT: when a chain's terminal thread is also an entry thread in another site's NetConfig (same thread name), the chain CONTINUES into that site — e.g. mux -> ancout -> CodaMetrix spanning sites. Set site_only=true to stop at the site boundary. Resolves sites under $HCIROOT automatically (or pass hciroot / an explicit netconfig). Cycle-safe across sites; always terminates.","input_schema":{"type":"object","properties":{"thread":{"type":"string","description":"Thread/protocol name to trace. Omit only when all=true."},"site":{"type":"string","description":"Site name (the NetConfig's parent dir). Optional — disambiguates a thread present in multiple sites, or scopes all-mode to one site."},"netconfig":{"type":"string","description":"Optional explicit NetConfig path. If given, the thread's home site is its parent dir; cross-site joins still scan $HCIROOT unless site_only=true."},"direction":{"type":"string","enum":["full","up","down"],"description":"full (default) = every path containing the thread; down = only downstream chains; up = only upstream feeder chains."},"all":{"type":"boolean","description":"true = enumerate every chain from every entry point (whole-site/whole-environment inventory). No thread needed."},"site_only":{"type":"boolean","description":"true = do NOT cross site boundaries (scope to one site). Default false = follow the chain across sites."},"format":{"type":"string","enum":["table","tsv","jsonl"],"description":"Output format. Default table (aligned, monospace)."},"hciroot":{"type":"string","description":"Override $HCIROOT for site discovery / cross-site joins."}},"required":[]}}, {"name":"nc_tclproc_refs","description":"List every TCL proc name referenced from a protocol block (or from the whole NetConfig if name is omitted). Pulls from DATAFORMAT.PROC, PREPROCS.PROCS, POSTPROCS.PROCS, etc. Unique sorted.","input_schema":{"type":"object","properties":{"netconfig":{"type":"string"},"name":{"type":"string","description":"Optional. Scope to one protocol."}},"required":["netconfig"]}}, {"name":"hl7_field","description":"Extract a specific HL7 v2 field from a message. field_path = SEG[.FIELD[.COMPONENT[.SUBCOMPONENT]]]. Examples: PID.3 (MRN), PID.18 (account number), MSH.7 (timestamp), MSH.9.2 (event code, like A08), PID.5 (patient name with components). Multiple repetitions are returned one per line. Native v3, no v1/v2 dependency.","input_schema":{"type":"object","properties":{"message":{"type":"string","description":"Raw HL7 message text. Segments separated by \\r."},"field_path":{"type":"string","description":"Field path like PID.3 or MSH.9.2"}},"required":["message","field_path"]}}, {"name":"nc_msgs","description":"Query Cloverleaf smat (SQLite!) databases for messages from a thread. Filters: time range, exact HL7 field match. Native v3 — reads smatdb directly with sqlite3 -ascii, no hcidbdump/dbExtract needed. Format text shows messages line-by-line with metadata; count returns just the count; json returns structured data. Operates on LOCAL smatdbs; for a remote env's smatdb, use ssh_pull_smat first (sampled mode is cheaper than pulling the whole DB).","input_schema":{"type":"object","properties":{"thread":{"type":"string","description":"Thread name. The .smatdb file under $HCISITEDIR/exec/processes/*/.smatdb is auto-located unless db is given."},"after":{"type":"string","description":"Time-after filter. Accepts \"3 days ago\", \"2026-05-20 14:30:00\", \"2026-05-20\", or a unix timestamp."},"before":{"type":"string","description":"Time-before filter, same formats as after."},"field":{"type":"string","description":"HL7 field path for exact-match filter, e.g. PID.18 or MSH.10."},"value":{"type":"string","description":"Value the field must equal. Use with field. Repeatable filters not supported via this single tool call — chain calls if you need multi-field AND."},"limit":{"type":"integer","description":"Max messages to return. Default 10."},"format":{"type":"string","enum":["text","json","count","raw"],"description":"text = human-readable with metadata; count = just the number; json = structured; raw = raw bytes separated by 0x1c."},"sitedir":{"type":"string","description":"Override $HCISITEDIR for thread-to-db location."},"db":{"type":"string","description":"Explicit .smatdb path; overrides auto-locate."}},"required":["thread"]}}, @@ -7019,6 +7060,38 @@ main_loop() { fi _run_ssh_helper exec "$alias" "$rcmd" continue ;; + /paths|/paths\ *) + # v0.8.19: deterministic route-chain tracer (muscle-memory entry). + # /paths [site] [--up|--down] [--site-only] [--all] [--format tsv|table|jsonl] + # /paths --all [site] [--site-only] + local _pa; _pa=$(_slash_args "/paths" "$input") + local _p_thread="" _p_site="" _p_dir="full" _p_all=0 _p_siteonly=0 _p_fmt="table" _ptok _pexpect="" + for _ptok in $_pa; do + if [ "$_pexpect" = "format" ]; then _p_fmt="$_ptok"; _pexpect=""; continue; fi + case "$_ptok" in + --up|--upstream) _p_dir="up" ;; + --down|--downstream) _p_dir="down" ;; + --all) _p_all=1 ;; + --site-only) _p_siteonly=1 ;; + --format) _pexpect="format" ;; + --format=*) _p_fmt="${_ptok#--format=}" ;; + --*) err "/paths: unknown flag $_ptok"; continue 2 ;; + *) + if [ -z "$_p_thread" ] && [ "$_p_all" = "0" ]; then _p_thread="$_ptok" + elif [ -z "$_p_site" ]; then _p_site="$_ptok" + fi ;; + esac + done + # default site to the current $HCISITE when a thread is given without one + if [ "$_p_all" = "0" ] && [ -z "$_p_thread" ]; then + err "usage: /paths [site] [--up|--down|--site-only|--all|--format tsv|table|jsonl]" + continue + fi + if [ "$_p_all" = "0" ] && [ -z "$_p_site" ] && [ -n "${HCISITE:-}" ]; then + _p_site="$HCISITE" + fi + tool_nc_paths "" "$_p_thread" "$_p_site" "$_p_dir" "$_p_all" "$_p_siteonly" "$_p_fmt" "" + continue ;; /redetect) detect_cloverleaf_env system_prompt=$(build_system_prompt) larry_say "re-detected. /env to view." diff --git a/lib/nc-parse.sh b/lib/nc-parse.sh index 3a59f7b..21afd76 100755 --- a/lib/nc-parse.sh +++ b/lib/nc-parse.sh @@ -21,10 +21,16 @@ # protocol-nested — drill into nested block, e.g. "PROTOCOL.PORT" # protocol-summary [--all|--filter R] — TSV summary of all protocols with key fields # destinations — list DEST values from DATAXLATE routing block +# sources — inverse: protocols that DEST to NAME # xlate-refs [] — list xlate .xlt files referenced +# tclproc-refs [] — list TCL proc names referenced # route-block — emit the DATAXLATE block (the routing config) # help — this help # +# Route-chain PATH enumeration (root-to-leaf chains, all-mode, cross-site) lives +# in lib/nc-paths.sh — it is the single walker backend built on the one-hop +# destinations/sources primitives here. The old `chain` subcommand was removed. +# # Exit codes: 0 OK, 1 usage error, 2 not found, 3 parse error. set -u set -o pipefail @@ -321,64 +327,14 @@ cmd_tclproc_refs() { ' | sort -u | grep -v '^$' } -# Walk the full thread chain starting from a thread name. BFS over sources -# and/or destinations to a configurable depth (default unlimited). -# Output: TSV with columns "depth direction thread" -# depth 0 = the start thread -# direction = self|up|down -cmd_chain() { - local nc="$1" start="$2"; shift 2 - local max_depth=99 dir="both" - while [ $# -gt 0 ]; do - case "$1" in - --depth) shift; max_depth="$1" ;; - --direction) shift; dir="$1" ;; - *) die "unknown flag for chain: $1" ;; - esac - shift - done - require_file "$nc" - - # BFS using two associative arrays in awk-style via files - # We'll just use plain arrays in bash. - local tmp_visited; tmp_visited=$(mktemp) - local tmp_frontier; tmp_frontier=$(mktemp) - local tmp_next; tmp_next=$(mktemp) - printf '%s\n' "$start" > "$tmp_visited" - printf '0\t%s\tself\n' "$start" - printf '%s\n' "$start" > "$tmp_frontier" - - local d - for ((d=1; d<=max_depth; d++)); do - : > "$tmp_next" - while IFS= read -r t; do - [ -z "$t" ] && continue - if [ "$dir" = "both" ] || [ "$dir" = "up" ]; then - while IFS= read -r s; do - [ -z "$s" ] && continue - if ! grep -qxF "$s" "$tmp_visited"; then - printf '%s\n' "$s" >> "$tmp_visited" - printf '%s\n' "$s" >> "$tmp_next" - printf '%d\t%s\tup\n' "$d" "$s" - fi - done < <(cmd_sources "$nc" "$t" 2>/dev/null) - fi - if [ "$dir" = "both" ] || [ "$dir" = "down" ]; then - while IFS= read -r dd; do - [ -z "$dd" ] && continue - if ! grep -qxF "$dd" "$tmp_visited"; then - printf '%s\n' "$dd" >> "$tmp_visited" - printf '%s\n' "$dd" >> "$tmp_next" - printf '%d\t%s\tdown\n' "$d" "$dd" - fi - done < <(cmd_destinations "$nc" "$t" 2>/dev/null) - fi - done < "$tmp_frontier" - if [ ! -s "$tmp_next" ]; then break; fi - cp "$tmp_next" "$tmp_frontier" - done - rm -f "$tmp_visited" "$tmp_frontier" "$tmp_next" -} +# NOTE (v0.8.19): the old `cmd_chain` BFS-node-set walker was removed and +# CONSOLIDATED into lib/nc-paths.sh, which is now the SINGLE route-chain backend. +# cmd_chain only emitted a flat set of reachable nodes (depth/direction/thread), +# never enumerated root-to-leaf PATHS, was never wired into the LLM, and would +# have left two competing walkers. nc-paths.sh ports the v2 `paths` DFS +# enumerator (SITE/THREAD/HOPS/PATH output, all-mode, cross-site joins) and reuses +# the one-hop DEST primitives (cmd_destinations / cmd_sources) below. Do not +# reintroduce a second walker here — extend nc-paths.sh. cmd_route_block() { local nc="$1" name="$2" @@ -424,7 +380,7 @@ case "$SUB" in protocol-summary) [ $# -ge 2 ] || die "usage: $0 protocol-summary [--filter REGEX]"; cmd_protocol_summary "$2" "${@:3}" ;; destinations) [ $# -ge 3 ] || die "usage: $0 destinations "; cmd_destinations "$2" "$3" ;; sources) [ $# -ge 3 ] || die "usage: $0 sources "; cmd_sources "$2" "$3" ;; - chain) [ $# -ge 3 ] || die "usage: $0 chain [--depth N] [--direction both|up|down]"; cmd_chain "$2" "$3" "${@:4}" ;; + chain) die "the 'chain' subcommand was removed in v0.8.19 — use nc-paths.sh (route-chain path enumerator) instead" ;; xlate-refs) [ $# -ge 2 ] || die "usage: $0 xlate-refs [name]"; cmd_xlate_refs "$2" "${3:-}" ;; tclproc-refs) [ $# -ge 2 ] || die "usage: $0 tclproc-refs [name]"; cmd_tclproc_refs "$2" "${3:-}" ;; route-block) [ $# -ge 3 ] || die "usage: $0 route-block "; cmd_route_block "$2" "$3" ;; diff --git a/lib/nc-paths.sh b/lib/nc-paths.sh new file mode 100755 index 0000000..2357fb6 --- /dev/null +++ b/lib/nc-paths.sh @@ -0,0 +1,530 @@ +#!/usr/bin/env bash +# nc-paths.sh — deterministic route-chain path ENUMERATOR for Larry-Anywhere v3. +# +# This is the SINGLE walker backend for Cloverleaf message routing. It replaces +# the old dark `nc-parse.sh chain` BFS-node-set command (which only ever +# returned a flat set of reachable nodes, never enumerated paths, and was never +# wired into the LLM). It ports the v2 `paths` semantics +# (cloverleaf_tools/cli/legacy_workflow_commands.py paths_cmd + the three +# _enumerate_* helpers, lines 315-464) faithfully: +# +# - Downstream DFS from a start thread, following the DATAXLATE DEST list +# (find_outgoing). A leaf (no outgoing) OR a cycle hit terminates that path +# and the terminal node is included in the emitted chain. +# - Upstream DFS (mirror), following incoming threads (find_incoming). +# - All-mode: enumerate from every entry point (a thread with no incoming), +# deduped — gives the whole-site chain inventory (v2 list_full_routes). +# +# ROUTING RESOLUTION: next hop is resolved ONLY from the DATAXLATE { DEST } +# list (via nc-parse.sh destinations / sources). It NEVER reads ICLSERVERPORT. +# This is deliberate: Bryan's old paths.tcl walked routes via +# `keylget data ICLSERVERPORT`, which THROWS on any thread lacking that key +# (every outbound/client thread), so the trace died on the first client thread. +# The DEST list is present on every routing thread regardless of direction and +# simply yields nothing (no crash) when a thread has no routes. DO NOT +# reintroduce an ICLSERVERPORT-based hop here. +# +# CROSS-SITE BY DEFAULT (Bryan's resolved decision, 2026-05-28): when a chain's +# terminal thread (a downstream leaf with no further DEST in its own site) is +# ALSO an entry/inbound thread declared in ANOTHER discovered site's NetConfig +# (correlated by shared thread name), the walk CONTINUES into that site — so the +# mux -> ancout -> CodaMetrix style chain is followed end to end across the site +# boundary. Pass --site-only to scope the walk to a single site. +# +# Robust cycle detection across sites: every walk carries the full ancestor set +# keyed by "site\037thread"; revisiting any (site,thread) ancestor terminates the +# path (the terminal node is still emitted), so the enumeration always +# terminates. A global max-depth cap (default 128, matching v2) is a second +# backstop. +# +# Output columns: SITE THREAD HOPS PATH +# THREAD = the start/anchor thread of the row +# HOPS = number of threads in the chain (len of the path list) +# PATH = the chain joined by " -> " (space-arrow-space) +# One row per enumerated root-to-leaf path; a branching thread yields N rows. +# +# Usage: +# nc-paths.sh --netconfig [flags] # explicit NetConfig +# nc-paths.sh [flags] # resolve site under $HCIROOT +# nc-paths.sh --all [--site ] [flags] # whole-site entry chains +# +# Flags: +# --upstream only the upstream chains feeding the thread +# --downstream only the downstream chains from the thread +# (neither flag = full paths containing the thread, +# v2 default, falling back to downstream-from-thread) +# --all enumerate from every entry point (no thread arg) +# --site scope all-mode (or site resolution) to one site +# --site-only do NOT cross site boundaries (downstream only) +# --hciroot override $HCIROOT for site/cross-site discovery +# --netconfig operate on one explicit NetConfig (implies the site is +# basename(dirname(file)); cross-site still scans $HCIROOT) +# --max-depth N recursion cap (default 128) +# --format tsv|table|jsonl default: table +# +# Exit codes: 0 OK, 1 usage error, 2 not found. +set -u +set -o pipefail + +NC_SELF="$0" +LIB_DIR="$(cd "$(dirname "$NC_SELF")" && pwd)" +NCP="$LIB_DIR/nc-parse.sh" + +die() { printf 'nc-paths: %s\n' "$*" >&2; exit 1; } + +# ───────────────────────────────────────────────────────────────────────────── +# Arg parsing +# ───────────────────────────────────────────────────────────────────────────── +THREAD="" +SITE_ARG="" +NETCONFIG="" +HCIROOT_OVERRIDE="" +DIR_MODE="full" # full | up | down +ALL_MODE=0 +SITE_ONLY=0 +MAX_DEPTH=128 +FORMAT="table" + +POSITIONAL=() +while [ $# -gt 0 ]; do + case "$1" in + --upstream) DIR_MODE="up" ;; + --downstream) DIR_MODE="down" ;; + --all) ALL_MODE=1 ;; + --site) shift; SITE_ARG="${1:-}" ;; + --site-only) SITE_ONLY=1 ;; + --hciroot) shift; HCIROOT_OVERRIDE="${1:-}" ;; + --netconfig) shift; NETCONFIG="${1:-}" ;; + --max-depth) shift; MAX_DEPTH="${1:-128}" ;; + --format) shift; FORMAT="${1:-table}" ;; + -h|--help) sed -n '2,70p' "$NC_SELF" | sed 's/^# \{0,1\}//'; exit 0 ;; + --*) die "unknown flag: $1" ;; + *) POSITIONAL+=("$1") ;; + esac + shift +done + +case "$FORMAT" in tsv|table|jsonl) ;; *) die "bad --format: $FORMAT (tsv|table|jsonl)" ;; esac + +# Positional shapes: +# (manual: thread only; site from $HCISITE/$HCISITEDIR) +# (manual muscle-memory: thread + site) +if [ "${#POSITIONAL[@]}" -ge 1 ]; then THREAD="${POSITIONAL[0]}"; fi +if [ "${#POSITIONAL[@]}" -ge 2 ] && [ -z "$SITE_ARG" ]; then SITE_ARG="${POSITIONAL[1]}"; fi +if [ "${#POSITIONAL[@]}" -gt 2 ]; then die "too many positional args: ${POSITIONAL[*]}"; fi + +if [ "$ALL_MODE" = "0" ] && [ -z "$THREAD" ]; then + die "no thread given (and --all not set). Try: nc-paths.sh OR nc-paths.sh --all --site " +fi + +ROOT="${HCIROOT_OVERRIDE:-${HCIROOT:-}}" + +# ───────────────────────────────────────────────────────────────────────────── +# Site discovery — map every discovered NetConfig to a site name. +# Two parallel arrays (portable to bash 3.2 on macOS; no associative-array dep). +# SITE_NAMES[i] = site (basename of NetConfig's parent dir) +# SITE_NCS[i] = absolute NetConfig path +# An explicit --netconfig is always included; cross-site scanning still walks +# $HCIROOT so a terminal can hop into another site. +# ───────────────────────────────────────────────────────────────────────────── +SITE_NAMES=() +SITE_NCS=() + +_add_site() { + local name="$1" nc="$2" i + [ -f "$nc" ] || return 0 + # de-dupe by NetConfig path + for ((i=0; i<${#SITE_NCS[@]}; i++)); do + [ "${SITE_NCS[$i]}" = "$nc" ] && return 0 + done + SITE_NAMES+=("$name") + SITE_NCS+=("$nc") +} + +_discover_sites() { + # explicit NetConfig first (its site name is the parent dir basename) + if [ -n "$NETCONFIG" ]; then + [ -f "$NETCONFIG" ] || die "not a file: $NETCONFIG" + _add_site "$(basename "$(dirname "$NETCONFIG")")" "$NETCONFIG" + fi + # When --site-only with an explicit NetConfig, do not scan further. + if [ "$SITE_ONLY" = "1" ] && [ -n "$NETCONFIG" ]; then + return 0 + fi + # Otherwise discover all sites under $HCIROOT (for cross-site joins / site + # resolution / all-mode), same walk nc-find.sh uses. + if [ -n "$ROOT" ]; then + local nc sname + while IFS= read -r nc; do + sname=$(basename "$(dirname "$nc")") + # When --site-only (no explicit NetConfig) and a site was named, keep only it. + if [ "$SITE_ONLY" = "1" ] && [ -n "$SITE_ARG" ] && [ "$sname" != "$SITE_ARG" ]; then + continue + fi + _add_site "$sname" "$nc" + done < <(find "$ROOT" -maxdepth 2 -name NetConfig -type f 2>/dev/null | sort) + fi +} + +# Resolve the NetConfig path for a given site name (first match wins). +_nc_for_site() { + local want="$1" i + for ((i=0; i<${#SITE_NAMES[@]}; i++)); do + if [ "${SITE_NAMES[$i]}" = "$want" ]; then + printf '%s' "${SITE_NCS[$i]}" + return 0 + fi + done + return 1 +} + +# Given a thread name, find the FIRST discovered (site,nc) pair whose NetConfig +# declares that thread as a protocol. Emits "site\037nc" or returns 1. +US=$'\037' # unit separator — safe field delimiter for site/thread keys +_locate_thread() { + local want="$1" i sname nc + for ((i=0; i<${#SITE_NCS[@]}; i++)); do + sname="${SITE_NAMES[$i]}"; nc="${SITE_NCS[$i]}" + if "$NCP" list-protocols "$nc" 2>/dev/null | grep -qxF "$want"; then + printf '%s%s%s' "$sname" "$US" "$nc" + return 0 + fi + done + return 1 +} + +# ───────────────────────────────────────────────────────────────────────────── +# One-hop primitives (DEST-based, never ICLSERVERPORT). +# ───────────────────────────────────────────────────────────────────────────── +_outgoing() { "$NCP" destinations "$1" "$2" 2>/dev/null; } # nc thread -> dest names +_incoming() { "$NCP" sources "$1" "$2" 2>/dev/null; } # nc thread -> source names + +# Is an entry point (no incoming) in ? +_is_entry_in() { + local nc="$1" t="$2" + [ -z "$(_incoming "$nc" "$t")" ] +} + +# ───────────────────────────────────────────────────────────────────────────── +# Path enumeration. Emitted paths are written to $OUT_PATHS as one line each: +# sitechain where chain = thread1 -> thread2 -> ... +# We carry the running chain as a space-joined token list of "site\037thread" +# keys, and the ancestor set as newline-joined keys (for cycle detection). +# ───────────────────────────────────────────────────────────────────────────── +OUT_PATHS=$(mktemp) +trap 'rm -f "$OUT_PATHS"' EXIT + +# _emit_chain ANCHOR_SITE KEYCHAIN +# KEYCHAIN = space-separated list of "site\037thread" keys +# Renders to "anchor_sitet1 -> t2 -> ..." (thread names only in PATH). +_emit_chain() { + local anchor_site="$1" keychain="$2" + local out="" k thr first=1 + for k in $keychain; do + thr="${k#*$US}" + if [ "$first" = "1" ]; then out="$thr"; first=0; else out="$out -> $thr"; fi + done + printf '%s\t%s\n' "$anchor_site" "$out" +} + +# Downstream DFS. Mirrors v2 _enumerate_downstream_paths + cross-site hop. +# $1 anchor_site — site to report in the SITE column for these rows +# $2 cur_site — site of current thread +# $3 cur_nc — NetConfig of current thread +# $4 cur_thread — current thread name +# $5 keychain — space-joined ancestor keys NOT including current +# $6 seen — newline-joined ancestor keys (for cycle detection) +# $7 depth +_walk_down() { + local anchor_site="$1" cur_site="$2" cur_nc="$3" cur_thread="$4" + local keychain="$5" seen="$6" depth="$7" + local curkey="${cur_site}${US}${cur_thread}" + local newchain + if [ -z "$keychain" ]; then newchain="$curkey"; else newchain="$keychain $curkey"; fi + + # cycle / depth cap → terminate, include current node (v2 semantics) + if [ "$depth" -gt "$MAX_DEPTH" ] || printf '%s\n' "$seen" | grep -qxF "$curkey"; then + _emit_chain "$anchor_site" "$newchain" + return 0 + fi + + # gather outgoing within the current site + local outgoing=() + local d + while IFS= read -r d; do + [ -z "$d" ] && continue + outgoing+=("$d") + done < <(_outgoing "$cur_nc" "$cur_thread") + + if [ "${#outgoing[@]}" -gt 0 ]; then + local nseen + nseen="$seen"$'\n'"$curkey" + for d in "${outgoing[@]}"; do + _walk_down "$anchor_site" "$cur_site" "$cur_nc" "$d" "$newchain" "$nseen" $((depth+1)) + done + return 0 + fi + + # No outgoing in this site = a leaf for this site. CROSS-SITE HOP: + # if cross-site is enabled and this leaf thread is an entry/inbound thread in + # ANOTHER site's NetConfig (shared name) that DOES have outgoing there, + # continue the walk into that site. + if [ "$SITE_ONLY" = "0" ]; then + local i osite onc okey + for ((i=0; i<${#SITE_NCS[@]}; i++)); do + osite="${SITE_NAMES[$i]}"; onc="${SITE_NCS[$i]}" + [ "$osite" = "$cur_site" ] && [ "$onc" = "$cur_nc" ] && continue + # the thread must exist in the other site AND have outgoing there + "$NCP" list-protocols "$onc" 2>/dev/null | grep -qxF "$cur_thread" || continue + [ -n "$(_outgoing "$onc" "$cur_thread")" ] || continue + okey="${osite}${US}${cur_thread}" + # cycle guard across sites: don't re-enter an ancestor (site,thread) + printf '%s\n' "$seen" | grep -qxF "$okey" && continue + # Continue the chain in the other site. We DROP the duplicate boundary + # node: cur_thread is already the last node in newchain, and it is the + # same thread name in osite, so we recurse on its destinations directly, + # carrying newchain as the prefix and marking both (site,thread) keys seen. + local nseen2 + nseen2="$seen"$'\n'"$curkey"$'\n'"$okey" + local dd + while IFS= read -r dd; do + [ -z "$dd" ] && continue + _walk_down "$anchor_site" "$osite" "$onc" "$dd" "$newchain" "$nseen2" $((depth+1)) + done < <(_outgoing "$onc" "$cur_thread") + # only join into the first matching downstream site, then stop scanning + return 0 + done + fi + + # true terminal — emit the chain + _emit_chain "$anchor_site" "$newchain" +} + +# Upstream DFS. Mirrors v2 _enumerate_upstream_paths. Cross-site upstream: +# if a thread has no incoming in its own site but the same-named thread is a +# downstream/leaf in another site, follow that site's incoming (the feeders). +# builds the chain as a PREFIX (sources come before current) +_walk_up() { + local anchor_site="$1" cur_site="$2" cur_nc="$3" cur_thread="$4" + local keychain="$5" seen="$6" depth="$7" + local curkey="${cur_site}${US}${cur_thread}" + local newchain + if [ -z "$keychain" ]; then newchain="$curkey"; else newchain="$curkey $keychain"; fi + + if [ "$depth" -gt "$MAX_DEPTH" ] || printf '%s\n' "$seen" | grep -qxF "$curkey"; then + _emit_chain "$anchor_site" "$newchain" + return 0 + fi + + local incoming=() + local s + while IFS= read -r s; do + [ -z "$s" ] && continue + incoming+=("$s") + done < <(_incoming "$cur_nc" "$cur_thread") + + if [ "${#incoming[@]}" -gt 0 ]; then + local nseen + nseen="$seen"$'\n'"$curkey" + for s in "${incoming[@]}"; do + _walk_up "$anchor_site" "$cur_site" "$cur_nc" "$s" "$newchain" "$nseen" $((depth+1)) + done + return 0 + fi + + # cross-site upstream hop: same-named thread fed in another site + if [ "$SITE_ONLY" = "0" ]; then + local i osite onc okey + for ((i=0; i<${#SITE_NCS[@]}; i++)); do + osite="${SITE_NAMES[$i]}"; onc="${SITE_NCS[$i]}" + [ "$osite" = "$cur_site" ] && [ "$onc" = "$cur_nc" ] && continue + "$NCP" list-protocols "$onc" 2>/dev/null | grep -qxF "$cur_thread" || continue + [ -n "$(_incoming "$onc" "$cur_thread")" ] || continue + okey="${osite}${US}${cur_thread}" + printf '%s\n' "$seen" | grep -qxF "$okey" && continue + local nseen2 + nseen2="$seen"$'\n'"$curkey"$'\n'"$okey" + local ss + while IFS= read -r ss; do + [ -z "$ss" ] && continue + _walk_up "$anchor_site" "$osite" "$onc" "$ss" "$newchain" "$nseen2" $((depth+1)) + done < <(_incoming "$onc" "$cur_thread") + return 0 + done + fi + + _emit_chain "$anchor_site" "$newchain" +} + +# ───────────────────────────────────────────────────────────────────────────── +# Drivers +# ───────────────────────────────────────────────────────────────────────────── + +# Enumerate every full path in a site by starting from each entry point. +# Cross-site continuation happens naturally inside _walk_down. Dedup by the +# rendered "site\tchain" line. +_enumerate_all_in_site() { + local site="$1" nc="$2" + local entry tmp + tmp=$(mktemp) + # entry points = threads with no incoming in this site + "$NCP" list-protocols "$nc" 2>/dev/null | while IFS= read -r entry; do + [ -z "$entry" ] && continue + if _is_entry_in "$nc" "$entry"; then + printf '%s\n' "$entry" >> "$tmp" + fi + done + # if no entry points (every thread has an incoming, e.g. a pure cycle), + # fall back to all protocols as start points (v2 fallback) + if [ ! -s "$tmp" ]; then + "$NCP" list-protocols "$nc" 2>/dev/null > "$tmp" + fi + while IFS= read -r entry; do + [ -z "$entry" ] && continue + _walk_down "$site" "$site" "$nc" "$entry" "" "" 0 + done < "$tmp" + rm -f "$tmp" +} + +main_enumerate() { + _discover_sites + [ "${#SITE_NCS[@]}" -gt 0 ] || die "no NetConfig found (set \$HCIROOT, or pass --netconfig / --hciroot)" + + local raw + raw=$(mktemp) + trap 'rm -f "$OUT_PATHS" "$raw"' EXIT + + if [ "$ALL_MODE" = "1" ]; then + # whole-site entry chains; scope to --site if given (else every site) + local i sname snc + for ((i=0; i<${#SITE_NAMES[@]}; i++)); do + sname="${SITE_NAMES[$i]}"; snc="${SITE_NCS[$i]}" + if [ -n "$SITE_ARG" ] && [ "$sname" != "$SITE_ARG" ]; then continue; fi + _enumerate_all_in_site "$sname" "$snc" >> "$raw" + done + else + # locate the thread's home site + local home_site home_nc loc + if [ -n "$NETCONFIG" ]; then + home_nc="$NETCONFIG"; home_site="$(basename "$(dirname "$NETCONFIG")")" + "$NCP" list-protocols "$home_nc" 2>/dev/null | grep -qxF "$THREAD" \ + || die "thread not found in $home_nc: $THREAD" + elif [ -n "$SITE_ARG" ]; then + home_nc="$(_nc_for_site "$SITE_ARG")" || die "site not found under \$HCIROOT: $SITE_ARG" + home_site="$SITE_ARG" + "$NCP" list-protocols "$home_nc" 2>/dev/null | grep -qxF "$THREAD" \ + || die "thread not found in site $SITE_ARG: $THREAD" + else + loc="$(_locate_thread "$THREAD")" || die "thread not found in any discovered site: $THREAD" + home_site="${loc%%$US*}"; home_nc="${loc#*$US}" + fi + + case "$DIR_MODE" in + up) _walk_up "$home_site" "$home_site" "$home_nc" "$THREAD" "" "" 0 >> "$raw" ;; + down) _walk_down "$home_site" "$home_site" "$home_nc" "$THREAD" "" "" 0 >> "$raw" ;; + full) + # v2 default: every full path (entry-point enumeration) that CONTAINS the + # thread; fall back to downstream-from-thread if none contain it. + local all_tmp + all_tmp=$(mktemp) + _enumerate_all_in_site "$home_site" "$home_nc" > "$all_tmp" + # cross-site: also enumerate full paths in any site whose entry chains + # could pass through the thread (the home site's own entry enumeration + # already crosses outward; inbound feeders in other sites are picked up + # because those sites' entry chains are enumerated in all-mode — but for + # a single-thread query we only have the home site's chains, so we also + # scan every discovered site's chains to catch upstream feeders). + if [ "$SITE_ONLY" = "0" ]; then + local j js jn + for ((j=0; j<${#SITE_NAMES[@]}; j++)); do + js="${SITE_NAMES[$j]}"; jn="${SITE_NCS[$j]}" + [ "$jn" = "$home_nc" ] && continue + _enumerate_all_in_site "$js" "$jn" >> "$all_tmp" + done + fi + # keep only chains containing the thread (match on " -> THREAD ->", + # leading "THREAD ->", or trailing "-> THREAD", or exact) + local kept + kept=$(awk -F'\t' -v t="$THREAD" ' + { + chain=$2 + # pad with arrows for unambiguous boundary matching + padded=" -> " chain " -> " + if (index(padded, " -> " t " -> ") > 0) print $0 + }' "$all_tmp" | sort -u) + if [ -n "$kept" ]; then + printf '%s\n' "$kept" >> "$raw" + else + _walk_down "$home_site" "$home_site" "$home_nc" "$THREAD" "" "" 0 >> "$raw" + fi + rm -f "$all_tmp" + ;; + esac + fi + + # dedup the raw "sitechain" lines, preserving first-seen order + awk '!seen[$0]++' "$raw" > "$OUT_PATHS" + rm -f "$raw" + trap 'rm -f "$OUT_PATHS"' EXIT +} + +# ───────────────────────────────────────────────────────────────────────────── +# Render: OUT_PATHS holds "sitechain" lines. Build SITE THREAD HOPS PATH. +# THREAD = first node of the chain (the anchor/root for this row) +# HOPS = number of nodes in the chain +# ───────────────────────────────────────────────────────────────────────────── +render() { + if [ ! -s "$OUT_PATHS" ]; then + printf 'No paths found.\n' + return 0 + fi + # produce a 4-col TSV: site thread hops path + local tsv + tsv=$(awk -F'\t' ' + { + site=$1; chain=$2 + # first node + first=chain + sub(/ -> .*/, "", first) + # hop count = number of " -> " separators + 1 + n=split(chain, parts, / -> /) + printf "%s\t%s\t%d\t%s\n", site, first, n, chain + }' "$OUT_PATHS") + + case "$FORMAT" in + tsv) + printf 'site\tthread\thops\tpath\n' + printf '%s\n' "$tsv" + ;; + jsonl) + printf '%s\n' "$tsv" | awk -F'\t' ' + function esc(s){ gsub(/\\/,"\\\\",s); gsub(/"/,"\\\"",s); return s } + { printf "{\"site\":\"%s\",\"thread\":\"%s\",\"hops\":%s,\"path\":\"%s\"}\n", + esc($1),esc($2),$3,esc($4) }' + ;; + table) + { + printf 'SITE\tTHREAD\tHOPS\tPATH\n' + printf '%s\n' "$tsv" + } | awk -F'\t' ' + { for (i=1;i<=NF;i++){ if (length($i)>w[i]) w[i]=length($i); cell[NR,i]=$i }; rows=NR; cols=NF } + END { + for (r=1; r<=rows; r++) { + for (c=1; c<=cols; c++) printf "%-*s ", w[c], cell[r,c] + printf "\n" + if (r==1) { for (c=1; c<=cols; c++) { for (k=0;k&2 + fi + return 0 +} + +main_enumerate +render