cloverleaf-larry/lib/nc-regression.sh
Bryan Johnson 1709655a9c v0.6.8: cross-env Cloverleaf workflows over SSH ControlMaster
Closes the gap between v0.6.7's ssh_exec/ssh_status primitives and the local
nc_* tools, so Bryan's two motivating workflows compose cleanly:

  1. "Compare the ADT site NetConfig on qa to dev"
  2. "Grab smat files from dev and bring to qa for regression testing"

ssh_pull, ssh_push (lib/ssh-helper.sh + larry.sh):
  scp via the existing ControlMaster socket — no second auth, no second TCP
  handshake. Master-not-open and missing-remote-file paths fail with explicit
  messages ("open the master with /ssh-setup <alias> first"). Pull caches to
  /tmp/larry-pulls/<alias>.<basename>.<hash-of-remote-path> when local_path is
  omitted, so repeat pulls of the same remote file are idempotent. Validates
  byte counts post-transfer to catch partial transfers.

ssh_pull_smat (lib/ssh-helper.sh + larry.sh):
  Cloverleaf-aware smatdb pull. Full mode scp's the entire .smatdb;
  sampled mode (days_back=N) runs sqlite3 server-side via ssh_exec to extract
  up to 1000 recent messages as TSV with base64-encoded MessageContent blobs
  (verified end-to-end with a synthetic smatdb fixture matching nc-msgs.sh's
  smat_msgs schema). Avoids transferring multi-GB archives when only N
  samples are needed.

nc_diff_interface tool (newly wired):
  Promotes lib/nc-diff-interface.sh into the LLM-callable tool surface. Used
  by the new /nc-diff-env slash command for workflow #1.

nc_regression cross-env (lib/nc-regression.sh + larry.sh):
  source_ssh_alias / target_ssh_alias args. Phase 1 (discovery) and Phase 2
  (sample) run via ssh_exec + ssh_pull / ssh_pull_smat against the source
  alias. Phase 3/4 (route_test) push inputs over and pull outputs back via
  ssh_push / ssh_pull. Phases 5/6 (diff + summary) stay local. Reports
  reference the SSH alias names rather than raw user@host strings.

/nc-diff-env and /nc-regression-env slash commands (larry.sh):
  Templated prompts to Larry-the-LLM that explicitly cite the motivating
  workflows, call out ssh_status / ssh_pull / nc_diff_interface and the
  nc_regression cross-env fields. Registered in _LARRY_SLASH_CMDS +
  _LARRY_SLASH_CMDS_DESC + /help per v0.6.7 patterns.

Bug fix unearthed during cross-env work:
  lib/nc-regression.sh phase_5 / phase_6 used printf 'FORMAT' where FORMAT
  begins with '- '. bash 3.2 (macOS default) reads the leading '-' as a bad
  option and emits nothing — silently dropping the entire "Configuration"
  section of regression-summary.md. Switched the affected lines to
  printf -- 'FORMAT' so the format string is unambiguous.

Tool/slash surface deltas vs v0.6.7:
  Tools: 31 → 35 (+ssh_pull, +ssh_push, +ssh_pull_smat, +nc_diff_interface)
  Slash commands: 34 → 36 (+/nc-diff-env, +/nc-regression-env)

Updated tool descriptions for read_file, grep_files, nc_msgs to point at
ssh_pull / ssh_pull_smat as the cross-env pre-step so Larry-the-LLM picks
the right chain on the first attempt.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 15:52:58 -07:00

545 lines
25 KiB
Bash
Executable File
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/usr/bin/env bash
# nc-regression.sh — Example 6 orchestrator: end-to-end regression testing
# between two Cloverleaf environments.
#
# Phases:
# 1. discover → list inbound threads in scope (uses nc-find-inbound)
# 2. sample → grab N messages per inbound from env-A smatdb → input .msgs files
# 3. route-A → run route_test on env-A for each inbound → captured outputs/env-a/...
# 4. route-B → run route_test on env-B with same inputs → captured outputs/env-b/...
# 5. diff → hl7-diff every paired output file with --ignore MSH.7 → per-pair report
# 6. summary → one master regression-summary.md compiling everything
#
# Phases 3 and 4 require Cloverleaf's route_test command on each box. The
# command is parameterized via --route-test-cmd with placeholders:
# {THREAD} → the inbound thread name
# {INPUT} → absolute path to the .msgs input file
# {OUTPUT_DIR} → absolute path where output files should land
# {HCIROOT} → the env's HCIROOT
# {HCISITE} → the env's HCISITE
# Default: not set — you must pass it once for your shop's invocation pattern.
#
# A common pattern: a wrapper script that sources the Cloverleaf profile and
# runs `<thread> route_test <INPUT>`, with output redirected to OUTPUT_DIR.
# Example template you might pass:
# --route-test-cmd 'cd {HCIROOT}/{HCISITE} && . ./.profile && {THREAD} route_test {INPUT} && cp *.out.* {OUTPUT_DIR}/'
#
# Usage:
# nc-regression.sh --scope <SCOPE>
# --count N
# --env-a HCIROOT_A --site-a SITENAME
# --env-b HCIROOT_B --site-b SITENAME
# --out DIR
# --route-test-cmd 'TEMPLATE'
# [--ignore "FIELDS"]
# [--include-fields "FIELDS"]
# [--phase 1|2|3|4|5|6|all]
# [--dry-run]
# [--inbound-mode tcp-listen|icl-or-file|all]
# [--env-b-host HOST] [--env-b-user USER] # for scp of inputs
#
# Scope formats:
# thread:NAME one specific thread
# threads:N1,N2,N3 comma-separated list
# site every inbound in the configured site
# server every inbound in every site under HCIROOT
set -o pipefail
NC_SELF="$0"
LIB_DIR="$(cd "$(dirname "$NC_SELF")" && pwd)"
NCP="$LIB_DIR/nc-parse.sh"
NCI="$LIB_DIR/nc-inbound.sh"
NCM="$LIB_DIR/nc-msgs.sh"
HL7DIFF="$LIB_DIR/hl7-diff.sh"
die() { printf 'nc-regression: %s\n' "$*" >&2; exit 1; }
say() { printf 'nc-regression: %s\n' "$*" >&2; }
SCOPE=""
COUNT=10
ENV_A=""
SITE_A=""
ENV_B=""
SITE_B=""
OUT=""
ROUTE_TEST_CMD=""
IGNORE="MSH.7"
INCLUDE=""
PHASE="all"
DRY_RUN=0
INBOUND_MODE="all"
ENV_B_HOST=""
ENV_B_USER=""
BUNDLE_OUT="" # after env-A phases, tar up the artifacts here
BUNDLE_IN="" # at start, untar a bundle here as the env-A artifacts
# v0.6.8: cross-env via ssh-helper.sh ControlMaster aliases. When set, phases
# 14 do their reads/writes against the named remote alias; phases 56 always
# run locally because all the artifacts live in $OUT (local).
SOURCE_SSH_ALIAS=""
TARGET_SSH_ALIAS=""
while [ $# -gt 0 ]; do
case "$1" in
--scope) shift; SCOPE="$1" ;;
--count) shift; COUNT="$1" ;;
--env-a) shift; ENV_A="$1" ;;
--site-a) shift; SITE_A="$1" ;;
--env-b) shift; ENV_B="$1" ;;
--site-b) shift; SITE_B="$1" ;;
--out) shift; OUT="$1" ;;
--route-test-cmd) shift; ROUTE_TEST_CMD="$1" ;;
--ignore) shift; IGNORE="$1" ;;
--include-fields) shift; INCLUDE="$1" ;;
--phase) shift; PHASE="$1" ;;
--dry-run) DRY_RUN=1 ;;
--inbound-mode) shift; INBOUND_MODE="$1" ;;
--env-b-host) shift; ENV_B_HOST="$1" ;;
--env-b-user) shift; ENV_B_USER="$1" ;;
--bundle-out) shift; BUNDLE_OUT="$1" ;;
--bundle-in) shift; BUNDLE_IN="$1" ;;
--source-ssh-alias) shift; SOURCE_SSH_ALIAS="$1" ;;
--target-ssh-alias) shift; TARGET_SSH_ALIAS="$1" ;;
-h|--help) sed -n '2,55p' "$NC_SELF"; exit 0 ;;
-*) die "unknown flag: $1" ;;
*) die "extra arg: $1" ;;
esac
shift
done
# Resolve ssh-helper.sh for cross-env support. Caller can pre-export
# LARRY_LIB_DIR; otherwise we look next to ourselves.
SSH_HELPER="${LARRY_LIB_DIR:-$LIB_DIR}/ssh-helper.sh"
[ -x "$SSH_HELPER" ] || SSH_HELPER="$LIB_DIR/ssh-helper.sh"
# Helper: run a command on a remote alias if non-empty, else locally.
# `$1`=alias (empty=local); rest=command (single string).
_run_on() {
local alias="$1"; shift
if [ -z "$alias" ]; then
bash -c "$*"
else
[ -x "$SSH_HELPER" ] || die "ssh-helper.sh not found at $SSH_HELPER (needed for --source/target-ssh-alias)"
"$SSH_HELPER" exec "$alias" "$*"
fi
}
[ -n "$OUT" ] || die "missing --out DIR"
# When --bundle-in is given, we don't need scope/env-a/etc. — the bundle has them.
if [ -z "$BUNDLE_IN" ]; then
[ -n "$SCOPE" ] || die "missing --scope (thread:NAME | threads:N1,N2 | site | server)"
[ -n "$ENV_A" ] || die "missing --env-a HCIROOT_A"
[ -n "$ENV_B" ] || die "missing --env-b HCIROOT_B"
# ENV_A directory check is only meaningful when SOURCE is local; same for ENV_B.
if [ -z "$SOURCE_SSH_ALIAS" ]; then
[ -d "$ENV_A" ] || die "env-a is not a directory: $ENV_A (and --source-ssh-alias unset)"
fi
fi
if [ -n "$SOURCE_SSH_ALIAS" ] || [ -n "$TARGET_SSH_ALIAS" ]; then
[ -x "$SSH_HELPER" ] || die "ssh-helper.sh required for cross-env mode but not found at $SSH_HELPER"
fi
case "$PHASE" in 1|2|3|4|5|6|all|env-a|env-b) ;; *) die "bad --phase (use 1|2|3|4|5|6|all|env-a|env-b)" ;; esac
[ "$DRY_RUN" = "1" ] || [ -n "$ROUTE_TEST_CMD" ] || say "WARNING: --route-test-cmd is unset; phases 3 and 4 will be skipped (you can run them manually using the generated input files)"
mkdir -p "$OUT" "$OUT/inputs" "$OUT/outputs/env-a" "$OUT/outputs/env-b" "$OUT/diff" 2>/dev/null
# If --bundle-in given, untar the bundle into $OUT first. Manifest tells us
# what env-A was and (optionally) what route_test command to use.
if [ -n "$BUNDLE_IN" ]; then
[ -f "$BUNDLE_IN" ] || die "bundle-in file not found: $BUNDLE_IN"
say "unpacking bundle $BUNDLE_IN into $OUT/"
tar -xzf "$BUNDLE_IN" -C "$OUT" 2>&1 | tail -5
if [ -f "$OUT/manifest.json" ]; then
say "manifest from env-A:"
cat "$OUT/manifest.json" >&2
# Pull scope and route-test-cmd hints if not overridden
if [ -z "$SCOPE" ] && command -v jq >/dev/null 2>&1; then
SCOPE=$(jq -r '.scope // ""' "$OUT/manifest.json")
fi
fi
fi
# ─────────────────────────────────────────────────────────────────────────────
# Phase 1: discover inbound threads in scope
# ─────────────────────────────────────────────────────────────────────────────
discover_inbounds() {
case "$SCOPE" in
thread:*) echo "${SCOPE#thread:}" ;;
threads:*) echo "${SCOPE#threads:}" | tr ',' '\n' ;;
site)
[ -n "$SITE_A" ] || die "scope=site requires --site-a"
if [ -n "$SOURCE_SSH_ALIAS" ]; then
# Pull the remote NetConfig locally first, then parse with our local NCI.
local remote_nc="$ENV_A/$SITE_A/NetConfig"
local local_nc; local_nc=$("$SSH_HELPER" pull "$SOURCE_SSH_ALIAS" "$remote_nc" 2>/dev/null | tail -1)
[ -n "$local_nc" ] && [ -f "$local_nc" ] || die "could not pull remote NetConfig $remote_nc from $SOURCE_SSH_ALIAS"
"$NCI" "$local_nc" --mode "$INBOUND_MODE" --format tsv \
| awk -F'\t' 'NR>1 {print $1}'
else
"$NCI" "$ENV_A/$SITE_A/NetConfig" --mode "$INBOUND_MODE" --format tsv \
| awk -F'\t' 'NR>1 {print $1}'
fi
;;
server)
if [ -n "$SOURCE_SSH_ALIAS" ]; then
# find NetConfigs remotely, then pull each and parse locally.
local ncs
ncs=$("$SSH_HELPER" exec "$SOURCE_SSH_ALIAS" "find $ENV_A -maxdepth 2 -name NetConfig -type f 2>/dev/null" 2>/dev/null \
| grep -v '^\[ssh_exec:' )
local nc local_nc
while IFS= read -r nc; do
[ -z "$nc" ] && continue
local_nc=$("$SSH_HELPER" pull "$SOURCE_SSH_ALIAS" "$nc" 2>/dev/null | tail -1)
[ -n "$local_nc" ] && [ -f "$local_nc" ] || continue
"$NCI" "$local_nc" --mode "$INBOUND_MODE" --format tsv \
| awk -F'\t' 'NR>1 {print $1}'
done <<< "$ncs"
else
while IFS= read -r nc; do
"$NCI" "$nc" --mode "$INBOUND_MODE" --format tsv \
| awk -F'\t' 'NR>1 {print $1}'
done < <(find "$ENV_A" -maxdepth 2 -name NetConfig -type f 2>/dev/null)
fi
;;
*) die "bad --scope: $SCOPE" ;;
esac
}
phase_1() {
say "=== PHASE 1: discover inbounds ==="
local inbounds; inbounds=$(discover_inbounds | sort -u | grep -v '^$')
if [ -z "$inbounds" ]; then
say "no inbounds found in scope $SCOPE"
return 1
fi
printf '%s\n' "$inbounds" > "$OUT/inbounds.txt"
local count; count=$(printf '%s\n' "$inbounds" | wc -l | tr -d ' ')
say "discovered $count inbound thread(s) → $OUT/inbounds.txt"
printf '%s\n' "$inbounds" | sed 's/^/ - /'
}
# ─────────────────────────────────────────────────────────────────────────────
# Phase 2: sample N messages per inbound from env-A's smatdbs
# ─────────────────────────────────────────────────────────────────────────────
phase_2() {
say "=== PHASE 2: sample $COUNT messages per inbound from env-A${SOURCE_SSH_ALIAS:+ via ssh:$SOURCE_SSH_ALIAS} ==="
[ -f "$OUT/inbounds.txt" ] || { say "no inbounds file from phase 1; running phase 1 first"; phase_1 || return 1; }
local thread sitedir
local count=0
while IFS= read -r thread; do
[ -z "$thread" ] && continue
local input="$OUT/inputs/${thread}.msgs"
if [ -n "$SOURCE_SSH_ALIAS" ]; then
# Remote sampling: use ssh_pull_smat in sampled mode if SITE_A is known.
# ssh_pull_smat needs a site name; if --site-a is unset we attempt a best-
# effort discovery via remote find for the .smatdb path. Note: ssh-helper's
# pull-smat requires the site arg, so we fall back to a remote find +
# full pull when SITE_A is unset.
if [ "$DRY_RUN" = "1" ]; then
say " [dry-run] would pull-smat $thread (sampled days=14) from $SOURCE_SSH_ALIAS$input"
else
local site_for_smat="${SITE_A:-}"
if [ -z "$site_for_smat" ]; then
# Try to locate the .smatdb path remotely and infer the site.
local remote_smatdb
remote_smatdb=$("$SSH_HELPER" exec "$SOURCE_SSH_ALIAS" \
"find $ENV_A -maxdepth 5 -name ${thread}.smatdb -type f 2>/dev/null | head -1" 2>/dev/null \
| grep -v '^\[ssh_exec:' | head -1)
if [ -n "$remote_smatdb" ]; then
# site = first dir component under $ENV_A
site_for_smat=$(printf '%s' "$remote_smatdb" | sed -e "s#^${ENV_A}/##" -e 's#/.*##')
fi
fi
if [ -n "$site_for_smat" ]; then
# Pull recent (last 14d) messages as TSV+b64; decode locally into the
# input file separated by 0x1c (matching nc-msgs --format raw output).
local sample_tsv; sample_tsv=$(mktemp)
"$SSH_HELPER" pull-smat "$SOURCE_SSH_ALIAS" "$site_for_smat" "$thread" 14 > "$sample_tsv" 2>/dev/null
# decode b64 column 6, separate messages with 0x1c
: > "$input"
local got=0
while IFS=$'\t' read -r unix_ts dir typ src dst b64; do
[ -z "$b64" ] && continue
printf '%s' "$b64" | base64 -d >> "$input" 2>/dev/null && {
printf '\x1c' >> "$input"
got=$((got+1))
[ "$got" -ge "$COUNT" ] && break
}
done < "$sample_tsv"
rm -f "$sample_tsv"
say " sampled $thread (remote, site=$site_for_smat) → $input ($got messages)"
else
say " skip $thread: could not infer remote site (set --site-a)"
continue
fi
fi
else
# Local sampling (original behaviour).
sitedir=""
if [ -n "$SITE_A" ]; then
sitedir="$ENV_A/$SITE_A"
else
sitedir=$(find "$ENV_A" -maxdepth 4 -name "${thread}.smatdb" -type f 2>/dev/null | head -1 | xargs -I{} dirname {} | sed "s#/exec/processes/.*##")
[ -z "$sitedir" ] && { say " skip $thread: smatdb not found under $ENV_A"; continue; }
fi
if [ "$DRY_RUN" = "1" ]; then
say " [dry-run] would sample $COUNT msgs from $thread$input"
else
HCISITEDIR="$sitedir" "$NCM" "$thread" --limit "$COUNT" --format raw > "$input" 2>/dev/null
local got; got=$(tr -cd $'\x1c' < "$input" | wc -c | tr -d ' ')
say " sampled $thread$input ($got messages)"
fi
fi
count=$((count+1))
done < "$OUT/inbounds.txt"
say "phase 2 done: $count thread(s) processed"
}
# ─────────────────────────────────────────────────────────────────────────────
# Phase 3 / 4: execute route_test on each env
# ─────────────────────────────────────────────────────────────────────────────
render_cmd() {
local tmpl="$1" thread="$2" input="$3" outdir="$4" hciroot="$5" hcisite="$6"
local cmd="$tmpl"
cmd="${cmd//\{THREAD\}/$thread}"
cmd="${cmd//\{INPUT\}/$input}"
cmd="${cmd//\{OUTPUT_DIR\}/$outdir}"
cmd="${cmd//\{HCIROOT\}/$hciroot}"
cmd="${cmd//\{HCISITE\}/$hcisite}"
printf '%s' "$cmd"
}
phase_routes() {
local label="$1" hciroot="$2" hcisite="$3" ssh_alias="${4:-}"
if [ -n "$ssh_alias" ]; then
say "=== PHASE ${label}: route_test on env-${label} via ssh:${ssh_alias} ==="
else
say "=== PHASE ${label}: route_test on env-${label} (local) ==="
fi
if [ -z "$ROUTE_TEST_CMD" ]; then
say "no --route-test-cmd; skipping phase ${label}"
say "to run manually, use the input files at $OUT/inputs/*.msgs"
return 0
fi
local thread
while IFS= read -r thread; do
[ -z "$thread" ] && continue
local input="$OUT/inputs/${thread}.msgs"
local outdir="$OUT/outputs/env-${label}/${thread}"
[ -f "$input" ] || { say " skip $thread: no input file"; continue; }
mkdir -p "$outdir"
if [ -n "$ssh_alias" ]; then
# Cross-env: push the input to a deterministic remote staging path,
# render the route_test cmd with REMOTE paths, run via ssh_exec, then
# ssh_pull the output files back into the local outdir.
local remote_input="/tmp/larry-regress/inputs/${thread}.msgs"
local remote_outdir="/tmp/larry-regress/outputs/env-${label}/${thread}"
local cmd; cmd=$(render_cmd "$ROUTE_TEST_CMD" "$thread" "$remote_input" "$remote_outdir" "$hciroot" "$hcisite")
if [ "$DRY_RUN" = "1" ]; then
say " [dry-run] $thread (remote):"
say " ssh_push $input$ssh_alias:$remote_input"
say " ssh_exec $ssh_alias: $cmd"
say " ssh_pull $ssh_alias:$remote_outdir/* → $outdir"
continue
fi
say " $thread (remote on $ssh_alias):"
"$SSH_HELPER" exec "$ssh_alias" "mkdir -p $remote_outdir $(dirname "$remote_input")" >/dev/null 2>&1 || true
"$SSH_HELPER" push "$ssh_alias" "$input" "$remote_input" >/dev/null 2>&1 || { say " push failed for $thread"; continue; }
say " \$ $cmd"
"$SSH_HELPER" exec "$ssh_alias" "$cmd" 2>&1 | sed 's/^/ /' || say " (route_test exit non-zero — continuing)"
# Pull every file from remote_outdir back to local outdir.
local out_listing
out_listing=$("$SSH_HELPER" exec "$ssh_alias" "find $remote_outdir -maxdepth 1 -type f 2>/dev/null" 2>/dev/null | grep -v '^\[ssh_exec:')
local rf
while IFS= read -r rf; do
[ -z "$rf" ] && continue
"$SSH_HELPER" pull "$ssh_alias" "$rf" "$outdir/$(basename "$rf")" >/dev/null 2>&1 || say " pull failed: $rf"
done <<< "$out_listing"
else
local cmd; cmd=$(render_cmd "$ROUTE_TEST_CMD" "$thread" "$input" "$outdir" "$hciroot" "$hcisite")
if [ "$DRY_RUN" = "1" ]; then
say " [dry-run] $thread:"
say " \$ $cmd"
else
say " $thread:"
say " \$ $cmd"
bash -c "$cmd" 2>&1 | sed 's/^/ /' || say " (route_test exit non-zero — continuing)"
fi
fi
done < "$OUT/inbounds.txt"
}
phase_3() { phase_routes "a" "$ENV_A" "$SITE_A" "$SOURCE_SSH_ALIAS"; }
phase_4() {
# Phase 4: if target is remote via ssh_alias, phase_routes handles the push.
# Legacy --env-b-host path (raw scp) is preserved for environments where the
# ssh-helper master isn't open.
if [ -z "$TARGET_SSH_ALIAS" ] && [ -n "$ENV_B_HOST" ]; then
say "copying input files to ${ENV_B_USER:-$USER}@${ENV_B_HOST}:${OUT}/inputs/"
if [ "$DRY_RUN" = "1" ]; then
say " [dry-run] scp -r $OUT/inputs/ ${ENV_B_USER:-$USER}@${ENV_B_HOST}:${OUT}/"
else
ssh "${ENV_B_USER:-$USER}@${ENV_B_HOST}" "mkdir -p $OUT/inputs $OUT/outputs/env-b" || true
scp -r "$OUT/inputs/" "${ENV_B_USER:-$USER}@${ENV_B_HOST}:${OUT}/" || say "scp failed; you'll need to copy manually"
fi
fi
phase_routes "b" "$ENV_B" "$SITE_B" "$TARGET_SSH_ALIAS"
}
# ─────────────────────────────────────────────────────────────────────────────
# Phase 5: diff outputs pair-by-pair
# ─────────────────────────────────────────────────────────────────────────────
phase_5() {
say "=== PHASE 5: diff env-a vs env-b outputs ==="
local diff_index="$OUT/diff/_index.md"
{
printf '# Regression diff index\n\n'
# NOTE: format string starts with '- ', so use printf '--' separator —
# otherwise bash 3.2's printf (macOS default) reads the leading '-' as a
# bad option and emits nothing. This was a latent bug pre-v0.6.8.
printf -- '- env-A: `%s`%s\n- env-B: `%s`%s\n- scope: `%s`\n- count: %s msgs per inbound\n- ignore: `%s`\n%s\n\n' \
"$ENV_A" "$([ -n "$SOURCE_SSH_ALIAS" ] && printf ' (via ssh:%s)' "$SOURCE_SSH_ALIAS")" \
"$ENV_B" "$([ -n "$TARGET_SSH_ALIAS" ] && printf ' (via ssh:%s)' "$TARGET_SSH_ALIAS")" \
"$SCOPE" "$COUNT" "$IGNORE" \
"$([ -n "$INCLUDE" ] && printf -- '- include-only: `%s`' "$INCLUDE")"
printf '| thread | dest | diffs | report |\n|---|---|---|---|\n'
} > "$diff_index"
local total_diff=0 total_pairs=0
local thread destfile destname
while IFS= read -r thread; do
[ -z "$thread" ] && continue
local a_dir="$OUT/outputs/env-a/${thread}"
local b_dir="$OUT/outputs/env-b/${thread}"
[ -d "$a_dir" ] || { say " skip $thread: no env-a outputs"; continue; }
[ -d "$b_dir" ] || { say " skip $thread: no env-b outputs"; continue; }
# Pair up by filename
while IFS= read -r destfile; do
destname=$(basename "$destfile")
local b_pair="$b_dir/$destname"
total_pairs=$((total_pairs+1))
if [ ! -f "$b_pair" ]; then
echo "| \`$thread\` | \`$destname\` | (missing on env-b) | — |" >> "$diff_index"
continue
fi
local report="$OUT/diff/${thread}.${destname}.md"
local count
if [ "$DRY_RUN" = "1" ]; then
echo " [dry-run] would diff $destfile vs $b_pair$report"
echo "| \`$thread\` | \`$destname\` | [dry-run] | — |" >> "$diff_index"
continue
fi
local diff_args=(--ignore "$IGNORE")
[ -n "$INCLUDE" ] && diff_args+=(--include-fields "$INCLUDE")
count=$("$HL7DIFF" "${diff_args[@]}" --format count "$destfile" "$b_pair" 2>/dev/null || echo "?")
{
printf '# Diff: %s → %s\n\n- env-A: `%s`\n- env-B: `%s`\n\n' "$thread" "$destname" "$destfile" "$b_pair"
"$HL7DIFF" "${diff_args[@]}" --format text "$destfile" "$b_pair" 2>/dev/null || true
} > "$report"
echo "| \`$thread\` | \`$destname\` | $count | [report](./$(basename "$report")) |" >> "$diff_index"
total_diff=$((total_diff + count))
say " $thread$destname: $count diff(s)"
done < <(find "$a_dir" -maxdepth 1 -type f 2>/dev/null)
done < "$OUT/inbounds.txt"
{
printf '\n## Summary\n\n- pairs compared: %s\n- total field differences (post-ignore): %s\n' "$total_pairs" "$total_diff"
} >> "$diff_index"
say "phase 5 done: $total_pairs pairs compared, $total_diff total diffs"
say "index: $diff_index"
}
# ─────────────────────────────────────────────────────────────────────────────
# Phase 6: master summary
# ─────────────────────────────────────────────────────────────────────────────
phase_6() {
say "=== PHASE 6: master regression-summary.md ==="
local summary="$OUT/regression-summary.md"
{
printf '# Regression test summary\n\n'
printf 'Generated: %s\n\n' "$(date -Iseconds 2>/dev/null || date)"
printf '## Configuration\n\n'
# printf '--' guard for leading '-' format strings (bash 3.2 / macOS).
printf -- '- scope: `%s`\n' "$SCOPE"
printf -- '- count: %s messages per inbound\n' "$COUNT"
printf -- '- env-A: `%s` (site=%s)%s\n' "$ENV_A" "${SITE_A:-auto}" \
"$([ -n "$SOURCE_SSH_ALIAS" ] && printf ' via ssh:%s' "$SOURCE_SSH_ALIAS")"
printf -- '- env-B: `%s` (site=%s)%s\n' "$ENV_B" "${SITE_B:-auto}" \
"$([ -n "$TARGET_SSH_ALIAS" ] && printf ' via ssh:%s' "$TARGET_SSH_ALIAS")"
printf -- '- ignore: `%s`\n' "$IGNORE"
[ -n "$INCLUDE" ] && printf -- '- include-only: `%s`\n' "$INCLUDE"
printf '\n## Inbounds tested\n\n'
[ -f "$OUT/inbounds.txt" ] && awk '{print "- `" $0 "`"}' "$OUT/inbounds.txt"
printf '\n## Inputs\n\n'
find "$OUT/inputs" -maxdepth 1 -type f 2>/dev/null \
| awk '{print "- `" $0 "`"}' || true
printf '\n## Output directories\n\n'
printf -- '- env-A: `%s`\n' "$OUT/outputs/env-a"
printf -- '- env-B: `%s`\n' "$OUT/outputs/env-b"
printf '\n## Diff details\n\n'
printf 'See [diff/_index.md](./diff/_index.md) for the per-pair table.\n'
} > "$summary"
say "summary: $summary"
}
# ─────────────────────────────────────────────────────────────────────────────
# Dispatch
# ─────────────────────────────────────────────────────────────────────────────
case "$PHASE" in
1) phase_1 ;;
2) phase_1 && phase_2 ;;
3) phase_3 ;;
4) phase_4 ;;
5) phase_5 ;;
6) phase_6 ;;
all) phase_1 && phase_2 && phase_3 && phase_4 && phase_5 && phase_6 ;;
env-a) phase_1 && phase_2 && phase_3 ;; # everything that uses env-A
env-b) phase_4 && phase_5 && phase_6 ;; # everything that uses env-B + diff
esac
# Optional: produce a portable bundle of the env-A artifacts (inputs + a-outputs)
# so the user can move them to the env-B box manually.
if [ -n "$BUNDLE_OUT" ]; then
say "producing bundle: $BUNDLE_OUT"
{
printf '{'
printf '"generated":"%s",' "$(date -Iseconds 2>/dev/null || date)"
printf '"host":"%s",' "$(hostname 2>/dev/null || echo unknown)"
printf '"env_a":"%s",' "$ENV_A"
printf '"site_a":"%s",' "$SITE_A"
printf '"env_b_expected":"%s",' "$ENV_B"
printf '"site_b_expected":"%s",' "$SITE_B"
printf '"scope":"%s",' "$SCOPE"
printf '"count":%s,' "$COUNT"
printf '"ignore":"%s",' "$IGNORE"
printf '"route_test_cmd_hint":"%s"' "$ROUTE_TEST_CMD"
printf '}\n'
} > "$OUT/manifest.json"
cat > "$OUT/README.md" <<EOF
# Regression bundle — env-A artifacts
Take this bundle to the env-B box and run:
\`\`\`
nc-regression.sh --bundle-in $(basename "$BUNDLE_OUT") --out $OUT \\
--env-b /path/to/env-b/integrator --site-b <site> \\
--route-test-cmd '<env-b route_test command>' \\
--phase env-b
\`\`\`
The bundle contains:
- inputs/ — sampled messages from env-A (one .msgs per inbound)
- outputs/env-a/ — route_test outputs from env-A
- manifest.json — env-A metadata
- inbounds.txt — the threads tested
env-B side will produce outputs/env-b/ and the diff/ tree.
EOF
tar -czf "$BUNDLE_OUT" -C "$OUT" inputs outputs/env-a inbounds.txt manifest.json README.md 2>/dev/null
say "bundle ready: $BUNDLE_OUT ($(du -h "$BUNDLE_OUT" | awk '{print $1}'))"
fi
say "regression run done. Output root: $OUT"