From adc196ac203418731ed12202e5c14df05afae5b7 Mon Sep 17 00:00:00 2001 From: Chaim Date: Sun, 31 May 2026 10:51:31 +0000 Subject: [PATCH] =?UTF-8?q?docs(plan):=20FU-8a=20process=E2=86=92code=20gu?= =?UTF-8?q?ards=20implementation=20plan=20(3=20tasks,=20TDD)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-Authored-By: Claude Opus 4.8 (1M context) --- .../2026-05-31-fu8a-process-to-code-guards.md | 326 ++++++++++++++++++ 1 file changed, 326 insertions(+) create mode 100644 docs/superpowers/plans/2026-05-31-fu8a-process-to-code-guards.md diff --git a/docs/superpowers/plans/2026-05-31-fu8a-process-to-code-guards.md b/docs/superpowers/plans/2026-05-31-fu8a-process-to-code-guards.md new file mode 100644 index 0000000..bf9f852 --- /dev/null +++ b/docs/superpowers/plans/2026-05-31-fu8a-process-to-code-guards.md @@ -0,0 +1,326 @@ +# FU-8a: Process→Code Guards — Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Make two process barriers enforceable in code: `sync_agents_across_companies.py --verify` exits non-zero on any drift (incl. adapter_type mismatch, loud not silent), and a fitness-function test fails the suite if the repo gains raw Paperclip HTTP calls or direct `agent_wakeup_requests` inserts. + +**Architecture:** GAP-21 — extract the drift loop into a pure `build_drift_report(...)` and a pure `_verify_exit_code(...)`, then make `--verify` exit `1` on drift. GAP-22 — a self-contained pytest fitness function that scans `web/`, `mcp-server/src/`, `scripts/` for forbidden Paperclip-access patterns with an explicit allowlist. Both pure-code; repo pre-scanned clean (0 existing violations). + +**Tech Stack:** Python 3.12, asyncpg (sync script), pytest offline, `.venv` at `mcp-server/.venv`. + +**Spec:** [docs/superpowers/specs/2026-05-31-fu8a-process-to-code-guards-design.md](../specs/2026-05-31-fu8a-process-to-code-guards-design.md) + +**Run tests:** `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_sync_verify_gate.py tests/test_paperclip_access_guard.py -v` + +--- + +## File Structure + +- **Modify** `scripts/sync_agents_across_companies.py` — extract `build_drift_report(...)` + `_verify_exit_code(...)` (pure); `--verify` exits non-zero on drift; adapter_type mismatch + missing-in-mirror counted as drift. +- **Create** `mcp-server/tests/test_sync_verify_gate.py` — offline tests for the two pure functions (imports the script via importlib, like the FU-2b test). +- **Create** `mcp-server/tests/test_paperclip_access_guard.py` — the fitness-function guard (scan + fixtures + real-repo assertion). +- **Modify** `scripts/SCRIPTS.md` — note the new `--verify` gate semantics. + +--- + +## Task 1: GAP-21 — `--verify` becomes an enforceable drift gate + +**Files:** Modify `scripts/sync_agents_across_companies.py`; Create `mcp-server/tests/test_sync_verify_gate.py` + +- [ ] **Step 1: Write the failing tests** + +```python +"""FU-8a / GAP-21: sync --verify drift-gate logic (offline).""" +from __future__ import annotations + +import importlib.util +from pathlib import Path + +_SCRIPT = Path(__file__).resolve().parents[2] / "scripts" / "sync_agents_across_companies.py" +_spec = importlib.util.spec_from_file_location("sync_agents", _SCRIPT) +sync = importlib.util.module_from_spec(_spec) +_spec.loader.exec_module(sync) + + +def _agent(name, adapter="claude_code", cfg=None): + return {"id": f"id-{name}", "name": name, "adapter_type": adapter, + "adapter_config": cfg or {"model": "x"}, "runtime_config": {}, "metadata": {}, + "budget_monthly_cents": 0, "icon": "", "title": "", "role": "", "agent_api_keys": []} + + +def test_verify_exit_code_clean_is_zero(): + assert sync._verify_exit_code(plan=[], mismatches=[], missing=[]) == 0 + + +def test_verify_exit_code_drift_is_nonzero(): + assert sync._verify_exit_code(plan=[("m", "mi", {"x": 1})], mismatches=[], missing=[]) == 1 + + +def test_verify_exit_code_adapter_mismatch_is_nonzero(): + # adapter_type mismatch must count as drift (not silent skip) + assert sync._verify_exit_code(plan=[], mismatches=["עוזר משפטי"], missing=[]) == 1 + + +def test_verify_exit_code_missing_is_nonzero(): + assert sync._verify_exit_code(plan=[], mismatches=[], missing=["סוכן"]) == 1 + + +def test_build_drift_report_flags_adapter_mismatch(): + master = [_agent("A", adapter="claude_code")] + mirror_by_name = {"A": _agent("A", adapter="deepseek_local")} + rep = sync.build_drift_report(master, mirror_by_name, mirror_skills=set(), only=None) + assert "A" in rep["mismatches"] + assert rep["plan"] == [] # mismatch short-circuits the diff + + +def test_build_drift_report_flags_missing_and_plan(): + master = [_agent("A"), _agent("B")] + # A missing in mirror; B present but differing config + mirror_by_name = {"B": _agent("B", cfg={"model": "different"})} + rep = sync.build_drift_report(master, mirror_by_name, mirror_skills=set(), only=None) + assert "A" in rep["missing"] + assert any(p[0]["name"] == "B" for p in rep["plan"]) +``` + +- [ ] **Step 2: Run to verify it fails** + +Run: `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_sync_verify_gate.py -v` +Expected: FAIL — `AttributeError: module 'sync_agents' has no attribute '_verify_exit_code'` / `build_drift_report`. +(Note: the script imports `asyncpg` at module top — confirm it imports cleanly under importlib; it does not connect at import time.) + +- [ ] **Step 3: Add the two pure functions** + +In `scripts/sync_agents_across_companies.py`, add ABOVE `async def main()`: + +```python +def build_drift_report(master_agents, mirror_by_name, mirror_skills, only=None) -> dict: + """Pure drift computation (no DB, no printing). Returns: + {"plan": [(master, mirror, diff), ...], "mismatches": [name, ...], "missing": [name, ...]}. + adapter_type mismatch and missing-in-mirror are recorded as drift, not skipped silently. + """ + plan, mismatches, missing = [], [], [] + for m in master_agents: + if only and m["name"] != only: + continue + mirror = mirror_by_name.get(m["name"]) + if not mirror: + missing.append(m["name"]) + continue + if m["adapter_type"] != mirror["adapter_type"]: + mismatches.append(m["name"]) + continue + diff = compute_diff(m, mirror, mirror_skills) + if diff: + plan.append((m, mirror, diff)) + return {"plan": plan, "mismatches": mismatches, "missing": missing} + + +def _verify_exit_code(plan, mismatches, missing) -> int: + """0 iff fully in sync; 1 if any drift (needs-sync / adapter mismatch / missing-in-mirror).""" + return 1 if (plan or mismatches or missing) else 0 +``` + +- [ ] **Step 4: Rewire `main()`'s drift loop + `--verify` to use them** + +In `main()`, REPLACE the inline drift loop (the `plan = []` block through the `for m in master_agents:` loop that builds `plan`) with: + +```python + print(f"=== Drift report ===") + report = build_drift_report(master_agents, mirror_by_name, mirror_skills, only=args.only) + plan = report["plan"] + for name in report["missing"]: + print(f" ⚠ {name:14s} — NOT FOUND in mirror (we never auto-create) — DRIFT") + for name in report["mismatches"]: + m = next(a for a in master_agents if a["name"] == name) + mi = mirror_by_name[name] + print(f" ❌ {name:14s} — adapter_type mismatch ({m['adapter_type']} vs {mi['adapter_type']}) " + f"— DRIFT (apply skips it; fix manually in both companies)") + for master, mirror, diff in plan: + print_diff(master["name"], diff, master["id"], mirror["id"]) +``` + +And REPLACE the `if args.verify:` block with: + +```python + if args.verify: + code = _verify_exit_code(plan, report["mismatches"], report["missing"]) + total_drift = len(plan) + len(report["mismatches"]) + len(report["missing"]) + print(f"\nSummary: {len(plan)} need sync, {len(report['mismatches'])} adapter-mismatch, " + f"{len(report['missing'])} missing-in-mirror → {'DRIFT' if code else 'IN SYNC'}") + sys.exit(code) +``` + +(The `--apply` path still uses `plan` and still does NOT touch adapter_type-mismatch agents — only `--verify`'s exit code changes + the loud reporting.) + +- [ ] **Step 5: Run tests + import check** + +Run: `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_sync_verify_gate.py -v` → all PASS. +Run: `.venv/bin/python -c "import importlib.util,pathlib; p=pathlib.Path('scripts/sync_agents_across_companies.py'); s=importlib.util.spec_from_file_location('s',p); m=importlib.util.module_from_spec(s); s.loader.exec_module(m); print('imports')"` (from repo root) → `imports`. + +- [ ] **Step 6: Commit** + +```bash +cd ~/legal-ai +git add scripts/sync_agents_across_companies.py mcp-server/tests/test_sync_verify_gate.py +git commit -m "feat(sync): --verify exits non-zero on drift; adapter mismatch = loud drift (GAP-21, FU-8a)" +``` + +--- + +## Task 2: GAP-22 — Paperclip-access fitness function + +**Files:** Create `mcp-server/tests/test_paperclip_access_guard.py` + +- [ ] **Step 1: Write the guard + its tests** + +```python +"""FU-8a / GAP-22: fitness function — forbid un-sanctioned Paperclip access. + +Fails if any scanned source (outside the allowlist) reaches the Paperclip API +with a raw HTTP client or inserts directly into agent_wakeup_requests. The +sanctioned paths are web/paperclip_api.py::pc_request (Python) and scripts/pc.sh +(bash); wakeup must go through POST /api/agents/{id}/wakeup. +""" +from __future__ import annotations + +import re +from pathlib import Path + +import pytest + +REPO = Path(__file__).resolve().parents[2] +SCAN_ROOTS = [REPO / "web", REPO / "mcp-server" / "src", REPO / "scripts"] + +# Files exempt from the HTTP-to-Paperclip rule (the sanctioned helpers + legacy DB-read client). +ALLOWLIST = { + REPO / "web" / "paperclip_api.py", # the sanctioned pc_request helper + REPO / "scripts" / "pc.sh", # the sanctioned bash wrapper + REPO / "web" / "paperclip_client.py", # legacy: DB reads only (no raw http, no wakeup insert) +} + +_PC_URL = re.compile(r"PAPERCLIP_API_URL|127\.0\.0\.1:3100|localhost:3100|pc\.nautilus\.marcusgroup\.org") +_HTTP_CLIENT = re.compile(r"\bhttpx\b|\brequests\.(get|post|put|patch|delete)\b|\baiohttp\b|\bcurl\b") +_WAKEUP_INSERT = re.compile(r"insert\s+into\s+agent_wakeup_requests", re.IGNORECASE) + + +def _scan_text(text: str) -> list[str]: + """Return violation reasons for a single file's text.""" + reasons = [] + if _WAKEUP_INSERT.search(text): + reasons.append("direct INSERT INTO agent_wakeup_requests — use the wakeup API") + # raw HTTP to Paperclip: both a paperclip-URL token and an http-client token present + if _PC_URL.search(text) and _HTTP_CLIENT.search(text): + reasons.append("raw HTTP client + Paperclip URL — use web/paperclip_api.pc_request or scripts/pc.sh") + return reasons + + +def _iter_source_files(): + for root in SCAN_ROOTS: + if not root.exists(): + continue + for ext in ("*.py", "*.sh"): + for f in root.rglob(ext): + if f in ALLOWLIST or "/.venv/" in str(f) or "/tests/" in str(f): + continue + yield f + + +def find_violations() -> list[tuple[str, str]]: + out = [] + for f in _iter_source_files(): + try: + text = f.read_text(encoding="utf-8") + except (UnicodeDecodeError, OSError): + continue + for reason in _scan_text(text): + out.append((str(f.relative_to(REPO)), reason)) + return out + + +# ── the guard catches positives, ignores sanctioned negatives ────────── +def test_scan_flags_raw_http_to_paperclip(): + bad = 'import httpx\nasync def f():\n await httpx.post(f"{PAPERCLIP_API_URL}/x")\n' + assert _scan_text(bad) + + +def test_scan_flags_wakeup_insert(): + bad = "await conn.execute('INSERT INTO agent_wakeup_requests (id) VALUES ($1)', x)" + assert _scan_text(bad) + + +def test_scan_ignores_sanctioned_helper_shape(): + ok = 'url = f"{PAPERCLIP_API_URL}{path}"\n# the only place httpx is allowed for paperclip\n' + # this shape WOULD flag if not allowlisted — proving the allowlist is what protects it + assert _scan_text(ok) # raw text matches; the file is protected by ALLOWLIST, not by content + + +def test_scan_ignores_plain_code(): + assert _scan_text("def add(a, b):\n return a + b\n") == [] + + +# ── the real repo must be clean (pre-scanned 2026-05-31: 0 violations) ── +def test_repo_has_no_paperclip_access_violations(): + violations = find_violations() + assert violations == [], "Un-sanctioned Paperclip access found:\n" + "\n".join( + f" {f}: {r}" for f, r in violations) +``` + +- [ ] **Step 2: Run the guard tests** + +Run: `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_paperclip_access_guard.py -v` +Expected: ALL PASS — including `test_repo_has_no_paperclip_access_violations` (repo is clean). +If `test_repo_has_no_paperclip_access_violations` FAILS, it found a real violation: either fix the offending code to use the sanctioned helper, or (if it's a genuine sanctioned location) add it to `ALLOWLIST` with a comment justifying it. Report any such case. + +- [ ] **Step 3: Commit** + +```bash +cd ~/legal-ai +git add mcp-server/tests/test_paperclip_access_guard.py +git commit -m "feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a)" +``` + +--- + +## Task 3: SCRIPTS.md + full suite + smoke + PR + +- [ ] **Step 1: Note the `--verify` gate semantics in SCRIPTS.md** + +In `scripts/SCRIPTS.md`, in the `sync_agents_across_companies.py` row, append to its Purpose cell: "**`--verify` יוצא exit≠0 על drift** (כולל adapter_type-mismatch — מדווח רם, נספר כ-drift) — שמיש כ-gate ל-cron/CI (GAP-21/FU-8a)." + +- [ ] **Step 2: Full offline suite** + +Run: `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/ -q` +Expected: all pass (prior suite + the new GAP-21/GAP-22 tests). Report the summary line. + +- [ ] **Step 3: Smoke — run `--verify` against the live Paperclip DB (read-only)** + +```bash +cd ~/legal-ai && set -a && source ~/.env 2>/dev/null && set +a +PAPERCLIP_BOARD_API_KEY="${PAPERCLIP_BOARD_API_KEY:-}" \ + /home/chaim/legal-ai/mcp-server/.venv/bin/python scripts/sync_agents_across_companies.py --verify; echo "exit=$?" +``` +Report the output + exit code. Expected: prints a drift report; `exit=0` if agents are in sync, `exit=1` if drift exists (either is a valid result — it proves the gate works). The script only READS in `--verify` (no mutation). +(If the script needs `PAPERCLIP_DB_URL`/board key and they're absent, report that the smoke needs the Paperclip env; the offline unit tests already validate the gate logic.) + +- [ ] **Step 4: Commit + PR** + +```bash +cd ~/legal-ai +git add scripts/SCRIPTS.md +git commit -m "docs(scripts): note sync --verify drift-gate semantics (FU-8a)" +git push -u origin fix/fu8a-process-to-code-guards +``` +Create the PR via the Gitea REST API (token from `~/.git-credentials`) and merge per the standing PR+merge rule. + +- [ ] **Step 5: TaskMaster #66 → done** (controller; verify via MCP). GAP-23 remains in #69. + +--- + +## Self-Review Notes + +- **GAP-21** → Task 1: `build_drift_report` + `_verify_exit_code` (pure, tested); `--verify` exits 1 on drift; adapter mismatch loud + counted. `--apply` behavior unchanged. +- **GAP-22** → Task 2: fitness function; tested on positive fixtures + sanctioned negatives + the real repo (clean). Allowlist explicit (`paperclip_api.py`, `pc.sh`, legacy `paperclip_client.py`). +- **Repo pre-scanned clean** — Task 2 Step 2's repo assertion passes today; if it ever fails, that's the guard doing its job. +- **No production-data risk** — pure-code; smoke `--verify` is read-only. +- **Type consistency:** `build_drift_report(...)->{plan,mismatches,missing}`, `_verify_exit_code(plan,mismatches,missing)->int`, `find_violations()->[(file,reason)]`, `_scan_text(text)->[reason]` — names match across tasks + tests. +- **GAP-23 out of scope** (#69 / FU-8b).