Files
legal-ai/docs/superpowers/plans/2026-05-31-fu2b-identifier-reconciliation.md
2026-05-31 08:09:22 +00:00

402 lines
18 KiB
Markdown
Raw Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# FU-2b: Internal Identifier Reconciliation — Implementation Plan
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
**Goal:** Build a reversible, chair-gated migration script that rewrites `internal_committee` `case_number` values currently holding a full citation into the canonical normalized bare number (X1: trim · prefix-strip · `/``-`, month preserved), leaving `citation_formatted` untouched.
**Architecture:** A standalone `scripts/` migration (not editable-service code), `--dry-run` by default. Dry-run emits a reconciliation table (CSV + Hebrew Markdown) for chair review; `--apply --approved <csv>` writes a backup then updates only chair-approved rows. Extraction is deterministic (single number-token regex) — no LLM. The production apply runs only AFTER Dafna approves the table.
**Tech Stack:** Python 3.12, asyncpg, PostgreSQL@localhost:5433, pytest offline, `.venv` at `mcp-server/.venv`.
**Spec:** [docs/superpowers/specs/2026-05-31-fu2b-identifier-reconciliation-design.md](../specs/2026-05-31-fu2b-identifier-reconciliation-design.md)
**Run script:** `PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python; $PY scripts/fu2b_reconcile_internal_case_numbers.py` (dry-run)
**Run tests:** `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_fu2b_reconcile.py -v`
---
## File Structure
- **Create** `scripts/fu2b_reconcile_internal_case_numbers.py` — the migration (pure `_extract_bare` + reconciliation builder + table/backup/revert writers + argparse `--dry-run`/`--apply`).
- **Create** `mcp-server/tests/test_fu2b_reconcile.py` — offline tests for `_extract_bare` + consistency flagging (imports the script module via sys.path).
- **Modify** `scripts/SCRIPTS.md` — register the new script (CLAUDE.md rule).
- **Artifact (produced, committed for review)** `data/audit/fu2b-reconciliation-<ts>.md` — the chair table from the dry-run.
No service code changes; no schema change. FK-safe (all `case_law` FKs use `id` UUID — verified).
---
## Task 1: Failing tests for `_extract_bare`
**Files:** Create `mcp-server/tests/test_fu2b_reconcile.py`
- [ ] **Step 1: Write the failing tests**
```python
"""FU-2b: deterministic bare-number extraction (offline)."""
from __future__ import annotations
import importlib.util
from pathlib import Path
import pytest
# Load the migration script as a module (it lives in scripts/, not a package).
_SCRIPT = Path(__file__).resolve().parents[2] / "scripts" / "fu2b_reconcile_internal_case_numbers.py"
_spec = importlib.util.spec_from_file_location("fu2b_reconcile", _SCRIPT)
fu2b = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(fu2b)
@pytest.mark.parametrize("raw,expected_bare", [
("ערר (‏ועדות ערר - תכנון ובנייה ירושלים‏) 403/17 אהרון ברק נ'", "403-17"),
("ערר (...) 8136-10-24 שחר שות'", "8136-10-24"), # month preserved
("בל\"מ (...) 1028/20 חלוואני ריאד", "1028-20"),
("8047/23", "8047-23"), # already-bare-ish
("ערר 81002-01-21", "81002-01-21"),
])
def test_extract_bare_single_token(raw, expected_bare):
bare, flag = fu2b._extract_bare(raw)
assert bare == expected_bare
assert flag == "OK"
def test_extract_bare_no_number():
bare, flag = fu2b._extract_bare("ערר אדלר נ' הוועדה")
assert bare is None and flag == "NO_NUMBER"
def test_extract_bare_multiple_numbers_flagged():
# Two case-number-shaped tokens → ambiguous, must NOT auto-pick.
bare, flag = fu2b._extract_bare("ערר 403/17 ו-1024/24 מאוחדים")
assert bare is None and flag == "MULTI_NUMBER"
def test_extract_bare_preserves_month_not_padding():
# Month kept exactly; 2-part stays 2-part (no invented month).
assert fu2b._extract_bare("ערר 8126/24 פלוני")[0] == "8126-24"
assert fu2b._extract_bare("ערר 8126-03-25 פלוני")[0] == "8126-03-25"
def test_consistency_flag_when_bare_absent_from_citation():
# proposed bare must appear in citation_formatted, else MISMATCH.
assert fu2b._consistency_flag("403-17", "ערר (...) 403/17 אהרון ברק") == "OK"
assert fu2b._consistency_flag("403-17", "ערר (...) 1975/24 מישהו אחר") == "MISMATCH"
assert fu2b._consistency_flag("403-17", "") == "NO_CITATION"
```
- [ ] **Step 2: Run to verify failure**
Run: `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_fu2b_reconcile.py -v`
Expected: FAIL — `FileNotFoundError`/`ModuleNotFoundError` (script doesn't exist) or `AttributeError: _extract_bare`.
- [ ] **Step 3: Commit**
```bash
cd ~/legal-ai
git add mcp-server/tests/test_fu2b_reconcile.py
git commit -m "test(fu2b): failing tests for bare-number extraction (FU-2b)"
```
---
## Task 2: The migration script (dry-run + apply + backup)
**Files:** Create `scripts/fu2b_reconcile_internal_case_numbers.py`
- [ ] **Step 1: Write the script**
```python
#!/usr/bin/env python3
"""FU-2b — reconcile internal_committee case_number → canonical bare number.
Rewrites case_number values that currently hold a full citation into the
canonical normalized bare number (X1: trim · prefix-strip · '/''-', month
preserved). citation_formatted is the display field and is left untouched.
DETERMINISTIC — no LLM. Extraction takes the single case-number-shaped token
from the value; 0 or >1 tokens are flagged for chair review, never guessed.
Usage (must use the mcp-server venv — asyncpg/pgvector vendored there):
PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python
# Dry-run (default): builds the reconciliation table for chair review.
$PY scripts/fu2b_reconcile_internal_case_numbers.py
# Apply ONLY the chair-approved rows (after Dafna's review), backup first:
$PY scripts/fu2b_reconcile_internal_case_numbers.py --apply \
--approved data/audit/fu2b-approved-<ts>.csv
Scope: source_kind='internal_committee' only (external → #68/FU-2c). FK-safe:
all case_law FKs reference case_law.id (UUID), not case_number.
"""
from __future__ import annotations
import argparse
import asyncio
import csv
import os
import re
import sys
from datetime import datetime, timezone
from pathlib import Path
REPO_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(REPO_ROOT / "mcp-server" / "src"))
if "POSTGRES_URL" not in os.environ:
os.environ["POSTGRES_URL"] = (
f"postgres://{os.environ.get('POSTGRES_USER','legal_ai')}:"
f"{os.environ.get('POSTGRES_PASSWORD','')}@"
f"{os.environ.get('POSTGRES_HOST','127.0.0.1')}:"
f"{os.environ.get('POSTGRES_PORT','5433')}/"
f"{os.environ.get('POSTGRES_DB','legal_ai')}"
)
AUDIT_DIR = REPO_ROOT / "data" / "audit"
_TOKEN_RE = re.compile(r"[0-9]{2,6}(?:[-/][0-9]{1,2}){1,2}")
def _extract_bare(case_number: str) -> tuple[str | None, str]:
"""Return (canonical_bare, flag). flag ∈ {OK, NO_NUMBER, MULTI_NUMBER}.
Deterministic: finds case-number-shaped tokens (NNNN/YY or NNNN-MM-YY).
Exactly one → normalize '/''-' (month preserved, none invented). 0 or >1
→ None + flag (chair decides; never guess).
"""
tokens = _TOKEN_RE.findall(case_number or "")
if len(tokens) == 1:
return tokens[0].replace("/", "-"), "OK"
if not tokens:
return None, "NO_NUMBER"
return None, "MULTI_NUMBER"
def _consistency_flag(bare: str | None, citation_formatted: str) -> str:
"""OK if bare appears in citation_formatted; MISMATCH if not; NO_CITATION if empty."""
if not citation_formatted:
return "NO_CITATION"
if not bare:
return "NO_NUMBER"
# compare against the citation with separators unified, to match 403/17 vs 403-17
cf = citation_formatted.replace("/", "-")
return "OK" if bare in cf else "MISMATCH"
async def _build_reconciliation() -> list[dict]:
from legal_mcp.services import db
pool = await db.get_pool()
async with pool.acquire() as conn:
rows = await conn.fetch(
"SELECT id, case_number, proceeding_type, coalesce(citation_formatted,'') AS cf "
"FROM case_law WHERE source_kind='internal_committee' ORDER BY case_number")
# detect dup serials across proceeding_type for a DUP_CHECK flag
out: list[dict] = []
for r in rows:
bare, flag = _extract_bare(r["case_number"])
cons = _consistency_flag(bare, r["cf"])
changes = bare is not None and bare != r["case_number"]
out.append({
"id": str(r["id"]),
"current_case_number": r["case_number"],
"proposed_bare": bare or "",
"proceeding_type": r["proceeding_type"] or "",
"citation_formatted": r["cf"],
"extract_flag": flag,
"consistency": cons,
"will_change": "yes" if changes else "no",
})
# DUP_CHECK: same proposed_bare appearing on >1 row (any proceeding_type)
from collections import Counter
bare_counts = Counter(d["proposed_bare"] for d in out if d["proposed_bare"])
for d in out:
if d["proposed_bare"] and bare_counts[d["proposed_bare"]] > 1:
d["dup_check"] = "DUP_CHECK"
else:
d["dup_check"] = ""
return out
def _ts() -> str:
return datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")
def _write_table(rows: list[dict], ts: str) -> tuple[Path, Path]:
AUDIT_DIR.mkdir(parents=True, exist_ok=True)
csv_path = AUDIT_DIR / f"fu2b-reconciliation-{ts}.csv"
md_path = AUDIT_DIR / f"fu2b-reconciliation-{ts}.md"
cols = ["id", "current_case_number", "proposed_bare", "proceeding_type",
"citation_formatted", "extract_flag", "consistency", "dup_check", "will_change"]
with csv_path.open("w", newline="", encoding="utf-8") as f:
w = csv.DictWriter(f, fieldnames=cols)
w.writeheader()
w.writerows(rows)
changing = [r for r in rows if r["will_change"] == "yes"]
flagged = [r for r in rows if r["extract_flag"] != "OK" or r["consistency"] == "MISMATCH" or r["dup_check"]]
with md_path.open("w", encoding="utf-8") as f:
f.write(f"# FU-2b — טבלת-תיאום מזהים (internal_committee) — {ts}\n\n")
f.write(f"- סה\"כ רשומות: {len(rows)}\n- ישתנו: {len(changing)}\n- מסומנות לסקירה: {len(flagged)}\n\n")
f.write("## דורש הכרעת-יו\"ר (flags)\n\n")
f.write("| current_case_number | proposed_bare | proc | flags |\n|---|---|---|---|\n")
for r in flagged:
fl = " ".join(x for x in [r["extract_flag"] if r["extract_flag"] != "OK" else "",
r["consistency"] if r["consistency"] == "MISMATCH" else "",
r["dup_check"]] if x)
f.write(f"| {r['current_case_number'][:50]} | {r['proposed_bare']} | {r['proceeding_type']} | {fl} |\n")
f.write("\n## כל השינויים המוצעים\n\n")
f.write("| current_case_number | → proposed_bare | proc |\n|---|---|---|\n")
for r in changing:
f.write(f"| {r['current_case_number'][:55]} | {r['proposed_bare']} | {r['proceeding_type']} |\n")
return csv_path, md_path
async def _apply(approved_csv: Path, ts: str) -> dict:
from legal_mcp.services import db
with approved_csv.open(encoding="utf-8") as f:
approved = [r for r in csv.DictReader(f)
if r.get("will_change") == "yes" and r.get("proposed_bare")]
if not approved:
return {"applied": 0, "note": "no approved changing rows"}
AUDIT_DIR.mkdir(parents=True, exist_ok=True)
backup = AUDIT_DIR / f"fu2b-backup-{ts}.csv"
pool = await db.get_pool()
applied = 0
with backup.open("w", newline="", encoding="utf-8") as bf:
bw = csv.writer(bf)
bw.writerow(["id", "old_case_number"])
async with pool.acquire() as conn:
for r in approved:
old = await conn.fetchval("SELECT case_number FROM case_law WHERE id=$1", r["id"])
if old is None:
continue
bw.writerow([r["id"], old])
await conn.execute(
"UPDATE case_law SET case_number=$2 WHERE id=$1 "
"AND source_kind='internal_committee'",
r["id"], r["proposed_bare"])
applied += 1
return {"applied": applied, "backup": str(backup)}
async def main() -> int:
parser = argparse.ArgumentParser(description="FU-2b internal case_number reconciliation")
parser.add_argument("--apply", action="store_true", help="apply approved changes (default: dry-run)")
parser.add_argument("--approved", type=str, help="path to chair-approved CSV (required with --apply)")
args = parser.parse_args()
ts = _ts()
if not args.apply:
rows = await _build_reconciliation()
csv_path, md_path = _write_table(rows, ts)
changing = sum(1 for r in rows if r["will_change"] == "yes")
flagged = sum(1 for r in rows if r["extract_flag"] != "OK" or r["consistency"] == "MISMATCH" or r["dup_check"])
print(f"DRY-RUN: {len(rows)} rows | will_change={changing} | flagged={flagged}")
print(f" table: {md_path}")
print(f" csv: {csv_path}")
print("Review the table with the chair, then run --apply --approved <reviewed.csv>.")
return 0
if not args.approved:
print("ERROR: --apply requires --approved <csv> (the chair-reviewed table).", file=sys.stderr)
return 2
result = await _apply(Path(args.approved), ts)
print(f"APPLIED: {result}")
return 0
if __name__ == "__main__":
sys.exit(asyncio.run(main()))
```
- [ ] **Step 2: Run the unit tests**
Run: `cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_fu2b_reconcile.py -v`
Expected: ALL pass (extraction + flags + consistency).
- [ ] **Step 3: Commit**
```bash
cd ~/legal-ai
chmod +x scripts/fu2b_reconcile_internal_case_numbers.py
git add scripts/fu2b_reconcile_internal_case_numbers.py
git commit -m "feat(fu2b): chair-gated internal case_number reconciliation script (GAP-07/08)"
```
---
## Task 3: Dry-run against the DB → produce the chair table
**Files:** Produces `data/audit/fu2b-reconciliation-<ts>.{csv,md}`
- [ ] **Step 1: Run the dry-run**
```bash
cd ~/legal-ai && set -a && source ~/.env 2>/dev/null && set +a
PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python
$PY scripts/fu2b_reconcile_internal_case_numbers.py
```
Expected output: `DRY-RUN: 56 rows | will_change=~52 | flagged=~1` (the ~1 = the 8047/23 DUP_CHECK pair → 2 rows flagged). Note the exact numbers.
- [ ] **Step 2: Sanity-check the produced table**
Open `data/audit/fu2b-reconciliation-<ts>.md`. Verify:
- `will_change` rows: each `current_case_number` (full citation) → a clean `proposed_bare` matching the number inside it.
- `flagged` section: should contain the `8047-23` DUP_CHECK pair (ערר + בל"מ) and ideally nothing else (0 MULTI_NUMBER, 0 MISMATCH expected per the analysis).
- If MULTI_NUMBER / MISMATCH rows appear unexpectedly, STOP and report them (the analysis predicted 0; an unexpected flag means the data changed and needs investigation before chair review).
- [ ] **Step 3: Commit the produced table as a review artifact**
```bash
cd ~/legal-ai
git add data/audit/fu2b-reconciliation-*.md data/audit/fu2b-reconciliation-*.csv
git commit -m "chore(fu2b): dry-run reconciliation table for chair review (GAP-07/08)"
```
(If `data/audit/` is gitignored, skip the commit and report the path instead — the table still exists on disk for review.)
---
## Task 4: SCRIPTS.md + PR
- [ ] **Step 1: Register the script in `scripts/SCRIPTS.md`**
Add a row to the active-scripts table (match the file's existing table format) describing `fu2b_reconcile_internal_case_numbers.py`: purpose (FU-2b internal case_number reconciliation, GAP-07/08), status (active, chair-gated), usage (dry-run default / `--apply --approved`).
- [ ] **Step 2: Full suite + commit + push + PR**
```bash
cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/ -q # report summary (expect all pass)
cd ~/legal-ai
git add scripts/SCRIPTS.md
git commit -m "docs(scripts): register fu2b reconciliation script (FU-2b)"
git push -u origin fix/fu2b-identifier-reconciliation
```
Then create the PR via the Gitea REST API (token from `~/.git-credentials`) and merge per the standing PR+merge rule. The PR delivers the **tooling + dry-run table**; the production `--apply` is the separate gated step below.
---
## Task 5: [HUMAN GATE] Chair review + gated apply (NOT automated)
> This task is the chair-approval gate. It is NOT executed by an implementer subagent.
- [ ] **Step 1:** Present `data/audit/fu2b-reconciliation-<ts>.md` to the controller, who presents it to Dafna: the ~52 proposed changes + the `8047-23` ערר/בל"מ DUP_CHECK pair. Dafna confirms the mapping and adjudicates whether 8047/23 is two distinct proceedings (keep both) or a mis-tagged duplicate (manual delete, separate).
- [ ] **Step 2:** Save the reviewed table as `data/audit/fu2b-approved-<ts>.csv` (rows Dafna approved; `will_change=yes` only for those).
- [ ] **Step 3:** Run the gated apply against the DB:
```bash
cd ~/legal-ai && set -a && source ~/.env && set +a
PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python
$PY scripts/fu2b_reconcile_internal_case_numbers.py --apply --approved data/audit/fu2b-approved-<ts>.csv
```
- [ ] **Step 4:** Verify: re-run dry-run → `will_change=0` (idempotent); spot-check `get_case_by_number` still resolves a migrated case; confirm a backup CSV was written (revert path). Mark TaskMaster #67 done.
---
## Self-Review Notes
- **GAP-07/08 (internal)** → Task 2 script + Task 3 dry-run + Task 5 gated apply. Canonical form per X1 (month preserved) — `_extract_bare` replaces only `/`→`-` on the single extracted token, never strips/pads a month.
- **Reversible:** `_apply` writes `fu2b-backup-<ts>.csv` (id, old_case_number) before each UPDATE.
- **Chair gate:** `--apply` requires `--approved <csv>`; production apply is Task 5 (human), not part of the PR merge.
- **Determinism / safety:** 0/>1 token → flagged, never guessed; consistency + DUP_CHECK flags surface the 8047 edge.
- **Scope:** `source_kind='internal_committee'` only (the UPDATE has the `AND source_kind='internal_committee'` guard); external → #68.
- **FK-safe:** verified all 11 `case_law` FKs use `id` (UUID).
- **Type consistency:** `_extract_bare(case_number)->(bare|None,flag)`, `_consistency_flag(bare,citation)->str` — names match tests (Task 1) and script (Task 2).