Files

Chaim c2de69272d docs(plan): FU-2b identifier-reconciliation implementation plan (chair-gated, TDD)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-05-31 08:09:22 +00:00

18 KiB

Raw Blame History

FU-2b: Internal Identifier Reconciliation — Implementation Plan

For agentic workers: REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (- [ ]) syntax for tracking.

Goal: Build a reversible, chair-gated migration script that rewrites internal_committee case_number values currently holding a full citation into the canonical normalized bare number (X1: trim · prefix-strip · /→-, month preserved), leaving citation_formatted untouched.

Architecture: A standalone scripts/ migration (not editable-service code), --dry-run by default. Dry-run emits a reconciliation table (CSV + Hebrew Markdown) for chair review; --apply --approved <csv> writes a backup then updates only chair-approved rows. Extraction is deterministic (single number-token regex) — no LLM. The production apply runs only AFTER Dafna approves the table.

Tech Stack: Python 3.12, asyncpg, PostgreSQL@localhost:5433, pytest offline, .venv at mcp-server/.venv.

Spec: docs/superpowers/specs/2026-05-31-fu2b-identifier-reconciliation-design.md

Run script: PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python; $PY scripts/fu2b_reconcile_internal_case_numbers.py (dry-run) Run tests: cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_fu2b_reconcile.py -v

File Structure

Create scripts/fu2b_reconcile_internal_case_numbers.py — the migration (pure _extract_bare + reconciliation builder + table/backup/revert writers + argparse --dry-run/--apply).
Create mcp-server/tests/test_fu2b_reconcile.py — offline tests for _extract_bare + consistency flagging (imports the script module via sys.path).
Modify scripts/SCRIPTS.md — register the new script (CLAUDE.md rule).
Artifact (produced, committed for review) data/audit/fu2b-reconciliation-<ts>.md — the chair table from the dry-run.

No service code changes; no schema change. FK-safe (all case_law FKs use id UUID — verified).

Task 1: Failing tests for `_extract_bare`

Files: Create mcp-server/tests/test_fu2b_reconcile.py

Step 1: Write the failing tests

"""FU-2b: deterministic bare-number extraction (offline)."""
from __future__ import annotations

import importlib.util
from pathlib import Path

import pytest

# Load the migration script as a module (it lives in scripts/, not a package).
_SCRIPT = Path(__file__).resolve().parents[2] / "scripts" / "fu2b_reconcile_internal_case_numbers.py"
_spec = importlib.util.spec_from_file_location("fu2b_reconcile", _SCRIPT)
fu2b = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(fu2b)


@pytest.mark.parametrize("raw,expected_bare", [
    ("ערר ‏(‏ועדות ערר - תכנון ובנייה ירושלים‏)‏ 403/17 אהרון ברק נ'", "403-17"),
    ("ערר (...) 8136-10-24 שחר שות'", "8136-10-24"),          # month preserved
    ("בל\"מ (...) 1028/20 חלוואני ריאד", "1028-20"),
    ("8047/23", "8047-23"),                                     # already-bare-ish
    ("ערר 81002-01-21", "81002-01-21"),
])
def test_extract_bare_single_token(raw, expected_bare):
    bare, flag = fu2b._extract_bare(raw)
    assert bare == expected_bare
    assert flag == "OK"


def test_extract_bare_no_number():
    bare, flag = fu2b._extract_bare("ערר אדלר נ' הוועדה")
    assert bare is None and flag == "NO_NUMBER"


def test_extract_bare_multiple_numbers_flagged():
    # Two case-number-shaped tokens → ambiguous, must NOT auto-pick.
    bare, flag = fu2b._extract_bare("ערר 403/17 ו-1024/24 מאוחדים")
    assert bare is None and flag == "MULTI_NUMBER"


def test_extract_bare_preserves_month_not_padding():
    # Month kept exactly; 2-part stays 2-part (no invented month).
    assert fu2b._extract_bare("ערר 8126/24 פלוני")[0] == "8126-24"
    assert fu2b._extract_bare("ערר 8126-03-25 פלוני")[0] == "8126-03-25"


def test_consistency_flag_when_bare_absent_from_citation():
    # proposed bare must appear in citation_formatted, else MISMATCH.
    assert fu2b._consistency_flag("403-17", "ערר (...) 403/17 אהרון ברק") == "OK"
    assert fu2b._consistency_flag("403-17", "ערר (...) 1975/24 מישהו אחר") == "MISMATCH"
    assert fu2b._consistency_flag("403-17", "") == "NO_CITATION"

Step 2: Run to verify failure

Run: cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_fu2b_reconcile.py -v Expected: FAIL — FileNotFoundError/ModuleNotFoundError (script doesn't exist) or AttributeError: _extract_bare.

Step 3: Commit

cd ~/legal-ai
git add mcp-server/tests/test_fu2b_reconcile.py
git commit -m "test(fu2b): failing tests for bare-number extraction (FU-2b)"

Task 2: The migration script (dry-run + apply + backup)

Files: Create scripts/fu2b_reconcile_internal_case_numbers.py

Step 1: Write the script

#!/usr/bin/env python3
"""FU-2b — reconcile internal_committee case_number → canonical bare number.

Rewrites case_number values that currently hold a full citation into the
canonical normalized bare number (X1: trim · prefix-strip · '/'→'-', month
preserved). citation_formatted is the display field and is left untouched.

DETERMINISTIC — no LLM. Extraction takes the single case-number-shaped token
from the value; 0 or >1 tokens are flagged for chair review, never guessed.

Usage (must use the mcp-server venv — asyncpg/pgvector vendored there):
    PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python

    # Dry-run (default): builds the reconciliation table for chair review.
    $PY scripts/fu2b_reconcile_internal_case_numbers.py

    # Apply ONLY the chair-approved rows (after Dafna's review), backup first:
    $PY scripts/fu2b_reconcile_internal_case_numbers.py --apply \
        --approved data/audit/fu2b-approved-<ts>.csv

Scope: source_kind='internal_committee' only (external → #68/FU-2c). FK-safe:
all case_law FKs reference case_law.id (UUID), not case_number.
"""
from __future__ import annotations

import argparse
import asyncio
import csv
import os
import re
import sys
from datetime import datetime, timezone
from pathlib import Path

REPO_ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(REPO_ROOT / "mcp-server" / "src"))

if "POSTGRES_URL" not in os.environ:
    os.environ["POSTGRES_URL"] = (
        f"postgres://{os.environ.get('POSTGRES_USER','legal_ai')}:"
        f"{os.environ.get('POSTGRES_PASSWORD','')}@"
        f"{os.environ.get('POSTGRES_HOST','127.0.0.1')}:"
        f"{os.environ.get('POSTGRES_PORT','5433')}/"
        f"{os.environ.get('POSTGRES_DB','legal_ai')}"
    )

AUDIT_DIR = REPO_ROOT / "data" / "audit"
_TOKEN_RE = re.compile(r"[0-9]{2,6}(?:[-/][0-9]{1,2}){1,2}")


def _extract_bare(case_number: str) -> tuple[str | None, str]:
    """Return (canonical_bare, flag). flag ∈ {OK, NO_NUMBER, MULTI_NUMBER}.

    Deterministic: finds case-number-shaped tokens (NNNN/YY or NNNN-MM-YY).
    Exactly one → normalize '/'→'-' (month preserved, none invented). 0 or >1
    → None + flag (chair decides; never guess).
    """
    tokens = _TOKEN_RE.findall(case_number or "")
    if len(tokens) == 1:
        return tokens[0].replace("/", "-"), "OK"
    if not tokens:
        return None, "NO_NUMBER"
    return None, "MULTI_NUMBER"


def _consistency_flag(bare: str | None, citation_formatted: str) -> str:
    """OK if bare appears in citation_formatted; MISMATCH if not; NO_CITATION if empty."""
    if not citation_formatted:
        return "NO_CITATION"
    if not bare:
        return "NO_NUMBER"
    # compare against the citation with separators unified, to match 403/17 vs 403-17
    cf = citation_formatted.replace("/", "-")
    return "OK" if bare in cf else "MISMATCH"


async def _build_reconciliation() -> list[dict]:
    from legal_mcp.services import db
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            "SELECT id, case_number, proceeding_type, coalesce(citation_formatted,'') AS cf "
            "FROM case_law WHERE source_kind='internal_committee' ORDER BY case_number")
    # detect dup serials across proceeding_type for a DUP_CHECK flag
    out: list[dict] = []
    for r in rows:
        bare, flag = _extract_bare(r["case_number"])
        cons = _consistency_flag(bare, r["cf"])
        changes = bare is not None and bare != r["case_number"]
        out.append({
            "id": str(r["id"]),
            "current_case_number": r["case_number"],
            "proposed_bare": bare or "",
            "proceeding_type": r["proceeding_type"] or "",
            "citation_formatted": r["cf"],
            "extract_flag": flag,
            "consistency": cons,
            "will_change": "yes" if changes else "no",
        })
    # DUP_CHECK: same proposed_bare appearing on >1 row (any proceeding_type)
    from collections import Counter
    bare_counts = Counter(d["proposed_bare"] for d in out if d["proposed_bare"])
    for d in out:
        if d["proposed_bare"] and bare_counts[d["proposed_bare"]] > 1:
            d["dup_check"] = "DUP_CHECK"
        else:
            d["dup_check"] = ""
    return out


def _ts() -> str:
    return datetime.now(timezone.utc).strftime("%Y%m%dT%H%M%SZ")


def _write_table(rows: list[dict], ts: str) -> tuple[Path, Path]:
    AUDIT_DIR.mkdir(parents=True, exist_ok=True)
    csv_path = AUDIT_DIR / f"fu2b-reconciliation-{ts}.csv"
    md_path = AUDIT_DIR / f"fu2b-reconciliation-{ts}.md"
    cols = ["id", "current_case_number", "proposed_bare", "proceeding_type",
            "citation_formatted", "extract_flag", "consistency", "dup_check", "will_change"]
    with csv_path.open("w", newline="", encoding="utf-8") as f:
        w = csv.DictWriter(f, fieldnames=cols)
        w.writeheader()
        w.writerows(rows)
    changing = [r for r in rows if r["will_change"] == "yes"]
    flagged = [r for r in rows if r["extract_flag"] != "OK" or r["consistency"] == "MISMATCH" or r["dup_check"]]
    with md_path.open("w", encoding="utf-8") as f:
        f.write(f"# FU-2b — טבלת-תיאום מזהים (internal_committee) — {ts}\n\n")
        f.write(f"- סה\"כ רשומות: {len(rows)}\n- ישתנו: {len(changing)}\n- מסומנות לסקירה: {len(flagged)}\n\n")
        f.write("## דורש הכרעת-יו\"ר (flags)\n\n")
        f.write("| current_case_number | proposed_bare | proc | flags |\n|---|---|---|---|\n")
        for r in flagged:
            fl = " ".join(x for x in [r["extract_flag"] if r["extract_flag"] != "OK" else "",
                                       r["consistency"] if r["consistency"] == "MISMATCH" else "",
                                       r["dup_check"]] if x)
            f.write(f"| {r['current_case_number'][:50]} | {r['proposed_bare']} | {r['proceeding_type']} | {fl} |\n")
        f.write("\n## כל השינויים המוצעים\n\n")
        f.write("| current_case_number | → proposed_bare | proc |\n|---|---|---|\n")
        for r in changing:
            f.write(f"| {r['current_case_number'][:55]} | {r['proposed_bare']} | {r['proceeding_type']} |\n")
    return csv_path, md_path


async def _apply(approved_csv: Path, ts: str) -> dict:
    from legal_mcp.services import db
    with approved_csv.open(encoding="utf-8") as f:
        approved = [r for r in csv.DictReader(f)
                    if r.get("will_change") == "yes" and r.get("proposed_bare")]
    if not approved:
        return {"applied": 0, "note": "no approved changing rows"}
    AUDIT_DIR.mkdir(parents=True, exist_ok=True)
    backup = AUDIT_DIR / f"fu2b-backup-{ts}.csv"
    pool = await db.get_pool()
    applied = 0
    with backup.open("w", newline="", encoding="utf-8") as bf:
        bw = csv.writer(bf)
        bw.writerow(["id", "old_case_number"])
        async with pool.acquire() as conn:
            for r in approved:
                old = await conn.fetchval("SELECT case_number FROM case_law WHERE id=$1", r["id"])
                if old is None:
                    continue
                bw.writerow([r["id"], old])
                await conn.execute(
                    "UPDATE case_law SET case_number=$2 WHERE id=$1 "
                    "AND source_kind='internal_committee'",
                    r["id"], r["proposed_bare"])
                applied += 1
    return {"applied": applied, "backup": str(backup)}


async def main() -> int:
    parser = argparse.ArgumentParser(description="FU-2b internal case_number reconciliation")
    parser.add_argument("--apply", action="store_true", help="apply approved changes (default: dry-run)")
    parser.add_argument("--approved", type=str, help="path to chair-approved CSV (required with --apply)")
    args = parser.parse_args()
    ts = _ts()

    if not args.apply:
        rows = await _build_reconciliation()
        csv_path, md_path = _write_table(rows, ts)
        changing = sum(1 for r in rows if r["will_change"] == "yes")
        flagged = sum(1 for r in rows if r["extract_flag"] != "OK" or r["consistency"] == "MISMATCH" or r["dup_check"])
        print(f"DRY-RUN: {len(rows)} rows | will_change={changing} | flagged={flagged}")
        print(f"  table:  {md_path}")
        print(f"  csv:    {csv_path}")
        print("Review the table with the chair, then run --apply --approved <reviewed.csv>.")
        return 0

    if not args.approved:
        print("ERROR: --apply requires --approved <csv> (the chair-reviewed table).", file=sys.stderr)
        return 2
    result = await _apply(Path(args.approved), ts)
    print(f"APPLIED: {result}")
    return 0


if __name__ == "__main__":
    sys.exit(asyncio.run(main()))

Step 2: Run the unit tests

Run: cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/test_fu2b_reconcile.py -v Expected: ALL pass (extraction + flags + consistency).

Step 3: Commit

cd ~/legal-ai
chmod +x scripts/fu2b_reconcile_internal_case_numbers.py
git add scripts/fu2b_reconcile_internal_case_numbers.py
git commit -m "feat(fu2b): chair-gated internal case_number reconciliation script (GAP-07/08)"

Task 3: Dry-run against the DB → produce the chair table

Files: Produces data/audit/fu2b-reconciliation-<ts>.{csv,md}

Step 1: Run the dry-run

cd ~/legal-ai && set -a && source ~/.env 2>/dev/null && set +a
PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python
$PY scripts/fu2b_reconcile_internal_case_numbers.py

Expected output: DRY-RUN: 56 rows | will_change=~52 | flagged=~1 (the ~1 = the 8047/23 DUP_CHECK pair → 2 rows flagged). Note the exact numbers.

Step 2: Sanity-check the produced table

Open data/audit/fu2b-reconciliation-<ts>.md. Verify:

will_change rows: each current_case_number (full citation) → a clean proposed_bare matching the number inside it.
flagged section: should contain the 8047-23 DUP_CHECK pair (ערר + בל"מ) and ideally nothing else (0 MULTI_NUMBER, 0 MISMATCH expected per the analysis).
If MULTI_NUMBER / MISMATCH rows appear unexpectedly, STOP and report them (the analysis predicted 0; an unexpected flag means the data changed and needs investigation before chair review).
Step 3: Commit the produced table as a review artifact

cd ~/legal-ai
git add data/audit/fu2b-reconciliation-*.md data/audit/fu2b-reconciliation-*.csv
git commit -m "chore(fu2b): dry-run reconciliation table for chair review (GAP-07/08)"

(If data/audit/ is gitignored, skip the commit and report the path instead — the table still exists on disk for review.)

Task 4: SCRIPTS.md + PR

Step 1: Register the script in scripts/SCRIPTS.md

Add a row to the active-scripts table (match the file's existing table format) describing fu2b_reconcile_internal_case_numbers.py: purpose (FU-2b internal case_number reconciliation, GAP-07/08), status (active, chair-gated), usage (dry-run default / --apply --approved).

Step 2: Full suite + commit + push + PR

cd ~/legal-ai/mcp-server && .venv/bin/python -m pytest tests/ -q   # report summary (expect all pass)
cd ~/legal-ai
git add scripts/SCRIPTS.md
git commit -m "docs(scripts): register fu2b reconciliation script (FU-2b)"
git push -u origin fix/fu2b-identifier-reconciliation

Then create the PR via the Gitea REST API (token from ~/.git-credentials) and merge per the standing PR+merge rule. The PR delivers the tooling + dry-run table; the production --apply is the separate gated step below.

Task 5: [HUMAN GATE] Chair review + gated apply (NOT automated)

This task is the chair-approval gate. It is NOT executed by an implementer subagent.

Step 1: Present data/audit/fu2b-reconciliation-<ts>.md to the controller, who presents it to Dafna: the ~52 proposed changes + the 8047-23 ערר/בל"מ DUP_CHECK pair. Dafna confirms the mapping and adjudicates whether 8047/23 is two distinct proceedings (keep both) or a mis-tagged duplicate (manual delete, separate).
Step 2: Save the reviewed table as data/audit/fu2b-approved-<ts>.csv (rows Dafna approved; will_change=yes only for those).

Step 3: Run the gated apply against the DB:

cd ~/legal-ai && set -a && source ~/.env && set +a
PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python
$PY scripts/fu2b_reconcile_internal_case_numbers.py --apply --approved data/audit/fu2b-approved-<ts>.csv

Step 4: Verify: re-run dry-run → will_change=0 (idempotent); spot-check get_case_by_number still resolves a migrated case; confirm a backup CSV was written (revert path). Mark TaskMaster #67 done.

Self-Review Notes

GAP-07/08 (internal) → Task 2 script + Task 3 dry-run + Task 5 gated apply. Canonical form per X1 (month preserved) — _extract_bare replaces only /→- on the single extracted token, never strips/pads a month.
Reversible: _apply writes fu2b-backup-<ts>.csv (id, old_case_number) before each UPDATE.
Chair gate: --apply requires --approved <csv>; production apply is Task 5 (human), not part of the PR merge.
Determinism / safety: 0/>1 token → flagged, never guessed; consistency + DUP_CHECK flags surface the 8047 edge.
Scope: source_kind='internal_committee' only (the UPDATE has the AND source_kind='internal_committee' guard); external → #68.
FK-safe: verified all 11 case_law FKs use id (UUID).
Type consistency: _extract_bare(case_number)->(bare|None,flag), _consistency_flag(bare,citation)->str — names match tests (Task 1) and script (Task 2).

18 KiB Raw Blame History Unescape Escape