feat: Stage A finalizers + #35/#36/#37 — critical-gap closure

Four parallel sub-agents closed the remaining critical gaps from the 26/05 Stage A/B sprint. Each block independently tested; aggregated here. ## #30/#31 finalizers (sub-agent A) * Auto-derive practice_area in case_create from case_number prefix (1xxx→rishuy_uvniya, 8xxx→betterment_levy, 9xxx→compensation_197); default for CaseCreateRequest is now "" (the DB constraint catches any stray "appeals_committee"). * practice_area.py: derive_subtype now handles axis-B domain values (rishuy_uvniya/betterment_levy/compensation_197) without parsing the case number; new helper derive_domain_practice_area(). * Halacha re-extraction verified unnecessary — all 6 reclassified records already had is_binding=false and approved halachot. * Regression tests: 6 cases in tests/test_corpus_constraints.py covering practice_area enum, internal-committee chair/district, external-upload arar prefix, MCP guard. * UI: district input → Select dropdown (7 districts) in precedent-edit-sheet.tsx, preserving legacy free-text values. ## #37 בל"מ subtypes (sub-agent B) * 3 new appeal_subtypes: extension_request_{building_permit, betterment_levy,compensation}. APPEALS_COMMITTEE_SUBTYPES extended, SUBTYPES_BY_AREA mappings added. * New helpers: is_blam_subject(), is_blam_subtype(), derive_subtype_with_blam(case_number, subject, practice_area). case_create now uses it to auto-detect "בקשה להארכת מועד" subjects. * 3 methodology templates under docs/methodology/extension-request-*.md. * paperclip_client.py mapping updated for the 3 new subtypes (extension_request_building_permit→CMP, the other two→CMPA). * Frontend: bilingual "בל"מ" badge + filter dropdown on cases list + detail header; appeal-type-bars collapseBlam() merges בל"מ into its parent domain for aggregate bars. * Wizard auto-detects בל"מ from subject during case creation. * 3 Berlinger cases (1017/1018/1019-03-26) migrated to appeal_subtype=extension_request_building_permit via psql. ## #35 missing_precedents feature (sub-agent C) * Schema V13: missing_precedents table (citation, case_id, party, legal_topic, status, linked_case_law_id, claim_quote, ...) + FK constraints + 3 indexes. Applied via psql + idempotent migration. * 6 db.py service functions, 3 MCP tools, 6 FastAPI endpoints (POST/GET/PATCH/DELETE/upload — upload routes by citation prefix to ingest_internal_decision or ingest_precedent). * Next.js page /missing-precedents with 5 status tabs + filters + sidebar badge counter + detail drawer with metadata edit + smart upload form that switches fields per committee/court. * Bootstrap: 7 rows imported from the JSON file (3 citations × cases, all status=closed with linked_case_law_id). * legal-researcher.md: new §2ב.5 with missing_precedent_create usage + dedup semantics + tool grant. ## #36 legal_arguments aggregation (sub-agent D) * Schema V14: legal_arguments + legal_argument_propositions M:M. Applied via psql. * New service argument_aggregator.py with two functions — aggregate_claims_to_arguments() (Claude CLI / claude_session) and get_legal_arguments(). Graceful llm_unavailable handling when CLI is missing (containers). * 2 MCP tools + 2 API endpoints (POST .../aggregate-arguments as BackgroundTask, GET .../legal-arguments). * Frontend: shadcn Accordion + new legal-arguments-panel.tsx with hierarchical (party → priority badge → arguments) display, "טיעונים" tab on the case page, "חשב/חשב מחדש" buttons. * scripts/backfill_legal_arguments.py + SCRIPTS.md entry — dry-run found 8 candidate cases including 1017/1018/1019. ## Open follow-ups (intentionally deferred) * npm run api:types in web-ui (CLAUDE.md flow) — recommended before the next UI commit; not required for backend deployment. * Run backfill_legal_arguments.py --apply once the container picks up the new aggregator service. * webhook on missing-precedents upload-close to Paperclip (optional). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-26 08:34:40 +00:00
parent af651d0135
commit f3cc9ca9d4
33 changed files with 4588 additions and 37 deletions
--- a/web/app.py
+++ b/web/app.py
@@ -1228,7 +1228,11 @@ class CaseCreateRequest(BaseModel):
    hearing_date: str = ""
    notes: str = ""
    expected_outcome: str = ""
-    practice_area: str = "appeals_committee"
+    # Empty default → cases_tools.case_create auto-derives the domain
+    # practice_area from the case_number prefix (1xxx→rishuy_uvniya,
+    # 8xxx→betterment_levy, 9xxx→compensation_197). Callers can still
+    # send a domain value explicitly.
+    practice_area: str = ""
    appeal_subtype: str = ""


@@ -1267,8 +1271,10 @@ async def api_case_create(req: CaseCreateRequest):
    )
    parsed = json.loads(result)

-    # Auto-create Paperclip project for the new case
-    appeal_type = req.appeal_subtype or "רישוי"
+    # Auto-create Paperclip project for the new case. case_create may have
+    # auto-derived appeal_subtype from the case-number prefix; prefer the
+    # resolved value over the (possibly empty) request value.
+    appeal_type = parsed.get("appeal_subtype") or req.appeal_subtype or "רישוי"
    try:
        pc_result = await pc_create_project(
            case_number=req.case_number,
@@ -1744,6 +1750,77 @@ async def api_get_claims(case_number: str):
    return {"case_number": case_number, "claims": claims_by_party, "total": len(rows)}


+# ── Legal Arguments (aggregated claims) ────────────────────────────
+# The aggregator groups raw ``claims`` rows into ~6-12 distinct legal
+# arguments per party. The heavy lifting (LLM call) runs in the local
+# MCP server context where Claude CLI is available; here we expose
+# read + trigger endpoints. The trigger is a BackgroundTask only when
+# Claude CLI is actually present in the runtime (i.e. dev box) — inside
+# the FastAPI container it short-circuits with status="llm_unavailable".
+
+@app.post("/api/cases/{case_number}/aggregate-arguments")
+async def api_aggregate_arguments(
+    case_number: str,
+    background_tasks: BackgroundTasks,
+    force: bool = False,
+):
+    """Aggregate raw claims into distinct legal arguments via Claude.
+
+    Runs as a BackgroundTask because the LLM pass can take 30-90 seconds.
+    """
+    case = await db.get_case_by_number(case_number)
+    if not case:
+        raise HTTPException(404, f"תיק {case_number} לא נמצא")
+
+    async def _run() -> None:
+        try:
+            from legal_mcp.services import argument_aggregator
+            result = await argument_aggregator.aggregate_claims_to_arguments(
+                UUID(case["id"]), force=force,
+            )
+            logger.info(
+                "aggregate_arguments[%s] finished: %s",
+                case_number, result,
+            )
+        except Exception as e:  # noqa: BLE001
+            logger.exception(
+                "aggregate_arguments[%s] failed: %s", case_number, e,
+            )
+
+    background_tasks.add_task(_run)
+    return {
+        "status": "started",
+        "case_number": case_number,
+        "force": force,
+        "message": "Aggregation started in background. Poll /legal-arguments for results.",
+    }
+
+
+@app.get("/api/cases/{case_number}/legal-arguments")
+async def api_get_legal_arguments(case_number: str, party: str = ""):
+    """Return aggregated legal arguments for a case, grouped by party."""
+    case = await db.get_case_by_number(case_number)
+    if not case:
+        raise HTTPException(404, f"תיק {case_number} לא נמצא")
+
+    from legal_mcp.services import argument_aggregator
+    args = await argument_aggregator.get_legal_arguments(
+        UUID(case["id"]), party=party,
+    )
+
+    # Group by party for the UI.
+    by_party: dict[str, list[dict]] = {}
+    for a in args:
+        by_party.setdefault(a["party"], []).append(a)
+
+    return {
+        "case_number": case_number,
+        "total": len(args),
+        "by_party": by_party,
+        "arguments": args,
+    }
+
+
@app.post("/api/cases/{case_number}/direction")
 async def api_set_direction(case_number: str, req: DirectionRequest):
    """Save the approved direction document for the discussion block."""
@@ -4789,3 +4866,332 @@ async def halacha_update(halacha_id: str, req: HalachaUpdateRequest):
    if not row:
        raise HTTPException(404, "הלכה לא נמצאה")
    return row
+
+
+# ── Missing Precedents (TaskMaster #35) ────────────────────────────
+# Track citations from party briefs that aren't yet in the precedent
+# corpus. Researcher logs gaps; chair closes them by uploading the
+# actual decision via /api/precedent-library/upload or
+# /api/internal-decisions/upload, then links via the upload endpoint
+# here which delegates to one of those depending on the citation type.
+
+
+_ALLOWED_MP_PARTIES = {
+    "appellant", "respondent", "committee", "permit_applicant", "unknown",
+}
+_ALLOWED_MP_STATUS = {"open", "uploaded", "closed", "irrelevant"}
+
+
+class MissingPrecedentCreate(BaseModel):
+    citation: str
+    case_number: str = ""  # cited-in case
+    cited_in_document_id: str | None = None
+    cited_by_party: Literal[
+        "appellant", "respondent", "committee", "permit_applicant", "unknown",
+    ] = "unknown"
+    cited_by_party_name: str | None = None
+    legal_topic: str | None = None
+    legal_issue: str | None = None
+    claim_quote: str | None = None
+    case_name: str | None = None
+    notes: str | None = None
+
+
+class MissingPrecedentPatch(BaseModel):
+    legal_topic: str | None = None
+    legal_issue: str | None = None
+    notes: str | None = None
+    cited_by_party: Literal[
+        "appellant", "respondent", "committee", "permit_applicant", "unknown",
+    ] | None = None
+    cited_by_party_name: str | None = None
+    case_name: str | None = None
+    status: Literal["open", "uploaded", "closed", "irrelevant"] | None = None
+    citation: str | None = None
+    claim_quote: str | None = None
+
+
+def _is_internal_committee_citation(citation: str) -> bool:
+    """Detect ועדת ערר citations — must go through internal_decision_upload
+    so they get chair_name + district. The legacy library upload doesn't
+    enforce those fields and the records end up un-searchable by chair."""
+    norm = citation.strip()
+    committee_prefixes = ("ערר ", "ערר(", "בל\"מ ", "בל\"מ(", "ARAR ")
+    return any(norm.startswith(p) for p in committee_prefixes)
+
+
+@app.post("/api/missing-precedents")
+async def missing_precedent_create(req: MissingPrecedentCreate):
+    """Log a new missing precedent (status='open'). Dedupes by
+    (citation, cited_in_case_id) — duplicate POST returns the existing row."""
+    if not req.citation.strip():
+        raise HTTPException(400, "citation חובה")
+
+    case_id: UUID | None = None
+    if req.case_number.strip():
+        c = await db.get_case_by_number(req.case_number.strip())
+        if not c:
+            raise HTTPException(404, f"תיק לא נמצא: {req.case_number}")
+        case_id = UUID(c["id"])
+
+    doc_id: UUID | None = None
+    if req.cited_in_document_id:
+        try:
+            doc_id = UUID(req.cited_in_document_id)
+        except ValueError:
+            raise HTTPException(400, "cited_in_document_id לא תקין")
+
+    existing = await db.find_missing_precedent_by_citation(
+        citation=req.citation.strip(),
+        case_id=case_id,
+    )
+    if existing:
+        return {**existing, "_duplicate": True}
+
+    row = await db.create_missing_precedent(
+        citation=req.citation.strip(),
+        case_name=req.case_name,
+        cited_in_case_id=case_id,
+        cited_in_document_id=doc_id,
+        cited_by_party=req.cited_by_party,
+        cited_by_party_name=req.cited_by_party_name,
+        legal_topic=req.legal_topic,
+        legal_issue=req.legal_issue,
+        claim_quote=req.claim_quote,
+        notes=req.notes,
+    )
+    return row
+
+
+@app.get("/api/missing-precedents")
+async def missing_precedents_list(
+    status: str = "",
+    case_id: str = "",
+    case_number: str = "",
+    legal_topic: str = "",
+    limit: int = 200,
+    offset: int = 0,
+):
+    """List missing precedents, optionally filtered by status / case."""
+    s = status.strip() or None
+    if s and s not in _ALLOWED_MP_STATUS:
+        raise HTTPException(400, f"status לא תקין: {status}")
+
+    case_uuid: UUID | None = None
+    if case_id.strip():
+        try:
+            case_uuid = UUID(case_id.strip())
+        except ValueError:
+            raise HTTPException(400, "case_id לא תקין")
+    elif case_number.strip():
+        c = await db.get_case_by_number(case_number.strip())
+        if not c:
+            raise HTTPException(404, f"תיק לא נמצא: {case_number}")
+        case_uuid = UUID(c["id"])
+
+    rows = await db.list_missing_precedents(
+        status=s,
+        case_id=case_uuid,
+        legal_topic=legal_topic.strip() or None,
+        limit=max(1, min(int(limit), 500)),
+        offset=max(0, int(offset)),
+    )
+    # Counters useful for the sidebar badge.
+    pool = await db.get_pool()
+    async with pool.acquire() as conn:
+        counts = await conn.fetch(
+            "SELECT status, COUNT(*) AS n FROM missing_precedents GROUP BY status"
+        )
+    by_status = {r["status"]: r["n"] for r in counts}
+    return {
+        "items": rows,
+        "count": len(rows),
+        "by_status": by_status,
+        "total_open": by_status.get("open", 0),
+    }
+
+
+@app.get("/api/missing-precedents/{mp_id}")
+async def missing_precedent_get(mp_id: str):
+    try:
+        uid = UUID(mp_id)
+    except ValueError:
+        raise HTTPException(400, "id לא תקין")
+    row = await db.get_missing_precedent(uid)
+    if not row:
+        raise HTTPException(404, "רשומה לא נמצאה")
+    return row
+
+
+@app.patch("/api/missing-precedents/{mp_id}")
+async def missing_precedent_update(mp_id: str, req: MissingPrecedentPatch):
+    try:
+        uid = UUID(mp_id)
+    except ValueError:
+        raise HTTPException(400, "id לא תקין")
+    fields = {k: v for k, v in req.model_dump(exclude_unset=True).items() if v is not None}
+    if not fields:
+        row = await db.get_missing_precedent(uid)
+        if not row:
+            raise HTTPException(404, "רשומה לא נמצאה")
+        return row
+    try:
+        row = await db.update_missing_precedent(uid, **fields)
+    except ValueError as e:
+        raise HTTPException(400, str(e))
+    if not row:
+        raise HTTPException(404, "רשומה לא נמצאה")
+    return row
+
+
+@app.delete("/api/missing-precedents/{mp_id}")
+async def missing_precedent_delete(mp_id: str):
+    try:
+        uid = UUID(mp_id)
+    except ValueError:
+        raise HTTPException(400, "id לא תקין")
+    pool = await db.get_pool()
+    async with pool.acquire() as conn:
+        result = await conn.execute(
+            "DELETE FROM missing_precedents WHERE id = $1", uid,
+        )
+    deleted = int(result.split()[-1]) > 0
+    if not deleted:
+        raise HTTPException(404, "רשומה לא נמצאה")
+    return {"deleted": True, "id": mp_id}
+
+
+@app.post("/api/missing-precedents/{mp_id}/upload")
+async def missing_precedent_upload(
+    mp_id: str,
+    file: UploadFile = File(...),
+    case_number: str = Form(""),  # for internal-committee path
+    chair_name: str = Form(""),
+    district: str = Form(""),
+    case_name: str = Form(""),
+    court: str = Form(""),
+    decision_date: str = Form(""),
+    practice_area: str = Form(""),
+    appeal_subtype: str = Form(""),
+    subject_tags: str = Form("[]"),
+    is_binding: bool = Form(True),
+    headnote: str = Form(""),
+    summary: str = Form(""),
+    precedent_level: str = Form(""),
+    source_type: str = Form(""),
+):
+    """Upload the decision file behind a missing-precedent and link it.
+
+    Routes to ingest_internal_decision if the citation looks like a
+    committee decision (ערר / בל"מ prefix), otherwise to ingest_precedent.
+    Once the case_law row is created, the missing_precedents row is marked
+    status='closed' with linked_case_law_id pointing to the new row.
+    """
+    try:
+        uid = UUID(mp_id)
+    except ValueError:
+        raise HTTPException(400, "id לא תקין")
+    mp = await db.get_missing_precedent(uid)
+    if not mp:
+        raise HTTPException(404, "רשומה לא נמצאה")
+    if mp["status"] in {"closed", "uploaded"} and mp.get("linked_case_law_id"):
+        raise HTTPException(409, "הרשומה כבר נסגרה — הסר קישור לפני העלאה חוזרת")
+
+    suffix = Path(file.filename or "").suffix.lower()
+    if suffix not in ALLOWED_EXTENSIONS:
+        raise HTTPException(400, f"סוג קובץ לא נתמך: {suffix}")
+
+    UPLOAD_DIR.mkdir(parents=True, exist_ok=True)
+    staged = UPLOAD_DIR / f"mp_{uuid4().hex[:8]}_{file.filename}"
+    size = 0
+    with staged.open("wb") as out:
+        while chunk := await file.read(1024 * 1024):
+            size += len(chunk)
+            if size > MAX_FILE_SIZE:
+                staged.unlink(missing_ok=True)
+                raise HTTPException(413, "קובץ גדול מדי")
+            out.write(chunk)
+
+    try:
+        tags = json.loads(subject_tags) if subject_tags else []
+        if not isinstance(tags, list):
+            tags = []
+    except json.JSONDecodeError:
+        tags = []
+
+    citation = mp["citation"]
+    is_committee = _is_internal_committee_citation(citation)
+    case_law_id: str | None = None
+    closed: dict | None = None
+
+    try:
+        if is_committee:
+            if not chair_name.strip() or not district.strip():
+                raise HTTPException(
+                    400,
+                    "החלטת ועדת ערר דורשת chair_name + district",
+                )
+            # case_number for the committee decision (not the cited-in case)
+            committee_case_number = case_number.strip() or citation
+            result = await int_decisions_service.ingest_internal_decision(
+                case_number=committee_case_number,
+                case_name=(case_name.strip() or mp.get("case_name") or "").strip(),
+                court=court.strip(),
+                decision_date=decision_date or None,
+                chair_name=chair_name.strip(),
+                district=district.strip(),
+                practice_area=practice_area,
+                appeal_subtype=appeal_subtype.strip(),
+                subject_tags=tags,
+                is_binding=is_binding,
+                summary=summary.strip(),
+                file_path=staged,
+            )
+            case_law_id = (
+                result.get("case_law_id") if isinstance(result, dict) else None
+            )
+        else:
+            if practice_area and practice_area not in _PRACTICE_AREAS:
+                raise HTTPException(400, "practice_area לא תקין")
+            if source_type and source_type not in _SOURCE_TYPES:
+                raise HTTPException(400, "source_type לא תקין")
+            result = await plib_service.ingest_precedent(
+                file_path=staged,
+                citation=citation,
+                case_name=(case_name.strip() or mp.get("case_name") or "").strip(),
+                court=court.strip(),
+                decision_date=decision_date or None,
+                source_type=source_type or "court_ruling",
+                precedent_level=precedent_level,
+                practice_area=practice_area,
+                appeal_subtype=appeal_subtype.strip(),
+                subject_tags=tags,
+                is_binding=is_binding,
+                headnote=headnote.strip(),
+                summary=summary.strip(),
+            )
+            case_law_id = (
+                result.get("case_law_id") if isinstance(result, dict) else None
+            )
+
+        if not case_law_id:
+            raise HTTPException(500, "לא התקבל case_law_id מההעלאה")
+
+        try:
+            closed = await db.close_missing_precedent(
+                mp_id=uid,
+                linked_case_law_id=UUID(case_law_id),
+                notes=mp.get("notes"),
+                status="closed",
+            )
+        except Exception as e:
+            logger.exception("missing-precedent close failed")
+            raise HTTPException(500, f"קישור הרשומה נכשל: {e}")
+    finally:
+        staged.unlink(missing_ok=True)
+
+    return {
+        "missing_precedent": closed,
+        "case_law_id": case_law_id,
+        "route": "internal_committee" if is_committee else "external_upload",
+    }