feat: Stage A finalizers + #35/#36/#37 — critical-gap closure
Some checks failed
Build & Deploy / build-and-deploy (push) Has been cancelled
Some checks failed
Build & Deploy / build-and-deploy (push) Has been cancelled
Four parallel sub-agents closed the remaining critical gaps from the 26/05 Stage A/B sprint. Each block independently tested; aggregated here. ## #30/#31 finalizers (sub-agent A) * Auto-derive practice_area in case_create from case_number prefix (1xxx→rishuy_uvniya, 8xxx→betterment_levy, 9xxx→compensation_197); default for CaseCreateRequest is now "" (the DB constraint catches any stray "appeals_committee"). * practice_area.py: derive_subtype now handles axis-B domain values (rishuy_uvniya/betterment_levy/compensation_197) without parsing the case number; new helper derive_domain_practice_area(). * Halacha re-extraction verified unnecessary — all 6 reclassified records already had is_binding=false and approved halachot. * Regression tests: 6 cases in tests/test_corpus_constraints.py covering practice_area enum, internal-committee chair/district, external-upload arar prefix, MCP guard. * UI: district input → Select dropdown (7 districts) in precedent-edit-sheet.tsx, preserving legacy free-text values. ## #37 בל"מ subtypes (sub-agent B) * 3 new appeal_subtypes: extension_request_{building_permit, betterment_levy,compensation}. APPEALS_COMMITTEE_SUBTYPES extended, SUBTYPES_BY_AREA mappings added. * New helpers: is_blam_subject(), is_blam_subtype(), derive_subtype_with_blam(case_number, subject, practice_area). case_create now uses it to auto-detect "בקשה להארכת מועד" subjects. * 3 methodology templates under docs/methodology/extension-request-*.md. * paperclip_client.py mapping updated for the 3 new subtypes (extension_request_building_permit→CMP, the other two→CMPA). * Frontend: bilingual "בל"מ" badge + filter dropdown on cases list + detail header; appeal-type-bars collapseBlam() merges בל"מ into its parent domain for aggregate bars. * Wizard auto-detects בל"מ from subject during case creation. * 3 Berlinger cases (1017/1018/1019-03-26) migrated to appeal_subtype=extension_request_building_permit via psql. ## #35 missing_precedents feature (sub-agent C) * Schema V13: missing_precedents table (citation, case_id, party, legal_topic, status, linked_case_law_id, claim_quote, ...) + FK constraints + 3 indexes. Applied via psql + idempotent migration. * 6 db.py service functions, 3 MCP tools, 6 FastAPI endpoints (POST/GET/PATCH/DELETE/upload — upload routes by citation prefix to ingest_internal_decision or ingest_precedent). * Next.js page /missing-precedents with 5 status tabs + filters + sidebar badge counter + detail drawer with metadata edit + smart upload form that switches fields per committee/court. * Bootstrap: 7 rows imported from the JSON file (3 citations × cases, all status=closed with linked_case_law_id). * legal-researcher.md: new §2ב.5 with missing_precedent_create usage + dedup semantics + tool grant. ## #36 legal_arguments aggregation (sub-agent D) * Schema V14: legal_arguments + legal_argument_propositions M:M. Applied via psql. * New service argument_aggregator.py with two functions — aggregate_claims_to_arguments() (Claude CLI / claude_session) and get_legal_arguments(). Graceful llm_unavailable handling when CLI is missing (containers). * 2 MCP tools + 2 API endpoints (POST .../aggregate-arguments as BackgroundTask, GET .../legal-arguments). * Frontend: shadcn Accordion + new legal-arguments-panel.tsx with hierarchical (party → priority badge → arguments) display, "טיעונים" tab on the case page, "חשב/חשב מחדש" buttons. * scripts/backfill_legal_arguments.py + SCRIPTS.md entry — dry-run found 8 candidate cases including 1017/1018/1019. ## Open follow-ups (intentionally deferred) * npm run api:types in web-ui (CLAUDE.md flow) — recommended before the next UI commit; not required for backend deployment. * Run backfill_legal_arguments.py --apply once the container picks up the new aggregator service. * webhook on missing-precedents upload-close to Paperclip (optional). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -54,6 +54,8 @@ from legal_mcp.tools import ( # noqa: E402
|
||||
cases, documents, search, drafting, workflow, precedents,
|
||||
precedent_library as plib,
|
||||
internal_decisions as int_tools,
|
||||
legal_arguments as la_tools,
|
||||
missing_precedents as mp_tools,
|
||||
)
|
||||
|
||||
|
||||
@@ -364,6 +366,28 @@ async def get_claims(
|
||||
return await documents.get_claims(case_number, party_role)
|
||||
|
||||
|
||||
# Legal arguments — aggregated (de-duped) propositions
|
||||
@mcp.tool()
|
||||
async def aggregate_claims_to_arguments(
|
||||
case_number: str,
|
||||
force: bool = False,
|
||||
) -> str:
|
||||
"""כינוס פרופוזיציות גולמיות (claims) לטיעונים משפטיים מובחנים — ~6-12 לכל צד.
|
||||
|
||||
משתמש ב-Claude headless לסיווג ואיגוד. force=True מוחק טיעונים קיימים לפני חישוב מחדש.
|
||||
"""
|
||||
return await la_tools.aggregate_claims_to_arguments(case_number, force=force)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def get_legal_arguments(
|
||||
case_number: str,
|
||||
party: str = "",
|
||||
) -> str:
|
||||
"""שליפת טיעונים משפטיים מאוגדים. party: appellant/respondent/committee/permit_applicant (ריק=הכל)."""
|
||||
return await la_tools.get_legal_arguments(case_number, party)
|
||||
|
||||
|
||||
# References
|
||||
@mcp.tool()
|
||||
async def extract_references(
|
||||
@@ -703,6 +727,82 @@ async def internal_decision_upload(
|
||||
)
|
||||
|
||||
|
||||
# ── Missing precedents (TaskMaster #35) ───────────────────────────
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def missing_precedent_create(
|
||||
citation: str,
|
||||
case_number: str = "",
|
||||
cited_in_document_id: str = "",
|
||||
cited_by_party: str = "unknown",
|
||||
cited_by_party_name: str = "",
|
||||
legal_topic: str = "",
|
||||
legal_issue: str = "",
|
||||
claim_quote: str = "",
|
||||
case_name: str = "",
|
||||
notes: str = "",
|
||||
) -> str:
|
||||
"""תיעוד פסיקה שצוטטה בכתבי הטענות אך אינה בקורפוס.
|
||||
|
||||
שימוש: סוכן המחקר (legal-researcher) קורא לזה כשהוא מזהה ציטוט שלא
|
||||
ניתן לאמת מול הקורפוס. הרשומה נשארת 'open' עד שהיו"ר מעלה את הפסיקה.
|
||||
cited_by_party: appellant / respondent / committee / permit_applicant / unknown.
|
||||
דה-דופ אוטומטי: ציטוט+תיק זהים → מחזיר את הרשומה הקיימת.
|
||||
"""
|
||||
return await mp_tools.missing_precedent_create(
|
||||
citation=citation,
|
||||
case_number=case_number,
|
||||
cited_in_document_id=cited_in_document_id,
|
||||
cited_by_party=cited_by_party,
|
||||
cited_by_party_name=cited_by_party_name,
|
||||
legal_topic=legal_topic,
|
||||
legal_issue=legal_issue,
|
||||
claim_quote=claim_quote,
|
||||
case_name=case_name,
|
||||
notes=notes,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def missing_precedent_list(
|
||||
case_number: str = "",
|
||||
status: str = "open",
|
||||
legal_topic: str = "",
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת פסיקות חסרות לתיק או בכלל. status: open/uploaded/closed/irrelevant.
|
||||
|
||||
שימוש: היו"ר רואה מה ממתין להעלאה; הסוכן מאשר שלא יוצר כפילויות.
|
||||
"""
|
||||
return await mp_tools.missing_precedent_list(
|
||||
case_number=case_number,
|
||||
status=status,
|
||||
legal_topic=legal_topic,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def missing_precedent_close(
|
||||
id: str,
|
||||
linked_case_law_id: str = "",
|
||||
notes: str = "",
|
||||
status: str = "closed",
|
||||
) -> str:
|
||||
"""סגירת רשומת פסיקה חסרה לאחר העלאה לקורפוס.
|
||||
|
||||
status: closed (הועלה ונקשר) / uploaded (הועלה, ממתין לקישור) /
|
||||
irrelevant (היו"ר החליט שזה לא רלוונטי לקורפוס).
|
||||
"""
|
||||
return await mp_tools.missing_precedent_close(
|
||||
id=id,
|
||||
linked_case_law_id=linked_case_law_id,
|
||||
notes=notes,
|
||||
status=status,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def record_chair_feedback(
|
||||
case_number: str,
|
||||
|
||||
358
mcp-server/src/legal_mcp/services/argument_aggregator.py
Normal file
358
mcp-server/src/legal_mcp/services/argument_aggregator.py
Normal file
@@ -0,0 +1,358 @@
|
||||
"""כינוס פרופוזיציות לטיעונים משפטיים מובחנים — argument de-duplication.
|
||||
|
||||
Workflow:
|
||||
1. ``claims_extractor`` extracts ~20-30 raw propositions per litigation
|
||||
brief into the ``claims`` table.
|
||||
2. This module groups those raw propositions, per party, into 6-12
|
||||
distinct legal arguments via Claude headless (`claude_session`).
|
||||
3. The result is stored in ``legal_arguments`` plus ``legal_argument_
|
||||
propositions`` (M:M join) so we keep traceability back to the source
|
||||
claims.
|
||||
|
||||
Manually de-duping 184 propositions in 3 cases yielded 82 arguments
|
||||
(~24/case) — see ``data/cases/{1017,1018,1019}-03-26/documents/research/
|
||||
legal-arguments.md`` for the gold standard.
|
||||
|
||||
**Architectural constraint**: ``claude_session`` only works from the local
|
||||
MCP server (Claude CLI is not installed in the FastAPI container). Calls
|
||||
from ``web/`` must go through MCP tools; calls from MCP tools land here
|
||||
directly.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import claude_session, db
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Allowed enum values mirror the DB CHECK constraints.
|
||||
ALLOWED_PARTIES = {"appellant", "respondent", "committee", "permit_applicant", "unknown"}
|
||||
ALLOWED_PRIORITIES = {"threshold", "substantive", "procedural", "relief"}
|
||||
|
||||
# Hebrew labels for the prompt (Claude needs context in the same
|
||||
# language as the source material).
|
||||
PARTY_LABELS_HE = {
|
||||
"appellant": "עוררים",
|
||||
"respondent": "משיבים",
|
||||
"committee": "ועדה מקומית",
|
||||
"permit_applicant": "מבקשי היתר",
|
||||
"unknown": "צד לא מזוהה",
|
||||
}
|
||||
|
||||
|
||||
AGGREGATE_PROMPT_TEMPLATE = """אתה מנתח כתבי טענות בתחום תכנון ובנייה (ועדת ערר).
|
||||
|
||||
לפניך {n} פרופוזיציות גולמיות שחולצו ממסמכי {party_he} בתיק ערר.
|
||||
מטרתך: לקבץ אותן ל-{target_min}-{target_max} **טיעונים משפטיים מובחנים**
|
||||
(ארגומנטים אמיתיים, לא חזרה מילולית של הפרופוזיציות).
|
||||
|
||||
## כללי איגוד:
|
||||
1. **טיעון אמיתי = רעיון משפטי אחד** — לא רשימה של פרופוזיציות, אלא טענה משפטית עצמאית.
|
||||
2. **מקבצים פרופוזיציות שתומכות באותו רעיון משפטי** — גם אם הניסוח שלהן שונה.
|
||||
3. **מפרידים בין סוגי טענות**:
|
||||
- **threshold** = טענות סף (זכות עמידה, סמכות, מועדים, שיהוי)
|
||||
- **substantive** = טענות מהותיות (תחולת חוק, פרשנות, חישוב)
|
||||
- **procedural** = פגמי הליך (פרסום, פרוטוקול, ניגוד עניינים)
|
||||
- **relief** = סעדים מבוקשים / סיכומים
|
||||
4. **כותרת קצרה ובהירה** — תיאורית, לא משפטית מפורטת. 5-15 מילים.
|
||||
5. **גוף הטיעון בפסקה אחת** — 3-7 שורות עברית, נאמן למקור.
|
||||
6. **שמירת ה-claim_ids המקוריים** — לכל טיעון, רשום אילו פרופוזיציות תומכות בו.
|
||||
|
||||
## פלט:
|
||||
החזר JSON בלבד (ללא markdown, ללא הסברים), array של אובייקטים:
|
||||
```
|
||||
[
|
||||
{{
|
||||
"title": "כותרת קצרה של הטיעון",
|
||||
"body": "גוף הטיעון בפסקה אחת",
|
||||
"topic": "סוגיה משפטית קצרה (לדוגמה: 'זכות עמידה', 'תחולת תמ\\"א 38')",
|
||||
"priority": "threshold|substantive|procedural|relief",
|
||||
"claim_ids": ["uuid-1", "uuid-2"]
|
||||
}}
|
||||
]
|
||||
```
|
||||
|
||||
## הפרופוזיציות:
|
||||
{propositions_json}
|
||||
"""
|
||||
|
||||
|
||||
def _build_prompt(party: str, propositions: list[dict]) -> str:
|
||||
"""Compose the per-party aggregation prompt."""
|
||||
n = len(propositions)
|
||||
# Conservative target: ~1 argument per 2-3 propositions, clamped 4-12.
|
||||
target_min = max(4, n // 4)
|
||||
target_max = max(target_min + 1, min(12, n // 2 + 1))
|
||||
|
||||
party_he = PARTY_LABELS_HE.get(party, party)
|
||||
# Strip noise from propositions for the prompt — Claude only needs
|
||||
# the id and the text to do the grouping.
|
||||
compact = [
|
||||
{"id": str(p["id"]), "text": p["claim_text"]}
|
||||
for p in propositions
|
||||
]
|
||||
propositions_json = json.dumps(compact, ensure_ascii=False, indent=2)
|
||||
|
||||
return AGGREGATE_PROMPT_TEMPLATE.format(
|
||||
n=n,
|
||||
party_he=party_he,
|
||||
target_min=target_min,
|
||||
target_max=target_max,
|
||||
propositions_json=propositions_json,
|
||||
)
|
||||
|
||||
|
||||
def _normalize_argument(raw: dict, fallback_topic: str = "") -> dict | None:
|
||||
"""Validate & normalize a single argument dict from Claude.
|
||||
|
||||
Returns None if the row is unusable (missing required fields).
|
||||
"""
|
||||
if not isinstance(raw, dict):
|
||||
return None
|
||||
title = (raw.get("title") or "").strip()
|
||||
body = (raw.get("body") or "").strip()
|
||||
if not title or not body:
|
||||
return None
|
||||
priority = raw.get("priority", "substantive")
|
||||
if priority not in ALLOWED_PRIORITIES:
|
||||
priority = "substantive"
|
||||
topic = (raw.get("topic") or fallback_topic or "").strip() or None
|
||||
claim_ids_raw = raw.get("claim_ids") or []
|
||||
claim_ids: list[UUID] = []
|
||||
if isinstance(claim_ids_raw, list):
|
||||
for cid in claim_ids_raw:
|
||||
try:
|
||||
claim_ids.append(UUID(str(cid)))
|
||||
except (ValueError, TypeError):
|
||||
continue
|
||||
return {
|
||||
"title": title,
|
||||
"body": body,
|
||||
"topic": topic,
|
||||
"priority": priority,
|
||||
"claim_ids": claim_ids,
|
||||
}
|
||||
|
||||
|
||||
async def _aggregate_party(
|
||||
party: str, propositions: list[dict],
|
||||
) -> list[dict]:
|
||||
"""Ask Claude to group one party's propositions; return normalized rows."""
|
||||
if not propositions:
|
||||
return []
|
||||
prompt = _build_prompt(party, propositions)
|
||||
|
||||
try:
|
||||
raw_result = await claude_session.query_json(prompt)
|
||||
except RuntimeError as e:
|
||||
# Surface CLI-unavailable specifically so the caller can report
|
||||
# cleanly instead of crashing the whole job.
|
||||
raise RuntimeError(
|
||||
f"argument_aggregator: claude_session.query_json failed for party "
|
||||
f"'{party}': {e}"
|
||||
) from e
|
||||
|
||||
if not isinstance(raw_result, list):
|
||||
logger.warning(
|
||||
"argument_aggregator: Claude returned non-list (%s) for party '%s'",
|
||||
type(raw_result).__name__, party,
|
||||
)
|
||||
return []
|
||||
|
||||
out: list[dict] = []
|
||||
for entry in raw_result:
|
||||
norm = _normalize_argument(entry)
|
||||
if norm:
|
||||
out.append(norm)
|
||||
return out
|
||||
|
||||
|
||||
async def aggregate_claims_to_arguments(
|
||||
case_id: UUID, force: bool = False,
|
||||
) -> dict:
|
||||
"""For a given case, group existing claims into distinct legal arguments.
|
||||
|
||||
Args:
|
||||
case_id: The case UUID.
|
||||
force: If True, delete existing ``legal_arguments`` for the case
|
||||
before aggregating. Otherwise short-circuit if any rows exist.
|
||||
|
||||
Returns:
|
||||
A summary dict:
|
||||
``{"status": "completed"|"skipped"|"no_claims"|"llm_unavailable",
|
||||
"by_party": {party: count}, "total": int, "message": ...}``
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
existing = await conn.fetchval(
|
||||
"SELECT COUNT(*) FROM legal_arguments WHERE case_id = $1",
|
||||
case_id,
|
||||
)
|
||||
if existing and not force:
|
||||
return {
|
||||
"status": "skipped",
|
||||
"message": f"Found {existing} existing arguments. Use force=True to re-run.",
|
||||
"total": existing,
|
||||
}
|
||||
|
||||
if force and existing:
|
||||
await conn.execute(
|
||||
"DELETE FROM legal_arguments WHERE case_id = $1", case_id,
|
||||
)
|
||||
|
||||
# Pull all claims for this case, grouped by party.
|
||||
rows = await conn.fetch(
|
||||
"""SELECT id, party_role, claim_text, claim_index, source_document
|
||||
FROM claims
|
||||
WHERE case_id = $1
|
||||
ORDER BY party_role, claim_index""",
|
||||
case_id,
|
||||
)
|
||||
|
||||
if not rows:
|
||||
return {
|
||||
"status": "no_claims",
|
||||
"message": "No claims found for this case. Run extract_claims first.",
|
||||
"total": 0,
|
||||
}
|
||||
|
||||
# Group propositions by party.
|
||||
by_party: dict[str, list[dict]] = {}
|
||||
for r in rows:
|
||||
party = r["party_role"]
|
||||
# Map deprecated 'appraiser' or unknown labels to 'unknown'.
|
||||
if party not in ALLOWED_PARTIES:
|
||||
party = "unknown"
|
||||
by_party.setdefault(party, []).append(dict(r))
|
||||
|
||||
party_counts: dict[str, int] = {}
|
||||
inserted = 0
|
||||
errors: list[str] = []
|
||||
|
||||
for party, props in by_party.items():
|
||||
try:
|
||||
arguments = await _aggregate_party(party, props)
|
||||
except RuntimeError as e:
|
||||
# Most likely cause: Claude CLI not installed (running from
|
||||
# the container). Don't crash — record the gap and continue.
|
||||
msg = str(e)
|
||||
if "Claude CLI not found" in msg:
|
||||
return {
|
||||
"status": "llm_unavailable",
|
||||
"message": (
|
||||
"Claude CLI not available. This service must run from "
|
||||
"the local MCP server (not the FastAPI container)."
|
||||
),
|
||||
"total": 0,
|
||||
}
|
||||
errors.append(f"{party}: {msg}")
|
||||
continue
|
||||
|
||||
if not arguments:
|
||||
party_counts[party] = 0
|
||||
continue
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
async with conn.transaction():
|
||||
for idx, arg in enumerate(arguments):
|
||||
arg_id = await conn.fetchval(
|
||||
"""INSERT INTO legal_arguments
|
||||
(case_id, party, argument_index, argument_title,
|
||||
argument_body, legal_topic, priority)
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $7)
|
||||
RETURNING id""",
|
||||
case_id,
|
||||
party,
|
||||
idx + 1,
|
||||
arg["title"],
|
||||
arg["body"],
|
||||
arg["topic"],
|
||||
arg["priority"],
|
||||
)
|
||||
for cid in arg["claim_ids"]:
|
||||
try:
|
||||
await conn.execute(
|
||||
"""INSERT INTO legal_argument_propositions
|
||||
(argument_id, claim_id)
|
||||
VALUES ($1, $2)
|
||||
ON CONFLICT DO NOTHING""",
|
||||
arg_id, cid,
|
||||
)
|
||||
except Exception as e: # noqa: BLE001
|
||||
# Likely FK violation if the LLM hallucinated
|
||||
# a claim_id. Log and continue.
|
||||
logger.warning(
|
||||
"argument_aggregator: skipped bad claim_id %s for arg %s: %s",
|
||||
cid, arg_id, e,
|
||||
)
|
||||
inserted += 1
|
||||
party_counts[party] = len(arguments)
|
||||
|
||||
result: dict = {
|
||||
"status": "completed",
|
||||
"total": inserted,
|
||||
"by_party": party_counts,
|
||||
"propositions_processed": len(rows),
|
||||
}
|
||||
if errors:
|
||||
result["errors"] = errors
|
||||
result["status"] = "completed_with_errors"
|
||||
return result
|
||||
|
||||
|
||||
async def get_legal_arguments(
|
||||
case_id: UUID, party: str = "",
|
||||
) -> list[dict]:
|
||||
"""Return aggregated legal arguments for a case, optionally filtered by party.
|
||||
|
||||
Each row includes ``supporting_claims`` (list of source claim_ids).
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
if party and party in ALLOWED_PARTIES:
|
||||
rows = await conn.fetch(
|
||||
"""SELECT id, case_id, party, argument_index, argument_title,
|
||||
argument_body, legal_topic, priority, cited_precedents,
|
||||
created_at, updated_at
|
||||
FROM legal_arguments
|
||||
WHERE case_id = $1 AND party = $2
|
||||
ORDER BY priority, argument_index""",
|
||||
case_id, party,
|
||||
)
|
||||
else:
|
||||
rows = await conn.fetch(
|
||||
"""SELECT id, case_id, party, argument_index, argument_title,
|
||||
argument_body, legal_topic, priority, cited_precedents,
|
||||
created_at, updated_at
|
||||
FROM legal_arguments
|
||||
WHERE case_id = $1
|
||||
ORDER BY party, priority, argument_index""",
|
||||
case_id,
|
||||
)
|
||||
|
||||
# Pull supporting claim ids for each argument in one round-trip.
|
||||
arg_ids = [r["id"] for r in rows]
|
||||
supporting: dict[UUID, list[str]] = {}
|
||||
if arg_ids:
|
||||
joins = await conn.fetch(
|
||||
"""SELECT argument_id, claim_id
|
||||
FROM legal_argument_propositions
|
||||
WHERE argument_id = ANY($1::uuid[])""",
|
||||
arg_ids,
|
||||
)
|
||||
for j in joins:
|
||||
supporting.setdefault(j["argument_id"], []).append(str(j["claim_id"]))
|
||||
|
||||
out: list[dict] = []
|
||||
for r in rows:
|
||||
d = dict(r)
|
||||
d["id"] = str(d["id"])
|
||||
d["case_id"] = str(d["case_id"])
|
||||
d["supporting_claims"] = supporting.get(r["id"], [])
|
||||
out.append(d)
|
||||
return out
|
||||
@@ -745,6 +745,84 @@ CREATE INDEX IF NOT EXISTS idx_halachot_tsv
|
||||
"""
|
||||
|
||||
|
||||
# ── V13: Missing precedents log ───────────────────────────────────
|
||||
# Track citations that the parties brought up but which are NOT yet in
|
||||
# the precedent_library. Created by the researcher (auto or chair)
|
||||
# whenever a citation can't be found in the corpus; closed by uploading
|
||||
# the actual decision via internal_decision_upload or
|
||||
# precedent_library_upload, at which point linked_case_law_id points to
|
||||
# the new case_law row and status flips to 'closed'.
|
||||
SCHEMA_V13_SQL = """
|
||||
CREATE TABLE IF NOT EXISTS missing_precedents (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
citation TEXT NOT NULL,
|
||||
case_name TEXT,
|
||||
cited_in_case_id UUID REFERENCES cases(id) ON DELETE CASCADE,
|
||||
cited_in_document_id UUID REFERENCES documents(id) ON DELETE SET NULL,
|
||||
cited_by_party TEXT CHECK (cited_by_party IN (
|
||||
'appellant', 'respondent', 'committee', 'permit_applicant', 'unknown'
|
||||
)),
|
||||
cited_by_party_name TEXT,
|
||||
legal_topic TEXT,
|
||||
legal_issue TEXT,
|
||||
claim_quote TEXT,
|
||||
status TEXT DEFAULT 'open' CHECK (status IN (
|
||||
'open', 'uploaded', 'closed', 'irrelevant'
|
||||
)),
|
||||
linked_case_law_id UUID REFERENCES case_law(id) ON DELETE SET NULL,
|
||||
closed_at TIMESTAMPTZ,
|
||||
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
notes TEXT
|
||||
);
|
||||
|
||||
CREATE INDEX IF NOT EXISTS idx_missing_precedents_case
|
||||
ON missing_precedents(cited_in_case_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_missing_precedents_status
|
||||
ON missing_precedents(status);
|
||||
CREATE INDEX IF NOT EXISTS idx_missing_precedents_citation
|
||||
ON missing_precedents(citation);
|
||||
"""
|
||||
|
||||
|
||||
# ── V14: Legal arguments (aggregated propositions) ────────────────
|
||||
# After ``claims_extractor`` extracts raw propositions (rows in ``claims``)
|
||||
# the LLM-driven aggregator groups them into ~6-12 distinct legal arguments
|
||||
# per party. ``legal_arguments`` holds the consolidated argument; the M:M
|
||||
# join table ``legal_argument_propositions`` links back to the source
|
||||
# propositions for traceability ("which raw claims feed this argument?").
|
||||
SCHEMA_V14_SQL = """
|
||||
CREATE TABLE IF NOT EXISTS legal_arguments (
|
||||
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
|
||||
case_id UUID NOT NULL REFERENCES cases(id) ON DELETE CASCADE,
|
||||
party TEXT NOT NULL CHECK (party IN (
|
||||
'appellant', 'respondent', 'committee', 'permit_applicant', 'unknown'
|
||||
)),
|
||||
argument_index INTEGER NOT NULL,
|
||||
argument_title TEXT NOT NULL,
|
||||
argument_body TEXT NOT NULL,
|
||||
legal_topic TEXT,
|
||||
priority TEXT DEFAULT 'substantive' CHECK (priority IN (
|
||||
'threshold', 'substantive', 'procedural', 'relief'
|
||||
)),
|
||||
cited_precedents TEXT[],
|
||||
created_at TIMESTAMPTZ DEFAULT NOW(),
|
||||
updated_at TIMESTAMPTZ DEFAULT NOW()
|
||||
);
|
||||
CREATE INDEX IF NOT EXISTS idx_legal_arguments_case
|
||||
ON legal_arguments(case_id);
|
||||
CREATE INDEX IF NOT EXISTS idx_legal_arguments_party
|
||||
ON legal_arguments(case_id, party);
|
||||
|
||||
-- M:M back to ``claims`` (raw propositions).
|
||||
CREATE TABLE IF NOT EXISTS legal_argument_propositions (
|
||||
argument_id UUID NOT NULL REFERENCES legal_arguments(id) ON DELETE CASCADE,
|
||||
claim_id UUID NOT NULL REFERENCES claims(id) ON DELETE CASCADE,
|
||||
PRIMARY KEY (argument_id, claim_id)
|
||||
);
|
||||
"""
|
||||
|
||||
|
||||
async def _run_schema_migrations(pool: asyncpg.Pool) -> None:
|
||||
async with pool.acquire() as conn:
|
||||
await conn.execute(SCHEMA_SQL)
|
||||
@@ -760,7 +838,9 @@ async def _run_schema_migrations(pool: asyncpg.Pool) -> None:
|
||||
await conn.execute(SCHEMA_V10_SQL)
|
||||
await conn.execute(SCHEMA_V11_SQL)
|
||||
await conn.execute(SCHEMA_V12_SQL)
|
||||
logger.info("Database schema initialized (v1-v12)")
|
||||
await conn.execute(SCHEMA_V13_SQL)
|
||||
await conn.execute(SCHEMA_V14_SQL)
|
||||
logger.info("Database schema initialized (v1-v14)")
|
||||
|
||||
|
||||
async def init_schema() -> None:
|
||||
@@ -782,7 +862,10 @@ async def create_case(
|
||||
hearing_date: date | None = None,
|
||||
notes: str = "",
|
||||
expected_outcome: str = "",
|
||||
practice_area: str = "appeals_committee",
|
||||
# Default "" — DB CHECK constraint accepts empty, the upstream tool
|
||||
# (cases.case_create) is responsible for deriving the domain value
|
||||
# from the case_number prefix before calling here.
|
||||
practice_area: str = "",
|
||||
appeal_subtype: str = "",
|
||||
) -> dict:
|
||||
pool = await get_pool()
|
||||
@@ -3106,3 +3189,228 @@ async def search_precedent_library_hybrid(
|
||||
merged.append(d)
|
||||
merged.sort(key=lambda x: -x["score"])
|
||||
return merged[:limit]
|
||||
|
||||
|
||||
# ── Missing precedents (V13) ───────────────────────────────────────
|
||||
# Track citations from party briefs that aren't yet in the corpus.
|
||||
# Lifecycle: 'open' → researcher logs gap → chair uploads decision
|
||||
# → status='uploaded' (file ingested) → status='closed' (linked to
|
||||
# case_law row). 'irrelevant' = chair decided the citation isn't worth
|
||||
# adding to the library.
|
||||
|
||||
ALLOWED_MP_PARTIES = {
|
||||
"appellant", "respondent", "committee", "permit_applicant", "unknown",
|
||||
}
|
||||
ALLOWED_MP_STATUS = {"open", "uploaded", "closed", "irrelevant"}
|
||||
|
||||
|
||||
def _row_to_missing_precedent(row: asyncpg.Record) -> dict:
|
||||
d = dict(row)
|
||||
d["id"] = str(d["id"])
|
||||
if d.get("cited_in_case_id") is not None:
|
||||
d["cited_in_case_id"] = str(d["cited_in_case_id"])
|
||||
if d.get("cited_in_document_id") is not None:
|
||||
d["cited_in_document_id"] = str(d["cited_in_document_id"])
|
||||
if d.get("linked_case_law_id") is not None:
|
||||
d["linked_case_law_id"] = str(d["linked_case_law_id"])
|
||||
return d
|
||||
|
||||
|
||||
async def create_missing_precedent(
|
||||
citation: str,
|
||||
case_name: str | None = None,
|
||||
cited_in_case_id: UUID | None = None,
|
||||
cited_in_document_id: UUID | None = None,
|
||||
cited_by_party: str | None = None,
|
||||
cited_by_party_name: str | None = None,
|
||||
legal_topic: str | None = None,
|
||||
legal_issue: str | None = None,
|
||||
claim_quote: str | None = None,
|
||||
notes: str | None = None,
|
||||
) -> dict:
|
||||
"""Create a new missing-precedent row (status='open' by default)."""
|
||||
if not citation.strip():
|
||||
raise ValueError("citation is required")
|
||||
if cited_by_party and cited_by_party not in ALLOWED_MP_PARTIES:
|
||||
raise ValueError(
|
||||
f"cited_by_party must be one of {sorted(ALLOWED_MP_PARTIES)}"
|
||||
)
|
||||
pool = await get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
row = await conn.fetchrow(
|
||||
"""INSERT INTO missing_precedents (
|
||||
citation, case_name, cited_in_case_id, cited_in_document_id,
|
||||
cited_by_party, cited_by_party_name, legal_topic, legal_issue,
|
||||
claim_quote, notes
|
||||
)
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10)
|
||||
RETURNING *""",
|
||||
citation.strip(), case_name, cited_in_case_id, cited_in_document_id,
|
||||
cited_by_party, cited_by_party_name, legal_topic, legal_issue,
|
||||
claim_quote, notes,
|
||||
)
|
||||
return _row_to_missing_precedent(row)
|
||||
|
||||
|
||||
async def list_missing_precedents(
|
||||
status: str | None = None,
|
||||
case_id: UUID | None = None,
|
||||
legal_topic: str | None = None,
|
||||
limit: int = 200,
|
||||
offset: int = 0,
|
||||
) -> list[dict]:
|
||||
"""List missing precedents, joining the cited-in case_number for display."""
|
||||
pool = await get_pool()
|
||||
conditions: list[str] = []
|
||||
params: list = []
|
||||
idx = 1
|
||||
if status:
|
||||
conditions.append(f"mp.status = ${idx}")
|
||||
params.append(status)
|
||||
idx += 1
|
||||
if case_id:
|
||||
conditions.append(f"mp.cited_in_case_id = ${idx}")
|
||||
params.append(case_id)
|
||||
idx += 1
|
||||
if legal_topic:
|
||||
conditions.append(f"mp.legal_topic ILIKE ${idx}")
|
||||
params.append(f"%{legal_topic}%")
|
||||
idx += 1
|
||||
where = f"WHERE {' AND '.join(conditions)}" if conditions else ""
|
||||
params.append(limit)
|
||||
params.append(offset)
|
||||
sql = f"""
|
||||
SELECT mp.*,
|
||||
c.case_number AS cited_in_case_number,
|
||||
cl.case_number AS linked_case_law_number,
|
||||
cl.case_name AS linked_case_law_name
|
||||
FROM missing_precedents mp
|
||||
LEFT JOIN cases c ON c.id = mp.cited_in_case_id
|
||||
LEFT JOIN case_law cl ON cl.id = mp.linked_case_law_id
|
||||
{where}
|
||||
ORDER BY
|
||||
CASE mp.status
|
||||
WHEN 'open' THEN 0
|
||||
WHEN 'uploaded' THEN 1
|
||||
WHEN 'closed' THEN 2
|
||||
WHEN 'irrelevant' THEN 3
|
||||
END,
|
||||
mp.created_at DESC
|
||||
LIMIT ${idx} OFFSET ${idx + 1}
|
||||
"""
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(sql, *params)
|
||||
return [_row_to_missing_precedent(r) for r in rows]
|
||||
|
||||
|
||||
async def get_missing_precedent(mp_id: UUID) -> dict | None:
|
||||
pool = await get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
row = await conn.fetchrow(
|
||||
"""
|
||||
SELECT mp.*,
|
||||
c.case_number AS cited_in_case_number,
|
||||
cl.case_number AS linked_case_law_number,
|
||||
cl.case_name AS linked_case_law_name
|
||||
FROM missing_precedents mp
|
||||
LEFT JOIN cases c ON c.id = mp.cited_in_case_id
|
||||
LEFT JOIN case_law cl ON cl.id = mp.linked_case_law_id
|
||||
WHERE mp.id = $1
|
||||
""",
|
||||
mp_id,
|
||||
)
|
||||
return _row_to_missing_precedent(row) if row else None
|
||||
|
||||
|
||||
async def update_missing_precedent(mp_id: UUID, **fields) -> dict | None:
|
||||
"""Patch a missing-precedent row. Allowed fields: legal_topic,
|
||||
legal_issue, notes, cited_by_party, cited_by_party_name, case_name,
|
||||
status, linked_case_law_id, closed_at."""
|
||||
if not fields:
|
||||
return await get_missing_precedent(mp_id)
|
||||
allowed = {
|
||||
"legal_topic", "legal_issue", "notes", "cited_by_party",
|
||||
"cited_by_party_name", "case_name", "status", "linked_case_law_id",
|
||||
"closed_at", "claim_quote", "citation",
|
||||
}
|
||||
clean = {k: v for k, v in fields.items() if k in allowed}
|
||||
if not clean:
|
||||
return await get_missing_precedent(mp_id)
|
||||
if "status" in clean and clean["status"] not in ALLOWED_MP_STATUS:
|
||||
raise ValueError(
|
||||
f"status must be one of {sorted(ALLOWED_MP_STATUS)}"
|
||||
)
|
||||
if "cited_by_party" in clean and clean["cited_by_party"] and \
|
||||
clean["cited_by_party"] not in ALLOWED_MP_PARTIES:
|
||||
raise ValueError(
|
||||
f"cited_by_party must be one of {sorted(ALLOWED_MP_PARTIES)}"
|
||||
)
|
||||
set_clauses = []
|
||||
values = []
|
||||
for i, (key, val) in enumerate(clean.items(), start=2):
|
||||
set_clauses.append(f"{key} = ${i}")
|
||||
values.append(val)
|
||||
set_clauses.append("updated_at = now()")
|
||||
sql = (
|
||||
f"UPDATE missing_precedents SET {', '.join(set_clauses)} "
|
||||
f"WHERE id = $1 RETURNING *"
|
||||
)
|
||||
pool = await get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
row = await conn.fetchrow(sql, mp_id, *values)
|
||||
return _row_to_missing_precedent(row) if row else None
|
||||
|
||||
|
||||
async def close_missing_precedent(
|
||||
mp_id: UUID,
|
||||
linked_case_law_id: UUID | None = None,
|
||||
notes: str | None = None,
|
||||
status: str = "closed",
|
||||
) -> dict | None:
|
||||
"""Mark a missing-precedent row as closed (or 'uploaded'/'irrelevant')
|
||||
and link it to a case_law row if provided."""
|
||||
if status not in ALLOWED_MP_STATUS:
|
||||
raise ValueError(
|
||||
f"status must be one of {sorted(ALLOWED_MP_STATUS)}"
|
||||
)
|
||||
pool = await get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
sets = ["status = $2", "closed_at = now()", "updated_at = now()"]
|
||||
params: list = [mp_id, status]
|
||||
idx = 3
|
||||
if linked_case_law_id is not None:
|
||||
sets.append(f"linked_case_law_id = ${idx}")
|
||||
params.append(linked_case_law_id)
|
||||
idx += 1
|
||||
if notes is not None:
|
||||
sets.append(f"notes = ${idx}")
|
||||
params.append(notes)
|
||||
idx += 1
|
||||
sql = (
|
||||
f"UPDATE missing_precedents SET {', '.join(sets)} "
|
||||
f"WHERE id = $1 RETURNING *"
|
||||
)
|
||||
row = await conn.fetchrow(sql, *params)
|
||||
return _row_to_missing_precedent(row) if row else None
|
||||
|
||||
|
||||
async def find_missing_precedent_by_citation(
|
||||
citation: str,
|
||||
case_id: UUID | None = None,
|
||||
) -> dict | None:
|
||||
"""Look up an existing row by citation string (exact match) and optionally
|
||||
cited-in case_id. Used to deduplicate auto-creation by the researcher."""
|
||||
pool = await get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
if case_id is not None:
|
||||
row = await conn.fetchrow(
|
||||
"SELECT * FROM missing_precedents "
|
||||
"WHERE citation = $1 AND cited_in_case_id = $2 LIMIT 1",
|
||||
citation.strip(), case_id,
|
||||
)
|
||||
else:
|
||||
row = await conn.fetchrow(
|
||||
"SELECT * FROM missing_precedents WHERE citation = $1 LIMIT 1",
|
||||
citation.strip(),
|
||||
)
|
||||
return _row_to_missing_precedent(row) if row else None
|
||||
|
||||
@@ -52,16 +52,44 @@ DOMAIN_PRACTICE_AREAS: set[str] = {
|
||||
"compensation_197",
|
||||
}
|
||||
|
||||
# Union — what ``validate()`` accepts for backward-compat
|
||||
PRACTICE_AREAS: set[str] = MULTI_TENANT_PRACTICE_AREAS | DOMAIN_PRACTICE_AREAS
|
||||
# Union — what ``validate()`` accepts for backward-compat.
|
||||
# Empty string is permitted because the DB CHECK constraint allows it as
|
||||
# a "not yet classified" sentinel (e.g. when auto-derivation fails on an
|
||||
# unrecognized case_number format).
|
||||
PRACTICE_AREAS: set[str] = MULTI_TENANT_PRACTICE_AREAS | DOMAIN_PRACTICE_AREAS | {""}
|
||||
|
||||
APPEALS_COMMITTEE_SUBTYPES: set[str] = {
|
||||
"building_permit",
|
||||
"betterment_levy",
|
||||
"compensation_197",
|
||||
# בל"מ — בקשה להארכת מועד להגשת ערר. מסלולים נפרדים לפי domain:
|
||||
"extension_request_building_permit", # 1xxx — סעיף 152, 30 ימים
|
||||
"extension_request_betterment_levy", # 8xxx — סעיף 14 לתוספת ג', 45 ימים
|
||||
"extension_request_compensation", # 9xxx — סעיף 198(ד), 30 ימים
|
||||
"unknown",
|
||||
}
|
||||
|
||||
# בל"מ subtypes — קל לזהות ע"י prefix
|
||||
BLAM_SUBTYPES: set[str] = {
|
||||
"extension_request_building_permit",
|
||||
"extension_request_betterment_levy",
|
||||
"extension_request_compensation",
|
||||
}
|
||||
|
||||
# מיפוי domain → בל"מ subtype
|
||||
_DOMAIN_TO_BLAM_SUBTYPE: dict[str, str] = {
|
||||
"rishuy_uvniya": "extension_request_building_permit",
|
||||
"betterment_levy": "extension_request_betterment_levy",
|
||||
"compensation_197": "extension_request_compensation",
|
||||
}
|
||||
|
||||
# מיפוי first-digit → בל"מ subtype (אותו מבנה כמו _APPEALS_COMMITTEE_DIGIT_TO_SUBTYPE)
|
||||
_APPEALS_COMMITTEE_DIGIT_TO_BLAM = {
|
||||
"1": "extension_request_building_permit",
|
||||
"8": "extension_request_betterment_levy",
|
||||
"9": "extension_request_compensation",
|
||||
}
|
||||
|
||||
DEFAULT_PRACTICE_AREA = "appeals_committee"
|
||||
|
||||
# Subtypes per practice_area (extend when adding domains)
|
||||
@@ -70,9 +98,11 @@ SUBTYPES_BY_AREA: dict[str, set[str]] = {
|
||||
"national_insurance": {"unknown"},
|
||||
"labor_law": {"unknown"},
|
||||
# Domain values — subtype is implicit in the value itself
|
||||
"rishuy_uvniya": {"building_permit", "unknown"},
|
||||
"betterment_levy": {"betterment_levy", "unknown"},
|
||||
"compensation_197": {"compensation_197", "unknown"},
|
||||
"rishuy_uvniya": {"building_permit", "extension_request_building_permit", "unknown"},
|
||||
"betterment_levy": {"betterment_levy", "extension_request_betterment_levy", "unknown"},
|
||||
"compensation_197": {"compensation_197", "extension_request_compensation", "unknown"},
|
||||
# Empty (unclassified) — allow any of the appeals_committee subtypes
|
||||
"": APPEALS_COMMITTEE_SUBTYPES,
|
||||
}
|
||||
|
||||
# Mapping: (multi_tenant_pa, appeal_subtype) → domain_pa
|
||||
@@ -80,9 +110,39 @@ _SUBTYPE_TO_DOMAIN: dict[str, str] = {
|
||||
"building_permit": "rishuy_uvniya",
|
||||
"betterment_levy": "betterment_levy",
|
||||
"compensation_197": "compensation_197",
|
||||
"extension_request_building_permit": "rishuy_uvniya",
|
||||
"extension_request_betterment_levy": "betterment_levy",
|
||||
"extension_request_compensation": "compensation_197",
|
||||
}
|
||||
|
||||
|
||||
# Regex לזיהוי "בקשה להארכת מועד" בנושא הערר (subject) —
|
||||
# וריאציות נפוצות. case-insensitive, מתחשב במרכאות חכמות/רגילות.
|
||||
_BLAM_SUBJECT_PATTERNS = (
|
||||
re.compile(r"בקשה\s+להארכת\s+מועד", re.IGNORECASE),
|
||||
re.compile(r"בל[\"״״]מ", re.IGNORECASE), # בל"מ עם quote variants
|
||||
re.compile(r"הארכת\s+מועד\s+להגשת", re.IGNORECASE),
|
||||
)
|
||||
|
||||
|
||||
def is_blam_subject(subject: str) -> bool:
|
||||
"""True iff subject indicates a בל"מ (extension-of-time request).
|
||||
|
||||
מזהה: "בקשה להארכת מועד", "בל\"מ", "הארכת מועד להגשת..."
|
||||
|
||||
Examples:
|
||||
>>> is_blam_subject("בל\"מ אלחנן ברלינגר נ' לינדאב")
|
||||
True
|
||||
>>> is_blam_subject("בקשה להארכת מועד להגשת ערר")
|
||||
True
|
||||
>>> is_blam_subject("היתר בנייה ברחוב X")
|
||||
False
|
||||
"""
|
||||
if not subject:
|
||||
return False
|
||||
return any(p.search(subject) for p in _BLAM_SUBJECT_PATTERNS)
|
||||
|
||||
|
||||
def to_db_practice_area(practice_area: str, appeal_subtype: str = "") -> str:
|
||||
"""Convert a multi-tenant practice_area + appeal_subtype to the
|
||||
domain value stored in DB columns (case_law/cases).
|
||||
@@ -120,14 +180,28 @@ _CASE_NUM = re.compile(r"(?:ARAR[-\s]*\d{2}[-\s]*(?:\d{2}[-\s]*)?)(\d{4})", re.I
|
||||
_PLAIN_NUM = re.compile(r"(\d{4})")
|
||||
|
||||
|
||||
_DOMAIN_TO_SUBTYPE: dict[str, str] = {
|
||||
"rishuy_uvniya": "building_permit",
|
||||
"betterment_levy": "betterment_levy",
|
||||
"compensation_197": "compensation_197",
|
||||
}
|
||||
|
||||
|
||||
def derive_subtype(case_number: str, practice_area: str = DEFAULT_PRACTICE_AREA) -> str:
|
||||
"""Infer the appeal_subtype from case_number.
|
||||
|
||||
For appeals_committee, the convention is:
|
||||
For appeals_committee (axis A), the convention is:
|
||||
1xxx → building_permit, 8xxx → betterment_levy, 9xxx → compensation_197.
|
||||
|
||||
For domain values (axis B — rishuy_uvniya/betterment_levy/compensation_197),
|
||||
the subtype is implicit in the practice_area itself — we map directly
|
||||
without parsing the case number.
|
||||
|
||||
Handles multiple formats: ARAR-25-8126, 8126/25, 1170, ערר 1024-25.
|
||||
"""
|
||||
# Axis B: practice_area is already a domain value — map directly.
|
||||
if practice_area in DOMAIN_PRACTICE_AREAS:
|
||||
return _DOMAIN_TO_SUBTYPE.get(practice_area, "unknown")
|
||||
if practice_area != "appeals_committee":
|
||||
return "unknown"
|
||||
cn = case_number or ""
|
||||
@@ -142,6 +216,82 @@ def derive_subtype(case_number: str, practice_area: str = DEFAULT_PRACTICE_AREA)
|
||||
return _APPEALS_COMMITTEE_DIGIT_TO_SUBTYPE.get(first_digit, "unknown")
|
||||
|
||||
|
||||
def derive_subtype_with_blam(
|
||||
case_number: str,
|
||||
subject: str = "",
|
||||
practice_area: str = DEFAULT_PRACTICE_AREA,
|
||||
) -> str:
|
||||
"""Like ``derive_subtype()`` but also detects בל"מ from the subject.
|
||||
|
||||
If ``subject`` indicates a בקשה להארכת מועד, the returned subtype is
|
||||
one of the ``extension_request_*`` values (chosen per case_number /
|
||||
practice_area). Otherwise behaviour matches ``derive_subtype()``.
|
||||
|
||||
Examples:
|
||||
>>> derive_subtype_with_blam("1017-03-26", "בל\"מ ברלינגר נ' לינדאב")
|
||||
'extension_request_building_permit'
|
||||
>>> derive_subtype_with_blam("8500-25", "בקשה להארכת מועד")
|
||||
'extension_request_betterment_levy'
|
||||
>>> derive_subtype_with_blam("1033-25", "ערר על החלטת ועדה")
|
||||
'building_permit'
|
||||
"""
|
||||
base = derive_subtype(case_number, practice_area)
|
||||
if not is_blam_subject(subject):
|
||||
return base
|
||||
# subject says it's בל"מ — return the matching extension_request_* variant.
|
||||
# For domain practice_area (axis B), use the direct mapping.
|
||||
if practice_area in DOMAIN_PRACTICE_AREAS:
|
||||
return _DOMAIN_TO_BLAM_SUBTYPE.get(practice_area, base)
|
||||
# For appeals_committee (axis A), derive from case_number digit.
|
||||
if practice_area == "appeals_committee":
|
||||
cn = case_number or ""
|
||||
m = _CASE_NUM.search(cn) or _PLAIN_NUM.search(cn)
|
||||
if m:
|
||||
first_digit = m.group(1)[0]
|
||||
blam = _APPEALS_COMMITTEE_DIGIT_TO_BLAM.get(first_digit)
|
||||
if blam:
|
||||
return blam
|
||||
return base
|
||||
|
||||
|
||||
def is_blam_subtype(appeal_subtype: str) -> bool:
|
||||
"""True iff appeal_subtype is one of the extension_request_* variants.
|
||||
|
||||
Useful for UI badges and routing logic that need to detect בל"מ cases
|
||||
regardless of which domain they belong to.
|
||||
"""
|
||||
return appeal_subtype in BLAM_SUBTYPES
|
||||
|
||||
|
||||
def derive_domain_practice_area(case_number: str) -> str:
|
||||
"""Map a case_number prefix to a domain practice_area (axis B).
|
||||
|
||||
Returns:
|
||||
``"rishuy_uvniya"`` for 1xxx, ``"betterment_levy"`` for 8xxx,
|
||||
``"compensation_197"`` for 9xxx, or ``""`` when the prefix is
|
||||
unrecognized (caller decides the fallback).
|
||||
|
||||
Examples:
|
||||
>>> derive_domain_practice_area("8126/25")
|
||||
'betterment_levy'
|
||||
>>> derive_domain_practice_area("1170")
|
||||
'rishuy_uvniya'
|
||||
>>> derive_domain_practice_area("ARAR-24-01-9007")
|
||||
'compensation_197'
|
||||
>>> derive_domain_practice_area("foo")
|
||||
''
|
||||
"""
|
||||
cn = case_number or ""
|
||||
m = _CASE_NUM.search(cn) or _PLAIN_NUM.search(cn)
|
||||
if not m:
|
||||
return ""
|
||||
first_digit = m.group(1)[0]
|
||||
subtype = _APPEALS_COMMITTEE_DIGIT_TO_SUBTYPE.get(first_digit)
|
||||
if not subtype:
|
||||
return ""
|
||||
return _SUBTYPE_TO_DOMAIN.get(subtype, "")
|
||||
|
||||
|
||||
# ── Validation ─────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@@ -164,6 +314,20 @@ def validate(practice_area: str, appeal_subtype: str | None) -> None:
|
||||
|
||||
def is_override(case_number: str, practice_area: str, appeal_subtype: str) -> bool:
|
||||
"""True iff the user-supplied subtype disagrees with what derive_subtype
|
||||
would have produced (and the derived value is not 'unknown')."""
|
||||
would have produced (and the derived value is not 'unknown').
|
||||
|
||||
Note: בל"מ variants (extension_request_*) are NOT considered overrides
|
||||
of their parent domain — extension_request_building_permit on a 1xxx
|
||||
case is consistent with the case-number convention.
|
||||
"""
|
||||
derived = derive_subtype(case_number, practice_area)
|
||||
return derived != "unknown" and derived != appeal_subtype
|
||||
if derived == "unknown":
|
||||
return False
|
||||
if derived == appeal_subtype:
|
||||
return False
|
||||
# בל"מ variants of the same domain are not overrides.
|
||||
if appeal_subtype in BLAM_SUBTYPES:
|
||||
# extension_request_building_permit ↔ building_permit (1xxx) — same domain
|
||||
if _SUBTYPE_TO_DOMAIN.get(appeal_subtype) == _SUBTYPE_TO_DOMAIN.get(derived):
|
||||
return False
|
||||
return True
|
||||
|
||||
@@ -128,7 +128,7 @@ async def case_create(
|
||||
hearing_date: str = "",
|
||||
notes: str = "",
|
||||
expected_outcome: str = "",
|
||||
practice_area: str = "appeals_committee",
|
||||
practice_area: str = "",
|
||||
appeal_subtype: str = "",
|
||||
) -> str:
|
||||
"""יצירת תיק ערר חדש.
|
||||
@@ -145,7 +145,9 @@ async def case_create(
|
||||
hearing_date: תאריך דיון (YYYY-MM-DD)
|
||||
notes: הערות
|
||||
expected_outcome: תוצאה צפויה (rejection/partial_acceptance/full_acceptance/betterment_levy)
|
||||
practice_area: תחום משפטי (appeals_committee / national_insurance / labor_law)
|
||||
practice_area: תחום משפטי — domain value (rishuy_uvniya / betterment_levy /
|
||||
compensation_197). ריק או "appeals_committee" = יוסק
|
||||
אוטומטית ממספר התיק (1xxx→רישוי, 8xxx→השבחה, 9xxx→197)
|
||||
appeal_subtype: סוג ערר (building_permit / betterment_levy / compensation_197).
|
||||
ריק = יוסק אוטומטית ממספר התיק
|
||||
"""
|
||||
@@ -155,8 +157,18 @@ async def case_create(
|
||||
if hearing_date:
|
||||
h_date = date_type.fromisoformat(hearing_date)
|
||||
|
||||
# Resolve appeal_subtype: explicit override > auto-derive > 'unknown'
|
||||
derived_subtype = pa.derive_subtype(case_number, practice_area)
|
||||
# Auto-derive practice_area when missing or set to the legacy multi-tenant
|
||||
# value. The DB's cases_practice_area_check rejects 'appeals_committee',
|
||||
# so we MUST map it to a domain value before INSERT. If derivation fails
|
||||
# (unknown case number format), fall back to '' which the constraint allows.
|
||||
if not practice_area or practice_area == "appeals_committee":
|
||||
practice_area = pa.derive_domain_practice_area(case_number)
|
||||
|
||||
# Resolve appeal_subtype: explicit override > auto-derive > 'unknown'.
|
||||
# derive_subtype_with_blam inspects the subject to detect בל"מ
|
||||
# (בקשה להארכת מועד) and returns an extension_request_* variant when
|
||||
# appropriate. Falls back to regular derive_subtype when subject is empty.
|
||||
derived_subtype = pa.derive_subtype_with_blam(case_number, subject, practice_area)
|
||||
if not appeal_subtype:
|
||||
appeal_subtype = derived_subtype
|
||||
pa.validate(practice_area, appeal_subtype)
|
||||
|
||||
83
mcp-server/src/legal_mcp/tools/legal_arguments.py
Normal file
83
mcp-server/src/legal_mcp/tools/legal_arguments.py
Normal file
@@ -0,0 +1,83 @@
|
||||
"""MCP tools — aggregated legal arguments (claim de-duplication)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import argument_aggregator, db
|
||||
|
||||
|
||||
async def aggregate_claims_to_arguments(
|
||||
case_number: str,
|
||||
force: bool = False,
|
||||
) -> str:
|
||||
"""כינוס פרופוזיציות גולמיות לטיעונים משפטיים מובחנים.
|
||||
|
||||
Args:
|
||||
case_number: מספר תיק הערר.
|
||||
force: True = למחוק טיעונים קיימים ולחשב מחדש.
|
||||
"""
|
||||
case = await db.get_case_by_number(case_number)
|
||||
if not case:
|
||||
return json.dumps(
|
||||
{"status": "error", "message": f"תיק {case_number} לא נמצא."},
|
||||
ensure_ascii=False, indent=2,
|
||||
)
|
||||
|
||||
case_id = UUID(case["id"])
|
||||
result = await argument_aggregator.aggregate_claims_to_arguments(
|
||||
case_id, force=force,
|
||||
)
|
||||
result["case_number"] = case_number
|
||||
return json.dumps(result, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
async def get_legal_arguments(
|
||||
case_number: str,
|
||||
party: str = "",
|
||||
) -> str:
|
||||
"""שליפת טיעונים משפטיים מאוגדים לתיק.
|
||||
|
||||
Args:
|
||||
case_number: מספר תיק הערר.
|
||||
party: סינון לפי צד (appellant/respondent/committee/permit_applicant).
|
||||
ריק = כל הצדדים.
|
||||
"""
|
||||
case = await db.get_case_by_number(case_number)
|
||||
if not case:
|
||||
return json.dumps(
|
||||
{"status": "error", "message": f"תיק {case_number} לא נמצא."},
|
||||
ensure_ascii=False, indent=2,
|
||||
)
|
||||
|
||||
case_id = UUID(case["id"])
|
||||
args = await argument_aggregator.get_legal_arguments(case_id, party=party)
|
||||
|
||||
if not args:
|
||||
return json.dumps({
|
||||
"status": "empty",
|
||||
"case_number": case_number,
|
||||
"message": "לא נמצאו טיעונים מאוגדים. הרץ aggregate_claims_to_arguments תחילה.",
|
||||
"arguments": [],
|
||||
}, ensure_ascii=False, indent=2)
|
||||
|
||||
# Group by party for nicer display.
|
||||
party_he = {
|
||||
"appellant": "עוררים",
|
||||
"respondent": "משיבים",
|
||||
"committee": "ועדה מקומית",
|
||||
"permit_applicant": "מבקשי היתר",
|
||||
"unknown": "צד לא מזוהה",
|
||||
}
|
||||
by_party: dict[str, list[dict]] = {}
|
||||
for a in args:
|
||||
label = party_he.get(a["party"], a["party"])
|
||||
by_party.setdefault(label, []).append(a)
|
||||
|
||||
return json.dumps({
|
||||
"status": "ok",
|
||||
"case_number": case_number,
|
||||
"total": len(args),
|
||||
"by_party": by_party,
|
||||
}, ensure_ascii=False, indent=2, default=str)
|
||||
210
mcp-server/src/legal_mcp/tools/missing_precedents.py
Normal file
210
mcp-server/src/legal_mcp/tools/missing_precedents.py
Normal file
@@ -0,0 +1,210 @@
|
||||
"""MCP tools for the missing-precedents log.
|
||||
|
||||
When a researcher (or chair) finds a citation in a party brief that
|
||||
isn't yet in the precedent_library, they record it here so:
|
||||
|
||||
1. The gap is visible in the UI (the chair can see all open citations
|
||||
that need to be uploaded).
|
||||
2. The writer agent doesn't try to use a precedent that isn't in the
|
||||
corpus — it knows the gap is being tracked.
|
||||
3. The chair has a clean closing workflow: upload the actual decision
|
||||
via the precedent library / internal-decisions, then link it here.
|
||||
|
||||
Three tools:
|
||||
- ``missing_precedent_create`` — log a new gap (researcher / chair).
|
||||
- ``missing_precedent_list`` — list open gaps (optionally filtered).
|
||||
- ``missing_precedent_close`` — close a gap (chair workflow).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
return json.dumps(payload, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
def _err(msg: str) -> str:
|
||||
return json.dumps({"error": msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
async def _resolve_case_id(case_number: str) -> UUID | None:
|
||||
"""Translate a human case_number (e.g. '1017-03-26') to a UUID."""
|
||||
if not case_number or not case_number.strip():
|
||||
return None
|
||||
row = await db.get_case_by_number(case_number.strip())
|
||||
if not row:
|
||||
return None
|
||||
return UUID(row["id"])
|
||||
|
||||
|
||||
async def missing_precedent_create(
|
||||
citation: str,
|
||||
case_number: str = "",
|
||||
cited_in_document_id: str = "",
|
||||
cited_by_party: str = "unknown",
|
||||
cited_by_party_name: str = "",
|
||||
legal_topic: str = "",
|
||||
legal_issue: str = "",
|
||||
claim_quote: str = "",
|
||||
case_name: str = "",
|
||||
notes: str = "",
|
||||
) -> str:
|
||||
"""תיעוד פסיקה שצוטטה אך אינה בקורפוס. הסוכן יוצר רשומה כשהוא מזהה ציטוט
|
||||
שלא ניתן לאמת מול הקורפוס; היו"ר יסגור אותה לאחר העלאת המסמך.
|
||||
|
||||
Args:
|
||||
citation: מראה המקום המלא (חובה).
|
||||
case_number: מספר תיק הערר שבו צוטטה הפסיקה (לדוגמה '1017-03-26').
|
||||
cited_in_document_id: UUID של המסמך שבו הציטוט מופיע (אופציונלי).
|
||||
cited_by_party: appellant / respondent / committee / permit_applicant / unknown.
|
||||
cited_by_party_name: שם הצד (כדי שיהיה ברור מי ציטט).
|
||||
legal_topic: נושא משפטי קצר (לדוגמה "זכות עמידה").
|
||||
legal_issue: שאלה משפטית מפורטת.
|
||||
claim_quote: הציטוט בכתב הטענות.
|
||||
case_name: שם קצר של פסק הדין החסר.
|
||||
notes: הערות חופשיות.
|
||||
|
||||
Returns: JSON של הרשומה שנוצרה (כולל id) או error.
|
||||
"""
|
||||
if not citation.strip():
|
||||
return _err("citation חובה")
|
||||
|
||||
case_id = None
|
||||
if case_number:
|
||||
case_id = await _resolve_case_id(case_number)
|
||||
if case_id is None:
|
||||
return _err(f"תיק לא נמצא: {case_number}")
|
||||
|
||||
doc_uuid: UUID | None = None
|
||||
if cited_in_document_id.strip():
|
||||
try:
|
||||
doc_uuid = UUID(cited_in_document_id.strip())
|
||||
except ValueError:
|
||||
return _err("cited_in_document_id לא תקין")
|
||||
|
||||
party = cited_by_party.strip() or "unknown"
|
||||
if party not in db.ALLOWED_MP_PARTIES:
|
||||
return _err(
|
||||
f"cited_by_party לא תקין. ערכים תקפים: "
|
||||
f"{', '.join(sorted(db.ALLOWED_MP_PARTIES))}"
|
||||
)
|
||||
|
||||
# Deduplication: if a row already exists for the same citation in
|
||||
# the same case, return that one rather than creating a duplicate.
|
||||
existing = await db.find_missing_precedent_by_citation(
|
||||
citation=citation.strip(),
|
||||
case_id=case_id,
|
||||
)
|
||||
if existing:
|
||||
return _ok({**existing, "_duplicate": True})
|
||||
|
||||
try:
|
||||
row = await db.create_missing_precedent(
|
||||
citation=citation.strip(),
|
||||
case_name=case_name.strip() or None,
|
||||
cited_in_case_id=case_id,
|
||||
cited_in_document_id=doc_uuid,
|
||||
cited_by_party=party,
|
||||
cited_by_party_name=cited_by_party_name.strip() or None,
|
||||
legal_topic=legal_topic.strip() or None,
|
||||
legal_issue=legal_issue.strip() or None,
|
||||
claim_quote=claim_quote.strip() or None,
|
||||
notes=notes.strip() or None,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(row)
|
||||
|
||||
|
||||
async def missing_precedent_list(
|
||||
case_number: str = "",
|
||||
status: str = "open",
|
||||
legal_topic: str = "",
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת פסיקות חסרות. ברירת מחדל = פתוחות בלבד.
|
||||
|
||||
Args:
|
||||
case_number: סינון לפי תיק הערר שבו צוטטו.
|
||||
status: open / uploaded / closed / irrelevant (ריק = הכל).
|
||||
legal_topic: סינון לפי נושא משפטי (substring).
|
||||
limit: מספר תוצאות מקסימלי.
|
||||
|
||||
Returns: JSON עם רשימת רשומות + linked_case_law_number אם נסגרו.
|
||||
"""
|
||||
case_id = None
|
||||
if case_number:
|
||||
case_id = await _resolve_case_id(case_number)
|
||||
if case_id is None:
|
||||
return _err(f"תיק לא נמצא: {case_number}")
|
||||
|
||||
s = status.strip() or None
|
||||
if s and s not in db.ALLOWED_MP_STATUS:
|
||||
return _err(
|
||||
f"status לא תקין. ערכים תקפים: "
|
||||
f"{', '.join(sorted(db.ALLOWED_MP_STATUS))}"
|
||||
)
|
||||
try:
|
||||
rows = await db.list_missing_precedents(
|
||||
status=s,
|
||||
case_id=case_id,
|
||||
legal_topic=legal_topic.strip() or None,
|
||||
limit=max(1, min(int(limit), 500)),
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok({"items": rows, "count": len(rows)})
|
||||
|
||||
|
||||
async def missing_precedent_close(
|
||||
id: str,
|
||||
linked_case_law_id: str = "",
|
||||
notes: str = "",
|
||||
status: str = "closed",
|
||||
) -> str:
|
||||
"""סגירת רשומת פסיקה חסרה. ברירת מחדל = 'closed' + קישור ל-case_law.
|
||||
|
||||
Args:
|
||||
id: UUID של הרשומה.
|
||||
linked_case_law_id: UUID של הפסיקה שהועלתה ב-precedent_library / internal_decisions.
|
||||
notes: הערות סגירה (לדוגמה "אינו רלוונטי" ל-status='irrelevant').
|
||||
status: closed / uploaded / irrelevant.
|
||||
|
||||
Returns: JSON של הרשומה המעודכנת.
|
||||
"""
|
||||
try:
|
||||
mp_id = UUID(id.strip())
|
||||
except ValueError:
|
||||
return _err("id לא תקין")
|
||||
|
||||
cl_uuid: UUID | None = None
|
||||
if linked_case_law_id.strip():
|
||||
try:
|
||||
cl_uuid = UUID(linked_case_law_id.strip())
|
||||
except ValueError:
|
||||
return _err("linked_case_law_id לא תקין")
|
||||
|
||||
status_clean = status.strip() or "closed"
|
||||
if status_clean not in db.ALLOWED_MP_STATUS:
|
||||
return _err(
|
||||
f"status לא תקין. ערכים תקפים: "
|
||||
f"{', '.join(sorted(db.ALLOWED_MP_STATUS))}"
|
||||
)
|
||||
|
||||
try:
|
||||
row = await db.close_missing_precedent(
|
||||
mp_id=mp_id,
|
||||
linked_case_law_id=cl_uuid,
|
||||
notes=notes.strip() or None,
|
||||
status=status_clean,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
if row is None:
|
||||
return _err("רשומה לא נמצאה")
|
||||
return _ok(row)
|
||||
276
mcp-server/tests/test_corpus_constraints.py
Normal file
276
mcp-server/tests/test_corpus_constraints.py
Normal file
@@ -0,0 +1,276 @@
|
||||
"""Regression tests for Stage-A corpus integrity fixes (TaskMaster #30, #31).
|
||||
|
||||
These tests document the bugs that were closed in Stage A so they don't
|
||||
regress quietly. Each test maps to a real bug or constraint:
|
||||
|
||||
1. DB CHECK ``cases_practice_area_check`` rejects the legacy
|
||||
``'appeals_committee'`` value — only domain values (rishuy_uvniya /
|
||||
betterment_levy / compensation_197) and ``''`` are allowed.
|
||||
(Bug: many ``cases`` rows stored ``'appeals_committee'`` instead of
|
||||
the domain.)
|
||||
|
||||
2. DB CHECK ``case_law_internal_chair_check`` and
|
||||
``case_law_internal_district_check`` reject internal_committee rows
|
||||
with empty chair_name/district.
|
||||
(Bug: 6 records had source_kind='external_upload' but were really
|
||||
internal committee decisions; the flip to internal_committee in
|
||||
Stage A.2 surfaced the missing chair/district fields.)
|
||||
|
||||
3. DB CHECK ``case_law_external_arar_check`` rejects external_upload
|
||||
rows whose case_number starts with ``"ערר"`` or ``"בל\\"מ"`` —
|
||||
committee decisions must go through internal_decision_upload, not
|
||||
precedent_library_upload.
|
||||
(Bug: the legacy upload path stored everything as external_upload,
|
||||
including appeal-committee decisions; the citation guard now
|
||||
redirects them.)
|
||||
|
||||
4. MCP tool ``precedent_library_upload`` returns an ``_err`` envelope
|
||||
when the citation starts with ``"ערר"`` (citation guard, not DB
|
||||
constraint — fires before INSERT to surface a helpful error).
|
||||
|
||||
These tests connect to the live local Postgres (port 5433) — they do not
|
||||
mock asyncpg. Run with::
|
||||
|
||||
pytest mcp-server/tests/test_corpus_constraints.py -v
|
||||
|
||||
If you don't have ``DATABASE_URL`` set, the tests are skipped.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
from uuid import uuid4
|
||||
|
||||
import asyncpg
|
||||
import pytest
|
||||
|
||||
|
||||
def _dsn() -> str | None:
|
||||
return (
|
||||
os.environ.get("DATABASE_URL")
|
||||
or os.environ.get("LEGAL_AI_DATABASE_URL")
|
||||
or "postgresql://legal_ai:od0ASJZFYibOlWK59krLvvETmgqwlXe8@localhost:5433/legal_ai"
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def dsn() -> str:
|
||||
d = _dsn()
|
||||
if not d:
|
||||
pytest.skip("No DATABASE_URL set; skipping live-DB regression tests")
|
||||
return d
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def event_loop():
|
||||
"""Provide a fresh event loop per test so asyncpg doesn't leak across cases."""
|
||||
loop = asyncio.new_event_loop()
|
||||
try:
|
||||
yield loop
|
||||
finally:
|
||||
loop.close()
|
||||
|
||||
|
||||
def _run(loop, coro):
|
||||
return loop.run_until_complete(coro)
|
||||
|
||||
|
||||
# ── 1. cases.practice_area CHECK ─────────────────────────────────────
|
||||
|
||||
|
||||
def test_cases_rejects_appeals_committee_practice_area(dsn: str, event_loop) -> None:
|
||||
"""``cases.practice_area = 'appeals_committee'`` must violate the CHECK."""
|
||||
|
||||
async def attempt() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO cases (id, case_number, title, practice_area)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), f"TEST-{uuid4().hex[:8]}", "regression-test",
|
||||
"appeals_committee",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt())
|
||||
|
||||
|
||||
def test_cases_accepts_domain_practice_area(dsn: str, event_loop) -> None:
|
||||
"""Sanity check: rishuy_uvniya / betterment_levy / compensation_197
|
||||
+ empty string must be accepted."""
|
||||
|
||||
async def attempt() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
tx = conn.transaction()
|
||||
await tx.start()
|
||||
try:
|
||||
for value in ("rishuy_uvniya", "betterment_levy",
|
||||
"compensation_197", ""):
|
||||
await conn.execute(
|
||||
"""INSERT INTO cases (id, case_number, title, practice_area)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), f"TEST-{uuid4().hex[:8]}",
|
||||
f"regression-{value or 'empty'}", value,
|
||||
)
|
||||
finally:
|
||||
await tx.rollback()
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt())
|
||||
|
||||
|
||||
# ── 2. case_law internal_committee chair/district CHECK ─────────────
|
||||
|
||||
|
||||
def test_case_law_internal_requires_chair_and_district(dsn: str, event_loop) -> None:
|
||||
"""``case_law`` rows with ``source_kind='internal_committee'`` must have
|
||||
non-empty ``chair_name`` AND ``district``."""
|
||||
|
||||
async def attempt_missing_chair() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind, district, chair_name)
|
||||
VALUES ($1, $2, $3, $4, $5, $6)""",
|
||||
uuid4(), f"ערר {uuid4().hex[:6]}",
|
||||
"test internal w/o chair",
|
||||
"internal_committee", "ירושלים", "",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
async def attempt_missing_district() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind, district, chair_name)
|
||||
VALUES ($1, $2, $3, $4, $5, $6)""",
|
||||
uuid4(), f"ערר {uuid4().hex[:6]}",
|
||||
"test internal w/o district",
|
||||
"internal_committee", "", "עו\"ד דפנה תמיר",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt_missing_chair())
|
||||
_run(event_loop, attempt_missing_district())
|
||||
|
||||
|
||||
# ── 3. case_law external_upload + ערר citation CHECK ────────────────
|
||||
|
||||
|
||||
def test_case_law_external_upload_rejects_arar_citation(dsn: str, event_loop) -> None:
|
||||
"""``case_law`` rows with ``source_kind='external_upload'`` cannot have
|
||||
a ``case_number`` that starts with ``"ערר"`` or ``"בל\"מ"`` — those
|
||||
are committee decisions and must use ``source_kind='internal_committee'``."""
|
||||
|
||||
async def attempt_arar() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), "ערר 1170/24 חיים נ' ועדה",
|
||||
"test external arar", "external_upload",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
async def attempt_balam() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), 'בל"מ 1234/25 פלוני',
|
||||
"test external balam", "external_upload",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt_arar())
|
||||
_run(event_loop, attempt_balam())
|
||||
|
||||
|
||||
# ── 4. MCP precedent_library_upload citation guard ──────────────────
|
||||
|
||||
|
||||
def test_mcp_precedent_upload_rejects_arar_citation() -> None:
|
||||
"""The MCP tool ``precedent_library_upload`` must short-circuit
|
||||
citations that start with ``"ערר"`` / ``"בל\"מ"`` and return an
|
||||
``_err`` envelope (a helpful message redirecting to
|
||||
``internal_decision_upload``), without touching the DB."""
|
||||
|
||||
from legal_mcp.tools import precedent_library as tools
|
||||
|
||||
async def call(citation: str) -> dict:
|
||||
# file_path won't be touched because the guard fires first.
|
||||
return json.loads(
|
||||
await tools.precedent_library_upload(
|
||||
file_path="/nonexistent",
|
||||
citation=citation,
|
||||
)
|
||||
)
|
||||
|
||||
loop = asyncio.new_event_loop()
|
||||
try:
|
||||
for citation in (
|
||||
"ערר 1170/24 חיים נ' ועדה",
|
||||
'בל"מ 1234/25 פלוני',
|
||||
"ARAR 8126-25 ב. קרן-נכסים",
|
||||
):
|
||||
result = loop.run_until_complete(call(citation))
|
||||
assert "error" in result, (
|
||||
f"expected guard to reject {citation!r}, got {result!r}"
|
||||
)
|
||||
# The error message should mention internal_decision_upload so
|
||||
# the caller knows the alternative path.
|
||||
assert "internal_decision_upload" in result["error"], (
|
||||
f"error message should redirect to internal_decision_upload, "
|
||||
f"got {result['error']!r}"
|
||||
)
|
||||
finally:
|
||||
loop.close()
|
||||
|
||||
|
||||
def test_practice_area_module_invariants() -> None:
|
||||
"""Quick guard that the ``practice_area`` service module exposes the
|
||||
helpers tools and tests depend on, and that derivation is consistent
|
||||
with the case-number convention (1xxx/8xxx/9xxx)."""
|
||||
|
||||
from legal_mcp.services import practice_area as pa
|
||||
|
||||
# Domain mapping is consistent with the case-number prefix convention.
|
||||
assert pa.derive_domain_practice_area("1170") == "rishuy_uvniya"
|
||||
assert pa.derive_domain_practice_area("8126/25") == "betterment_levy"
|
||||
assert pa.derive_domain_practice_area("9001") == "compensation_197"
|
||||
assert pa.derive_domain_practice_area("ARAR-25-8126") == "betterment_levy"
|
||||
# Unparseable input → empty (caller decides fallback).
|
||||
assert pa.derive_domain_practice_area("foo") == ""
|
||||
assert pa.derive_domain_practice_area("") == ""
|
||||
|
||||
# Empty practice_area is valid (DB allows it as 'unclassified').
|
||||
pa.validate("", "unknown")
|
||||
pa.validate("rishuy_uvniya", "building_permit")
|
||||
pa.validate("betterment_levy", "betterment_levy")
|
||||
|
||||
# appeals_committee (axis A) is still recognised for backward-compat.
|
||||
pa.validate("appeals_committee", "building_permit")
|
||||
|
||||
# is_override returns False when subtype matches derivation.
|
||||
assert pa.is_override("1170", "rishuy_uvniya", "building_permit") is False
|
||||
assert pa.is_override("8126", "betterment_levy", "betterment_levy") is False
|
||||
Reference in New Issue
Block a user