fix(learning): chair_name במקור — סופי-ועדה תמיד נכנס לקורפוס-הפסיקה (TaskMaster #134)
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 5s

הבאג: שלב-הלמידה (ingest_final_version → ingest_internal_decision) מוסיף כל
סופי כתקדים ציטוטי ב-case_law (source_kind=internal_committee), אך נכשל
בשקט (non-fatal warning) כש-cases.chair_name ריק — בגלל constraint
case_law_internal_chair_check. כך סופיים של 1194/1200/8070 לא נכנסו
לקורפוס-הפסיקה. שורש: (1) chair_name לא נקבע בפתיחת תיק; (2) מסלול-ה-MCP
העביר chair גולמי בעוד מסלול-ה-UI (web/) כבר פתר אותו דטרמיניסטית —
**מסלולים מקבילים מתפצלים (הפרת INV-G2)**; (3) הכשל נבלע (נגד §6).

תיקון-שורש (3 שכבות):
1. **SoT יחיד (INV-G2):** `config.committee_chair_for_case` — המקום היחיד
   שגם web/app.py וגם tools/workflow.py + db.create_case גוזרים ממנו chair
   (לפי תחילית מספר-התיק; override ל-env). web/ אחוד אליו (הוסרה הכפילות).
2. **נרמול-במקור (INV-G1):** `db.create_case` קובע chair_name תמיד לא-ריק;
   `cases.case_create` חושף param. `ingest_final_version` גוזר chair מה-SoT
   במקום הערך הגולמי → ה-constraint לא נופל.
3. **נראות (§6/feedback_silent_swallow):** כשל-העתק מוחזר ב-result
   (`internal_corpus_error`) ו-`final_learning_pipeline` מדפיס אזהרה — לא
   נבלע. backfill ל-11 תיקים עם chair ריק. `audit_corpus_integrity`:
   נוספו CHECK_D (תיקים מוכרעים ללא chair) + CHECK_E (סופי-final חסר
   מקורפוס-הפסיקה) — שניהם 0 כעת.

invariants: מקיים INV-G1 (נרמול בכתיבה), INV-G2 (מסלול-יחיד, אוחד web↔MCP),
§6 (אין בליעה שקטה). בדיקות: py_compile + 14 pytest (chair_seed_gate,
audit_provenance) + integration של create_case (default+override) + הרצת
ה-audit החי (A–E=0).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-12 07:25:54 +00:00
parent 412bd091cf
commit 242e6cfd11
8 changed files with 124 additions and 25 deletions

View File

@@ -362,3 +362,34 @@ def parse_llm_json(raw: str):
except json.JSONDecodeError:
pass
return None
# ── Committee chair — single source of truth (INV-G2) ─────────────────
# internal_committee rows REQUIRE a non-empty chair_name (DB constraint
# case_law_internal_chair_check). Our committee (CMP 1xxx, CMPA 8/9xxx) is
# chaired by Dafna Tamir; map by case-number prefix so adding a future chair
# stays a one-line local change. This resolver is the ONE place both the
# FastAPI final-upload path (web/app.py) and the MCP learning path
# (tools/workflow.py + services/db.create_case) derive the chair from — so
# the two cannot drift into parallel logic. Override via env for another
# committee.
COMMITTEE_CHAIR_DEFAULT = os.environ.get("DEFAULT_CHAIR_NAME", "דפנה תמיר")
COMMITTEE_CHAIR_BY_PREFIX = {
"1": COMMITTEE_CHAIR_DEFAULT,
"8": COMMITTEE_CHAIR_DEFAULT,
"9": COMMITTEE_CHAIR_DEFAULT,
}
def committee_chair_for_case(case: dict | None, case_number: str) -> str:
"""Resolve the chair for one of OUR decisions deterministically (no LLM):
the case's own chair_name, else the committee default by case-number prefix.
Never returns empty for a valid case number — this is how chair_name is
normalised at the source (INV-G1) so internal_committee corpus copies of
finals never silently fail the DB chair constraint.
"""
existing = ((case or {}).get("chair_name") or "").strip()
if existing:
return existing
return COMMITTEE_CHAIR_BY_PREFIX.get((case_number or "")[:1], COMMITTEE_CHAIR_DEFAULT)

View File

@@ -1555,22 +1555,30 @@ async def create_case(
practice_area: str = "",
appeal_subtype: str = "",
proceeding_type: str = "ערר",
# Default "" — resolved below to the committee chair (never stored empty).
# internal_committee corpus copies of this case's final REQUIRE a chair
# (DB constraint case_law_internal_chair_check); setting it at creation
# (INV-G1, source) keeps the learning loop's precedent copy from failing.
chair_name: str = "",
) -> dict:
pool = await get_pool()
case_id = uuid4()
canonical_number = _canonical_case_number(case_number)
resolved_chair = config.committee_chair_for_case(
{"chair_name": chair_name}, canonical_number)
async with pool.acquire() as conn:
await conn.execute(
"""INSERT INTO cases (id, case_number, title, appellants, respondents,
subject, property_address, permit_number, committee_type,
hearing_date, notes, expected_outcome,
practice_area, appeal_subtype, proceeding_type)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15)""",
case_id, _canonical_case_number(case_number), title,
practice_area, appeal_subtype, proceeding_type, chair_name)
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, $14, $15, $16)""",
case_id, canonical_number, title,
json.dumps(appellants or []),
json.dumps(respondents or []),
subject, property_address, permit_number, committee_type,
hearing_date, notes, expected_outcome,
practice_area, appeal_subtype, proceeding_type,
practice_area, appeal_subtype, proceeding_type, resolved_chair,
)
return await get_case(case_id)

View File

@@ -132,6 +132,7 @@ async def case_create(
practice_area: str = "",
appeal_subtype: str = "",
proceeding_type: str = "",
chair_name: str = "",
) -> str:
"""יצירת תיק ערר חדש.
@@ -153,6 +154,9 @@ async def case_create(
appeal_subtype: סוג ערר (building_permit / betterment_levy / compensation_197).
ריק = יוסק אוטומטית ממספר התיק
proceeding_type: 'ערר' / 'בל"מ'. ריק = יוסק מ-appeal_subtype/subject.
chair_name: שם יו"ר הוועדה. ריק = ברירת-המחדל של הוועדה לפי תחילית
מספר-התיק (SoT: config.committee_chair_for_case) — נשמר
תמיד לא-ריק כדי שהעתק-הסופי לקורפוס-הפסיקה לא ייכשל.
"""
# INV-TOOL3 / GAP-52: idempotent on case_number (already UNIQUE in schema).
# Re-creating an existing case returns it instead of raising a unique-violation.
@@ -204,6 +208,7 @@ async def case_create(
practice_area=practice_area,
appeal_subtype=appeal_subtype,
proceeding_type=resolved_proc,
chair_name=chair_name,
)
# If the user overrode the case-number convention (e.g. case 8500 marked

View File

@@ -326,13 +326,20 @@ async def ingest_final_version(
return err(str(e))
# Auto-ingest into internal committee decisions corpus (best-effort).
# chair_name is resolved via the shared SoT (config.committee_chair_for_case)
# — the SAME resolver the FastAPI upload path uses — so the two paths cannot
# drift (INV-G2) and the DB chair constraint is never hit on an empty chair
# (INV-G1: chair normalised at source). Failures are surfaced, not swallowed
# (engineering rule §6 / feedback_silent_swallow): the result carries the
# reason and final_learning_pipeline prints it.
try:
from legal_mcp import config
from legal_mcp.services import internal_decisions as int_svc
await int_svc.ingest_internal_decision(
case_number=case_number,
case_name=case.get("title", ""),
decision_date=case.get("decision_date"),
chair_name=case.get("chair_name", ""),
chair_name=config.committee_chair_for_case(case, case_number),
district="ירושלים",
practice_area=case.get("practice_area", ""),
appeal_subtype=case.get("appeal_subtype", ""),
@@ -340,8 +347,10 @@ async def ingest_final_version(
)
result["internal_corpus_ingested"] = True
except Exception as e:
logger.warning("ingest_final_version: internal corpus ingestion failed (non-fatal): %s", e)
logger.warning(
"ingest_final_version: internal corpus ingestion failed (non-fatal): %s", e)
result["internal_corpus_ingested"] = False
result["internal_corpus_error"] = str(e)
return ok(result)