fix(learning): chair_name במקור — סופי-ועדה תמיד נכנס לקורפוס-הפסיקה (TaskMaster #134)
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 5s
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 5s
הבאג: שלב-הלמידה (ingest_final_version → ingest_internal_decision) מוסיף כל סופי כתקדים ציטוטי ב-case_law (source_kind=internal_committee), אך נכשל בשקט (non-fatal warning) כש-cases.chair_name ריק — בגלל constraint case_law_internal_chair_check. כך סופיים של 1194/1200/8070 לא נכנסו לקורפוס-הפסיקה. שורש: (1) chair_name לא נקבע בפתיחת תיק; (2) מסלול-ה-MCP העביר chair גולמי בעוד מסלול-ה-UI (web/) כבר פתר אותו דטרמיניסטית — **מסלולים מקבילים מתפצלים (הפרת INV-G2)**; (3) הכשל נבלע (נגד §6). תיקון-שורש (3 שכבות): 1. **SoT יחיד (INV-G2):** `config.committee_chair_for_case` — המקום היחיד שגם web/app.py וגם tools/workflow.py + db.create_case גוזרים ממנו chair (לפי תחילית מספר-התיק; override ל-env). web/ אחוד אליו (הוסרה הכפילות). 2. **נרמול-במקור (INV-G1):** `db.create_case` קובע chair_name תמיד לא-ריק; `cases.case_create` חושף param. `ingest_final_version` גוזר chair מה-SoT במקום הערך הגולמי → ה-constraint לא נופל. 3. **נראות (§6/feedback_silent_swallow):** כשל-העתק מוחזר ב-result (`internal_corpus_error`) ו-`final_learning_pipeline` מדפיס אזהרה — לא נבלע. backfill ל-11 תיקים עם chair ריק. `audit_corpus_integrity`: נוספו CHECK_D (תיקים מוכרעים ללא chair) + CHECK_E (סופי-final חסר מקורפוס-הפסיקה) — שניהם 0 כעת. invariants: מקיים INV-G1 (נרמול בכתיבה), INV-G2 (מסלול-יחיד, אוחד web↔MCP), §6 (אין בליעה שקטה). בדיקות: py_compile + 14 pytest (chair_seed_gate, audit_provenance) + integration של create_case (default+override) + הרצת ה-audit החי (A–E=0). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -82,6 +82,28 @@ CHECK_C_SQL = (
|
||||
" 'compensation_197', '') "
|
||||
"ORDER BY case_number"
|
||||
)
|
||||
# D. cases that reached a decided state but have no chair_name. An empty chair
|
||||
# silently breaks the internal_committee corpus copy of the final
|
||||
# (case_law_internal_chair_check) — chair must be set at source (INV-G1).
|
||||
CHECK_D_SQL = (
|
||||
"SELECT id, case_number, status FROM cases "
|
||||
"WHERE status IN ('final', 'exported', 'reviewed') "
|
||||
"AND (chair_name IS NULL OR chair_name = '') "
|
||||
"ORDER BY case_number"
|
||||
)
|
||||
# E. SIGNED finals that never landed in the citable precedent corpus
|
||||
# (case_law, source_kind='internal_committee'). Only status='final' means the
|
||||
# chair's signed decision was ingested — 'exported' is merely OUR draft DOCX
|
||||
# and legitimately has no precedent copy. This is the exact failure the
|
||||
# chair_name fix prevents going forward; the check catches any regression.
|
||||
CHECK_E_SQL = (
|
||||
"SELECT c.id, c.case_number, c.status FROM cases c "
|
||||
"WHERE c.status = 'final' "
|
||||
"AND NOT EXISTS (SELECT 1 FROM case_law cl "
|
||||
" WHERE cl.case_number = c.case_number "
|
||||
" AND cl.source_kind = 'internal_committee') "
|
||||
"ORDER BY c.case_number"
|
||||
)
|
||||
|
||||
|
||||
logging.basicConfig(
|
||||
@@ -178,6 +200,8 @@ def _format_report(
|
||||
a_hits: list[dict],
|
||||
b_hits: list[dict],
|
||||
c_hits: list[dict],
|
||||
d_hits: list[dict],
|
||||
e_hits: list[dict],
|
||||
ts: datetime,
|
||||
) -> str:
|
||||
parts: list[str] = []
|
||||
@@ -215,6 +239,29 @@ def _format_report(
|
||||
if len(c_hits) > 50:
|
||||
parts.append(f" ... ({len(c_hits) - 50} more truncated)")
|
||||
parts.append("")
|
||||
parts.append(
|
||||
f"Check D (decided cases missing chair_name): {len(d_hits)} hit(s)"
|
||||
)
|
||||
for row in d_hits[:50]:
|
||||
parts.append(
|
||||
f" - id={row['id']} case_number={row['case_number']!r} "
|
||||
f"status={row.get('status')!r}"
|
||||
)
|
||||
if len(d_hits) > 50:
|
||||
parts.append(f" ... ({len(d_hits) - 50} more truncated)")
|
||||
parts.append("")
|
||||
parts.append(
|
||||
f"Check E (signed-final cases missing from internal_committee "
|
||||
f"precedent corpus): {len(e_hits)} hit(s)"
|
||||
)
|
||||
for row in e_hits[:50]:
|
||||
parts.append(
|
||||
f" - id={row['id']} case_number={row['case_number']!r} "
|
||||
f"status={row.get('status')!r}"
|
||||
)
|
||||
if len(e_hits) > 50:
|
||||
parts.append(f" ... ({len(e_hits) - 50} more truncated)")
|
||||
parts.append("")
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
@@ -225,12 +272,14 @@ async def main(args: argparse.Namespace) -> int:
|
||||
a_hits = await _run_check(conn, CHECK_A_SQL)
|
||||
b_hits = await _run_check(conn, CHECK_B_SQL)
|
||||
c_hits = await _run_check(conn, CHECK_C_SQL)
|
||||
d_hits = await _run_check(conn, CHECK_D_SQL)
|
||||
e_hits = await _run_check(conn, CHECK_E_SQL)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
total = len(a_hits) + len(b_hits) + len(c_hits)
|
||||
total = len(a_hits) + len(b_hits) + len(c_hits) + len(d_hits) + len(e_hits)
|
||||
ts = datetime.now(timezone.utc)
|
||||
report = _format_report(a_hits, b_hits, c_hits, ts)
|
||||
report = _format_report(a_hits, b_hits, c_hits, d_hits, e_hits, ts)
|
||||
|
||||
# Always write to log (creates dir + file if missing).
|
||||
LOG_PATH.parent.mkdir(parents=True, exist_ok=True)
|
||||
@@ -246,8 +295,8 @@ async def main(args: argparse.Namespace) -> int:
|
||||
return 0
|
||||
|
||||
logger.warning(
|
||||
"found %d total violation(s) (A=%d, B=%d, C=%d)",
|
||||
total, len(a_hits), len(b_hits), len(c_hits),
|
||||
"found %d total violation(s) (A=%d, B=%d, C=%d, D=%d, E=%d)",
|
||||
total, len(a_hits), len(b_hits), len(c_hits), len(d_hits), len(e_hits),
|
||||
)
|
||||
|
||||
if args.notify:
|
||||
@@ -256,6 +305,8 @@ async def main(args: argparse.Namespace) -> int:
|
||||
f"- Check A (external_upload עם prefix פנימי): {len(a_hits)}",
|
||||
f"- Check B (internal_committee חסר chair/district): {len(b_hits)}",
|
||||
f"- Check C (cases.practice_area לא תקין): {len(c_hits)}",
|
||||
f"- Check D (תיקים מוכרעים ללא chair_name): {len(d_hits)}",
|
||||
f"- Check E (סופיים חסרים מקורפוס-הפסיקה הפנימי): {len(e_hits)}",
|
||||
"",
|
||||
f"פירוט מלא: {LOG_PATH}",
|
||||
]
|
||||
|
||||
Reference in New Issue
Block a user