feat(corpus): Stage A — corpus tagging fixes + prevention layer
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m8s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m8s
מתקן את הבאג של תיוג שגוי לועדות ערר ומונע חזרתו: **Code changes:** * New MCP tool `internal_decision_upload` (chair_name+district required) — sole supported path for ingesting committee decisions; tags source_kind='internal_committee' automatically. * Citation guard in `precedent_library_upload` rejects citations starting with "ערר" or "בל\"מ" with a directive to use internal_decision_upload. * `practice_area.py` taxonomy unification: PRACTICE_AREAS now accepts both multi-tenant (appeals_committee/national_insurance/labor_law) and domain (rishuy_uvniya/betterment_levy/compensation_197) values. New helper `to_db_practice_area(multi_tenant, subtype) -> domain`. **Agent docs:** * legal-researcher (+5K): upload-tool decision flowchart, code samples per source_kind, district enum (ירושלים/מרכז/תל אביב/צפון/דרום/חיפה/ארצי) * legal-ceo, legal-analyst, legal-writer, legal-qa, HEARTBEAT — taxonomy awareness + source_kind-aware citation patterns + research_complete as valid status. * Fixed two pre-existing wrong practice_area values in examples (histael_hashbacha→betterment_levy, pitsuim_197→compensation_197). Closes TaskMaster #30(parts), #38(parts), #39 (root cause). DB-side backfill + CHECK constraints applied directly via psql: * 11 cases.practice_area corrected (1xxx→rishuy, 8xxx→betterment) * 6 case_law records reclassified external_upload→internal_committee with inferred district * 6 chair_name backfilled from full_text (5 שרית אריאלי + 1 דפנה תמיר) * 88 new halachot extracted for newly-uploaded precedents (אנטרים + ירושלים שקופה 1112/22 + אגא וכט) * CHECK constraints: cases.practice_area enum, case_law internal⇒district Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
@@ -63,6 +63,18 @@ async def precedent_library_upload(
|
||||
"""
|
||||
if not citation.strip():
|
||||
return _err("citation חובה")
|
||||
# Citation guard: appeals-committee decisions must go through
|
||||
# internal_decision_upload (with chair_name + district). The legacy
|
||||
# path always stored source_kind='external_upload' and left
|
||||
# chair_name/district empty — see TaskMaster #30(ב).
|
||||
_norm = citation.strip()
|
||||
_committee_prefixes = ("ערר ", "ערר(", "ערר ", "בל\"מ ", "בל\"מ(", "ARAR ")
|
||||
if any(_norm.startswith(p) for p in _committee_prefixes):
|
||||
return _err(
|
||||
"ציטוט שמתחיל ב-'ערר' או 'בל\"מ' הוא החלטת ועדת ערר. "
|
||||
"השתמש ב-internal_decision_upload (דורש chair_name + district), "
|
||||
"לא ב-precedent_library_upload."
|
||||
)
|
||||
try:
|
||||
result = await precedent_library.ingest_precedent(
|
||||
file_path=file_path,
|
||||
|
||||
Reference in New Issue
Block a user