feat(corpus): Stage A — corpus tagging fixes + prevention layer
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m8s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m8s
מתקן את הבאג של תיוג שגוי לועדות ערר ומונע חזרתו: **Code changes:** * New MCP tool `internal_decision_upload` (chair_name+district required) — sole supported path for ingesting committee decisions; tags source_kind='internal_committee' automatically. * Citation guard in `precedent_library_upload` rejects citations starting with "ערר" or "בל\"מ" with a directive to use internal_decision_upload. * `practice_area.py` taxonomy unification: PRACTICE_AREAS now accepts both multi-tenant (appeals_committee/national_insurance/labor_law) and domain (rishuy_uvniya/betterment_levy/compensation_197) values. New helper `to_db_practice_area(multi_tenant, subtype) -> domain`. **Agent docs:** * legal-researcher (+5K): upload-tool decision flowchart, code samples per source_kind, district enum (ירושלים/מרכז/תל אביב/צפון/דרום/חיפה/ארצי) * legal-ceo, legal-analyst, legal-writer, legal-qa, HEARTBEAT — taxonomy awareness + source_kind-aware citation patterns + research_complete as valid status. * Fixed two pre-existing wrong practice_area values in examples (histael_hashbacha→betterment_levy, pitsuim_197→compensation_197). Closes TaskMaster #30(parts), #38(parts), #39 (root cause). DB-side backfill + CHECK constraints applied directly via psql: * 11 cases.practice_area corrected (1xxx→rishuy, 8xxx→betterment) * 6 case_law records reclassified external_upload→internal_committee with inferred district * 6 chair_name backfilled from full_text (5 שרית אריאלי + 1 דפנה תמיר) * 88 new halachot extracted for newly-uploaded precedents (אנטרים + ירושלים שקופה 1112/22 + אגא וכט) * CHECK constraints: cases.practice_area enum, case_law internal⇒district Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
104
mcp-server/src/legal_mcp/tools/internal_decisions.py
Normal file
104
mcp-server/src/legal_mcp/tools/internal_decisions.py
Normal file
@@ -0,0 +1,104 @@
|
||||
"""MCP tools for the Internal Decisions corpus.
|
||||
|
||||
Decisions of appeals committees (ועדות ערר) live in the same physical
|
||||
``case_law`` table as court rulings but are distinguished by
|
||||
``source_kind='internal_committee'`` and must carry ``chair_name`` +
|
||||
``district``.
|
||||
|
||||
The existing ``precedent_library_upload`` MCP tool always stores
|
||||
``source_kind='external_upload'`` and does not accept chair/district —
|
||||
which is why **44+ existing appeals-committee decisions were tagged
|
||||
wrong**. This wrapper is the authoritative ingestion path for committee
|
||||
decisions and enforces the required metadata at the tool boundary.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
from legal_mcp.services import internal_decisions as int_svc
|
||||
|
||||
# Valid Hebrew district names (matches _COURT_TO_DISTRICT in service)
|
||||
VALID_DISTRICTS = {"ירושלים", "מרכז", "תל אביב", "תל-אביב", "צפון", "דרום", "חיפה", "ארצי"}
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
return json.dumps(payload, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
def _err(msg: str) -> str:
|
||||
return json.dumps({"error": msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
async def internal_decision_upload(
|
||||
file_path: str,
|
||||
case_number: str,
|
||||
chair_name: str,
|
||||
district: str,
|
||||
case_name: str = "",
|
||||
court: str = "",
|
||||
decision_date: str = "",
|
||||
practice_area: str = "",
|
||||
appeal_subtype: str = "",
|
||||
subject_tags: list[str] | None = None,
|
||||
summary: str = "",
|
||||
is_binding: bool = False,
|
||||
) -> str:
|
||||
"""העלאת החלטה של ועדת ערר (internal_committee) לקורפוס הסמכותי.
|
||||
|
||||
Required: file_path, case_number, chair_name, district.
|
||||
The tool enforces chair_name+district so the record cannot be saved
|
||||
in the broken legacy mode (external_upload with empty chair/district).
|
||||
|
||||
Args:
|
||||
file_path: נתיב מלא לקובץ PDF/DOCX/RTF/TXT/MD.
|
||||
case_number: מספר הערר ("ערר (ועדות ערר - תכנון ובנייה ירושלים) 1110/20 ...").
|
||||
chair_name: שם יו"ר הוועדה (חובה).
|
||||
district: מחוז (ירושלים/מרכז/תל אביב/צפון/דרום/חיפה/ארצי) — חובה.
|
||||
case_name: שם קצר.
|
||||
court: ערכאה ("ועדת הערר לתכנון ובנייה — מחוז ירושלים").
|
||||
decision_date: ISO date (YYYY-MM-DD), אופציונלי.
|
||||
practice_area: rishuy_uvniya / betterment_levy / compensation_197.
|
||||
appeal_subtype: building_permit / וכו'.
|
||||
subject_tags: תגיות נושא.
|
||||
is_binding: בד"כ False (ועדת ערר לא מחייבת ועדה אחרת — שכנוע אופקי).
|
||||
|
||||
Returns: JSON עם case_law_id, מספר chunks, halachot_pending.
|
||||
"""
|
||||
if not file_path.strip():
|
||||
return _err("file_path חובה")
|
||||
if not case_number.strip():
|
||||
return _err("case_number חובה")
|
||||
if not chair_name.strip():
|
||||
return _err(
|
||||
"chair_name חובה. החלטות ועדת ערר חייבות שם יו\"ר — "
|
||||
"בלעדיו ההחלטה לא ניתנת לחיפוש סלקטיבי לפי הרכב."
|
||||
)
|
||||
if not district.strip():
|
||||
return _err(
|
||||
"district חובה. ערכים תקפים: " + ", ".join(sorted(VALID_DISTRICTS))
|
||||
)
|
||||
if district.strip() not in VALID_DISTRICTS:
|
||||
return _err(
|
||||
f"district לא תקין: {district!r}. ערכים תקפים: "
|
||||
+ ", ".join(sorted(VALID_DISTRICTS))
|
||||
)
|
||||
|
||||
try:
|
||||
result = await int_svc.ingest_internal_decision(
|
||||
case_number=case_number,
|
||||
case_name=case_name,
|
||||
court=court,
|
||||
decision_date=decision_date or None,
|
||||
chair_name=chair_name,
|
||||
district=district,
|
||||
practice_area=practice_area,
|
||||
appeal_subtype=appeal_subtype,
|
||||
subject_tags=subject_tags or [],
|
||||
summary=summary,
|
||||
is_binding=is_binding,
|
||||
file_path=file_path,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(result)
|
||||
Reference in New Issue
Block a user