fix(corpus): move citation guard to service level
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m31s

Defense in depth — the MCP wrapper guard catches researcher uploads, but
the HTTP API (/api/precedent-library/upload) bypasses the wrapper and
calls services.precedent_library.ingest_precedent directly. The guard
now also lives in the service, so HTTP uploads of ערר/בל"מ citations
to the external corpus get rejected at the source.

Companion to DB constraint case_law_external_arar_check (applied via
psql) — three independent layers now enforce the same invariant.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-26 07:49:49 +00:00
parent c6e368e4f7
commit b197d2329c

View File

@@ -116,6 +116,18 @@ async def ingest_precedent(
raise FileNotFoundError(f"file not found: {src}") raise FileNotFoundError(f"file not found: {src}")
if not citation.strip(): if not citation.strip():
raise ValueError("citation is required") raise ValueError("citation is required")
# Citation guard at service level (catches both MCP and HTTP API paths).
# Appeals-committee decisions must go through ingest_internal_decision
# which records chair_name+district. The MCP wrapper has the same guard
# for an earlier, friendlier error message — but this is the source of
# truth. See TaskMaster #30(ב) and DB constraint case_law_external_arar_check.
_norm = citation.strip()
if _norm.startswith(("ערר ", "ערר(", "בל\"מ ", "בל\"מ(", "ARAR ")):
raise ValueError(
"ציטוט שמתחיל ב-'ערר' או 'בל\"מ' הוא החלטת ועדת ערר. "
"השתמש ב-internal_decision_upload (דורש chair_name + district), "
"לא ב-precedent_library_upload."
)
if practice_area not in _VALID_PRACTICE_AREAS: if practice_area not in _VALID_PRACTICE_AREAS:
raise ValueError(f"invalid practice_area: {practice_area!r}") raise ValueError(f"invalid practice_area: {practice_area!r}")
if source_type not in _VALID_SOURCE_TYPES: if source_type not in _VALID_SOURCE_TYPES: