fix(precedent-library): allow re-extraction for internal_committee rows
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m13s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m13s
The "חלץ מטא-דאטה" / "חלץ הלכות" buttons in the UI were returning 404 for any precedent with `source_kind != 'external_upload'`. The original restriction was meant to keep LLM extraction off internal-committee imports (their metadata supposedly came from the case file system), but the same precedent rows can still need re-extraction when ingest produces broken data — e.g. the corrupted `subject_tags` value `['[','"','ה','י',...]` that motivated this change (an early ingest stored a JSON literal into a TEXT[] column, which Postgres split into single chars). Two changes here: 1. db.request_metadata_extraction / request_halacha_extraction: drop the `AND source_kind='external_upload'` filter. The extractor already preserves user values (only fills empty fields), so this is safe. 2. precedent_metadata_extractor.extract_and_apply: detect the character-by-character corruption above and treat it as empty so the freshly-extracted tags actually replace the broken ones. Heuristic: 3+ elements where every element is at most 2 chars (legitimate tags are multi-character Hebrew words). Coolify deploy required for the FastAPI container to pick this up.
This commit is contained in:
@@ -257,8 +257,11 @@ async def reextract_halachot(
|
||||
case_law_id = UUID(case_law_id)
|
||||
|
||||
record = await db.get_case_law(case_law_id)
|
||||
if not record or record.get("source_kind") != "external_upload":
|
||||
raise ValueError("precedent not found or not chair-uploaded")
|
||||
if not record:
|
||||
raise ValueError("precedent not found")
|
||||
# Was restricted to source_kind='external_upload'; opened 2026-05-06 so
|
||||
# internal_committee rows can also be re-extracted when ingest produced
|
||||
# bad data. See note in db.request_metadata_extraction.
|
||||
|
||||
await progress("extracting_halachot", 50, "מחלץ הלכות מחדש")
|
||||
result = await halacha_extractor.extract(case_law_id)
|
||||
@@ -402,8 +405,9 @@ async def reextract_metadata(
|
||||
case_law_id = UUID(case_law_id)
|
||||
|
||||
record = await db.get_case_law(case_law_id)
|
||||
if not record or record.get("source_kind") != "external_upload":
|
||||
raise ValueError("precedent not found or not chair-uploaded")
|
||||
if not record:
|
||||
raise ValueError("precedent not found")
|
||||
# See note in db.request_metadata_extraction — opened to all source kinds.
|
||||
|
||||
await progress("extracting_metadata", 40, "מחלץ מטא-דאטה (תקציר, תגיות)")
|
||||
result = await precedent_metadata_extractor.extract_and_apply(case_law_id)
|
||||
|
||||
Reference in New Issue
Block a user