feat(precedents): metadata auto-fill, edit sheet, persuasive extraction

Three improvements to the precedent library based on usage feedback: 1. Auto-fill metadata at upload time. New service precedent_metadata_extractor reads the ruling's full_text and suggests case_name (short), summary, headnote, key_quote, subject_tags, appeal_subtype. The merge policy fills only empty fields, preserving everything the chair typed in the upload form. Wired into the ingest pipeline; also exposed as a re-run endpoint POST /api/precedent-library/{id}/extract-metadata for existing records. 2. Edit sheet in the UI. Pencil icon on each library row opens a pre-populated form covering every field. A Sparkles button on the sheet runs the metadata extractor on demand and refreshes the form. The case_number is read-only because halachot are FK'd to it; renaming requires delete + re-upload. 3. Halacha extractor branches on is_binding. Sources marked binding (Supreme/Administrative) keep the strict halacha prompt. Non-binding sources (other appeals committees, district courts on planning matters) get a different prompt that extracts applications, interpretive principles, and persuasive conclusions — labeled with new rule_types 'application' and 'persuasive'. The fallback also widens chunk selection: if the chunker labeled nothing as legal_analysis/ruling/conclusion, we now run on all chunks rather than returning zero halachot for a usable ruling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 10:19:35 +00:00
parent b51163b67c
commit 73a79ea7e8
10 changed files with 841 additions and 21 deletions
--- a/web/app.py
+++ b/web/app.py
@@ -3779,6 +3779,37 @@ async def precedent_library_reextract(case_law_id: str):
    return {"task_id": task_id}


+@app.post("/api/precedent-library/{case_law_id}/extract-metadata")
+async def precedent_library_extract_metadata(case_law_id: str):
+    """Re-run metadata extraction in background. Fills empty fields only."""
+    try:
+        cid = UUID(case_law_id)
+    except ValueError:
+        raise HTTPException(400, "case_law_id לא תקין")
+    record = await db.get_case_law(cid)
+    if not record:
+        raise HTTPException(404, "פסיקה לא נמצאה")
+
+    task_id = str(uuid4())
+    label = record.get("case_number") or case_law_id
+    await _progress.set(task_id, {
+        "status": "queued", "filename": label, "stage": "queued", "percent": 0,
+    })
+    publish = _make_progress_publisher(task_id, label)
+
+    async def _run():
+        try:
+            await plib_service.reextract_metadata(cid, progress=publish)
+        except Exception as e:
+            logger.exception("re-extract metadata failed")
+            await _progress.set(task_id, {
+                "status": "failed", "error": str(e), "filename": label,
+            })
+
+    asyncio.create_task(_run())
+    return {"task_id": task_id}
+
+
@app.get("/api/halachot")
 async def halachot_list(
    case_law_id: str = "",