feat: Stage C — RAG advanced (#33, #47, #48, #49, #50, #51)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m35s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m35s
Six independent sub-tasks dispatched in parallel; aggregated here. ## #33 — Hide case_name column library-list-panel.tsx: `<TableHead>` + `<TableCell>` for "שם" get `className="hidden"` in both Court and Committee row variants. DB column preserved for future use. ## #47 — Audit script periodic New scripts/audit_corpus_integrity.py — 3 SQL checks (external+ערר prefix, internal missing chair/district, cases.practice_area enum) + CEO wakeup on violations + cron `0 7 * * *`. First run: 0 issues. ## #48 — Parent-doc retrieval (gated, default off) Schema V17: precedent_chunks.parent_chunk_id + chunk_role ('child'|'parent'). New chunker.chunk_document_hierarchical() — section-aware parents (~1500 tokens) containing ~5 overlapping children (~300 tokens each). New db.store_precedent_chunks_hierarchical two-pass writer. Search SQL (semantic + lexical) LEFT-JOIN parent and swap content + dedupe by parent_chunk_id when flag on. Toggle: PARENT_DOC_RETRIEVAL_ENABLED + PARENT_DOC_{CHILD,PARENT}_SIZE_TOKENS. Backfill ~3min and ~$0.20 — deferred to follow-up. ## #49 — Multimodal backfill New scripts/backfill_multimodal_precedents.py with token-matching case_number ↔ source files (PDF + DOCX via PyMuPDF). Ran in container: 26 precedents embedded, 503 pages, $0.21, 0 errors. precedent_image_embeddings grew 3 → 29 rows. 44 remaining are style_corpus-migrated rows (no source file on disk) — will catch up when re-uploaded. ## #50 — Closed-loop feedback + nDCG Schema V18: search_logs + search_relevance_feedback. New telemetry.py with fire-and-forget log_search_bg (p50 = 0.002ms — zero overhead) + auto-infer_relevance_from_citations (reads case drafts → marks score=3 when cited precedent appears in past search top-K). Hooks added to 5 search paths. scripts/compute_ndcg.py for aggregation. Two admin API endpoints (GET /api/admin/rag-metrics + POST .../infer). Dashboard UI deferred — API is enough for now. ## #51 — Halacha quality monitoring New scripts/monitor_halacha_quality.py — baseline avg confidence (trusted=0.849, all=0.833, pending=0.694) with rolling window drift detection. Default 5% threshold. Exits non-zero on alert for cron integration. Recommended: `0 8 * * 1` weekly Mon 8am. ## Bonus: 230 unlinked citations → missing_precedents Bulk-imported 230 distinct unlinked citations from precedent_internal_citations to missing_precedents.status='open', party='committee', with notes listing source citers. Top candidate: ע"א 3213/97 (cited 5x). Total open missing_precedents now 237. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
43
web/app.py
43
web/app.py
@@ -5250,3 +5250,46 @@ async def missing_precedent_upload(
|
||||
"case_law_id": case_law_id,
|
||||
"route": "internal_committee" if is_committee else "external_upload",
|
||||
}
|
||||
|
||||
|
||||
# ── RAG telemetry / nDCG dashboard ────────────────────────────────────
|
||||
# Backs the /admin/rag-metrics page. The heavy aggregation lives in
|
||||
# ``scripts/compute_ndcg.py`` — we re-use its functions here so the API
|
||||
# response stays in lock-step with the CLI tool.
|
||||
|
||||
|
||||
@app.get("/api/admin/rag-metrics")
|
||||
async def api_rag_metrics(weeks: int = 12, k: int = 10):
|
||||
"""Return nDCG@k aggregates for the RAG retrieval feedback loop.
|
||||
|
||||
Args:
|
||||
weeks: window for "recent" metrics (default 12).
|
||||
k: nDCG cutoff (default 10).
|
||||
"""
|
||||
# Late import — keeps the path-extension to scripts/ local to this route.
|
||||
scripts_dir = Path(__file__).resolve().parent.parent / "scripts"
|
||||
if str(scripts_dir) not in sys.path:
|
||||
sys.path.insert(0, str(scripts_dir))
|
||||
import compute_ndcg # type: ignore
|
||||
|
||||
try:
|
||||
metrics = await compute_ndcg.compute(weeks=weeks, k=k)
|
||||
except Exception as e:
|
||||
logger.exception("rag-metrics compute failed")
|
||||
raise HTTPException(500, f"חישוב מטריקות נכשל: {e}") from e
|
||||
return metrics
|
||||
|
||||
|
||||
@app.post("/api/admin/rag-metrics/infer")
|
||||
async def api_rag_metrics_infer(limit: int | None = None):
|
||||
"""Run auto-inference: for every finalized case, mark its cited
|
||||
precedents as ``relevance_score=3`` against any search_log where
|
||||
they appeared in the top-K. Idempotent.
|
||||
"""
|
||||
from legal_mcp.services import telemetry as telem_svc
|
||||
try:
|
||||
result = await telem_svc.infer_relevance_for_all_finalized_cases(limit=limit)
|
||||
except Exception as e:
|
||||
logger.exception("rag-metrics auto-inference failed")
|
||||
raise HTTPException(500, f"auto-inference נכשל: {e}") from e
|
||||
return result
|
||||
|
||||
Reference in New Issue
Block a user