feat: Stage C — RAG advanced (#33, #47, #48, #49, #50, #51)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m35s

Six independent sub-tasks dispatched in parallel; aggregated here.

## #33 — Hide case_name column
library-list-panel.tsx: `<TableHead>` + `<TableCell>` for "שם"
get `className="hidden"` in both Court and Committee row variants.
DB column preserved for future use.

## #47 — Audit script periodic
New scripts/audit_corpus_integrity.py — 3 SQL checks (external+ערר
prefix, internal missing chair/district, cases.practice_area enum)
+ CEO wakeup on violations + cron `0 7 * * *`. First run: 0 issues.

## #48 — Parent-doc retrieval (gated, default off)
Schema V17: precedent_chunks.parent_chunk_id + chunk_role
('child'|'parent'). New chunker.chunk_document_hierarchical() —
section-aware parents (~1500 tokens) containing ~5 overlapping
children (~300 tokens each). New db.store_precedent_chunks_hierarchical
two-pass writer. Search SQL (semantic + lexical) LEFT-JOIN parent and
swap content + dedupe by parent_chunk_id when flag on. Toggle:
PARENT_DOC_RETRIEVAL_ENABLED + PARENT_DOC_{CHILD,PARENT}_SIZE_TOKENS.
Backfill ~3min and ~$0.20 — deferred to follow-up.

## #49 — Multimodal backfill
New scripts/backfill_multimodal_precedents.py with token-matching
case_number ↔ source files (PDF + DOCX via PyMuPDF). Ran in container:
26 precedents embedded, 503 pages, $0.21, 0 errors. precedent_image_embeddings
grew 3 → 29 rows. 44 remaining are style_corpus-migrated rows (no
source file on disk) — will catch up when re-uploaded.

## #50 — Closed-loop feedback + nDCG
Schema V18: search_logs + search_relevance_feedback. New telemetry.py
with fire-and-forget log_search_bg (p50 = 0.002ms — zero overhead) +
auto-infer_relevance_from_citations (reads case drafts → marks score=3
when cited precedent appears in past search top-K). Hooks added to 5
search paths. scripts/compute_ndcg.py for aggregation. Two admin API
endpoints (GET /api/admin/rag-metrics + POST .../infer). Dashboard UI
deferred — API is enough for now.

## #51 — Halacha quality monitoring
New scripts/monitor_halacha_quality.py — baseline avg confidence
(trusted=0.849, all=0.833, pending=0.694) with rolling window drift
detection. Default 5% threshold. Exits non-zero on alert for cron
integration. Recommended: `0 8 * * 1` weekly Mon 8am.

## Bonus: 230 unlinked citations → missing_precedents
Bulk-imported 230 distinct unlinked citations from
precedent_internal_citations to missing_precedents.status='open',
party='committee', with notes listing source citers. Top candidate:
ע"א 3213/97 (cited 5x). Total open missing_precedents now 237.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-05-26 11:26:52 +00:00
parent 3a05e30c8d
commit 2aee398b4a
15 changed files with 2493 additions and 57 deletions

View File

@@ -5250,3 +5250,46 @@ async def missing_precedent_upload(
"case_law_id": case_law_id,
"route": "internal_committee" if is_committee else "external_upload",
}
# ── RAG telemetry / nDCG dashboard ────────────────────────────────────
# Backs the /admin/rag-metrics page. The heavy aggregation lives in
# ``scripts/compute_ndcg.py`` — we re-use its functions here so the API
# response stays in lock-step with the CLI tool.
@app.get("/api/admin/rag-metrics")
async def api_rag_metrics(weeks: int = 12, k: int = 10):
"""Return nDCG@k aggregates for the RAG retrieval feedback loop.
Args:
weeks: window for "recent" metrics (default 12).
k: nDCG cutoff (default 10).
"""
# Late import — keeps the path-extension to scripts/ local to this route.
scripts_dir = Path(__file__).resolve().parent.parent / "scripts"
if str(scripts_dir) not in sys.path:
sys.path.insert(0, str(scripts_dir))
import compute_ndcg # type: ignore
try:
metrics = await compute_ndcg.compute(weeks=weeks, k=k)
except Exception as e:
logger.exception("rag-metrics compute failed")
raise HTTPException(500, f"חישוב מטריקות נכשל: {e}") from e
return metrics
@app.post("/api/admin/rag-metrics/infer")
async def api_rag_metrics_infer(limit: int | None = None):
"""Run auto-inference: for every finalized case, mark its cited
precedents as ``relevance_score=3`` against any search_log where
they appeared in the top-K. Idempotent.
"""
from legal_mcp.services import telemetry as telem_svc
try:
result = await telem_svc.infer_relevance_for_all_finalized_cases(limit=limit)
except Exception as e:
logger.exception("rag-metrics auto-inference failed")
raise HTTPException(500, f"auto-inference נכשל: {e}") from e
return result