feat(halacha-triage): quality-gated + prioritized review queue + metrics (#84)

Backend for the halacha approval-queue triage (#84). The keyboard UI, batch
actions and defer/reject (#84.4–6) already shipped; this adds the gating,
prioritization and metrics the queue was missing.

db.list_halachot — two opt-in triage controls:
  * exclude_low_quality (#84.1): drop items carrying ANY quality_flag
    (application / quote_unverified / truncated / non_decision / thin /
    nli_unsupported / near_duplicate) — they belong in a 'needs extraction fix'
    bucket, not the chair's approve queue.
  * order_by_priority (#84.3): active-learning order — negatively-treated
    first, then most-uncertain (lowest confidence), then oldest — instead of
    FIFO, so the highest-value decisions surface first.

halachot_pending (MCP) — now gated + prioritized BY DEFAULT; include_low_quality=
true reveals the needs-fix bucket. The agent review path benefits immediately.

GET /api/halachot — same two params, default OFF (non-breaking; the UI opts in).

metrics.halacha_backlog (#84.7) — splits pending into clean vs flagged, adds
deferred, reviewed_total, approve_ratio, and a pending_by_flag breakdown, so the
backlog distinguishes real review work from extraction noise.

Deferred (documented): #84.2 near-duplicate cluster cards and wiring the UI
fetch to the new params require frontend work + an api:types regen AFTER this
deploys (the new query params aren't in prod's OpenAPI until then) — a clean
follow-up. The backend fully supports both now.

Verified against the live DB (read-only):
- pending 177 → gated-clean 110, 0 flagged items leak into the clean queue.
- priority order surfaces the lowest-confidence items first (0.55, 0.55, ...).
- backlog: pending_clean=110 / pending_flagged=67 / approve_ratio=0.916,
  pending_by_flag={nli_unsupported:59, quote_unverified:3, thin:3, truncated:2}.
- pytest tests/test_halacha_quality.py — 52 passed (no regression).

Invariants: G1 (gate at source — SQL filter, not post-hoc); G2 (no parallel
path — same list_halachot); §6 (flagged items routed to a bucket, never dropped).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-06 20:00:52 +00:00
parent 32ef259843
commit 420cb819f5
4 changed files with 70 additions and 5 deletions

View File

@@ -356,7 +356,22 @@ async def halacha_review(
return _ok(row)
async def halachot_pending(limit: int = 100) -> str:
"""תור ההלכות הממתינות לאישור (review_status='pending_review')."""
rows = await db.list_halachot(review_status="pending_review", limit=limit)
async def halachot_pending(limit: int = 100, include_low_quality: bool = False) -> str:
"""תור ההלכות הממתינות לאישור (review_status='pending_review').
כברירת-מחדל (#84.1, #84.3) התור **מסונן** — הלכות עם דגל-איכות כלשהו
(application / ציטוט-לא-מאומת / קטוע / obiter / restatement דק / לא-נתמך /
near-duplicate) מוסתרות (הן שייכות ל'דורש תיקון-חילוץ', לא לתור-האישור),
ו**ממוין לפי עדיפות** (טופלו-לרעה תחילה, אז הכי לא-ודאיים, אז הישנים).
Args:
limit: מספר מקסימלי.
include_low_quality: True כדי לחשוף גם פריטים מסומני-איכות (בקט 'דורש תיקון').
"""
rows = await db.list_halachot(
review_status="pending_review",
limit=limit,
exclude_low_quality=not include_low_quality,
order_by_priority=True,
)
return _ok(rows)