feat(precedents): UI button queues extraction for local MCP worker

The chair wanted a one-click "extract metadata" button on the edit sheet. The constraint stays the same — claude_session needs the local CLI which the container doesn't have, so the button can't run the extractor itself. Compromise: button stamps a queue marker; the local MCP server drains the queue on demand. DB (V8): two nullable timestamps on case_law, metadata_extraction_requested_at and halacha_extraction_requested_at, with partial indexes for cheap "find pending" scans. API: POST /api/precedent-library/{id}/request-metadata → stamp the row POST /api/precedent-library/{id}/request-halachot → same for halacha GET /api/precedent-library/queue/pending?kind=... → read-only view UI: Sparkles button in the edit sheet header. Click → toast tells the chair what to run from Claude Code. The button never triggers the extractor directly from the container. MCP tool: precedent_process_pending(kind, limit) — runs from Claude Code with the local CLI, picks up everything stamped, calls the extractor for each, clears the timestamp on success. Failures keep the timestamp so the next invocation retries them. Architectural rule (claude_session local-only) is preserved end-to-end and called out in the new endpoint comment + tool docstring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 12:32:25 +00:00
parent 8e1384b897
commit 4a9a6b7970
7 changed files with 293 additions and 21 deletions
--- a/mcp-server/src/legal_mcp/services/precedent_library.py
+++ b/mcp-server/src/legal_mcp/services/precedent_library.py
@@ -253,6 +253,65 @@ async def reextract_halachot(
    return result


+async def process_pending_extractions(kind: str = "metadata", limit: int = 20) -> dict:
+    """Drain the extraction queue (UI-button-stamped requests).
+
+    The button in the web UI cannot run claude_session itself (it lives in
+    the container, no CLI). It just stamps ``metadata_extraction_requested_at``
+    on the row. This function — called from local Claude Code via the MCP
+    tool — picks each stamped row up, runs the extractor, and clears the
+    timestamp.
+
+    Args:
+        kind: 'metadata' or 'halacha'.
+        limit: max rows to process this run.
+    """
+    from legal_mcp.services import halacha_extractor, precedent_metadata_extractor
+
+    if kind not in {"metadata", "halacha"}:
+        raise ValueError("kind must be 'metadata' or 'halacha'")
+
+    pending = await db.list_pending_extraction_requests(kind=kind, limit=limit)
+    if not pending:
+        return {"status": "no_pending", "kind": kind, "processed": 0, "results": []}
+
+    results: list[dict] = []
+    processed = 0
+    for row in pending:
+        cid = UUID(str(row["id"]))
+        try:
+            if kind == "metadata":
+                result = await precedent_metadata_extractor.extract_and_apply(cid)
+            else:
+                result = await halacha_extractor.extract(cid)
+            await db.clear_extraction_request(cid, kind=kind)
+            processed += 1
+            results.append({
+                "case_law_id": str(cid),
+                "case_number": row.get("case_number", ""),
+                "status": result.get("status", "unknown"),
+                "fields": result.get("fields", []),
+                "stored": result.get("stored", 0),
+            })
+        except Exception as e:
+            logger.exception("process_pending_extractions failed for %s: %s", cid, e)
+            results.append({
+                "case_law_id": str(cid),
+                "case_number": row.get("case_number", ""),
+                "status": "failed",
+                "error": str(e),
+            })
+            # Don't clear the request — it stays for the next run.
+
+    return {
+        "status": "completed",
+        "kind": kind,
+        "processed": processed,
+        "total_pending": len(pending),
+        "results": results,
+    }
+
+
 async def reextract_metadata(
    case_law_id: UUID | str,
    progress: ProgressCb | None = None,