feat(extraction): precedent metadata via Gemini Flash + scheduled drainer

The /precedents metadata queue was stuck — 24 rows requested, nothing draining
them — and the agentic claude CLI hit error_max_turns on what is a single
structured text→JSON task (slow + flaky). Metadata extraction is bounded
extraction, the wrong fit for an agentic loop.

- gemini_session.py: query_json drop-in (gemini-2.5-flash, JSON mode, httpx —
  no new SDK dep). Reads GEMINI_API_KEY (~/.env; SoT Infisical
  nautilus:/external-apis/gemini). Host-side only — no LLM from the container.
- precedent_metadata_extractor: claude_session.query_json → gemini_session.
  Validated live: rich, accurate fields (case_name/summary/appeal_subtype/tags).
- process_pending_extractions: kind-aware cooldown — metadata 2s (Gemini, fast),
  halacha keeps 30s (Claude rate limits).
- drain_metadata_queue.py + legal-metadata-drain.config.cjs (pm2 cron */15) so
  the queue never clogs again. SCRIPTS.md.
- X8 INV-FP5 updated: per-task engine choice (Gemini=bounded metadata,
  claude_session=agentic halacha), both host-side, single canonical queue (G2).

Agentic/voice-sensitive work (writing, analysis, halacha) stays on claude_session
(Daphna's subscription). Gemini cost ≈ $0.10/1M tokens — negligible.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-08 05:13:49 +00:00
parent cc9adc5c1f
commit d95a36f310
7 changed files with 202 additions and 9 deletions

View File

@@ -19,7 +19,7 @@ from datetime import date as date_type
from uuid import UUID
from legal_mcp.config import parse_llm_json
from legal_mcp.services import claude_session, db
from legal_mcp.services import db, gemini_session
logger = logging.getLogger(__name__)
@@ -150,7 +150,10 @@ async def extract_metadata(case_law_id: UUID | str) -> dict:
)
try:
result = await claude_session.query_json(
# Bounded structured extraction → Gemini Flash (JSON mode). The agentic
# claude CLI hit error_max_turns on this single-shot task; see
# gemini_session.py. Voice-sensitive/agentic work stays on claude_session.
result = await gemini_session.query_json(
user_msg, system=METADATA_EXTRACTION_PROMPT,
)
except Exception as e: