feat(extraction): precedent metadata via Gemini Flash + scheduled drainer
The /precedents metadata queue was stuck — 24 rows requested, nothing draining them — and the agentic claude CLI hit error_max_turns on what is a single structured text→JSON task (slow + flaky). Metadata extraction is bounded extraction, the wrong fit for an agentic loop. - gemini_session.py: query_json drop-in (gemini-2.5-flash, JSON mode, httpx — no new SDK dep). Reads GEMINI_API_KEY (~/.env; SoT Infisical nautilus:/external-apis/gemini). Host-side only — no LLM from the container. - precedent_metadata_extractor: claude_session.query_json → gemini_session. Validated live: rich, accurate fields (case_name/summary/appeal_subtype/tags). - process_pending_extractions: kind-aware cooldown — metadata 2s (Gemini, fast), halacha keeps 30s (Claude rate limits). - drain_metadata_queue.py + legal-metadata-drain.config.cjs (pm2 cron */15) so the queue never clogs again. SCRIPTS.md. - X8 INV-FP5 updated: per-task engine choice (Gemini=bounded metadata, claude_session=agentic halacha), both host-side, single canonical queue (G2). Agentic/voice-sensitive work (writing, analysis, halacha) stays on claude_session (Daphna's subscription). Gemini cost ≈ $0.10/1M tokens — negligible. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -15,6 +15,7 @@ from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
from pathlib import Path
|
||||
from typing import Awaitable, Callable
|
||||
from uuid import UUID
|
||||
@@ -179,6 +180,9 @@ async def reextract_halachot(
|
||||
# precedent into a 429 storm. Observed 2026-05-03: 1110/20 succeeded with 9
|
||||
# halachot, 317/10 immediately after returned silent no_halachot.
|
||||
INTER_PRECEDENT_COOLDOWN_SEC = 30
|
||||
# Metadata extraction is on Gemini (fast, high rate limits) — a brief spacer is
|
||||
# enough; the 30s above is for the Claude-backed halacha path.
|
||||
METADATA_COOLDOWN_SEC = float(os.environ.get("METADATA_COOLDOWN_SEC", "2"))
|
||||
|
||||
# How many times to retry a precedent that came back as 'extraction_failed'
|
||||
# (i.e. >50% chunks crashed). Each retry uses a longer cooldown.
|
||||
@@ -226,11 +230,14 @@ async def process_pending_extractions(kind: str = "metadata", limit: int = 20) -
|
||||
cid, effort=config.HALACHA_BULK_EXTRACT_EFFORT,
|
||||
)
|
||||
|
||||
# Metadata extraction runs on Gemini (high rate limits, fast) — the long
|
||||
# cooldown is only needed for halacha (Claude/Anthropic rate limits).
|
||||
cooldown = METADATA_COOLDOWN_SEC if kind == "metadata" else INTER_PRECEDENT_COOLDOWN_SEC
|
||||
results: list[dict] = []
|
||||
processed = 0
|
||||
for idx, row in enumerate(pending):
|
||||
if idx > 0:
|
||||
await asyncio.sleep(INTER_PRECEDENT_COOLDOWN_SEC)
|
||||
await asyncio.sleep(cooldown)
|
||||
cid = UUID(str(row["id"]))
|
||||
attempts = 0
|
||||
result: dict = {}
|
||||
|
||||
Reference in New Issue
Block a user