The /precedents metadata queue was stuck — 24 rows requested, nothing draining them — and the agentic claude CLI hit error_max_turns on what is a single structured text→JSON task (slow + flaky). Metadata extraction is bounded extraction, the wrong fit for an agentic loop. - gemini_session.py: query_json drop-in (gemini-2.5-flash, JSON mode, httpx — no new SDK dep). Reads GEMINI_API_KEY (~/.env; SoT Infisical nautilus:/external-apis/gemini). Host-side only — no LLM from the container. - precedent_metadata_extractor: claude_session.query_json → gemini_session. Validated live: rich, accurate fields (case_name/summary/appeal_subtype/tags). - process_pending_extractions: kind-aware cooldown — metadata 2s (Gemini, fast), halacha keeps 30s (Claude rate limits). - drain_metadata_queue.py + legal-metadata-drain.config.cjs (pm2 cron */15) so the queue never clogs again. SCRIPTS.md. - X8 INV-FP5 updated: per-task engine choice (Gemini=bounded metadata, claude_session=agentic halacha), both host-side, single canonical queue (G2). Agentic/voice-sensitive work (writing, analysis, halacha) stays on claude_session (Daphna's subscription). Gemini cost ≈ $0.10/1M tokens — negligible. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
35 lines
1.3 KiB
JavaScript
35 lines
1.3 KiB
JavaScript
/**
|
|
* pm2 ecosystem entry for legal-metadata-drain — scheduled (every 15 min) drain
|
|
* of the precedent metadata-extraction queue (Gemini Flash). Keeps the
|
|
* /precedents metadata queue from clogging (the prior agentic claude-CLI path
|
|
* hit error_max_turns and nothing drained it autonomously).
|
|
*
|
|
* Pattern: cron_restart fires the script on schedule; autorestart:false → runs
|
|
* once and exits (pm2 shows "stopped" between ticks — expected). Cheap no-op
|
|
* when the queue is empty; Gemini Flash ≈ $0.10/1M tokens.
|
|
*
|
|
* Requires (host ~/.env via legal_mcp.config): GEMINI_API_KEY, POSTGRES_URL.
|
|
*
|
|
* Install (once):
|
|
* pm2 start /home/chaim/legal-ai/scripts/legal-metadata-drain.config.cjs
|
|
* pm2 save
|
|
* Run now (manual): mcp-server/.venv/bin/python scripts/drain_metadata_queue.py
|
|
* Schedule override: METADATA_DRAIN_CRON (default every 15 min).
|
|
*/
|
|
const cron = process.env.METADATA_DRAIN_CRON || "*/15 * * * *";
|
|
|
|
module.exports = {
|
|
apps: [
|
|
{
|
|
name: "legal-metadata-drain",
|
|
cwd: "/home/chaim/legal-ai",
|
|
script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
|
|
args: "scripts/drain_metadata_queue.py 10",
|
|
env: { HOME: "/home/chaim", PYTHONUNBUFFERED: "1" },
|
|
autorestart: false, // one-shot per cron tick
|
|
cron_restart: cron,
|
|
max_memory_restart: "500M",
|
|
},
|
|
],
|
|
};
|