Files
legal-ai/scripts/legal-court-fetch-drain.config.cjs
Chaim f4f110f0d1 feat(X13): scheduled drain — fully-autonomous digest→fetch→ingest loop
- scripts/drain_court_fetch.py: drives orchestrator.drain_pending (host-only;
  no-op when queue empty). Mirrors drain_halacha_queue.py.
- scripts/legal-court-fetch-drain.config.cjs: pm2 cron (hourly :17, one-shot),
  COURT_FETCH_DRAIN_CRON override.
- fix: orchestrator default service URL 127.0.0.1 → 10.0.1.1 (the service binds
  the docker0 gateway; the host can't reach it on loopback). Found live — the
  first drain failed "connection refused" until corrected.
- SCRIPTS.md entries.

Validated end-to-end in PRODUCTION on a real digest: עת"מ 43830-12-24
(החברה להגנת הטבע) fetched from נט המשפט → case_law (79 chunks, source_url),
digest relinked (INV-DIG3 closed), halacha queued pending_review. job=done.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-07 20:31:53 +00:00

41 lines
1.7 KiB
JavaScript

/**
* pm2 ecosystem entry for legal-court-fetch-drain — a scheduled (hourly) one-shot
* that drains the X13 court-verdict fetch queue the digest trigger fills, making
* the digest → fetch → ingest loop fully autonomous (no manual court_fetch_drain).
*
* Pattern: cron_restart fires the script on schedule; autorestart:false means it
* runs once and exits (pm2 shows it "stopped" between ticks — expected for a cron
* job). A no-op (fast) when the queue is empty, so hourly is cheap.
*
* Requires (already deployed): legal-court-fetch-service (+xvfb) running for the
* browser fetch, and the host env (~/.env: POSTGRES_URL, VOYAGE_API_KEY,
* COURT_FETCH_SHARED_SECRET) the venv loads via legal_mcp.config. Ingest uses the
* local claude CLI for halacha extraction (halachot land pending_review — the
* chair's approval gate is untouched).
*
* Install (once):
* pm2 start /home/chaim/legal-ai/scripts/legal-court-fetch-drain.config.cjs
* pm2 save
* Logs: pm2 logs legal-court-fetch-drain --lines 50
* Run now (manual): mcp-server/.venv/bin/python scripts/drain_court_fetch.py
*
* Schedule override: COURT_FETCH_DRAIN_CRON (default hourly at :17 to avoid the
* top-of-hour stampede with other jobs).
*/
const cron = process.env.COURT_FETCH_DRAIN_CRON || "17 * * * *";
module.exports = {
apps: [
{
name: "legal-court-fetch-drain",
cwd: "/home/chaim/legal-ai",
script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
args: "scripts/drain_court_fetch.py 5",
env: { HOME: "/home/chaim", PYTHONUNBUFFERED: "1" },
autorestart: false, // one-shot per cron tick
cron_restart: cron,
max_memory_restart: "800M",
},
],
};