feat(X13): scheduled drain — fully-autonomous digest→fetch→ingest loop
- scripts/drain_court_fetch.py: drives orchestrator.drain_pending (host-only; no-op when queue empty). Mirrors drain_halacha_queue.py. - scripts/legal-court-fetch-drain.config.cjs: pm2 cron (hourly :17, one-shot), COURT_FETCH_DRAIN_CRON override. - fix: orchestrator default service URL 127.0.0.1 → 10.0.1.1 (the service binds the docker0 gateway; the host can't reach it on loopback). Found live — the first drain failed "connection refused" until corrected. - SCRIPTS.md entries. Validated end-to-end in PRODUCTION on a real digest: עת"מ 43830-12-24 (החברה להגנת הטבע) fetched from נט המשפט → case_law (79 chunks, source_url), digest relinked (INV-DIG3 closed), halacha queued pending_review. job=done. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
43
scripts/drain_court_fetch.py
Normal file
43
scripts/drain_court_fetch.py
Normal file
@@ -0,0 +1,43 @@
|
||||
"""Drain the X13 court-verdict fetch queue (jobs the digest trigger fills).
|
||||
|
||||
When a digest points at a court ruling not yet in the corpus, the digest
|
||||
trigger enqueues a ``court_fetch_jobs`` row (status=pending). This script
|
||||
drains those: for each pending/failed job it runs the full Tier-0/Tier-1 fetch
|
||||
(via the host browser service) + the canonical ingest, then links the verdict
|
||||
back to its source digest. Serial with a cooldown (INV-CF4); failures are
|
||||
recorded and retried until they escalate to ``manual`` (INV-CF3).
|
||||
|
||||
Host-only: ingest drives halacha extraction via the local ``claude`` CLI (same
|
||||
constraint as ``drain_halacha_queue.py``). A no-op (fast) when the queue is
|
||||
empty. Scheduled hourly by ``legal-court-fetch-drain`` (pm2 cron); also runnable
|
||||
by hand:
|
||||
|
||||
mcp-server/.venv/bin/python scripts/drain_court_fetch.py [limit]
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "mcp-server", "src"))
|
||||
|
||||
from legal_mcp.services import court_fetch_orchestrator as orch
|
||||
|
||||
|
||||
async def main() -> int:
|
||||
limit = int(sys.argv[1]) if len(sys.argv) > 1 else 5
|
||||
res = await orch.drain_pending(limit=limit)
|
||||
print(f"===court-fetch drain=== processed={res.get('processed', 0)} "
|
||||
f"ingested={res.get('done', 0)}", flush=True)
|
||||
for r in res.get("results", []):
|
||||
line = f" [{r.get('status')}] {r.get('citation', '')}"
|
||||
if r.get("error"):
|
||||
line += f" — {r['error'][:120]}"
|
||||
if r.get("case_law_id"):
|
||||
line += f" → case_law {r['case_law_id']}"
|
||||
print(line, flush=True)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(asyncio.run(main()))
|
||||
Reference in New Issue
Block a user