Files
legal-ai/scripts/legal-halacha-supervisor.config.cjs
Chaim c7c402e7ef
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 6s
feat(operations): manual burst control for the halacha drain + permanent supervisor
The halacha-extraction backlog needs to be worked off the chair's leftover weekly
Claude quota on demand. This adds a MANUAL, time-boxed "burst" — run the drain
continuously now until a chosen deadline (default the upcoming Saturday 18:00 IL),
managed interactively from /operations — plus the permanent health-supervisor that
enforces it.

Backend (this PR; deploys via Coolify + host pm2):
- db: drain_controls.burst_until (SCHEMA_V37) + set_drain_burst/get_drain_burst/
  get_drain_bursts. Single source of truth shared by the container-side /operations
  API and the host-side supervisor.
- web: POST /api/operations/drains/{name}/burst (on→until|next-Sat-18:00, off→NULL),
  and burst_until surfaced per-service in the /operations snapshot.
- scripts/halacha_drain_supervisor.py + legal-halacha-supervisor.config.cjs: pm2 cron
  (*/15, zero Claude quota) — re-triggers idle drain, restarts a HUNG run (liveness =
  per-chunk checkpoints, NOT log mtime), backs off on 429 until the parsed reset
  (fresh-gated), verifies crash-safe staging. Reads burst_until from the DB; burst
  auto-expires at the deadline (never bleeds into a fresh week).

UI (separate follow-up PR, after Claude Design approval): the /operations toggle +
date-picker that calls the burst endpoint.

Invariants: G1 (normalize at source — burst lives once in the DB, read by both
surfaces), G2 (no parallel control path — CAPTURE field on the existing
drain_controls + orchestrates the existing drain, not a new one), G12 (no Paperclip
touch), §6 (no silent error-swallow — burst-clear failure is surfaced as a note).
2026-06-12 11:11:13 +00:00

46 lines
2.1 KiB
JavaScript
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
/**
* pm2 ecosystem entry for legal-halacha-supervisor — the permanent health
* manager for the halacha-extraction drain (legal-halacha-drain).
*
* Runs ONE health tick per cron fire (~every 15 min) and takes ZERO Claude quota
* itself (only the drain calls Opus); it just reads the DB/logs/pm2 and pokes the
* existing drain via the established run-now mechanism. Each tick:
* • re-triggers the one-shot drain when idle and the queue is non-empty
* • restarts a HUNG run (online but no new chunk-checkpoint > 25 min — the
* out-log only updates when a whole CASE finishes, so mtime is not liveness)
* • backs off on rate-limit (claude_session 429) until the CLI's parsed reset
* • verifies crash-safe per-chunk staging is committing (nothing lost)
*
* BURST (manual "run continuously now" window): source of truth is
* drain_controls.burst_until in the DB — the SAME value the /operations page
* reads/writes (G1 single source; G2 no parallel control path). While it is in
* the future the supervisor LIFTS the drain's night-window; otherwise the drain
* keeps its normal 23:0005:00 IDT window. Burst auto-expires at its deadline.
* Manual front-ends: the /operations toggle, or:
* mcp-server/.venv/bin/python scripts/halacha_drain_supervisor.py burst-on
* mcp-server/.venv/bin/python scripts/halacha_drain_supervisor.py burst-off
*
* Pattern (mirrors legal-halacha-drain): cron_restart fires the script;
* autorestart:false → one-shot per tick (pm2 shows "stopped" between ticks).
*
* Install (once):
* pm2 start /home/chaim/legal-ai/scripts/legal-halacha-supervisor.config.cjs
* pm2 save
*/
const cron = process.env.HALACHA_SUPERVISOR_CRON || "*/15 * * * *";
module.exports = {
apps: [
{
name: "legal-halacha-supervisor",
cwd: "/home/chaim/legal-ai",
script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
args: "scripts/halacha_drain_supervisor.py",
env: { HOME: "/home/chaim", PYTHONUNBUFFERED: "1" },
autorestart: false, // one-shot per cron tick
cron_restart: cron,
max_memory_restart: "300M",
},
],
};