feat(operations): manual burst control for the halacha drain + permanent supervisor
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 6s
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 6s
The halacha-extraction backlog needs to be worked off the chair's leftover weekly
Claude quota on demand. This adds a MANUAL, time-boxed "burst" — run the drain
continuously now until a chosen deadline (default the upcoming Saturday 18:00 IL),
managed interactively from /operations — plus the permanent health-supervisor that
enforces it.
Backend (this PR; deploys via Coolify + host pm2):
- db: drain_controls.burst_until (SCHEMA_V37) + set_drain_burst/get_drain_burst/
get_drain_bursts. Single source of truth shared by the container-side /operations
API and the host-side supervisor.
- web: POST /api/operations/drains/{name}/burst (on→until|next-Sat-18:00, off→NULL),
and burst_until surfaced per-service in the /operations snapshot.
- scripts/halacha_drain_supervisor.py + legal-halacha-supervisor.config.cjs: pm2 cron
(*/15, zero Claude quota) — re-triggers idle drain, restarts a HUNG run (liveness =
per-chunk checkpoints, NOT log mtime), backs off on 429 until the parsed reset
(fresh-gated), verifies crash-safe staging. Reads burst_until from the DB; burst
auto-expires at the deadline (never bleeds into a fresh week).
UI (separate follow-up PR, after Claude Design approval): the /operations toggle +
date-picker that calls the burst endpoint.
Invariants: G1 (normalize at source — burst lives once in the DB, read by both
surfaces), G2 (no parallel control path — CAPTURE field on the existing
drain_controls + orchestrates the existing drain, not a new one), G12 (no Paperclip
touch), §6 (no silent error-swallow — burst-clear failure is surfaced as a note).
This commit is contained in:
45
scripts/legal-halacha-supervisor.config.cjs
Normal file
45
scripts/legal-halacha-supervisor.config.cjs
Normal file
@@ -0,0 +1,45 @@
|
||||
/**
|
||||
* pm2 ecosystem entry for legal-halacha-supervisor — the permanent health
|
||||
* manager for the halacha-extraction drain (legal-halacha-drain).
|
||||
*
|
||||
* Runs ONE health tick per cron fire (~every 15 min) and takes ZERO Claude quota
|
||||
* itself (only the drain calls Opus); it just reads the DB/logs/pm2 and pokes the
|
||||
* existing drain via the established run-now mechanism. Each tick:
|
||||
* • re-triggers the one-shot drain when idle and the queue is non-empty
|
||||
* • restarts a HUNG run (online but no new chunk-checkpoint > 25 min — the
|
||||
* out-log only updates when a whole CASE finishes, so mtime is not liveness)
|
||||
* • backs off on rate-limit (claude_session 429) until the CLI's parsed reset
|
||||
* • verifies crash-safe per-chunk staging is committing (nothing lost)
|
||||
*
|
||||
* BURST (manual "run continuously now" window): source of truth is
|
||||
* drain_controls.burst_until in the DB — the SAME value the /operations page
|
||||
* reads/writes (G1 single source; G2 no parallel control path). While it is in
|
||||
* the future the supervisor LIFTS the drain's night-window; otherwise the drain
|
||||
* keeps its normal 23:00–05:00 IDT window. Burst auto-expires at its deadline.
|
||||
* Manual front-ends: the /operations toggle, or:
|
||||
* mcp-server/.venv/bin/python scripts/halacha_drain_supervisor.py burst-on
|
||||
* mcp-server/.venv/bin/python scripts/halacha_drain_supervisor.py burst-off
|
||||
*
|
||||
* Pattern (mirrors legal-halacha-drain): cron_restart fires the script;
|
||||
* autorestart:false → one-shot per tick (pm2 shows "stopped" between ticks).
|
||||
*
|
||||
* Install (once):
|
||||
* pm2 start /home/chaim/legal-ai/scripts/legal-halacha-supervisor.config.cjs
|
||||
* pm2 save
|
||||
*/
|
||||
const cron = process.env.HALACHA_SUPERVISOR_CRON || "*/15 * * * *";
|
||||
|
||||
module.exports = {
|
||||
apps: [
|
||||
{
|
||||
name: "legal-halacha-supervisor",
|
||||
cwd: "/home/chaim/legal-ai",
|
||||
script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
|
||||
args: "scripts/halacha_drain_supervisor.py",
|
||||
env: { HOME: "/home/chaim", PYTHONUNBUFFERED: "1" },
|
||||
autorestart: false, // one-shot per cron tick
|
||||
cron_restart: cron,
|
||||
max_memory_restart: "300M",
|
||||
},
|
||||
],
|
||||
};
|
||||
Reference in New Issue
Block a user