feat(operations): manual burst control for the halacha drain + permanent supervisor
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 6s

The halacha-extraction backlog needs to be worked off the chair's leftover weekly
Claude quota on demand. This adds a MANUAL, time-boxed "burst" — run the drain
continuously now until a chosen deadline (default the upcoming Saturday 18:00 IL),
managed interactively from /operations — plus the permanent health-supervisor that
enforces it.

Backend (this PR; deploys via Coolify + host pm2):
- db: drain_controls.burst_until (SCHEMA_V37) + set_drain_burst/get_drain_burst/
  get_drain_bursts. Single source of truth shared by the container-side /operations
  API and the host-side supervisor.
- web: POST /api/operations/drains/{name}/burst (on→until|next-Sat-18:00, off→NULL),
  and burst_until surfaced per-service in the /operations snapshot.
- scripts/halacha_drain_supervisor.py + legal-halacha-supervisor.config.cjs: pm2 cron
  (*/15, zero Claude quota) — re-triggers idle drain, restarts a HUNG run (liveness =
  per-chunk checkpoints, NOT log mtime), backs off on 429 until the parsed reset
  (fresh-gated), verifies crash-safe staging. Reads burst_until from the DB; burst
  auto-expires at the deadline (never bleeds into a fresh week).

UI (separate follow-up PR, after Claude Design approval): the /operations toggle +
date-picker that calls the burst endpoint.

Invariants: G1 (normalize at source — burst lives once in the DB, read by both
surfaces), G2 (no parallel control path — CAPTURE field on the existing
drain_controls + orchestrates the existing drain, not a new one), G12 (no Paperclip
touch), §6 (no silent error-swallow — burst-clear failure is surfaced as a note).
This commit is contained in:
2026-06-12 11:11:13 +00:00
parent 551d38dd7c
commit c7c402e7ef
5 changed files with 563 additions and 1 deletions

View File

@@ -1501,6 +1501,16 @@ SCHEMA_V36_SQL = """
ALTER TABLE draft_final_pairs ADD COLUMN IF NOT EXISTS learning_run JSONB;
"""
SCHEMA_V37_SQL = """
-- drain_controls.burst_until: a MANUAL, time-boxed "run continuously now" window
-- for a drain, managed interactively from /operations. While burst_until is in the
-- future the host supervisor (legal-halacha-supervisor) lifts the drain's
-- night-window and keeps it draining; NULL/past = the drain's normal schedule.
-- Chair-controlled — never set automatically. CAPTURE field on the existing
-- control table (G2 — no parallel control path).
ALTER TABLE drain_controls ADD COLUMN IF NOT EXISTS burst_until TIMESTAMPTZ;
"""
async def _run_schema_migrations(pool: asyncpg.Pool) -> None:
async with pool.acquire() as conn:
@@ -1541,7 +1551,8 @@ async def _run_schema_migrations(pool: asyncpg.Pool) -> None:
await conn.execute(SCHEMA_V34_SQL)
await conn.execute(SCHEMA_V35_SQL)
await conn.execute(SCHEMA_V36_SQL)
logger.info("Database schema initialized (v1-v36)")
await conn.execute(SCHEMA_V37_SQL)
logger.info("Database schema initialized (v1-v37)")
async def init_schema() -> None:
@@ -6908,3 +6919,37 @@ async def get_drain_controls() -> dict[str, bool]:
async with pool.acquire() as conn:
rows = await conn.fetch("SELECT name, disabled FROM drain_controls")
return {r["name"]: bool(r["disabled"]) for r in rows}
async def set_drain_burst(name: str, until) -> None:
"""Set/clear a drain's MANUAL burst window (upsert). ``until=None`` clears it.
Single source of truth shared by the container-side /operations API and the
host-side supervisor (G1) — no parallel control path (G2)."""
pool = await get_pool()
async with pool.acquire() as conn:
await conn.execute(
"INSERT INTO drain_controls (name, burst_until, updated_at) "
"VALUES ($1, $2, now()) "
"ON CONFLICT (name) DO UPDATE SET burst_until = $2, updated_at = now()",
name, until,
)
async def get_drain_burst(name: str):
"""The drain's burst_until (datetime) or None."""
pool = await get_pool()
async with pool.acquire() as conn:
return await conn.fetchval(
"SELECT burst_until FROM drain_controls WHERE name = $1", name
)
async def get_drain_bursts() -> dict[str, str]:
"""Map of drain name → burst_until ISO string (only rows with a value set)."""
pool = await get_pool()
async with pool.acquire() as conn:
rows = await conn.fetch(
"SELECT name, burst_until FROM drain_controls WHERE burst_until IS NOT NULL"
)
return {r["name"]: r["burst_until"].isoformat() for r in rows}