legal-ai

Author	SHA1	Message	Date
Chaim	013fe39ea7	fix(supervisor): re-probe claude.ai quota instead of waiting blindly for the reported reset All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 8s Details When the halacha drain hit a 429, the supervisor recorded the reset time the error reported (e.g. "resets 6:50pm UTC") and then HELD until that timestamp, re-reading it from its own state every tick without ever checking whether quota had actually returned. claude.ai usually frees up quota earlier than the message claims, so the drain sat idle for hours after it could have resumed — and only a manual kick (clear cooldown + trigger) got it going again. Now, on any tick where we'd otherwise hold on a cooldown, run a cheap live probe (`quota_available()` → a tiny `claude -p` call, cost ~0) and resume the instant it succeeds — at most one probe per 15-min tick, only while we believe we're limited. Conservative on failure (non-zero exit / timeout / limit message → stay held), so a flaky probe never resumes the drain into a real 429. Adds a claude_bin() resolver so the probe works under pm2 cron where PATH is bare. Invariants: G1 (resume decision driven by actual quota state, not a guessed timestamp); no new control path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-13 09:35:45 +00:00
Chaim	a44827c3dd	fix(operations): disabling the halacha drain now stops a running process immediately All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 6s Details The /operations "disabled" toggle only wrote drain_controls.disabled, which the drain checks at STARTUP — so a drain already mid-run kept going until the queue emptied or the night window closed. Disabling did not stop a running drain. Three layers, immediate + backstops: - web/app.py operations_drain_toggle: on disable, also stop the running process immediately via the host pm2 bridge (_ops_pm2_control). Best-effort — a bridge failure doesn't fail the toggle. - halacha_drain_supervisor.py: each tick now reads the disabled flag (added to db_snapshot) and, when set, stops the drain and never re-triggers it — regardless of burst/window. Backstop if the UI path failed (≤ one tick). - drain_halacha_queue.py: re-check is_drain_disabled at the top of every round, so a drain disabled mid-run halts at the next round boundary. Per-chunk checkpoints mean the in-flight case loses nothing. SCRIPTS.md updated for both drain and supervisor. Invariants: G1 (fix at source — the disable control honoured along every path, not just at startup); G2 (no parallel control path — same drain_controls flag). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-13 09:03:07 +00:00
Chaim	75a1b23972	fix(supervisor): burst set/get via raw SQL, not new db helpers (host-lag-proof) All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 9s Details The host pm2 supervisor imports legal_mcp.services.db from the host repo checkout, which can lag main by many commits. Depending on the just-added db.set_drain_burst/ get_drain_burst would require the host checkout to be current. Use raw SQL via the stable db.get_pool() instead — the supervisor now depends only on get_pool + the drain_controls.burst_until column (the shared contract with the /operations API). The container-side API keeps using the typed helpers (it ships the code in-image). Invariants: G1/G2 unchanged (same single DB column, no parallel path).	2026-06-12 11:16:38 +00:00
Chaim	c7c402e7ef	feat(operations): manual burst control for the halacha drain + permanent supervisor All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 6s Details The halacha-extraction backlog needs to be worked off the chair's leftover weekly Claude quota on demand. This adds a MANUAL, time-boxed "burst" — run the drain continuously now until a chosen deadline (default the upcoming Saturday 18:00 IL), managed interactively from /operations — plus the permanent health-supervisor that enforces it. Backend (this PR; deploys via Coolify + host pm2): - db: drain_controls.burst_until (SCHEMA_V37) + set_drain_burst/get_drain_burst/ get_drain_bursts. Single source of truth shared by the container-side /operations API and the host-side supervisor. - web: POST /api/operations/drains/{name}/burst (on→until\|next-Sat-18:00, off→NULL), and burst_until surfaced per-service in the /operations snapshot. - scripts/halacha_drain_supervisor.py + legal-halacha-supervisor.config.cjs: pm2 cron (*/15, zero Claude quota) — re-triggers idle drain, restarts a HUNG run (liveness = per-chunk checkpoints, NOT log mtime), backs off on 429 until the parsed reset (fresh-gated), verifies crash-safe staging. Reads burst_until from the DB; burst auto-expires at the deadline (never bleeds into a fresh week). UI (separate follow-up PR, after Claude Design approval): the /operations toggle + date-picker that calls the burst endpoint. Invariants: G1 (normalize at source — burst lives once in the DB, read by both surfaces), G2 (no parallel control path — CAPTURE field on the existing drain_controls + orchestrates the existing drain, not a new one), G12 (no Paperclip touch), §6 (no silent error-swallow — burst-clear failure is surfaced as a note).	2026-06-12 11:11:13 +00:00

4 Commits