fix(operations): disabling the halacha drain now stops a running process immediately

The /operations "disabled" toggle only wrote drain_controls.disabled, which the drain checks at STARTUP — so a drain already mid-run kept going until the queue emptied or the night window closed. Disabling did not stop a running drain. Three layers, immediate + backstops: - web/app.py operations_drain_toggle: on disable, also stop the running process immediately via the host pm2 bridge (_ops_pm2_control). Best-effort — a bridge failure doesn't fail the toggle. - halacha_drain_supervisor.py: each tick now reads the disabled flag (added to db_snapshot) and, when set, stops the drain and never re-triggers it — regardless of burst/window. Backstop if the UI path failed (≤ one tick). - drain_halacha_queue.py: re-check is_drain_disabled at the top of every round, so a drain disabled mid-run halts at the next round boundary. Per-chunk checkpoints mean the in-flight case loses nothing. SCRIPTS.md updated for both drain and supervisor. Invariants: G1 (fix at source — the disable control honoured along every path, not just at startup); G2 (no parallel control path — same drain_controls flag). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-13 09:03:07 +00:00
parent 72f81734f1
commit a44827c3dd
4 changed files with 43 additions and 7 deletions
--- a/scripts/halacha_drain_supervisor.py
+++ b/scripts/halacha_drain_supervisor.py
@@ -98,7 +98,8 @@ def db_snapshot() -> dict:
        "  ck=await c.fetchval('SELECT count(*) FROM precedent_chunks WHERE halacha_extracted_at IS NOT NULL')\n"
        "  pend_rev=await c.fetchval(\"SELECT count(*) FROM halachot WHERE review_status='pending_review'\")\n"
        "  bu=await c.fetchval(\"SELECT burst_until FROM drain_controls WHERE name='legal-halacha-drain'\")\n"
-        "  print(json.dumps({'status_counts':st,'processing_cases':procs,'halachot_total':hal,'checkpointed_chunks':ck,'pending_review':pend_rev,'burst_until':bu.isoformat() if bu else None}))\n"
+        "  dis=await c.fetchval(\"SELECT disabled FROM drain_controls WHERE name='legal-halacha-drain'\")\n"
+        "  print(json.dumps({'status_counts':st,'processing_cases':procs,'halachot_total':hal,'checkpointed_chunks':ck,'pending_review':pend_rev,'burst_until':bu.isoformat() if bu else None,'disabled':bool(dis)}))\n"
        "asyncio.run(m())\n"
    )
    return json.loads(_venv_py(code).splitlines()[-1])
@@ -255,6 +256,23 @@ def tick():
                                    "mode": "db_error", "next_wake_sec": 900}, ensure_ascii=False))
        return

+    # /operations "disabled" switch — highest priority. The drain self-guards at
+    # startup, but a process mid-run wouldn't notice the flag, and the cron keeps
+    # firing it; so when disabled we stop any running drain and never re-trigger,
+    # regardless of burst/window. (The UI toggle also stops it immediately via the
+    # bridge; this is the backstop + the "don't re-ignite a disabled drain" gate.)
+    if snap.get("disabled"):
+        stopped = stop_drain()
+        save_state({**prev, "tick_at": now.isoformat(), "mode": "disabled",
+                    "action": "stopped-disabled" if stopped else "already-stopped"})
+        print(f"🛑 {now.astimezone(IDT):%H:%M:%S IDT}  |  מצב: disabled  |  "
+              f"מסומן לא-פעיל ב-/operations — הדריינר {'נעצר' if stopped else 'כבר עצור'}.")
+        print("JSON:" + json.dumps(
+            {"ok": True, "mode": "disabled",
+             "action": "stopped-disabled" if stopped else "already-stopped",
+             "next_wake_sec": 900}, ensure_ascii=False))
+        return
+
    # burst state from the DB + auto-expiry (manual on; auto-off at the deadline)
    burst_until = datetime.fromisoformat(snap["burst_until"]) if snap.get("burst_until") else None
    if burst_until and now >= burst_until: