feat(principles): canonical_statement synthesis service + throttled backfill (Phase E groundwork, #152)

Grounded (INV-AH) multi-instance synthesis with drift guard + chair gate
(pending_review, G10). Single path used by backfill, MCP tool, nightly drain.
HELD from production run pending the principles-redesign (rename+cull, #152).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-19 10:57:48 +00:00
parent db93735ed6
commit 338a8a947f
14 changed files with 1250 additions and 74 deletions

View File

@@ -162,6 +162,24 @@ HALACHA_AUTO_APPROVE_THRESHOLD = float(
os.environ.get("HALACHA_AUTO_APPROVE_THRESHOLD", "0.80")
)
# ── Tri-model panel extraction regime (legal-principles-redesign, #152) ──────
# chaim 2026-06-19: replace single-model auto-approve with a 3-model panel that
# deep-analyzes each decision. 3 models (Claude local + DeepSeek + Gemini) each
# PROPOSE candidate principles with a 0-1 score; candidates are matched across
# models (cosine ≥ MATCH_COSINE) → votes (# distinct models) + score (mean of the
# voters' scores). Approval rule (chaim): 3 votes → approve (even score<floor) ·
# ≥2 votes AND score≥SCORE_FLOOR → approve · 2 votes AND score<floor → chair
# (pending_review, G10) · 1 vote → drop. Cap MAX_NEW genuinely-new principles per
# decision (by score); recognized-existing (V41 cosine link) don't count against
# the cap. Applies to extraction (going forward) AND the retroactive cull (#152).
HALACHA_PANEL_SCORE_FLOOR = float(os.environ.get("HALACHA_PANEL_SCORE_FLOOR", "0.85"))
HALACHA_PANEL_MAX_NEW = int(os.environ.get("HALACHA_PANEL_MAX_NEW", "5"))
# 0.80: legal-principle paraphrases across models land ~0.78-0.82 on voyage-law-2
# (the canonical-synthesis dry-run showed faithful rewrites at 0.78-0.80); too high
# a floor misses genuine cross-model agreement → undercounts votes → over-culls.
# Calibrate against the gold-set in Phase C before the production cull.
HALACHA_PANEL_MATCH_COSINE = float(os.environ.get("HALACHA_PANEL_MATCH_COSINE", "0.80"))
# Halacha dedup-on-insert — within-precedent semantic cosine ceiling. Before
# storing a halacha, store_halachot_for_chunk skips it if its rule-embedding has
# cosine >= this value against an already-stored halacha of the SAME precedent
@@ -210,6 +228,20 @@ HALACHA_CONSOLIDATE_EFFORT = os.environ.get("HALACHA_CONSOLIDATE_EFFORT", "high"
HALACHA_CANONICAL_LOOKUP_ENABLED = os.environ.get("HALACHA_CANONICAL_LOOKUP_ENABLED", "true").lower() == "true"
HALACHA_CANONICAL_THRESHOLD = float(os.environ.get("HALACHA_CANONICAL_THRESHOLD", "0.85"))
# V41 canonical synthesis (Phase 4) — a claude_session pass that rewrites each
# canonical's statement (carried over verbatim from the representative halacha at
# backfill) into ONE clean, case-independent legal principle, grounded in the
# instances' supporting quotes (INV-AH), then flips review_status
# pending_synthesis → pending_review for the chair gate (G10). Opus by default —
# substance-bearing rewrite, chair-facing. Runs through the local CLI (zero $-cost,
# but consumes subscription usage windows → throttled via usage_limits).
# Drift guard: the synthesized statement is re-embedded and compared (cosine) to
# the source; below the floor the synthesis is REJECTED (kept as-is, flagged) so a
# hallucinated/topic-drifted rewrite never silently overwrites a sound principle.
HALACHA_CANONICAL_SYNTH_MODEL = os.environ.get("HALACHA_CANONICAL_SYNTH_MODEL", HALACHA_EXTRACT_MODEL)
HALACHA_CANONICAL_SYNTH_EFFORT = os.environ.get("HALACHA_CANONICAL_SYNTH_EFFORT", "high")
HALACHA_CANONICAL_SYNTH_DRIFT_FLOOR = float(os.environ.get("HALACHA_CANONICAL_SYNTH_DRIFT_FLOOR", "0.80"))
# Google Cloud Vision (OCR for scanned PDFs)
GOOGLE_CLOUD_VISION_API_KEY = os.environ.get("GOOGLE_CLOUD_VISION_API_KEY", "")