feat(principles): canonical_statement synthesis service + throttled backfill (Phase E groundwork, #152)
Grounded (INV-AH) multi-instance synthesis with drift guard + chair gate (pending_review, G10). Single path used by backfill, MCP tool, nightly drain. HELD from production run pending the principles-redesign (rename+cull, #152). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -162,6 +162,24 @@ HALACHA_AUTO_APPROVE_THRESHOLD = float(
|
||||
os.environ.get("HALACHA_AUTO_APPROVE_THRESHOLD", "0.80")
|
||||
)
|
||||
|
||||
# ── Tri-model panel extraction regime (legal-principles-redesign, #152) ──────
|
||||
# chaim 2026-06-19: replace single-model auto-approve with a 3-model panel that
|
||||
# deep-analyzes each decision. 3 models (Claude local + DeepSeek + Gemini) each
|
||||
# PROPOSE candidate principles with a 0-1 score; candidates are matched across
|
||||
# models (cosine ≥ MATCH_COSINE) → votes (# distinct models) + score (mean of the
|
||||
# voters' scores). Approval rule (chaim): 3 votes → approve (even score<floor) ·
|
||||
# ≥2 votes AND score≥SCORE_FLOOR → approve · 2 votes AND score<floor → chair
|
||||
# (pending_review, G10) · 1 vote → drop. Cap MAX_NEW genuinely-new principles per
|
||||
# decision (by score); recognized-existing (V41 cosine link) don't count against
|
||||
# the cap. Applies to extraction (going forward) AND the retroactive cull (#152).
|
||||
HALACHA_PANEL_SCORE_FLOOR = float(os.environ.get("HALACHA_PANEL_SCORE_FLOOR", "0.85"))
|
||||
HALACHA_PANEL_MAX_NEW = int(os.environ.get("HALACHA_PANEL_MAX_NEW", "5"))
|
||||
# 0.80: legal-principle paraphrases across models land ~0.78-0.82 on voyage-law-2
|
||||
# (the canonical-synthesis dry-run showed faithful rewrites at 0.78-0.80); too high
|
||||
# a floor misses genuine cross-model agreement → undercounts votes → over-culls.
|
||||
# Calibrate against the gold-set in Phase C before the production cull.
|
||||
HALACHA_PANEL_MATCH_COSINE = float(os.environ.get("HALACHA_PANEL_MATCH_COSINE", "0.80"))
|
||||
|
||||
# Halacha dedup-on-insert — within-precedent semantic cosine ceiling. Before
|
||||
# storing a halacha, store_halachot_for_chunk skips it if its rule-embedding has
|
||||
# cosine >= this value against an already-stored halacha of the SAME precedent
|
||||
@@ -210,6 +228,20 @@ HALACHA_CONSOLIDATE_EFFORT = os.environ.get("HALACHA_CONSOLIDATE_EFFORT", "high"
|
||||
HALACHA_CANONICAL_LOOKUP_ENABLED = os.environ.get("HALACHA_CANONICAL_LOOKUP_ENABLED", "true").lower() == "true"
|
||||
HALACHA_CANONICAL_THRESHOLD = float(os.environ.get("HALACHA_CANONICAL_THRESHOLD", "0.85"))
|
||||
|
||||
# V41 canonical synthesis (Phase 4) — a claude_session pass that rewrites each
|
||||
# canonical's statement (carried over verbatim from the representative halacha at
|
||||
# backfill) into ONE clean, case-independent legal principle, grounded in the
|
||||
# instances' supporting quotes (INV-AH), then flips review_status
|
||||
# pending_synthesis → pending_review for the chair gate (G10). Opus by default —
|
||||
# substance-bearing rewrite, chair-facing. Runs through the local CLI (zero $-cost,
|
||||
# but consumes subscription usage windows → throttled via usage_limits).
|
||||
# Drift guard: the synthesized statement is re-embedded and compared (cosine) to
|
||||
# the source; below the floor the synthesis is REJECTED (kept as-is, flagged) so a
|
||||
# hallucinated/topic-drifted rewrite never silently overwrites a sound principle.
|
||||
HALACHA_CANONICAL_SYNTH_MODEL = os.environ.get("HALACHA_CANONICAL_SYNTH_MODEL", HALACHA_EXTRACT_MODEL)
|
||||
HALACHA_CANONICAL_SYNTH_EFFORT = os.environ.get("HALACHA_CANONICAL_SYNTH_EFFORT", "high")
|
||||
HALACHA_CANONICAL_SYNTH_DRIFT_FLOOR = float(os.environ.get("HALACHA_CANONICAL_SYNTH_DRIFT_FLOOR", "0.80"))
|
||||
|
||||
# Google Cloud Vision (OCR for scanned PDFs)
|
||||
GOOGLE_CLOUD_VISION_API_KEY = os.environ.get("GOOGLE_CLOUD_VISION_API_KEY", "")
|
||||
|
||||
|
||||
Reference in New Issue
Block a user