feat(halacha): #81.8 — calibrate auto-approve gate on the gold-set (keep 0.80, documented)
All checks were successful
G12 Leak-Guard / leak-guard (pull_request) Successful in 5s

כיול סף-האישור-האוטומטי מול ה-100 תוויות-היו"ר (93 keep / 7 drop), אמת אנושית (לא
הקונצנזוס — מונע מעגליות):
  conf≥0.80 → P=0.98 R=0.53  ← נוכחי (errs safe)
  conf≥0.75 → P=0.96 R=0.81
  conf≥0.70 → P=0.94 R=0.94
  panel unanimous-3/3 → P=0.988 cov=95% · majority-2/3 → P=0.948 cov=100%

הכרעה: **לשמר 0.80** — עומד ביעד precision≥0.90 עם מרווח, וטועה לכיוון היו"ר
(recall נמוך = יותר סקירה, לא פחות). שני ממצאים:
 (א) self-confidence מכויל היטב ל-precision; הוולידטורים ה-rule-based לא-מבחינים
     על ה-gold-set (P≈0.1) → "confidence × validators" רק יזיק, לא אומץ (תשובה ל-#81.8).
 (ב) מנוף-הכיסוי האמיתי = הפאנל התלת-מודלי (unanimous 0.988/95%), לא סף-confidence נמוך.
     הורדת השער ל-0.75 = tradeoff governance (יותר auto-approve לא-מסוקר, INV-G10) על
     ראיה דקה (7 שליליים) → נדחה ליו"ר/פאנל (#121), לא שונה כאן.

- db.goldset_calibrate(): sweep-confidence + panel-policy precision/coverage מול הזהב,
  read-only, משוחזר (INV-LRN3). ground_truth='chair' default (אנטי-מעגליות).
- config: הערת HALACHA_AUTO_APPROVE_THRESHOLD מעודכנת לממצא-הכיול (במקום spot-check-of-10).

invariants: INV-G10 (לא הורדנו את השער הלא-מסוקר) · INV-LRN2/LRN3 (כיול מתועד במקור, מובנה).
tests: 4 offline (sweep/policies/anti-circularity/threshold-surfaced). אומת חי: משחזר את המספרים.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-11 16:29:24 +00:00
parent 7e1a0c879a
commit 4e06662208
3 changed files with 161 additions and 6 deletions

View File

@@ -138,12 +138,26 @@ BM25_HYBRID_ENABLED = (
)
# Halacha extraction — auto-approve threshold. Halachot with extractor
# confidence >= this value are inserted with review_status='approved'
# instead of 'pending_review' (so they immediately appear in
# search_precedent_library). Set to a value > 1.0 to disable auto-approval.
# 0.80 baseline: 89% of historical extractions land here, manual spot-check
# of 10 random samples confirmed quality. Tunable via env if drift is
# observed (e.g. raise to 0.90 if false-positives appear).
# confidence >= this value AND no quality_flags are inserted
# review_status='approved' (so they appear immediately in
# search_precedent_library). Set > 1.0 to disable auto-approval.
#
# CALIBRATION (#81.8, 2026-06-11) against the 100-item human-labeled gold-set
# (db.goldset_calibrate, ground_truth='chair'; 93 keep / 7 drop):
# conf>=0.80 -> precision 0.98, recall 0.53 <- current (errs safe)
# conf>=0.75 -> precision 0.96, recall 0.81
# conf>=0.70 -> precision 0.94, recall 0.94
# 0.80 clears the >=0.90 precision target with margin, so we KEEP it — it errs
# toward the chair (low recall = more items reviewed, never the reverse).
# Two findings shape the policy:
# (a) self-confidence alone is well-calibrated for PRECISION; the rule-based
# validators do NOT discriminate keep/drop on the gold-set (P~0.1), so a
# "confidence x validators" combined score would only hurt — not adopted.
# (b) the real COVERAGE lever is the tri-model panel (halacha_panel_approve):
# unanimous-3/3 -> precision 0.988 at 95% coverage, dominating any single
# confidence threshold. Lowering this gate to ~0.75 is a governance
# tradeoff (more unreviewed auto-approvals, INV-G10) on thin evidence
# (7 negatives) -> deferred to chair/panel (TaskMaster #121), not changed here.
HALACHA_AUTO_APPROVE_THRESHOLD = float(
os.environ.get("HALACHA_AUTO_APPROVE_THRESHOLD", "0.80")
)