fix(halacha): split authority (derived) from rule_role — stop source-conflation (INV-DM7)
The extractor classified rule_type by SOURCE bindingness (higher-court→binding, committee→persuasive) instead of by rule KIND. The gold-set proved it: 'binding' appeared on 19/19 external rulings & 0 committees; 'persuasive' on 13/13 committees & 0 external — only 58% agreement with the human role tags. The two axes (authority vs rule role) were crammed into one enum. This splits them per INV-DM7: - authority (binding/persuasive) — DERIVED from case_law.precedent_level (עליון/מנהלי→binding, ועדת_ערר_מחוזית→persuasive), never stored, never LLM-guessed. New helper halacha_quality.derive_authority; surfaced read-only in list_halachot / goldset_list / search results. - rule_type — now the rule ROLE only: holding/interpretive/procedural/ application/obiter. Both extractor prompts unified to this vocabulary; _coerce_halacha no longer defaults rule_type from the source; legacy binding→holding / persuasive→interpretive fold for safety. UI: authority shown as a separate read-only badge (gold=מחייב / muted=משכנע) across the review queue, precedent detail, and gold-set; the gold-set role selector drops binding/persuasive and adds מהותי (holding). Migration: scripts/halacha_rule_role_backfill.py re-classifies the 276 pre-split binding/persuasive rows into a genuine role via local claude_session (run after deploy). Gold-set correct_type/ai_correct_type 'binding'→'holding' via SQL. Sources (≥3, per research-decision policy): OASIS LegalRuleML v1.0 (appliesAuthority/Strength as metadata orthogonal to rule logic) · SemEval-2023 Task 6 LegalEval (rhetorical roles by function, authority kept separate) · Bluebook signals (weight-of-authority is a separate dimension). Invariants: ESTABLISHES INV-DM7. Upholds G1 (normalize at source — extractor classifies role, system derives authority) and G2 (single source of truth — authority derived, not a parallel stored field). Tests: 211 pass + new derive_authority/coerce coverage. web-ui build + tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -664,8 +664,10 @@ CREATE TABLE IF NOT EXISTS halachot (
|
||||
case_law_id UUID REFERENCES case_law(id) ON DELETE CASCADE,
|
||||
halacha_index INTEGER NOT NULL,
|
||||
rule_statement TEXT NOT NULL,
|
||||
rule_type TEXT DEFAULT 'binding',
|
||||
-- binding | interpretive | procedural | obiter
|
||||
rule_type TEXT DEFAULT 'interpretive',
|
||||
-- rule ROLE only (INV-DM7): holding | interpretive | procedural |
|
||||
-- application | obiter. authority (binding/persuasive) is DERIVED
|
||||
-- from case_law.precedent_level, never stored here.
|
||||
reasoning_summary TEXT DEFAULT '',
|
||||
supporting_quote TEXT NOT NULL,
|
||||
page_reference TEXT DEFAULT '',
|
||||
@@ -4052,7 +4054,7 @@ async def store_halachot(case_law_id: UUID, halachot: list[dict]) -> int:
|
||||
case_law_id,
|
||||
i,
|
||||
h["rule_statement"],
|
||||
h.get("rule_type", "binding"),
|
||||
h.get("rule_type", "interpretive"),
|
||||
h.get("reasoning_summary", ""),
|
||||
h["supporting_quote"],
|
||||
h.get("page_reference", ""),
|
||||
@@ -4193,7 +4195,7 @@ async def store_halachot_for_chunk(
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11,
|
||||
$12, $13, $14, $15, $16, {reviewed_at_clause})""",
|
||||
case_law_id, base + inserted, h["rule_statement"],
|
||||
h.get("rule_type", "binding"), h.get("reasoning_summary", ""),
|
||||
h.get("rule_type", "interpretive"), h.get("reasoning_summary", ""),
|
||||
h["supporting_quote"], h.get("page_reference", ""),
|
||||
h.get("practice_areas", []), h.get("subject_tags", []),
|
||||
h.get("cites", []), confidence, h.get("quote_verified", False),
|
||||
@@ -4299,6 +4301,8 @@ async def list_halachot(
|
||||
d = dict(r)
|
||||
if d.get("decision_date") is not None:
|
||||
d["decision_date"] = d["decision_date"].isoformat()
|
||||
# authority is DERIVED from the source, never stored (INV-DM7)
|
||||
d["authority"] = halacha_quality.derive_authority(d.get("precedent_level"))
|
||||
out.append(d)
|
||||
if cluster and out:
|
||||
await _annotate_clusters(pool, out)
|
||||
@@ -4721,7 +4725,7 @@ async def goldset_list(batch: str = "default") -> list[dict]:
|
||||
" g.ai_is_holding, g.ai_correct_type, g.ai_rationale, g.ai_generated_at, "
|
||||
" h.rule_statement, h.supporting_quote, h.reasoning_summary, "
|
||||
" h.rule_type, h.confidence, h.quality_flags, h.review_status, "
|
||||
" cl.case_number, cl.case_name, cl.source_type "
|
||||
" cl.case_number, cl.case_name, cl.source_type, cl.precedent_level "
|
||||
"FROM halacha_goldset g JOIN halachot h ON h.id = g.halacha_id "
|
||||
"LEFT JOIN case_law cl ON cl.id = h.case_law_id "
|
||||
"WHERE g.batch = $1 ORDER BY g.created_at, g.id", batch,
|
||||
@@ -4735,6 +4739,8 @@ async def goldset_list(batch: str = "default") -> list[dict]:
|
||||
d["ai_generated_at"] = d["ai_generated_at"].isoformat()
|
||||
if d.get("confidence") is not None:
|
||||
d["confidence"] = float(d["confidence"])
|
||||
# authority is DERIVED from the source, never stored (INV-DM7)
|
||||
d["authority"] = halacha_quality.derive_authority(d.get("precedent_level"))
|
||||
out.append(d)
|
||||
return out
|
||||
|
||||
@@ -4792,7 +4798,7 @@ async def goldset_score(batch: str = "default") -> dict:
|
||||
for r in labeled:
|
||||
rule = r.get("rule_statement") or ""
|
||||
quote = r.get("supporting_quote") or ""
|
||||
rtype = r.get("rule_type") or "binding"
|
||||
rtype = r.get("rule_type") or "interpretive"
|
||||
qc = r["quote_complete"] if r["quote_complete"] is not None else True
|
||||
truly_bad = r["is_holding"] is False
|
||||
flags = halacha_quality.compute_quality_flags(rule, quote, "", qc, rtype)
|
||||
@@ -4990,6 +4996,8 @@ async def search_precedent_library_semantic(
|
||||
_conf = float(d.get("confidence") or 0.0)
|
||||
d["score"] = float(d["score"]) + max(_conf * 0.06, 0.0)
|
||||
d["type"] = "halacha"
|
||||
# authority is DERIVED from the source, never stored (INV-DM7)
|
||||
d["authority"] = halacha_quality.derive_authority(d.get("precedent_level"))
|
||||
results.append(d)
|
||||
|
||||
rows = await pool.fetch(chunk_sql, *c_params)
|
||||
|
||||
Reference in New Issue
Block a user