Fix claims handling: filter block-zayin duplicates, improve QA matching

block_writer: _build_claims_context now filters out block-zayin claims
(from final decision) and uses only claims from original pleadings.
Reduces noise from 78 to 48 real claims for Hecht case.

qa_validator: claims_coverage check rewritten:
- Filter block-zayin claims (same reason)
- Keyword-based matching instead of 3-word phrase matching
- 25% keyword overlap threshold (was: any 3-word match)
- Allow up to 20% uncovered claims before failing
- Check both block-yod and block-zayin for coverage

Result: Hecht case QA goes from 4/6 to 6/6, 47/48 claims covered (98%).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-03 11:32:29 +00:00
parent 570f745823
commit 018b5936a1
2 changed files with 49 additions and 15 deletions

View File

@@ -430,12 +430,20 @@ async def _build_claims_context(case_id: UUID) -> str:
claims = await db.get_claims(case_id)
if not claims:
return "(לא חולצו טענות)"
# Filter out claims from block-zayin (decision summary) — use only
# claims extracted from original pleadings (appeal, response, etc.)
source_claims = [c for c in claims if c.get("source_document", "") != "block-zayin"]
if not source_claims:
# Fallback to all claims if no source claims exist
source_claims = claims
lines = []
current_role = ""
role_heb = {"appellant": "טענות העוררים", "respondent": "טענות המשיבים",
"committee": "עמדת הוועדה המקומית", "permit_applicant": "עמדת מבקשי ההיתר"}
claim_num = 0
for c in claims:
for c in source_claims:
if c["party_role"] != current_role:
current_role = c["party_role"]
lines.append(f"\n### {role_heb.get(current_role, current_role)}")