legal-ai

Author	SHA1	Message	Date
Chaim	f8c3fd6c89	fix(nevo): strip preamble/mini-ratio from court rulings too (#86.1) strip_nevo_preamble's _DECISION_START only matched ועדת-ערר openings (בפנינו / הערר שבנדון / ...), so Nevo COURT judgments — exactly the ones carrying a מיני-רציו — slipped through unstripped. The editorial mini-ratio then leaked into the chunked body, risking that the halacha extractor reads Nevo's answer key (contamination) and polluting the corpus. Proven on בג"ץ 1764/05: its full_text still contained the מיני-רציו (unstripped). Fix: - Extend _DECISION_START with court-ruling openings: פסק-דין/פסק דין header and the authoring-judge line (השופט/ת, כב' השופט, הנשיא, המשנה לנשיא). re.search picks the earliest line-start match → the real opinion start, not the prose ratio above it. - Widen the Nevo-marker detection window 400→1500 chars so a long court/parties header doesn't push חקיקה שאוזכרה:/מיני-רציו: out of range. Verified on the real 1764/05 full_text: strips 2702 chars, body now starts at 'השופט ס' ג'ובראן:', מיני-רציו gone. Regression: ועדת-ערר openings still strip; non-Nevo text untouched; markers-past-400 now detected. Suite 182 passed (6 new). This is the anti-contamination prerequisite for the Nevo-ratio gold-set (#86.3/#81.7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 16:55:31 +00:00
Chaim	fb60dca796	feat(halacha): over-extraction consolidation — fold facets via claude_session (#81.5) After a precedent finishes extracting, a claude_session pass folds facets of the SAME legal question (below #82's dedup cosine — the שפר 14-vs-4 / 403-17→89 granularity gap) into one canonical; the rest are marked 'rejected' (reversible: out of the active corpus AND the review queue, but recoverable). FOLD-ONLY — never merges distinct legal questions, never invents. - Engine: claude_session-as-judge (local CLI, zero cost), 'high' effort — folding needs careful judgment. One pass per precedent, runs in _extract_impl once all chunks are done (the prompt dedups within a chunk; this catches across chunks). - Pure, unit-tested helpers in halacha_quality: CONSOLIDATE_SYSTEM, build_consolidation_prompt, parse_fold_groups (fails SAFE → [] on any malformed shape; drops <2-member groups; coerces/dedups indices). - halacha_extractor._consolidate_precedent picks the canonical per group (approved>pending, higher confidence, quote_verified, longer) and rejects the rest via the existing update_halachot_batch (#84). Never rejects a canonical. Fails OPEN on any error (no CLI / parse fail → 0 folds, data untouched). - config: HALACHA_CONSOLIDATE_ENABLED/MODEL/EFFORT. Verified: suite 176 passed (10 new); integration vs dev DB — a 2-facet group folds to 1 canonical + 1 rejected (tagged), distinct rules untouched, claude error → 0 folds (fail-open). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 16:26:44 +00:00
Chaim	f196bed564	feat(halacha): NLI entailment validator via claude_session (#81.3) + task #86 #81.3 — a post-extraction validator that flags halachot whose rule_statement is NOT entailed by its supporting_quote (the model over-reaching beyond its source). - Engine: claude_session-as-judge (local CLI, zero API cost) per chaim's standing preference — one batched judge call per chunk, NOT a hosted NLI model. - Pure, unit-tested helpers in halacha_quality: NLI_SYSTEM, build_nli_prompt, parse_nli_verdicts (fails OPEN — any shape/label ambiguity → 'entailed'). - halacha_extractor._nli_check wraps the call; fails OPEN on any error (e.g. no CLI in the container) so a flaky judge never blocks a genuine halacha. - Non-entailed (neutral/contradiction) → quality_flag 'nli_unsupported' which blocks auto-approve (routes to pending_review) via the existing store gate. - config: HALACHA_NLI_ENABLED/MODEL/EFFORT (effort 'low' — entailment is simple). Verified: suite 166 passed (10 new); LIVE smoke test against the real claude CLI returned ['entailed','neutral'] for a supported vs unsupported rule. Also commits TaskMaster #86 (Nevo preamble/ratio: anti-contamination strip fix + gold-set benchmark) capturing today's strip_nevo_preamble findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 14:46:12 +00:00
Chaim	ca959d4a9c	feat(halacha): strict-rubric quality gate + dedup-on-insert (#81,#82) Bake the 2026-06-03 strict-cleanup rubric into the extraction pipeline so the corpus stays clean at the source instead of accumulating duplicates, obiter dicta, truncated quotes and thin restatements that clog the review queue. #81 — quality gate: - New pure module halacha_quality.py with unit-tested validators: non-decision/obiter (Wambaugh markers), truncated-quote (mid-word cut), thin-restatement (rule≈quote), quote-unverified. - Validators run in halacha_extractor._process; a non-decision is re-typed obiter; flags persist in new halachot.quality_flags column. - Auto-approve now requires confidence>=threshold AND no quality flags; flagged items route to pending_review regardless of confidence. - Both extraction prompts hardened: reject undecided dicta, exclude case-specific applications, require abstraction, forbid over-splitting. #82 — dedup-on-insert (store_halachot_for_chunk): - Within the same precedent, skip a halacha whose normalized supporting_quote already exists, or whose rule-embedding has cosine>=HALACHA_DEDUP_COSINE (0.93) against an already-stored one. Makes re-runs idempotent. Migration: halachot.quality_flags TEXT[] (additive, idempotent ALTER). Tests: 19 new unit tests; full suite 156 passed. Validated end-to-end against dev DB (dedup skips dups, flag blocks auto-approve, re-run inserts 0). Calibration: flags fire on only ~10% of current survivors (low false-positive). Spec: docs/halacha-strict-rubric.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 12:30:38 +00:00
Chaim	df007784c9	feat(corroboration): approval_action decision fn + kill-switch (INV-COR2/COR4, X11 Phase 2) - HALACHA_CORROBORATION_AUTO_APPROVE config (default ON, Dafna validated 2026-06-01) - approval_action(agg, has_overruled): overruled→demote, corroborated→approve, else None - 4 offline unit tests; Phase 2 plan + TaskMaster #75 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:34:23 +00:00
Chaim	33f955e372	feat(corroboration): aggregator — distinct positive + negative-flag (INV-COR4, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:00:16 +00:00
Chaim	dbc176ae66	feat(corroboration): halacha matcher + cosine threshold (INV-COR3, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:57:47 +00:00
Chaim	09eec6a906	feat(corroboration): treatment classifier + polarity (INV-COR2, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:54:50 +00:00
Chaim	4d8422198a	feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a) Wakeup-INSERT rule is universal (never allowlisted — hard invariant). Raw-HTTP rule exempts the sanctioned helpers + standalone operator/admin scripts (a distinct category per fitness-function scope differentiation + DRY: tooling needn't reuse the FastAPI wrapper). Repo scanned clean under these rules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 11:35:07 +00:00
Chaim	a66ab3b3cd	feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 11:16:36 +00:00
Chaim	aac383acb7	feat(sync): --verify exits non-zero on drift; adapter mismatch = loud drift (GAP-21, FU-8a) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 11:14:44 +00:00
Chaim	e46868feda	feat(fu2b): flag PROC_MISMATCH (case_number prefix vs proceeding_type) for chair Dry-run surfaced 2 rows with בל"מ prefix but proceeding_type=ערר. Since the migration strips the prefix, a wrong proceeding_type would silently lose the בל"מ signal — must be chair-adjudicated, not auto-applied. Chair table now flags 4 rows: 2 DUP_CHECK (8047-23) + 2 PROC_MISMATCH. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 08:57:42 +00:00
Chaim	a41fcedc28	test(fu2b): failing tests for bare-number extraction (FU-2b)	2026-05-31 08:52:48 +00:00
Chaim	7e35a24d80	test(reindex): cover empty-text raise path (FU-3 review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:13:18 +00:00
Chaim	63abf83e76	test(reindex): fix mark_indexed stub arity in FU-1 fixture (FU-3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:07:39 +00:00
Chaim	c8de42150e	test(reindex): stub db.mark_indexed in FU-1/FU-2a ingest fixtures (FU-3 interaction) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:07:18 +00:00
Chaim	e522555b1a	test(reindex): failing tests for content-hash re-index (FU-3)	2026-05-30 22:02:16 +00:00
Chaim	bffd2ec701	test(audit): failing tests for audit-trail + provenance (FU-7)	2026-05-30 21:27:54 +00:00
Chaim	5d3c340243	test(ingest): stub recompute_searchable in FU-1 fixture (FU-2a interaction) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 20:59:11 +00:00
Chaim	358d82e90e	feat(retrieval): require practice_area only for internal/cases; enable searchable filter + health visibility (GAP-13, FU-2a) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 20:57:27 +00:00
Chaim	bcd226ac1a	test(ingest): failing tests for idempotent ingest + searchable (FU-2a) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 20:41:34 +00:00
Chaim	9ae2d47d03	test(ingest): failing tests for unified pipeline (FU-1) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 19:09:37 +00:00
Chaim	0c8d415044	fix(retrieval): scope search_decisions by domain — derive from case, block only on undeterminable case (GAP-12, INV-RET1) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 18:23:41 +00:00
Chaim	084b31cd9b	fix(qa): enforce critical-QA gate on export + fix neutral_background critical-but-passed (GAP-15/16, INV-QA3/EX3) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 17:58:50 +00:00
Chaim	1af689a969	fix(retrieval): enforce source_kind on halacha_filters — close cross-corpus leak (GAP-10, INV-RET1) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-30 17:46:59 +00:00
Chaim	f3cc9ca9d4	feat: Stage A finalizers + #35/#36/#37 — critical-gap closure Some checks failed Build & Deploy / build-and-deploy (push) Has been cancelled Details Four parallel sub-agents closed the remaining critical gaps from the 26/05 Stage A/B sprint. Each block independently tested; aggregated here. ## #30/#31 finalizers (sub-agent A) * Auto-derive practice_area in case_create from case_number prefix (1xxx→rishuy_uvniya, 8xxx→betterment_levy, 9xxx→compensation_197); default for CaseCreateRequest is now "" (the DB constraint catches any stray "appeals_committee"). * practice_area.py: derive_subtype now handles axis-B domain values (rishuy_uvniya/betterment_levy/compensation_197) without parsing the case number; new helper derive_domain_practice_area(). * Halacha re-extraction verified unnecessary — all 6 reclassified records already had is_binding=false and approved halachot. * Regression tests: 6 cases in tests/test_corpus_constraints.py covering practice_area enum, internal-committee chair/district, external-upload arar prefix, MCP guard. * UI: district input → Select dropdown (7 districts) in precedent-edit-sheet.tsx, preserving legacy free-text values. ## #37 בל"מ subtypes (sub-agent B) * 3 new appeal_subtypes: extension_request_{building_permit, betterment_levy,compensation}. APPEALS_COMMITTEE_SUBTYPES extended, SUBTYPES_BY_AREA mappings added. * New helpers: is_blam_subject(), is_blam_subtype(), derive_subtype_with_blam(case_number, subject, practice_area). case_create now uses it to auto-detect "בקשה להארכת מועד" subjects. * 3 methodology templates under docs/methodology/extension-request-.md. paperclip_client.py mapping updated for the 3 new subtypes (extension_request_building_permit→CMP, the other two→CMPA). * Frontend: bilingual "בל"מ" badge + filter dropdown on cases list + detail header; appeal-type-bars collapseBlam() merges בל"מ into its parent domain for aggregate bars. * Wizard auto-detects בל"מ from subject during case creation. * 3 Berlinger cases (1017/1018/1019-03-26) migrated to appeal_subtype=extension_request_building_permit via psql. ## #35 missing_precedents feature (sub-agent C) * Schema V13: missing_precedents table (citation, case_id, party, legal_topic, status, linked_case_law_id, claim_quote, ...) + FK constraints + 3 indexes. Applied via psql + idempotent migration. * 6 db.py service functions, 3 MCP tools, 6 FastAPI endpoints (POST/GET/PATCH/DELETE/upload — upload routes by citation prefix to ingest_internal_decision or ingest_precedent). * Next.js page /missing-precedents with 5 status tabs + filters + sidebar badge counter + detail drawer with metadata edit + smart upload form that switches fields per committee/court. * Bootstrap: 7 rows imported from the JSON file (3 citations × cases, all status=closed with linked_case_law_id). * legal-researcher.md: new §2ב.5 with missing_precedent_create usage + dedup semantics + tool grant. ## #36 legal_arguments aggregation (sub-agent D) * Schema V14: legal_arguments + legal_argument_propositions M:M. Applied via psql. * New service argument_aggregator.py with two functions — aggregate_claims_to_arguments() (Claude CLI / claude_session) and get_legal_arguments(). Graceful llm_unavailable handling when CLI is missing (containers). * 2 MCP tools + 2 API endpoints (POST .../aggregate-arguments as BackgroundTask, GET .../legal-arguments). * Frontend: shadcn Accordion + new legal-arguments-panel.tsx with hierarchical (party → priority badge → arguments) display, "טיעונים" tab on the case page, "חשב/חשב מחדש" buttons. * scripts/backfill_legal_arguments.py + SCRIPTS.md entry — dry-run found 8 candidate cases including 1017/1018/1019. ## Open follow-ups (intentionally deferred) * npm run api:types in web-ui (CLAUDE.md flow) — recommended before the next UI commit; not required for backend deployment. * Run backfill_legal_arguments.py --apply once the container picks up the new aggregator service. * webhook on missing-precedents upload-close to Paperclip (optional). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 08:34:40 +00:00
Chaim	03e7d88aee	DOCX exporter: 3-layer RTL + David font on all slots All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m30s Details Hebrew was rendering LTR or in Times New Roman fallback in some Word contexts. Root cause: incomplete RTL marking and missing font hints on the run level. Three layers of RTL are required (per skills/docx/SKILL.md): 1. Section: <w:bidi/> in sectPr (now inherited from template) 2. Paragraph: <w:bidi/> directly in pPr (paragraph direction) 3. Run: <w:rtl/> in rPr — tells Word to use cs (complex-script) font Without an explicit font on the run, Hebrew renders in the ascii slot (Times New Roman). Force David on all four slots (ascii / hAnsi / cs / eastAsia) so every shaping path picks the correct font. Changes: - TEMPLATE_PATH now points to skills/docx/decision_template.docx (carries David, RTL, margins, styles); replaces hard-coded constants. - _mark_run_rtl: writes rFonts on all four slots, then appends <w:rtl/>. - _mark_paragraph_rtl: places <w:bidi/> directly in pPr (not nested in rPr — that was the bug), and adds <w:rtl/> to the paragraph-mark rPr. - _set_paragraph_jc: forces explicit jc, overriding style-inherited. Tests: - test_mark_paragraph_rtl_adds_bidi_directly_in_pPr — guards against the regression where bidi was nested inside rPr. - test_mark_run_rtl_forces_david_on_all_font_slots — ensures all four font slots are set, not just cs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 17:37:52 +00:00
Chaim	36ca713dfa	Retrofit: tighten yod-bet pattern, add cover-block fallback All checks were successful Build & Deploy / build-and-deploy (push) Successful in 6s Details The "על כן" pattern for block-yod-bet was too greedy and matched mid-discussion transitional sentences (e.g. "על כן, במקום בו..."), which caused forward-scan to skip block-yod-alef ("סוף דבר") via the pointer advance. Tightened to require an operative subject (אנו / הערר / הוועדה / ועדת הערר) so terminal "על כן, אנו מחליטים" still matches but mid-block transitions don't. Added structural_fallback for cover blocks (alef/bet/gimel/dalet) — these are template metadata not present in user-edited DOCX bodies. Inject zero-content anchors so apply_user_edit can still target them later. The frontend toast distinguishes real content gaps from fallback anchors. Also expanded heading patterns based on training corpus inspection: - block-vav: על המקרקעין חלות / במצב התכנוני / התכניות החלות - block-zayin: טענות העוררת - block-chet: עיקר תגובת המשיב - block-tet: הדיון בוועדת הערר For case 1130-25, this raises detection from 6/12 to 11/12 blocks — only block-yod-bet remains missing (Daphna's edit ends at "סוף דבר" + numbered ruling, no terminal "ההחלטה" or "על כן אנו מחליטים" paragraph). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 06:57:41 +00:00
Chaim	726498126d	Add Track Changes architecture for draft revisions (CMP + CMPA) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m29s Details Fixes critical bug in 1033-25: user-uploaded עריכה-*.docx files were orphaned on disk while exports kept rebuilding from stale DB blocks. New architecture: - User-uploaded DOCX becomes the source of truth (cases.active_draft_path) - System edits via XML surgery with real Word <w:ins>/<w:del> revisions - User can Accept/Reject each change from within Word Components: - docx_reviser.py: XML surgery for Track Changes (15 tests) - docx_retrofit.py: retroactive bookmark injection with Hebrew marker detection + heading heuristic (9 tests) - docx_exporter.py: emits bookmarks around each of the 12 blocks - 3 new MCP tools: apply_user_edit, list_bookmarks, revise_draft - 4 new/updated endpoints: upload (auto-registers active draft), /exports/revise, /exports/bookmarks, /exports/{filename}/retrofit, /active-draft - DB migration: cases.active_draft_path column - UI: correct banner using real v-numbers, "מקור האמת" badge, detailed upload toast with bookmarks_added/missing_blocks - agents: legal-exporter (3 export modes), legal-ceo (stage G for revision handling), legal-writer (revision mode) Multi-tenancy: - Works for both CMP (1xxx cases) and CMPA (8xxx/9xxx cases) - New revise-draft skill added to both companies - deploy-track-changes.sh syncs skills CMP ↔ CMPA - retrofit_case.py: one-off retrofit of existing files Tests: 34 passing (15 reviser + 9 retrofit + 4 exporter bookmarks + 6 e2e) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 18:49:30 +00:00

29 Commits