legal-ai

Author	SHA1	Message	Date
Chaim	e096c51037	fix(#85 ): claude_session — retry על כשלים חולפים של claude -p שורש #85 התברר: `claude -p` נכשל מדי פעם ב-exit מהיר + stderr ריק על פרומפטים גדולים/איטיים (CEO write_interim_draft, learning_loop distillation), אותו פרומפט מצליח בריצה חוזרת — כשל חולף, לא nesting (אומת: nested claude מ-bash וגם פרומפט 70K הצליחו; הכשל אינו דטרמיניסטי). query() עוטף spawn+communicate ב-לולאת retry (MAX_RETRIES=3, backoff לינארי 5s*attempt). FileNotFoundError + timeout נשארים דטרמיניסטיים (ללא retry). empty-response גם מטופל כ-transient. אומת e2e: distillation על 1130-25 רץ בהצלחה → pair=analyzed (9 שינויים, 6 style_method, 33.8% diff). פותר גם את write_interim_draft של ה-CEO. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 20:08:54 +00:00
Chaim	420cb819f5	feat(halacha-triage): quality-gated + prioritized review queue + metrics (#84 ) Backend for the halacha approval-queue triage (#84). The keyboard UI, batch actions and defer/reject (#84.4–6) already shipped; this adds the gating, prioritization and metrics the queue was missing. db.list_halachot — two opt-in triage controls: * exclude_low_quality (#84.1): drop items carrying ANY quality_flag (application / quote_unverified / truncated / non_decision / thin / nli_unsupported / near_duplicate) — they belong in a 'needs extraction fix' bucket, not the chair's approve queue. * order_by_priority (#84.3): active-learning order — negatively-treated first, then most-uncertain (lowest confidence), then oldest — instead of FIFO, so the highest-value decisions surface first. halachot_pending (MCP) — now gated + prioritized BY DEFAULT; include_low_quality= true reveals the needs-fix bucket. The agent review path benefits immediately. GET /api/halachot — same two params, default OFF (non-breaking; the UI opts in). metrics.halacha_backlog (#84.7) — splits pending into clean vs flagged, adds deferred, reviewed_total, approve_ratio, and a pending_by_flag breakdown, so the backlog distinguishes real review work from extraction noise. Deferred (documented): #84.2 near-duplicate cluster cards and wiring the UI fetch to the new params require frontend work + an api:types regen AFTER this deploys (the new query params aren't in prod's OpenAPI until then) — a clean follow-up. The backend fully supports both now. Verified against the live DB (read-only): - pending 177 → gated-clean 110, 0 flagged items leak into the clean queue. - priority order surfaces the lowest-confidence items first (0.55, 0.55, ...). - backlog: pending_clean=110 / pending_flagged=67 / approve_ratio=0.916, pending_by_flag={nli_unsupported:59, quote_unverified:3, thin:3, truncated:2}. - pytest tests/test_halacha_quality.py — 52 passed (no regression). Invariants: G1 (gate at source — SQL filter, not post-hoc); G2 (no parallel path — same list_halachot); §6 (flagged items routed to a bucket, never dropped). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 20:00:52 +00:00
Chaim	1286a1e60d	feat(halacha): application gate + lexical dedup tail + quality harnesses (#81,#82) Halacha-extraction quality (#81) and dedup-on-insert (#82) — engine changes (pure + tested) plus measurement/ops tooling. halacha_quality.py - #81.4 application gate: is_fact_dependent() (high-precision "applied to THIS case" deixis per the strict rubric §3/§27) + FLAG_APPLICATION. compute_quality_flags now takes rule_type and flags rule_type=='application' OR fact-dependent — blocking auto-approve (an illustration is not a generalizable holding). - #82.3 lexical tail signal: jaccard_shingles / normalized_levenshtein / lexical_near_duplicate + FLAG_NEAR_DUPLICATE, for the 0.83–0.93 cosine band. halacha_extractor.py — pass rule_type to the flag computation; re-type a binding-labeled fact-application to 'application' (mirrors non_decision→obiter). db.py (store_halachot_for_chunk) — dedup now fetches the nearest same-precedent neighbor once: cosine ≥ DEDUP → skip (unchanged); cosine in [BAND, DEDUP) with high lexical overlap → FLAG_NEAR_DUPLICATE (review, not skip — never drop a possibly-distinct principle unreviewed). config.py — HALACHA_DEDUP_BAND_COSINE (0.83). Scripts: - scripts/halacha_goldset.py (#81.7) — export stratified sample for human tagging; score validators (P/R/F1) against the tags. Backbone for #81.8. - scripts/halacha_batch_reconcile.py (#82.7) — conservative cross-precedent dedup (cosine ≥0.95), dry-run report only. - scripts/calibrate_halacha_dedup.py (#82.1) — calibrate the lexical thresholds against the 2026-06-03 cleanup gold-set. Deferred (documented): #82.4 merge-provenance and #82.5 DB ON CONFLICT/UNIQUE on normalized quote are NOT included — the current skip+flag behavior is safe, whereas a UNIQUE on normalized_quote would fail on existing dups and a blind merge risks losing provenance; they need their own chair-reviewed migration. #82.6 over-merge guard is moot until merge lands. #81.6 full rhetorical-role classifier deferred (section pre-filter + application flag cover the practical case); #81.8 blocked on the human-tagged gold-set (harness now provided). Verified: - pytest tests/test_halacha_quality.py — 52 passed (14 new). - calibrate: configured (0.55,0.70) → precision 1.0 (zero false-merge), recall 0.30 — correct profile for an auto-approve-blocking signal. - goldset export: 15-row sample CSV. batch reconcile: 819 halachot → 5 cross-precedent candidate pairs. Invariants: G1 (normalize at source — flag at insert, not at read); §6 (no silent swallow — suspect items flagged to review, never dropped); G2 (no parallel path — same store_halachot_for_chunk / compute_quality_flags). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:55:45 +00:00
Chaim	fb51a0e869	feat(nevo): backfill leaked preamble + ratio gold-set benchmark (#86 ) #86.2 backfill + #86.3 benchmark, plus a #86.1 over-strip fix found en route. extractor.py - extract_nevo_ratio(): capture Nevo's מיני-רציו block (editorial holdings summary) before it is stripped — a free professional gold-set (#86.3). - _DECISION_START hardening (#86.2): the merged #86.1 regex over-stripped. (a) פסק-דין headers are markdown-wrapped (פסק דין); the old anchor required the keyword as the first line char with one separator, so it missed the header and matched a citation 32K deep (עמ"נ 50567-07-21, losing 45% of the body). Now tolerates leading markdown + 0-3 seps, and the final-nun form (דין ן vs דינו נ). (b) bare השופט/הנשיא matched CITATIONS ("השופט מ' חשין, פסקה 23"). The authoring-judge line ends with a colon; we now require it. ingest.py - capture the ratio before stripping and store it on the row (best-effort, non-fatal); also strip the text-upload path (was file-only). db.py - add case_law.nevo_ratio column (additive); allow it in update_case_law. scripts/backfill_nevo_preamble.py (#86.2) — dry-run-by-default data migration: finds historically-leaked rulings, captures ratio→nevo_ratio, rewrites full_text (+content_hash), reindexes, and FLAGS (never deletes) halachot whose quote lives in the removed preamble (review_status=pending_review + nevo_preamble_leak flag). Safety guard: rows with keep%<--min-keep (60) are excluded from --apply as suspected over-strip. --apply writes backup+manifest to data/audit/ first. Chair-gated — NOT applied here. scripts/nevo_ratio_benchmark.py (#86.3) — LLM-as-judge (local claude_session, zero cost) measures recall/precision/granularity of our halachot vs the Nevo ratio. Works pre- and post-backfill (reads nevo_ratio, falls back to full_text). Verified: - pytest tests/test_nevo_preamble.py — 12 passed (incl. citation/markdown over-strip regressions). - backfill dry-run: 19 leaked rulings, 27 contaminated halachot, all ≥75% keep (the 32K over-strip is gone). - benchmark on בג"ץ 1764/05: recall=0.875 precision=1.0 granularity=1.75x. Invariants: G1 (normalize at source — strip/capture at ingest, not at read); no silent swallow (contaminated halachot flagged + reported, not dropped); data-migration is dry-run-default with backup+manifest, chair-gated. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:45:43 +00:00
chaim	12bdec10fa	Merge pull request 'fix(claude_session): surface real CLI error + sanitize nested env (#85 )' (#90 ) from worktree-task85-claude-session-nested into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details	2026-06-06 19:30:22 +00:00
Chaim	8ec24cf822	fix(claude_session): surface real CLI error + sanitize nested env (#85 ) write_interim_draft failed for all blocks from the CEO MCP instance with "Claude CLI failed (exit 1): unknown error". Two fixes: 1. Error surfacing (the certain win): on non-zero exit, capture and log both stderr AND stdout (the CLI sometimes writes its diagnostic to stdout or nowhere), so the next occurrence is diagnosable instead of collapsing to "unknown error". This is why #85 was unsolved — the real error was swallowed (engineering rule §6: no silent swallow). 2. Defensive hardening: strip Claude Code session markers (CLAUDECODE, CLAUDE_CODE_, CLAUDE_AGENT_, AI_AGENT, CLAUDE_EFFORT) from the env of nested `claude -p` calls and run them from $HOME, decoupling them from the parent agent's session/project state. Aligns query() with the existing query_streaming() path (which already sets cwd=HOME). Auth/ config vars are preserved. Note: the original adapter-context failure could not be reproduced in a plain interactive session (nested claude -p succeeds there in both old and new code), so the env markers are a suspect, not a proven cause. The real value is the diagnostics. Verified: nested query() returns PONG from inside a CLAUDECODE=1 session; unit tests cover env sanitization. Invariants: G1 (normalize at source — fix the spawn, not readers), G2 (no parallel path — same query()), §6 (no silent error swallow). INV: feedback_claude_session_local_only preserved (all calls stay local). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:29:36 +00:00
Chaim	5fa76a09b4	feat(style-acq T8): analyze_corpus — הסרת LIMIT 20 (כיסוי 48/48) LIMIT 20 קבוע השמיט בשקט שליש מקורפוס דפנה מחילוץ author-features שהפרופיל של הכותב (T0) נסמך עליו. עכשיו limit=0 (ברירת-מחדל) = כל הקורפוס; פרמטר lim>0 אופציונלי לתקרה. G11. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:25:13 +00:00
Chaim	3c68383e86	fix(style-acq T9): מספור-אוטומטי אמיתי בייצוא DOCX (היה ללא מספור) באג: ה-exporter הסיר את הקידומת "N." והחיל סגנון "List Paragraph" — שאין לו numPr בתבנית (אין numbering.xml) → ההחלטות יצאו ללא מספור כלל. - docx_exporter._ensure_decision_numbering: מזריק abstractNum עשרוני (RTL, lvlJc=right) + num לחלק-המספור פעם אחת; _apply_list_numbering מחבר כל פסקת-גוף לרשימה הרציפה. מספור Word אמיתי — מתעדכן בעריכה, copy/paste נקי. אומת מבנית: numId יחיד, decimal, שתי פסקאות→אותו numId, docx נשמר. - התאמת ANTI_PATTERNS (T7): הוסר manual_paragraph_numbers — "N." בתחילת-שורה הוא ה-signal הנדרש לייצוא, לא אנטי-דפוס. נשאר inline (1)..(2)/markdown/bullets. - voice-fingerprint §3.1: תוקן — הכותב כן מקדים "N. " בתחילת-שורה (signal), הייצוא ממיר ל-auto-numbering. סתירה קודמת ("אל תקליד מספרים") יושבה. ⚠️ אימות-מבנה עבר; אימות ויזואלי ב-Word מומלץ על ייצוא ראשון. G11. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:23:29 +00:00
Chaim	f20a3a09fd	feat(style-acq T14): שער-יו"ר לאישור הצעות-curator → הטמעה לפרופיל סוגר את הלולאה מקצה-לקצה (INV-G10/LRN1): ה-curator מציע (status=analyzed), היו"ר מאשרת, והלקחים נכתבים לערוצים שהכותב צורך (T15) — אין auto-commit. - db.get_draft_final_pair(id) — שורת-פנקס מלאה כולל analysis. - app.py: GET /api/learning/pairs/{id} (חושף רק changes מסוג style_method — INV-LRN5) + POST .../promote (לקחים→discussion_rules['universal'], ביטויים→transition_phrases['universal'] דרך merge ל-appeal_type_rules; status→lessons_folded). _append_methodology_override משותף. - web-ui: usePairDetail/usePromoteLearning + ProposalReview (בחירת לקחים/ ביטויים לאימוץ) בטאב "למידה" עבור pairs במצב analyzed. INV-G10 (שער-יו"ר) · INV-LRN1 (אין auto-commit) · INV-LRN5 (טוהר). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:17:56 +00:00
Chaim	e4fbda6c1f	feat(style-acq T12): /methodology — קטגוריות ביטויי-מעבר + אנטי-דפוסים מרחיב את עורך-הפרופיל ב-/methodology עם 2 קטגוריות נוספות שהכותב (T15) והמדד (T7) צורכים — כך שהיו"ר עורכת אותן והעריכה זורמת לכתיבה: - app.py: _METHODOLOGY_DEFAULTS += transition_phrases (מקובץ לפי תוצאה) + anti_patterns (מ-lessons.ANTI_PATTERNS). דרך ה-CRUD הגנרי הקיים (appeal_type_rules). - block_writer (T15 loop): קורא overrides גם ל-transition_phrases + anti_patterns. - web-ui: GenericMethodologyPanel (עורך key→JSON) + 2 טאבים ב-/methodology. voice_invariants (doc) — נדחה (לא key-value). G11, INV-LRN4. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 19:08:44 +00:00
Chaim	a2236363d4	feat(mcp): FU-14 GAP-50 — deprecate draft_section לטובת get_block_context INV-TOOL2. מיפוי הראה ש-6 כלי-הבלוק אינם כפילות מיותרת: write_block/ write_all_blocks/save_block_content/write_interim_draft משרתים זרימות שונות (CLI/initial-draft מול תהליך-ה-writer "התיקון בקובץ, לא ב-DB"). הכפילות האמיתית היחידה — draft_section (הקשר לפי-סעיף, granularity ישן) חופף ל- get_block_context (לפי-בלוק, תואם 12-הבלוקים הקנוני). הכרעת-יו"ר: draft_section סומן deprecated (docstring ב-server.py + drafting.py מפנה ל-get_block_context; draft-decision.md עודכן). ללא הסרה, ללא מיזוג כלי- הכתיבה — שמירת תהליך-הכתיבה המכוון. בדיקות: 182/182 עוברים. GAP-49+50 סגורים. Invariants: מקדם INV-TOOL2 + G2. מתועד ב-X9 (נסגר) + gap-audit פרוסה 9. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 18:57:02 +00:00
Chaim	14568fdd15	feat(mcp): FU-14 GAP-49 — תיקון שם-הכלי המטעה (precedent_search_library) INV-TOOL2: `precedent_search_library` (שמחפש ציטוטים מצורפים-לתיק) היה הפוך וכמעט-זהה ל-`search_precedent_library` (ספריית-הפסיקה הסמכותית, מקור CREAC), מה שסיכן ציטוט מהמקור הלא-נכון בהחלטה. שונה ל-`search_case_precedents` (שם ברור: case-attached). השם הישן נשמר כ-@mcp.tool() alias deprecated המנתב לחדש → אפס שבירה לסוכנים חיים. docstrings של שני כלי-הפסיקה הובהרו (case-attached מול authoritative). עודכנו: web/app.py (typeahead), legal-researcher/legal-writer docs, precedent_library docstring. 5 כלי-החיפוש הנותרים (search_decisions/case_documents/find_similar/internal/ precedent_library) מחפשים קורפוסים מובחנים בשמות סבירים — לא בוצע rename המוני (churn גבוה, ערך נמוך מול הסיכון). בדיקות: 182/182 עוברים. אחרי deploy — סנכרון cross-company של doc-הסוכן. Invariants: מקדם INV-TOOL2 + G2. מתועד ב-X9 + gap-audit פרוסה 8. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 18:51:17 +00:00
chaim	172511339f	Merge pull request 'fix(style-acq T1): insert_style_exemplar — vector כ-list (register_vector)' (#81 ) from worktree-style-acquisition-mvp into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-06-06 18:15:31 +00:00
Chaim	ad4350029a	fix(style-acq T1): insert_style_exemplar — vector כ-list לא str (register_vector) asyncpg עם pgvector register_vector מקבל את ה-embedding כ-list[float] ישירות; str() גרם ל-DataError. תוקן בהתאם לדפוס store_*_image_embeddings. Backfill הורץ בהצלחה: 2670 דוגמאות מ-83 החלטות. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 18:14:56 +00:00
chaim	424dc7cd18	Merge pull request 'feat(style-acq T1-T3): קורפוס-דוגמאות של דפנה לכותב (style_exemplars)' (#80 ) from worktree-style-acquisition-mvp into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m41s Details	2026-06-06 18:10:31 +00:00
Chaim	2e20e27e17	feat(style-acq T1-T3): קורפוס-דוגמאות של דפנה לכותב (style_exemplars) ממלא את ערוץ-הדוגמאות (B) של מערכת רכישת-הסגנון: הכותב מאחזר פסקאות-בלוק אמיתיות של דפנה בזמן כתיבה, ממוקדות section+outcome+practice_area. T1 — תשתית + backfill: - SCHEMA_V27: טבלת style_exemplars (purpose-built — בלי תיקים מזויפים בשרשרת decision_paragraphs). decision_number/source/section/outcome/practice_area+embedding. - db: insert/delete/search_style_exemplars + count_style_exemplars. - scripts/backfill_style_exemplars.py: מפצל קורפוס דפנה (style_corpus + internal_committee) לסעיפים→פסקאות, embed, שמירה. אידמפוטנטי, dry-run/apply. T2 — אחזור ממוקד: - search_style_exemplars(section, outcome, practice_area) — section=hard filter, outcome/practice_area=soft. block_writer._build_precedents_context ממפה block→section ומאחזר (ראשי), לצד הנתיב הישן (משלים). T3 — contrastive/adapt: - הדוגמאות מתויגות "מבנה/קול בלבד — התאם, אל תעתיק תוכן"; פסקה מלאה (1100 תווים). INV-LRN5 (טוהר — סגנון בלבד). G11. הרצת backfill --apply בנפרד. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 18:10:01 +00:00
Chaim	29af008271	feat(mcp): FU-14 GAP-48 פרוסה 3 — envelope למשפחת drafting (סגירת GAP-48) הפרוסה האחרונה של GAP-48 (INV-TOOL1). 18 כלי drafting הומרו ל-{status,data,message} דרך tools/envelope.py — כולל מסלול הפקת-ההחלטה הקריטי. עיקרון לכלים עם כשל משמעותי (export_docx/revise_draft/apply_user_edit): err() ברמת-המעטפת — כך שהסוכן והמשתמש רואים את הכשל; failed_gates רוכב ב-data. שאר הכלים: ok(data=payload) להצלחה, err להיעדר-תיק/קלט-שגוי/חריגה. 6 צרכני-app.py חוּוטו (get_decision_template, apply_user_edit ×2, revise_draft, list_bookmarks, export_docx) עם envelope_unwrap + בדיקת status=="error"→4xx, לשמירת חוזה-ה-API (X6) ללא-שינוי. test_export_qa_gate עודכן לחוזה החדש. בדיקות: 182/182 עוברים (כולל שערי-QA של הייצוא). GAP-48 סגור: כל ~12 משפחות-הכלים אחידות. נותר ב-FU-14: GAP-49/50 (שובר), GAP-54. Invariants: משלים INV-TOOL1 + G2. מתועד ב-X9 (נסגר) + gap-audit פרוסה 7. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 17:51:56 +00:00
chaim	9a3e7faf08	Merge pull request 'feat(mcp): FU-14 GAP-48 פרוסה 2 — envelope אחיד ל-11 משפחות-כלים' (#77 ) from fix/fu14-gap48-envelope-rest into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 2m5s Details	2026-06-06 17:42:00 +00:00
Chaim	79b9c37301	feat(mcp): FU-14 GAP-48 פרוסה 2 — envelope אחיד ל-11 משפחות-כלים המשך מיגרציית INV-TOOL1 מעבר למשפחת-החיפוש (#71). הומרו ל-{status,data,message}: precedent_library, citations, internal_decisions, missing_precedents, training_enrichment, precedents, legal_arguments, cases, documents, workflow (~55 כלים). בוטלו 5 עותקי _ok/_err משוכפלים (alias ל-tools/envelope.py — SSoT, G2). עיקרון: envelope-status = הצלחת-הקריאה-לכלי; תוצאה-עסקית (idempotent_existing, noop, completed...) נשמרת בתוך data. err רק לכשל אמיתי (not-found/invalid/exception). תאימות-API: צרכני web/app.py של cases/workflow/precedents חוּוטו דרך envelope_unwrap + בדיקת status=="error"→4xx — תשובת ה-HTTP זהה, web-ui לא מושפע. (documents/legal_arguments/citations/... אינם נצרכים מ-app.py — agent-only.) בדיקות: 182/182 עוברים (test_corpus_constraints עודכן לחוזה החדש). נותר: משפחת drafting (מסלול הפקת-ההחלטה) בפרוסה נפרדת עם שער טסט-ייצוא. Invariants: מקדם INV-TOOL1 + G2 (SSoT, ביטול כפילות). מתועד ב-X9 + gap-audit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 17:41:39 +00:00
Chaim	a3451775fa	feat(style-acq T7): מדד מרחק-סגנון — האם הטיוטות מתכנסות לדפנה סוגר את ה-MVP (T0+T4+T5+T7): מטא-אות על בריאות-הלמידה (INV-LRN4), דטרמיניסטי וללא LLM. - lessons.ANTI_PATTERNS — אנטי-דפוסים נמדדים (מ-voice-fingerprint §3 המתוקן): מספרים-ידניים, רשימת-מיני (1)..(2), כותרות markdown, תבליטים. - services/style_distance.py — 3 רכיבים: golden_ratio_adherence (סטיית אחוזי-סעיפים מ-GOLDEN_RATIOS), anti_pattern_hits, draft_to_final_diff (change_percent מפנקס-ההתאמה). מקור-אמת אחד עם lessons.py. - MCP tool style_distance(case_number). INV-LRN4. G9. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 17:33:00 +00:00
Chaim	0d995483ce	feat(style-acq T4+T5): פנקס-התאמה draft↔final + דיסטילציה אוטומטית דרך ה-curator סוגר את לולאת-הלמידה (INV-LRN4): כל החלטה נסגרת מול הסופי, וכל סופי מנותח מול הטיוטה. מזין את הטבלאות ש-T15 כבר קורא מהן. T5 — פנקס-התאמה: - SCHEMA_V26: טבלת draft_final_pairs (snapshot draft + final + diff + analysis + status). - db: create/update/list_draft_final_pairs. - mark-final (app.py): תופס snapshot של הטיוטה (decision_blocks) ברגע החתימה, לפני שאפשר לדרוס אותו, ופותח שורת-פנקס (status=final_received). T4 — דיסטילציה אוטומטית: - learning_loop.process_final_version: משתמש ב-snapshot (לא בבלוקים שאולי השתנו), מסווג style_method↔substance, שומר הצעה ב-pair (status=analyzed). הוסר ה-auto-upsert של style_patterns — ביטל את ה-bug שדרס את שער-היו"ר וזיהם סגנון במהות (INV-LRN1 + INV-LRN5). - LESSONS_PROMPT: הפרדת style_method↔substance מפורשת + לקח מופשט בלבד. - curator wake + hermes-curator.md: מריץ ingest_final_version ראשון; מציע רק style_method שלא תועד; substance→מסלול precedent. INV-LRN1 (שער-יו"ר, אין auto-commit) · INV-LRN4 (ניגוד-אמת) · INV-LRN5 (טוהר). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 17:20:57 +00:00
chaim	014eb4937e	Merge pull request 'feat(style-acq T15): הכותב צורך את כל הלמידה (/methodology + /training) + תיקון-מספור' (#72 ) from worktree-style-acquisition-mvp into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m56s Details	2026-06-06 16:37:01 +00:00
Chaim	b9bdca0572	feat(style-acq T15): הכותב צורך את כל הלמידה (/methodology overrides + /training lessons) + תיקון-מספור עונה ל"להתחשב במה שכבר למדנו": הכותב התעלם מעריכות היו"ר ב-/methodology (נשמרו ב-appeal_type_rules אך block_writer קרא רק קבועי lessons.py) ומ- decision_lessons של /training. עכשיו הכל מגיע לכתיבה. - db.get_methodology_overrides(category) — overrides של היו"ר (יחסי-זהב, כללי-דיון, צ׳קליסטים) מ-appeal_type_rules (כמו merge של ה-API). - db.get_recent_decision_lessons(limit, practice_area) — לקחי /training. - _build_style_context(practice_area): מוסיף סעיף "⭐ למידה מצטברת — גובר על ברירת-מחדל" עם שניהם, אחרי voice-fingerprint (T0). שני ה-callers מעבירים practice_area. עובד יחד עם הלולאה (T4/T5) שתזין לאותן טבלאות. תיקון-מספור (חלק מ-T9, דחוף כי T0 הזריק את הטעות): voice-fingerprint §3.1 תוקן — ההחלטה ממוספרת תמיד (מספור-אוטומטי ב-Word); "ללא מספור" היה ארטיפקט-חילוץ. האנטי-דפוס האמיתי: רשימת-מיני בתוך פסקה + מספרים ידניים. INV-LRN4 (הזרמת למידה) · INV-LRN5 (טוהר). G11. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 16:36:32 +00:00
Chaim	aa0a736a7b	feat(mcp): FU-14 GAP-48 פרוסה 1 — envelope אחיד (SSoT) + משפחת-חיפוש INV-TOOL1: כלי-ה-MCP החזירו 3 מוסכמות סותרות (raw payload / {error} / {status,message} אד-הוק) + 5 עותקי _ok/_err משוכפלים. נוצר tools/envelope.py כמקור-אמת יחיד: ok/empty/err → {status,data,message}, כש-status מבחין מפורשות הצלחה/ריק/שגיאה. פרוסה 1 ממירה את משפחת-החיפוש (search_decisions, search_case_documents, find_similar_cases, search_internal_decisions). web/app.py מפרק את המעטפת דרך envelope_unwrap כדי לשמר את חוזה-ה-UI↔API (X6) ללא-שינוי — תשובת ה-HTTP זהה (list על hits, {"message"} על ריק/שגיאה). טסט test_search_domain_scope עודכן לחוזה החדש (5/5 עוברים). החלטה: הדרגתי לפי-משפחה ולא big-bang. מפת-צרכנים: server.py pass-through, web-ui מבודד (/api/*), רק 17 כלים נצרכים ישירות מ-app.py → סיכון מינימלי לסוכנים החיים. ~73 כלים נותרו לפרוסות הבאות. Invariants: מקדם INV-TOOL1 (envelope עקבי) + G2 (SSoT, ביטול כפילות _ok/_err). לא נוגע ב-G1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 16:32:07 +00:00
chaim	c52b5986a3	Merge pull request 'feat(ui): אינדיקטור התקדמות לחילוץ מטא-דאטה + מתג-מקטעים בספריית הפסיקה' (#70 ) from worktree-feat+metadata-extraction-progress into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m42s Details	2026-06-06 16:22:42 +00:00
Chaim	6bf19bd0d7	feat(ui): אינדיקטור התקדמות לחילוץ מטא-דאטה + מתג-מקטעים בספריית הפסיקה שתי בעיות UX בדף /precedents: 1. חילוץ מטא-דאטה לא נתן שום אינדיקציה שהוא רץ. בניגוד לחילוץ טקסט/הלכות (extraction_status / halacha_extraction_status) למטא-דאטה היתה רק חותמת-זמן metadata_extraction_requested_at — אין מצב "processing", לכן StatusPill לא הציג כלום. נוספה עמודת metadata_extraction_status ('pending'\|'processing'\| 'completed'\|'failed') במתכונת העמודות הקיימות, וה-worker (process_pending_extractions + reextract_metadata) מעדכן אותה: processing בתחילת פריט, completed בסיום (מנקה גם את החותמת), pending בכשל (לריטריי). ה-UI מציג תג "מחלץ מטא-דאטה" + באנר מונה-אצווה עם אחוז התקדמות (high-water-mark של עומק-התור) שמתעדכן אוטומטית דרך ה-polling הקיים (5ש'). 2. שתי טבלאות מוערמות (בתי משפט / ועדות ערר) חייבו גלילה ארוכה. הוחלפו במתג- מקטעים — טבלה אחת בכל פעם, עם שמירה על העמודות הייעודיות לכל סוג. Invariants: G2 (מרחיב מנגנון-סטטוס קיים, לא מסלול מקביל), INV-TOOL4/GAP-45 (המשך חשיפת תור-החילוץ הסמוי). אין נגיעה בתוכן משפטי (G11). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 16:21:41 +00:00
Chaim	8a3bcd3ffc	feat(style-acq T0): הזרקת פרופיל-הקול לכותב + מדיניות-העתקה + הפרדת דוגמאות↔פסיקה הלוֹבר הראשי של מערכת רכישת-הסגנון. block_writer עבר היום מ"העתקה + ערבוב-מהות" ל"הכללת-סגנון + הפרדה": - _build_style_context: טוען את daphna-voice-fingerprint.md (פרופיל-הקול המופשט — המנגנון המרכזי) + מדיניות-העתקה מפורשת לפי סוג-תוכן (נוסחה→מותר, ניתוח→הכלל, מהות מתיק אחר→אסור). INV-LRN5. - _build_precedents_context: פוצל לשני זרמים נפרדים — daphna_style_exemplars (איך דפנה כותבת) מול case_law_citations (מהות לציטוט). - block-yod prompt: שני סעיפים מסומנים במקום "פסיקה רלוונטית (צטט מכאן)" שערבב סגנון ומהות; הדוגמאות-סגנוניות מתויגות "מבנה/קול בלבד". INV: G11 (סגנון דפנה), INV-LRN5 (טוהר-הקול). חלק מתוכנית style-acquisition. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 16:20:24 +00:00
Chaim	40c1111e9b	feat(mcp): FU-14 GAP-47 (חלק provenance) — draft_section מחזיר document_id+page+score ה-provenance (document_id, page_number, score) כבר נשלף ב-search_similar אך נזרק בבניית פלט draft_section. כעת מוחזר לכל קטע ב-case_documents/precedents, כך שהכותב יכול לעקוב אחורה אל מסמך-המקור והעמוד ולצטטם, ולא לסמוך על תוכן חסר-מקור. תוספתי בלבד — אין צרכן שמפרסר את מפתחות-הפלט, תואם-לאחור. נותר ב-GAP-47: העברת הנחיות-יו"ר מ-analysis-and-research.md ל-DB (get_chair_directions) — שינוי-מסלול גדול יותר, לפרוסה נפרדת. Invariants: מקיים INV-TOOL4 (מקור-אמת נגיש) + G9 (provenance). לא נוגע ב-G2/G1. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 15:58:39 +00:00
Chaim	701efab726	feat(mcp): FU-14 GAP-51 — איחוד אוצר-המילים של תוצאת-תיק (set_outcome SSoT) הכרעת-יו"ר: קנוני = 3 תוצאות אמיתיות (rejection/partial_acceptance/full_acceptance); betterment_levy יוצא מהיותו "תוצאה" ועובר ל-override לפי practice_area. + עקרון "אנגלית-ב-DB, עברית-ב-UI": מפת-תוויות SSoT אחת. lessons.py: - VALID_OUTCOMES = 3 (הוסר betterment_levy). - OUTCOME_LABELS_HE (SSoT לתצוגה) + LEGACY_OUTCOME_MAP + canonical_outcome(). - PRACTICE_AREA_OVERRIDES["betterment_levy"] מרכז את כל ה-guidance שהיה מפתוח כ-outcome (golden_ratios/opening/summary/discussion/template). - get_lessons_for_outcome(outcome, practice_area) + format_ratios_comment(..., practice_area) מחילים override + מנרמלים legacy. block_writer.py: STRUCTURE_GUIDANCE קנוני + תווית מ-OUTCOME_LABELS_HE + override betterment. workflow.set_outcome: קנוני 3 + מיפוי-legacy סלחני; תווית מ-SSoT. drafting.py: טבלת יחסי-זהב + get_decision_template מודעי-practice_area (override). web-ui case.ts: הסרת betterment_levy מ-expectedOutcomes (הוא practice_area). server.py: docstrings קנוניים. מיגרציה: migrate_gap51_outcomes.py — 9 שורות נורמלו (rejected→rejection וכו'), גיבוי ב-data/audit/. הקוד canonicalize בקריאה ⇒ backward-compatible גם בלי מיגרציה. אומת: py_compile (5 קבצים) + בדיקות-יחידה offline (override/legacy/labels) + אימות-DB. עודכנו X9 §3 + gap-audit (GAP-51 ✅). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 15:34:49 +00:00
Chaim	ea8b48c6ac	feat(mcp): FU-14 GAP-45 — extraction_status (חשיפת תור-החילוץ הסמוי) INV-TOOL4 (visibility / persistence). תור בקשות-החילוץ (metadata/halacha) נשמר ב-case_law.{metadata,halacha}_extraction_requested_at ומרוקן ע"י precedent_process_pending — אבל לא היה כלי לראות את עומק-התור. נוסף: - db.extraction_queue_status() — count + גיל הבקשה הוותיקה לכל kind (read-only). - plib.extraction_status() — tool wrapper (envelope _ok/_err). - רישום extraction_status ב-server.py ליד precedent_process_pending. - precedent_process_pending קיבל _clamp_limit (עקביות עם GAP-53). תוספתי, read-only, אפס שבירה. עודכנו X9 (INV-TOOL4 ✅) ו-gap-audit (GAP-45 ✅). py_compile עבר על 3 קבצי הקוד. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 15:00:25 +00:00
Chaim	034b609bd3	feat(mcp): FU-14 GAP-52 — idempotency על case_create/precedent_attach/document_upload INV-TOOL3 (idempotency על מפתח דטרמיניסטי). כל שלושת הכלים מחזירים את הרשומה הקיימת במקום ליצור כפילות: - case_create — מפתח case_number (כבר UNIQUE ב-schema): מחזיר את התיק הקיים במקום unique-violation. - precedent_attach — מפתח (case_id, section_id, citation, quote): צירוף חוזר של אותו ציטוט לאותו סעיף מחזיר את הקיים. - document_upload — מפתח (case_id, SHA-256 של בייטי הקובץ): העלאה חוזרת של אותו קובץ מחזירה את המסמך הקיים ומדלגת על copy+OCR+embed (החלק היקר). נוספה עמודת documents.content_hash (תוספתי, DEFAULT '') + get_document_by_hash. נבחרה בדיקת-מפתח ברמת-אפליקציה (SELECT-לפני-INSERT) ולא UNIQUE-constraint — כדי לא לשבור startup אם קיימים נתונים-כפולים legacy. אין מיגרציה הרסנית. עודכנו docs/spec/X9 (INV-TOOL3 ✅) ו-gap-audit (GAP-52 ✅, פרוסה 2). py_compile עבר על 4 קבצי הקוד. אימות runtime (restart MCP server) נדחה עד שהחילוץ הפעיל יסתיים. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 14:52:33 +00:00
Chaim	ebfe7f6a1d	feat(mcp): FU-14 פרוסה 1 — get_appraiser_facts (GAP-44) + limit-caps (GAP-53) תוספתי בלבד, אפס שבירת-תאימות. שני invariants מחוזה-כלי-ה-MCP (X9): GAP-44 (INV-TOOL4, סימטריית extract/get): נוסף get_appraiser_facts — ה-get המקביל ל-extract_appraiser_facts. קורא list_appraiser_facts + detect_appraiser_conflicts מה-DB ללא חילוץ-LLM יקר ולא-דטרמיניסטי. מחזיר count=0 (לא שגיאה) אם טרם חולץ. GAP-53 (INV-TOOL5, limit-caps / OWASP API4:2023): נוסף _clamp_limit (תקרה 200, non-positive→max) על ~13 כלי list/search ב-server.py (case_list, search_, precedent_library_list, halachot_pending, missing_precedent_list, list__citations…). list_chair_feedback קיבל param limit חדש (server→workflow→db עם LIMIT) — היה ללא תקרה כלל. לא הוסף get_appraiser_facts ל-frontmatter של סוכנים (INV-AG3 "לא עודף" — ההוראות עוד לא מפנות אליו; חיווט = follow-up). נותר ב-FU-14: GAP-45/48/49/50/51/52. עודכנו docs/spec/X9 (INV-TOOL4/5) ו-gap-audit (סטטוס פרוסה 1). אומת: py_compile על 4 קבצי הקוד. אימות runtime (restart MCP server) נדחה עד שהחילוץ הפעיל של היו"ר יסתיים. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 14:37:30 +00:00
Chaim	4174217179	feat(feedback): סימון "יושם" מפעיל CEO לקיפול הלקח לקובץ הנכון סוגר את לולאת פידבק-יו"ר→ידע-סוכנים. עד כה resolve רק עדכן את ה-DB; עכשיו לחיצה ב-/feedback מעירה את ה-CEO שמקפל את הלקח לקובץ לפי הקטגוריה. - paperclip_client.py: wake_ceo_for_feedback_fold() — יוצר issue ב-Paperclip עם הלקח + rubric ניתוב (style→SKILL.md, wrong_structure→block-schema, אחר→lessons.md), מעיר CEO. משכפל את דפוס wake_for_precedent_extraction - db.py: get_chair_feedback(id) — שליפת הערה בודדת עם case_number/appeal_type - app.py: resolve endpoint מקבל fold (ברירת מחדל true); BackgroundTask fire-and-forget; guard — רק עם lesson_extracted. מחזיר fold_queued - legal-ceo.md: dispatch ל-feedback_fold_ + סעיף "קיפול הערת יו"ר" עם rubric - frontend: useResolveFeedback מקבל fold; /feedback שולח fold=true עם toast; drafts-panel שולח fold=false (bookkeeping per-case, בלי קיפול כפול) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 13:08:41 +00:00
Chaim	68a77c11b6	feat(upload): חסימת כפילות בהעלאת פסיקה + banner עם אפשרויות All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m38s Details - בקאנד: GET לפני ה-async task — אם citation כבר קיים כ-external_upload מחזיר 409 - DB: get_external_case_law_by_citation — lookup לפי citation + source_kind - פרונט: banner אדום עם פרטי הרשומה הקיימת ושני כפתורות: • "הפעל חילוץ מחדש" — request-halachot ל-ID הקיים וסגירת הטופס • "מחק את הרשומה" — DELETE עם confirm, ניקוי conflict לאחר מכן Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 12:11:33 +00:00
Chaim	f8c3fd6c89	fix(nevo): strip preamble/mini-ratio from court rulings too (#86.1) strip_nevo_preamble's _DECISION_START only matched ועדת-ערר openings (בפנינו / הערר שבנדון / ...), so Nevo COURT judgments — exactly the ones carrying a מיני-רציו — slipped through unstripped. The editorial mini-ratio then leaked into the chunked body, risking that the halacha extractor reads Nevo's answer key (contamination) and polluting the corpus. Proven on בג"ץ 1764/05: its full_text still contained the מיני-רציו (unstripped). Fix: - Extend _DECISION_START with court-ruling openings: פסק-דין/פסק דין header and the authoring-judge line (השופט/ת, כב' השופט, הנשיא, המשנה לנשיא). re.search picks the earliest line-start match → the real opinion start, not the prose ratio above it. - Widen the Nevo-marker detection window 400→1500 chars so a long court/parties header doesn't push חקיקה שאוזכרה:/מיני-רציו: out of range. Verified on the real 1764/05 full_text: strips 2702 chars, body now starts at 'השופט ס' ג'ובראן:', מיני-רציו gone. Regression: ועדת-ערר openings still strip; non-Nevo text untouched; markers-past-400 now detected. Suite 182 passed (6 new). This is the anti-contamination prerequisite for the Nevo-ratio gold-set (#86.3/#81.7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 16:55:31 +00:00
Chaim	fb60dca796	feat(halacha): over-extraction consolidation — fold facets via claude_session (#81.5) After a precedent finishes extracting, a claude_session pass folds facets of the SAME legal question (below #82's dedup cosine — the שפר 14-vs-4 / 403-17→89 granularity gap) into one canonical; the rest are marked 'rejected' (reversible: out of the active corpus AND the review queue, but recoverable). FOLD-ONLY — never merges distinct legal questions, never invents. - Engine: claude_session-as-judge (local CLI, zero cost), 'high' effort — folding needs careful judgment. One pass per precedent, runs in _extract_impl once all chunks are done (the prompt dedups within a chunk; this catches across chunks). - Pure, unit-tested helpers in halacha_quality: CONSOLIDATE_SYSTEM, build_consolidation_prompt, parse_fold_groups (fails SAFE → [] on any malformed shape; drops <2-member groups; coerces/dedups indices). - halacha_extractor._consolidate_precedent picks the canonical per group (approved>pending, higher confidence, quote_verified, longer) and rejects the rest via the existing update_halachot_batch (#84). Never rejects a canonical. Fails OPEN on any error (no CLI / parse fail → 0 folds, data untouched). - config: HALACHA_CONSOLIDATE_ENABLED/MODEL/EFFORT. Verified: suite 176 passed (10 new); integration vs dev DB — a 2-facet group folds to 1 canonical + 1 rejected (tagged), distinct rules untouched, claude error → 0 folds (fail-open). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 16:26:44 +00:00
Chaim	f196bed564	feat(halacha): NLI entailment validator via claude_session (#81.3) + task #86 #81.3 — a post-extraction validator that flags halachot whose rule_statement is NOT entailed by its supporting_quote (the model over-reaching beyond its source). - Engine: claude_session-as-judge (local CLI, zero API cost) per chaim's standing preference — one batched judge call per chunk, NOT a hosted NLI model. - Pure, unit-tested helpers in halacha_quality: NLI_SYSTEM, build_nli_prompt, parse_nli_verdicts (fails OPEN — any shape/label ambiguity → 'entailed'). - halacha_extractor._nli_check wraps the call; fails OPEN on any error (e.g. no CLI in the container) so a flaky judge never blocks a genuine halacha. - Non-entailed (neutral/contradiction) → quality_flag 'nli_unsupported' which blocks auto-approve (routes to pending_review) via the existing store gate. - config: HALACHA_NLI_ENABLED/MODEL/EFFORT (effort 'low' — entailment is simple). Verified: suite 166 passed (10 new); LIVE smoke test against the real claude CLI returned ['entailed','neutral'] for a supported vs unsupported rule. Also commits TaskMaster #86 (Nevo preamble/ratio: anti-contamination strip fix + gold-set benchmark) capturing today's strip_nevo_preamble findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 14:46:12 +00:00
Chaim	476c2fc5d1	feat(upload): accept legacy .doc, convert via LibreOffice in container Legacy Hebrew .doc precedents (e.g. nevo.co.il CP1255 OLE2) can now be uploaded directly through the precedent-library, missing-precedent, and training upload paths — the frontend already advertised .doc but the backend gate rejected it before reaching the extractor. - web/app.py: add .doc to ALLOWED_EXTENSIONS (covers all paths that share the set: precedent library, missing-precedent, training). - Dockerfile: install libreoffice-writer-nogui (no X11/Java) so the extractor's existing _extract_doc LibreOffice conversion works in the Coolify container (was missing → would fail at runtime). - extractor.py: isolate the LibreOffice user profile per call to avoid a profile-lock failure on concurrent .doc conversions. Verified in python:3.12-slim (prod base): .doc→.docx→text yields text byte-identical to a native Word .docx save (103 paragraphs, 24,341 chars). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 13:47:47 +00:00
Chaim	eeb70a5758	feat(halacha): review-queue triage — defer + batch group actions + quality-flag badges (#84 ) Make the chair's pending-halacha review faster and less exhausting. Backend: - New 'deferred' review_status (snooze): stays out of the active library AND out of the default pending queue, without the finality of 'rejected'. update_halacha stamps reviewer+reviewed_at on defer; HALACHA_REVIEW_STATUSES is the single source of valid statuses (PATCH validation now uses it). - db.update_halachot_batch(ids, status, reviewer) — one atomic UPDATE for a whole group; invalid status / empty ids are a no-op. - POST /api/halachot/batch (HalachaBatchReviewRequest) wraps it. - update_halacha now RETURNs quality_flags too (parity with list_halachot). Frontend (halacha-review-panel): - Quality-flag badges (#81: non_decision / truncated_quote / thin_restatement / quote_unverified) so the chair sees WHY an item was held back. - Defer action — button + keyboard 'D' — to snooze without rejecting (fixes the 'leave in pending forever' anti-pattern; reject stays the junk verb). - Per-precedent batch bar: 'אשר הכל' / 'דחה הכל' via useBatchReviewHalachot (one request, one refetch) with confirm guards. - Halacha/HalachaPatch types gain quality_flags + 'deferred'. Verified: mcp-server suite 156 passed; web build green; end-to-end integration against dev DB (batch approve/reject, defer sets status+timestamp, pending excludes approved+deferred, deferred queryable, invalid status no-op). Note: api:types regen deferred until deploy (the batch hook is hand-typed, not dependent on generated types). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 13:42:21 +00:00
Chaim	0f64b4c062	feat(halacha): UNIQUE(case_law_id, halacha_index) backstop + task tracking (#83 ) #83 pipeline robustness — the index-numbering correctness guarantee: - Add CREATE UNIQUE INDEX idx_halachot_unique_index ON halachot(case_law_id, halacha_index). The extractor assigns the index as MAX+1 under an in-process store-lock + a cross-process pg advisory lock, so collisions shouldn't occur in normal operation — but per the research (FireHydrant/OneUptime) the constraint is the actual correctness guarantee while the lock is the optimization. A racing/double run now fails LOUDLY (UniqueViolation, chunk left un-checkpointed → clean resume) instead of silently appending the duplicates that were the 2026-05/06 over-extraction root cause. Data prep (run against the live DB before the constraint, backed up to data/audit/halacha-reindex-backup-*.sql): the 6 precedents that still carried colliding halacha_index values (9 groups, distinct principles that shared a number — NOT content dups) were renumbered to unique sequential indices. Verified: advisory lock holds cross-process and the DB path is direct asyncpg (no transaction-pooler), so the session lock is safe (83.1); force=True does delete+checkpoint-clear in one transaction (83.5); constraint rejects a duplicate-index insert (integration-checked). Full suite 156 passed. Also commits the TaskMaster tracking for the whole halacha-quality initiative (#81-#84 + research-backed subtasks, statuses). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 13:06:58 +00:00
Chaim	ca959d4a9c	feat(halacha): strict-rubric quality gate + dedup-on-insert (#81,#82) Bake the 2026-06-03 strict-cleanup rubric into the extraction pipeline so the corpus stays clean at the source instead of accumulating duplicates, obiter dicta, truncated quotes and thin restatements that clog the review queue. #81 — quality gate: - New pure module halacha_quality.py with unit-tested validators: non-decision/obiter (Wambaugh markers), truncated-quote (mid-word cut), thin-restatement (rule≈quote), quote-unverified. - Validators run in halacha_extractor._process; a non-decision is re-typed obiter; flags persist in new halachot.quality_flags column. - Auto-approve now requires confidence>=threshold AND no quality flags; flagged items route to pending_review regardless of confidence. - Both extraction prompts hardened: reject undecided dicta, exclude case-specific applications, require abstraction, forbid over-splitting. #82 — dedup-on-insert (store_halachot_for_chunk): - Within the same precedent, skip a halacha whose normalized supporting_quote already exists, or whose rule-embedding has cosine>=HALACHA_DEDUP_COSINE (0.93) against an already-stored one. Makes re-runs idempotent. Migration: halachot.quality_flags TEXT[] (additive, idempotent ALTER). Tests: 19 new unit tests; full suite 156 passed. Validated end-to-end against dev DB (dedup skips dups, flag blocks auto-approve, re-run inserts 0). Calibration: flags fire on only ~10% of current survivors (low false-positive). Spec: docs/halacha-strict-rubric.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 12:30:38 +00:00
Chaim	6fcfdc76db	fix(#79 ): chunker never emits sub-50-char fragment chunks (#55 follow-up) A section that opens with a short header line ('דיון', 'טענות המשיבים') followed by a paragraph larger than chunk_size flushed the header alone as a tiny chunk. #55 added a query-time >=50 filter to hide these; this removes them at the source. _split_section: (1) don't flush a buffer still below MIN_CHUNK_CHARS — let it absorb the next paragraph even if that overflows chunk_size, so a short header rides with its following content; (2) fold a trailing tiny chunk back into its predecessor. Verified: re-chunked the 4 corpus docs that still had a tiny chunk (ע"א 5138/04, בר"מ 2340/02, בג"ץ 6525/15, 403-17) — corpus-wide chunks<50 went 4 -> 0; all 4 stay embedded/searchable and rank top in a relevant search (נווה שלום #1 for the s.19(ג)(1) exemption query). No regression. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 08:10:10 +00:00
Chaim	f46bf47d5b	feat(web-ui): expose citation-corroboration badge on halachot (X11) - db.list_halachot: aggregate corroboration_count (distinct positive sources) + corroboration_negative from halacha_citation_corroboration (LEFT JOIN) - web-ui: CorroborationBadge — 'מתוקף · N ציטוטים' at ≥2 (gold), soft single citation, danger badge on negative treatment; native title tooltips - shown in ExtractedHalachotSection (per-precedent) + halacha review panel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 05:04:31 +00:00
Chaim	ed547e20ad	feat(corroboration): wire approval gate + backfill driver + rebuild tool (X11 Phase 2) - db: approve_halacha_by_corroboration (pending_review→approved only), demote_halacha_overruled (approved→pending_review only), list_corroboration_grouped, precedents_with_halachot_and_incoming_citations - corroboration: reconcile_approvals (INV-COR2/COR4/COR5), build_all backfill; build_for_precedent now returns approved/demoted counts - mcp: corroboration_rebuild write tool (single precedent or full-corpus backfill) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:35:37 +00:00
Chaim	df007784c9	feat(corroboration): approval_action decision fn + kill-switch (INV-COR2/COR4, X11 Phase 2) - HALACHA_CORROBORATION_AUTO_APPROVE config (default ON, Dafna validated 2026-06-01) - approval_action(agg, has_overruled): overruled→demote, corroborated→approve, else None - 4 offline unit tests; Phase 2 plan + TaskMaster #75 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:34:23 +00:00
Chaim	885cba543e	feat(halacha): lighter effort for BULK queue-drain extraction (speed at scale) xhigh is the quality sweet-spot for a single precedent but very slow at scale (64-chunk case ≈ 20 min). Bulk queue-drains (process_pending over many precedents) now use a lighter effort to cut wall-clock; interactive single re-extraction keeps xhigh quality. - config.HALACHA_BULK_EXTRACT_EFFORT (env, default 'high'; set 'medium' for max speed, 'xhigh' to match single). - extract()/_extract_impl()/_extract_chunk() take an `effort` override threaded to claude_session.query_json; None falls back to HALACHA_EXTRACT_EFFORT (xhigh). - process_pending_extractions(kind='halacha') passes the bulk effort; single reextract_halachot keeps xhigh. Verified end-to-end (mocked LLM): _extract_chunk(effort='medium') → query_json effort='medium'; effort=None → 'xhigh' fallback. Closes the open item in #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:34:13 +00:00
Chaim	8e4ea23882	feat(halacha): crash-safe incremental extraction + resume (A + resume) Halacha extraction held ALL chunk results in memory and stored once at the very end — a crash/interrupt mid-run (e.g. the 2026-05-31 freeze) lost everything and re-paid the full LLM cost on retry. Now each chunk's halachot are stored AND the chunk is checkpointed (precedent_chunks.halacha_extracted_at) the moment it finishes: - V25 schema: precedent_chunks.halacha_extracted_at (per-chunk checkpoint). - db.store_halachot_for_chunk: atomic per-chunk insert (halacha_index continues from MAX, caller serializes via an in-process store-lock) + checkpoint mark. - db.reset_halacha_extraction (force) / mark_all_chunks_extracted (legacy backfill). - _extract_impl rewritten: resume by default (skip checkpointed chunks; failed chunks stay pending and are retried; status stays 'processing' until all done); force=True wipes + redoes all. reextract_halachot passes force=True; the queue drain (process_pending) resumes by default. - Legacy guard: a pre-V25 precedent (halachot exist, no checkpoints) is backfilled and treated as complete — never re-extracted (would duplicate). Verified on 9002-24 (55 halachot, legacy): resume → legacy-backfill, NO duplication (stays 55), all chunks checkpointed. Index continuation: store at 55,56 after max 54, no collision. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:27:46 +00:00
Chaim	807053ec54	fix(halacha): global advisory lock — one extraction at a time (prevents box freeze) 2026-05-31: opus-4-8 @ xhigh extraction + overlapping driver processes (agent fallback retries each spawn an independent `python -c` driver; process_pending is serial WITHIN a process but the box ran 4-5 drivers in parallel) → 12-16 concurrent xhigh `claude -p` procs → load 69 → hard reboot. Fix: halacha_extractor.extract() now takes a Postgres advisory lock (pg_try_advisory_lock, key 'HALA') before any work. If another extraction (any process/agent/driver — all share the legal-ai DB) holds it, the call returns status='busy' and the precedent stays pending for the next drain. Guarantees ONE extraction at a time ACROSS PROCESSES — an in-process Semaphore cannot (drivers are separate OS processes). Core logic moved to _extract_impl (unchanged) under the lock. CHUNK_CONCURRENCY now env-tunable (HALACHA_CHUNK_CONCURRENCY, default 3). Verified: while a lock is held, extract() returns 'busy' with no LLM call; lock releases cleanly and the next extraction proceeds. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 20:42:15 +00:00
Chaim	5abfbd2746	feat(mcp): halacha_corroboration read-only tool (INV-COR6, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:07:37 +00:00
Chaim	b57e590275	feat(corroboration): orchestrator + persistence over both citation graphs (X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:04:20 +00:00

1 2 3 4 5

212 Commits