legal-ai

Author	SHA1	Message	Date
Chaim	5f93c7492f	fix(halacha): #81.7 — report Gwet AC1 + consensus-vs-human (κ paradox under skew) All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 5s Details ריצת-הפאנל החיה חשפה Fleiss κ=-0.07 למרות 97.5% הסכמה-גסה (28/40 פה-אחד, 11/40 רוב). זה אינו חוסר-אמינות אלא פרדוקס-הקאפא: ה-marginal של is_holding מוטה קיצונית (≈הכול True, כמו 93/100 ה-keep בתוויות-האנוש), וכש-Pe→1 גם κ→0 (Feinstein & Cicchetti 1990, "high agreement, low kappa"). - gwet_ac1(): מדד הסכמה עמיד-שכיחות (Gwet 2008) — אותו Pa כמו Fleiss, אומדן-מקריות שונה (2·p·(1-p)). הופך לכותרת; Fleiss κ עדיין מודווח לשקיפות + raw 3/3. - consensus-vs-HUMAN: כשקיים תיוג-יו"ר, הדוח מודד התאמת-הקונצנזוס מולו (תוקף חיצוני). אימות בפועל על 100 תוויות-היו"ר: 29/29 = 100% התאמה. invariants: ללא שינוי בהתנהגות-הכתיבה; מטריקה בלבד. tests: 21 (3 חדשות, כולל מקרה-פרדוקס מפורש). מקור: Gwet 2008 (AC1) · Feinstein & Cicchetti 1990. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 16:13:24 +00:00
chaim	e6c6237ef6	Merge pull request 'feat(halacha): #81.7 — תיוג gold-set בקונצנזוס תלת-מודלי (Opus+DeepSeek+Gemini), κ + אנונימיזציה' (#188 ) from worktree-goldset-tri-model-consensus into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details G12 Leak-Guard / leak-guard (push) Successful in 6s Details	2026-06-11 16:04:04 +00:00
Chaim	5b001bbd9d	feat(halacha): #81.7 — gold-set labeled by tri-model consensus (Opus+DeepSeek+Gemini) All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 6s Details מבטל את ה-man-in-the-loop בתיוג ה-gold-set (הנחיית-יו"ר 2026-06-11): במקום תיוג ידני של חיים/דפנה, אמת-המידה נקבעת בקונצנזוס שלוש שושלות-מודל עצמאיות — אותו פאנל שמערכת האישור החיה כבר משתמשת בו (halacha_panel_approve), עם 92% הסכמה חוצת-מודלים על הציר הגס. למה לא מעגלי: הוולידטורים הנמדדים ב-#81.8 (compute_quality_flags / is_fact_dependent / is_quote_truncated / is_thin_restatement) הם היוריסטיקות rule-based — משפחת-שיטה שונה מה-LLM-judges. שני שומרי-יושר: (1) פיצול-קולות (אין רוב 2/3) לא כותב לייבל — הפריט נשאר NULL ומוסלם ליו"ר (INV-G10); (2) מבחן-אנונימיזציה — שיפוט-מחדש עם מזהה-התיק ממוסך, flip בקונצנזוס = שינון ולא הנמקה (arXiv:2505.02172). - db.py: עמודות per-lineage (ds_/gm_; ai_*=claude קיים) + consensus/agreement/anon + goldset_set_panel_label() שכותב רוב-2/3 ל-is_holding/correct_type (tagged_by='panel:…', לא דורס tagged_by='chair'). goldset_score נשאר ללא שינוי — קורא is_holding (G2, אין מסלול ניקוד מקביל). עדכון הערת-הסכמה (בוטלה דרישת "MUST be human"). - scripts/goldset_panel_label.py: 3 שופטים (מיובאים מ-halacha_panel_approve, מקור-אמת יחיד) + prompt עשיר (מיובא מ-goldset_ai_recommend) + Fleiss κ + מבחן-אנונימיזציה. דוח→data/audit/. - SCRIPTS.md: סקריפט חדש; goldset_ai_recommend/independent_judge מסומנים single-model נבלעים. invariants: G2 (שופטים+prompt מיובאים, אין כפילות; ניקוד יחיד) · INV-G10 (פיצול→יו"ר) · INV-LRN2/LRN3 (איכות-במקור, לכידה מובנית). מקור: PoLL · Trust-or-Escalate (ICLR 2025) · arXiv:2505.02172. tests: 18 offline (consensus/type/Fleiss-κ/anonymize). live labeling = צעד תפעולי אחרי deploy. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 16:03:32 +00:00
Chaim	3c169a76f2	feat(halacha): rhetorical-role pre-filter — fallback excludes facts/arguments (#81.6) All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 5s Details חילוץ-הלכות מוגבל למקטעי הנמקה/הכרעה בלבד (INV-LRN2 quality-at-source). הפער שנסגר: מסלול ה-fallback (כשה-chunker לא תייג שום מקטע כ-extractable, כותרות לא-תקניות → הכול 'other') נפל קודם לכל ה-chunks — והחזיר בדיוק את המקטעים שהמסנן הראשי מחריג (רקע עובדתי + טענות הצדדים). בלבול Facts↔Reasoning הוא מחלקת-השגיאה הדומיננטית (LegalSeg), כך שהזנת עובדות לחילוץ פוגעת ישירות ב-precision. - NON_REASONING_SECTIONS = (facts, appellant_claims, respondent_claims, intro) - _select_extractable_chunks(): מרכז את מדיניות-הבחירה (primary + fallback) בפונקציה אחת המשמשת גם את הבחירה הראשית וגם את ה-re-read לקביעת-סטטוס (G2 — מקור-אמת יחיד, אין מסלול מקביל). ה-fallback מחריג את NON_REASONING_SECTIONS ועדיין מגיע להנמקה שנחתה תחת 'other'. invariants: G1 (נרמול-במקור, לא תיקון-בקריאה) · G2 (אין מסלול מקביל) · INV-LRN2 (quality-at-source). tests: 4 חדשות (primary/fallback-excludes-args/all-nonreasoning/disjoint-sets) + 61 בדיקות-הלכה קיימות עוברות. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-11 15:52:13 +00:00
Chaim	64db643e6d	fix(writer): disable tools on block_writer + style_analyzer claude_session calls All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 6s Details המשך ל-#182 — שני האתרים שנותרו עם query()‎ ליצירת-טקסט/ניתוח, ששמרו על ברירת-המחדל של ה-CLI (כל הכלים פעילים) ולכן חשופים לאותו error_max_turns: המודל פולט stop_reason:"tool_use", מפיל את --max-turns 1, ומאלץ retry יקר. - block_writer.py:413 — כתיבת פרוזת בלוק (Opus/Sonnet). יצירת-טקסט טהורה, אף פעם לא צריך כלי. - style_analyzer.py:166/183/196 — single/multi-pass + synthesis; הפלט מפוענח כ-JSON (_parse_and_store_patterns/_extract_json). text→JSON טהור. מיישר את שני האחרונים לאותו מסלול קנוני (claude_session.query(tools="")). עכשיו כל קריאות ה-LLM שאינן צריכות כלים מעבירות tools="". Invariants: מקיים INV-G2 (מסלול קנוני יחיד; סימטריה). אין בליעה שקטה (§6). ללא שינוי-ספ. בדיקות: py_compile נקי; 18 בדיקות (block/style/writer) עוברות. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 12:03:37 +00:00
Chaim	d05c1e3fce	fix(extractors): disable tools on text→JSON claude_session calls (no error_max_turns) All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 5s Details כל קריאות text→JSON ב-9 המחלצים העבירו את ברירת-המחדל של ה-CLI (כל הכלים פעילים). המודל פלט מדי פעם stop_reason:"tool_use", מה שמפיל את --max-turns 1 ל-error_max_turns ומאלץ retry — ~$0.12-0.16 לניסיון, × 3. נצפה ב-drain חילוץ-ההלכות (legal-halacha-drain, ‎15 כשלי error_max_turns ב-error.log). התשתית כבר קיימת: claude_session.query מקבל tools=""‎ לנטרול כל הכלים, ושני מחלצים (digest_metadata_extractor, bulletin_splitter) כבר משתמשים בו. כאן רק מיישרים את שאר המחלצים לאותו מסלול קנוני — אף קריאת חילוץ/שיפוט/סיווג טהורה לא צריכה כלי. מתוקנים (11 קריאות, 9 קבצים): halacha_extractor (×3: extract/NLI/consolidate), corroboration, claims_extractor, argument_aggregator, appraiser_facts_extractor, learning_loop, qa_validator, brainstorm, style_metadata_extractor. Invariants: מקיים INV-G2 (מסלול קנוני יחיד; סימטריה בין מחלצים-אחים) — לא מסלול מקביל חדש אלא שימוש עקבי בפרמטר הקיים. אין בליעה שקטה (§6) — נתיבי הכשל/retry נשמרים. ללא שינוי-ספ. בדיקות: 60/60 ב-tests/test_halacha_coerce.py + test_halacha_quality.py עוברות; py_compile נקי על כל 9 הקבצים. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-11 11:49:35 +00:00
chaim	f5650196b7	Merge pull request 'feat(pipeline): עמידות (LangGraph) ל-final_halacha (P0, X16/INV-DUR1, #114 )' (#178 ) from worktree-langgraph-durable-pipeline into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m29s Details G12 Leak-Guard / leak-guard (push) Successful in 7s Details	2026-06-10 09:53:07 +00:00
Chaim	e7d8b24d7c	feat(pipeline): durable execution for final_halacha via LangGraph (P0, X16/INV-DUR1, #114 ) scripts/_pipeline_runtime.py — runtime עמידות משותף: עוטף רשימת-צעדים async ב-LangGraph StateGraph ליניארי עם AsyncSqliteSaver (checkpoint לכל צעד). קריסה/OOM ממשיכה מהצעד שנכשל במקום להריץ הכל מחדש. degradation חיננית: ללא langgraph → ריצה ליניארית כמו קודם (הכפתור לא נשבר). מימוש אחד לשני הפייפליינים (G2). final_halacha_pipeline.py — 4 הצעדים ([0]extract [1]citations [2]corroboration [3]panel) רצים דרך ה-runtime. CLI זהה + --fresh (ברירת-מחדל auto-resume). thread יציב לכל תיק; dry-run = preview נפרד (תמיד fresh). קריסה בפאנל [3] → resume מ-[3] (steps 0-2 שמורים). pyproject: extra "durable" (langgraph + langgraph-checkpoint-sqlite) — host-only, optional. data/checkpoints/ ב-.gitignore. גבול (X16 §1): LangGraph רק כמנוע-פנימי של הסקריפט — לא orchestrator (לא מסלול מקביל ל-Paperclip; G2/G12). #108 (atomic extract) קדם לזה כתנאי. אימות: test_pipeline_runtime.py — עם langgraph (venv-זמני): 3 passed (resume מדלג צעדים שהושלמו · fresh מריץ-מחדש · linear). בלי langgraph (venv משותף): 1 passed + 2 skipped (degradation). final_halacha מתקמפל ומיובא נקי בשני המצבים. הרצה end-to-end על הפייפליין החי (DB+LLM) — לאחר `pip install -e ".[durable]"` בעץ הראשי. Invariants: INV-DUR1 (עמידות), G2 (runtime יחיד), G3 (idempotency מחוזק). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-10 09:52:35 +00:00
Chaim	d2b622f28e	feat(ci): G12 leak-guard — enforce the Agent Platform Port seam (R4, #113 ) All checks were successful G12 Leak-Guard / leak-guard (pull_request) Successful in 5s Details המאכף האוטומטי של INV-G12 (docs/spec/X15 §4). שני כללים קשיחים: 1. mcp-server/src (שכבת-האינטליגנציה) ללא סמלי-Paperclip — allowlist מנומק לפי substring ל-6 ההפניות הלגיטימיות (pm2-bridge + הערות-מקור company_id). 2. import seam — רק web/agent_platform_port.py (+ קבצי-המעטפת) מייבאים paperclip_*. מימוש קנוני אחד (scripts/leak_guard.py, stdlib-בלבד), משותף לשלושה אכיפנים (G2): • CI hard gate: .gitea/workflows/leak-guard.yaml (pull_request + push→main) • pytest: mcp-server/tests/test_platform_port_leak_guard.py (כולל self-test שמוודא שה-guard תופס הזרקה — לא ירקב) • hook בזמן-אמת: spec-guard.sh בודק את התוכן-הנכתב (new_string/content) על כתיבה ל-mcp-server/src ומזהיר על הזרקת-Paperclip (לא-deduped); תזכורת-הספ עודכנה ל-G1–G12. מחריג קבצים-נוצרים (web-ui types.ts) ומעטפת מוצהרת; הפרונט מחוץ להיקף-האינטליגנציה (ממצא R3). עודכן scripts/SCRIPTS.md. אימות: סריקה נקייה exit 0; הזרקת pc.sh ל-mcp-server → exit 1; seam-violation ב-web → exit 1; hook מזהיר על mcp-server ומזכיר-ספ על web; pytest 3 passed; bash -n + YAML תקינים. Invariants: G12 (אכיפה), G2 (מאכף יחיד לשלושה צרכנים). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-10 09:40:42 +00:00
Chaim	26e0219219	fix(halacha): re-extraction preserves chair-approved halachot (INV-G10, #108 ) תיקון data-loss: reset_halacha_extraction ביצע DELETE ללא-תנאי לפני חילוץ-מחדש; קריסה בין המחיקה לאחסון הראשון מחקה את כל אישורי-היו"ר והשאירה את הרשומה תקועה status='processing' עם 0 שורות (תקרית עמיאל 8126-03-25, 2026-06-08). עכשיו המחיקה מחריגה review_status IN ('approved','published') — אישור אנושי לא נמחק בשקט (INV-G10). ה-dedup-on-insert של store_halachot_for_chunk מדלג על חילוץ טרי שמשכפל מאושרת שנשמרה, כך שאין כפילות. reset מחזיר {deleted, preserved}, וה-extractor מתעד כמה מאושרות נשמרו (provenance, G9). עמידות מלאה מול מוות-תהליך (OOM) נשארת ל-X16/#114 (durable resume) — זה תנאי-מקדים. בדיקה: test_halacha_reextract_preserves_approved.py (offline SQL-capture) מאמת שה-DELETE מחריג approved/published; 64 בדיקות-הלכה קיימות עוברות. Invariants: G10 (שער-יו"ר — אישור לא נמחק), G1 (תיקון במקור), G9 (provenance). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-10 09:08:16 +00:00
Chaim	d4d2ab4d68	feat(arguments): פופאפ פרופוזיציות גולמיות בלחיצה על "מסתמך על N" הקישור טיעון↔פרופוזיציות כבר נשמר ב-DB (legal_argument_propositions), אך ה-UI הציג רק את המספר. מעשיר את get_legal_arguments באותו round-trip (JOIN ל-claims) להחזיר supporting_propositions = {id, text, source_document}, ועוטף את שורת "מסתמך על N פרופוזיציות" ב-Popover שמציג את הטענות הגולמיות verbatim עם מקור. שקיפות ועקיבוּת מהטיעון המאוגד חזרה לטענות-המקור. - supporting_claims נשאר id-only (תאימות לאחור: מונה, צרכני MCP) - supporting_propositions שדה חדש אופציונלי; fallback לטקסט סטטי כשחסר - אין מסלול מקביל (G2) — העשרה של אותו endpoint; נרמול-במקור (G1) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-09 06:51:09 +00:00
Chaim	d23f854c25	fix(ops): self-restart/stop of the host bridge returns 200 (detached) Restarting/stopping legal-court-fetch-service from its own /pm2/control kills the process before it can reply — the client got a misleading 502 even though pm2 performed the restart. Detach the self-action (sleep 1; pm2 ...) so the HTTP response flushes first, and report success optimistically. Other targets are unchanged. Own name via COURT_FETCH_SERVICE_PM2_NAME (default legal-court-fetch-service). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 09:09:08 +00:00
Chaim	638eef6803	feat(ops): /operations — מוני-תור אחידים, "מה רץ עכשיו", וניהול-תהליכים הדף הציג את התורים באופן לא-אחיד (by_status גולמי), בלי הבחנה בין "ממתין" (בקלוג: status=pending) ל"בתור" (התור הפעיל: requested_at IS NOT NULL), בלי הצגת הפריט שרץ כרגע, ובלי שום שליטה בתהליכים. מה נוסף: 1. כרטיסי-תור אחידים — בתור / ממתין(בקלוג) / בעיבוד / הושלם / נכשל + "רץ עכשיו" (citation/case_number של הפריט בעיבוד) לכל drain (אחזור-פסיקה, מטא-דאטה, הלכות, יומונים). שערי-אנוש (אישור-הלכות, פסיקה-חסרה) נשארים מוני-סטטוס. 2. פאנל ניהול-תהליכים בסגנון "שירותי Windows": - דמון (court-fetch-service/xvfb/chat/reaper): הפעל-מחדש / עצור / הפעל. - cron drain: "הרץ עכשיו" (pm2 restart) + מתג הפעל/כבה תזמון. 3. כל תגי-הסטטוס מתורגמים לעברית. מנגנון: - הפעל/כבה תזמון = דגל ב-DB (טבלה drain_controls). pm2 cron_restart מחיה תהליך שעוצר ב-stop, לכן ה"כיבוי" האמין הוא דגל שכל drain בודק ב-startup (no-op מיידי כשכבוי). הקונטיינר כותב/קורא ישירות מ-DB. - הרץ-עכשיו + restart/stop/start = proxy ל-pm2 דרך endpoint חדש בגשר-המארח (court_fetch_service /pm2/control), מאובטח Bearer + whitelist ל-legal-* בלבד. - יומונים: drain_digests הועבר מ-crontab ל-pm2 (legal-digest-drain.config.cjs) כדי שיופיע ויהיה שליט כמו כל drain. drain_halacha_queue.py הובא לבקרת-גרסאות. Invariants: מקיים G2 (הרחבת /operations + הגשר הקיים, לא מסלול מקביל) ו-G1 (drain_controls = מקור-אמת יחיד לכיבוי, נורמליזציה במקור ולא תיקון-בקריאה). אין בליעת שגיאות שקטה (הגשר מחזיר {ok,error}; המוטציות מציגות toast). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 08:57:23 +00:00
Chaim	b2ea0c28dd	feat(storage): X14 Phase 2c — route remaining sync write-sites through storage.py Completes the write-side rewiring (INV-STG1) for the call-sites that run in synchronous contexts, via a new blocking facade in storage.py (put_bytes_sync / put_file_sync — asyncio.run, or a worker thread when a loop is already running): - services/extractor.py: multimodal thumbnail JPEGs → DERIVED (rendered in a to_thread worker) - services/docx_reviser.py: track-changes save (_save_docx_xml) + empty-diff copy (copy_with_revisions) → DOCUMENTS - services/docx_retrofit.py: in-place retrofit backup → DOCUMENTS Each site keeps a fallback to a direct disk write when the target path is outside DATA_DIR (caller-provided). Under the default STORAGE_BACKEND= filesystem the bytes land exactly where they did before — zero behaviour change. Also: mcp_env_catalog MINIO_ENDPOINT default updated to the durable container-name endpoint (http://minio-bx2ykvw94xbutsex41hz4vv8:9000), matching the Coolify "Connect to Predefined Network" change made for network durability. All binary write-sites now flow through storage.py. git-tracked text (case.json/notes/research-md/draft-md) stays on disk by design (INV-STG7); court-fetch temp files are ephemeral. tests: +2 (thumbnail renderer routes through storage; put_bytes_sync round-trip); 55 storage/docx/track-changes green; 244 collected, no import breakage. Keeps G2; completes INV-STG1 write coverage. Spec: docs/spec/X14-storage-minio.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 08:26:09 +00:00
Chaim	5745d36bb4	feat(digests-ui): publication filter + 'מאמר'/source badges for bulletins משלים את #154 בצד-לקוח: - פילטר "מקור" בדף /digests (כל המקורות / כל יום / עו"ד על נדל"ן) — backend: list_digests + /api/digests מקבלים publication. - DigestCard: תג "מאמר" ל-digest_kind='article', ו-chip מקור לפרסום שאינו 'כל יום'. build (webpack) עובר, lint נקי. digests = hand-written types (אין api:types). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 08:14:23 +00:00
chaim	05e8373d22	Merge pull request 'feat(bulletins): catalog monthly "עו"ד על נדל"ן" bulletins into the radar (X12)' (#154 ) from worktree-bulletins-catalog into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-06-08 08:08:10 +00:00
Chaim	85f94a4f3f	feat(bulletins): catalog monthly "עו"ד על נדל"ן" bulletins into the radar (X12) עלון חודשי רב-נושאי (פרסום נפרד מהיומון היומי) → מתפצל ל-N שורות digest באותה טבלה (publication='עו"ד על נדל"ן', לא קורפוס מקביל — G2): - bulletin_splitter (LLM local-only, tools=""): מפצל ל-cases[]+articles[]; עדכוני-חקיקה מדולגים (החלטת יו"ר). - bulletin_library.ingest_bulletin: כל מצביע-פסיקה → digest_kind='decision' + embedding + autolink (כולל X13 court-fetch); כל מאמר → digest_kind='article' (טקסט-מלא + embedding, רקע בלבד — INV-DIG1 חל). - content_hash per-item הוא מפתח-הדדאפ (yomon_number ריק) → אידמפוטנטי. - db.create_digest: פרמטר digest_kind (זורם ל-INSERT + upsert). - scripts/ingest_bulletins.py (host, venv) לעיבוד הארכיון. - spec X12 §2.1. אומת (dry-run, ללא DB): עלון 180 → 4 cases+1 article · עלון 201 → 4 cases (כולל ערר-197) +1 article. עדכוני-חקיקה דולגו. claude_session נשאר local-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 08:07:45 +00:00
Chaim	1f42a39ce4	feat(storage): X14 Phase 2b — route extracted-text + async DOCX exports through storage.py Continue the write-site rewiring onto the unified storage layer (INV-STG1): - services/processor.py: extracted-text .txt → DERIVED bucket (a derived artifact; the DB column is the source of truth per INV-STG5, so the write stays non-fatal) - services/docx_exporter.py (export_decision): DOCX → DOCUMENTS bucket via BytesIO → put_bytes, with a fallback to a direct disk write when the caller passes an output_path outside DATA_DIR - services/analysis_docx_exporter.py (build_analysis_docx): same pattern; out_path is always under DATA_DIR Under the default STORAGE_BACKEND=filesystem the bytes land at the exact legacy path (put_bytes → DATA_DIR/key), so behaviour is unchanged. The disk-reading bits that must stay for now (export_dir glob in _next_version) are kept; storage-native versioning is a cutover concern. Still on disk (sync call-sites, follow-up Phase 2c): docx_reviser (track-changes), docx_retrofit backup, and multimodal thumbnails (rendered in a to_thread). git-tracked text (case.json/notes/research-md/draft-md) stays on disk by design (INV-STG7). tests: 38 storage + docx tests green (incl. test_export_qa_gate / test_docx_exporter_bookmarks which exercise the real export path); 242 collected, no import breakage. Keeps G2; advances INV-STG1. Spec: docs/spec/X14-storage-minio.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 08:05:25 +00:00
Chaim	1986fe3b14	feat(storage): X14 Phase 2a — route source-document writes through storage.py Rewire the source-document staging writes onto the unified storage layer (INV-STG1), replacing direct shutil.copy2 calls: - tools/documents.py: case originals + training-corpus uploads - services/ingest.py: _stage_file (now async) — covers precedent-library, internal-decisions, and digests (the canonical intake helper) - services/digest_library.py: awaits the now-async _stage_file Each write goes through storage.put_file(..., bucket=DOCUMENTS) with the DATA_DIR-relative key; the Hebrew original filename rides as object metadata (INV-STG2), content-type is guessed from the extension. DB path columns are unchanged (still the absolute dest) — object_key backfill is Phase 3. Under the default STORAGE_BACKEND=filesystem the bytes land at the exact legacy on-disk location (put_file → shutil.copy2 to DATA_DIR/key), so this is zero behaviour change in prod. shutil import dropped where now unused. tests: +2 staging regression tests (file lands under DATA_DIR at the legacy path); 20 storage + 22 ingest tests green; 242 collected with no import breakage. Derived/export write sites (thumbnails, extracted text, DOCX exports) are Phase 2b. Keeps G2; advances INV-STG1. Spec: docs/spec/X14-storage-minio.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 08:00:27 +00:00
Chaim	b4a28f072d	feat(storage): X14 Phase 1 — unified storage layer (services/storage.py) The single choke-point for all binary file I/O (originals, derived artifacts, exports), replacing the scattered open()/shutil/Path.write_bytes calls across ~8 services. Backend chosen by STORAGE_BACKEND: - filesystem (default): disk under DATA_DIR — byte-for-byte legacy behaviour - dual: write disk + S3, read S3→disk fallback (migration window) - s3: MinIO via aioboto3 (lazy import; absent in the filesystem path) Keys are DATA_DIR-relative POSIX paths; the FS backend ignores the logical bucket and keeps the existing single tree, so the default backend is zero behaviour change. S3 maps a governance bucket (documents/immutable/derived) → MinIO bucket; presigned URLs are minted against the public endpoint (browser-reachable) and carry the Hebrew filename via RFC-5987 Content-Disposition. - config: STORAGE_BACKEND + MINIO_* (endpoint, public-endpoint, creds, region, 3 bucket names, presign TTL) - mcp_env_catalog: new "storage" category + 10 specs (X10/INV-ENV1) - pyproject: aioboto3>=13 (consumed here, deployed with first use) - tests: 18 unit tests (FS round-trip, key normalization/traversal guard, bucket resolution, backend selection, dual write-both + S3-down fallback) No call-sites are rewired yet — that is Phase 2 (106.3). STORAGE_BACKEND stays filesystem in prod, so behaviour is unchanged. Invariants: keeps G2 (one storage path replaces scattered I/O); establishes INV-STG1 (single layer), INV-STG2 (atomic keys, Hebrew name in metadata), INV-STG3 (governance buckets), INV-STG6 (presigned serving). Spec: docs/spec/X14-storage-minio.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 07:47:49 +00:00
Chaim	34d80a39e5	feat(ops): /operations dashboard — everything running in the background A single live page for all the background work that downloads/analyses, so the chair can see what's running instead of guessing. - court_fetch_service: GET /pm2 (unauthenticated, host-only) → trimmed pm2 jlist for the legal-* services (status, restarts, mem, cron schedule). - FastAPI GET /api/operations: aggregates the DB-backed pipelines (court_fetch jobs, metadata + halacha extraction queues, halacha review gate, missing_precedents, digests, recent court ingests) and proxies the host /pm2 over the docker bridge (graceful if the host service is down). - web-ui /operations page (+ src/lib/api/operations.ts hook, nav entry under admin): services grid (with Hebrew labels + schedules) + pipeline cards + recent-fetch / recent-ingest lists. Auto-refreshes every 5s. tsc --noEmit clean; pm2 status carries nothing sensitive and the bind (10.0.1.1) is host/container-only, so /pm2 needs no secret. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 07:28:41 +00:00
Chaim	8d2f1ea0a2	feat(X13 Tier-0): decode supremedecisions API — fetch serial-format Supreme verdicts The 211 open missing_precedents include 99 Supreme serial-format rulings (בג"ץ/בר"מ/עע"מ NNNN/YY) with no נט-format triple — fetchable only from supremedecisions.court.gov.il. Decoded its public JSON API (no browser, no CAPTCHA, no smart-card); validated live on בג"ץ 3483/05 + בר"מ 10212/16. - court_fetch_supreme.py: rewrite. POST Home/SearchVerdicts with a structured `document` ({Year:"YYYY", CaseNum, OldMainNumFormat:true, SearchText:[…]}) + X-Requested-With header → records; GET Home/Download?path=&fileName=&type=4 → PDF. The earlier attempt failed only on the request shape (string vs object). 2-digit→4-digit year; try candidate docs best-first (פסק-דין→pages), skipping the published-report 's'-prefix files the free endpoint WAF-blocks. - orchestrator: on successful ingest, close matching open missing_precedents (link to the new case_law). End-to-end validated (בר"מ 10212/16 → corpus). - backfill_missing_precedents.py: enqueue fetchable open gaps (supreme + net) into court_fetch_jobs; the drainer fetches+ingests+closes. dry-run default. - X13 spec + SCRIPTS.md updated (Tier-0 decoded, no longer a limitation). Very old un-digitized Supreme cases (e.g. בג"ץ 389/87 → 0 records) → manual. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 06:53:31 +00:00
chaim	a1db283ce1	Merge pull request 'fix(extraction): self-heal לתור חילוץ-ההלכות + drainer מתוזמן' (#142 ) from worktree-halacha-selfheal into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m41s Details	2026-06-08 06:05:27 +00:00
Chaim	97ede1a49d	fix(extraction): self-heal stale halacha 'processing' rows + scheduled drainer The halacha extraction queue was stuck (same class as the metadata issue): 26 precedents requested extraction with no drainer, plus 1 orphaned in 'processing' (status=processing, requested_at cleared → never re-picked by the queue). - db.requeue_stale_processing_extractions(kind): re-stamp orphaned 'processing' rows (requested_at IS NULL) so they re-drain; halacha extractor force=False resumes from chunk checkpoints (no duplicates). - process_pending_extractions calls it at the top — fully unattended, safe under the global advisory lock. Mirrors the digests-drain self-heal. - legal-halacha-drain.config.cjs: pm2 cron (every 2h, conservative — Claude is slow/rate-limited and each run adds to the chair's pending_review queue). drain_halacha_queue.py stays on claude_session (high reasoning quality for holding/ratio; NOT moved to Gemini). SCRIPTS.md. The chair-approval gate (INV-G10) is untouched — this only produces halachot; Daphna still approves each in /approvals. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 06:04:53 +00:00
Chaim	83d1a8253c	feat(digests): digest_kind classification — robust extraction for all issue types (X12) ~2% מגיליונות "כל יום" הם לא-הכרעות (עדכוני-חקיקה/הודעות/ברכות) ללא ruling → החילוץ ה-decision-centric החזיר ריק → both-empty → מחזורי ב-self-heal. - SCHEMA_V32: `digest_kind` (decision/announcement/other) + backfill legacy בזול (יש citation→decision, אחרת announcement) — לפני שה-self-heal מסתמך עליו. - extractor: prompt מסווג + מחלץ תמיד concept/headline/summary; underlying_* רק ל-decision. extract מנרמל digest_kind. - enrich: שומר digest_kind; חילוץ מוצלח תמיד מסתיים ב-kind לא-ריק (ברירת-מחדל לפי citation אם המודל השמיט). - drain self-heal: הגדרת-כשל = completed עם digest_kind='' (במקום both-empty) → הודעות לא מנוסות-מחדש לנצח. - db: digest_kind ב-_DIGEST_COLS + update-whitelist (זורם ל-search/list/API). - X12 spec: תיעוד digest_kind + הגדרת-הכשל המתוקנת. אומת: V32 סיווג 533 (525 decision + 8 announcement, 0 unclassified — self-heal לא נוגע בהם). extract: 5163→decision+citation · 5060→announcement+concept, citation ריק (לא both-empty). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 06:02:08 +00:00
Chaim	d95a36f310	feat(extraction): precedent metadata via Gemini Flash + scheduled drainer The /precedents metadata queue was stuck — 24 rows requested, nothing draining them — and the agentic claude CLI hit error_max_turns on what is a single structured text→JSON task (slow + flaky). Metadata extraction is bounded extraction, the wrong fit for an agentic loop. - gemini_session.py: query_json drop-in (gemini-2.5-flash, JSON mode, httpx — no new SDK dep). Reads GEMINI_API_KEY (~/.env; SoT Infisical nautilus:/external-apis/gemini). Host-side only — no LLM from the container. - precedent_metadata_extractor: claude_session.query_json → gemini_session. Validated live: rich, accurate fields (case_name/summary/appeal_subtype/tags). - process_pending_extractions: kind-aware cooldown — metadata 2s (Gemini, fast), halacha keeps 30s (Claude rate limits). - drain_metadata_queue.py + legal-metadata-drain.config.cjs (pm2 cron */15) so the queue never clogs again. SCRIPTS.md. - X8 INV-FP5 updated: per-task engine choice (Gemini=bounded metadata, claude_session=agentic halacha), both host-side, single canonical queue (G2). Agentic/voice-sensitive work (writing, analysis, halacha) stays on claude_session (Daphna's subscription). Gemini cost ≈ $0.10/1M tokens — negligible. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 05:13:49 +00:00
Chaim	a3a02ca67a	fix(digests): enrich self-cleans duplicate-yomon rows (re-sent issues) אותו יומון יכול להגיע כשני PDF שונים (re-send/forward → בייטים שונים → content_hash dedup מפספס), אבל yomon_number ייחודי → ה-update ב-enrich מתנגש על uq_digests_yomon_number. עכשיו enrich תופס את ההתנגשות, מוחק את השורה הכפולה (היומון כבר קיים), ומחזיר status='duplicate' — כך ה-cron לא מנסה אותה שוב ושוב. סוגר לולאת-retry אינסופית פוטנציאלית במערכת הלא-מאוישת. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-08 04:59:12 +00:00
Chaim	69b34f1c3f	fix(X13): route by נט-format availability; robust fetch error handling Live drain surfaced three issues: 1. Tier-0 needed `h2` (httpx http2) — added to the court-fetch extra. 2. Supreme cases that carry a נט-format number (e.g. בר"מ 72182-06-25) were routed to the unvalidated Tier-0 and failed, even though נט המשפט serves Supreme cases too. classify() now parses the file-month-year triple for Supreme prefixes; the orchestrator routes by triple-availability: נט-format present → Tier-1 (validated, all courts) serial-only Supreme (עע"מ 5886/24) → Tier-0 neither → clear "no public route" failure Validated live: בר"מ 72182-06-25 fetched via Tier-1 (5-page PDF). 3. A non-`RuntimeError` fetch exception (the h2 import error) left jobs stuck in 'running'. The fetch block now catches any Exception → _record_failure (INV-CF2/CF3), so a job always reaches a terminal state. + test_supreme_with_net_format_triple. Suite 11/11. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 20:45:20 +00:00
Chaim	f4f110f0d1	feat(X13): scheduled drain — fully-autonomous digest→fetch→ingest loop - scripts/drain_court_fetch.py: drives orchestrator.drain_pending (host-only; no-op when queue empty). Mirrors drain_halacha_queue.py. - scripts/legal-court-fetch-drain.config.cjs: pm2 cron (hourly :17, one-shot), COURT_FETCH_DRAIN_CRON override. - fix: orchestrator default service URL 127.0.0.1 → 10.0.1.1 (the service binds the docker0 gateway; the host can't reach it on loopback). Found live — the first drain failed "connection refused" until corrected. - SCRIPTS.md entries. Validated end-to-end in PRODUCTION on a real digest: עת"מ 43830-12-24 (החברה להגנת הטבע) fetched from נט המשפט → case_law (79 chunks, source_url), digest relinked (INV-DIG3 closed), halacha queued pending_review. job=done. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 20:31:53 +00:00
Chaim	d3b5c563ce	fix(extract): disable tools for digest LLM extraction (no error_max_turns) חילוץ-המטא-דאטה של יומון הוא טקסט→JSON טהור, אבל ה-claude CLI רץ עם tools זמינים, ו-Sonnet לפעמים פולט stop_reason=tool_use → פוגע ב---max-turns 1 → error_max_turns → retry (איטי). מבזבז זמן רב בגיבוי-המוני. - claude_session.query/query_json: פרמטר חדש `tools` → מועבר כ---tools. "" = ביטוי כל ה-tools (אין tool_use → אין max-turns trip). None = ברירת-CLI. - digest_metadata_extractor.extract: מעביר tools="". אומת: extract על יומון 5160 ב-Sonnet+tools="" → num_turns=1, JSON תקין, ללא error_max_turns. claude_session נשאר local-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 20:18:29 +00:00
Chaim	f56309da5a	feat(X13): auto-trigger court fetch from digests + drain tool סוגר את הלולאה — יומון שמצביע על פס"ד בית-משפט שלא בקורפוס מזניק אחזור אוטומטי, וקושר את היומון חזרה אחרי הקליטה (INV-DIG3 + INV-CF2). - digest_library.try_autolink: בכשל-קישור, אם הציטוט מסווג כפס"ד-בימ"ש (supreme/admin) → _enqueue_court_fetch יוצר court_fetch_jobs(pending); ועדת-ערר (skip) לא מוזנק. never-raises (לא שובר קליטת-יומון). - orchestrator.drain_pending(limit): מנקז pending/failed סדרתי (cooldown, INV-CF4), fetch+ingest לכל אחד; בהצלחה מקשר את היומון ל-case_law שנקלט. - כלי-MCP court_fetch_drain + רישום ב-server.py. - X13 spec: עודכן (הפער ב-INV-CF2 סומן כמתוקן). נבדק מול ה-DB: עת"מ 46111-12-22 → job tier=admin pending digest-linked; ערר 1110/20 → לא מוזנק. כלי מקומי בלבד (ingest = claude CLI). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 20:04:12 +00:00
Chaim	e6dc410d7d	feat(digests): use Sonnet for digest metadata extraction (X12) חילוץ-המטא-דאטה של יומון (תג-מושג, כותרת-הלכה, מראה-מקום, תגיות מסיכום עמוד-אחד) הוא משימה פשוטה בנפח גבוה — Sonnet הוא נקודת-האיזון מהירות/עלות, בניגוד לחילוץ-הלכות שמצמיד Opus. - config.DIGEST_EXTRACT_MODEL (env-tunable, ברירת-מחדל claude-sonnet-4-6). - digest_metadata_extractor.extract(model=None) → ברירת-מחדל מה-config; קודם לא צוין model → רץ על ברירת-המחדל של ה-CLI (Opus 4.8). אומת: extract על יומון 5163 עם Sonnet החזיר תג-מושג/כותרת/מראה-מקום/תחום/ תגיות תקינים (~36s). claude_session נשאר local-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 19:58:48 +00:00
Chaim	e186183527	fix(X13): harden court-fetch against browser leaks + reaper for task-master-mcp leak שלוש שכבות-הגנה נגד דליפת-זיכרון מדפדפנים יתומים, + טיפול בדליפה הגדולה בפועל בשרת (task-master-mcp). - camofox_client.py: - asyncio.wait_for קשיח סביב כל ה-fetch (COURT_FETCH_HARD_TIMEOUT_S=180ש') — hang → ביטול → async-with tear-down → reap. - _reap_orphan_browsers(): הורג camoufox-bin יתומים (ppid=1) לפני ואחרי כל fetch. סדרתיות (INV-CF4) → כל ppid=1 הוא שארית בטוחה. - scripts/reap_orphan_procs.py: reaper כללי ל-task-master-mcp (~3GB יתומים) + camoufox-bin. רק ppid=1; /proc טהור. --dry-run / --loop N. - scripts/legal-reaper.config.cjs: דמון pm2 (loop 180s, max_memory_restart 100M). - X13 spec + SCRIPTS.md: תיעוד שכבות-ההגנה. max_memory_restart בשירות (1.5G) כבר נותן רשת-ביטחון ברמת-התהליך. Invariants: מקיים INV-CF4 (politeness/serial) — ללא שינוי חוזה. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 19:43:53 +00:00
Chaim	781f24c643	feat(X13 Tier-1): calibrate נט המשפט fetch — Camoufox python, proven on 46111-12-22 אומת end-to-end: פס"ד 34 עמ' של עת"מ 46111-12-22 הורד אוטונומית מלא, נטו קוד-פתוח, ללא כרטיס-חכם וללא פתרון-CAPTCHA. ממצאי-כיול עיקריים: - החיפוש+הניווט-לתיק ללא reCAPTCHA כלל. reCAPTCHA קיים רק בצופה ורק על שמירה/הדפסה מפורשת — לא על הצגת המסמך. - הצופה מגיש עמודים כ-PNG דרך PageMethod GetImages (4/batch); משיכה ב-fetch עם הכותרת X-Requested-With: XMLHttpRequest (חובה — F5 WAF חוסם בלעדיה) → הרכבת PDF (Pillow). שינויים: - camofox_client.py: שכתוב מלא — Camoufox דרך חבילת-הפייתון (in-process, לא שרת-Node REST). מסלול מכויל: home→btnExternalSearchCases→Bama fields→ CaseDetails→פסקי דין→DecisionList→NGCSViewerPage→GetImages→PDF. - pm2 config: app Xvfb :99 + DISPLAY=:99 (Camoufox קורס headless בלי צג וירטואלי). - pyproject: extra [court-fetch] = camoufox + faster-whisper (host-only; הקונטיינר לא מריץ דפדפן). Pillow כבר בבסיס. - X13 spec + SCRIPTS.md: עודכנו לממצאים (image-API, Xvfb, אימות). reCAPTCHA audio (Whisper) נשמר כ-fallback למסלול-השמירה-המפורש בלבד; המסלול הראשי אינו זקוק לו. Invariants: מקיים INV-CF1/CF4/CF6 (ללא שינוי). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 19:32:13 +00:00
chaim	f3740fef68	Merge pull request 'fix(halacha): split authority (derived) from rule_role — stop source-conflation (INV-DM7)' (#112 ) from worktree-halacha-authority-split into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m32s Details	2026-06-07 18:19:43 +00:00
Chaim	2e33cac043	fix(halacha): split authority (derived) from rule_role — stop source-conflation (INV-DM7) The extractor classified rule_type by SOURCE bindingness (higher-court→binding, committee→persuasive) instead of by rule KIND. The gold-set proved it: 'binding' appeared on 19/19 external rulings & 0 committees; 'persuasive' on 13/13 committees & 0 external — only 58% agreement with the human role tags. The two axes (authority vs rule role) were crammed into one enum. This splits them per INV-DM7: - authority (binding/persuasive) — DERIVED from case_law.precedent_level (עליון/מנהלי→binding, ועדת_ערר_מחוזית→persuasive), never stored, never LLM-guessed. New helper halacha_quality.derive_authority; surfaced read-only in list_halachot / goldset_list / search results. - rule_type — now the rule ROLE only: holding/interpretive/procedural/ application/obiter. Both extractor prompts unified to this vocabulary; _coerce_halacha no longer defaults rule_type from the source; legacy binding→holding / persuasive→interpretive fold for safety. UI: authority shown as a separate read-only badge (gold=מחייב / muted=משכנע) across the review queue, precedent detail, and gold-set; the gold-set role selector drops binding/persuasive and adds מהותי (holding). Migration: scripts/halacha_rule_role_backfill.py re-classifies the 276 pre-split binding/persuasive rows into a genuine role via local claude_session (run after deploy). Gold-set correct_type/ai_correct_type 'binding'→'holding' via SQL. Sources (≥3, per research-decision policy): OASIS LegalRuleML v1.0 (appliesAuthority/Strength as metadata orthogonal to rule logic) · SemEval-2023 Task 6 LegalEval (rhetorical roles by function, authority kept separate) · Bluebook signals (weight-of-authority is a separate dimension). Invariants: ESTABLISHES INV-DM7. Upholds G1 (normalize at source — extractor classifies role, system derives authority) and G2 (single source of truth — authority derived, not a parallel stored field). Tests: 211 pass + new derive_authority/coerce coverage. web-ui build + tsc clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 18:18:41 +00:00
chaim	acb8e2c206	Merge pull request 'feat(X13): אחזור-פסיקה אוטומטי מנט המשפט → קורפוס (Tier 0 + scaffold)' (#110 ) from worktree-court-fetch into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m21s Details	2026-06-07 18:13:15 +00:00
Chaim	0990db7a3c	feat(X13): auto-fetch court verdicts from נט המשפט → corpus (Tier 0 + scaffold) תת-מערכת אחזור-פסיקה אוטומטי: כשיומון מצביע על פס"ד בית-משפט, מסווגים את הערכאה, מורידים מהמקור הציבורי המתאים, וקולטים דרך צינור-הקליטה הקנוני. - spec-first: docs/spec/X13-court-fetch.md (INV-CF1..CF7) + אינדקס - מסווג court_citation.py (supreme/admin/skip) + 10 בדיקות (עת"מ 46111-12-22 → admin) - Tier 0: court_fetch_supreme.py — supremedecisions API (reverse-engineered), httpx + browser-headers (אומת 200) + politeness - תור court_fetch_jobs (SCHEMA_V30) + DB helpers + court_fetch_orchestrator.py - Tier 1 scaffold: legal-court-fetch-service (aiohttp+Bearer, מראת legal-chat-service) + camofox_client (Camoufox open-source) + recaptcha_audio (Whisper מקומי) + pm2 - Tier 2 fallback חינני: manual + missing_precedent (INV-CF2/CF3 — אין drop שקט) - כלי-MCP court_verdict_fetch / court_fetch_status; SCRIPTS.md Invariants: מקיים G2 (מסלול-קליטה יחיד, INV-CF1) · G3/G1 (idempotent+נרמול, INV-CF5) · G4/§6 (אין בליעה שקטה, INV-CF2) · G10 (שער-אנושי, INV-CF3) · G5 (source_type, INV-CF6) · G9 (provenance+audit, INV-CF7). מקורות INV-CF4: RFC 9309 · Google crawler · OWASP OAT. Follow-ups (טרם אומתו חי): live Tier-0 validation · התקנת camofox-browser+whisper · כיול selectors Tier-1 · COURT_FETCH_SHARED_SECRET (Infisical+Coolify) · טריגר מ-digest try_autolink (worktree-digests-radar). V30 עלול להתנגש עם digests-radar. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 18:12:13 +00:00
Chaim	06281996ca	feat(digests): Phase 2 — API endpoints + /digests UI (X12) משטחי-משתמש לקורפוס היומונים: endpoints ב-FastAPI + דף UI נפרד /digests (לדפדוף, חיפוש, העלאה, וקישור לפסק המקורי). היומון נשאר מקור-משני המצביע על הפסק — אינו מצוטט בהחלטה (INV-DIG1) ואינו מחלץ הלכות (INV-DIG2). Backend (container-safe + local split): - digest_library: פוצל ל-create_pending_digest (CONTAINER-SAFE: stage+ extract_text+create row 'pending', בלי LLM) ↔ enrich_digest/ process_pending_digests (local: LLM+embed+autolink). ingest_digest מאחד. - db.list_pending_digests; MCP digest_process_pending (tool+server) — חלופה ל-batch script לריקון התור. - web/app.py: 10 endpoints /api/digests/* (upload/list/search/queue-pending/ get/patch/delete/link/relink/unlink). upload=INSERT-only pending (ה-LLM רץ מקומית — claude_session local-only). כולם מחזירים dict בדפוס precedent. Frontend (Next 16, ללא api:types — hooks עם טיפוסים hand-written כמו precedent-library.ts): - lib/api/digests.ts — hooks (useDigests/useDigestSearch/useDigestPending/ useUploadDigest/useLink/Relink/Unlink/Delete/Update). - דף /digests נפרד (לא כרטיסייה ב-/precedents — לשמור גבול סמכותי/משני, INV-DIG1): טאבים יומונים/חיפוש + DigestCard (badge קישור-לפסק) + DigestUploadDialog + pending badge. nav + header-context. אומת: backend round-trip מלא (create_pending→list_pending→process_pending→ search→restore); web-ui מתקמפל (webpack/tsc נקי, route /digests נוצר). הערה: build דיפולטי (turbopack) נכשל ב-worktree עקב symlink ל-node_modules — ב-CI/Docker (node_modules אמיתי) עובד; אומת עם --webpack. Invariants: מקיים INV-DIG1/2 (upload לא מחלץ הלכות, UI מציג "מצביע לא מצוטט"), INV-DIG3 (link/relink/queue). G4 (אין בליעה — שגיאות→toast/HTTP), G2 (מסלול נפרד, לא מקביל). X6 (חוזה UI↔API — endpoints בדפוס precedent; hooks hand-written כמו שאר ה-domain modules). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 18:11:05 +00:00
Chaim	8171572cdd	feat(digests): קורפוס יומונים כשכבת-גילוי (radar) — X12 מאגר חדש ליומוני "כל יום" (עפר טויסטר) כשכבת-גילוי מעל קורפוסי-הפסיקה: מקור-משני המצביע על פסק הדין המקורי, נקלט לטבלה נפרדת `digests`, נחפש סמנטית, ומקושר לפסק המקורי בספריית הפסיקה — אך לעולם אינו מצוטט בהחלטה ואינו מחלץ הלכות. Phase 0 (spec): - docs/spec/X12-digests-radar.md — INV-DIG1 (מצביע לא מצוטט) / INV-DIG2 (מסלול-קליטה נפרד, לא מקביל — מקיים G2) / INV-DIG3 (קישור-לפסק הוא הגשר; חוסר-קישור = פער גלוי). עדכון אינדקס 00/03/README. Phase 1 (MVP): - SCHEMA_V30: טבלת `digests` (HNSW על embedding — לא ivfflat, להימנע מ-recall cliff בקורפוס קטן/צומח) + GIN/FTS + UNIQUE חלקי ל-idempotent. - services/digest_metadata_extractor.py — חילוץ-LLM (claude_session local-only, ייבוא lazy): תג-מושג, כותרת-הלכה, מראה-מקום, שני-תאריכים מובחנים, תגיות. - services/digest_library.py — מסלול קצר עצמאי (INV-DIG2): extract→hash→LLM→ embedding יחיד→autolink. לא משתמש ב-ingest.ingest_document. - tools/digests.py + רישום 7 כלים ב-server.py (digest_upload/list/get/link/ relink/delete + search_digests). - scripts/ingest_digests_batch.py — קליטה ידנית מ-data/digests/incoming. - legal-researcher.md: שלב 2ב.0 (סריקת-radar לפני אימות) + סעיף-דוח ט + 3 כלים ב-frontmatter. HEARTBEAT §8: ניתוב יומון→digest_upload. אומת end-to-end: 4 יומונים נקלטו (מטא-דאטה מדויק), חיפוש סמנטי מדרג נכון ("היטל השבחה"→5160, "תמא 38"→5158), link/relink/autolink/revert + מעטפת-MCP. Invariants: מוסיף INV-DIG1/2/3 (X12). מקיים G2 (bounded context נפרד, לא מסלול מקביל), G3 (idempotent upsert), G4 (אין בליעה שקטה — פער-קישור מוצף), G9 (עקיבוּת — היומון מצביע על מקור עקיב). נוגע G7 (RRF) — נדחה, חיפוש סמנטי-בלבד בשלב 1 (FTS index מוכן). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 17:49:00 +00:00
Chaim	0e35060d3d	feat(goldset): AI second-opinion per item (QA aid) — compare vs human tag The chair wanted an independent recommendation beside each tag, to reconsider his own judgments. Adds a NON-ground-truth AI second-opinion: - schema: halacha_goldset.ai_is_holding / ai_correct_type / ai_rationale / ai_generated_at (additive). - db.goldset_set_ai_recommendation + goldset_list now returns the ai_* fields. - scripts/goldset_ai_recommend.py — local claude_session judges is_holding + type + a one-line rationale per item, INDEPENDENTLY (own legal rubric). Independent of the rule-based validators #81.8 measures → no circularity. Never auto-applied; QA aid only. - web-ui: each card shows "🤖 המלצת AI: הלכה/לא · type" + rationale and an agreement/disagreement chip vs the human tag (amber on disagree); a "⚠ אי-הסכמות AI (N)" filter to review only the conflicts. Methodology note kept explicit: the human stays the ground truth; the AI is a prompt to reconsider, not to copy. Verified: tsc --noEmit 0; generator stores recs and flags disagreements with existing human tags. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 14:24:35 +00:00
Chaim	632fe73857	feat(goldset): separate court rulings from committee decisions in tagging Tagging is easier one source-type at a time. goldset_list now returns case_law.source_type; the page adds: - a filter (הכל / פסקי דין / ועדת ערר) with live counts, - a group-sort so even in "הכל" all court rulings come first, then all committee decisions, - a per-card source badge (פסק-דין / ועדת ערר). Verified: tsc --noEmit 0; source_type splits the live batch 58 court / 92 committee. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-07 13:55:06 +00:00
Chaim	ac279220c4	feat(goldset): interactive gold-set tagging page (#81.7/#81.8) Replaces the CSV-edit workflow with an in-app tagging page so the chair/Dafna can label the extraction-quality gold-set by clicking, and see validator precision/recall live. Schema (V29): halacha_goldset — a stratified, human-tagged evaluation batch (is_holding / correct_type / quote_complete, NULL until tagged). db.py: - goldset_create_sample (stratified round-robin over case×rule_type, idempotent), - goldset_list (items + halacha content + the machine's own labels), - goldset_tag (partial — one field at a time for keyboard tagging), - goldset_score (ports the script's P/R/F1: each validator scored as a not-a-holding detector against the human tags — the #81.8 input). API: GET /api/goldset, POST /api/goldset/sample, GET /api/goldset/score, PATCH /api/goldset/{id}. web-ui: - lib/api/goldset.ts (hooks), - components/goldset/goldset-panel.tsx — card-per-item, keyboard-first (J/K nav, H/N holding, C/X quote), progress bar, hide-tagged toggle, and a collapsible live score table, - app/goldset/page.tsx + nav link "מדגם-זהב" under ידע ולמידה. Methodology guard kept explicit in UI + docstrings: tags are HUMAN ground truth, no AI pre-fill (circular bias). Populated a 150-item stratified batch. Verified: backend create/list/tag/score against the live DB; tsc --noEmit 0; py_compile ok. (Local Turbopack build blocked by worktree symlink — CI builds clean.) Invariants: G1 (eval set modeled at source in its own table); G2 (reuses the same halacha_quality validators the extractor runs — no parallel scoring logic). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 21:52:05 +00:00
Chaim	b7b44f4453	feat(halacha): equivalent-halacha (parallel-authority) links across precedents Cross-precedent recurrence of a principle is real but is NOT citation corroboration (X11) — the 5 candidate pairs have ZERO citations between their precedents. Recording them in halacha_citation_corroboration would fabricate citation data and inflate corroboration_count. This adds a proper, separate halacha-level link for parallel authority. Schema (V28): equivalent_halachot — symmetric (halacha_a < halacha_b, CHECK + UNIQUE), non-citation, cross-precedent-only. ON DELETE CASCADE. db.py: - link_equivalent_halachot (idempotent; rejects same-id and SAME-precedent pairs — parallel authority is cross-precedent by definition), unlink, and list_equivalent_for_halacha. - list_halachot gains include_equivalents → _annotate_equivalents attaches an `equivalents` list (both directions) per row. API: include_equivalents on GET /api/halachot; GET/POST/DELETE /api/halachot/{id}/equivalents for the chair to view/link/unlink manually. scripts/halacha_batch_reconcile.py: --link records found cross-precedent pairs as equivalent_halachot (non-destructive, idempotent). web-ui: Halacha.equivalents type; the clean review queue fetches include_equivalents; the review card shows a gold "עיקרון מקביל ב-N" badge + an expandable list (case + rule + similarity) labeled "אסמכתה מקבילה — לא ציטוט". Populated the 5 reviewed pairs (chair decision: keep all + link as parallel authority). Verified: 5 rows; the 1023-20 hub annotates 3 of its halachot with equivalents; tsc --noEmit exits 0. Invariants: G1 (model recurrence at source in its own table, not by abusing the citator); G2 (no parallel path — extends list_halachot); citator integrity preserved (corroboration stays citation-only). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 21:29:46 +00:00
chaim	25e0662ead	Merge pull request 'feat(halacha-triage UI): wire gating + near-duplicate cluster cards (#84.2)' (#98 ) from worktree-task84.2-ui-clustering into main Some checks failed Build & Deploy / build-and-deploy (push) Has been cancelled Details	2026-06-06 21:02:09 +00:00
Chaim	e4651a9d06	feat(#99 / T10): get_style_guide — יחסי-זהב נמדדים מהקורפוס לצד היעד style_distance.measure_corpus_ratios(): מפצל כל החלטה ב-style_corpus לסעיפים (chunker) ומחשב ממוצע %-סעיף — אגרגט "_all" + פר-תוצאה (כשיש). cached. get_style_guide מציג שורת "נמדד בפועל" עם ⚠️ על פער מטווח-היעד. מצב נוכחי: style_corpus.outcome לא מאוכלס → מוצג אגרגט כל-ההחלטות (n=48: רקע 26.4% / טענות 9.7% / דיון 43.8% / סיכום 20.1%); פיצול לפי-תוצאה future-ready. המדידה גם מאירה מגבלות זיהוי-סעיפים (כוונת T10 — לסמן פער לבדיקה). חופף-חלקית ל-T7 שמודד adherence per-draft; זה מודד את הקורפוס. כשל מדידה מוצג, לא נבלע. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 21:01:42 +00:00
Chaim	12313774a1	feat(halacha-triage UI): wire gating + near-duplicate cluster cards (#84.2) Completes #84 — surfaces the backend gating/prioritization (#84.1/#84.3, PR #93) in the chair's review UI and adds near-duplicate clustering (#84.2). Backend - db.list_halachot gains `cluster` (#84.2): annotates each row with cluster_id + cluster_size by unioning same-precedent halachot within HALACHA_CLUSTER_COSINE (0.90, new config). Display-only — never merges/deletes. Pairwise is confined to the returned set (cheap). - GET /api/halachot exposes the `cluster` query param (default off). Frontend (web-ui) - Halacha type gains optional cluster_id / cluster_size (hand-written module; no api:types regen needed — halachot aren't typed off the generated schema). - useHalachotPending(opts): the default "clean" queue now fetches exclude_low_quality + order_by_priority + cluster; needsFix:true returns the flagged 'needs extraction fix' bucket (filtered client-side). - HalachaReviewPanel: a "תור נקי / דורש תיקון-חילוץ" toggle (#84.1); near-dup clusters collapse into ONE card showing "+N וריאנטים" with an expandable list, and approve/reject/defer on a clustered card applies to all variants via the batch endpoint (#84.2 + #84.4). Counts show true halacha totals (pendingTotal). New flag labels added (application / near_duplicate / nevo_preamble_leak). Verified: - backend: list_halachot(cluster=True) on the live queue — algorithm correct (groups related same-precedent rules at 0.78; none at the production 0.90 because dedup #82 already removed near-dups — the desired state). - frontend: `tsc --noEmit` exits 0 (type-clean); no new lint errors (the one lint error is pre-existing in training/learning-panel.tsx from #94). Local Turbopack build can't run on the worktree node_modules symlink — CI builds in a clean checkout. Invariants: G1 (gate/cluster at source in SQL, not post-hoc); G2 (same list_halachot path); §6 (flagged items routed to a visible bucket, not dropped). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 21:01:30 +00:00
Chaim	a571ad535b	fix(#88+#87): סנכרון DB↔file אוטומטי + claims_coverage מבחין כתב-ערר מתכתובת #88 (DB↔file, lessons #35): drafts/decision.md דרסה את עצמה רק ב-save_block_content; renumber_all_blocks + נתיבי store_block אחרים השאירו את הקובץ stale → QA נכשל פעמיים על אותה בעיה (CMPA-62). תיקון: _update_draft_file הפך ל-hook אוטומטי (מקבל decision_id, מאתר case פנימית) שנקרא מ-store_block (כל persist) ומ- renumber_all_blocks. legal-qa ממילא קורא מ-DB → שני הצדדים זהים תמיד. #87 (claims_coverage, 1033-25): טענות מתכתובת (claim_type='reply' — תגובה/ השלמת-טיעון) סומנו "לא נענו" כ-false-positive. תיקון: check_claims_coverage דורש מענה רק לטענות כתב-הערר (claim_type='claim', appellant); reply/תכתובת מוחרגות. בקבלה מלאה הסף מוקל (0.2→0.4) כי העורר זכה במלואו. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 20:54:31 +00:00
Chaim	e096c51037	fix(#85 ): claude_session — retry על כשלים חולפים של claude -p שורש #85 התברר: `claude -p` נכשל מדי פעם ב-exit מהיר + stderr ריק על פרומפטים גדולים/איטיים (CEO write_interim_draft, learning_loop distillation), אותו פרומפט מצליח בריצה חוזרת — כשל חולף, לא nesting (אומת: nested claude מ-bash וגם פרומפט 70K הצליחו; הכשל אינו דטרמיניסטי). query() עוטף spawn+communicate ב-לולאת retry (MAX_RETRIES=3, backoff לינארי 5s*attempt). FileNotFoundError + timeout נשארים דטרמיניסטיים (ללא retry). empty-response גם מטופל כ-transient. אומת e2e: distillation על 1130-25 רץ בהצלחה → pair=analyzed (9 שינויים, 6 style_method, 33.8% diff). פותר גם את write_interim_draft של ה-CEO. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 20:08:54 +00:00
Chaim	420cb819f5	feat(halacha-triage): quality-gated + prioritized review queue + metrics (#84 ) Backend for the halacha approval-queue triage (#84). The keyboard UI, batch actions and defer/reject (#84.4–6) already shipped; this adds the gating, prioritization and metrics the queue was missing. db.list_halachot — two opt-in triage controls: * exclude_low_quality (#84.1): drop items carrying ANY quality_flag (application / quote_unverified / truncated / non_decision / thin / nli_unsupported / near_duplicate) — they belong in a 'needs extraction fix' bucket, not the chair's approve queue. * order_by_priority (#84.3): active-learning order — negatively-treated first, then most-uncertain (lowest confidence), then oldest — instead of FIFO, so the highest-value decisions surface first. halachot_pending (MCP) — now gated + prioritized BY DEFAULT; include_low_quality= true reveals the needs-fix bucket. The agent review path benefits immediately. GET /api/halachot — same two params, default OFF (non-breaking; the UI opts in). metrics.halacha_backlog (#84.7) — splits pending into clean vs flagged, adds deferred, reviewed_total, approve_ratio, and a pending_by_flag breakdown, so the backlog distinguishes real review work from extraction noise. Deferred (documented): #84.2 near-duplicate cluster cards and wiring the UI fetch to the new params require frontend work + an api:types regen AFTER this deploys (the new query params aren't in prod's OpenAPI until then) — a clean follow-up. The backend fully supports both now. Verified against the live DB (read-only): - pending 177 → gated-clean 110, 0 flagged items leak into the clean queue. - priority order surfaces the lowest-confidence items first (0.55, 0.55, ...). - backlog: pending_clean=110 / pending_flagged=67 / approve_ratio=0.916, pending_by_flag={nli_unsupported:59, quote_unverified:3, thin:3, truncated:2}. - pytest tests/test_halacha_quality.py — 52 passed (no regression). Invariants: G1 (gate at source — SQL filter, not post-hoc); G2 (no parallel path — same list_halachot); §6 (flagged items routed to a bucket, never dropped). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 20:00:52 +00:00

1 2 3 4 5 ...

260 Commits