legal-ai

Author	SHA1	Message	Date
chaim	2d7ab26c71	Merge pull request 'fix(#78 ): trigger extraction wakeup on committee-decision upload + surface silent failures' (#39 ) from fix/78-precedent-extraction-wakeup into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 10s Details	2026-06-02 12:06:56 +00:00
Chaim	1d3e235556	fix(#78 ): trigger extraction wakeup on committee-decision upload + surface failures The /api/internal-decisions/upload path (used by the UI for ועדת-ערר decisions) never called pc_wake_for_precedent_extraction, so committee decisions were stuck at halacha_extraction_status='pending' forever — the CEO was never woken to drain the queue. Root cause behind 8027-25's stuck extraction. The other two upload paths (precedent_library, missing-precedent) already wake the CEO; this one was missing it. - internal-decisions upload: add the wakeup, routing the company by case number prefix (1xxx→רישוי, 8xxx→היטל, 9xxx→פיצויים) when practice_area is empty (else an 8xxx case wrongly routes to the licensing CEO). - all three call sites: the wake helper returns {ok:False} WITHOUT raising on a skipped/failed wakeup; that was silently dropped. Now logged at WARNING with the reason, and the upload progress carries extraction_queued. Fallback drainer (scheduled precedent_process_pending) deferred — the missing wakeup was the actual failure; manual precedent_process_pending remains the recovery path. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 12:06:31 +00:00
chaim	7471dcf3cc	Merge pull request 'chore: tasks #76-78 + weekly chair-feedback lessons #34-35' (#38 ) from chore/tasks-and-weekly-lessons into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-06-02 11:57:44 +00:00
Chaim	d790fb26e0	docs(lessons): weekly chair-feedback lessons #34-35 (week ending 2026-05-31) #34 don't manufacture doubt about unambiguous statutes (s.19(ג)(2)); #35 writer/QA two-sources-of-truth sync gap (DB vs drafts/decision.md). Output of the weekly-feedback-analysis job, pending commit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 11:57:24 +00:00
Chaim	7e34c53224	chore(tasks): add #76-78 — Paperclip create-task button + 2 precedent-upload bugs #76 צור-משימה button (enabled but submit no-ops), #77 committee-upload field mapping (citation→case_number, case_number uneditable), #78 silent extraction wakeup failure. Discovered while debugging precedent 8027-25. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 11:57:24 +00:00
chaim	77ed0361b7	Merge pull request 'fix(appraiser-facts): valid Paperclip priority enum (normal→medium)' (#37 ) from fix/appraiser-facts-priority-enum into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m18s Details	2026-06-02 11:49:23 +00:00
Chaim	5d63a903ce	fix(appraiser-facts): valid Paperclip priority enum (normal→medium) The 'חלץ עובדות שמאיות עכשיו' button returned HTTP 500. Root cause: wake_analyst_for_appraiser_facts POSTs a child issue to Paperclip with priority='normal', but Paperclip's ISSUE_PRIORITIES enum is only critical\|high\|medium\|low. createChildIssueSchema (Zod) rejects 'normal' with 400 Bad Request; pc_request raise_for_status() turns it into a 500 surfaced to the chair. Fixed to 'medium' (the sole non-normal occurrence in the repo). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-02 11:48:58 +00:00
chaim	aeddcb41eb	Merge pull request 'feat(web-ui): sort corroborated halachot first' (#36 ) from feat/x11-corroborated-first into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 36s Details	2026-06-01 05:50:29 +00:00
Chaim	1aadd3b455	feat(web-ui): sort corroborated halachot first in extracted list (X11) Halachot carrying a corroboration badge (positive citation count or a negative treatment) float to the top of 'הלכות שחולצו', ordered by corroboration strength; the rest keep document order by halacha_index. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 05:50:12 +00:00
chaim	f66a2a27e7	Merge pull request 'feat(web-ui): X11 corroboration badge on halachot' (#35 ) from feat/x11-corroboration-web-ui into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details	2026-06-01 05:04:58 +00:00
Chaim	f46bf47d5b	feat(web-ui): expose citation-corroboration badge on halachot (X11) - db.list_halachot: aggregate corroboration_count (distinct positive sources) + corroboration_negative from halacha_citation_corroboration (LEFT JOIN) - web-ui: CorroborationBadge — 'מתוקף · N ציטוטים' at ≥2 (gold), soft single citation, danger badge on negative treatment; native title tooltips - shown in ExtractedHalachotSection (per-precedent) + halacha review panel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 05:04:31 +00:00
chaim	9f2adc4dd0	Merge pull request 'docs(X11): wire corroboration tools into CEO flow + user guide' (#34 ) from docs/x11-phase2-tool-integration into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-06-01 04:52:25 +00:00
Chaim	e79f74bc23	docs(X11): wire corroboration tools into CEO halacha flow + guide (X11 Phase 2) - CEO: run corroboration_rebuild after precedent_process_pending(halacha); report {approved, demoted}; tools added to allowlist - researcher: halacha_corroboration (read) in allowlist - TaskMaster #75 → done Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:52:02 +00:00
chaim	3bd2d16652	Merge pull request 'feat(X11): citation-corroboration Phase 2 — wire the approval gate + backfill' (#33 ) from feat/x11-corroboration-phase2 into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details	2026-06-01 04:43:24 +00:00
Chaim	b4d1fc5539	docs(audit): X11 Phase 2 corroboration backfill result (X11 Phase 2) 12 precedents, 20 links, 0 negatives. 4 halachot corroborated — all already confidence-approved (signal fully overlaps confidence set), so 0 transitions. Approve path proven in rolled-back tx; no chair-final state touched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:41:58 +00:00
Chaim	ed547e20ad	feat(corroboration): wire approval gate + backfill driver + rebuild tool (X11 Phase 2) - db: approve_halacha_by_corroboration (pending_review→approved only), demote_halacha_overruled (approved→pending_review only), list_corroboration_grouped, precedents_with_halachot_and_incoming_citations - corroboration: reconcile_approvals (INV-COR2/COR4/COR5), build_all backfill; build_for_precedent now returns approved/demoted counts - mcp: corroboration_rebuild write tool (single precedent or full-corpus backfill) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:35:37 +00:00
Chaim	df007784c9	feat(corroboration): approval_action decision fn + kill-switch (INV-COR2/COR4, X11 Phase 2) - HALACHA_CORROBORATION_AUTO_APPROVE config (default ON, Dafna validated 2026-06-01) - approval_action(agg, has_overruled): overruled→demote, corroborated→approve, else None - 4 offline unit tests; Phase 2 plan + TaskMaster #75 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:34:23 +00:00
chaim	391b025e8a	Merge pull request 'feat(halacha): effort קל-יותר לחילוץ-bulk (מהירות בקנה-מידה)' (#32 ) from feat/halacha-bulk-effort into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-05-31 21:34:44 +00:00
Chaim	885cba543e	feat(halacha): lighter effort for BULK queue-drain extraction (speed at scale) xhigh is the quality sweet-spot for a single precedent but very slow at scale (64-chunk case ≈ 20 min). Bulk queue-drains (process_pending over many precedents) now use a lighter effort to cut wall-clock; interactive single re-extraction keeps xhigh quality. - config.HALACHA_BULK_EXTRACT_EFFORT (env, default 'high'; set 'medium' for max speed, 'xhigh' to match single). - extract()/_extract_impl()/_extract_chunk() take an `effort` override threaded to claude_session.query_json; None falls back to HALACHA_EXTRACT_EFFORT (xhigh). - process_pending_extractions(kind='halacha') passes the bulk effort; single reextract_halachot keeps xhigh. Verified end-to-end (mocked LLM): _extract_chunk(effort='medium') → query_json effort='medium'; effort=None → 'xhigh' fallback. Closes the open item in #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:34:13 +00:00
chaim	acfd5bae3e	Merge pull request 'feat(halacha): חילוץ מצטבר crash-safe + resume (A + resume)' (#31 ) from feat/halacha-incremental-resume into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m5s Details	2026-05-31 21:28:19 +00:00
Chaim	8e4ea23882	feat(halacha): crash-safe incremental extraction + resume (A + resume) Halacha extraction held ALL chunk results in memory and stored once at the very end — a crash/interrupt mid-run (e.g. the 2026-05-31 freeze) lost everything and re-paid the full LLM cost on retry. Now each chunk's halachot are stored AND the chunk is checkpointed (precedent_chunks.halacha_extracted_at) the moment it finishes: - V25 schema: precedent_chunks.halacha_extracted_at (per-chunk checkpoint). - db.store_halachot_for_chunk: atomic per-chunk insert (halacha_index continues from MAX, caller serializes via an in-process store-lock) + checkpoint mark. - db.reset_halacha_extraction (force) / mark_all_chunks_extracted (legacy backfill). - _extract_impl rewritten: resume by default (skip checkpointed chunks; failed chunks stay pending and are retried; status stays 'processing' until all done); force=True wipes + redoes all. reextract_halachot passes force=True; the queue drain (process_pending) resumes by default. - Legacy guard: a pre-V25 precedent (halachot exist, no checkpoints) is backfilled and treated as complete — never re-extracted (would duplicate). Verified on 9002-24 (55 halachot, legacy): resume → legacy-backfill, NO duplication (stays 55), all chunks checkpointed. Index continuation: store at 55,56 after max 54, no collision. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:27:46 +00:00
chaim	6183e24316	Merge pull request 'fix(halacha): נעילה גלובלית — חילוץ אחד בכל רגע (מונע הקפאת מכונה)' (#30 ) from fix/halacha-extract-global-lock into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details	2026-05-31 20:43:10 +00:00
Chaim	807053ec54	fix(halacha): global advisory lock — one extraction at a time (prevents box freeze) 2026-05-31: opus-4-8 @ xhigh extraction + overlapping driver processes (agent fallback retries each spawn an independent `python -c` driver; process_pending is serial WITHIN a process but the box ran 4-5 drivers in parallel) → 12-16 concurrent xhigh `claude -p` procs → load 69 → hard reboot. Fix: halacha_extractor.extract() now takes a Postgres advisory lock (pg_try_advisory_lock, key 'HALA') before any work. If another extraction (any process/agent/driver — all share the legal-ai DB) holds it, the call returns status='busy' and the precedent stays pending for the next drain. Guarantees ONE extraction at a time ACROSS PROCESSES — an in-process Semaphore cannot (drivers are separate OS processes). Core logic moved to _extract_impl (unchanged) under the lock. CHUNK_CONCURRENCY now env-tunable (HALACHA_CHUNK_CONCURRENCY, default 3). Verified: while a lock is held, extract() returns 'busy' with no LLM call; lock releases cleanly and the next extraction proceeds. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 20:42:15 +00:00
chaim	62e5e5183d	Merge pull request 'fix(precedents): החלטות ועדת ערר אינן מחייבות (is_binding=false)' (#29 ) from fix/committee-decisions-not-binding into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 43s Details	2026-05-31 20:40:54 +00:00
Chaim	1b62fa4af8	fix(precedents): ועדת ערר decisions are never binding (is_binding=false) מסך העלאת הפסיקה הציג צ'קבוקס "הלכה מחייבת" עם ברירת מחדל true גם להחלטות ועדת ערר (isCommittee), כך שהלכות שחולצו מהחלטה לא-מחייבת תויגו rule_type='binding' — בסתירה להגדרה הדוקטרינרית (ועדת ערר = persuasive בלבד, לא binding כמו עליון/מנהלי). - מסלול ההגשה של החלטות ועדת ערר שולח כעת is_binding=false תמיד - הצ'קבוקס ננעל (disabled+unchecked) כשזוהתה החלטת ועדת ערר, עם הסבר שההלכות יסומנו persuasive יישור דוקטרינרי בלבד — אין השפעה downstream על ranking/injection; rule_type הוא תווית תצוגה, והשער הפונקציונלי הוא review_status. TaskMaster #73 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 20:39:59 +00:00
chaim	e712573766	Merge pull request 'docs(X11): מקורות פתוחים + אימות ההחלטה מול הספרות הפתוחה' (#28 ) from docs/x11-open-sources into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 9s Details	2026-05-31 19:30:27 +00:00
Chaim	6ed5c9e99f	docs(X11): foreground open-access sources; verify decision against open literature החלפת מיקוד שורות-המקורות של INV-COR1–COR5 + תיקון-G10 ממוצרים סגורים (Shepard's/KeyCite) למקורות פתוחים שאומתו בפועל — בהתאם ל-feedback_legal_db_authoritative_sources ולפרוטוקול ≥3-המקורות של החוקה: - Fowler et al., Network Analysis and the Law (Political Analysis 2007) — ציטוטים-נכנסים = מדד-סמכות, מאומת בניבוי ציטוט עתידי (INV-COR1/COR4). - Demir & Canbaz, Validate Your Authority (NLLP/ACL 2025) — LLM מסווג טיפול-תקדים ב-67.7–79.1%; הדיוק הלא-מושלם מצדיק את הסייגים השמרניים (≥N, שער-אנוש, שלילי→דגל) (INV-COR2/COR4/COR5). - CaseHOLD (arXiv 2021) — סיווג ברמת holding (INV-COR3). LePaRD (arXiv 2023) — citation dataset. - Hellyer (LLJ 2018, open-access), NCSC/JTC, CEPEJ, ISO 15489 — ללא שינוי, פתוחים. מסקנה: הספרות הפתוחה תומכת בהחלטה (citator + סיווג-טיפול + סמכות-מבוססת-ציטוט), ודווקא מחזקת את הגרסה השמרנית. אין גישה ל-Shepard's/KeyCite הסגורים — המידע עליהם הגיע ממקורות משניים פתוחים בלבד. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 19:30:02 +00:00
chaim	a9472187ff	Merge pull request 'feat(X11): citation-corroboration Phase 1 — the signal (no approval change)' (#27 ) from feat/x11-corroboration-phase1 into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m46s Details	2026-05-31 19:18:49 +00:00
Chaim	5abfbd2746	feat(mcp): halacha_corroboration read-only tool (INV-COR6, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:07:37 +00:00
Chaim	b57e590275	feat(corroboration): orchestrator + persistence over both citation graphs (X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:04:20 +00:00
Chaim	33f955e372	feat(corroboration): aggregator — distinct positive + negative-flag (INV-COR4, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:00:16 +00:00
Chaim	dbc176ae66	feat(corroboration): halacha matcher + cosine threshold (INV-COR3, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:57:47 +00:00
Chaim	09eec6a906	feat(corroboration): treatment classifier + polarity (INV-COR2, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:54:50 +00:00
Chaim	ca31932a5f	feat(db): V24 — citation treatment column + halacha corroboration link table (X11)	2026-05-31 18:52:16 +00:00
Chaim	beba24dfc5	docs(plan): X11 corroboration Phase 1 implementation plan Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 18:50:50 +00:00
chaim	ae8efc0b63	Merge pull request 'feat(spec): X11 ציטוט-corroboration + תיקון INV-G10 + Opus 4.8 לחילוץ הלכות' (#26 ) from feat/x11-citation-corroboration into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m42s Details	2026-05-31 18:42:59 +00:00
Chaim	887079535c	feat(spec): X11 citation-corroboration + INV-G10 amendment + Opus 4.8 halacha extraction ספ חדש לשכבת citator פנימית — תיקוף הלכות לפי טיפול-שיפוטי מצטבר (ציטוטים נכנסים), לצמצום היקף האישור-הידני של היו"ר: - docs/spec/X11-citation-corroboration.md — 6 invariants (INV-COR1–COR6), כל אחד עם ≥3 מקורות מקצועיים (Shepard's/KeyCite, Hellyer LLJ 2018, UNC Law, NCSC/JTC, CEPEJ). - docs/spec/00-constitution.md — תיקון מבוקר ל-INV-G10: השער מסופק ע"י טיפול-שיפוטי-מצטבר לתת-הקבוצה החיובית, שער-היו"ר נשאר חובה לזנב ולשלילי. + X11 באינדקס. - Opus 4.8 @ xhigh כמודל חילוץ הלכות (config HALACHA_EXTRACT_MODEL/EFFORT, env-tunable; claude_session model/effort params; halacha_extractor מחווט). מבוסס A/B 2026-05-31: פחות חילוץ-יתר, 100% quote-verified, ביטחון מכויל. - scripts/ab_halacha_opus48.py — harness A/B לא-הרסני להשוואת מודל/effort בחילוץ הלכות. - .taskmaster #70 (FU-2c-b) — תיעוד dedup שפר + סריקת-קורפוס (0 stubs תקועים נותרו). תנאי-קדם (זהות נקייה) הושלם: שפר מוזג לרשומה קנונית + סריקת 128 רשומות. audit-findings גלויים ב-X11 §7: קישור הלכה↔ציטוט + סיווג-טיפול = greenfield, ל-implementation plan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 18:42:13 +00:00
chaim	d83a2a2fb2	Merge pull request 'docs(spec): מחזור-2 — 8 משטחי-האפליקציה (X6–X10) + ui-audit + GAP-24..62/FU-9..15' (#25 ) from docs/fu9-15-cycle2-spec into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 10s Details	2026-05-31 16:22:42 +00:00
Chaim	37c56ff22a	docs(spec): cycle-2 — 8 application-surface domains (X6–X10) + ui-audit + GAP-24..62/FU-9..15 Extends the system spec beyond the core pipeline to the 8 surfaces outside it: - X6 UI↔API contract + design rules (INV-UI1..6) - X7 Paperclip client & connection params (INV-INT4..8) - X8 field-population & extraction provenance (INV-FP1..5) - X9 MCP tool contract — 71 tools (INV-TOOL1..6) - X10 deploy/env/secrets (INV-ENV1..5) - ui-audit.md — page-by-page UI audit (13 pages) - 02-data-model: derived-entity invariants (INV-DM4..6) - X4-agents: tool-grant map + INV-AG3 - gap-audit: GAP-24..62 → FU-9..15; cycle-1 (FU-1..8b) marked done - constitution §7 + README index (X1..X10) Planning/spec artifacts only — no application code. All engineering invariants backed by ≥3 sources; every finding carries verified file:line. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 16:21:27 +00:00
chaim	c70a03f91e	Merge pull request 'chore(tasks): #71 — FU-5 follow-up (multi-precedent recall depth)' (#24 ) from chore/task-71-retrieval-depth into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 16:06:12 +00:00
Chaim	1cc7c0e757	chore(tasks): #71 — FU-5 follow-up, multi-precedent recall depth tuning Diagnosis from the FU-5 eval: co-relevant precedents for broad legal questions rank 15-16 (retrieved, not absent — recall ~1.0 by rank 20). Tracked as a deliberate, harness-measured tuning task rather than an unmeasured global limit change (which affects UI + writer agents + token cost). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 16:05:53 +00:00
chaim	ae7d475103	Merge pull request 'FU-8b: חיווט הספ לסוכנים — INV-AG1 read-before-act (GAP-23)' (#23 ) from feat/fu-8b-spec-wiring-agents into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 16:03:45 +00:00
Chaim	a02a606f34	feat(agents): wire spec into agents — INV-AG1 read-before-act gate (FU-8b/GAP-23) חיווט ספ-המערכת לסוכני-Paperclip כך שכל סוכן חייב לקרוא את 00-constitution תחילה, ואז את ספ-התחום הרלוונטי לתפקידו (לפי טבלת X4 §2) — לפני עבודה מהותית. - HEARTBEAT.md: סעיף עליון "קריאת-ספ — קודם החוקה (00), אז ספ-התחום" לפני §0–§8, עם טבלת תפקיד→ספ ל-8 הסוכנים. - 8 קבצי-סוכן (ceo/proofreader/researcher/analyst/writer/qa/exporter/hermes): סעיף "קרא לפני פעולה (INV-AG1)" בראש הגוף. - X4-agents.md: שדה "אכיפה" של INV-AG1 → "מחוּוט (פרוצדורלי)"; §5 → "בוצע". אכיפה פרוצדורלית בכוונה — invariant פרויקטלי-תפעולי, אין שער-קוד שמכריח קריאה. prereq לסוכני-התהליך (תת-פרויקט 5). gap-audit נשמר כ-snapshot (כמו FU-8a). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 16:02:04 +00:00
chaim	ff5187c9c1	Merge pull request 'chore(eval): add 9 chair-approved semantic queries to FU-5 gold-set' (#22 ) from chore/goldset-semantic-queries into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:58:10 +00:00
Chaim	7161c3d010	chore(eval): add 9 chair-approved semantic queries to gold-set (FU-5) The gold-set was 77 known-item probes (query=case_name). Added 9 chair-approved SEMANTIC queries (S1–S9) — a real legal question per row, relevant = the precedents that should surface (drawn from subject_tags, chair-confirmed). These test what matters: does retrieval answer a legal issue, not just find a case by name. source='chair' (preserved across re-bootstrap). practice_area left empty so the filter never excludes a cross-tagged precedent (s.197 rulings sit under betterment_levy). Baseline now 86 queries. Finding from the 9 semantic queries: MRR ≈ 1.0 — the system surfaces a lead relevant precedent at rank 1 for nearly every question — but R@10 ranges 0.5–1.0: for broad questions with many co-relevant precedents (e.g. נטרול תמ"א 38 = 5 relevant → R@10 0.60; שמאי מכריע = 2 → 0.50) some co-relevant rulings miss the top-10. Lead-precedent retrieval is strong; exhaustive multi-precedent recall is the gap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:57:45 +00:00
chaim	eef04b0f09	Merge pull request 'chore(eval): chair fix — rename ARAR-24-9002 → קרקעות ירושלים 2 + refresh gold-set' (#21 ) from chore/goldset-chair-fix-arar into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:48:17 +00:00
Chaim	411ee18786	chore(eval): chair review — rename code-named record + refresh gold-set Chair review of the FU-5 gold-set surfaced one internal_committee record whose case_name was a code ("ARAR-24-9002") rather than a real name. Per the chair's citation (ערר 9002/24 קרקעות ירושלים 2 בע"מ נ' הוועדה המקומית ירושלים, נבו 13.8.2025, a s.197 compensation appeal), case_name corrected in the DB to "קרקעות ירושלים 2" (case_number 9002-24 and citation_formatted were already correct; only 1 such code-named record exists corpus-wide). Re-bootstrapped the gold-set (the known-item query is now the real name) and refreshed baseline (aggregate unchanged — the case retrieves identically under the corrected name). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:47:57 +00:00
chaim	83d6b5ecf0	Merge pull request 'fix: drop gold-set card from chair approval center (data/ not in image)' (#20 ) from fix/chair-pending-drop-goldset-card into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:41:40 +00:00
Chaim	c231782ee8	fix(ui): drop gold-set card from /api/chair/pending — data/ excluded from image The gold-set card read data/eval/gold-set.jsonl, but .dockerignore excludes data/ from the build context, so the file is never in the container and the card silently never rendered. Baking eval data into the image is the wrong layering (data/ is runtime volumes). The gold-set review is a one-time task, not a recurring chair queue, so it doesn't belong on the live board — it's tracked via task #63 and reviewed directly with the chair. The board now returns the 4 robust DB-backed gates (halachot, missing precedents, feedback, qa_failed). Removes the best-effort file read + its unused Path import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:41:00 +00:00
chaim	dfa2f5bd7f	Merge pull request 'מרכז אישורים — chair approval center (everything Dafna must approve, in one page)' (#19 ) from feat/chair-approval-center into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 37s Details	2026-05-31 15:37:00 +00:00

1 2 3 4 5 ...

516 Commits