legal-ai

Author	SHA1	Message	Date
chaim	aeddcb41eb	Merge pull request 'feat(web-ui): sort corroborated halachot first' (#36 ) from feat/x11-corroborated-first into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 36s Details	2026-06-01 05:50:29 +00:00
Chaim	1aadd3b455	feat(web-ui): sort corroborated halachot first in extracted list (X11) Halachot carrying a corroboration badge (positive citation count or a negative treatment) float to the top of 'הלכות שחולצו', ordered by corroboration strength; the rest keep document order by halacha_index. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 05:50:12 +00:00
chaim	f66a2a27e7	Merge pull request 'feat(web-ui): X11 corroboration badge on halachot' (#35 ) from feat/x11-corroboration-web-ui into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details	2026-06-01 05:04:58 +00:00
Chaim	f46bf47d5b	feat(web-ui): expose citation-corroboration badge on halachot (X11) - db.list_halachot: aggregate corroboration_count (distinct positive sources) + corroboration_negative from halacha_citation_corroboration (LEFT JOIN) - web-ui: CorroborationBadge — 'מתוקף · N ציטוטים' at ≥2 (gold), soft single citation, danger badge on negative treatment; native title tooltips - shown in ExtractedHalachotSection (per-precedent) + halacha review panel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 05:04:31 +00:00
chaim	9f2adc4dd0	Merge pull request 'docs(X11): wire corroboration tools into CEO flow + user guide' (#34 ) from docs/x11-phase2-tool-integration into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-06-01 04:52:25 +00:00
Chaim	e79f74bc23	docs(X11): wire corroboration tools into CEO halacha flow + guide (X11 Phase 2) - CEO: run corroboration_rebuild after precedent_process_pending(halacha); report {approved, demoted}; tools added to allowlist - researcher: halacha_corroboration (read) in allowlist - TaskMaster #75 → done Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:52:02 +00:00
chaim	3bd2d16652	Merge pull request 'feat(X11): citation-corroboration Phase 2 — wire the approval gate + backfill' (#33 ) from feat/x11-corroboration-phase2 into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details	2026-06-01 04:43:24 +00:00
Chaim	b4d1fc5539	docs(audit): X11 Phase 2 corroboration backfill result (X11 Phase 2) 12 precedents, 20 links, 0 negatives. 4 halachot corroborated — all already confidence-approved (signal fully overlaps confidence set), so 0 transitions. Approve path proven in rolled-back tx; no chair-final state touched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:41:58 +00:00
Chaim	ed547e20ad	feat(corroboration): wire approval gate + backfill driver + rebuild tool (X11 Phase 2) - db: approve_halacha_by_corroboration (pending_review→approved only), demote_halacha_overruled (approved→pending_review only), list_corroboration_grouped, precedents_with_halachot_and_incoming_citations - corroboration: reconcile_approvals (INV-COR2/COR4/COR5), build_all backfill; build_for_precedent now returns approved/demoted counts - mcp: corroboration_rebuild write tool (single precedent or full-corpus backfill) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:35:37 +00:00
Chaim	df007784c9	feat(corroboration): approval_action decision fn + kill-switch (INV-COR2/COR4, X11 Phase 2) - HALACHA_CORROBORATION_AUTO_APPROVE config (default ON, Dafna validated 2026-06-01) - approval_action(agg, has_overruled): overruled→demote, corroborated→approve, else None - 4 offline unit tests; Phase 2 plan + TaskMaster #75 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:34:23 +00:00
chaim	391b025e8a	Merge pull request 'feat(halacha): effort קל-יותר לחילוץ-bulk (מהירות בקנה-מידה)' (#32 ) from feat/halacha-bulk-effort into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-05-31 21:34:44 +00:00
Chaim	885cba543e	feat(halacha): lighter effort for BULK queue-drain extraction (speed at scale) xhigh is the quality sweet-spot for a single precedent but very slow at scale (64-chunk case ≈ 20 min). Bulk queue-drains (process_pending over many precedents) now use a lighter effort to cut wall-clock; interactive single re-extraction keeps xhigh quality. - config.HALACHA_BULK_EXTRACT_EFFORT (env, default 'high'; set 'medium' for max speed, 'xhigh' to match single). - extract()/_extract_impl()/_extract_chunk() take an `effort` override threaded to claude_session.query_json; None falls back to HALACHA_EXTRACT_EFFORT (xhigh). - process_pending_extractions(kind='halacha') passes the bulk effort; single reextract_halachot keeps xhigh. Verified end-to-end (mocked LLM): _extract_chunk(effort='medium') → query_json effort='medium'; effort=None → 'xhigh' fallback. Closes the open item in #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:34:13 +00:00
chaim	acfd5bae3e	Merge pull request 'feat(halacha): חילוץ מצטבר crash-safe + resume (A + resume)' (#31 ) from feat/halacha-incremental-resume into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m5s Details	2026-05-31 21:28:19 +00:00
Chaim	8e4ea23882	feat(halacha): crash-safe incremental extraction + resume (A + resume) Halacha extraction held ALL chunk results in memory and stored once at the very end — a crash/interrupt mid-run (e.g. the 2026-05-31 freeze) lost everything and re-paid the full LLM cost on retry. Now each chunk's halachot are stored AND the chunk is checkpointed (precedent_chunks.halacha_extracted_at) the moment it finishes: - V25 schema: precedent_chunks.halacha_extracted_at (per-chunk checkpoint). - db.store_halachot_for_chunk: atomic per-chunk insert (halacha_index continues from MAX, caller serializes via an in-process store-lock) + checkpoint mark. - db.reset_halacha_extraction (force) / mark_all_chunks_extracted (legacy backfill). - _extract_impl rewritten: resume by default (skip checkpointed chunks; failed chunks stay pending and are retried; status stays 'processing' until all done); force=True wipes + redoes all. reextract_halachot passes force=True; the queue drain (process_pending) resumes by default. - Legacy guard: a pre-V25 precedent (halachot exist, no checkpoints) is backfilled and treated as complete — never re-extracted (would duplicate). Verified on 9002-24 (55 halachot, legacy): resume → legacy-backfill, NO duplication (stays 55), all chunks checkpointed. Index continuation: store at 55,56 after max 54, no collision. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:27:46 +00:00
chaim	6183e24316	Merge pull request 'fix(halacha): נעילה גלובלית — חילוץ אחד בכל רגע (מונע הקפאת מכונה)' (#30 ) from fix/halacha-extract-global-lock into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details	2026-05-31 20:43:10 +00:00
Chaim	807053ec54	fix(halacha): global advisory lock — one extraction at a time (prevents box freeze) 2026-05-31: opus-4-8 @ xhigh extraction + overlapping driver processes (agent fallback retries each spawn an independent `python -c` driver; process_pending is serial WITHIN a process but the box ran 4-5 drivers in parallel) → 12-16 concurrent xhigh `claude -p` procs → load 69 → hard reboot. Fix: halacha_extractor.extract() now takes a Postgres advisory lock (pg_try_advisory_lock, key 'HALA') before any work. If another extraction (any process/agent/driver — all share the legal-ai DB) holds it, the call returns status='busy' and the precedent stays pending for the next drain. Guarantees ONE extraction at a time ACROSS PROCESSES — an in-process Semaphore cannot (drivers are separate OS processes). Core logic moved to _extract_impl (unchanged) under the lock. CHUNK_CONCURRENCY now env-tunable (HALACHA_CHUNK_CONCURRENCY, default 3). Verified: while a lock is held, extract() returns 'busy' with no LLM call; lock releases cleanly and the next extraction proceeds. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 20:42:15 +00:00
chaim	62e5e5183d	Merge pull request 'fix(precedents): החלטות ועדת ערר אינן מחייבות (is_binding=false)' (#29 ) from fix/committee-decisions-not-binding into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 43s Details	2026-05-31 20:40:54 +00:00
Chaim	1b62fa4af8	fix(precedents): ועדת ערר decisions are never binding (is_binding=false) מסך העלאת הפסיקה הציג צ'קבוקס "הלכה מחייבת" עם ברירת מחדל true גם להחלטות ועדת ערר (isCommittee), כך שהלכות שחולצו מהחלטה לא-מחייבת תויגו rule_type='binding' — בסתירה להגדרה הדוקטרינרית (ועדת ערר = persuasive בלבד, לא binding כמו עליון/מנהלי). - מסלול ההגשה של החלטות ועדת ערר שולח כעת is_binding=false תמיד - הצ'קבוקס ננעל (disabled+unchecked) כשזוהתה החלטת ועדת ערר, עם הסבר שההלכות יסומנו persuasive יישור דוקטרינרי בלבד — אין השפעה downstream על ranking/injection; rule_type הוא תווית תצוגה, והשער הפונקציונלי הוא review_status. TaskMaster #73 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 20:39:59 +00:00
chaim	e712573766	Merge pull request 'docs(X11): מקורות פתוחים + אימות ההחלטה מול הספרות הפתוחה' (#28 ) from docs/x11-open-sources into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 9s Details	2026-05-31 19:30:27 +00:00
Chaim	6ed5c9e99f	docs(X11): foreground open-access sources; verify decision against open literature החלפת מיקוד שורות-המקורות של INV-COR1–COR5 + תיקון-G10 ממוצרים סגורים (Shepard's/KeyCite) למקורות פתוחים שאומתו בפועל — בהתאם ל-feedback_legal_db_authoritative_sources ולפרוטוקול ≥3-המקורות של החוקה: - Fowler et al., Network Analysis and the Law (Political Analysis 2007) — ציטוטים-נכנסים = מדד-סמכות, מאומת בניבוי ציטוט עתידי (INV-COR1/COR4). - Demir & Canbaz, Validate Your Authority (NLLP/ACL 2025) — LLM מסווג טיפול-תקדים ב-67.7–79.1%; הדיוק הלא-מושלם מצדיק את הסייגים השמרניים (≥N, שער-אנוש, שלילי→דגל) (INV-COR2/COR4/COR5). - CaseHOLD (arXiv 2021) — סיווג ברמת holding (INV-COR3). LePaRD (arXiv 2023) — citation dataset. - Hellyer (LLJ 2018, open-access), NCSC/JTC, CEPEJ, ISO 15489 — ללא שינוי, פתוחים. מסקנה: הספרות הפתוחה תומכת בהחלטה (citator + סיווג-טיפול + סמכות-מבוססת-ציטוט), ודווקא מחזקת את הגרסה השמרנית. אין גישה ל-Shepard's/KeyCite הסגורים — המידע עליהם הגיע ממקורות משניים פתוחים בלבד. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 19:30:02 +00:00
chaim	a9472187ff	Merge pull request 'feat(X11): citation-corroboration Phase 1 — the signal (no approval change)' (#27 ) from feat/x11-corroboration-phase1 into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m46s Details	2026-05-31 19:18:49 +00:00
Chaim	5abfbd2746	feat(mcp): halacha_corroboration read-only tool (INV-COR6, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:07:37 +00:00
Chaim	b57e590275	feat(corroboration): orchestrator + persistence over both citation graphs (X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:04:20 +00:00
Chaim	33f955e372	feat(corroboration): aggregator — distinct positive + negative-flag (INV-COR4, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:00:16 +00:00
Chaim	dbc176ae66	feat(corroboration): halacha matcher + cosine threshold (INV-COR3, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:57:47 +00:00
Chaim	09eec6a906	feat(corroboration): treatment classifier + polarity (INV-COR2, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:54:50 +00:00
Chaim	ca31932a5f	feat(db): V24 — citation treatment column + halacha corroboration link table (X11)	2026-05-31 18:52:16 +00:00
Chaim	beba24dfc5	docs(plan): X11 corroboration Phase 1 implementation plan Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 18:50:50 +00:00
chaim	ae8efc0b63	Merge pull request 'feat(spec): X11 ציטוט-corroboration + תיקון INV-G10 + Opus 4.8 לחילוץ הלכות' (#26 ) from feat/x11-citation-corroboration into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m42s Details	2026-05-31 18:42:59 +00:00
Chaim	887079535c	feat(spec): X11 citation-corroboration + INV-G10 amendment + Opus 4.8 halacha extraction ספ חדש לשכבת citator פנימית — תיקוף הלכות לפי טיפול-שיפוטי מצטבר (ציטוטים נכנסים), לצמצום היקף האישור-הידני של היו"ר: - docs/spec/X11-citation-corroboration.md — 6 invariants (INV-COR1–COR6), כל אחד עם ≥3 מקורות מקצועיים (Shepard's/KeyCite, Hellyer LLJ 2018, UNC Law, NCSC/JTC, CEPEJ). - docs/spec/00-constitution.md — תיקון מבוקר ל-INV-G10: השער מסופק ע"י טיפול-שיפוטי-מצטבר לתת-הקבוצה החיובית, שער-היו"ר נשאר חובה לזנב ולשלילי. + X11 באינדקס. - Opus 4.8 @ xhigh כמודל חילוץ הלכות (config HALACHA_EXTRACT_MODEL/EFFORT, env-tunable; claude_session model/effort params; halacha_extractor מחווט). מבוסס A/B 2026-05-31: פחות חילוץ-יתר, 100% quote-verified, ביטחון מכויל. - scripts/ab_halacha_opus48.py — harness A/B לא-הרסני להשוואת מודל/effort בחילוץ הלכות. - .taskmaster #70 (FU-2c-b) — תיעוד dedup שפר + סריקת-קורפוס (0 stubs תקועים נותרו). תנאי-קדם (זהות נקייה) הושלם: שפר מוזג לרשומה קנונית + סריקת 128 רשומות. audit-findings גלויים ב-X11 §7: קישור הלכה↔ציטוט + סיווג-טיפול = greenfield, ל-implementation plan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 18:42:13 +00:00
chaim	d83a2a2fb2	Merge pull request 'docs(spec): מחזור-2 — 8 משטחי-האפליקציה (X6–X10) + ui-audit + GAP-24..62/FU-9..15' (#25 ) from docs/fu9-15-cycle2-spec into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 10s Details	2026-05-31 16:22:42 +00:00
Chaim	37c56ff22a	docs(spec): cycle-2 — 8 application-surface domains (X6–X10) + ui-audit + GAP-24..62/FU-9..15 Extends the system spec beyond the core pipeline to the 8 surfaces outside it: - X6 UI↔API contract + design rules (INV-UI1..6) - X7 Paperclip client & connection params (INV-INT4..8) - X8 field-population & extraction provenance (INV-FP1..5) - X9 MCP tool contract — 71 tools (INV-TOOL1..6) - X10 deploy/env/secrets (INV-ENV1..5) - ui-audit.md — page-by-page UI audit (13 pages) - 02-data-model: derived-entity invariants (INV-DM4..6) - X4-agents: tool-grant map + INV-AG3 - gap-audit: GAP-24..62 → FU-9..15; cycle-1 (FU-1..8b) marked done - constitution §7 + README index (X1..X10) Planning/spec artifacts only — no application code. All engineering invariants backed by ≥3 sources; every finding carries verified file:line. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>	2026-05-31 16:21:27 +00:00
chaim	c70a03f91e	Merge pull request 'chore(tasks): #71 — FU-5 follow-up (multi-precedent recall depth)' (#24 ) from chore/task-71-retrieval-depth into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 16:06:12 +00:00
Chaim	1cc7c0e757	chore(tasks): #71 — FU-5 follow-up, multi-precedent recall depth tuning Diagnosis from the FU-5 eval: co-relevant precedents for broad legal questions rank 15-16 (retrieved, not absent — recall ~1.0 by rank 20). Tracked as a deliberate, harness-measured tuning task rather than an unmeasured global limit change (which affects UI + writer agents + token cost). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 16:05:53 +00:00
chaim	ae7d475103	Merge pull request 'FU-8b: חיווט הספ לסוכנים — INV-AG1 read-before-act (GAP-23)' (#23 ) from feat/fu-8b-spec-wiring-agents into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 16:03:45 +00:00
Chaim	a02a606f34	feat(agents): wire spec into agents — INV-AG1 read-before-act gate (FU-8b/GAP-23) חיווט ספ-המערכת לסוכני-Paperclip כך שכל סוכן חייב לקרוא את 00-constitution תחילה, ואז את ספ-התחום הרלוונטי לתפקידו (לפי טבלת X4 §2) — לפני עבודה מהותית. - HEARTBEAT.md: סעיף עליון "קריאת-ספ — קודם החוקה (00), אז ספ-התחום" לפני §0–§8, עם טבלת תפקיד→ספ ל-8 הסוכנים. - 8 קבצי-סוכן (ceo/proofreader/researcher/analyst/writer/qa/exporter/hermes): סעיף "קרא לפני פעולה (INV-AG1)" בראש הגוף. - X4-agents.md: שדה "אכיפה" של INV-AG1 → "מחוּוט (פרוצדורלי)"; §5 → "בוצע". אכיפה פרוצדורלית בכוונה — invariant פרויקטלי-תפעולי, אין שער-קוד שמכריח קריאה. prereq לסוכני-התהליך (תת-פרויקט 5). gap-audit נשמר כ-snapshot (כמו FU-8a). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 16:02:04 +00:00
chaim	ff5187c9c1	Merge pull request 'chore(eval): add 9 chair-approved semantic queries to FU-5 gold-set' (#22 ) from chore/goldset-semantic-queries into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:58:10 +00:00
Chaim	7161c3d010	chore(eval): add 9 chair-approved semantic queries to gold-set (FU-5) The gold-set was 77 known-item probes (query=case_name). Added 9 chair-approved SEMANTIC queries (S1–S9) — a real legal question per row, relevant = the precedents that should surface (drawn from subject_tags, chair-confirmed). These test what matters: does retrieval answer a legal issue, not just find a case by name. source='chair' (preserved across re-bootstrap). practice_area left empty so the filter never excludes a cross-tagged precedent (s.197 rulings sit under betterment_levy). Baseline now 86 queries. Finding from the 9 semantic queries: MRR ≈ 1.0 — the system surfaces a lead relevant precedent at rank 1 for nearly every question — but R@10 ranges 0.5–1.0: for broad questions with many co-relevant precedents (e.g. נטרול תמ"א 38 = 5 relevant → R@10 0.60; שמאי מכריע = 2 → 0.50) some co-relevant rulings miss the top-10. Lead-precedent retrieval is strong; exhaustive multi-precedent recall is the gap. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:57:45 +00:00
chaim	eef04b0f09	Merge pull request 'chore(eval): chair fix — rename ARAR-24-9002 → קרקעות ירושלים 2 + refresh gold-set' (#21 ) from chore/goldset-chair-fix-arar into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:48:17 +00:00
Chaim	411ee18786	chore(eval): chair review — rename code-named record + refresh gold-set Chair review of the FU-5 gold-set surfaced one internal_committee record whose case_name was a code ("ARAR-24-9002") rather than a real name. Per the chair's citation (ערר 9002/24 קרקעות ירושלים 2 בע"מ נ' הוועדה המקומית ירושלים, נבו 13.8.2025, a s.197 compensation appeal), case_name corrected in the DB to "קרקעות ירושלים 2" (case_number 9002-24 and citation_formatted were already correct; only 1 such code-named record exists corpus-wide). Re-bootstrapped the gold-set (the known-item query is now the real name) and refreshed baseline (aggregate unchanged — the case retrieves identically under the corrected name). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:47:57 +00:00
chaim	83d6b5ecf0	Merge pull request 'fix: drop gold-set card from chair approval center (data/ not in image)' (#20 ) from fix/chair-pending-drop-goldset-card into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:41:40 +00:00
Chaim	c231782ee8	fix(ui): drop gold-set card from /api/chair/pending — data/ excluded from image The gold-set card read data/eval/gold-set.jsonl, but .dockerignore excludes data/ from the build context, so the file is never in the container and the card silently never rendered. Baking eval data into the image is the wrong layering (data/ is runtime volumes). The gold-set review is a one-time task, not a recurring chair queue, so it doesn't belong on the live board — it's tracked via task #63 and reviewed directly with the chair. The board now returns the 4 robust DB-backed gates (halachot, missing precedents, feedback, qa_failed). Removes the best-effort file read + its unused Path import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:41:00 +00:00
chaim	dfa2f5bd7f	Merge pull request 'מרכז אישורים — chair approval center (everything Dafna must approve, in one page)' (#19 ) from feat/chair-approval-center into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 37s Details	2026-05-31 15:37:00 +00:00
Chaim	19d3dc81d0	feat(ui): chair approval center — one page for every pending human-gate (#63 follow-up) Dafna asked for a single page under the prod site listing everything she needs to approve, so nothing is forgotten — the visible embodiment of INV-G10 (human gates) and INV-QA1 (halacha backlog must be visible). Backend — GET /api/chair/pending aggregates every pending chair gate, each as a direct source query (count + sample + action link): - halachot review backlog (review_status='pending_review') + oldest - open missing precedents - unresolved chair_feedback - qa_failed cases - gold-set review (FU-5, file-based, best-effort: total vs source='chair') Frontend — /approvals page ("מרכז אישורים"): - src/lib/api/chair.ts — usePendingApprovals() (hand-typed until next api:types) - src/app/approvals/page.tsx — card per category, severity-coloured count, sample rows, oldest-pending date, link to where each is handled; live (60s refetch) - app-shell nav: "מרכז אישורים" in the work group + total-pending badge (quiet at 0) Live counts at build time surfaced the value immediately: 226 open missing precedents, 178 pending halachot, 20 unapplied feedback notes, 1 qa_failed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:36:29 +00:00
chaim	aee2140b0b	Merge pull request 'FU-5 — retrieval eval harness + halacha backlog visibility (#63 )' (#18 ) from feat/fu5-eval-harness-backlog-visibility into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-05-31 14:58:47 +00:00
Chaim	6ff2e36bf9	feat(eval): FU-5 — retrieval eval harness + halacha backlog visibility (#63 ) Covers GAP-11 (INV-RET4/G8) and GAP-14 (INV-QA1/G10). Retrieval quality was never measured (only telemetry observation) and the halacha review backlog was invisible (the 10/19 gap was found by accident). Unit B — backlog visibility (pure code, container): - metrics.halacha_backlog(conn) → {pending_review, approved, rejected, published, total, oldest_pending_at}; surfaced in metrics.get_dashboard() (get_metrics MCP tool) and /api/system/diagnostics. Live count revealed 178 pending / 1552 total, oldest from 2026-05-03 — previously invisible. Unit A — retrieval eval harness (host-side scripts): - scripts/eval_gold_bootstrap.py — seeds data/eval/gold-set.jsonl. Two sources: citations (cited==relevant via search_relevance_feedback — empty until decisions cite precedents) and known_item (query=case_name → relevant=self; a real citation-free signal, the methodology #52 checked by hand). Idempotent; preserves source='chair' rows. - scripts/eval_retrieval.py — runs the production retrieval path (search_library / search_internal) over the gold-set; computes precision@k, recall@k, MRR, nDCG@k (k=5,10); aggregates overall + per-corpus + per-practice_area; writes a report and a delta vs committed baseline.json (which records the retrieval_config it reflects). --self-test unit-checks the metric math offline. Gold-set strategy = hybrid (chair decision): bootstrap + chair review. The citation source is empty today (0 cited precedents in decisions), so the seed is known-item (77 queries: 54 internal_decisions + 23 precedent_library). The gold-set is PROVISIONAL until Dafna reviews it (the domain chair-gate). Baseline (production config: multimodal+rerank on): R@10=0.987, MRR=0.837, nDCG@10=0.872. Finding: MULTIMODAL_ENABLED=true slightly lowers known-item recall (image-page results displace exact name matches) — relevant to #15. precedent_library weaker than internal (R@10 0.957 vs 1.0) — one external precedent unfindable by name. "CI gate" realized as discipline (re-runnable harness + committed baseline + run before/after any retrieval-layer change) — retrieval needs prod DB + Voyage, no CI runner has that access. Spec: docs/superpowers/specs/2026-05-31-fu5-eval-harness-design.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 14:58:13 +00:00
chaim	cfcac80de2	Merge pull request 'FU-2c — reconcile external case_law identifiers (GAP-08, #68 )' (#17 ) from feat/fu2c-external-id-reconciliation into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 14:13:25 +00:00
Chaim	4fce9d503f	feat(migration): FU-2c — reconcile external case_law identifiers (GAP-08, #68 ) External court precedents stored the full citation (designator + docket + parties + Nevo date) inside case_number, violating INV-ID2/G1 (citation as identifier). Chair decision 2026-05-31 (Option A): canonical external case_number = proceeding-designator + docket, '/' preserved (court convention, not X1's '/'→'-'); parties/court/date → citation_formatted. scripts/fu2c_reconcile_external_case_numbers.py — deterministic dry-run → chair-review → apply, mirroring FU-2b: - extracts designator+docket; flags split into BLOCKING (MISMATCH / CIT_NO_DOCKET / DESIG_MISMATCH / DUP_CHECK / NO_DOCKET) vs ADVISORY (NO_CITATION — case_number fix still deterministic, missing citation is a separate gap), so advisory rows apply while uncertain identity does not. - --overrides CSV (id,proposed_canonical,citation_formatted,reason) for audited chair adjudication of blocking rows. - apply scoped to source_kind='external_upload' (task target) while keeping cited_only/nevo_seed in the reconciliation VIEW so DUP_CHECK spans the full external unique space; pre-flight collision guard before every UPDATE. Applied to production 2026-05-31: 21 case_number normalized + 3 citation_formatted reconciled (D = consolidated Supreme Court judgment לויתן/קלמנוביץ → lead docket 25226-04-25; 2×C empty citations composed from metadata). אהוד שפר עע"מ 317/10 deferred — cross-source duplicate with an existing cited_only reference (collision guard held; → #70). 49 cited_only records out of scope → new task #70 (committee-form NNNN-NN dockets the extractor misses, dedup, unresolvable "ערר אדלר"). Extraction + gating verified offline on all 24 records. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 14:12:45 +00:00
chaim	9dbc1bafbf	Merge pull request 'FU-8a: process→code guards (GAP-21/22)' (#16 ) from fix/fu8a-process-to-code-guards into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m39s Details	2026-05-31 11:36:07 +00:00
Chaim	e5b34e01dc	docs(scripts): note sync --verify drift-gate semantics (FU-8a)	2026-05-31 11:36:06 +00:00

1 2 3 4 5 ...

509 Commits