legal-ai

Author	SHA1	Message	Date
Chaim	ea8b48c6ac	feat(mcp): FU-14 GAP-45 — extraction_status (חשיפת תור-החילוץ הסמוי) INV-TOOL4 (visibility / persistence). תור בקשות-החילוץ (metadata/halacha) נשמר ב-case_law.{metadata,halacha}_extraction_requested_at ומרוקן ע"י precedent_process_pending — אבל לא היה כלי לראות את עומק-התור. נוסף: - db.extraction_queue_status() — count + גיל הבקשה הוותיקה לכל kind (read-only). - plib.extraction_status() — tool wrapper (envelope _ok/_err). - רישום extraction_status ב-server.py ליד precedent_process_pending. - precedent_process_pending קיבל _clamp_limit (עקביות עם GAP-53). תוספתי, read-only, אפס שבירה. עודכנו X9 (INV-TOOL4 ✅) ו-gap-audit (GAP-45 ✅). py_compile עבר על 3 קבצי הקוד. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 15:00:25 +00:00
Chaim	034b609bd3	feat(mcp): FU-14 GAP-52 — idempotency על case_create/precedent_attach/document_upload INV-TOOL3 (idempotency על מפתח דטרמיניסטי). כל שלושת הכלים מחזירים את הרשומה הקיימת במקום ליצור כפילות: - case_create — מפתח case_number (כבר UNIQUE ב-schema): מחזיר את התיק הקיים במקום unique-violation. - precedent_attach — מפתח (case_id, section_id, citation, quote): צירוף חוזר של אותו ציטוט לאותו סעיף מחזיר את הקיים. - document_upload — מפתח (case_id, SHA-256 של בייטי הקובץ): העלאה חוזרת של אותו קובץ מחזירה את המסמך הקיים ומדלגת על copy+OCR+embed (החלק היקר). נוספה עמודת documents.content_hash (תוספתי, DEFAULT '') + get_document_by_hash. נבחרה בדיקת-מפתח ברמת-אפליקציה (SELECT-לפני-INSERT) ולא UNIQUE-constraint — כדי לא לשבור startup אם קיימים נתונים-כפולים legacy. אין מיגרציה הרסנית. עודכנו docs/spec/X9 (INV-TOOL3 ✅) ו-gap-audit (GAP-52 ✅, פרוסה 2). py_compile עבר על 4 קבצי הקוד. אימות runtime (restart MCP server) נדחה עד שהחילוץ הפעיל יסתיים. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 14:52:33 +00:00
Chaim	ebfe7f6a1d	feat(mcp): FU-14 פרוסה 1 — get_appraiser_facts (GAP-44) + limit-caps (GAP-53) תוספתי בלבד, אפס שבירת-תאימות. שני invariants מחוזה-כלי-ה-MCP (X9): GAP-44 (INV-TOOL4, סימטריית extract/get): נוסף get_appraiser_facts — ה-get המקביל ל-extract_appraiser_facts. קורא list_appraiser_facts + detect_appraiser_conflicts מה-DB ללא חילוץ-LLM יקר ולא-דטרמיניסטי. מחזיר count=0 (לא שגיאה) אם טרם חולץ. GAP-53 (INV-TOOL5, limit-caps / OWASP API4:2023): נוסף _clamp_limit (תקרה 200, non-positive→max) על ~13 כלי list/search ב-server.py (case_list, search_, precedent_library_list, halachot_pending, missing_precedent_list, list__citations…). list_chair_feedback קיבל param limit חדש (server→workflow→db עם LIMIT) — היה ללא תקרה כלל. לא הוסף get_appraiser_facts ל-frontmatter של סוכנים (INV-AG3 "לא עודף" — ההוראות עוד לא מפנות אליו; חיווט = follow-up). נותר ב-FU-14: GAP-45/48/49/50/51/52. עודכנו docs/spec/X9 (INV-TOOL4/5) ו-gap-audit (סטטוס פרוסה 1). אומת: py_compile על 4 קבצי הקוד. אימות runtime (restart MCP server) נדחה עד שהחילוץ הפעיל של היו"ר יסתיים. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 14:37:30 +00:00
Chaim	4174217179	feat(feedback): סימון "יושם" מפעיל CEO לקיפול הלקח לקובץ הנכון סוגר את לולאת פידבק-יו"ר→ידע-סוכנים. עד כה resolve רק עדכן את ה-DB; עכשיו לחיצה ב-/feedback מעירה את ה-CEO שמקפל את הלקח לקובץ לפי הקטגוריה. - paperclip_client.py: wake_ceo_for_feedback_fold() — יוצר issue ב-Paperclip עם הלקח + rubric ניתוב (style→SKILL.md, wrong_structure→block-schema, אחר→lessons.md), מעיר CEO. משכפל את דפוס wake_for_precedent_extraction - db.py: get_chair_feedback(id) — שליפת הערה בודדת עם case_number/appeal_type - app.py: resolve endpoint מקבל fold (ברירת מחדל true); BackgroundTask fire-and-forget; guard — רק עם lesson_extracted. מחזיר fold_queued - legal-ceo.md: dispatch ל-feedback_fold_ + סעיף "קיפול הערת יו"ר" עם rubric - frontend: useResolveFeedback מקבל fold; /feedback שולח fold=true עם toast; drafts-panel שולח fold=false (bookkeeping per-case, בלי קיפול כפול) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 13:08:41 +00:00
Chaim	68a77c11b6	feat(upload): חסימת כפילות בהעלאת פסיקה + banner עם אפשרויות All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m38s Details - בקאנד: GET לפני ה-async task — אם citation כבר קיים כ-external_upload מחזיר 409 - DB: get_external_case_law_by_citation — lookup לפי citation + source_kind - פרונט: banner אדום עם פרטי הרשומה הקיימת ושני כפתורות: • "הפעל חילוץ מחדש" — request-halachot ל-ID הקיים וסגירת הטופס • "מחק את הרשומה" — DELETE עם confirm, ניקוי conflict לאחר מכן Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-06 12:11:33 +00:00
Chaim	f8c3fd6c89	fix(nevo): strip preamble/mini-ratio from court rulings too (#86.1) strip_nevo_preamble's _DECISION_START only matched ועדת-ערר openings (בפנינו / הערר שבנדון / ...), so Nevo COURT judgments — exactly the ones carrying a מיני-רציו — slipped through unstripped. The editorial mini-ratio then leaked into the chunked body, risking that the halacha extractor reads Nevo's answer key (contamination) and polluting the corpus. Proven on בג"ץ 1764/05: its full_text still contained the מיני-רציו (unstripped). Fix: - Extend _DECISION_START with court-ruling openings: פסק-דין/פסק דין header and the authoring-judge line (השופט/ת, כב' השופט, הנשיא, המשנה לנשיא). re.search picks the earliest line-start match → the real opinion start, not the prose ratio above it. - Widen the Nevo-marker detection window 400→1500 chars so a long court/parties header doesn't push חקיקה שאוזכרה:/מיני-רציו: out of range. Verified on the real 1764/05 full_text: strips 2702 chars, body now starts at 'השופט ס' ג'ובראן:', מיני-רציו gone. Regression: ועדת-ערר openings still strip; non-Nevo text untouched; markers-past-400 now detected. Suite 182 passed (6 new). This is the anti-contamination prerequisite for the Nevo-ratio gold-set (#86.3/#81.7). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 16:55:31 +00:00
Chaim	fb60dca796	feat(halacha): over-extraction consolidation — fold facets via claude_session (#81.5) After a precedent finishes extracting, a claude_session pass folds facets of the SAME legal question (below #82's dedup cosine — the שפר 14-vs-4 / 403-17→89 granularity gap) into one canonical; the rest are marked 'rejected' (reversible: out of the active corpus AND the review queue, but recoverable). FOLD-ONLY — never merges distinct legal questions, never invents. - Engine: claude_session-as-judge (local CLI, zero cost), 'high' effort — folding needs careful judgment. One pass per precedent, runs in _extract_impl once all chunks are done (the prompt dedups within a chunk; this catches across chunks). - Pure, unit-tested helpers in halacha_quality: CONSOLIDATE_SYSTEM, build_consolidation_prompt, parse_fold_groups (fails SAFE → [] on any malformed shape; drops <2-member groups; coerces/dedups indices). - halacha_extractor._consolidate_precedent picks the canonical per group (approved>pending, higher confidence, quote_verified, longer) and rejects the rest via the existing update_halachot_batch (#84). Never rejects a canonical. Fails OPEN on any error (no CLI / parse fail → 0 folds, data untouched). - config: HALACHA_CONSOLIDATE_ENABLED/MODEL/EFFORT. Verified: suite 176 passed (10 new); integration vs dev DB — a 2-facet group folds to 1 canonical + 1 rejected (tagged), distinct rules untouched, claude error → 0 folds (fail-open). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 16:26:44 +00:00
Chaim	f196bed564	feat(halacha): NLI entailment validator via claude_session (#81.3) + task #86 #81.3 — a post-extraction validator that flags halachot whose rule_statement is NOT entailed by its supporting_quote (the model over-reaching beyond its source). - Engine: claude_session-as-judge (local CLI, zero API cost) per chaim's standing preference — one batched judge call per chunk, NOT a hosted NLI model. - Pure, unit-tested helpers in halacha_quality: NLI_SYSTEM, build_nli_prompt, parse_nli_verdicts (fails OPEN — any shape/label ambiguity → 'entailed'). - halacha_extractor._nli_check wraps the call; fails OPEN on any error (e.g. no CLI in the container) so a flaky judge never blocks a genuine halacha. - Non-entailed (neutral/contradiction) → quality_flag 'nli_unsupported' which blocks auto-approve (routes to pending_review) via the existing store gate. - config: HALACHA_NLI_ENABLED/MODEL/EFFORT (effort 'low' — entailment is simple). Verified: suite 166 passed (10 new); LIVE smoke test against the real claude CLI returned ['entailed','neutral'] for a supported vs unsupported rule. Also commits TaskMaster #86 (Nevo preamble/ratio: anti-contamination strip fix + gold-set benchmark) capturing today's strip_nevo_preamble findings. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 14:46:12 +00:00
Chaim	476c2fc5d1	feat(upload): accept legacy .doc, convert via LibreOffice in container Legacy Hebrew .doc precedents (e.g. nevo.co.il CP1255 OLE2) can now be uploaded directly through the precedent-library, missing-precedent, and training upload paths — the frontend already advertised .doc but the backend gate rejected it before reaching the extractor. - web/app.py: add .doc to ALLOWED_EXTENSIONS (covers all paths that share the set: precedent library, missing-precedent, training). - Dockerfile: install libreoffice-writer-nogui (no X11/Java) so the extractor's existing _extract_doc LibreOffice conversion works in the Coolify container (was missing → would fail at runtime). - extractor.py: isolate the LibreOffice user profile per call to avoid a profile-lock failure on concurrent .doc conversions. Verified in python:3.12-slim (prod base): .doc→.docx→text yields text byte-identical to a native Word .docx save (103 paragraphs, 24,341 chars). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 13:47:47 +00:00
Chaim	eeb70a5758	feat(halacha): review-queue triage — defer + batch group actions + quality-flag badges (#84 ) Make the chair's pending-halacha review faster and less exhausting. Backend: - New 'deferred' review_status (snooze): stays out of the active library AND out of the default pending queue, without the finality of 'rejected'. update_halacha stamps reviewer+reviewed_at on defer; HALACHA_REVIEW_STATUSES is the single source of valid statuses (PATCH validation now uses it). - db.update_halachot_batch(ids, status, reviewer) — one atomic UPDATE for a whole group; invalid status / empty ids are a no-op. - POST /api/halachot/batch (HalachaBatchReviewRequest) wraps it. - update_halacha now RETURNs quality_flags too (parity with list_halachot). Frontend (halacha-review-panel): - Quality-flag badges (#81: non_decision / truncated_quote / thin_restatement / quote_unverified) so the chair sees WHY an item was held back. - Defer action — button + keyboard 'D' — to snooze without rejecting (fixes the 'leave in pending forever' anti-pattern; reject stays the junk verb). - Per-precedent batch bar: 'אשר הכל' / 'דחה הכל' via useBatchReviewHalachot (one request, one refetch) with confirm guards. - Halacha/HalachaPatch types gain quality_flags + 'deferred'. Verified: mcp-server suite 156 passed; web build green; end-to-end integration against dev DB (batch approve/reject, defer sets status+timestamp, pending excludes approved+deferred, deferred queryable, invalid status no-op). Note: api:types regen deferred until deploy (the batch hook is hand-typed, not dependent on generated types). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 13:42:21 +00:00
Chaim	0f64b4c062	feat(halacha): UNIQUE(case_law_id, halacha_index) backstop + task tracking (#83 ) #83 pipeline robustness — the index-numbering correctness guarantee: - Add CREATE UNIQUE INDEX idx_halachot_unique_index ON halachot(case_law_id, halacha_index). The extractor assigns the index as MAX+1 under an in-process store-lock + a cross-process pg advisory lock, so collisions shouldn't occur in normal operation — but per the research (FireHydrant/OneUptime) the constraint is the actual correctness guarantee while the lock is the optimization. A racing/double run now fails LOUDLY (UniqueViolation, chunk left un-checkpointed → clean resume) instead of silently appending the duplicates that were the 2026-05/06 over-extraction root cause. Data prep (run against the live DB before the constraint, backed up to data/audit/halacha-reindex-backup-*.sql): the 6 precedents that still carried colliding halacha_index values (9 groups, distinct principles that shared a number — NOT content dups) were renumbered to unique sequential indices. Verified: advisory lock holds cross-process and the DB path is direct asyncpg (no transaction-pooler), so the session lock is safe (83.1); force=True does delete+checkpoint-clear in one transaction (83.5); constraint rejects a duplicate-index insert (integration-checked). Full suite 156 passed. Also commits the TaskMaster tracking for the whole halacha-quality initiative (#81-#84 + research-backed subtasks, statuses). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 13:06:58 +00:00
Chaim	ca959d4a9c	feat(halacha): strict-rubric quality gate + dedup-on-insert (#81,#82) Bake the 2026-06-03 strict-cleanup rubric into the extraction pipeline so the corpus stays clean at the source instead of accumulating duplicates, obiter dicta, truncated quotes and thin restatements that clog the review queue. #81 — quality gate: - New pure module halacha_quality.py with unit-tested validators: non-decision/obiter (Wambaugh markers), truncated-quote (mid-word cut), thin-restatement (rule≈quote), quote-unverified. - Validators run in halacha_extractor._process; a non-decision is re-typed obiter; flags persist in new halachot.quality_flags column. - Auto-approve now requires confidence>=threshold AND no quality flags; flagged items route to pending_review regardless of confidence. - Both extraction prompts hardened: reject undecided dicta, exclude case-specific applications, require abstraction, forbid over-splitting. #82 — dedup-on-insert (store_halachot_for_chunk): - Within the same precedent, skip a halacha whose normalized supporting_quote already exists, or whose rule-embedding has cosine>=HALACHA_DEDUP_COSINE (0.93) against an already-stored one. Makes re-runs idempotent. Migration: halachot.quality_flags TEXT[] (additive, idempotent ALTER). Tests: 19 new unit tests; full suite 156 passed. Validated end-to-end against dev DB (dedup skips dups, flag blocks auto-approve, re-run inserts 0). Calibration: flags fire on only ~10% of current survivors (low false-positive). Spec: docs/halacha-strict-rubric.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 12:30:38 +00:00
Chaim	6fcfdc76db	fix(#79 ): chunker never emits sub-50-char fragment chunks (#55 follow-up) A section that opens with a short header line ('דיון', 'טענות המשיבים') followed by a paragraph larger than chunk_size flushed the header alone as a tiny chunk. #55 added a query-time >=50 filter to hide these; this removes them at the source. _split_section: (1) don't flush a buffer still below MIN_CHUNK_CHARS — let it absorb the next paragraph even if that overflows chunk_size, so a short header rides with its following content; (2) fold a trailing tiny chunk back into its predecessor. Verified: re-chunked the 4 corpus docs that still had a tiny chunk (ע"א 5138/04, בר"מ 2340/02, בג"ץ 6525/15, 403-17) — corpus-wide chunks<50 went 4 -> 0; all 4 stay embedded/searchable and rank top in a relevant search (נווה שלום #1 for the s.19(ג)(1) exemption query). No regression. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-03 08:10:10 +00:00
Chaim	f46bf47d5b	feat(web-ui): expose citation-corroboration badge on halachot (X11) - db.list_halachot: aggregate corroboration_count (distinct positive sources) + corroboration_negative from halacha_citation_corroboration (LEFT JOIN) - web-ui: CorroborationBadge — 'מתוקף · N ציטוטים' at ≥2 (gold), soft single citation, danger badge on negative treatment; native title tooltips - shown in ExtractedHalachotSection (per-precedent) + halacha review panel Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 05:04:31 +00:00
Chaim	ed547e20ad	feat(corroboration): wire approval gate + backfill driver + rebuild tool (X11 Phase 2) - db: approve_halacha_by_corroboration (pending_review→approved only), demote_halacha_overruled (approved→pending_review only), list_corroboration_grouped, precedents_with_halachot_and_incoming_citations - corroboration: reconcile_approvals (INV-COR2/COR4/COR5), build_all backfill; build_for_precedent now returns approved/demoted counts - mcp: corroboration_rebuild write tool (single precedent or full-corpus backfill) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:35:37 +00:00
Chaim	df007784c9	feat(corroboration): approval_action decision fn + kill-switch (INV-COR2/COR4, X11 Phase 2) - HALACHA_CORROBORATION_AUTO_APPROVE config (default ON, Dafna validated 2026-06-01) - approval_action(agg, has_overruled): overruled→demote, corroborated→approve, else None - 4 offline unit tests; Phase 2 plan + TaskMaster #75 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-01 04:34:23 +00:00
Chaim	885cba543e	feat(halacha): lighter effort for BULK queue-drain extraction (speed at scale) xhigh is the quality sweet-spot for a single precedent but very slow at scale (64-chunk case ≈ 20 min). Bulk queue-drains (process_pending over many precedents) now use a lighter effort to cut wall-clock; interactive single re-extraction keeps xhigh quality. - config.HALACHA_BULK_EXTRACT_EFFORT (env, default 'high'; set 'medium' for max speed, 'xhigh' to match single). - extract()/_extract_impl()/_extract_chunk() take an `effort` override threaded to claude_session.query_json; None falls back to HALACHA_EXTRACT_EFFORT (xhigh). - process_pending_extractions(kind='halacha') passes the bulk effort; single reextract_halachot keeps xhigh. Verified end-to-end (mocked LLM): _extract_chunk(effort='medium') → query_json effort='medium'; effort=None → 'xhigh' fallback. Closes the open item in #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:34:13 +00:00
Chaim	8e4ea23882	feat(halacha): crash-safe incremental extraction + resume (A + resume) Halacha extraction held ALL chunk results in memory and stored once at the very end — a crash/interrupt mid-run (e.g. the 2026-05-31 freeze) lost everything and re-paid the full LLM cost on retry. Now each chunk's halachot are stored AND the chunk is checkpointed (precedent_chunks.halacha_extracted_at) the moment it finishes: - V25 schema: precedent_chunks.halacha_extracted_at (per-chunk checkpoint). - db.store_halachot_for_chunk: atomic per-chunk insert (halacha_index continues from MAX, caller serializes via an in-process store-lock) + checkpoint mark. - db.reset_halacha_extraction (force) / mark_all_chunks_extracted (legacy backfill). - _extract_impl rewritten: resume by default (skip checkpointed chunks; failed chunks stay pending and are retried; status stays 'processing' until all done); force=True wipes + redoes all. reextract_halachot passes force=True; the queue drain (process_pending) resumes by default. - Legacy guard: a pre-V25 precedent (halachot exist, no checkpoints) is backfilled and treated as complete — never re-extracted (would duplicate). Verified on 9002-24 (55 halachot, legacy): resume → legacy-backfill, NO duplication (stays 55), all chunks checkpointed. Index continuation: store at 55,56 after max 54, no collision. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 21:27:46 +00:00
Chaim	807053ec54	fix(halacha): global advisory lock — one extraction at a time (prevents box freeze) 2026-05-31: opus-4-8 @ xhigh extraction + overlapping driver processes (agent fallback retries each spawn an independent `python -c` driver; process_pending is serial WITHIN a process but the box ran 4-5 drivers in parallel) → 12-16 concurrent xhigh `claude -p` procs → load 69 → hard reboot. Fix: halacha_extractor.extract() now takes a Postgres advisory lock (pg_try_advisory_lock, key 'HALA') before any work. If another extraction (any process/agent/driver — all share the legal-ai DB) holds it, the call returns status='busy' and the precedent stays pending for the next drain. Guarantees ONE extraction at a time ACROSS PROCESSES — an in-process Semaphore cannot (drivers are separate OS processes). Core logic moved to _extract_impl (unchanged) under the lock. CHUNK_CONCURRENCY now env-tunable (HALACHA_CHUNK_CONCURRENCY, default 3). Verified: while a lock is held, extract() returns 'busy' with no LLM call; lock releases cleanly and the next extraction proceeds. Tracks #72. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 20:42:15 +00:00
Chaim	5abfbd2746	feat(mcp): halacha_corroboration read-only tool (INV-COR6, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:07:37 +00:00
Chaim	b57e590275	feat(corroboration): orchestrator + persistence over both citation graphs (X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:04:20 +00:00
Chaim	33f955e372	feat(corroboration): aggregator — distinct positive + negative-flag (INV-COR4, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 19:00:16 +00:00
Chaim	dbc176ae66	feat(corroboration): halacha matcher + cosine threshold (INV-COR3, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:57:47 +00:00
Chaim	09eec6a906	feat(corroboration): treatment classifier + polarity (INV-COR2, X11) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 18:54:50 +00:00
Chaim	ca31932a5f	feat(db): V24 — citation treatment column + halacha corroboration link table (X11)	2026-05-31 18:52:16 +00:00
Chaim	887079535c	feat(spec): X11 citation-corroboration + INV-G10 amendment + Opus 4.8 halacha extraction ספ חדש לשכבת citator פנימית — תיקוף הלכות לפי טיפול-שיפוטי מצטבר (ציטוטים נכנסים), לצמצום היקף האישור-הידני של היו"ר: - docs/spec/X11-citation-corroboration.md — 6 invariants (INV-COR1–COR6), כל אחד עם ≥3 מקורות מקצועיים (Shepard's/KeyCite, Hellyer LLJ 2018, UNC Law, NCSC/JTC, CEPEJ). - docs/spec/00-constitution.md — תיקון מבוקר ל-INV-G10: השער מסופק ע"י טיפול-שיפוטי-מצטבר לתת-הקבוצה החיובית, שער-היו"ר נשאר חובה לזנב ולשלילי. + X11 באינדקס. - Opus 4.8 @ xhigh כמודל חילוץ הלכות (config HALACHA_EXTRACT_MODEL/EFFORT, env-tunable; claude_session model/effort params; halacha_extractor מחווט). מבוסס A/B 2026-05-31: פחות חילוץ-יתר, 100% quote-verified, ביטחון מכויל. - scripts/ab_halacha_opus48.py — harness A/B לא-הרסני להשוואת מודל/effort בחילוץ הלכות. - .taskmaster #70 (FU-2c-b) — תיעוד dedup שפר + סריקת-קורפוס (0 stubs תקועים נותרו). תנאי-קדם (זהות נקייה) הושלם: שפר מוזג לרשומה קנונית + סריקת 128 רשומות. audit-findings גלויים ב-X11 §7: קישור הלכה↔ציטוט + סיווג-טיפול = greenfield, ל-implementation plan. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 18:42:13 +00:00
Chaim	6ff2e36bf9	feat(eval): FU-5 — retrieval eval harness + halacha backlog visibility (#63 ) Covers GAP-11 (INV-RET4/G8) and GAP-14 (INV-QA1/G10). Retrieval quality was never measured (only telemetry observation) and the halacha review backlog was invisible (the 10/19 gap was found by accident). Unit B — backlog visibility (pure code, container): - metrics.halacha_backlog(conn) → {pending_review, approved, rejected, published, total, oldest_pending_at}; surfaced in metrics.get_dashboard() (get_metrics MCP tool) and /api/system/diagnostics. Live count revealed 178 pending / 1552 total, oldest from 2026-05-03 — previously invisible. Unit A — retrieval eval harness (host-side scripts): - scripts/eval_gold_bootstrap.py — seeds data/eval/gold-set.jsonl. Two sources: citations (cited==relevant via search_relevance_feedback — empty until decisions cite precedents) and known_item (query=case_name → relevant=self; a real citation-free signal, the methodology #52 checked by hand). Idempotent; preserves source='chair' rows. - scripts/eval_retrieval.py — runs the production retrieval path (search_library / search_internal) over the gold-set; computes precision@k, recall@k, MRR, nDCG@k (k=5,10); aggregates overall + per-corpus + per-practice_area; writes a report and a delta vs committed baseline.json (which records the retrieval_config it reflects). --self-test unit-checks the metric math offline. Gold-set strategy = hybrid (chair decision): bootstrap + chair review. The citation source is empty today (0 cited precedents in decisions), so the seed is known-item (77 queries: 54 internal_decisions + 23 precedent_library). The gold-set is PROVISIONAL until Dafna reviews it (the domain chair-gate). Baseline (production config: multimodal+rerank on): R@10=0.987, MRR=0.837, nDCG@10=0.872. Finding: MULTIMODAL_ENABLED=true slightly lowers known-item recall (image-page results displace exact name matches) — relevant to #15. precedent_library weaker than internal (R@10 0.957 vs 1.0) — one external precedent unfindable by name. "CI gate" realized as discipline (re-runnable harness + committed baseline + run before/after any retrieval-layer change) — retrieval needs prod DB + Voyage, no CI runner has that access. Spec: docs/superpowers/specs/2026-05-31-fu5-eval-harness-design.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 14:58:13 +00:00
Chaim	4d8422198a	feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a) Wakeup-INSERT rule is universal (never allowlisted — hard invariant). Raw-HTTP rule exempts the sanctioned helpers + standalone operator/admin scripts (a distinct category per fitness-function scope differentiation + DRY: tooling needn't reuse the FastAPI wrapper). Repo scanned clean under these rules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 11:35:07 +00:00
Chaim	a66ab3b3cd	feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 11:16:36 +00:00
Chaim	aac383acb7	feat(sync): --verify exits non-zero on drift; adapter mismatch = loud drift (GAP-21, FU-8a) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 11:14:44 +00:00
Chaim	e46868feda	feat(fu2b): flag PROC_MISMATCH (case_number prefix vs proceeding_type) for chair Dry-run surfaced 2 rows with בל"מ prefix but proceeding_type=ערר. Since the migration strips the prefix, a wrong proceeding_type would silently lose the בל"מ signal — must be chair-adjudicated, not auto-applied. Chair table now flags 4 rows: 2 DUP_CHECK (8047-23) + 2 PROC_MISMATCH. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 08:57:42 +00:00
Chaim	a41fcedc28	test(fu2b): failing tests for bare-number extraction (FU-2b)	2026-05-31 08:52:48 +00:00
Chaim	7e35a24d80	test(reindex): cover empty-text raise path (FU-3 review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:13:18 +00:00
Chaim	8a0c206ecd	feat(reindex): precedent_reindex MCP tool (GAP-09, FU-3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 22:09:44 +00:00
Chaim	f008820ec8	feat(reindex): health-check stale_embedding_case_law count (GAP-09, FU-3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 22:08:27 +00:00
Chaim	63abf83e76	test(reindex): fix mark_indexed stub arity in FU-1 fixture (FU-3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:07:39 +00:00
Chaim	c8de42150e	test(reindex): stub db.mark_indexed in FU-1/FU-2a ingest fixtures (FU-3 interaction) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:07:18 +00:00
Chaim	c7c7a1e119	feat(reindex): reindex_case_law from stored text + mark_indexed on ingest (GAP-09, FU-3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 22:06:17 +00:00
Chaim	96ae83081f	feat(reindex): V23 content/indexed hashes + helpers + write content_hash (GAP-09, FU-3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:04:43 +00:00
Chaim	e522555b1a	test(reindex): failing tests for content-hash re-index (FU-3)	2026-05-30 22:02:16 +00:00
Chaim	9bfb912bdf	fix(audit): _collect_block_sources mirrors None-doc-types (provenance accuracy, FU-7 review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:40:42 +00:00
Chaim	677f29ddec	feat(audit): blocks_stale drift flag + health-check visibility (GAP-17, FU-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 21:36:56 +00:00
Chaim	7e2f4b2872	feat(qa): citation→corpus resolution as non-blocking warning (GAP-20, FU-7) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:35:24 +00:00
Chaim	769f5020eb	feat(audit): block→source provenance via write_block audit event (GAP-19, FU-7) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:33:36 +00:00
Chaim	1f483383b9	feat(audit): log document_upload/extract_claims/export_docx (GAP-18, FU-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 21:31:09 +00:00
Chaim	a121f79d6a	feat(audit): log_action_safe + V22 blocks_stale + citation resolver (FU-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 21:29:26 +00:00
Chaim	bffd2ec701	test(audit): failing tests for audit-trail + provenance (FU-7)	2026-05-30 21:27:54 +00:00
Chaim	5d3c340243	test(ingest): stub recompute_searchable in FU-1 fixture (FU-2a interaction) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 20:59:11 +00:00
Chaim	358d82e90e	feat(retrieval): require practice_area only for internal/cases; enable searchable filter + health visibility (GAP-13, FU-2a) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 20:57:27 +00:00
Chaim	6dbcb7e798	feat(ingest): recompute searchable on ingest + metadata completion (GAP-13, FU-2a) Wire db.recompute_searchable into the ingest pipeline (after statuses are set) and into extract_and_apply (after fields are persisted to DB, success path only). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 20:47:51 +00:00

1 2 3 4

183 Commits