legal-ai

Author	SHA1	Message	Date
Chaim	79b9c37301	feat(mcp): FU-14 GAP-48 פרוסה 2 — envelope אחיד ל-11 משפחות-כלים המשך מיגרציית INV-TOOL1 מעבר למשפחת-החיפוש (#71). הומרו ל-{status,data,message}: precedent_library, citations, internal_decisions, missing_precedents, training_enrichment, precedents, legal_arguments, cases, documents, workflow (~55 כלים). בוטלו 5 עותקי _ok/_err משוכפלים (alias ל-tools/envelope.py — SSoT, G2). עיקרון: envelope-status = הצלחת-הקריאה-לכלי; תוצאה-עסקית (idempotent_existing, noop, completed...) נשמרת בתוך data. err רק לכשל אמיתי (not-found/invalid/exception). תאימות-API: צרכני web/app.py של cases/workflow/precedents חוּוטו דרך envelope_unwrap + בדיקת status=="error"→4xx — תשובת ה-HTTP זהה, web-ui לא מושפע. (documents/legal_arguments/citations/... אינם נצרכים מ-app.py — agent-only.) בדיקות: 182/182 עוברים (test_corpus_constraints עודכן לחוזה החדש). נותר: משפחת drafting (מסלול הפקת-ההחלטה) בפרוסה נפרדת עם שער טסט-ייצוא. Invariants: מקדם INV-TOOL1 + G2 (SSoT, ביטול כפילות). מתועד ב-X9 + gap-audit. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 17:41:39 +00:00
Chaim	701efab726	feat(mcp): FU-14 GAP-51 — איחוד אוצר-המילים של תוצאת-תיק (set_outcome SSoT) הכרעת-יו"ר: קנוני = 3 תוצאות אמיתיות (rejection/partial_acceptance/full_acceptance); betterment_levy יוצא מהיותו "תוצאה" ועובר ל-override לפי practice_area. + עקרון "אנגלית-ב-DB, עברית-ב-UI": מפת-תוויות SSoT אחת. lessons.py: - VALID_OUTCOMES = 3 (הוסר betterment_levy). - OUTCOME_LABELS_HE (SSoT לתצוגה) + LEGACY_OUTCOME_MAP + canonical_outcome(). - PRACTICE_AREA_OVERRIDES["betterment_levy"] מרכז את כל ה-guidance שהיה מפתוח כ-outcome (golden_ratios/opening/summary/discussion/template). - get_lessons_for_outcome(outcome, practice_area) + format_ratios_comment(..., practice_area) מחילים override + מנרמלים legacy. block_writer.py: STRUCTURE_GUIDANCE קנוני + תווית מ-OUTCOME_LABELS_HE + override betterment. workflow.set_outcome: קנוני 3 + מיפוי-legacy סלחני; תווית מ-SSoT. drafting.py: טבלת יחסי-זהב + get_decision_template מודעי-practice_area (override). web-ui case.ts: הסרת betterment_levy מ-expectedOutcomes (הוא practice_area). server.py: docstrings קנוניים. מיגרציה: migrate_gap51_outcomes.py — 9 שורות נורמלו (rejected→rejection וכו'), גיבוי ב-data/audit/. הקוד canonicalize בקריאה ⇒ backward-compatible גם בלי מיגרציה. אומת: py_compile (5 קבצים) + בדיקות-יחידה offline (override/legacy/labels) + אימות-DB. עודכנו X9 §3 + gap-audit (GAP-51 ✅). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 15:34:49 +00:00
Chaim	ebfe7f6a1d	feat(mcp): FU-14 פרוסה 1 — get_appraiser_facts (GAP-44) + limit-caps (GAP-53) תוספתי בלבד, אפס שבירת-תאימות. שני invariants מחוזה-כלי-ה-MCP (X9): GAP-44 (INV-TOOL4, סימטריית extract/get): נוסף get_appraiser_facts — ה-get המקביל ל-extract_appraiser_facts. קורא list_appraiser_facts + detect_appraiser_conflicts מה-DB ללא חילוץ-LLM יקר ולא-דטרמיניסטי. מחזיר count=0 (לא שגיאה) אם טרם חולץ. GAP-53 (INV-TOOL5, limit-caps / OWASP API4:2023): נוסף _clamp_limit (תקרה 200, non-positive→max) על ~13 כלי list/search ב-server.py (case_list, search_, precedent_library_list, halachot_pending, missing_precedent_list, list__citations…). list_chair_feedback קיבל param limit חדש (server→workflow→db עם LIMIT) — היה ללא תקרה כלל. לא הוסף get_appraiser_facts ל-frontmatter של סוכנים (INV-AG3 "לא עודף" — ההוראות עוד לא מפנות אליו; חיווט = follow-up). נותר ב-FU-14: GAP-45/48/49/50/51/52. עודכנו docs/spec/X9 (INV-TOOL4/5) ו-gap-audit (סטטוס פרוסה 1). אומת: py_compile על 4 קבצי הקוד. אימות runtime (restart MCP server) נדחה עד שהחילוץ הפעיל של היו"ר יסתיים. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 14:37:30 +00:00
Chaim	92a2763b86	feat: add internal committee decisions corpus (source_kind='internal_committee') All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m31s Details Three-layer separation: style learning (style_corpus), appeals-committee decisions (internal_committee), and court rulings (external_upload). - SCHEMA_V10: chair_name + district columns on case_law and cases, partial indexes - create_internal_committee_decision() DB upsert function - search_precedent_library_semantic() now accepts source_kind/district/chair_name params - search_precedent_library_hybrid() passes through new params - services/internal_decisions.py: ingest_internal_decision, migrate_from_style_corpus, migrate_from_external_corpus (identifies rows via source_type='appeals_committee') - search_internal_decisions() MCP tool (server.py + tools/search.py) - internal_decision_migrate() MCP admin tool - Web endpoints: POST /api/internal-decisions/upload, POST /api/internal-decisions/migrate, GET /api/internal-decisions - ingest_final_version auto-ingests finalized decisions into internal corpus - SKILL.md updated: agents now search internal + external in parallel, present separately Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:33:39 +00:00
Chaim	81ccf3a888	feat(retrieval): track page_number on text chunks for multimodal hybrid boost All checks were successful Build & Deploy / build-and-deploy (push) Successful in 6m33s Details The legacy chunker did not track which PDF page each chunk came from. Stored chunks had page_number=NULL, which blocked the multimodal hybrid retriever's text+image boost — it joins (chunk, image) on (document_id, page_number) and the join could never fire. This change: - extractor.extract_text now returns (text, page_count, page_offsets); page_offsets[i] is the start char offset of page (i+1) in the joined text. None for non-PDFs. - chunker.chunk_document accepts an optional page_offsets and tags each chunk with the page that contains its first character (uses the existing chunker logic; pages assigned post-hoc by content search to keep the diff minimal). - processor.process_document and precedent_library.ingest_precedent forward page_offsets through the chunker. New uploads now carry accurate page_number on every chunk. - Other extract_text callers (tools/documents, tools/workflow, web/app.py) updated to unpack the third element (ignored). - scripts/backfill_chunk_pages.py: per-case retrofit. Re-extracts each PDF (re-OCRs via Google Vision if needed, ~$0.0015/page), computes page_offsets, and updates page_number on every chunk by content search. Idempotent; --force re-runs on already-tagged docs. Forward-only would leave the 419 image embeddings backfilled on cases 8174-24 + 8137-24 unable to boost their corresponding text chunks. The retrofit script closes that gap (cost ~$0.60). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:49:41 +00:00
Chaim	50eaa887db	Add chair feedback system and content checklists for block-yod Backend changes cherry-picked from ui-rewrite branch to enable feedback API endpoints for the Next.js staging UI. - chair_feedback DB table + API endpoints (GET/POST/PATCH) - Content checklists by appeal subtype injected into block-yod prompt - MCP tools for recording and listing chair feedback - Corpus analysis documentation (24 decisions) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 21:05:53 +00:00
Chaim	5fc52ce530	Switch to cases/{new,in-progress,completed}/ directory structure Replace single CASES_DIR with find_case_dir() that searches across all status directories. New cases created in cases/new/{number}/. Config: CASES_BASE, CASES_NEW, CASES_IN_PROGRESS, CASES_COMPLETED Docker: added -v /home/chaim/legal-ai/cases:/cases volume mount Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 10:45:47 +00:00
Chaim	d9e5ef0f46	Add full decision writing pipeline: classify, extract, brainstorm, write, QA, export New services (11 files): - classifier.py: auto doc-type classification + party identification (Claude Haiku) - claims_extractor.py: claim extraction from pleadings (Claude Sonnet + regex) - references_extractor.py: plan/case-law/legislation detection (regex) - brainstorm.py: direction generation with 2-3 options (Claude Sonnet) - block_writer.py: 12-block decision writer (template + Claude Sonnet/Opus) - docx_exporter.py: DOCX export with David font, RTL, headings - qa_validator.py: 6 QA checks with export blocking on critical failure - learning_loop.py: draft vs final comparison + lesson extraction - metrics.py: KPIs dashboard per case and global - audit.py: action audit log - cli.py: standalone CLI with 11 commands Updated pipeline: extract → classify → chunk → embed → store → extract_references New MCP tools: 29 total (was 16) New DB tables: audit_log, decisions CRUD, claims CRUD Config: Infisical support, external service allowlist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 10:21:47 +00:00
Chaim	6f515dc2cb	Initial commit: MCP server + web upload interface Ezer Mishpati - AI legal decision drafting system with: - MCP server (FastMCP) with document processing pipeline - Web upload interface (FastAPI) for file upload and classification - pgvector-based semantic search - Hebrew legal document chunking and embedding	2026-03-23 12:33:07 +00:00

9 Commits