legal-ai

Author	SHA1	Message	Date
Chaim	a3ef9e5e34	fix(ui): ברירת-מחדל של ספריית הפסיקה — החלטות ועדות ערר ראשונות מתג-המקטעים נפתח כעת על "החלטות ועדות ערר" (הקורפוס המרכזי של היו"ר) במקום "פסיקת בתי משפט". Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 17:26:49 +00:00
Chaim	6bf19bd0d7	feat(ui): אינדיקטור התקדמות לחילוץ מטא-דאטה + מתג-מקטעים בספריית הפסיקה שתי בעיות UX בדף /precedents: 1. חילוץ מטא-דאטה לא נתן שום אינדיקציה שהוא רץ. בניגוד לחילוץ טקסט/הלכות (extraction_status / halacha_extraction_status) למטא-דאטה היתה רק חותמת-זמן metadata_extraction_requested_at — אין מצב "processing", לכן StatusPill לא הציג כלום. נוספה עמודת metadata_extraction_status ('pending'\|'processing'\| 'completed'\|'failed') במתכונת העמודות הקיימות, וה-worker (process_pending_extractions + reextract_metadata) מעדכן אותה: processing בתחילת פריט, completed בסיום (מנקה גם את החותמת), pending בכשל (לריטריי). ה-UI מציג תג "מחלץ מטא-דאטה" + באנר מונה-אצווה עם אחוז התקדמות (high-water-mark של עומק-התור) שמתעדכן אוטומטית דרך ה-polling הקיים (5ש'). 2. שתי טבלאות מוערמות (בתי משפט / ועדות ערר) חייבו גלילה ארוכה. הוחלפו במתג- מקטעים — טבלה אחת בכל פעם, עם שמירה על העמודות הייעודיות לכל סוג. Invariants: G2 (מרחיב מנגנון-סטטוס קיים, לא מסלול מקביל), INV-TOOL4/GAP-45 (המשך חשיפת תור-החילוץ הסמוי). אין נגיעה בתוכן משפטי (G11). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-06 16:21:41 +00:00
Chaim	cbc7a1e336	feat(precedents): formal citation per Israeli citation rules + copy/edit UI All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m25s Details Until now, "case_number" was the only stored identifier for a precedent. But a citation per the Israeli unified citation rules is a different beast — it has bold parties, an unbold prefix (court abbrev + panel/ district parenthetical + case number), and an unbold trailing reporter (נבו / פ"ד...). Without storing it as a first-class field we couldn't hand the chair a one-click "copy as citation" experience for pasting into decisions. Changes: - Schema V19: case_law.citation_formatted TEXT (Markdown — parties wrapped in … so the copy helper can render <strong> for Word/Docs paste and keep plain-text fallback meaningful). - Metadata extractor: composes citation_formatted from the document text per the unified citation rules, with worked examples for ע"א / עת"מ / ערר / בל"מ in the prompt. Refuses to store half-formed strings. - PATCH /api/precedent-library/{id} accepts citation_formatted so the chair can correct LLM mistakes. - /precedents/[id]: dedicated "מראה מקום" block with bold rendering, a copy-to-clipboard button (text/html + text/plain so Word keeps the bolds), and an inline edit textarea. - /precedents list rows: link displays the formatted citation when available, with a small inline copy button — falls back to the bare case_number for older rows. Backfill of existing rows happens by re-stamping the extraction queue once V19 has rolled out and the new field is reachable. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-27 07:14:34 +00:00
Chaim	2aee398b4a	feat: Stage C — RAG advanced (#33 , #47 , #48 , #49 , #50 , #51 ) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m35s Details Six independent sub-tasks dispatched in parallel; aggregated here. ## #33 — Hide case_name column library-list-panel.tsx: `<TableHead>` + `<TableCell>` for "שם" get `className="hidden"` in both Court and Committee row variants. DB column preserved for future use. ## #47 — Audit script periodic New scripts/audit_corpus_integrity.py — 3 SQL checks (external+ערר prefix, internal missing chair/district, cases.practice_area enum) + CEO wakeup on violations + cron `0 7 * * `. First run: 0 issues. ## #48 — Parent-doc retrieval (gated, default off) Schema V17: precedent_chunks.parent_chunk_id + chunk_role ('child'\|'parent'). New chunker.chunk_document_hierarchical() — section-aware parents (~1500 tokens) containing ~5 overlapping children (~300 tokens each). New db.store_precedent_chunks_hierarchical two-pass writer. Search SQL (semantic + lexical) LEFT-JOIN parent and swap content + dedupe by parent_chunk_id when flag on. Toggle: PARENT_DOC_RETRIEVAL_ENABLED + PARENT_DOC_{CHILD,PARENT}_SIZE_TOKENS. Backfill ~3min and ~$0.20 — deferred to follow-up. ## #49 — Multimodal backfill New scripts/backfill_multimodal_precedents.py with token-matching case_number ↔ source files (PDF + DOCX via PyMuPDF). Ran in container: 26 precedents embedded, 503 pages, $0.21, 0 errors. precedent_image_embeddings grew 3 → 29 rows. 44 remaining are style_corpus-migrated rows (no source file on disk) — will catch up when re-uploaded. ## #50 — Closed-loop feedback + nDCG Schema V18: search_logs + search_relevance_feedback. New telemetry.py with fire-and-forget log_search_bg (p50 = 0.002ms — zero overhead) + auto-infer_relevance_from_citations (reads case drafts → marks score=3 when cited precedent appears in past search top-K). Hooks added to 5 search paths. scripts/compute_ndcg.py for aggregation. Two admin API endpoints (GET /api/admin/rag-metrics + POST .../infer). Dashboard UI deferred — API is enough for now. ## #51 — Halacha quality monitoring New scripts/monitor_halacha_quality.py — baseline avg confidence (trusted=0.849, all=0.833, pending=0.694) with rolling window drift detection. Default 5% threshold. Exits non-zero on alert for cron integration. Recommended: `0 8 * 1` weekly Mon 8am. ## Bonus: 230 unlinked citations → missing_precedents Bulk-imported 230 distinct unlinked citations from precedent_internal_citations to missing_precedents.status='open', party='committee', with notes listing source citers. Top candidate: ע"א 3213/97 (cited 5x). Total open missing_precedents now 237. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-26 11:26:52 +00:00
Chaim	10a63fb9e0	fix(precedents): separate court rulings from committee decisions correctly All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m37s Details - DB: add 'all_committees' virtual source_kind covering internal_committee + external_upload appeals_committee rows in one query - DB: stats now count all case_law rows (not just external_upload), fixing the precedents_total that excluded 44 internal-committee records - UI: courts table filters to source_type=court_ruling only; committees table uses the new all_committees query Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 09:59:30 +00:00
Chaim	f94201c577	feat(precedents): make citation link to detail page All checks were successful Build & Deploy / build-and-deploy (push) Successful in 34s Details Both CourtRow and CommitteeRow citation cells are now Next.js Links → /precedents/{id}, letting users navigate directly from the list. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 09:01:26 +00:00
Chaim	171da84680	feat(precedent-library): add halacha-extract button to library list rows All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m8s Details When a precedent has not had successful halacha extraction yet, show a small wand icon between the edit and delete buttons. Clicking it queues the precedent for the local MCP worker (request-halachot endpoint). Visibility rule (`needsHalachaExtraction`): show when text extraction is complete AND halacha status is "pending without requested_at" (never tried) or "failed" (allow retry). Hide while processing, after completion, or when already queued — to avoid duplicate requests. Pairs with the metadata-extract button on the edit sheet.	2026-05-07 06:30:03 +00:00
Chaim	c0f67ab841	feat(precedents): split library into court rulings + appeals committee tables All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m34s Details - /api/precedent-library now accepts source_kind param (default external_upload) - list_external_case_law returns chair_name/district fields - LibraryListPanel renders two separate tables with appropriate columns - internal_decisions migration: added queue_halachot param to defer extraction - Fixed practice_area mapping from style_corpus (appeals_committee → proper enum) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 18:49:32 +00:00
Chaim	1f17419ee9	ui(precedents): live status pill with shimmer + auto-queue + auto-refresh All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m44s Details The chair pointed out three UX gaps after uploading a new precedent: 1. The status said "מחלץ הלכות" but nothing was actually running — the field only meant "halacha_extraction_status != completed", which includes the post-upload "pending" state where the local MCP worker hasn't been told to drain anything yet. Misleading. 2. The page didn't refresh on its own. The chair had to F5 to see new counts after extraction completed. 3. Clicking the trash icon mid-extraction would cascade-delete the row while the extractor was still using it (FK errors, partial writes). Fixes: - ingest_precedent now auto-queues both metadata and halacha extraction on upload by stamping the request timestamps. The chair (or me) drains the queue with one `precedent_process_pending` call from chat — no need to click any button before that. - StatusPill is now five-state with proper labels: "נכשל" (extraction_status=failed) — red "מעבד טקסט" — shimmer (extraction_status=processing) "בתור" — neutral (chunks queued, not yet running) "מחלץ הלכות" — shimmer (halacha_extraction_status=processing) "ממתין לחילוץ" — neutral (queued for local MCP worker) "לא חולץ" — neutral (pending without queue stamp — shouldn't happen) "X/Y מאושרות" — gold (done, with halachot count) The shimmer is a CSS-only sliding-stripe animation defined in globals. - usePrecedents has a conditional refetchInterval — polls every 5s while any row is mid-extraction or queued, then stops once everything settles to completed/failed. New helper isPrecedentActive() centralises the "is this row mid-something" check so the UI and the destructive-action guard agree. - Trash button is disabled (opacity 30%, tooltip explains) while the row is active. Pencil/edit stays enabled — editing metadata fields during extraction is safe (last write wins, low-stakes race). Schema: list_external_case_law now exposes the two *_requested_at timestamps so the UI can distinguish "queued" from "never asked". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:47:31 +00:00
Chaim	8e1384b897	fix(precedents): wrap citation column + extractor fills source_type All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details Two follow-ups after running the metadata extractor on 403-17: 1. Library table: shadcn TableCell defaults to whitespace-nowrap and the table wrapper has overflow-x-auto, so the long citation forced a horizontal scrollbar inside the row. Override on the citation cell only — whitespace-normal + break-words + min/max-w to keep the column readable. Same for the case-name cell. Row aligns to top so wrapping doesn't push neighbours up. 2. Extractor now also fills source_type (court_ruling / appeals_committee). The previous round added decision_date_iso, precedent_level, and court but left source_type empty. Same closed-enum + merge-only-if-empty policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:28:35 +00:00
Chaim	fc3b6b6cae	ui(precedents): collapsible groups by precedent + Hebrew labels + RTL fixes All checks were successful Build & Deploy / build-and-deploy (push) Successful in 33s Details After running the dual-mode halacha extractor on a real appeals committee decision (403-17), the pending-review tab surfaced 351 halachot in a single flat list — the chair correctly pointed out that this is unusable without grouping. Three fixes: 1. Group pending halachot by precedent (case_law_id). Each group shows the citation, court, date, level and item count; default state is collapsed so the chair picks one ruling at a time. Within a group, items still sort by confidence ascending so the doubtful ones surface first. J/K/A/R/E now scope to currently-expanded groups; toggling open auto-focuses the first item. 2. Translate the badges that were leaking English: rule_type values (`persuasive`, `interpretive`, `binding`, `application`, `procedural`, `obiter`) now render as Hebrew labels, and `confidence X.XX` becomes `ביטחון X.XX`. The card header no longer repeats the citation since it's already in the group header. 3. Strip Unicode bidi marks (U+200E/F/202A-E/2066-9) from displayed citations. Nevo PDFs and the upload form embed these in the case_number; they render as zero-width but visually push the text away from the right edge of the table cell. Also: hide the empty court line under the case name in the list (was rendering as a stray em-dash), and use a muted em-dash for empty date/level rather than blank/dash inconsistency across columns. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:05:40 +00:00
Chaim	2cfdf35191	refactor(precedents): keep all LLM calls on the local-MCP path All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details Architectural correction: every claude_session caller in this project runs through the local MCP server (~/.claude.json points at /home/chaim/legal-ai/mcp-server/.venv/bin/python). The Coolify container has no `claude` CLI and no claude.ai session, so any LLM call originating from web/ FastAPI fails with "Claude CLI not found" — which is exactly what we hit on 403-17. The earlier Anthropic SDK fallback would have made it work, but at direct API cost. The chair's preference is to stay on the claude.ai session for everything. So: - claude_session.py: removed the SDK fallback, restored CLI-only. The error message now points the next person at the architectural rule in the module docstring instead of papering over it. - precedent_library.py:ingest_precedent (called from FastAPI on upload) now does only the non-LLM half: extract → chunk → embed → store. Sets halacha_extraction_status='pending' for the chair to act on. - reextract_halachot / reextract_metadata kept, but lazy-import their extractors so the FastAPI path can't accidentally pull them in. They are reachable only via the MCP tools precedent_extract_halachot / precedent_extract_metadata, which run locally with CLI. - Removed POST /api/precedent-library/{id}/extract-halachot and /extract-metadata — they were dead ends from the container. - Dropped the `anthropic` Python dep that the SDK fallback required. - UI: removed the "refresh halachot" and "sparkles metadata" buttons that called those endpoints. Edit sheet now points the chair at the MCP tool names instead. Halacha and metadata extraction for an uploaded precedent now happen when the chair (via Claude Code) runs: mcp__legal-ai__precedent_extract_metadata <case_law_id> mcp__legal-ai__precedent_extract_halachot <case_law_id> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 11:06:08 +00:00
Chaim	73a79ea7e8	feat(precedents): metadata auto-fill, edit sheet, persuasive extraction All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details Three improvements to the precedent library based on usage feedback: 1. Auto-fill metadata at upload time. New service precedent_metadata_extractor reads the ruling's full_text and suggests case_name (short), summary, headnote, key_quote, subject_tags, appeal_subtype. The merge policy fills only empty fields, preserving everything the chair typed in the upload form. Wired into the ingest pipeline; also exposed as a re-run endpoint POST /api/precedent-library/{id}/extract-metadata for existing records. 2. Edit sheet in the UI. Pencil icon on each library row opens a pre-populated form covering every field. A Sparkles button on the sheet runs the metadata extractor on demand and refreshes the form. The case_number is read-only because halachot are FK'd to it; renaming requires delete + re-upload. 3. Halacha extractor branches on is_binding. Sources marked binding (Supreme/Administrative) keep the strict halacha prompt. Non-binding sources (other appeals committees, district courts on planning matters) get a different prompt that extracts applications, interpretive principles, and persuasive conclusions — labeled with new rule_types 'application' and 'persuasive'. The fallback also widens chunk selection: if the chunker labeled nothing as legal_analysis/ruling/conclusion, we now run on all chunks rather than returning zero halachot for a usable ruling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 10:19:35 +00:00
Chaim	7ee90dce31	feat: external precedent library with auto halacha extraction All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details Adds a third corpus of legal authority distinct from style_corpus (Daphna's prior decisions for voice) and case_precedents (chair-attached quotes per case). The new corpus holds chair-uploaded court rulings and other appeals committee decisions, with binding rules (הלכות) extracted automatically and queued for chair approval. Pipeline (web/app.py + services/precedent_library.py): file → extract → chunk → Voyage embed → halacha_extractor → store + publish progress over the existing Redis SSE channel. Schema V7 (services/db.py): extends case_law with source_kind + extraction status fields under a CHECK constraint pinning practice_area to the three appeals committee domains (rishuy_uvniya, betterment_levy, compensation_197). New precedent_chunks (vector(1024)) and halachot tables (vector(1024) over rule_statement, IVFFlat indexes, gin on practice_areas/subject_tags). Halachot start as pending_review; only approved/published rows are visible to search_precedent_library. Agents: legal-writer, legal-researcher, legal-analyst, legal-ceo, legal-qa get search_precedent_library. legal-writer prompt explains the three-corpus distinction and CREAC use; legal-qa now verifies that every cited halacha resolves to an approved row in the corpus. UI: /precedents page with four tabs — library / semantic search / pending review (J/K nav, A/R/E shortcuts, badge count) / stats. Reuses the existing upload-sheet progress + SSE pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 08:38:18 +00:00

14 Commits