legal-ai

Author	SHA1	Message	Date
Chaim	b409f1c7eb	Add case data, benchmark embeddings, and bug report Add cases symlink, Google Vision extraction and benchmark embedding data, and Paperclip bug report. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 17:20:40 +00:00
Chaim	3f759d3610	Improve document processing pipeline and agent workflows - Add delete_document_chunks for reprocessing, save extracted text to disk - Expand case directory structure (original/extracted/proofread/backup) - Update classifier patterns (תגובה, הודעת עמדה) - Fix proofreader agent paths for new directory layout - Update HEARTBEAT to notify on every task completion - Improve bidi_table with LRE/PDF directional embedding - Add Paperclip project verification and auto-close setup issue - Add auto-sync-cases.sh for Gitea synchronization Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 16:45:49 +00:00
Chaim	63c9ca184b	Fix processing badge: treat 'proofread' status as completed Documents with extraction_status='proofread' were incorrectly shown as "in processing" on the case list page. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:36:16 +00:00
Chaim	bfcbb6708a	Add local files section to case view (research, drafts, proofread) - New API endpoint /api/cases/{num}/local-files lists files from disk - New API endpoint /api/cases/{num}/local-files/{folder}/{file} serves file content - Case view now shows research/analysis files, proofread texts, and draft decisions - Files are clickable and open in new tab Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:43:02 +00:00
Chaim	22e819363e	Flatten cases directory structure and unify paths - Remove cases/new\|in-progress\|completed subdivision (status managed in DB) - Rename documents/original → documents/originals (consistent plural) - Move exports from global data/exports/ into cases/{num}/exports/ - Add documents/research/ for case law and analysis files - Update all agents, scripts, config, web API endpoints, and DB paths Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 14:33:27 +00:00
Chaim	4d674bf475	Add proofreader and exporter agents + abbreviations dictionary - legal-proofreader: OCR proofreading agent (Opus) that fixes broken Hebrew text before legal analysis — corrects abbreviations (עוייד→עו"ד), broken words, and illogical sentences - legal-exporter: Final draft export agent — validates decision, exports DOCX, saves versioned drafts (טיוטה-V1.docx etc.) - abbreviations.json: Dictionary of ~70 Hebrew legal/general/planning abbreviations for automated OCR correction - legal-ceo.md: Updated workflow to include proofreader before analyst and exporter after QA Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 20:34:10 +00:00
Chaim	6aaca14e31	Replace Claude Vision OCR with Google Cloud Vision Benchmark results on Hebrew legal docs (case 1130-25): - Google Vision: 1s/page, $0.001/page, high accuracy - Claude Opus Vision: 90s/page, $0.05/page, poor accuracy - PyMuPDF broken OCR layers now detected via quality check Changes: - extractor.py: Google Vision OCR with Hebrew language hint (300 DPI) - extractor.py: text quality detection (word length, words-per-line, Hebrew ratio) - extractor.py: Hebrew abbreviation quote fixer (15 known patterns) - config.py: add GOOGLE_CLOUD_VISION_API_KEY, remove ANTHROPIC_API_KEY - pyproject.toml: add google-cloud-vision, remove anthropic Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 20:17:58 +00:00
Chaim	bc72a83a71	Switch embedding model from voyage-3-large to voyage-law-2 Benchmark on case 1130-25 (4 Hebrew legal docs, 8 queries) showed: - voyage-law-2: avg top-1 score 0.5839 (+27% over voyage-3-large) - voyage-4-large: avg top-1 score 0.4119 (worse than current) - voyage-3-large: avg top-1 score 0.4589 (baseline) voyage-law-2 costs ~4.6x more per run but delivers significantly better retrieval quality for Hebrew legal text. Model is now configurable via VOYAGE_MODEL env var. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 19:05:58 +00:00
Chaim	d8e888ad6a	Add sync-to-DB and delete-from-DB actions for skills - POST /api/admin/skills/{slug}/sync — read SKILL.md from disk, insert/update DB - DELETE /api/admin/skills/{slug} — remove skill from DB (keeps disk files) - UI: Sync/Re-sync and Delete buttons per skill in the skills list Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 17:52:00 +00:00
Chaim	2d265d2f0e	Add Paperclip skill install/upgrade UI and API - POST /api/admin/skills/install — upload ZIP, extract to skills dir, update DB - GET /api/admin/skills — list installed skills with DB/disk sync status - POST /api/admin/paperclip/restart — restart Paperclip (pm2 or flag file) - New Skills page in web UI with drag-and-drop ZIP upload - Coolify volume mount for /paperclip-skills - Host-side crontab watcher for restart flag file Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 17:38:29 +00:00
Chaim	6a62edbdb4	Fix: add /api/health endpoint for Coolify healthcheck Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 12:15:11 +00:00
Chaim	5a8d5cac0a	Add exports panel: versioned drafts, download, upload revisions, mark final Export DOCX now saves to data/exports/{case_number}/ with auto-versioning (טיוטה-v1, v2...). The case view UI shows all drafts with download buttons, allows uploading revised versions (עריכה-v1...), and marking a version as final (copies to training corpus for style learning). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 12:10:02 +00:00
Chaim	b2f60d51f4	Remove project .mcp.json (moved to global ~/.claude.json) legal-ai MCP server now configured globally for all Claude Code sessions, including Paperclip agents. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 19:16:34 +00:00
Chaim	e1d2e18ea8	Add email notifications: agents send mail when human action needed New: scripts/notify.py — sends via SMTP (notify@marcus-law.co.il → paperclip+chaim@marcus-law.co.il) Updated: HEARTBEAT.md — agents must send email when waiting for human decision Triggers: outcome choice, direction approval, QA failures, review ready. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 17:07:43 +00:00
Chaim	22196f48cb	Enforce Hebrew-only output for all agents in HEARTBEAT.md All agent output — comments, status, errors, summaries, thinking — must be in Hebrew. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 16:58:46 +00:00
Chaim	4df2040a40	Fix: save_block_content now writes draft file + writer must update status Two issues that caused QA agent to fail: 1. save_block_content saved to DB only — now also rebuilds drafts/decision.md 2. legal-writer.md now has explicit mandatory step: case_update(status="drafted") Without these, workflow_status reports has_draft=false and QA can't run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 15:25:53 +00:00
Chaim	85880c482e	Revert Paperclip DB URL to host embedded-postgres (localhost:54329) Paperclip moved back from Docker to pm2 on host. Reverts `c83dcd6` (Docker migration URL change). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 12:03:08 +00:00
Chaim	c83dcd660e	Update Paperclip DB URL for Docker migration Paperclip moved from embedded PG (localhost:54329) to Coolify-managed PostgreSQL container (10.0.2.13:5432). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 11:17:57 +00:00
Chaim	e6293250aa	Fix CEO agent: brainstorm directly instead of calling claude-in-claude brainstorm_directions tool uses claude -p subprocess which times out when called from inside a claude session (agent). CEO should think about directions directly — it already has all the context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 18:37:45 +00:00
Chaim	6a93292f56	Update CEO agent with interactive decision workflow via Paperclip CEO now follows a step-by-step interactive flow: A. Check status and what's been done B. Summarize case + ask Chaim for outcome (1/2/3) C. Read response, run brainstorm, present directions D. Read direction choice, approve, launch writer agent E. Monitor writing progress F. QA and export All interaction happens through Paperclip comments. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 18:10:19 +00:00
Chaim	65e78f493c	Add CEO agent: עוזר משפטי — orchestrates all legal agents Manages the decision writing pipeline: - Creates issues and assigns to specialist agents - Tracks status across all active cases - Reports to human (Chaim) when approvals needed - Never writes or analyzes directly — delegates All 4 specialist agents now report to CEO. Hierarchy: עוזר משפטי → מנתח/חוקר/כותב/בודק Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 17:18:35 +00:00
Chaim	f4dd4f7134	Add shared HEARTBEAT.md checklist for all agents Symlinked to Paperclip instructions directory for each agent. Single source of truth: .claude/agents/ files → symlinked to Paperclip. Cleaned duplicate soul_md from DB metadata. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 17:02:15 +00:00
Chaim	4574987a69	Add 3 new agents: legal-researcher, legal-writer, legal-qa Complete agent pipeline for decision writing: 1. legal-analyst (existing) — extract claims/responses/replies 2. legal-researcher (new) — analyze precedents, plans, protocols 3. legal-writer (new) — write decision blocks in Dafna's style 4. legal-qa (new) — validate before export (6 checks) All agents use claude_local adapter (Claude Code session, zero API cost). Each has YAML frontmatter with specific tools and detailed Hebrew instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 16:47:28 +00:00
Chaim	9ba489ee21	Add legal-analyst agent definition in .claude/agents/ Defines the agent's role, tools, document type rules, and workflow. Linked to Paperclip agent via --agent legal-analyst extraArg. Key rules: - Claims only from appeal docs, responses from response docs, replies from supplementary - Never extract from precedents, plans, or protocols - Must report results to Paperclip before finishing Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 16:22:16 +00:00
Chaim	96ea54dc6e	Add claim_type field: distinguish claims vs responses vs replies Legal documents have 3 types of assertions: - claim: from appeal documents (כתב ערר) - response: from original responses (כתב תשובה) - reply: from supplementary responses (תגובה, השלמת טיעון) DB: added claim_type column to claims table Extractor: _infer_claim_type() auto-detects from doc_type + title Updated existing 113 records: 29 claims, 28 responses, 56 replies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 15:35:16 +00:00
Chaim	328436f56d	Remove stale classifier import from processor.py (was deleted) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:45:19 +00:00
Chaim	911c797eb2	Reorganize: skills/ directory + move memory to docs/ skill-legal-decision/ → skills/decision/ skill-legal-assistant/ → skills/assistant/ skill-legal-docx/ → skills/docx/ memory/*.md → docs/ Also removed: TASKS.md (use TaskMaster), classifier.py (replaced by local_classifier.py) Updated all references in CLAUDE.md, scripts, PRDs, docs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:27:07 +00:00
Chaim	d5ccf03e4c	Add docs, scripts, skills, commands, and taskmaster config to repo Includes: - docs/: architecture, block-schema, migration-plan, product-specification - scripts/: bidi_table, decompose-decisions, extract-claims, seed-knowledge, etc. - skill-legal-decision/: SKILL.md + references + block-schema - skill-legal-assistant/: SKILL.md - skill-legal-docx/: SKILL.md + references - .claude/commands/: bidi-table skill - .taskmaster/: task config + PRDs - .gitignore: exclude legacy/, kiryat-yearim/, node_modules/, memory/ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:19:17 +00:00
Chaim	bacb330a2a	Replace all Anthropic API calls with Claude Code session (claude -p) New module claude_session.py provides query() and query_json() that run prompts via `claude -p` CLI — uses the claude.ai session, zero API cost. Converted 6 services: - claims_extractor.py: extract_claims_with_ai - brainstorm.py: brainstorm_directions - block_writer.py: write_block (was streaming+thinking, now simple) - qa_validator.py: claims_coverage check - style_analyzer.py: 3 API calls (single pass, multi pass, synthesis) - learning_loop.py: extract_lessons Only extractor.py still uses Anthropic API (for PDF OCR with Vision). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:14:08 +00:00
Chaim	e5dc037088	Create Paperclip issue + plugin state link when opening new case Wizard now creates: project + issue (CMP-N) + plugin_state entry linking the issue to the legal-ai case number. This enables the sync job in the marcusgroup.legal-ai plugin to track case status. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 13:44:49 +00:00
Chaim	8db06c9ac6	Set git default branch to main in Docker image Prevents master/main mismatch when pushing new case repos to Gitea. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 13:36:10 +00:00
Chaim	52ee3419d3	Add local rule-based classifier with Claude Code headless fallback Replaces API-based classifier with: 1. Filename pattern matching (covers 95%+ of legal docs) 2. Content keyword matching for ambiguous filenames 3. Claude Code headless (claude -p) fallback for edge cases No Anthropic API calls needed for classification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 13:14:13 +00:00
Chaim	9e7492e761	Make classification and reference extraction non-fatal in document pipeline Text extraction, chunking and embedding proceed even if Claude API classification or reference extraction fails (e.g. API quota exceeded). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 13:00:34 +00:00
Chaim	40406b5fde	Keep original filename when doc_type is auto instead of 'auto-{case}.ext' Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 12:52:18 +00:00
Chaim	561a4f7bcf	Allow .md file uploads alongside PDF, DOCX, RTF, TXT Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 12:44:30 +00:00
Chaim	10071d7f18	Prevent duplicate Paperclip projects: check existing before creating Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 11:21:06 +00:00
Chaim	5fc52ce530	Switch to cases/{new,in-progress,completed}/ directory structure Replace single CASES_DIR with find_case_dir() that searches across all status directories. New cases created in cases/new/{number}/. Config: CASES_BASE, CASES_NEW, CASES_IN_PROGRESS, CASES_COMPLETED Docker: added -v /home/chaim/legal-ai/cases:/cases volume mount Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 10:45:47 +00:00
Chaim	dc6026100c	Improve case-not-found error: show clear message with create button Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 10:27:36 +00:00
Chaim	0dfb42ab00	Fix env var loading for Docker: support GITEA_TOKEN fallback, configurable Paperclip DB URL Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 10:18:24 +00:00
Chaim	0593fe9b01	Add interactive case creation wizard + document upload with auto-rename New SPA UI with 4 views: - Case list (#/) with status cards and document counts - New case wizard (#/new) with 4-step form: details, parties, schedule, review+create - Case view (#/case/:id) with grouped documents and drag-drop upload with tagging - Legacy upload (#/upload) for backwards compatibility Auto-creation pipeline in wizard step 4: 1. Creates case in legal-ai DB with local git repo 2. Creates Gitea repo in 'cases' org and pushes initial commit 3. Creates Paperclip project via direct DB insert Document upload with smart rename: - scan_001.pdf -> כתב-ערר-קובר-1130-25.pdf - Based on doc_type + party_name + case_number New files: - web/gitea_client.py: Gitea REST API client - web/paperclip_client.py: Paperclip embedded DB client Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 10:17:24 +00:00
Chaim	cb41867bc9	Remove din-leumi: fully separate into standalone service - Removed din-leumi imports, endpoints, and processing from app.py - Removed bundled din-leumi source from repo - Simplified Dockerfile (no din-leumi dependency) - din-leumi now runs as its own Coolify application Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 08:34:35 +00:00
Chaim	324807ff1d	Fix Docker build: bundle din-leumi instead of git clone Removes GITEA_TOKEN dependency from build by copying din-leumi MCP server source directly into the Docker context. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 08:21:31 +00:00
Chaim	316dd2aefb	Redesign upload UI + update Gitea org references web/static/index.html: Complete redesign with clean modern layout: - RTL Hebrew throughout - Two-column layout: upload zone + pending files - Cleaner drag & drop with visual feedback - Improved classification form with radio buttons - Better progress tracking display - Status bar with system metrics CLAUDE.md: Updated Gitea URL to new org ezer-mishpati/legal-ai Closes ezer-mishpati/legal-ai#1 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 08:10:18 +00:00
Chaim	59bb471368	Add expanded workflow API endpoints and update CLAUDE.md New endpoints: outcome, direction, claims, QA validation, learning loop, document text retrieval. Updated Dockerfile and project documentation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 08:04:28 +00:00
Chaim	081c7fb17a	Replace Haiku with Sonnet in classifier for better accuracy classify_document and identify_parties both used Haiku, which produced parsing failures and 0% confidence on Beit HaKerem documents. Sonnet handles Hebrew legal documents more reliably. No more Haiku usage in the entire codebase. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 07:47:12 +00:00
Chaim	586f1db402	QA claims check: Haiku→Sonnet + filter appellant claims only Two fixes for claims_coverage false negatives (55% → expected ~85%+): 1. Model upgrade: Haiku → Sonnet for semantic matching. Haiku missed obvious matches (e.g., paragraph about "כריתת עצים" not matching claim about tree cutting). Sonnet understands context better. 2. Filter: only check appellant/respondent claims, not committee or permit_applicant claims. Committee claims are defensive positions ("the application complies with the plan") — they don't need to be "addressed" in the discussion section. 3. Send full discussion text (was truncated to 12K chars). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 07:37:23 +00:00
Chaim	9d0a73a1dc	Add context-only mode: Claude Code writes blocks, no API needed New architecture: MCP provides context, Claude Code writes. New functions: - get_block_context(case_id, block_id) → returns full context package (prompt, source docs, claims, direction, precedents, style guide) WITHOUT calling Anthropic API - save_block_content(case_id, block_id, content) → saves block to DB New MCP tools: get_block_context, save_block_content The old write_block (API-based) still works as fallback. The new flow uses Claude Code's own model (Opus 4.6, 1M context) which has no separate API billing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:18:25 +00:00
Chaim	7033d2d3ee	Embed full style guide in block prompts for Dafna's voice _build_style_context rewritten from 10-line summary to comprehensive style guide including: - Tone rules per appeal type (warm for licensing, cold for levy) - 15 mandatory expressions ("כידוע", "ברי כי", "אין בידנו לקבל") - Discussion structure rules (continuous prose, conclusion first) - Per-party phrasing templates (appellants, committee, permit applicants) - DB patterns grouped by type (phrases, transitions, openings, closings) This addresses the main quality gap: style rated 2/5 because the output was "dry and overly formal" vs Dafna's "direct and clear" voice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:12:09 +00:00
Chaim	e725f9ecd7	Fix claims parsing: truncated JSON recovery + chunking + compact output config.py parse_llm_json: Added truncated JSON recovery. When Claude's output is cut mid-JSON (common with long claim lists), the parser now: - Finds the last complete JSON item (closing "}") - Closes the array/object brackets - Returns partial but valid results instead of None Tested: recovers 2/3 items from truncated array, all cases pass. claims_extractor.py: - Prompt asks for compact output (150 words max per claim, group similar) - Explicitly requests "no markdown, no explanations, JSON only" - Long documents split into chunks at paragraph boundaries - Each chunk processed separately, results merged - max_tokens already at 8192 This fixes the recurring "0 claims" bug for committee responses and permit applicant responses where the JSON was getting truncated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:04:34 +00:00
Chaim	7d1dc73112	Fix max_tokens to 16K for Opus (API limit is 32K, need room for thinking) block-yod max_tokens reduced from 32K to 16K — the API returned "max_tokens: 32768 > 32000" error. With thinking enabled, the actual limit for output is lower. 16K is sufficient for discussion blocks. Also: extractor.py now supports .md files (was missing, blocked Beit HaKerem upload). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:00:49 +00:00

1 2

65 Commits