- New API endpoint /api/cases/{num}/local-files lists files from disk
- New API endpoint /api/cases/{num}/local-files/{folder}/{file} serves file content
- Case view now shows research/analysis files, proofread texts, and draft decisions
- Files are clickable and open in new tab
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove cases/new|in-progress|completed subdivision (status managed in DB)
- Rename documents/original → documents/originals (consistent plural)
- Move exports from global data/exports/ into cases/{num}/exports/
- Add documents/research/ for case law and analysis files
- Update all agents, scripts, config, web API endpoints, and DB paths
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Benchmark on case 1130-25 (4 Hebrew legal docs, 8 queries) showed:
- voyage-law-2: avg top-1 score 0.5839 (+27% over voyage-3-large)
- voyage-4-large: avg top-1 score 0.4119 (worse than current)
- voyage-3-large: avg top-1 score 0.4589 (baseline)
voyage-law-2 costs ~4.6x more per run but delivers significantly
better retrieval quality for Hebrew legal text. Model is now
configurable via VOYAGE_MODEL env var.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- POST /api/admin/skills/{slug}/sync — read SKILL.md from disk, insert/update DB
- DELETE /api/admin/skills/{slug} — remove skill from DB (keeps disk files)
- UI: Sync/Re-sync and Delete buttons per skill in the skills list
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- POST /api/admin/skills/install — upload ZIP, extract to skills dir, update DB
- GET /api/admin/skills — list installed skills with DB/disk sync status
- POST /api/admin/paperclip/restart — restart Paperclip (pm2 or flag file)
- New Skills page in web UI with drag-and-drop ZIP upload
- Coolify volume mount for /paperclip-skills
- Host-side crontab watcher for restart flag file
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Export DOCX now saves to data/exports/{case_number}/ with auto-versioning
(טיוטה-v1, v2...). The case view UI shows all drafts with download buttons,
allows uploading revised versions (עריכה-v1...), and marking a version as
final (copies to training corpus for style learning).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
legal-ai MCP server now configured globally for all Claude Code sessions,
including Paperclip agents.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New: scripts/notify.py — sends via SMTP (notify@marcus-law.co.il → paperclip+chaim@marcus-law.co.il)
Updated: HEARTBEAT.md — agents must send email when waiting for human decision
Triggers: outcome choice, direction approval, QA failures, review ready.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
All agent output — comments, status, errors, summaries, thinking — must be in Hebrew.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two issues that caused QA agent to fail:
1. save_block_content saved to DB only — now also rebuilds drafts/decision.md
2. legal-writer.md now has explicit mandatory step: case_update(status="drafted")
Without these, workflow_status reports has_draft=false and QA can't run.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Paperclip moved back from Docker to pm2 on host.
Reverts c83dcd6 (Docker migration URL change).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Paperclip moved from embedded PG (localhost:54329) to Coolify-managed
PostgreSQL container (10.0.2.13:5432).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
brainstorm_directions tool uses claude -p subprocess which times out
when called from inside a claude session (agent). CEO should think
about directions directly — it already has all the context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
CEO now follows a step-by-step interactive flow:
A. Check status and what's been done
B. Summarize case + ask Chaim for outcome (1/2/3)
C. Read response, run brainstorm, present directions
D. Read direction choice, approve, launch writer agent
E. Monitor writing progress
F. QA and export
All interaction happens through Paperclip comments.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Manages the decision writing pipeline:
- Creates issues and assigns to specialist agents
- Tracks status across all active cases
- Reports to human (Chaim) when approvals needed
- Never writes or analyzes directly — delegates
All 4 specialist agents now report to CEO.
Hierarchy: עוזר משפטי → מנתח/חוקר/כותב/בודק
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Symlinked to Paperclip instructions directory for each agent.
Single source of truth: .claude/agents/ files → symlinked to Paperclip.
Cleaned duplicate soul_md from DB metadata.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Complete agent pipeline for decision writing:
1. legal-analyst (existing) — extract claims/responses/replies
2. legal-researcher (new) — analyze precedents, plans, protocols
3. legal-writer (new) — write decision blocks in Dafna's style
4. legal-qa (new) — validate before export (6 checks)
All agents use claude_local adapter (Claude Code session, zero API cost).
Each has YAML frontmatter with specific tools and detailed Hebrew instructions.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Defines the agent's role, tools, document type rules, and workflow.
Linked to Paperclip agent via --agent legal-analyst extraArg.
Key rules:
- Claims only from appeal docs, responses from response docs, replies from supplementary
- Never extract from precedents, plans, or protocols
- Must report results to Paperclip before finishing
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New module claude_session.py provides query() and query_json() that
run prompts via `claude -p` CLI — uses the claude.ai session, zero API cost.
Converted 6 services:
- claims_extractor.py: extract_claims_with_ai
- brainstorm.py: brainstorm_directions
- block_writer.py: write_block (was streaming+thinking, now simple)
- qa_validator.py: claims_coverage check
- style_analyzer.py: 3 API calls (single pass, multi pass, synthesis)
- learning_loop.py: extract_lessons
Only extractor.py still uses Anthropic API (for PDF OCR with Vision).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wizard now creates: project + issue (CMP-N) + plugin_state entry
linking the issue to the legal-ai case number. This enables the
sync job in the marcusgroup.legal-ai plugin to track case status.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces API-based classifier with:
1. Filename pattern matching (covers 95%+ of legal docs)
2. Content keyword matching for ambiguous filenames
3. Claude Code headless (claude -p) fallback for edge cases
No Anthropic API calls needed for classification.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Text extraction, chunking and embedding proceed even if Claude API
classification or reference extraction fails (e.g. API quota exceeded).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace single CASES_DIR with find_case_dir() that searches across
all status directories. New cases created in cases/new/{number}/.
Config: CASES_BASE, CASES_NEW, CASES_IN_PROGRESS, CASES_COMPLETED
Docker: added -v /home/chaim/legal-ai/cases:/cases volume mount
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New SPA UI with 4 views:
- Case list (#/) with status cards and document counts
- New case wizard (#/new) with 4-step form: details, parties, schedule, review+create
- Case view (#/case/:id) with grouped documents and drag-drop upload with tagging
- Legacy upload (#/upload) for backwards compatibility
Auto-creation pipeline in wizard step 4:
1. Creates case in legal-ai DB with local git repo
2. Creates Gitea repo in 'cases' org and pushes initial commit
3. Creates Paperclip project via direct DB insert
Document upload with smart rename:
- scan_001.pdf -> כתב-ערר-קובר-1130-25.pdf
- Based on doc_type + party_name + case_number
New files:
- web/gitea_client.py: Gitea REST API client
- web/paperclip_client.py: Paperclip embedded DB client
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Removed din-leumi imports, endpoints, and processing from app.py
- Removed bundled din-leumi source from repo
- Simplified Dockerfile (no din-leumi dependency)
- din-leumi now runs as its own Coolify application
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Removes GITEA_TOKEN dependency from build by copying din-leumi
MCP server source directly into the Docker context.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
web/static/index.html: Complete redesign with clean modern layout:
- RTL Hebrew throughout
- Two-column layout: upload zone + pending files
- Cleaner drag & drop with visual feedback
- Improved classification form with radio buttons
- Better progress tracking display
- Status bar with system metrics
CLAUDE.md: Updated Gitea URL to new org ezer-mishpati/legal-ai
Closesezer-mishpati/legal-ai#1
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New endpoints: outcome, direction, claims, QA validation, learning loop,
document text retrieval. Updated Dockerfile and project documentation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
classify_document and identify_parties both used Haiku, which produced
parsing failures and 0% confidence on Beit HaKerem documents.
Sonnet handles Hebrew legal documents more reliably.
No more Haiku usage in the entire codebase.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Two fixes for claims_coverage false negatives (55% → expected ~85%+):
1. Model upgrade: Haiku → Sonnet for semantic matching. Haiku missed
obvious matches (e.g., paragraph about "כריתת עצים" not matching
claim about tree cutting). Sonnet understands context better.
2. Filter: only check appellant/respondent claims, not committee or
permit_applicant claims. Committee claims are defensive positions
("the application complies with the plan") — they don't need to
be "addressed" in the discussion section.
3. Send full discussion text (was truncated to 12K chars).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New architecture: MCP provides context, Claude Code writes.
New functions:
- get_block_context(case_id, block_id) → returns full context package
(prompt, source docs, claims, direction, precedents, style guide)
WITHOUT calling Anthropic API
- save_block_content(case_id, block_id, content) → saves block to DB
New MCP tools: get_block_context, save_block_content
The old write_block (API-based) still works as fallback.
The new flow uses Claude Code's own model (Opus 4.6, 1M context)
which has no separate API billing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
_build_style_context rewritten from 10-line summary to comprehensive
style guide including:
- Tone rules per appeal type (warm for licensing, cold for levy)
- 15 mandatory expressions ("כידוע", "ברי כי", "אין בידנו לקבל")
- Discussion structure rules (continuous prose, conclusion first)
- Per-party phrasing templates (appellants, committee, permit applicants)
- DB patterns grouped by type (phrases, transitions, openings, closings)
This addresses the main quality gap: style rated 2/5 because the output
was "dry and overly formal" vs Dafna's "direct and clear" voice.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
config.py parse_llm_json: Added truncated JSON recovery. When Claude's
output is cut mid-JSON (common with long claim lists), the parser now:
- Finds the last complete JSON item (closing "}")
- Closes the array/object brackets
- Returns partial but valid results instead of None
Tested: recovers 2/3 items from truncated array, all cases pass.
claims_extractor.py:
- Prompt asks for compact output (150 words max per claim, group similar)
- Explicitly requests "no markdown, no explanations, JSON only"
- Long documents split into chunks at paragraph boundaries
- Each chunk processed separately, results merged
- max_tokens already at 8192
This fixes the recurring "0 claims" bug for committee responses and
permit applicant responses where the JSON was getting truncated.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
block-yod max_tokens reduced from 32K to 16K — the API returned
"max_tokens: 32768 > 32000" error. With thinking enabled, the actual
limit for output is lower. 16K is sufficient for discussion blocks.
Also: extractor.py now supports .md files (was missing, blocked
Beit HaKerem upload).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
block_writer: Rewrote block-zayin prompt to require synthesis by topic
instead of listing each claim separately. Now produces 3 organized
sections (appellants 8, committee 6, permit applicants 3+) instead
of 40 scattered paragraphs. Target: 800-1500 words.
claims_extractor: Fix markdown code block stripping (same bug as
qa_validator had). Enables parsing claims from Claude responses
wrapped in ```json blocks.
Tested on Hecht: block-zayin from 40 paragraphs/1049 words to
17 organized paragraphs/1039 words. Structure now matches Dafna's
original (3 parties, grouped by topic).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
block_writer: new renumber_all_blocks() function that renumbers all
paragraphs across all blocks sequentially (1, 2, 3...). Handles both
plain "N." and bold "**N.**" formats. Added missing 'import re'.
qa_validator: sequential_numbering check now matches bold-formatted
numbers (**N.**) in addition to plain (N.).
Tested on Hecht: renumbered 115 paragraphs across 7 blocks, QA 6/6.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>