legal-ai

Author	SHA1	Message	Date
Chaim	28f49defff	LLM session: async, 30min timeout, semantic chunking + parallel All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details The claude_session bridge had two structural defects that made any non-trivial document extraction unreliable: 1. subprocess.run() blocks the asyncio event loop in the MCP server for the full duration of every LLM call (60-180s typical). 2. The 120-second timeout was below the cold-cache cost of any document over ~12K Hebrew characters. Three back-to-back timeouts on case 8174-24 dropped 43 appellant claims on the floor. Phase 1 of the remediation plan — keeps claude_session as the engine (no Anthropic API switch) and restructures around it: claude_session.py • query / query_json are now async — asyncio.create_subprocess_exec instead of subprocess.run, so MCP server can serve other coroutines while a call is in flight. • DEFAULT_TIMEOUT 120 → 1800 (30 min). High enough that no realistic document hits it; bounded so a runaway never zombifies forever. • LONG_TIMEOUT 300 → 3600 for opus block writing on full case context. • TimeoutError now actually kills the subprocess (asyncio.wait_for cancellation alone leaves the child running). claims_extractor.py • _split_by_sections: chunks at numbered sections / Hebrew letter headings / "פרק" markers / markdown ##, falls back to paragraph breaks, then to hard splits. Targets 12K chars per chunk — small enough that each chunk reliably finishes inside the timeout. • _extract_chunk: per-chunk retry (1 attempt by default) with structured logging on failure. Failed chunks no longer crash the overall extraction; they're skipped with a partial-result warning. • extract_claims_with_ai now runs chunks in parallel via asyncio.gather bounded by a semaphore (CHUNK_CONCURRENCY=3). For a 25K-char appeal: was sequential 150-300s, now ~70-90s. Updated all 9 callers (claims, appraiser facts, block writer, qa validator, brainstorm, learning loop, style analyzer × 3) to await the now-async API. The one-shot scripts/extract_claims_8174.py used to recover 43 appellant claims on case 8174-24 has been moved to .archive/ — phase 1 makes it obsolete. SCRIPTS.md updated. Phase 2 (background-task wrapper around LLM-bound MCP tools, persistent llm_tasks table, SSE progress) is the structural follow-up — separate PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:21:35 +00:00
Chaim	96ea54dc6e	Add claim_type field: distinguish claims vs responses vs replies Legal documents have 3 types of assertions: - claim: from appeal documents (כתב ערר) - response: from original responses (כתב תשובה) - reply: from supplementary responses (תגובה, השלמת טיעון) DB: added claim_type column to claims table Extractor: _infer_claim_type() auto-detects from doc_type + title Updated existing 113 records: 29 claims, 28 responses, 56 replies Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 15:35:16 +00:00
Chaim	bacb330a2a	Replace all Anthropic API calls with Claude Code session (claude -p) New module claude_session.py provides query() and query_json() that run prompts via `claude -p` CLI — uses the claude.ai session, zero API cost. Converted 6 services: - claims_extractor.py: extract_claims_with_ai - brainstorm.py: brainstorm_directions - block_writer.py: write_block (was streaming+thinking, now simple) - qa_validator.py: claims_coverage check - style_analyzer.py: 3 API calls (single pass, multi pass, synthesis) - learning_loop.py: extract_lessons Only extractor.py still uses Anthropic API (for PDF OCR with Vision). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:14:08 +00:00
Chaim	e725f9ecd7	Fix claims parsing: truncated JSON recovery + chunking + compact output config.py parse_llm_json: Added truncated JSON recovery. When Claude's output is cut mid-JSON (common with long claim lists), the parser now: - Finds the last complete JSON item (closing "}") - Closes the array/object brackets - Returns partial but valid results instead of None Tested: recovers 2/3 items from truncated array, all cases pass. claims_extractor.py: - Prompt asks for compact output (150 words max per claim, group similar) - Explicitly requests "no markdown, no explanations, JSON only" - Long documents split into chunks at paragraph boundaries - Each chunk processed separately, results merged - max_tokens already at 8192 This fixes the recurring "0 claims" bug for committee responses and permit applicant responses where the JSON was getting truncated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:04:34 +00:00
Chaim	e24e24dac5	Maximize context and output per Anthropic best practices Per official Anthropic documentation (April 2026): Output tokens increased to match model capabilities: - block-yod (discussion): 8K → 32K (Opus supports 128K) - block-zayin (claims): 4K → 16K - block-vav (background): 4K → 16K - claims_extractor: 4K → 8K (fixes truncated JSON) - qa_validator: 4K → 8K Source documents sent in full (not truncated): - Was: 3000 chars per doc, 15K total - Now: full document text, no truncation - Reduces hallucinations: "extract word-for-word quotes first" Prompt structure follows long-context tips: - Source documents placed FIRST (top of prompt) - Instructions and query placed LAST - "Queries at the end improve quality by up to 30%" Extended thinking uses adaptive mode for Opus 4.6. Streaming enabled for all requests > 21K tokens. Unified JSON parsing via parse_llm_json() helper in config.py. Applied to: classifier, claims_extractor, brainstorm, qa_validator, learning_loop (5 files). Also: extractor.py now supports .md files. Sources: - https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking - https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips - https://docs.anthropic.com/en/docs/minimizing-hallucinations - https://docs.anthropic.com/en/docs/about-claude/models/overview Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:17:43 +00:00
Chaim	bed9d5c7e9	Improve block-zayin: synthesize claims by topic + fix markdown JSON parsing block_writer: Rewrote block-zayin prompt to require synthesis by topic instead of listing each claim separately. Now produces 3 organized sections (appellants 8, committee 6, permit applicants 3+) instead of 40 scattered paragraphs. Target: 800-1500 words. claims_extractor: Fix markdown code block stripping (same bug as qa_validator had). Enables parsing claims from Claude responses wrapped in ```json blocks. Tested on Hecht: block-zayin from 40 paragraphs/1049 words to 17 organized paragraphs/1039 words. Structure now matches Dafna's original (3 parties, grouped by topic). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 12:54:42 +00:00
Chaim	d9e5ef0f46	Add full decision writing pipeline: classify, extract, brainstorm, write, QA, export New services (11 files): - classifier.py: auto doc-type classification + party identification (Claude Haiku) - claims_extractor.py: claim extraction from pleadings (Claude Sonnet + regex) - references_extractor.py: plan/case-law/legislation detection (regex) - brainstorm.py: direction generation with 2-3 options (Claude Sonnet) - block_writer.py: 12-block decision writer (template + Claude Sonnet/Opus) - docx_exporter.py: DOCX export with David font, RTL, headings - qa_validator.py: 6 QA checks with export blocking on critical failure - learning_loop.py: draft vs final comparison + lesson extraction - metrics.py: KPIs dashboard per case and global - audit.py: action audit log - cli.py: standalone CLI with 11 commands Updated pipeline: extract → classify → chunk → embed → store → extract_references New MCP tools: 29 total (was 16) New DB tables: audit_log, decisions CRUD, claims CRUD Config: Infisical support, external service allowlist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 10:21:47 +00:00

7 Commits