legal-ai

Author	SHA1	Message	Date
Chaim	be9fa9e712	Add decision-writing methodology based on FJC, Garner, Posner sources "בית ספר להחלטות" Phase 2 — the system now has formal analytical methodology for building quasi-judicial decisions, separate from Dafna's writing style (SKILL.md) and content checklists. What was done: - Downloaded 5 authoritative sources (~341K words): FJC Judicial Writing Manual (1991+2020), Garner Legal Writing in Plain English, Posner How Judges Think, Scalia/Garner Making Your Case - Extracted principles from all sources into intermediate docs - Synthesized into docs/decision-methodology.md (3,400 words, 12 sections, 10 guiding principles) - Integrated methodology into block-yod prompt via {methodology_guidance} - Restructured legal-writer agent workflow to follow analytical stages - Made "answer all claims" flexible (bundle/skip via chair_directions) - Added methodology compliance check (#7) to legal-qa agent - Updated all knowledge files (CLAUDE.md, SKILL.md, lessons, corpus) Three-layer architecture: 1. Methodology (decision-methodology.md) — universal, how to think 2. Content checklists (lessons.py) — specific per appeal subtype 3. Style (SKILL.md) — Dafna's personal writing patterns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 23:29:16 +00:00
Chaim	0fef20e272	Add content checklists for block-yod and chair feedback system Addresses Dafna's observation that licensing decisions lack comprehensive planning discussion. Systematic corpus analysis of all 24 training decisions revealed the system learned writing style but not substantive content. Changes: - Corpus analysis of all 24 decisions (docs/corpus-analysis.md) - 5 content checklists by appeal subtype injected into block-yod prompt - chair_feedback DB table + API endpoints + MCP tools - Feedback management page in Next.js UI (/feedback) - Navigation updated with "הערות יו״ר" link Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-12 20:58:28 +00:00
Chaim	4df2040a40	Fix: save_block_content now writes draft file + writer must update status Two issues that caused QA agent to fail: 1. save_block_content saved to DB only — now also rebuilds drafts/decision.md 2. legal-writer.md now has explicit mandatory step: case_update(status="drafted") Without these, workflow_status reports has_draft=false and QA can't run. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 15:25:53 +00:00
Chaim	bacb330a2a	Replace all Anthropic API calls with Claude Code session (claude -p) New module claude_session.py provides query() and query_json() that run prompts via `claude -p` CLI — uses the claude.ai session, zero API cost. Converted 6 services: - claims_extractor.py: extract_claims_with_ai - brainstorm.py: brainstorm_directions - block_writer.py: write_block (was streaming+thinking, now simple) - qa_validator.py: claims_coverage check - style_analyzer.py: 3 API calls (single pass, multi pass, synthesis) - learning_loop.py: extract_lessons Only extractor.py still uses Anthropic API (for PDF OCR with Vision). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-04 14:14:08 +00:00
Chaim	9d0a73a1dc	Add context-only mode: Claude Code writes blocks, no API needed New architecture: MCP provides context, Claude Code writes. New functions: - get_block_context(case_id, block_id) → returns full context package (prompt, source docs, claims, direction, precedents, style guide) WITHOUT calling Anthropic API - save_block_content(case_id, block_id, content) → saves block to DB New MCP tools: get_block_context, save_block_content The old write_block (API-based) still works as fallback. The new flow uses Claude Code's own model (Opus 4.6, 1M context) which has no separate API billing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:18:25 +00:00
Chaim	7033d2d3ee	Embed full style guide in block prompts for Dafna's voice _build_style_context rewritten from 10-line summary to comprehensive style guide including: - Tone rules per appeal type (warm for licensing, cold for levy) - 15 mandatory expressions ("כידוע", "ברי כי", "אין בידנו לקבל") - Discussion structure rules (continuous prose, conclusion first) - Per-party phrasing templates (appellants, committee, permit applicants) - DB patterns grouped by type (phrases, transitions, openings, closings) This addresses the main quality gap: style rated 2/5 because the output was "dry and overly formal" vs Dafna's "direct and clear" voice. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:12:09 +00:00
Chaim	7d1dc73112	Fix max_tokens to 16K for Opus (API limit is 32K, need room for thinking) block-yod max_tokens reduced from 32K to 16K — the API returned "max_tokens: 32768 > 32000" error. With thinking enabled, the actual limit for output is lower. 16K is sufficient for discussion blocks. Also: extractor.py now supports .md files (was missing, blocked Beit HaKerem upload). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 16:00:49 +00:00
Chaim	e24e24dac5	Maximize context and output per Anthropic best practices Per official Anthropic documentation (April 2026): Output tokens increased to match model capabilities: - block-yod (discussion): 8K → 32K (Opus supports 128K) - block-zayin (claims): 4K → 16K - block-vav (background): 4K → 16K - claims_extractor: 4K → 8K (fixes truncated JSON) - qa_validator: 4K → 8K Source documents sent in full (not truncated): - Was: 3000 chars per doc, 15K total - Now: full document text, no truncation - Reduces hallucinations: "extract word-for-word quotes first" Prompt structure follows long-context tips: - Source documents placed FIRST (top of prompt) - Instructions and query placed LAST - "Queries at the end improve quality by up to 30%" Extended thinking uses adaptive mode for Opus 4.6. Streaming enabled for all requests > 21K tokens. Unified JSON parsing via parse_llm_json() helper in config.py. Applied to: classifier, claims_extractor, brainstorm, qa_validator, learning_loop (5 files). Also: extractor.py now supports .md files. Sources: - https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking - https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips - https://docs.anthropic.com/en/docs/minimizing-hallucinations - https://docs.anthropic.com/en/docs/about-claude/models/overview Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 14:17:43 +00:00
Chaim	bed9d5c7e9	Improve block-zayin: synthesize claims by topic + fix markdown JSON parsing block_writer: Rewrote block-zayin prompt to require synthesis by topic instead of listing each claim separately. Now produces 3 organized sections (appellants 8, committee 6, permit applicants 3+) instead of 40 scattered paragraphs. Target: 800-1500 words. claims_extractor: Fix markdown code block stripping (same bug as qa_validator had). Enables parsing claims from Claude responses wrapped in ```json blocks. Tested on Hecht: block-zayin from 40 paragraphs/1049 words to 17 organized paragraphs/1039 words. Structure now matches Dafna's original (3 parties, grouped by topic). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 12:54:42 +00:00
Chaim	e438740ab4	Add renumber_all_blocks + fix sequential_numbering check for bold format block_writer: new renumber_all_blocks() function that renumbers all paragraphs across all blocks sequentially (1, 2, 3...). Handles both plain "N." and bold "N." formats. Added missing 'import re'. qa_validator: sequential_numbering check now matches bold-formatted numbers (N.) in addition to plain (N.). Tested on Hecht: renumbered 115 paragraphs across 7 blocks, QA 6/6. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 12:30:31 +00:00
Chaim	7781987c3a	Fix precedents search + auto-update case parties block_writer: _build_precedents_context now searches both paragraph_embeddings (other decisions by Dafna) and case_law_embeddings (precedent case law). Previously only searched document_chunks which had no cross-case data. Now returns ~2400 chars from 3 other decisions. processor: Step 1.6 auto-updates case appellants/respondents from classifier results when they're empty. Prevents blank party fields. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 11:59:33 +00:00
Chaim	018b5936a1	Fix claims handling: filter block-zayin duplicates, improve QA matching block_writer: _build_claims_context now filters out block-zayin claims (from final decision) and uses only claims from original pleadings. Reduces noise from 78 to 48 real claims for Hecht case. qa_validator: claims_coverage check rewritten: - Filter block-zayin claims (same reason) - Keyword-based matching instead of 3-word phrase matching - 25% keyword overlap threshold (was: any 3-word match) - Allow up to 20% uncovered claims before failing - Check both block-yod and block-zayin for coverage Result: Hecht case QA goes from 4/6 to 6/6, 47/48 claims covered (98%). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 11:32:29 +00:00
Chaim	570f745823	Improve block-yod prompt: require minimum length, numbered claims, precedent citations - Add minimum word count guidance (2000-4000 words) - Number each claim in claims_context for explicit tracking - Require 3-5 case law citations minimum - Fix max_tokens > budget_tokens for extended thinking - Use streaming for opus+thinking requests (>10min timeout) Tested on Hecht case: block-yod improved from 1039 to 1927 words. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 11:28:12 +00:00
Chaim	d9e5ef0f46	Add full decision writing pipeline: classify, extract, brainstorm, write, QA, export New services (11 files): - classifier.py: auto doc-type classification + party identification (Claude Haiku) - claims_extractor.py: claim extraction from pleadings (Claude Sonnet + regex) - references_extractor.py: plan/case-law/legislation detection (regex) - brainstorm.py: direction generation with 2-3 options (Claude Sonnet) - block_writer.py: 12-block decision writer (template + Claude Sonnet/Opus) - docx_exporter.py: DOCX export with David font, RTL, headings - qa_validator.py: 6 QA checks with export blocking on critical failure - learning_loop.py: draft vs final comparison + lesson extraction - metrics.py: KPIs dashboard per case and global - audit.py: action audit log - cli.py: standalone CLI with 11 commands Updated pipeline: extract → classify → chunk → embed → store → extract_references New MCP tools: 29 total (was 16) New DB tables: audit_log, decisions CRUD, claims CRUD Config: Infisical support, external service allowlist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-03 10:21:47 +00:00

14 Commits