legal-ai

Author	SHA1	Message	Date
Chaim	36f21c815e	fix(precedents): distinguish silent extraction failure from "no halachot" All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m5s Details Observed 2026-05-03: a `precedent_process_pending(halacha)` run that chained two precedents (1110/20 → 317/10) succeeded for the first (9 halachot, 129 chunks) and produced status=`no_halachot` for the second despite it being a 47KB Supreme Court ruling with rich legal analysis. A manual single-precedent re-run on 317/10 immediately extracted 53 halachot. Diagnosis: every chunk's claude_session call in the back-to-back run silently failed (likely Anthropic rate-limit storm after the 1110/20 token burn), and the empty list was reported as "Claude looked and found nothing" — same code path as a real 0-halacha ruling. The user couldn't tell the difference. Three changes: 1. Surface chunk-level failures (halacha_extractor.py) `_extract_chunk` now returns `(halachot, succeeded)` so the caller can count how many chunks crashed. `extract()` uses this to distinguish: - `no_halachot` — chunks ran cleanly, Claude found nothing - `extraction_failed` — ≥50% of chunks crashed AND zero halachot came back (rate limit, subprocess crash, etc.) When `extraction_failed`, DB status is left as 'processing' so the request stays in the queue for the caller to retry — instead of the old behaviour where it got marked 'completed' and silently dropped from the queue. 2. Inter-precedent cooldown (precedent_library.py) `process_pending_extractions` now sleeps 30s between precedents. Anthropic rate-limits per-org, and back-to-back large rulings (~4M tokens for 1110/20, immediately followed by another 2-3M) was the empirical trigger. 30s gives the per-minute counter time to drain. 3. Auto-retry on extraction_failed (precedent_library.py) When a precedent comes back as `extraction_failed`, retry once after a 60s cooldown before giving up. Rate-limit storms are transient — the manual re-run of 317/10 minutes later succeeded with 53 halachot and zero chunk failures, confirming a single retry is sufficient. Only retries `extraction_failed`; never `no_halachot` (Claude looked and there genuinely is no holding). The DB status now ends up as 'failed' only after retries are exhausted, matching the UI's terminal-failure chip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 05:13:10 +00:00
Chaim	5d836ca414	fix(precedents): Anthropic SDK fallback, format() crash, UI refresh All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m31s Details Three fixes to the precedent library after the first end-to-end test on 403-17 surfaced runtime issues: 1. Anthropic SDK fallback in claude_session. The legal-ai Docker container does not ship the `claude` CLI, so every halacha and metadata extraction was failing with "Claude CLI not found." Module now tries the CLI first (zero-cost local path) and falls back to the Anthropic SDK with ANTHROPIC_API_KEY when the binary is absent. Default model is claude-sonnet-4-6, overridable via CLAUDE_SDK_MODEL env. The system message gets cache_control: ephemeral so multi-chunk runs reuse the cached instruction prefix at ~10% read cost. Adds `anthropic` to pyproject deps. 2. precedent_metadata_extractor crashed with KeyError because the JSON example inside the prompt template contained literal { } characters that str.format() interpreted as placeholders. Switched to f-string concatenation; the prompt template no longer needs format() at all. 3. Library list query stays stale after upload because the upload mutation's onSuccess fires when the POST returns task_id, not when SSE reports completion. Added a second invalidate inside the SSE watcher in PrecedentUploadSheet so the new row appears with up-to-date chunk and halachot counts the moment processing finishes. Halacha and metadata extractors now route the long static prompt through the new `system=` parameter so the SDK path actually caches it; the CLI path concatenates and behaves as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 10:52:31 +00:00
Chaim	73a79ea7e8	feat(precedents): metadata auto-fill, edit sheet, persuasive extraction All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details Three improvements to the precedent library based on usage feedback: 1. Auto-fill metadata at upload time. New service precedent_metadata_extractor reads the ruling's full_text and suggests case_name (short), summary, headnote, key_quote, subject_tags, appeal_subtype. The merge policy fills only empty fields, preserving everything the chair typed in the upload form. Wired into the ingest pipeline; also exposed as a re-run endpoint POST /api/precedent-library/{id}/extract-metadata for existing records. 2. Edit sheet in the UI. Pencil icon on each library row opens a pre-populated form covering every field. A Sparkles button on the sheet runs the metadata extractor on demand and refreshes the form. The case_number is read-only because halachot are FK'd to it; renaming requires delete + re-upload. 3. Halacha extractor branches on is_binding. Sources marked binding (Supreme/Administrative) keep the strict halacha prompt. Non-binding sources (other appeals committees, district courts on planning matters) get a different prompt that extracts applications, interpretive principles, and persuasive conclusions — labeled with new rule_types 'application' and 'persuasive'. The fallback also widens chunk selection: if the chunker labeled nothing as legal_analysis/ruling/conclusion, we now run on all chunks rather than returning zero halachot for a usable ruling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 10:19:35 +00:00
Chaim	7ee90dce31	feat: external precedent library with auto halacha extraction All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details Adds a third corpus of legal authority distinct from style_corpus (Daphna's prior decisions for voice) and case_precedents (chair-attached quotes per case). The new corpus holds chair-uploaded court rulings and other appeals committee decisions, with binding rules (הלכות) extracted automatically and queued for chair approval. Pipeline (web/app.py + services/precedent_library.py): file → extract → chunk → Voyage embed → halacha_extractor → store + publish progress over the existing Redis SSE channel. Schema V7 (services/db.py): extends case_law with source_kind + extraction status fields under a CHECK constraint pinning practice_area to the three appeals committee domains (rishuy_uvniya, betterment_levy, compensation_197). New precedent_chunks (vector(1024)) and halachot tables (vector(1024) over rule_statement, IVFFlat indexes, gin on practice_areas/subject_tags). Halachot start as pending_review; only approved/published rows are visible to search_precedent_library. Agents: legal-writer, legal-researcher, legal-analyst, legal-ceo, legal-qa get search_precedent_library. legal-writer prompt explains the three-corpus distinction and CREAC use; legal-qa now verifies that every cited halacha resolves to an approved row in the corpus. UI: /precedents page with four tabs — library / semantic search / pending review (J/K nav, A/R/E shortcuts, badge count) / stats. Reuses the existing upload-sheet progress + SSE pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 08:38:18 +00:00

4 Commits