legal-ai

Author	SHA1	Message	Date
Chaim	a9cd8aeb12	fix: prevent write_interim_draft context overflow (465K → ≤300K chars) Two bugs caused all 5 interim blocks to fail with "Claude CLI failed (exit 1): unknown error": 1. source_context was embedded BOTH inside the prompt template (via {source_context}) AND prepended again in write_block — doubling every block's context size (232K chars × 2 = 465K chars). 2. _build_source_context loaded all 9 case documents for every block regardless of relevance. Fixes: - Remove the duplicate source_context prepend in write_block; the template already contains it via {source_context} - Add per-block document filtering (_BLOCK_DOC_TYPES): block-he/zayin → empty, block-chet → protocol only, block-tet → appraisals only - Add 400K char guard before calling claude -p with a descriptive error (vs opaque "exit 1: unknown error") - Add prompt-size warning and size info in claude_session error messages Result: block-he 0 chars, block-zayin 0 chars, block-vav ~172K, block-chet ~45K, block-tet ~300K (all under 400K limit) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 10:49:47 +00:00
Chaim	10a63fb9e0	fix(precedents): separate court rulings from committee decisions correctly All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m37s Details - DB: add 'all_committees' virtual source_kind covering internal_committee + external_upload appeals_committee rows in one query - DB: stats now count all case_law rows (not just external_upload), fixing the precedents_total that excluded 44 internal-committee records - UI: courts table filters to source_type=court_ruling only; committees table uses the new all_committees query Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 09:59:30 +00:00
Chaim	3e14cd6798	feat: link related precedents across court instances (SCHEMA_V11) Add ability to mark case_law records as related (e.g. same appeal through ועדת ערר → מנהלי → עליון): - DB: case_law_relations join table (bidirectional, V11 migration) - DB CRUD: add/remove/get_case_law_relations - Service: get_precedent() now returns related_cases[] - MCP: precedent_link_cases + precedent_unlink_cases tools - REST: POST/DELETE /api/precedent-library/{id}/relations - UI: RelatedCasesSection on detail page with search dialog and unlink Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-10 07:52:29 +00:00
Chaim	fff2d1c859	fix(precedent-library): per-record extraction must drain the queue too All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details reextract_metadata / reextract_halachot extract & apply but never cleared metadata_extraction_requested_at / halacha_extraction_requested_at — only the bulk worker (process_pending_extractions) did. Result: clicking "חלץ מטא-דאטה" on the edit sheet (or calling precedent_extract_metadata directly) left the row stuck in the queue forever, with the UI badge showing "ממתין לחילוץ" even after extraction succeeded. Mirror the worker's behaviour: on success ('completed' / 'no_changes' / 'no_halachot'), call db.clear_extraction_request to drain the queue. Coolify deploy required for the FastAPI container; local MCP server needs a process restart for the change to take effect (long-running).	2026-05-07 07:08:31 +00:00
Chaim	36b78ea404	fix(precedent-library): queue listing must include internal_committee too All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details Earlier commit `afcc481` opened request_metadata_extraction and request_halacha_extraction to all source kinds — but list_pending_extraction_requests still hard-filtered to external_upload. Result: stamping a queue request on an internal_committee row succeeded silently, but the worker (and the queue badge) never saw it. Even with the auto-wakeup added in `c7132ba` the CEO would wake, find 0 pending items, and exit. Drop the legacy filter so the queue listing matches the writer side. Coolify deploy required for the FastAPI container to pick this up.	2026-05-07 06:51:19 +00:00
Chaim	afcc4818a4	fix(precedent-library): allow re-extraction for internal_committee rows All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m13s Details The "חלץ מטא-דאטה" / "חלץ הלכות" buttons in the UI were returning 404 for any precedent with `source_kind != 'external_upload'`. The original restriction was meant to keep LLM extraction off internal-committee imports (their metadata supposedly came from the case file system), but the same precedent rows can still need re-extraction when ingest produces broken data — e.g. the corrupted `subject_tags` value `['[','"','ה','י',...]` that motivated this change (an early ingest stored a JSON literal into a TEXT[] column, which Postgres split into single chars). Two changes here: 1. db.request_metadata_extraction / request_halacha_extraction: drop the `AND source_kind='external_upload'` filter. The extractor already preserves user values (only fills empty fields), so this is safe. 2. precedent_metadata_extractor.extract_and_apply: detect the character-by-character corruption above and treat it as empty so the freshly-extracted tags actually replace the broken ones. Heuristic: 3+ elements where every element is at most 2 chars (legitimate tags are multi-character Hebrew words). Coolify deploy required for the FastAPI container to pick this up.	2026-05-06 19:44:13 +00:00
Chaim	bd4b0ca766	feat(mcp): case_get_final_text — fall back to PDF/DOC/RTF/TXT/MD All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m58s Details The Hermes Knowledge Curator's hermes-curator.md says it must be able to read both DOCX and PDF final decisions. The original implementation hardcoded the .docx extension only. Extend to try .docx → .pdf → .doc → .rtf → .txt → .md, returning the first match. extractor.extract_text already supports all six formats, so no extractor changes needed. If none found, the not_found response now includes the tried_extensions list so the caller knows what was attempted. Verified on case 1130-25 (.docx still picked first) and tested via `curator-cmp mcp test legal-ai`.	2026-05-05 19:18:57 +00:00
Chaim	7c9582ed04	feat(mcp): case_get_final_text — let agents read the signed final DOCX All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details The Knowledge Curator (Hermes) couldn't read סופי-{case}.docx because document_get_text only works on rows in the documents table — the final file is just a copy in the case's exports/ directory, not a tracked document. CMP-71 hit this and produced an unproductive interaction asking the user how to fix the access issue. Add a new MCP tool that: - Locates exports/סופי-{case_number}.docx via config.find_case_dir - Extracts text using the existing extractor service (python-docx based) - Returns JSON with status + text + page_count + truncation info - Optional max_chars cap for large decisions Smoke test on case 1130-25: 400-char preview returns proper Hebrew text beginning with "לפנינו ערר על החלטת הוועדה המקומית...". The local MCP server reloads on next Hermes spawn (stdio mode), so the tool is immediately available — no Coolify deploy needed. Curator's promptTemplate (DB-stored) updated to use the new tool as the primary path for reading the final. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-05 15:57:10 +00:00
Chaim	69d4827f33	feat(migration): enrich internal committee entries — fix case_number + metadata + halachot All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m32s Details - precedent_metadata_extractor: add case_number_clean extraction field - apply_to_record: overwrite_case_number param for one-time migration - internal_decisions: enrich_migrated_entries() — runs metadata then queues halachot - server: expose as internal_decision_enrich MCP tool Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 18:59:20 +00:00
Chaim	c0f67ab841	feat(precedents): split library into court rulings + appeals committee tables All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m34s Details - /api/precedent-library now accepts source_kind param (default external_upload) - list_external_case_law returns chair_name/district fields - LibraryListPanel renders two separate tables with appropriate columns - internal_decisions migration: added queue_halachot param to defer extraction - Fixed practice_area mapping from style_corpus (appeals_committee → proper enum) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-04 18:49:32 +00:00
Chaim	92a2763b86	feat: add internal committee decisions corpus (source_kind='internal_committee') All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m31s Details Three-layer separation: style learning (style_corpus), appeals-committee decisions (internal_committee), and court rulings (external_upload). - SCHEMA_V10: chair_name + district columns on case_law and cases, partial indexes - create_internal_committee_decision() DB upsert function - search_precedent_library_semantic() now accepts source_kind/district/chair_name params - search_precedent_library_hybrid() passes through new params - services/internal_decisions.py: ingest_internal_decision, migrate_from_style_corpus, migrate_from_external_corpus (identifies rows via source_type='appeals_committee') - search_internal_decisions() MCP tool (server.py + tools/search.py) - internal_decision_migrate() MCP admin tool - Web endpoints: POST /api/internal-decisions/upload, POST /api/internal-decisions/migrate, GET /api/internal-decisions - ingest_final_version auto-ingests finalized decisions into internal corpus - SKILL.md updated: agents now search internal + external in parallel, present separately Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 18:33:39 +00:00
Chaim	eab0ca906c	feat(interim): include block-he opening in pre-ruling interim drafts block-he (פתיחה ניטרלית) was previously emitted only in final decisions. For interim drafts shown to the chair before ruling, including a neutral opening helps the chair confirm framing before approving downstream blocks. Skipped if empty, so legacy cases without block-he are unaffected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 17:25:54 +00:00
Chaim	69bdf7b30a	fix(settings): harden PATCH/redeploy per code review - Add infisicalsdk dependency - Narrow update→create fallback to NotFound errors only (no silent swallow) - Truncate Coolify error response text to 200 chars - Add 60s cooldown to redeploy endpoint - Move httpx to top-level import	2026-05-04 06:33:01 +00:00
Chaim	f6bb46dc4a	fix(retrieval): restore _base(limit=) contract in hybrid precedent search All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m23s Details `rerank.maybe_rerank` calls `base_search(limit=…, base_kwargs)` on both the rerank-on and rerank-off paths. Commit `242f668` moved the closure into hybrid_search.py and renamed its parameter to `limit_inner`, so every call to `/api/precedent-library/search` raised TypeError 500 regardless of the VOYAGE_RERANK_ENABLED flag. Sibling `search_documents_hybrid` was unaffected because it uses `lambda kw:` which absorbs the kwarg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 05:19:53 +00:00
Chaim	36f21c815e	fix(precedents): distinguish silent extraction failure from "no halachot" All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m5s Details Observed 2026-05-03: a `precedent_process_pending(halacha)` run that chained two precedents (1110/20 → 317/10) succeeded for the first (9 halachot, 129 chunks) and produced status=`no_halachot` for the second despite it being a 47KB Supreme Court ruling with rich legal analysis. A manual single-precedent re-run on 317/10 immediately extracted 53 halachot. Diagnosis: every chunk's claude_session call in the back-to-back run silently failed (likely Anthropic rate-limit storm after the 1110/20 token burn), and the empty list was reported as "Claude looked and found nothing" — same code path as a real 0-halacha ruling. The user couldn't tell the difference. Three changes: 1. Surface chunk-level failures (halacha_extractor.py) `_extract_chunk` now returns `(halachot, succeeded)` so the caller can count how many chunks crashed. `extract()` uses this to distinguish: - `no_halachot` — chunks ran cleanly, Claude found nothing - `extraction_failed` — ≥50% of chunks crashed AND zero halachot came back (rate limit, subprocess crash, etc.) When `extraction_failed`, DB status is left as 'processing' so the request stays in the queue for the caller to retry — instead of the old behaviour where it got marked 'completed' and silently dropped from the queue. 2. Inter-precedent cooldown (precedent_library.py) `process_pending_extractions` now sleeps 30s between precedents. Anthropic rate-limits per-org, and back-to-back large rulings (~4M tokens for 1110/20, immediately followed by another 2-3M) was the empirical trigger. 30s gives the per-minute counter time to drain. 3. Auto-retry on extraction_failed (precedent_library.py) When a precedent comes back as `extraction_failed`, retry once after a 60s cooldown before giving up. Rate-limit storms are transient — the manual re-run of 317/10 minutes later succeeded with 53 halachot and zero chunk failures, confirming a single retry is sufficient. Only retries `extraction_failed`; never `no_halachot` (Claude looked and there genuinely is no holding). The DB status now ends up as 'failed' only after retries are exhausted, matching the UI's terminal-failure chip. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-04 05:13:10 +00:00
Chaim	d4496b96f1	fix(mcp): eliminate "No such tool available" race at agent wakeup All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m26s Details When Paperclip wakes the CEO and the model issues an mcp__legal-ai__* call within ~10s of session init, Claude Code sometimes returns "No such tool available" because the legal-ai MCP server hasn't finished bringing up its tool catalog yet. Observed twice today on CMPA precedent-extraction wakeups (sessions 9989fbaf and a9c61801); the agent fell back to bash + .venv/bin/python and finished the work, but the race needed fixing on the server side. Three changes that close the window: 1. Lazy schema init (services/db.py + server.py) `init_schema()` was awaited inside the FastMCP lifespan, blocking the `initialize`/`tools/list` handshake until ~10 CREATE TABLE IF NOT EXISTS statements ran. Under contention (two CEOs waking at once for different companies) this stretched. Now the lifespan returns immediately and `get_pool()` runs the schema migrations exactly once on first DB access, guarded by an asyncio.Lock. tools/list is answered in milliseconds regardless of DB state. 2. Lazy heavy imports - services/embeddings.py: voyageai (~450ms) loaded only inside _get_client() - services/extractor.py: google.cloud.vision (~550ms) loaded only inside _get_vision_client() and _ocr_with_google_vision() These two were being imported at module top from legal_mcp.tools.documents -> services.processor -> services.{ extractor,embeddings}, so the FastMCP server couldn't even start responding until both finished. Cold start dropped from 2.7s to 1.17s end-to-end (init + tools/list response). 3. Agent-side warmup + retry guidance (.claude/agents/legal-ceo.md) Even with a fast server, the model can still race on the very first call. The precedent-extraction section now tells the CEO to call workflow_status as a warmup probe and to retry after a short sleep if it sees "No such tool available", before falling back to the python bypass. Also expanded the precedent-tool whitelists on the sub-agents that delegate halacha/library work (commits `4a9a6b7` + `7ee90dc` added the tools to the MCP server but only the CEO got them in its allowed list). Added to: legal-researcher (full extraction set), legal-analyst (library_get/list + halacha review), legal-writer (library lookups + halacha_review), legal-qa (library_get + halacha_review), and the two that the CEO was already missing (halacha_review, halachot_pending). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 20:23:14 +00:00
Chaim	81ccf3a888	feat(retrieval): track page_number on text chunks for multimodal hybrid boost All checks were successful Build & Deploy / build-and-deploy (push) Successful in 6m33s Details The legacy chunker did not track which PDF page each chunk came from. Stored chunks had page_number=NULL, which blocked the multimodal hybrid retriever's text+image boost — it joins (chunk, image) on (document_id, page_number) and the join could never fire. This change: - extractor.extract_text now returns (text, page_count, page_offsets); page_offsets[i] is the start char offset of page (i+1) in the joined text. None for non-PDFs. - chunker.chunk_document accepts an optional page_offsets and tags each chunk with the page that contains its first character (uses the existing chunker logic; pages assigned post-hoc by content search to keep the diff minimal). - processor.process_document and precedent_library.ingest_precedent forward page_offsets through the chunker. New uploads now carry accurate page_number on every chunk. - Other extract_text callers (tools/documents, tools/workflow, web/app.py) updated to unpack the third element (ignored). - scripts/backfill_chunk_pages.py: per-case retrofit. Re-extracts each PDF (re-OCRs via Google Vision if needed, ~$0.0015/page), computes page_offsets, and updates page_number on every chunk by content search. Idempotent; --force re-runs on already-tagged docs. Forward-only would leave the 419 image embeddings backfilled on cases 8174-24 + 8137-24 unable to boost their corresponding text chunks. The retrofit script closes that gap (cost ~$0.60). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:49:41 +00:00
Chaim	c31fe0866b	fix(retrieval): switch hybrid merge to Reciprocal Rank Fusion (RRF) Some checks are pending Build & Deploy / build-and-deploy (push) Waiting to run Details Cosine scores in voyage-3 (~0.4-0.5) and voyage-multimodal-3 (~0.2-0.25) live on different scales. The previous weighted-sum merge let text always dominate — verified empirically: 0 image-only hits across 7 queries on case 8174-24, image side contributed nothing. RRF combines by rank in each list rather than raw score, robust to scale differences. Per-item score: rrf_score = text_weight / (k + text_rank) + image_weight / (k + image_rank) A row that appears in both lists (joined on (id_field, page_number)) gets both terms — surfaced as match_type='text+image'. After fix on 8174-24 (146 image rows): 2 image-only hits land in top-5 across all 7 test queries, surfacing actual table/diagram/ signature pages (p12, p13 of שומת המשיבה for 'טבלת השוואת ערכי שומה', p25 of שומת השגה for 'תרשים גוש וחלקה', etc). On 8137-24 (273 image rows): 'חישוב היוון של דמי החכירה' goes from 0 baseline results → 5 hybrid results (3 text + 2 image), opening recall on scanned content the OCR layer misses. Default MULTIMODAL_TEXT_WEIGHT 0.65 → 0.5 (vanilla RRF) since the prior 0.65 was tuned for raw cosine scales that no longer apply. New env knob MULTIMODAL_RRF_K (default 60, standard literature). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:39:31 +00:00
Chaim	242f668319	feat(retrieval): add voyage-multimodal-3 page-image embeddings (feature flag) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m50s Details Stage C: per-page image embeddings via voyage-multimodal-3 + hybrid text+image search. Off by default; enable with MULTIMODAL_ENABLED=true. - Schema V9: document_image_embeddings + precedent_image_embeddings (vector(1024), page_number, image_thumbnail_path) - extractor.render_pages_for_multimodal renders PDF pages at MULTIMODAL_DPI (144) for embedding + JPEG thumbnails at MULTIMODAL_THUMB_DPI (96) for UI preview, in one pass - embeddings.embed_images calls voyage-multimodal-3 in 50-page batches - services/hybrid_search.py orchestrator: rerank applied to text side first (rerank-2 is text-only); image side cosine; weighted merge with text_weight 0.65 (env-tunable); image-only pages surface as match_type='image' so dense scanned content still appears - processor.process_document and precedent_library.ingest_precedent gated by flag — non-fatal on multimodal failure - scripts/multimodal_backfill.py — idempotent per-case CLI to embed existing documents without re-extracting text Validated locally on a 5-page response brief: render 0.31s, embed 8.32s, hybrid merge surfaces image rows correctly. Production rollout starts with flag=false (no behavior change), then per-case A/B. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:24:52 +00:00
Chaim	36e464f668	fix(halachot): exclude embedding from update_halacha RETURNING All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m26s Details PATCH /api/halachot/{id} was returning 500 because the row included ``embedding`` as a numpy.ndarray of np.float32, which FastAPI's jsonable_encoder cannot serialize (vars() and dict() both fail on it). The bug had been latent — it triggered for the first time today after the auto-approve batch left only low-confidence halachot for the chair to review manually, and her first PATCH hit the unserializable response. Replace ``RETURNING *`` with an explicit column list (everything except ``embedding``). Callers that need the embedding can re-fetch via ``get_halacha`` — but no current caller does. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:04:46 +00:00
Chaim	4d1924c7e6	feat(halachot): auto-approve high-confidence halachot at insert All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m29s Details Halachot extracted by halacha_extractor with confidence >= 0.80 are now inserted with review_status='approved' instead of 'pending_review' — they appear in search_precedent_library immediately. Halachot below the threshold still require manual chair approval. Threshold tunable via env (HALACHA_AUTO_APPROVE_THRESHOLD), defaults to 0.80. Rationale: 89% of historical extractions (356/400) score 0.80+, spot-checks confirmed quality, and the manual review backlog was the single biggest reason rerank-2 was returning passages-only on ההבחנה-style queries. After this change + the one-time backfill UPDATE, search now returns 9/10 halachot for "ההבחנה בין השבחה לפיצויים" instead of 0 — and the top-3 are exact-match rules, not adjacent passages. Reviewer field records "auto-approved (confidence ≥ X.XX)" with the threshold value at insert time, for traceability. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 19:01:03 +00:00
Chaim	26c3fddf41	feat(retrieval): add voyage rerank-2 cross-encoder stage (feature flag) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m29s Details Stage B of voyage-upgrades-plan rewritten: instead of context-3 (which 4 POCs showed inconsistent improvement), add a cross-encoder rerank layer on top of voyage-3. Default off (VOYAGE_RERANK_ENABLED=false). POC validation (785-doc corpus, 12 queries, claude-haiku-4-5 judge): - mean@3 +4.5% (4.306 → 4.500) - practical-category queries +11.6% (3.78 → 4.22) - latency +702ms per query - no schema change, no re-embed, no double storage Plumbing: - config: VOYAGE_RERANK_ENABLED / _MODEL / _FETCH_K env vars - embeddings.voyage_rerank() wraps voyageai client.rerank - services/rerank.py: maybe_rerank() helper — fetches FETCH_K candidates via the bi-encoder then reranks to top-K. Fail-open if Voyage rerank is unavailable. - tools/search.py: search_decisions, search_case_documents, find_similar_cases all wrapped - services/precedent_library.search_library wrapped Smoke-tested locally with flag on/off — produces expected behaviour and latency profile. Ready for production rollout via Coolify env flip after deploy. POCs (kept under scripts/ for reference): - voyage_context3_poc{_long}.py — context-3 evaluation (rejected) - voyage_multimodal_poc.py — multimodal-3 (stage C, deferred) - voyage_rerank_judge_poc.py — single-case rerank benchmark - voyage_rerank_corpus_poc.py — full-corpus rerank validation Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 18:43:41 +00:00
Chaim	72c4593e74	fix(precedents): auto-clear _requested_at on terminal status All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details set_case_law_extraction_status and set_case_law_halacha_status now NULL the corresponding _requested_at timestamp when status transitions to "completed" or "failed". Without this, completed rows kept lingering in the local-MCP work queue (which scans by `WHERE *_requested_at IS NOT NULL`) and the UI's isPrecedentActive check, leaving them undeletable until a manual SQL cleanup. The pre-existing process_pending_extractions path already called clear_extraction_request, but other paths (re-extraction, status set during upload) didn't — so the cleanup belongs at the status setter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 16:39:24 +00:00
Chaim	1f17419ee9	ui(precedents): live status pill with shimmer + auto-queue + auto-refresh All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m44s Details The chair pointed out three UX gaps after uploading a new precedent: 1. The status said "מחלץ הלכות" but nothing was actually running — the field only meant "halacha_extraction_status != completed", which includes the post-upload "pending" state where the local MCP worker hasn't been told to drain anything yet. Misleading. 2. The page didn't refresh on its own. The chair had to F5 to see new counts after extraction completed. 3. Clicking the trash icon mid-extraction would cascade-delete the row while the extractor was still using it (FK errors, partial writes). Fixes: - ingest_precedent now auto-queues both metadata and halacha extraction on upload by stamping the request timestamps. The chair (or me) drains the queue with one `precedent_process_pending` call from chat — no need to click any button before that. - StatusPill is now five-state with proper labels: "נכשל" (extraction_status=failed) — red "מעבד טקסט" — shimmer (extraction_status=processing) "בתור" — neutral (chunks queued, not yet running) "מחלץ הלכות" — shimmer (halacha_extraction_status=processing) "ממתין לחילוץ" — neutral (queued for local MCP worker) "לא חולץ" — neutral (pending without queue stamp — shouldn't happen) "X/Y מאושרות" — gold (done, with halachot count) The shimmer is a CSS-only sliding-stripe animation defined in globals. - usePrecedents has a conditional refetchInterval — polls every 5s while any row is mid-extraction or queued, then stops once everything settles to completed/failed. New helper isPrecedentActive() centralises the "is this row mid-something" check so the UI and the destructive-action guard agree. - Trash button is disabled (opacity 30%, tooltip explains) while the row is active. Pencil/edit stays enabled — editing metadata fields during extraction is safe (last write wins, low-stakes race). Schema: list_external_case_law now exposes the two *_requested_at timestamps so the UI can distinguish "queued" from "never asked". Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:47:31 +00:00
Chaim	4a9a6b7970	feat(precedents): UI button queues extraction for local MCP worker All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details The chair wanted a one-click "extract metadata" button on the edit sheet. The constraint stays the same — claude_session needs the local CLI which the container doesn't have, so the button can't run the extractor itself. Compromise: button stamps a queue marker; the local MCP server drains the queue on demand. DB (V8): two nullable timestamps on case_law, metadata_extraction_requested_at and halacha_extraction_requested_at, with partial indexes for cheap "find pending" scans. API: POST /api/precedent-library/{id}/request-metadata → stamp the row POST /api/precedent-library/{id}/request-halachot → same for halacha GET /api/precedent-library/queue/pending?kind=... → read-only view UI: Sparkles button in the edit sheet header. Click → toast tells the chair what to run from Claude Code. The button never triggers the extractor directly from the container. MCP tool: precedent_process_pending(kind, limit) — runs from Claude Code with the local CLI, picks up everything stamped, calls the extractor for each, clears the timestamp on success. Failures keep the timestamp so the next invocation retries them. Architectural rule (claude_session local-only) is preserved end-to-end and called out in the new endpoint comment + tool docstring. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:32:25 +00:00
Chaim	8e1384b897	fix(precedents): wrap citation column + extractor fills source_type All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details Two follow-ups after running the metadata extractor on 403-17: 1. Library table: shadcn TableCell defaults to whitespace-nowrap and the table wrapper has overflow-x-auto, so the long citation forced a horizontal scrollbar inside the row. Override on the citation cell only — whitespace-normal + break-words + min/max-w to keep the column readable. Same for the case-name cell. Row aligns to top so wrapping doesn't push neighbours up. 2. Extractor now also fills source_type (court_ruling / appeals_committee). The previous round added decision_date_iso, precedent_level, and court but left source_type empty. Same closed-enum + merge-only-if-empty policy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:28:35 +00:00
Chaim	6420fe4b0b	feat(precedents): metadata extractor also fills date, level, court All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m26s Details The first end-to-end run on 403-17 surfaced three fields the auto-fill left blank because the chair didn't set them in the upload form: date, precedent_level, and court. All three are right there in the ruling's header text — there's no reason to require manual entry. Prompt now asks for: - decision_date_iso (YYYY-MM-DD parsed from "ניתנה היום, … 5 בספטמבר 2022" style signatures) - precedent_level (closed enum: עליון/מנהלי/ועדת_ערר_ארצית/ועדת_ערר_מחוזית) - court (the full court name from the title block) Validation is unchanged: precedent_level only accepts the four enum values; decision_date_iso is parsed into a Python date object before being handed to update_case_law (asyncpg doesn't coerce strings to DATE columns); court is stored verbatim. Merge policy is unchanged — only fills empty fields. Anything the chair typed in the upload form survives. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 12:16:03 +00:00
Chaim	2cfdf35191	refactor(precedents): keep all LLM calls on the local-MCP path All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details Architectural correction: every claude_session caller in this project runs through the local MCP server (~/.claude.json points at /home/chaim/legal-ai/mcp-server/.venv/bin/python). The Coolify container has no `claude` CLI and no claude.ai session, so any LLM call originating from web/ FastAPI fails with "Claude CLI not found" — which is exactly what we hit on 403-17. The earlier Anthropic SDK fallback would have made it work, but at direct API cost. The chair's preference is to stay on the claude.ai session for everything. So: - claude_session.py: removed the SDK fallback, restored CLI-only. The error message now points the next person at the architectural rule in the module docstring instead of papering over it. - precedent_library.py:ingest_precedent (called from FastAPI on upload) now does only the non-LLM half: extract → chunk → embed → store. Sets halacha_extraction_status='pending' for the chair to act on. - reextract_halachot / reextract_metadata kept, but lazy-import their extractors so the FastAPI path can't accidentally pull them in. They are reachable only via the MCP tools precedent_extract_halachot / precedent_extract_metadata, which run locally with CLI. - Removed POST /api/precedent-library/{id}/extract-halachot and /extract-metadata — they were dead ends from the container. - Dropped the `anthropic` Python dep that the SDK fallback required. - UI: removed the "refresh halachot" and "sparkles metadata" buttons that called those endpoints. Edit sheet now points the chair at the MCP tool names instead. Halacha and metadata extraction for an uploaded precedent now happen when the chair (via Claude Code) runs: mcp__legal-ai__precedent_extract_metadata <case_law_id> mcp__legal-ai__precedent_extract_halachot <case_law_id> Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 11:06:08 +00:00
Chaim	5d836ca414	fix(precedents): Anthropic SDK fallback, format() crash, UI refresh All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m31s Details Three fixes to the precedent library after the first end-to-end test on 403-17 surfaced runtime issues: 1. Anthropic SDK fallback in claude_session. The legal-ai Docker container does not ship the `claude` CLI, so every halacha and metadata extraction was failing with "Claude CLI not found." Module now tries the CLI first (zero-cost local path) and falls back to the Anthropic SDK with ANTHROPIC_API_KEY when the binary is absent. Default model is claude-sonnet-4-6, overridable via CLAUDE_SDK_MODEL env. The system message gets cache_control: ephemeral so multi-chunk runs reuse the cached instruction prefix at ~10% read cost. Adds `anthropic` to pyproject deps. 2. precedent_metadata_extractor crashed with KeyError because the JSON example inside the prompt template contained literal { } characters that str.format() interpreted as placeholders. Switched to f-string concatenation; the prompt template no longer needs format() at all. 3. Library list query stays stale after upload because the upload mutation's onSuccess fires when the POST returns task_id, not when SSE reports completion. Added a second invalidate inside the SSE watcher in PrecedentUploadSheet so the new row appears with up-to-date chunk and halachot counts the moment processing finishes. Halacha and metadata extractors now route the long static prompt through the new `system=` parameter so the SDK path actually caches it; the CLI path concatenates and behaves as before. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 10:52:31 +00:00
Chaim	73a79ea7e8	feat(precedents): metadata auto-fill, edit sheet, persuasive extraction All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details Three improvements to the precedent library based on usage feedback: 1. Auto-fill metadata at upload time. New service precedent_metadata_extractor reads the ruling's full_text and suggests case_name (short), summary, headnote, key_quote, subject_tags, appeal_subtype. The merge policy fills only empty fields, preserving everything the chair typed in the upload form. Wired into the ingest pipeline; also exposed as a re-run endpoint POST /api/precedent-library/{id}/extract-metadata for existing records. 2. Edit sheet in the UI. Pencil icon on each library row opens a pre-populated form covering every field. A Sparkles button on the sheet runs the metadata extractor on demand and refreshes the form. The case_number is read-only because halachot are FK'd to it; renaming requires delete + re-upload. 3. Halacha extractor branches on is_binding. Sources marked binding (Supreme/Administrative) keep the strict halacha prompt. Non-binding sources (other appeals committees, district courts on planning matters) get a different prompt that extracts applications, interpretive principles, and persuasive conclusions — labeled with new rule_types 'application' and 'persuasive'. The fallback also widens chunk selection: if the chunker labeled nothing as legal_analysis/ruling/conclusion, we now run on all chunks rather than returning zero halachot for a usable ruling. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 10:19:35 +00:00
Chaim	7ee90dce31	feat: external precedent library with auto halacha extraction All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details Adds a third corpus of legal authority distinct from style_corpus (Daphna's prior decisions for voice) and case_precedents (chair-attached quotes per case). The new corpus holds chair-uploaded court rulings and other appeals committee decisions, with binding rules (הלכות) extracted automatically and queued for chair approval. Pipeline (web/app.py + services/precedent_library.py): file → extract → chunk → Voyage embed → halacha_extractor → store + publish progress over the existing Redis SSE channel. Schema V7 (services/db.py): extends case_law with source_kind + extraction status fields under a CHECK constraint pinning practice_area to the three appeals committee domains (rishuy_uvniya, betterment_levy, compensation_197). New precedent_chunks (vector(1024)) and halachot tables (vector(1024) over rule_statement, IVFFlat indexes, gin on practice_areas/subject_tags). Halachot start as pending_review; only approved/published rows are visible to search_precedent_library. Agents: legal-writer, legal-researcher, legal-analyst, legal-ceo, legal-qa get search_precedent_library. legal-writer prompt explains the three-corpus distinction and CREAC use; legal-qa now verifies that every cited halacha resolves to an approved row in the corpus. UI: /precedents page with four tabs — library / semantic search / pending review (J/K nav, A/R/E shortcuts, badge count) / stats. Reuses the existing upload-sheet progress + SSE pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-03 08:38:18 +00:00
Chaim	f256eddbb1	git_sync: full case-dir backup to Gitea (sweep + explicit commits) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m25s Details The case repo is the user's backup, so anything in the dir must end up on Gitea. Two layers: 1. Periodic sweep (every 30s) — git_sync.sweep_loop runs as a FastAPI background task. It scans every case dir, runs git status --porcelain on each, and commit_and_push's any dirty changes with an auto-built Hebrew message ("אוטו: טיוטות (2) · מסמכים"). Catches files written outside the API path: agent research artefacts, manual edits, etc. 2. Explicit commits at known write paths — DOCX export, interim draft, apply_user_edit, revise_draft, mark-final, analysis DOCX export. These give immediate feedback with descriptive messages instead of waiting up to 30s for the sweep. safe.directory injection added to _git_env so sweep + explicit commits work even when the running uid differs from the case-dir owner (host runs vs. uniform-root container). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 18:27:36 +00:00
Chaim	fa70944ed4	case-create: surface Gitea repo result + UI retry button All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m29s Details The auto-creation in case_create had two failure modes that combined to make repos silently missing: a stale GITEA_TOKEN returning 401, and the outer try/except in case_create that swallowed every exception with a bare pass. Result: cases like 8174-24 ended up with a local git repo and Paperclip project but no Gitea repo, with no signal anywhere. _setup_gitea_remote now returns {ok, url, error} and never raises; the result is attached to the case JSON and the FastAPI endpoint logs a warning when ok=false. The UI gets a "צור ריפו ב-Gitea" button on the case header that appears only when the repo or remote is missing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 18:12:05 +00:00
Chaim	903fb4d140	db: add missing delete_case (cases_tools.case_delete was calling a ghost) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m30s Details The case_delete tool in tools/cases.py and the DELETE /api/cases endpoint in web/app.py both invoke await db.delete_case(case_id), but no such function existed in services/db.py — every call returned 500 with an AttributeError. Discovered while wiping case 8174-24 for a clean rerun. Implementation is straightforward because the FK graph already does the work: 7 dependent tables CASCADE on cases.id (documents, document_chunks, claims, appraiser_facts, decisions, qa_results, case_precedents) and 2 SET NULL (audit_log, chair_feedback). A single DELETE FROM cases is enough — no manual ordering needed. Documented in the docstring that this only touches the legal-ai DB — Paperclip projects/issues and Gitea repos for the case are separate systems and must be cleaned up by the caller. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:44:44 +00:00
Chaim	28f49defff	LLM session: async, 30min timeout, semantic chunking + parallel All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details The claude_session bridge had two structural defects that made any non-trivial document extraction unreliable: 1. subprocess.run() blocks the asyncio event loop in the MCP server for the full duration of every LLM call (60-180s typical). 2. The 120-second timeout was below the cold-cache cost of any document over ~12K Hebrew characters. Three back-to-back timeouts on case 8174-24 dropped 43 appellant claims on the floor. Phase 1 of the remediation plan — keeps claude_session as the engine (no Anthropic API switch) and restructures around it: claude_session.py • query / query_json are now async — asyncio.create_subprocess_exec instead of subprocess.run, so MCP server can serve other coroutines while a call is in flight. • DEFAULT_TIMEOUT 120 → 1800 (30 min). High enough that no realistic document hits it; bounded so a runaway never zombifies forever. • LONG_TIMEOUT 300 → 3600 for opus block writing on full case context. • TimeoutError now actually kills the subprocess (asyncio.wait_for cancellation alone leaves the child running). claims_extractor.py • _split_by_sections: chunks at numbered sections / Hebrew letter headings / "פרק" markers / markdown ##, falls back to paragraph breaks, then to hard splits. Targets 12K chars per chunk — small enough that each chunk reliably finishes inside the timeout. • _extract_chunk: per-chunk retry (1 attempt by default) with structured logging on failure. Failed chunks no longer crash the overall extraction; they're skipped with a partial-result warning. • extract_claims_with_ai now runs chunks in parallel via asyncio.gather bounded by a semaphore (CHUNK_CONCURRENCY=3). For a 25K-char appeal: was sequential 150-300s, now ~70-90s. Updated all 9 callers (claims, appraiser facts, block writer, qa validator, brainstorm, learning loop, style analyzer × 3) to await the now-async API. The one-shot scripts/extract_claims_8174.py used to recover 43 appellant claims on case 8174-24 has been moved to .archive/ — phase 1 makes it obsolete. SCRIPTS.md updated. Phase 2 (background-task wrapper around LLM-bound MCP tools, persistent llm_tasks table, SSE progress) is the structural follow-up — separate PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-30 14:21:35 +00:00
Chaim	03e7d88aee	DOCX exporter: 3-layer RTL + David font on all slots All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m30s Details Hebrew was rendering LTR or in Times New Roman fallback in some Word contexts. Root cause: incomplete RTL marking and missing font hints on the run level. Three layers of RTL are required (per skills/docx/SKILL.md): 1. Section: <w:bidi/> in sectPr (now inherited from template) 2. Paragraph: <w:bidi/> directly in pPr (paragraph direction) 3. Run: <w:rtl/> in rPr — tells Word to use cs (complex-script) font Without an explicit font on the run, Hebrew renders in the ascii slot (Times New Roman). Force David on all four slots (ascii / hAnsi / cs / eastAsia) so every shaping path picks the correct font. Changes: - TEMPLATE_PATH now points to skills/docx/decision_template.docx (carries David, RTL, margins, styles); replaces hard-coded constants. - _mark_run_rtl: writes rFonts on all four slots, then appends <w:rtl/>. - _mark_paragraph_rtl: places <w:bidi/> directly in pPr (not nested in rPr — that was the bug), and adds <w:rtl/> to the paragraph-mark rPr. - _set_paragraph_jc: forces explicit jc, overriding style-inherited. Tests: - test_mark_paragraph_rtl_adds_bidi_directly_in_pPr — guards against the regression where bidi was nested inside rPr. - test_mark_run_rtl_forces_david_on_all_font_slots — ensures all four font slots are set, not just cs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 17:37:52 +00:00
Chaim	5e4c03d0cd	Case sync: refresh remote URL with current token before each push All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details Cases failed to push silently after the Gitea token in Infisical was rotated: the embedded credential in each case repo's origin URL was the old token, the rotation never propagated, and capture_output=True hid the auth failure as a logger.warning. Three cases (1033-25, 1130-25, 1194-25) accumulated unpushed commits over weeks before this was noticed. Fixes the root cause in two places: web/gitea_client.py for uploads through the FastAPI endpoint, and mcp-server/services/git_sync.py for case_update / document_upload through MCP tools (which previously committed but never pushed at all). The new commit_and_push helper: - re-injects the current GITEA_ACCESS_TOKEN into the existing origin URL on every call, so pushes survive token rotation - logs push failures at WARNING with the actual stderr (the previous code suppressed errors entirely) - continues to push even when the commit was a no-op, in case earlier commits are still unpushed Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-28 17:14:57 +00:00
Chaim	2b7f291928	Case archive/restore with Paperclip sync All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m27s Details Adds a comprehensive archive flow for closed cases — separate /archive screen in the UI, archive/restore actions on the case detail page, and automatic two-way sync with Paperclip. Backend (web/app.py + mcp-server/services/db.py): - New SCHEMA_V6 migration: cases.archived_at TIMESTAMPTZ + partial index - list_cases gains include_archived/archived_only flags; default excludes archived rows so the main /api/cases list hides closed cases - archive_case / restore_case helpers in db.py - POST /api/cases/{n}/archive sets archived_at and calls pc_archive_project (sets Paperclip projects.archived_at via direct DB) - POST /api/cases/{n}/restore clears archived_at and calls pc_restore_project (clears Paperclip archived_at) - archive_project / restore_project in paperclip_client.py — name-based match consistent with create_project's lookup Frontend (web-ui): - cases.ts: scope param ("active"\|"archived"\|"all") on useCases; useArchiveCase / useRestoreCase mutations - /archive page (new): table of archived cases with restore button + search, sort, empty state matching the editorial aesthetic of / - case-archive-action.tsx: button on case detail header. Active case → confirm dialog → archive. Archived case → restore (no confirm). Toast announces both legal-ai and Paperclip outcomes (synced, not found in pc, error) - case-header shows "בארכיון" badge when archived_at is set - Nav: ארכיון link added to AppShell after בית Tested end-to-end against the live DB: - 1130-25 archive → list_cases(include_archived=False) excludes it, list_cases(archived_only=True) includes it, restore reverses - pc archive/restore on 1194-25 verified via direct DB lookup - TypeScript compiles clean Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-27 18:54:52 +00:00
Chaim	36ca713dfa	Retrofit: tighten yod-bet pattern, add cover-block fallback All checks were successful Build & Deploy / build-and-deploy (push) Successful in 6s Details The "על כן" pattern for block-yod-bet was too greedy and matched mid-discussion transitional sentences (e.g. "על כן, במקום בו..."), which caused forward-scan to skip block-yod-alef ("סוף דבר") via the pointer advance. Tightened to require an operative subject (אנו / הערר / הוועדה / ועדת הערר) so terminal "על כן, אנו מחליטים" still matches but mid-block transitions don't. Added structural_fallback for cover blocks (alef/bet/gimel/dalet) — these are template metadata not present in user-edited DOCX bodies. Inject zero-content anchors so apply_user_edit can still target them later. The frontend toast distinguishes real content gaps from fallback anchors. Also expanded heading patterns based on training corpus inspection: - block-vav: על המקרקעין חלות / במצב התכנוני / התכניות החלות - block-zayin: טענות העוררת - block-chet: עיקר תגובת המשיב - block-tet: הדיון בוועדת הערר For case 1130-25, this raises detection from 6/12 to 11/12 blocks — only block-yod-bet remains missing (Daphna's edit ends at "סוף דבר" + numbered ruling, no terminal "ההחלטה" or "על כן אנו מחליטים" paragraph). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-26 06:57:41 +00:00
Chaim	c536ed0e63	Edit document doc_type and appraiser side from the case UI All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m26s Details Until now changing a document's doc_type required a manual SQL update. Adds an inline editor on the document badge so the chair can retag without leaving the case page, and threads an appraiser_side tag (committee / appellant / deciding) through the appraisal pipeline so betterment-levy cases — which usually have 2-3 appraisers — render conflicts with the deciding appraiser's view marked as governing. Backend - New appraiser_facts.appraiser_side column (V5.1) populated from documents.metadata.appraiser_side at extraction time. - extract_appraiser_facts now returns status='sides_missing' with the list of untagged appraisals instead of running with empty side labels — chair must tag every appraisal first via the UI. - Conflict detection orders entries committee → appellant → deciding so the deciding appraiser appears last; block-tet's prompt instructs the writer to phrase the deciding appraiser's view as the governing factual finding ("ואולם, השמאי המכריע קבע..."). - New PATCH /api/cases/{n}/documents/{doc_id} (Pydantic model with whitelist validation) and matching document_update MCP tool. Both merge appraiser_side into metadata JSONB instead of touching the schema. UI - New shared doc-types module exports the canonical 11 doc_type options plus the 3 appraiser-side options; both upload-sheet and the document badge now read from it instead of duplicating Hebrew labels. - New DocumentTypeEditor renders a Popover off the doc-type Badge with two Selects. The save button stays disabled while doc_type is appraisal but no side has been picked, mirroring the backend enforcement so the user finds out before triggering extraction. - usePatchDocument React-Query mutation invalidates the case detail on success so the badge updates without a manual refresh.	2026-04-19 06:26:51 +00:00
Chaim	c619c22a51	Add pre-ruling interim draft (טיוטת ביניים) for appeals committee All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m26s Details Lets the chair generate a partial decision DOCX before the discussion-and- ruling block is decided. Same template, skill and DOCX styling as the final decision (David, RTL, bookmarks) — only the block selection and order differ: רקע (ו) → תכניות+היתרים (ט) → טענות (ז) → הליכים (ח). The opening (ה), ruling (י), summary (יא), and signatures (יב) are omitted. - New appraiser_facts table + CRUD + conflict detection in db.py (V5 schema). Conflict = same plan/permit identifier reported differently by 2+ appraisers. - New appraiser_facts_extractor service: per-appraisal Claude extraction of plans + permits with raw quotes and page numbers. - block-tet prompt extended with a permits sub-section sourced from the extracted facts, plus an explicit instruction to flag inter-appraiser conflicts in neutral wording without resolving them (deferred to block-yod). - block-chet prompt extended with a post-hearing materials context sourced from documents.metadata.is_post_hearing. - docx_exporter.export_decision now accepts mode='interim' which reorders the blocks per the chair's mental model and writes טיוטת-ביניים-v{N}.docx (versioned independently of regular drafts). - 3 new MCP tools: extract_appraiser_facts, write_interim_draft, export_interim_draft. write_interim_draft auto-runs extraction if the appraiser_facts table is empty for the case.	2026-04-18 13:28:04 +00:00
Chaim	726498126d	Add Track Changes architecture for draft revisions (CMP + CMPA) All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m29s Details Fixes critical bug in 1033-25: user-uploaded עריכה-*.docx files were orphaned on disk while exports kept rebuilding from stale DB blocks. New architecture: - User-uploaded DOCX becomes the source of truth (cases.active_draft_path) - System edits via XML surgery with real Word <w:ins>/<w:del> revisions - User can Accept/Reject each change from within Word Components: - docx_reviser.py: XML surgery for Track Changes (15 tests) - docx_retrofit.py: retroactive bookmark injection with Hebrew marker detection + heading heuristic (9 tests) - docx_exporter.py: emits bookmarks around each of the 12 blocks - 3 new MCP tools: apply_user_edit, list_bookmarks, revise_draft - 4 new/updated endpoints: upload (auto-registers active draft), /exports/revise, /exports/bookmarks, /exports/{filename}/retrofit, /active-draft - DB migration: cases.active_draft_path column - UI: correct banner using real v-numbers, "מקור האמת" badge, detailed upload toast with bookmarks_added/missing_blocks - agents: legal-exporter (3 export modes), legal-ceo (stage G for revision handling), legal-writer (revision mode) Multi-tenancy: - Works for both CMP (1xxx cases) and CMPA (8xxx/9xxx cases) - New revise-draft skill added to both companies - deploy-track-changes.sh syncs skills CMP ↔ CMPA - retrofit_case.py: one-off retrofit of existing files Tests: 34 passing (15 reviser + 9 retrofit + 4 exporter bookmarks + 6 e2e) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 18:49:30 +00:00
Chaim	28daff58be	Pre-existing agent updates + analysis DOCX export Updates accumulated from prior sessions: - HEARTBEAT: company-based filtering (CMP/CMPA) rules - legal-qa, legal-researcher: routine updates - analysis_docx_exporter: new service for analysis DOCX export - compose page: "הורד כ-DOCX" button for analysis - decision_template.docx: template for exporter Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-04-16 18:49:10 +00:00
Chaim	3da4d73498	Upgrade agents to Claude Opus 4.7 All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m28s Details - legal-analyst: opus 4.6 → opus 4.7 - legal-proofreader: opus 4.6 → opus 4.7 - legal-writer: sonnet 4.6 → opus 4.7 (complex block writing benefits from stronger model) - block_writer MODEL_MAP: updated opus ID to 4.7 Opus 4.7 brings: high-res images (2576px), better file-based memory, improved DOCX generation, and task budgets for agentic loops. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-16 16:10:56 +00:00
Chaim	5dd24729e2	Auto-strip Nevo preambles and separate style analysis per appeal subtype - Add strip_nevo_preamble() to extractor.py — auto-removes Nevo database headers (bibliography, legislation, mini-ratio) during training upload - Add appeal_subtype column to style_patterns table — patterns are now stored per subtype instead of globally mixed - Update clear_style_patterns() to support subtype-scoped deletion - Pass appeal_subtype through analyze_corpus → store → upsert pipeline Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:03:06 +00:00
Chaim	ba39707c70	Add CMPA (betterment levy) training support and update methodology Support ingestion of betterment levy (היטל השבחה) decisions into a separate training corpus (CMPA). Key changes: - Add .doc file extraction via LibreOffice conversion in extractor - Add practice_area/appeal_subtype columns to style_corpus table - Route training files to cmp/ or cmpa/ subdirs based on appeal subtype - Fix derive_subtype to handle ARAR-YY-NNNN format (was matching year digit) - Expose practice_area/appeal_subtype params in MCP upload_training tool - Add appeal_subtype filter to analyze_style for per-type style analysis - Update betterment levy methodology in lessons.py: checklist (from generic to corpus-based), opening/closing strategies, and discussion rules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 14:00:35 +00:00
Chaim	684a4cfd3b	Fix 500 error on precedents API — add default=str to json.dumps All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m41s Details UUID and datetime objects from PostgreSQL RETURNING * were not serializable. All other tool files already used default=str. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-15 12:11:30 +00:00
Chaim	2e2d2d42b6	Prevent status regression in case_update All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m32s Details CEO agent was reverting case status from "processing" to "new" when updating metadata fields. Added ordered status list — case_update now silently ignores status changes that would move backwards. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 17:05:40 +00:00
Chaim	82ba4663ba	Fix case repo sync + auto-create Gitea repos + add sync indicator All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m30s Details - auto-sync-cases.sh: fix broken directory scan (was looking for status subdirs that don't exist), fix env var word-splitting bug, add safe.directory handling and error logging - cases.py: auto-create Gitea repo on case_create, fix documents/original → documents/originals naming mismatch - app.py: add GET /api/cases/{case_number}/git-status endpoint - web-ui: add SyncIndicator component in case header showing sync status (synced/pending/no remote) with last commit time - pyproject.toml: add httpx dependency - CLAUDE.md: update Paperclip wakeup API docs - settings page: switch tag input from Select to free-text with datalist Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 15:28:16 +00:00
Chaim	e698419faf	Fix git not found error crashing document uploads in container All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m13s Details Install git in Docker image and wrap all subprocess git calls in try/except so a missing or failing git binary never kills an upload that already succeeded. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:38:40 +00:00

1 2

97 Commits