The "ספרייה" tab only exposed approved/total counts in a status pill;
to inspect the actual extracted halachot per case the chair had to use
the global "ממתין לאישור" tab, which only surfaces pending items, or
the MCP tool. Now the per-precedent edit sheet renders a read-only
roll-up of every halacha (approved + pending + rejected) with status
filter tabs and counts. Review actions intentionally stay in the
review tab to avoid duplicate approve/reject UX.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`rerank.maybe_rerank` calls `base_search(limit=…, **base_kwargs)` on both
the rerank-on and rerank-off paths. Commit 242f668 moved the closure into
hybrid_search.py and renamed its parameter to `limit_inner`, so every call
to `/api/precedent-library/search` raised TypeError 500 regardless of the
VOYAGE_RERANK_ENABLED flag. Sibling `search_documents_hybrid` was unaffected
because it uses `lambda **kw:` which absorbs the kwarg.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Observed 2026-05-03: a `precedent_process_pending(halacha)` run that
chained two precedents (1110/20 → 317/10) succeeded for the first
(9 halachot, 129 chunks) and produced status=`no_halachot` for the
second despite it being a 47KB Supreme Court ruling with rich legal
analysis. A manual single-precedent re-run on 317/10 immediately
extracted 53 halachot. Diagnosis: every chunk's claude_session call
in the back-to-back run silently failed (likely Anthropic rate-limit
storm after the 1110/20 token burn), and the empty list was reported
as "Claude looked and found nothing" — same code path as a real
0-halacha ruling. The user couldn't tell the difference.
Three changes:
1. Surface chunk-level failures (halacha_extractor.py)
`_extract_chunk` now returns `(halachot, succeeded)` so the caller
can count how many chunks crashed. `extract()` uses this to
distinguish:
- `no_halachot` — chunks ran cleanly, Claude found nothing
- `extraction_failed` — ≥50% of chunks crashed AND zero halachot
came back (rate limit, subprocess crash, etc.)
When `extraction_failed`, DB status is left as 'processing' so the
request stays in the queue for the caller to retry — instead of
the old behaviour where it got marked 'completed' and silently
dropped from the queue.
2. Inter-precedent cooldown (precedent_library.py)
`process_pending_extractions` now sleeps 30s between precedents.
Anthropic rate-limits per-org, and back-to-back large rulings
(~4M tokens for 1110/20, immediately followed by another 2-3M)
was the empirical trigger. 30s gives the per-minute counter time
to drain.
3. Auto-retry on extraction_failed (precedent_library.py)
When a precedent comes back as `extraction_failed`, retry once
after a 60s cooldown before giving up. Rate-limit storms are
transient — the manual re-run of 317/10 minutes later succeeded
with 53 halachot and zero chunk failures, confirming a single
retry is sufficient. Only retries `extraction_failed`; never
`no_halachot` (Claude looked and there genuinely is no holding).
The DB status now ends up as 'failed' only after retries are
exhausted, matching the UI's terminal-failure chip.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When Paperclip wakes the CEO and the model issues an mcp__legal-ai__*
call within ~10s of session init, Claude Code sometimes returns
"No such tool available" because the legal-ai MCP server hasn't
finished bringing up its tool catalog yet. Observed twice today on
CMPA precedent-extraction wakeups (sessions 9989fbaf and a9c61801);
the agent fell back to bash + .venv/bin/python and finished the work,
but the race needed fixing on the server side.
Three changes that close the window:
1. Lazy schema init (services/db.py + server.py)
`init_schema()` was awaited inside the FastMCP lifespan, blocking
the `initialize`/`tools/list` handshake until ~10 CREATE TABLE IF
NOT EXISTS statements ran. Under contention (two CEOs waking at
once for different companies) this stretched. Now the lifespan
returns immediately and `get_pool()` runs the schema migrations
exactly once on first DB access, guarded by an asyncio.Lock.
tools/list is answered in milliseconds regardless of DB state.
2. Lazy heavy imports
- services/embeddings.py: voyageai (~450ms) loaded only inside
_get_client()
- services/extractor.py: google.cloud.vision (~550ms) loaded only
inside _get_vision_client() and _ocr_with_google_vision()
These two were being imported at module top from
legal_mcp.tools.documents -> services.processor -> services.{
extractor,embeddings}, so the FastMCP server couldn't even start
responding until both finished. Cold start dropped from 2.7s to
1.17s end-to-end (init + tools/list response).
3. Agent-side warmup + retry guidance (.claude/agents/legal-ceo.md)
Even with a fast server, the model can still race on the very
first call. The precedent-extraction section now tells the CEO
to call workflow_status as a warmup probe and to retry after a
short sleep if it sees "No such tool available", before falling
back to the python bypass.
Also expanded the precedent-tool whitelists on the sub-agents that
delegate halacha/library work (commits 4a9a6b7 + 7ee90dc added the
tools to the MCP server but only the CEO got them in its allowed
list). Added to: legal-researcher (full extraction set), legal-analyst
(library_get/list + halacha review), legal-writer (library lookups +
halacha_review), legal-qa (library_get + halacha_review), and the two
that the CEO was already missing (halacha_review, halachot_pending).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage C of the voyage-upgrades-plan shipped to production on
2026-05-03. The doc now leads with the final state and the two
empirical corrections vs the original plan:
1. Reciprocal Rank Fusion replaces weighted-sum hybrid merge.
voyage-3 cosines (~0.4-0.5) systematically outscale
voyage-multimodal-3 cosines (~0.20-0.25); a weighted sum lets
text dominate even when image is the better signal. RRF is
rank-based and robust to scale differences.
2. Chunker now propagates page_number end-to-end (extractor returns
per-page offsets, chunker tags each chunk by its first character's
page). A retrofit script backfills page_number on existing
document_chunks without re-OCR — uses the stored
documents.extracted_text plus PyMuPDF direct text reads as page
anchors (linear interpolation for OCR-only pages).
Production state on cases 8174-24 + 8137-24: 419 page-image
embeddings, 819 chunks tagged with page_number, MULTIMODAL_ENABLED=true
in Coolify env, hybrid search verified A/B against text-only baseline.
The original stage C plan section is retained below for reference.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first-pass retrofit re-extracted via extractor.extract_text, which
re-runs Google Vision OCR on scanned pages. OCR is non-deterministic,
so the new text didn't match the chunk content stored in the DB
(produced by the original OCR run) — only ~7% of chunks were located.
New approach (no OCR cost):
1. Use the stored documents.extracted_text from the DB — the exact
text the chunks were produced from, so chunk lookups match.
2. Anchor page boundaries via PyMuPDF direct text reads (free, no
OCR). Pages with usable direct text are anchored by snippet match;
OCR-only pages are linearly interpolated between anchors.
3. Search each chunk in extracted_text using a whitespace-tolerant
helper — needed because the chunker joins paragraphs with single
'\\n' while extracted_text uses '\\n\\n' as page separators.
Verified on 8174-24 (5 docs, 307 chunks) + 8137-24 (9 docs, 512
chunks): 100% chunks tagged, 13s total, $0 cost.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The legacy chunker did not track which PDF page each chunk came from.
Stored chunks had page_number=NULL, which blocked the multimodal
hybrid retriever's text+image boost — it joins (chunk, image) on
(document_id, page_number) and the join could never fire.
This change:
- extractor.extract_text now returns (text, page_count, page_offsets);
page_offsets[i] is the start char offset of page (i+1) in the joined
text. None for non-PDFs.
- chunker.chunk_document accepts an optional page_offsets and tags
each chunk with the page that contains its first character (uses
the existing chunker logic; pages assigned post-hoc by content
search to keep the diff minimal).
- processor.process_document and precedent_library.ingest_precedent
forward page_offsets through the chunker. New uploads now carry
accurate page_number on every chunk.
- Other extract_text callers (tools/documents, tools/workflow,
web/app.py) updated to unpack the third element (ignored).
- scripts/backfill_chunk_pages.py: per-case retrofit. Re-extracts each
PDF (re-OCRs via Google Vision if needed, ~$0.0015/page), computes
page_offsets, and updates page_number on every chunk by content
search. Idempotent; --force re-runs on already-tagged docs.
Forward-only would leave the 419 image embeddings backfilled on
cases 8174-24 + 8137-24 unable to boost their corresponding text
chunks. The retrofit script closes that gap (cost ~$0.60).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previous push to main did not trigger a workflow run; act-runner
went silent after task 112. Empty commit to re-fire the webhook.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Cosine scores in voyage-3 (~0.4-0.5) and voyage-multimodal-3
(~0.2-0.25) live on different scales. The previous weighted-sum
merge let text always dominate — verified empirically: 0 image-only
hits across 7 queries on case 8174-24, image side contributed nothing.
RRF combines by *rank* in each list rather than raw score, robust
to scale differences. Per-item score:
rrf_score = text_weight / (k + text_rank)
+ image_weight / (k + image_rank)
A row that appears in both lists (joined on (id_field, page_number))
gets both terms — surfaced as match_type='text+image'.
After fix on 8174-24 (146 image rows): 2 image-only hits land in
top-5 across all 7 test queries, surfacing actual table/diagram/
signature pages (p12, p13 of שומת המשיבה for 'טבלת השוואת ערכי שומה',
p25 of שומת השגה for 'תרשים גוש וחלקה', etc).
On 8137-24 (273 image rows): 'חישוב היוון של דמי החכירה' goes from
0 baseline results → 5 hybrid results (3 text + 2 image), opening
recall on scanned content the OCR layer misses.
Default MULTIMODAL_TEXT_WEIGHT 0.65 → 0.5 (vanilla RRF) since the
prior 0.65 was tuned for raw cosine scales that no longer apply.
New env knob MULTIMODAL_RRF_K (default 60, standard literature).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Stage C: per-page image embeddings via voyage-multimodal-3 + hybrid
text+image search. Off by default; enable with MULTIMODAL_ENABLED=true.
- Schema V9: document_image_embeddings + precedent_image_embeddings
(vector(1024), page_number, image_thumbnail_path)
- extractor.render_pages_for_multimodal renders PDF pages at
MULTIMODAL_DPI (144) for embedding + JPEG thumbnails at
MULTIMODAL_THUMB_DPI (96) for UI preview, in one pass
- embeddings.embed_images calls voyage-multimodal-3 in 50-page batches
- services/hybrid_search.py orchestrator: rerank applied to text side
first (rerank-2 is text-only); image side cosine; weighted merge
with text_weight 0.65 (env-tunable); image-only pages surface as
match_type='image' so dense scanned content still appears
- processor.process_document and precedent_library.ingest_precedent
gated by flag — non-fatal on multimodal failure
- scripts/multimodal_backfill.py — idempotent per-case CLI to embed
existing documents without re-extracting text
Validated locally on a 5-page response brief: render 0.31s, embed 8.32s,
hybrid merge surfaces image rows correctly. Production rollout starts
with flag=false (no behavior change), then per-case A/B.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The halacha-review panel was rendering raw slugs (`betterment_levy`,
`rishuy_uvniya`, `compensation_197`) as English badges. Pipe them through
the existing `practiceAreaLabel()` helper so the chair sees
"היטל השבחה", "רישוי ובניה", "פיצויים לפי ס' 197".
All other UI sites (library-list-panel, library-stats-panel,
precedent-edit-sheet) were already using the helper — this was the
sole place left rendering the raw slug.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PATCH /api/halachot/{id} was returning 500 because the row included
``embedding`` as a numpy.ndarray of np.float32, which FastAPI's
jsonable_encoder cannot serialize (vars() and dict() both fail on it).
The bug had been latent — it triggered for the first time today after
the auto-approve batch left only low-confidence halachot for the chair
to review manually, and her first PATCH hit the unserializable response.
Replace ``RETURNING *`` with an explicit column list (everything except
``embedding``). Callers that need the embedding can re-fetch via
``get_halacha`` — but no current caller does.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Halachot extracted by halacha_extractor with confidence >= 0.80 are now
inserted with review_status='approved' instead of 'pending_review' —
they appear in search_precedent_library immediately. Halachot below the
threshold still require manual chair approval.
Threshold tunable via env (HALACHA_AUTO_APPROVE_THRESHOLD), defaults to
0.80. Rationale: 89% of historical extractions (356/400) score 0.80+,
spot-checks confirmed quality, and the manual review backlog was the
single biggest reason rerank-2 was returning passages-only on
ההבחנה-style queries.
After this change + the one-time backfill UPDATE, search now returns
9/10 halachot for "ההבחנה בין השבחה לפיצויים" instead of 0 — and the
top-3 are exact-match rules, not adjacent passages.
Reviewer field records "auto-approved (confidence ≥ X.XX)" with the
threshold value at insert time, for traceability.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
In an RTL paragraph the bidi algorithm puts the *first* logical token
on the right, so "פתח דאשבורד Paperclip" rendered visually as
"Paperclip" on the LEFT — which reads as the *last* word in Hebrew
and looks like an afterthought rather than the brand name the menu
opens. Reorders to "Paperclip פתח דאשבורד" so Paperclip sits on the
right (read first) and centers the label so it sits above both items
instead of hugging the inline-start edge.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous flex layout used `flex-1` on the search wrapper, which
centers the search relative to the *remaining* space — so as the brand
subtitle grows ("עוזר משפטי · ערר 8137-24 · ניסוח") or the agent
trigger label changes, the search drifts off-center.
Switches Row 1 to `grid-cols-[minmax(0,1fr)_minmax(280px,460px)_minmax(0,1fr)]`:
brand on the right, search in the middle (anchored to the viewport
midpoint), agent dropdown on the left. The side cells flex equally so
the center stays put regardless of side content width.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous layout used `justify-between` with the board name and the
prefix·hint hint on the same row. With Hebrew labels + the long hint
"תיקי 8xxx / 9xxx" the row overflowed the 220px content and wrapped the
hint into 2-3 lines, breaking visual alignment.
Stacks each item now: bold board name on top, dim prefix·hint underneath.
Adds whitespace-nowrap to both lines and bumps min-width to 240px so the
content drives the dropdown width instead of fighting it.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Splits the AppShell header into:
Row 1 — brand: logo + dynamic context subtitle (route-aware) +
global search + agent boards dropdown
Row 2 — nav: work group (בית · ארכיון) | knowledge group (ספריית
פסיקה · אימון · מתודולוגיה) + admin dropdown (⚙) on the left
Three changes from the previous flat 8-item nav:
1. Grouping reflects intent. Daily-driver pages are in "work", corpus
pages in "knowledge"; system pages (skills · diagnostics · settings)
move into a single ⚙ dropdown so they stop competing for attention.
2. Subtitle is now dynamic. `headerSubtitle(pathname)` resolves the
current section so the user always sees where they are without
scanning the nav row. Case routes show the case number explicitly
("ערר 1234-24" / "ערר 1234-24 · ניסוח").
3. The gold-underline active state is preserved and the admin trigger
inherits it whenever any admin route is active.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures accumulated backend drift since last regeneration. Triggered
by the new /api/search/cases endpoint added for header global search,
but the diff also picks up many other endpoints that had been added
without re-running api:types.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds an always-visible debounced search input in the AppShell header
that fans out to three independent sources in parallel and renders
per-source result groups with their own loading/empty/error states:
- /api/search/cases (NEW): SQL ILIKE on case_number, address, parties,
title, subject. Returns small projections, no embeddings needed.
- /api/precedent-library/search (existing): semantic over case-law
halachot + passages.
- /api/search (existing): semantic over case documents + past decisions.
Cmd/Ctrl+K focuses the input; Esc and click-outside close the panel.
This is Phase A of the header redesign — the bar layout itself is
unchanged; row grouping + dynamic context follow in Phase B.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the hardcoded CMPA link with a dropdown listing both
Paperclip boards (CMP = רישוי ובניה, CMPA = היטלי השבחה). Fixes the
mislabeling where the original link pointed to the wrong board, and
gives the user a single entry point that scales if a third board is
added later.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a "ניהול סוכנים" link on the opposite side of the "עוזר משפטי"
title in the app shell header. Opens the Paperclip CMPA dashboard
(pc.nautilus.marcusgroup.org/CMPA/dashboard) in a new tab for quick
cross-tool navigation between the legal-ai workspace and agent ops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
When a precedent is uploaded to the library, the FastAPI container now
fires a Paperclip wakeup so Claude (running locally as the CEO agent)
picks up the new row and runs `precedent_process_pending` for both
metadata and halacha extraction. The user no longer has to remember to
trigger it manually.
Mechanics:
- New `wake_for_precedent_extraction()` in paperclip_client.py creates
(or reuses) a per-company "ספריית פסיקה — תור חילוץ" project, opens
a fresh issue assigned to the company CEO with the case_law_id +
citation in the description, and pings the Board API wakeup endpoint
with `triggerDetail=precedent_library_upload`.
- ingest_precedent's _run() in app.py captures the returned case_law_id
and best-effort calls the wake function (failures are logged, not
surfaced — the upload itself stays clean).
- legal-ceo.md adds the precedent_process_pending tool family and a
new "חילוץ פסיקה אוטומטי" section that tells the CEO to short-circuit
past the heartbeat scan when woken with this trigger.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase A — voyage-3 migration (executed):
- VOYAGE_MODEL=voyage-3 set in Coolify (legal-ai app) and ~/.env
- scripts/reembed_voyage.py: re-embeds document_chunks (6157),
case_law_embeddings (9), precedent_chunks (385), and halachot (400)
using the new model. paragraph_embeddings was empty. 6951 rows
re-embedded in 93s, ~75 rows/sec.
- Same 1024 dim → no schema change needed.
Why voyage-3 over voyage-law-2: benchmark on 3 Hebrew legal queries
with real passages from the corpus gave voyage-3 perfect ordering on
3/3 tests AND the largest separation (+0.483 vs voyage-law-2's
+0.238). voyage-4 family had bigger separation but missed top-1 on
the hardest test.
Phase B (voyage-context-3) and Phase C (voyage-multimodal-3.5 for
scanned + appraiser docs) are designed in docs/voyage-upgrades-plan.md
but deferred — to be picked up in a fresh conversation. The plan
includes:
- Phase B: contextualized embeddings refactor (~49% recall lift on
legal docs per Anthropic's research). Same dim, but ingestion
pipeline must pass full doc context per chunk.
- Phase C: page-level image embeddings via voyage-multimodal-3.5,
stored in a parallel *_image_embeddings table. Hybrid text+image
search. Targets appraiser report tables and scanned PDFs where
current OCR loses layout.
After this commit: MCP server needs a /mcp reconnect to pick up the
new VOYAGE_MODEL env, and the legal-ai container will pick it up on
its next redeploy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
set_case_law_extraction_status and set_case_law_halacha_status now NULL
the corresponding *_requested_at timestamp when status transitions to
"completed" or "failed". Without this, completed rows kept lingering in
the local-MCP work queue (which scans by `WHERE *_requested_at IS NOT NULL`)
and the UI's isPrecedentActive check, leaving them undeletable until a
manual SQL cleanup.
The pre-existing process_pending_extractions path already called
clear_extraction_request, but other paths (re-extraction, status set
during upload) didn't — so the cleanup belongs at the status setter.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The local MCP worker is supposed to NULL `*_extraction_requested_at` after
a successful run, but in practice these timestamps linger. The previous
isPrecedentActive logic treated any non-null timestamp as "still active",
which left completed rows permanently undeletable.
Now only "processing" status (or genuinely queued: pending + timestamp)
counts as active. Once a row is "completed"/"failed", stale timestamps
no longer block the delete button.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chair pointed out three UX gaps after uploading a new precedent:
1. The status said "מחלץ הלכות" but nothing was actually running — the
field only meant "halacha_extraction_status != completed", which
includes the post-upload "pending" state where the local MCP worker
hasn't been told to drain anything yet. Misleading.
2. The page didn't refresh on its own. The chair had to F5 to see new
counts after extraction completed.
3. Clicking the trash icon mid-extraction would cascade-delete the row
while the extractor was still using it (FK errors, partial writes).
Fixes:
- ingest_precedent now auto-queues both metadata and halacha extraction
on upload by stamping the request timestamps. The chair (or me) drains
the queue with one `precedent_process_pending` call from chat —
no need to click any button before that.
- StatusPill is now five-state with proper labels:
"נכשל" (extraction_status=failed) — red
"מעבד טקסט" — shimmer (extraction_status=processing)
"בתור" — neutral (chunks queued, not yet running)
"מחלץ הלכות" — shimmer (halacha_extraction_status=processing)
"ממתין לחילוץ" — neutral (queued for local MCP worker)
"לא חולץ" — neutral (pending without queue stamp — shouldn't happen)
"X/Y מאושרות" — gold (done, with halachot count)
The shimmer is a CSS-only sliding-stripe animation defined in globals.
- usePrecedents has a conditional refetchInterval — polls every 5s while
any row is mid-extraction or queued, then stops once everything settles
to completed/failed. New helper isPrecedentActive() centralises the
"is this row mid-something" check so the UI and the destructive-action
guard agree.
- Trash button is disabled (opacity 30%, tooltip explains) while the row
is active. Pencil/edit stays enabled — editing metadata fields during
extraction is safe (last write wins, low-stakes race).
Schema: list_external_case_law now exposes the two *_requested_at
timestamps so the UI can distinguish "queued" from "never asked".
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The chair wanted a one-click "extract metadata" button on the edit sheet.
The constraint stays the same — claude_session needs the local CLI which
the container doesn't have, so the button can't run the extractor itself.
Compromise: button stamps a queue marker; the local MCP server drains the
queue on demand.
DB (V8): two nullable timestamps on case_law,
metadata_extraction_requested_at and halacha_extraction_requested_at,
with partial indexes for cheap "find pending" scans.
API:
POST /api/precedent-library/{id}/request-metadata → stamp the row
POST /api/precedent-library/{id}/request-halachot → same for halacha
GET /api/precedent-library/queue/pending?kind=... → read-only view
UI: Sparkles button in the edit sheet header. Click → toast tells the
chair what to run from Claude Code. The button never triggers the
extractor directly from the container.
MCP tool: precedent_process_pending(kind, limit) — runs from Claude Code
with the local CLI, picks up everything stamped, calls the extractor for
each, clears the timestamp on success. Failures keep the timestamp so the
next invocation retries them.
Architectural rule (claude_session local-only) is preserved end-to-end
and called out in the new endpoint comment + tool docstring.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two follow-ups after running the metadata extractor on 403-17:
1. Library table: shadcn TableCell defaults to whitespace-nowrap and
the table wrapper has overflow-x-auto, so the long citation forced
a horizontal scrollbar inside the row. Override on the citation
cell only — whitespace-normal + break-words + min/max-w to keep the
column readable. Same for the case-name cell. Row aligns to top so
wrapping doesn't push neighbours up.
2. Extractor now also fills source_type (court_ruling /
appeals_committee). The previous round added decision_date_iso,
precedent_level, and court but left source_type empty. Same
closed-enum + merge-only-if-empty policy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The first end-to-end run on 403-17 surfaced three fields the auto-fill
left blank because the chair didn't set them in the upload form: date,
precedent_level, and court. All three are right there in the ruling's
header text — there's no reason to require manual entry.
Prompt now asks for:
- decision_date_iso (YYYY-MM-DD parsed from "ניתנה היום, … 5 בספטמבר 2022"
style signatures)
- precedent_level (closed enum: עליון/מנהלי/ועדת_ערר_ארצית/ועדת_ערר_מחוזית)
- court (the full court name from the title block)
Validation is unchanged: precedent_level only accepts the four enum
values; decision_date_iso is parsed into a Python date object before
being handed to update_case_law (asyncpg doesn't coerce strings to
DATE columns); court is stored verbatim.
Merge policy is unchanged — only fills empty fields. Anything the
chair typed in the upload form survives.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
After running the dual-mode halacha extractor on a real appeals committee
decision (403-17), the pending-review tab surfaced 351 halachot in a
single flat list — the chair correctly pointed out that this is unusable
without grouping. Three fixes:
1. Group pending halachot by precedent (case_law_id). Each group shows
the citation, court, date, level and item count; default state is
collapsed so the chair picks one ruling at a time. Within a group,
items still sort by confidence ascending so the doubtful ones surface
first. J/K/A/R/E now scope to currently-expanded groups; toggling
open auto-focuses the first item.
2. Translate the badges that were leaking English: rule_type values
(`persuasive`, `interpretive`, `binding`, `application`, `procedural`,
`obiter`) now render as Hebrew labels, and `confidence X.XX` becomes
`ביטחון X.XX`. The card header no longer repeats the citation since
it's already in the group header.
3. Strip Unicode bidi marks (U+200E/F/202A-E/2066-9) from displayed
citations. Nevo PDFs and the upload form embed these in the
case_number; they render as zero-width but visually push the text
away from the right edge of the table cell. Also: hide the empty
court line under the case name in the list (was rendering as a
stray em-dash), and use a muted em-dash for empty date/level rather
than blank/dash inconsistency across columns.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Architectural correction: every claude_session caller in this project
runs through the local MCP server (~/.claude.json points at
/home/chaim/legal-ai/mcp-server/.venv/bin/python). The Coolify container
has no `claude` CLI and no claude.ai session, so any LLM call originating
from web/ FastAPI fails with "Claude CLI not found" — which is exactly
what we hit on 403-17.
The earlier Anthropic SDK fallback would have made it work, but at
direct API cost. The chair's preference is to stay on the claude.ai
session for everything. So:
- claude_session.py: removed the SDK fallback, restored CLI-only.
The error message now points the next person at the architectural
rule in the module docstring instead of papering over it.
- precedent_library.py:ingest_precedent (called from FastAPI on upload)
now does only the non-LLM half: extract → chunk → embed → store.
Sets halacha_extraction_status='pending' for the chair to act on.
- reextract_halachot / reextract_metadata kept, but lazy-import their
extractors so the FastAPI path can't accidentally pull them in. They
are reachable only via the MCP tools precedent_extract_halachot /
precedent_extract_metadata, which run locally with CLI.
- Removed POST /api/precedent-library/{id}/extract-halachot and
/extract-metadata — they were dead ends from the container.
- Dropped the `anthropic` Python dep that the SDK fallback required.
- UI: removed the "refresh halachot" and "sparkles metadata" buttons
that called those endpoints. Edit sheet now points the chair at the
MCP tool names instead.
Halacha and metadata extraction for an uploaded precedent now happen
when the chair (via Claude Code) runs:
mcp__legal-ai__precedent_extract_metadata <case_law_id>
mcp__legal-ai__precedent_extract_halachot <case_law_id>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three fixes to the precedent library after the first end-to-end test on
403-17 surfaced runtime issues:
1. Anthropic SDK fallback in claude_session. The legal-ai Docker container
does not ship the `claude` CLI, so every halacha and metadata extraction
was failing with "Claude CLI not found." Module now tries the CLI first
(zero-cost local path) and falls back to the Anthropic SDK with
ANTHROPIC_API_KEY when the binary is absent. Default model is
claude-sonnet-4-6, overridable via CLAUDE_SDK_MODEL env. The system
message gets cache_control: ephemeral so multi-chunk runs reuse the
cached instruction prefix at ~10% read cost. Adds `anthropic` to
pyproject deps.
2. precedent_metadata_extractor crashed with KeyError because the JSON
example inside the prompt template contained literal { } characters
that str.format() interpreted as placeholders. Switched to f-string
concatenation; the prompt template no longer needs format() at all.
3. Library list query stays stale after upload because the upload
mutation's onSuccess fires when the POST returns task_id, not when
SSE reports completion. Added a second invalidate inside the SSE
watcher in PrecedentUploadSheet so the new row appears with up-to-date
chunk and halachot counts the moment processing finishes.
Halacha and metadata extractors now route the long static prompt through
the new `system=` parameter so the SDK path actually caches it; the CLI
path concatenates and behaves as before.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three improvements to the precedent library based on usage feedback:
1. Auto-fill metadata at upload time. New service
precedent_metadata_extractor reads the ruling's full_text and
suggests case_name (short), summary, headnote, key_quote,
subject_tags, appeal_subtype. The merge policy fills only empty
fields, preserving everything the chair typed in the upload form.
Wired into the ingest pipeline; also exposed as a re-run endpoint
POST /api/precedent-library/{id}/extract-metadata for existing
records.
2. Edit sheet in the UI. Pencil icon on each library row opens a
pre-populated form covering every field. A Sparkles button on the
sheet runs the metadata extractor on demand and refreshes the
form. The case_number is read-only because halachot are FK'd to
it; renaming requires delete + re-upload.
3. Halacha extractor branches on is_binding. Sources marked binding
(Supreme/Administrative) keep the strict halacha prompt. Non-binding
sources (other appeals committees, district courts on planning
matters) get a different prompt that extracts applications,
interpretive principles, and persuasive conclusions — labeled with
new rule_types 'application' and 'persuasive'. The fallback also
widens chunk selection: if the chunker labeled nothing as
legal_analysis/ruling/conclusion, we now run on all chunks rather
than returning zero halachot for a usable ruling.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Reduce vertical padding, number font size, and inter-element gaps so
the four counters take less vertical real estate. Width unchanged.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a third corpus of legal authority distinct from style_corpus
(Daphna's prior decisions for voice) and case_precedents (chair-attached
quotes per case). The new corpus holds chair-uploaded court rulings and
other appeals committee decisions, with binding rules (הלכות) extracted
automatically and queued for chair approval.
Pipeline (web/app.py + services/precedent_library.py):
file → extract → chunk → Voyage embed → halacha_extractor → store +
publish progress over the existing Redis SSE channel.
Schema V7 (services/db.py): extends case_law with source_kind +
extraction status fields under a CHECK constraint pinning practice_area
to the three appeals committee domains (rishuy_uvniya, betterment_levy,
compensation_197). New precedent_chunks (vector(1024)) and halachot
tables (vector(1024) over rule_statement, IVFFlat indexes, gin on
practice_areas/subject_tags). Halachot start as pending_review; only
approved/published rows are visible to search_precedent_library.
Agents: legal-writer, legal-researcher, legal-analyst, legal-ceo,
legal-qa get search_precedent_library. legal-writer prompt explains
the three-corpus distinction and CREAC use; legal-qa now verifies that
every cited halacha resolves to an approved row in the corpus.
UI: /precedents page with four tabs — library / semantic search /
pending review (J/K nav, A/R/E shortcuts, badge count) / stats.
Reuses the existing upload-sheet progress + SSE pattern.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The list's scroll container had only overflow-y:auto, which CSS computes
overflow-x to auto too. Combined with the row's -mx-2 hover-background
extension, this surfaced an unwanted horizontal scrollbar.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Backend (cases listing)
- /api/cases: also return updated_at, created_at, practice_area,
appeal_subtype, subject. The detail-mode response was previously
dropping these even though db.list_cases reads them, leaving the
UI's "תחום" and "עודכן" columns blank.
Frontend
- Split the home table into two: רישוי (1xxx) and היטל השבחה ופיצויים
(8xxx + 9xxx), bucketing on appeal_subtype with a case-number-prefix
fallback. The "תחום" column is now redundant and removed.
- New AppealTypeBars chart in the right rail next to the existing
status donut.
- Donut: switch to a vertical layout (donut on top, legend below in a
3-col grid) so labels like "חדש / בעיבוד" no longer wrap inside the
320px sidebar; counts now align in a tabular column.
- CasesTable accepts emptyText/searchPlaceholder so each split table
has its own copy.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Raise HTTPException(503) when Paperclip DB is unreachable instead of
silently falling through to disk-only mode and returning [].
- Honor PAPERCLIP_SKILLS_DIR env var (falls back to ~/.paperclip/...).
In the Coolify container the host's skills dir is bind-mounted at
/paperclip-skills; without this, Path.home() resolved to /root/ and
the disk inventory was always empty.
Both bugs together silently turned a Paperclip DB outage into "no skills
installed" on the /skills page.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Paperclip heartbeat staleness gate (heartbeat.js evaluateQueuedRunStaleness)
cancels queued runs when issue.assigneeAgentId !== run.agentId, with error
"issue assignee changed before the queued run could start". Older Paperclip
versions auto-assigned on wakeup; the current version does not, so issues
created with NULL assignee silently never run.
Set assignee_agent_id to the company's CEO at INSERT time. Affects both the
project setup issue and the "התחל תהליך ניסוח" workflow issue.
Default 10mb caused upload-tagged 500s on scanned PDFs in case 1027-26
(Next 16 truncates body, FastAPI sees broken multipart, socket hang up).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The case repo is the user's backup, so anything in the dir must end up
on Gitea. Two layers:
1. Periodic sweep (every 30s) — git_sync.sweep_loop runs as a FastAPI
background task. It scans every case dir, runs git status --porcelain
on each, and commit_and_push's any dirty changes with an auto-built
Hebrew message ("אוטו: טיוטות (2) · מסמכים"). Catches files written
outside the API path: agent research artefacts, manual edits, etc.
2. Explicit commits at known write paths — DOCX export, interim draft,
apply_user_edit, revise_draft, mark-final, analysis DOCX export.
These give immediate feedback with descriptive messages instead of
waiting up to 30s for the sweep.
safe.directory injection added to _git_env so sweep + explicit commits
work even when the running uid differs from the case-dir owner (host
runs vs. uniform-root container).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two latent issues surfaced today while watching the case 8174-24
end-to-end run, both worth documenting and engineering around because
they will recur on every future case.
Bug 1 — issue.released flips done→todo
After an agent successfully PATCHes its issue to "done", Paperclip's
internal issue.released action reverts the status to "todo" within
~30 seconds. This triggers a fresh wakeup of the same agent on a
task that is already complete.
Reproduced on CMPA-18 (30/04/26):
18:14:57 agent PATCH → status: done
18:15:35 Paperclip → issue.released → status: todo
18:15:54 new researcher run started
The fix at the right altitude (Paperclip itself) is outside our repo.
Mitigation in HEARTBEAT.md §3 — when an agent boots and finds the
issue in `todo` while expected outputs (file, DB rows) already exist,
it must short-circuit: post a "no change" comment, PATCH back to done,
and exit. Costs ~$0.20 per false wakeup but breaks the loop.
Bug 2 — Bash backtick trap on long comment bodies
Researcher agent built a curl pipeline like:
curl ... -d "$(python3 -c "body = '''...
📁 קובץ מחקר: `/path/to/file.md`
'''")"
The backticks around the file path (markdown convention) get
evaluated by the OUTER bash $(...) as command substitution. Bash
then tries to exec /path/to/file.md, which is not executable, and
prints "Permission denied" — a misleading error since the actual
file ownership is fine. The curl itself succeeded; only the bash
prelude noised up the log.
Fix in HEARTBEAT.md §4א: long bodies must go via Write→tempfile
then `curl -d @file`. Avoids every shell quoting edge case.
Files:
• docs/paperclip-quirks.md — new. Full writeup of both bugs plus
two prior known-quirks (CEO auto-block in_progress, INSERT vs
API for wakeups). Each section: what happens, empirical evidence
from logs, impact, workaround, status.
• .claude/agents/HEARTBEAT.md — added the self-recovery section to
§3 and the temp-file pattern to §4א. The temp-file pattern is the
canonical answer for any agent posting markdown comments —
applies to all 7 agents in this skill set.
• CLAUDE.md — referenced the new doc from the docs index.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The auto-creation in case_create had two failure modes that combined to
make repos silently missing: a stale GITEA_TOKEN returning 401, and the
outer try/except in case_create that swallowed every exception with a
bare pass. Result: cases like 8174-24 ended up with a local git repo and
Paperclip project but no Gitea repo, with no signal anywhere.
_setup_gitea_remote now returns {ok, url, error} and never raises; the
result is attached to the case JSON and the FastAPI endpoint logs a
warning when ok=false. The UI gets a "צור ריפו ב-Gitea" button on the
case header that appears only when the repo or remote is missing.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two structural gaps in legal-researcher's "שלב 5: דיווח" surfaced while
auditing the case 8174-24 run:
1. **No DB linkage.** The skill told the researcher to post a comment
summarizing precedents but never to call mcp__legal-ai__precedent_attach.
The MCP tool itself wasn't even in the tools frontmatter — so even
a researcher that wanted to write to case_precedents physically
couldn't. Result: 0 rows in case_precedents after a successful
research run, even with 8 precedents identified and verified in
the comment text. The writer then has to grep free-text instead
of querying a structured table.
2. **No persisted file.** Research output existed only as a Paperclip
comment. The writer/QA can't `Read` it from disk; they have to go
through Paperclip API to fetch comment bodies. Compare to the
analyst, which is required to write `analysis-and-research.md`.
Fix:
• Added precedent_attach, precedent_list, precedent_search_library
to the tools frontmatter.
• Rewrote step 5 with explicit ordering: save to disk → attach
verified precedents to DB → update status → email → post comment.
• Documented the precedent_attach call signature inline (case_number,
citation, quote, section_id) so the agent doesn't have to reverse-
engineer it. Includes guidance on which precedents to attach
(verified with quote) vs which to leave for external verification.
Effect: future research runs will populate case_precedents and
data/cases/{N}/documents/research/precedent-research.md, both of which
the writer needs.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The retry loop bug we fixed in legal-analyst yesterday existed in every
single sub-agent skill. They all post a comment + wake the CEO + exit,
leaving their own issue in `in_progress`. Paperclip's "in_progress with
no live execution" watchdog then re-wakes them, repeating until something
external transitions the issue. Watched it happen on CMPA-17 (researcher)
today — 4 iterations + manual SIGTERM + manual PATCH.
Same fix applied to all 5 remaining agents:
• legal-researcher.md
• legal-writer.md
• legal-qa.md
• legal-exporter.md
• legal-proofreader.md (file was incomplete — also added the missing
שלב 5: דיווח and wake-CEO sections to bring it to parity with the
other agents)
Each gets a "סגור את ה-issue של עצמך — חובה!" section with two PATCH
templates: one for `done` after a successful run, one for `blocked` if
checks fail or output is incomplete. The section sits before the
wake-CEO block, with an explicit reference to the CMPA-17 incident so
the rule has a concrete anchor.
Result: every agent now has the same close-issue contract. No more
zombie in_progress issues, no more 4× wakeup loops.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two structural bugs surfaced while monitoring the fresh end-to-end
run on case 8174-24:
1. **No appraiser_facts extraction.** legal-analyst.md's "what to
extract" table didn't mention doc_type='appraisal' at all, and
`extract_appraiser_facts` wasn't in its tools frontmatter. The
CEO compounded this by writing in CMPA-16's body that all 3
appraisals were "reference materials, do not extract" — which
is correct for `extract_claims` but wrong for the appraisal-
specific extractor. Result: 0 appraiser_facts in DB after a
full run, even though the user had carefully tagged each
appraisal's `appraiser_side` (committee/appellant) precisely
so detect_conflicts could compare them.
2. **Issue stays in_progress, Paperclip retries forever.** Step 7
("שמירה ודיווח") instructed the analyst to update the *case*
status, post a comment, send email, and wake the CEO — but
never to PATCH the issue itself to `done`. Paperclip's
"in_progress with no live execution" watchdog then re-woke the
analyst, which posted "I'm done" again, which re-triggered
another wakeup. We saw three iterations on CMPA-16 before the
issue finally transitioned. The PATCH pattern was already
documented in HEARTBEAT.md §4ב — the analyst skill just never
referenced it.
Changes:
• legal-analyst.md
- Added mcp__legal-ai__extract_appraiser_facts to tools list.
- Rewrote the "what to extract" table to use doc_type as the
key column and added an `appraisal` row + a callout explaining
why it goes through a different extractor.
- Added explicit step 5 "חלץ עובדות שמאי" with the call.
- Step 7 now PATCHes the issue to `done` (or `blocked` on
failure) before waking the CEO. Refers to the actual incident
so the rule has a concrete anchor.
- Cleaned up the chunking guidance — phase 1 of claude_session
already handles big docs automatically; no need to manually
split.
• legal-ceo.md (analyst issue template section)
- Replaced the generic "list of docs not to extract from" with a
per-doc_type action table that explicitly says
`appraisal → extract_appraiser_facts (NOT extract_claims)`.
- Added an explicit guard: "for every appraisal in the case,
verify the issue body says to run extract_appraiser_facts —
otherwise the writer gets a numbers-free block ז".
- Added the close-the-issue-with-PATCH instruction so the CEO
knows to write that into every analyst issue.
These edits don't affect the run currently in flight (the CEO's
prompt was already cached and the analyst already ran). They take
effect on the next analyst invocation.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Without a primary workspace on a project, the "סביבות עבודה" tab in
Paperclip stays hidden (gate: enableIsolatedWorkspaces && S0t list
non-empty), and agents wake with cwd=`/home/chaim` instead of the
legal-ai source tree. New helper inserts a primary workspace pointing at
LEGAL_AI_WORKSPACE_CWD (default /home/chaim/legal-ai) on both new and
legacy/existing-project paths. Idempotent — skips if any workspace row
already exists.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Captures the full deletion procedure we worked out empirically while
wiping case 8174-24 for a clean rerun. Covers all four systems where
case state lives, in dependency order:
1. legal-ai DB + on-disk dir — DELETE /api/cases?remove_files=true
(now actually works after 903fb4d added the missing db.delete_case)
2. Paperclip DB — no API; raw SQL with explicit FK-blocker ordering
(issue_comments, cost_events, finance_events, feedback_votes,
issue_inbox_archives, issue_read_states must go before issues;
heartbeat_runs.wakeup_request_id must be NULLed before
agent_wakeup_requests can be deleted)
3. Gitea — DELETE /api/v1/repos/cases/{N}
4. Verification queries for each system
Two gotchas worth highlighting in the doc:
• The case directory inside /data/cases is owned by root because the
container runs as root — host-side rm needs sudo, or use the API
(rmtree happens inside the container).
• Paperclip projects are referenced via name LIKE '%{N}%' since
there's no slug column. Stricter matching is recommended if N
appears in multiple project names.
Linked from legal-ai/CLAUDE.md docs index. A future scripts/delete-case.sh
that automates the runbook with a confirmation prompt is noted as TODO
inside the runbook itself.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>