Commit Graph

4 Commits

Author SHA1 Message Date
d4496b96f1 fix(mcp): eliminate "No such tool available" race at agent wakeup
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m26s
When Paperclip wakes the CEO and the model issues an mcp__legal-ai__*
call within ~10s of session init, Claude Code sometimes returns
"No such tool available" because the legal-ai MCP server hasn't
finished bringing up its tool catalog yet. Observed twice today on
CMPA precedent-extraction wakeups (sessions 9989fbaf and a9c61801);
the agent fell back to bash + .venv/bin/python and finished the work,
but the race needed fixing on the server side.

Three changes that close the window:

1. Lazy schema init (services/db.py + server.py)
   `init_schema()` was awaited inside the FastMCP lifespan, blocking
   the `initialize`/`tools/list` handshake until ~10 CREATE TABLE IF
   NOT EXISTS statements ran. Under contention (two CEOs waking at
   once for different companies) this stretched. Now the lifespan
   returns immediately and `get_pool()` runs the schema migrations
   exactly once on first DB access, guarded by an asyncio.Lock.
   tools/list is answered in milliseconds regardless of DB state.

2. Lazy heavy imports
   - services/embeddings.py: voyageai (~450ms) loaded only inside
     _get_client()
   - services/extractor.py: google.cloud.vision (~550ms) loaded only
     inside _get_vision_client() and _ocr_with_google_vision()
   These two were being imported at module top from
   legal_mcp.tools.documents -> services.processor -> services.{
   extractor,embeddings}, so the FastMCP server couldn't even start
   responding until both finished. Cold start dropped from 2.7s to
   1.17s end-to-end (init + tools/list response).

3. Agent-side warmup + retry guidance (.claude/agents/legal-ceo.md)
   Even with a fast server, the model can still race on the very
   first call. The precedent-extraction section now tells the CEO
   to call workflow_status as a warmup probe and to retry after a
   short sleep if it sees "No such tool available", before falling
   back to the python bypass.

Also expanded the precedent-tool whitelists on the sub-agents that
delegate halacha/library work (commits 4a9a6b7 + 7ee90dc added the
tools to the MCP server but only the CEO got them in its allowed
list). Added to: legal-researcher (full extraction set), legal-analyst
(library_get/list + halacha review), legal-writer (library lookups +
halacha_review), legal-qa (library_get + halacha_review), and the two
that the CEO was already missing (halacha_review, halachot_pending).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 20:23:14 +00:00
242f668319 feat(retrieval): add voyage-multimodal-3 page-image embeddings (feature flag)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m50s
Stage C: per-page image embeddings via voyage-multimodal-3 + hybrid
text+image search. Off by default; enable with MULTIMODAL_ENABLED=true.

- Schema V9: document_image_embeddings + precedent_image_embeddings
  (vector(1024), page_number, image_thumbnail_path)
- extractor.render_pages_for_multimodal renders PDF pages at
  MULTIMODAL_DPI (144) for embedding + JPEG thumbnails at
  MULTIMODAL_THUMB_DPI (96) for UI preview, in one pass
- embeddings.embed_images calls voyage-multimodal-3 in 50-page batches
- services/hybrid_search.py orchestrator: rerank applied to text side
  first (rerank-2 is text-only); image side cosine; weighted merge
  with text_weight 0.65 (env-tunable); image-only pages surface as
  match_type='image' so dense scanned content still appears
- processor.process_document and precedent_library.ingest_precedent
  gated by flag — non-fatal on multimodal failure
- scripts/multimodal_backfill.py — idempotent per-case CLI to embed
  existing documents without re-extracting text

Validated locally on a 5-page response brief: render 0.31s, embed 8.32s,
hybrid merge surfaces image rows correctly. Production rollout starts
with flag=false (no behavior change), then per-case A/B.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 19:24:52 +00:00
26c3fddf41 feat(retrieval): add voyage rerank-2 cross-encoder stage (feature flag)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m29s
Stage B of voyage-upgrades-plan rewritten: instead of context-3 (which
4 POCs showed inconsistent improvement), add a cross-encoder rerank
layer on top of voyage-3. Default off (VOYAGE_RERANK_ENABLED=false).

POC validation (785-doc corpus, 12 queries, claude-haiku-4-5 judge):
- mean@3 +4.5% (4.306 → 4.500)
- practical-category queries +11.6% (3.78 → 4.22)
- latency +702ms per query
- no schema change, no re-embed, no double storage

Plumbing:
- config: VOYAGE_RERANK_ENABLED / _MODEL / _FETCH_K env vars
- embeddings.voyage_rerank() wraps voyageai client.rerank
- services/rerank.py: maybe_rerank() helper — fetches FETCH_K candidates
  via the bi-encoder then reranks to top-K. Fail-open if Voyage rerank is
  unavailable.
- tools/search.py: search_decisions, search_case_documents,
  find_similar_cases all wrapped
- services/precedent_library.search_library wrapped

Smoke-tested locally with flag on/off — produces expected behaviour and
latency profile. Ready for production rollout via Coolify env flip after
deploy.

POCs (kept under scripts/ for reference):
- voyage_context3_poc{_long}.py — context-3 evaluation (rejected)
- voyage_multimodal_poc.py — multimodal-3 (stage C, deferred)
- voyage_rerank_judge_poc.py — single-case rerank benchmark
- voyage_rerank_corpus_poc.py — full-corpus rerank validation

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 18:43:41 +00:00
6f515dc2cb Initial commit: MCP server + web upload interface
Ezer Mishpati - AI legal decision drafting system with:
- MCP server (FastMCP) with document processing pipeline
- Web upload interface (FastAPI) for file upload and classification
- pgvector-based semantic search
- Hebrew legal document chunking and embedding
2026-03-23 12:33:07 +00:00