feat(retrieval): add voyage rerank-2 cross-encoder stage (feature flag)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m29s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m29s
Stage B of voyage-upgrades-plan rewritten: instead of context-3 (which
4 POCs showed inconsistent improvement), add a cross-encoder rerank
layer on top of voyage-3. Default off (VOYAGE_RERANK_ENABLED=false).
POC validation (785-doc corpus, 12 queries, claude-haiku-4-5 judge):
- mean@3 +4.5% (4.306 → 4.500)
- practical-category queries +11.6% (3.78 → 4.22)
- latency +702ms per query
- no schema change, no re-embed, no double storage
Plumbing:
- config: VOYAGE_RERANK_ENABLED / _MODEL / _FETCH_K env vars
- embeddings.voyage_rerank() wraps voyageai client.rerank
- services/rerank.py: maybe_rerank() helper — fetches FETCH_K candidates
via the bi-encoder then reranks to top-K. Fail-open if Voyage rerank is
unavailable.
- tools/search.py: search_decisions, search_case_documents,
find_similar_cases all wrapped
- services/precedent_library.search_library wrapped
Smoke-tested locally with flag on/off — produces expected behaviour and
latency profile. Ready for production rollout via Coolify env flip after
deploy.
POCs (kept under scripts/ for reference):
- voyage_context3_poc{_long}.py — context-3 evaluation (rejected)
- voyage_multimodal_poc.py — multimodal-3 (stage C, deferred)
- voyage_rerank_judge_poc.py — single-case rerank benchmark
- voyage_rerank_corpus_poc.py — full-corpus rerank validation
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -40,7 +40,7 @@ Local (developer machine, pm2):
|
||||
|
||||
External:
|
||||
← Claude API (Opus 4.7 for agents)
|
||||
← Voyage AI (voyage-3-large, 1024-dim embeddings)
|
||||
← Voyage AI (voyage-3, 1024-dim embeddings)
|
||||
← Infisical (secret management)
|
||||
← Gmail SMTP (agent notifications)
|
||||
```
|
||||
@@ -59,7 +59,7 @@ External:
|
||||
- מפעיל OCR (Google Vision) אם PDF ללא טקסט
|
||||
- מריץ proofreader להסרת artifacts מ-Nevo
|
||||
- מחלץ טקסט ל-`documents.extracted_text`
|
||||
- מפצל ל-chunks של ~500 מילים, מחשב embeddings (voyage-3-large, 1024D), שומר ב-`document_chunks`
|
||||
- מפצל ל-chunks של ~500 מילים, מחשב embeddings (voyage-3, 1024D), שומר ב-`document_chunks`
|
||||
4. סטטוס תיק: `new` → `proofread`
|
||||
|
||||
### שלב 2 — ניתוח משפטי (legal-researcher + analyst)
|
||||
@@ -223,7 +223,7 @@ legal-qa מריץ 6 בדיקות איכות:
|
||||
`case_law`, `statutory_provisions`, `transition_phrases`, `lessons_learned`, `style_corpus`, `style_patterns`
|
||||
|
||||
### Layer 4: Semantic Search (RAG)
|
||||
`document_embeddings`, `paragraph_embeddings`, `case_law_embeddings` (pgvector 1024-dim, voyage-3-large)
|
||||
`document_embeddings`, `paragraph_embeddings`, `case_law_embeddings` (pgvector 1024-dim, voyage-3)
|
||||
|
||||
### Layer 5 — Multi-tenancy
|
||||
`companies`, `tag_company_mappings` (appeal_subtype → company_id)
|
||||
@@ -283,7 +283,9 @@ legal-qa מריץ 6 בדיקות איכות:
|
||||
## טכנולוגיות עיקריות
|
||||
|
||||
- **Database**: PostgreSQL 15 + pgvector 0.8.1
|
||||
- **Embeddings**: Voyage AI (`voyage-3-large`, 1024-dim)
|
||||
- **Embeddings**: Voyage AI (`voyage-3`, 1024-dim) + cross-encoder rerank (`rerank-2`)
|
||||
- bi-encoder: voyage-3 לכל chunk (חד-פעמי בעת ingestion)
|
||||
- cross-encoder: rerank-2 לכל query (top-50 → top-K), feature flag `VOYAGE_RERANK_ENABLED`
|
||||
- **Agents**: Claude Opus 4.7 (via Paperclip pm2)
|
||||
- **DOCX manipulation**: `python-docx` 1.2+ ו-`lxml` 5.2+ (XML surgery)
|
||||
- **Frontend**: Next.js + TanStack Query + Tailwind
|
||||
|
||||
Reference in New Issue
Block a user