Benchmark on case 1130-25 (4 Hebrew legal docs, 8 queries) showed:
- voyage-law-2: avg top-1 score 0.5839 (+27% over voyage-3-large)
- voyage-4-large: avg top-1 score 0.4119 (worse than current)
- voyage-3-large: avg top-1 score 0.4589 (baseline)
voyage-law-2 costs ~4.6x more per run but delivers significantly
better retrieval quality for Hebrew legal text. Model is now
configurable via VOYAGE_MODEL env var.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>