legal-ai

Files

Chaim 6aaca14e31 Replace Claude Vision OCR with Google Cloud Vision

Benchmark results on Hebrew legal docs (case 1130-25):
- Google Vision: 1s/page, $0.001/page, high accuracy
- Claude Opus Vision: 90s/page, $0.05/page, poor accuracy
- PyMuPDF broken OCR layers now detected via quality check

Changes:
- extractor.py: Google Vision OCR with Hebrew language hint (300 DPI)
- extractor.py: text quality detection (word length, words-per-line, Hebrew ratio)
- extractor.py: Hebrew abbreviation quote fixer (15 known patterns)
- config.py: add GOOGLE_CLOUD_VISION_API_KEY, remove ANTHROPIC_API_KEY
- pyproject.toml: add google-cloud-vision, remove anthropic

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

2026-04-08 20:17:58 +00:00

src/legal_mcp

Replace Claude Vision OCR with Google Cloud Vision

2026-04-08 20:17:58 +00:00

pyproject.toml

Replace Claude Vision OCR with Google Cloud Vision

2026-04-08 20:17:58 +00:00