Chaim 4892fb6e8f
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m40s
fix(extractor): apply Hebrew quote fixer to direct PDF extraction path
Born-digital Hebrew PDFs from legal software often encode gershayim (״)
as double-yod (יי), producing the same corruption patterns as OCR.
The fixer was only called after Google Cloud Vision OCR — digitally
created PDFs that passed quality checks received no correction.

Changes:
- Apply _fix_hebrew_quotes() in the direct extraction path
- Add 'בליימ' → 'בל"מ' (בקשה להארכת מועד — systematic corruption in 1017-03-26)
- Add 'תמייא' → 'תמ"א' (תכנית מתאר ארצית)
- Update docstring to reflect the broader scope

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-25 15:59:39 +00:00
Description
AI Legal Decision Drafting System — MCP server, web upload, RAG search
47 MiB
Languages
Python 63.2%
TypeScript 34.3%
JavaScript 1.3%
Shell 0.8%
CSS 0.3%
Other 0.1%