Install git in Docker image and wrap all subprocess git calls in
try/except so a missing or failing git binary never kills an upload
that already succeeded.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds two orthogonal columns — practice_area (top-level legal domain:
appeals_committee / national_insurance / labor_law) and appeal_subtype
(building_permit / betterment_levy / compensation_197) — denormalized
into cases, documents, document_chunks, decisions, and style_corpus so
vector searches can filter without JOINs.
Why: the system handles two unrelated sub-domains under the same
appeals committee (1xxx building permits and 8xxx/9xxx betterment/197),
with different rules and writing style. Without a separation axis,
search_similar() and the block-writer's precedent lookup were free to
surface betterment-levy paragraphs while drafting a building-permit
decision — a real risk of cross-domain contamination. The same axis
also lets future domains (national insurance, labor law) coexist
without separate schemas.
Schema (V4 migration in db.py):
- ALTER ... ADD COLUMN IF NOT EXISTS on all five tables + composite
indexes (practice_area first).
- Idempotent backfill: case_number ~ '^1' → building_permit, '^8' →
betterment_levy, '^9' → compensation_197; propagated to documents,
chunks, and decisions via case_id; training-corpus rows (case_id NULL)
default to appeals_committee.
Code:
- New services/practice_area.py with derive_subtype, validate, and
is_override + enum constants.
- db.create_case / create_document / store_chunks / create_decision
inherit practice_area from the parent case (or take an explicit
override for the case_id=None training corpus).
- db.search_similar and search_similar_paragraphs accept practice_area
+ appeal_subtype filters using the denormalized columns.
- tools/search.py auto-resolves the filter from case_number when given.
- block_writer._build_precedents_context now passes the active case's
practice_area to search_similar_paragraphs — closes the contamination
hole for the discussion-block precedent fetch.
- tools/cases.case_create auto-derives subtype from case_number; an
explicit override that disagrees writes a case_subtype_override entry
to audit_log so we can spot bad classifications later.
- tools/documents.document_upload_training tags new training material
with practice_area + subtype end-to-end (corpus, document, chunks).
UI (web/static/index.html + web/app.py):
- New-case wizard gets a practice_area dropdown (others disabled until
national_insurance / labor_law arrive) and an appeal_subtype dropdown
with JS auto-fill from the case-number prefix; manual edits stick.
- Case header shows a blue badge with practice_area · subtype.
- CaseCreateRequest plumbs both fields through to cases_tools.case_create.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove cases/new|in-progress|completed subdivision (status managed in DB)
- Rename documents/original → documents/originals (consistent plural)
- Move exports from global data/exports/ into cases/{num}/exports/
- Add documents/research/ for case law and analysis files
- Update all agents, scripts, config, web API endpoints, and DB paths
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace single CASES_DIR with find_case_dir() that searches across
all status directories. New cases created in cases/new/{number}/.
Config: CASES_BASE, CASES_NEW, CASES_IN_PROGRESS, CASES_COMPLETED
Docker: added -v /home/chaim/legal-ai/cases:/cases volume mount
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ezer Mishpati - AI legal decision drafting system with:
- MCP server (FastMCP) with document processing pipeline
- Web upload interface (FastAPI) for file upload and classification
- pgvector-based semantic search
- Hebrew legal document chunking and embedding