Practice area separation: multi-tenant axis across DB, RAG, and UI

Adds two orthogonal columns — practice_area (top-level legal domain:
appeals_committee / national_insurance / labor_law) and appeal_subtype
(building_permit / betterment_levy / compensation_197) — denormalized
into cases, documents, document_chunks, decisions, and style_corpus so
vector searches can filter without JOINs.

Why: the system handles two unrelated sub-domains under the same
appeals committee (1xxx building permits and 8xxx/9xxx betterment/197),
with different rules and writing style. Without a separation axis,
search_similar() and the block-writer's precedent lookup were free to
surface betterment-levy paragraphs while drafting a building-permit
decision — a real risk of cross-domain contamination. The same axis
also lets future domains (national insurance, labor law) coexist
without separate schemas.

Schema (V4 migration in db.py):
- ALTER ... ADD COLUMN IF NOT EXISTS on all five tables + composite
  indexes (practice_area first).
- Idempotent backfill: case_number ~ '^1' → building_permit, '^8' →
  betterment_levy, '^9' → compensation_197; propagated to documents,
  chunks, and decisions via case_id; training-corpus rows (case_id NULL)
  default to appeals_committee.

Code:
- New services/practice_area.py with derive_subtype, validate, and
  is_override + enum constants.
- db.create_case / create_document / store_chunks / create_decision
  inherit practice_area from the parent case (or take an explicit
  override for the case_id=None training corpus).
- db.search_similar and search_similar_paragraphs accept practice_area
  + appeal_subtype filters using the denormalized columns.
- tools/search.py auto-resolves the filter from case_number when given.
- block_writer._build_precedents_context now passes the active case's
  practice_area to search_similar_paragraphs — closes the contamination
  hole for the discussion-block precedent fetch.
- tools/cases.case_create auto-derives subtype from case_number; an
  explicit override that disagrees writes a case_subtype_override entry
  to audit_log so we can spot bad classifications later.
- tools/documents.document_upload_training tags new training material
  with practice_area + subtype end-to-end (corpus, document, chunks).

UI (web/static/index.html + web/app.py):
- New-case wizard gets a practice_area dropdown (others disabled until
  national_insurance / labor_law arrive) and an appeal_subtype dropdown
  with JS auto-fill from the case-number prefix; manual edits stick.
- Case header shows a blue badge with practice_area · subtype.
- CaseCreateRequest plumbs both fields through to cases_tools.case_create.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-11 16:36:48 +00:00
parent a8b79822bf
commit 26d09d648f
8 changed files with 468 additions and 34 deletions

View File

@@ -8,7 +8,7 @@ from pathlib import Path
from uuid import UUID
from legal_mcp import config
from legal_mcp.services import db
from legal_mcp.services import audit, db, practice_area as pa
async def case_create(
@@ -23,6 +23,8 @@ async def case_create(
hearing_date: str = "",
notes: str = "",
expected_outcome: str = "",
practice_area: str = "appeals_committee",
appeal_subtype: str = "",
) -> str:
"""יצירת תיק ערר חדש.
@@ -38,6 +40,9 @@ async def case_create(
hearing_date: תאריך דיון (YYYY-MM-DD)
notes: הערות
expected_outcome: תוצאה צפויה (rejection/partial_acceptance/full_acceptance/betterment_levy)
practice_area: תחום משפטי (appeals_committee / national_insurance / labor_law)
appeal_subtype: סוג ערר (building_permit / betterment_levy / compensation_197).
ריק = יוסק אוטומטית ממספר התיק
"""
from datetime import date as date_type
@@ -45,6 +50,12 @@ async def case_create(
if hearing_date:
h_date = date_type.fromisoformat(hearing_date)
# Resolve appeal_subtype: explicit override > auto-derive > 'unknown'
derived_subtype = pa.derive_subtype(case_number, practice_area)
if not appeal_subtype:
appeal_subtype = derived_subtype
pa.validate(practice_area, appeal_subtype)
case = await db.create_case(
case_number=case_number,
title=title,
@@ -57,8 +68,24 @@ async def case_create(
hearing_date=h_date,
notes=notes,
expected_outcome=expected_outcome,
practice_area=practice_area,
appeal_subtype=appeal_subtype,
)
# If the user overrode the case-number convention (e.g. case 8500 marked
# as building_permit), record it so we can audit later.
if pa.is_override(case_number, practice_area, appeal_subtype):
await audit.log_action(
action="case_subtype_override",
case_id=UUID(case["id"]),
details={
"case_number": case_number,
"derived_subtype": derived_subtype,
"chosen_subtype": appeal_subtype,
"practice_area": practice_area,
},
)
# Initialize git repo for the case
case_dir = config.find_case_dir(case_number)
case_dir.mkdir(parents=True, exist_ok=True)