Commit Graph

3 Commits

Author SHA1 Message Date
28f49defff LLM session: async, 30min timeout, semantic chunking + parallel
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m28s
The claude_session bridge had two structural defects that made any
non-trivial document extraction unreliable:

  1. subprocess.run() blocks the asyncio event loop in the MCP server
     for the full duration of every LLM call (60-180s typical).
  2. The 120-second timeout was below the cold-cache cost of any
     document over ~12K Hebrew characters. Three back-to-back timeouts
     on case 8174-24 dropped 43 appellant claims on the floor.

Phase 1 of the remediation plan — keeps claude_session as the engine
(no Anthropic API switch) and restructures around it:

claude_session.py
  • query / query_json are now async — asyncio.create_subprocess_exec
    instead of subprocess.run, so MCP server can serve other coroutines
    while a call is in flight.
  • DEFAULT_TIMEOUT 120 → 1800 (30 min). High enough that no realistic
    document hits it; bounded so a runaway never zombifies forever.
  • LONG_TIMEOUT 300 → 3600 for opus block writing on full case context.
  • TimeoutError now actually kills the subprocess (asyncio.wait_for
    cancellation alone leaves the child running).

claims_extractor.py
  • _split_by_sections: chunks at numbered sections / Hebrew letter
    headings / "פרק" markers / markdown ##, falls back to paragraph
    breaks, then to hard splits. Targets 12K chars per chunk — small
    enough that each chunk reliably finishes inside the timeout.
  • _extract_chunk: per-chunk retry (1 attempt by default) with
    structured logging on failure. Failed chunks no longer crash the
    overall extraction; they're skipped with a partial-result warning.
  • extract_claims_with_ai now runs chunks in parallel via
    asyncio.gather bounded by a semaphore (CHUNK_CONCURRENCY=3).
    For a 25K-char appeal: was sequential 150-300s, now ~70-90s.

Updated all 9 callers (claims, appraiser facts, block writer, qa
validator, brainstorm, learning loop, style analyzer × 3) to await
the now-async API.

The one-shot scripts/extract_claims_8174.py used to recover 43
appellant claims on case 8174-24 has been moved to .archive/ — phase 1
makes it obsolete. SCRIPTS.md updated.

Phase 2 (background-task wrapper around LLM-bound MCP tools, persistent
llm_tasks table, SSE progress) is the structural follow-up — separate PR.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 14:21:35 +00:00
c536ed0e63 Edit document doc_type and appraiser side from the case UI
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m26s
Until now changing a document's doc_type required a manual SQL update.
Adds an inline editor on the document badge so the chair can retag
without leaving the case page, and threads an appraiser_side tag
(committee / appellant / deciding) through the appraisal pipeline so
betterment-levy cases — which usually have 2-3 appraisers — render
conflicts with the deciding appraiser's view marked as governing.

Backend
- New appraiser_facts.appraiser_side column (V5.1) populated from
  documents.metadata.appraiser_side at extraction time.
- extract_appraiser_facts now returns status='sides_missing' with the
  list of untagged appraisals instead of running with empty side
  labels — chair must tag every appraisal first via the UI.
- Conflict detection orders entries committee → appellant → deciding so
  the deciding appraiser appears last; block-tet's prompt instructs the
  writer to phrase the deciding appraiser's view as the governing
  factual finding ("ואולם, השמאי המכריע קבע...").
- New PATCH /api/cases/{n}/documents/{doc_id} (Pydantic model with
  whitelist validation) and matching document_update MCP tool. Both
  merge appraiser_side into metadata JSONB instead of touching the
  schema.

UI
- New shared doc-types module exports the canonical 11 doc_type
  options plus the 3 appraiser-side options; both upload-sheet and
  the document badge now read from it instead of duplicating Hebrew
  labels.
- New DocumentTypeEditor renders a Popover off the doc-type Badge
  with two Selects. The save button stays disabled while doc_type is
  appraisal but no side has been picked, mirroring the backend
  enforcement so the user finds out before triggering extraction.
- usePatchDocument React-Query mutation invalidates the case detail
  on success so the badge updates without a manual refresh.
2026-04-19 06:26:51 +00:00
c619c22a51 Add pre-ruling interim draft (טיוטת ביניים) for appeals committee
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m26s
Lets the chair generate a partial decision DOCX before the discussion-and-
ruling block is decided. Same template, skill and DOCX styling as the final
decision (David, RTL, bookmarks) — only the block selection and order differ:
רקע (ו) → תכניות+היתרים (ט) → טענות (ז) → הליכים (ח). The opening (ה),
ruling (י), summary (יא), and signatures (יב) are omitted.

- New appraiser_facts table + CRUD + conflict detection in db.py (V5 schema).
  Conflict = same plan/permit identifier reported differently by 2+ appraisers.
- New appraiser_facts_extractor service: per-appraisal Claude extraction of
  plans + permits with raw quotes and page numbers.
- block-tet prompt extended with a permits sub-section sourced from the
  extracted facts, plus an explicit instruction to flag inter-appraiser
  conflicts in neutral wording without resolving them (deferred to block-yod).
- block-chet prompt extended with a post-hearing materials context sourced
  from documents.metadata.is_post_hearing.
- docx_exporter.export_decision now accepts mode='interim' which reorders
  the blocks per the chair's mental model and writes
  טיוטת-ביניים-v{N}.docx (versioned independently of regular drafts).
- 3 new MCP tools: extract_appraiser_facts, write_interim_draft,
  export_interim_draft. write_interim_draft auto-runs extraction if the
  appraiser_facts table is empty for the case.
2026-04-18 13:28:04 +00:00