Commit Graph

8 Commits

Author SHA1 Message Date
885cba543e feat(halacha): lighter effort for BULK queue-drain extraction (speed at scale)
xhigh is the quality sweet-spot for a single precedent but very slow at scale
(64-chunk case ≈ 20 min). Bulk queue-drains (process_pending over many
precedents) now use a lighter effort to cut wall-clock; interactive single
re-extraction keeps xhigh quality.

- config.HALACHA_BULK_EXTRACT_EFFORT (env, default 'high'; set 'medium' for max
  speed, 'xhigh' to match single).
- extract()/_extract_impl()/_extract_chunk() take an `effort` override threaded
  to claude_session.query_json; None falls back to HALACHA_EXTRACT_EFFORT (xhigh).
- process_pending_extractions(kind='halacha') passes the bulk effort; single
  reextract_halachot keeps xhigh.

Verified end-to-end (mocked LLM): _extract_chunk(effort='medium') → query_json
effort='medium'; effort=None → 'xhigh' fallback. Closes the open item in #72.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:34:13 +00:00
8e4ea23882 feat(halacha): crash-safe incremental extraction + resume (A + resume)
Halacha extraction held ALL chunk results in memory and stored once at the very
end — a crash/interrupt mid-run (e.g. the 2026-05-31 freeze) lost everything and
re-paid the full LLM cost on retry.

Now each chunk's halachot are stored AND the chunk is checkpointed
(precedent_chunks.halacha_extracted_at) the moment it finishes:

- V25 schema: precedent_chunks.halacha_extracted_at (per-chunk checkpoint).
- db.store_halachot_for_chunk: atomic per-chunk insert (halacha_index continues
  from MAX, caller serializes via an in-process store-lock) + checkpoint mark.
- db.reset_halacha_extraction (force) / mark_all_chunks_extracted (legacy backfill).
- _extract_impl rewritten: resume by default (skip checkpointed chunks; failed
  chunks stay pending and are retried; status stays 'processing' until all done);
  force=True wipes + redoes all. reextract_halachot passes force=True; the queue
  drain (process_pending) resumes by default.
- Legacy guard: a pre-V25 precedent (halachot exist, no checkpoints) is
  backfilled and treated as complete — never re-extracted (would duplicate).

Verified on 9002-24 (55 halachot, legacy): resume → legacy-backfill, NO
duplication (stays 55), all chunks checkpointed. Index continuation: store at
55,56 after max 54, no collision. Tracks #72.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 21:27:46 +00:00
807053ec54 fix(halacha): global advisory lock — one extraction at a time (prevents box freeze)
2026-05-31: opus-4-8 @ xhigh extraction + overlapping driver processes (agent
fallback retries each spawn an independent `python -c` driver; process_pending is
serial WITHIN a process but the box ran 4-5 drivers in parallel) → 12-16 concurrent
xhigh `claude -p` procs → load 69 → hard reboot.

Fix: halacha_extractor.extract() now takes a Postgres advisory lock
(pg_try_advisory_lock, key 'HALA') before any work. If another extraction (any
process/agent/driver — all share the legal-ai DB) holds it, the call returns
status='busy' and the precedent stays pending for the next drain. Guarantees ONE
extraction at a time ACROSS PROCESSES — an in-process Semaphore cannot (drivers
are separate OS processes). Core logic moved to _extract_impl (unchanged) under
the lock. CHUNK_CONCURRENCY now env-tunable (HALACHA_CHUNK_CONCURRENCY, default 3).

Verified: while a lock is held, extract() returns 'busy' with no LLM call; lock
releases cleanly and the next extraction proceeds. Tracks #72.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 20:42:15 +00:00
887079535c feat(spec): X11 citation-corroboration + INV-G10 amendment + Opus 4.8 halacha extraction
ספ חדש לשכבת citator פנימית — תיקוף הלכות לפי טיפול-שיפוטי מצטבר (ציטוטים נכנסים),
לצמצום היקף האישור-הידני של היו"ר:

- docs/spec/X11-citation-corroboration.md — 6 invariants (INV-COR1–COR6), כל אחד עם
  ≥3 מקורות מקצועיים (Shepard's/KeyCite, Hellyer LLJ 2018, UNC Law, NCSC/JTC, CEPEJ).
- docs/spec/00-constitution.md — תיקון מבוקר ל-INV-G10: השער מסופק ע"י טיפול-שיפוטי-מצטבר
  לתת-הקבוצה החיובית, שער-היו"ר נשאר חובה לזנב ולשלילי. + X11 באינדקס.
- Opus 4.8 @ xhigh כמודל חילוץ הלכות (config HALACHA_EXTRACT_MODEL/EFFORT, env-tunable;
  claude_session model/effort params; halacha_extractor מחווט). מבוסס A/B 2026-05-31:
  פחות חילוץ-יתר, 100% quote-verified, ביטחון מכויל.
- scripts/ab_halacha_opus48.py — harness A/B לא-הרסני להשוואת מודל/effort בחילוץ הלכות.
- .taskmaster #70 (FU-2c-b) — תיעוד dedup שפר + סריקת-קורפוס (0 stubs תקועים נותרו).

תנאי-קדם (זהות נקייה) הושלם: שפר מוזג לרשומה קנונית + סריקת 128 רשומות.
audit-findings גלויים ב-X11 §7: קישור הלכה↔ציטוט + סיווג-טיפול = greenfield, ל-implementation plan.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-31 18:42:13 +00:00
36f21c815e fix(precedents): distinguish silent extraction failure from "no halachot"
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m5s
Observed 2026-05-03: a `precedent_process_pending(halacha)` run that
chained two precedents (1110/20 → 317/10) succeeded for the first
(9 halachot, 129 chunks) and produced status=`no_halachot` for the
second despite it being a 47KB Supreme Court ruling with rich legal
analysis. A manual single-precedent re-run on 317/10 immediately
extracted 53 halachot. Diagnosis: every chunk's claude_session call
in the back-to-back run silently failed (likely Anthropic rate-limit
storm after the 1110/20 token burn), and the empty list was reported
as "Claude looked and found nothing" — same code path as a real
0-halacha ruling. The user couldn't tell the difference.

Three changes:

1. Surface chunk-level failures (halacha_extractor.py)
   `_extract_chunk` now returns `(halachot, succeeded)` so the caller
   can count how many chunks crashed. `extract()` uses this to
   distinguish:
   - `no_halachot` — chunks ran cleanly, Claude found nothing
   - `extraction_failed` — ≥50% of chunks crashed AND zero halachot
     came back (rate limit, subprocess crash, etc.)
   When `extraction_failed`, DB status is left as 'processing' so the
   request stays in the queue for the caller to retry — instead of
   the old behaviour where it got marked 'completed' and silently
   dropped from the queue.

2. Inter-precedent cooldown (precedent_library.py)
   `process_pending_extractions` now sleeps 30s between precedents.
   Anthropic rate-limits per-org, and back-to-back large rulings
   (~4M tokens for 1110/20, immediately followed by another 2-3M)
   was the empirical trigger. 30s gives the per-minute counter time
   to drain.

3. Auto-retry on extraction_failed (precedent_library.py)
   When a precedent comes back as `extraction_failed`, retry once
   after a 60s cooldown before giving up. Rate-limit storms are
   transient — the manual re-run of 317/10 minutes later succeeded
   with 53 halachot and zero chunk failures, confirming a single
   retry is sufficient. Only retries `extraction_failed`; never
   `no_halachot` (Claude looked and there genuinely is no holding).

The DB status now ends up as 'failed' only after retries are
exhausted, matching the UI's terminal-failure chip.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-04 05:13:10 +00:00
5d836ca414 fix(precedents): Anthropic SDK fallback, format() crash, UI refresh
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m31s
Three fixes to the precedent library after the first end-to-end test on
403-17 surfaced runtime issues:

1. Anthropic SDK fallback in claude_session. The legal-ai Docker container
   does not ship the `claude` CLI, so every halacha and metadata extraction
   was failing with "Claude CLI not found." Module now tries the CLI first
   (zero-cost local path) and falls back to the Anthropic SDK with
   ANTHROPIC_API_KEY when the binary is absent. Default model is
   claude-sonnet-4-6, overridable via CLAUDE_SDK_MODEL env. The system
   message gets cache_control: ephemeral so multi-chunk runs reuse the
   cached instruction prefix at ~10% read cost. Adds `anthropic` to
   pyproject deps.

2. precedent_metadata_extractor crashed with KeyError because the JSON
   example inside the prompt template contained literal { } characters
   that str.format() interpreted as placeholders. Switched to f-string
   concatenation; the prompt template no longer needs format() at all.

3. Library list query stays stale after upload because the upload
   mutation's onSuccess fires when the POST returns task_id, not when
   SSE reports completion. Added a second invalidate inside the SSE
   watcher in PrecedentUploadSheet so the new row appears with up-to-date
   chunk and halachot counts the moment processing finishes.

Halacha and metadata extractors now route the long static prompt through
the new `system=` parameter so the SDK path actually caches it; the CLI
path concatenates and behaves as before.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 10:52:31 +00:00
73a79ea7e8 feat(precedents): metadata auto-fill, edit sheet, persuasive extraction
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m28s
Three improvements to the precedent library based on usage feedback:

1. Auto-fill metadata at upload time. New service
   precedent_metadata_extractor reads the ruling's full_text and
   suggests case_name (short), summary, headnote, key_quote,
   subject_tags, appeal_subtype. The merge policy fills only empty
   fields, preserving everything the chair typed in the upload form.
   Wired into the ingest pipeline; also exposed as a re-run endpoint
   POST /api/precedent-library/{id}/extract-metadata for existing
   records.

2. Edit sheet in the UI. Pencil icon on each library row opens a
   pre-populated form covering every field. A Sparkles button on the
   sheet runs the metadata extractor on demand and refreshes the
   form. The case_number is read-only because halachot are FK'd to
   it; renaming requires delete + re-upload.

3. Halacha extractor branches on is_binding. Sources marked binding
   (Supreme/Administrative) keep the strict halacha prompt. Non-binding
   sources (other appeals committees, district courts on planning
   matters) get a different prompt that extracts applications,
   interpretive principles, and persuasive conclusions — labeled with
   new rule_types 'application' and 'persuasive'. The fallback also
   widens chunk selection: if the chunker labeled nothing as
   legal_analysis/ruling/conclusion, we now run on all chunks rather
   than returning zero halachot for a usable ruling.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 10:19:35 +00:00
7ee90dce31 feat: external precedent library with auto halacha extraction
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m27s
Adds a third corpus of legal authority distinct from style_corpus
(Daphna's prior decisions for voice) and case_precedents (chair-attached
quotes per case). The new corpus holds chair-uploaded court rulings and
other appeals committee decisions, with binding rules (הלכות) extracted
automatically and queued for chair approval.

Pipeline (web/app.py + services/precedent_library.py):
file → extract → chunk → Voyage embed → halacha_extractor → store +
publish progress over the existing Redis SSE channel.

Schema V7 (services/db.py): extends case_law with source_kind +
extraction status fields under a CHECK constraint pinning practice_area
to the three appeals committee domains (rishuy_uvniya, betterment_levy,
compensation_197). New precedent_chunks (vector(1024)) and halachot
tables (vector(1024) over rule_statement, IVFFlat indexes, gin on
practice_areas/subject_tags). Halachot start as pending_review; only
approved/published rows are visible to search_precedent_library.

Agents: legal-writer, legal-researcher, legal-analyst, legal-ceo,
legal-qa get search_precedent_library. legal-writer prompt explains
the three-corpus distinction and CREAC use; legal-qa now verifies that
every cited halacha resolves to an approved row in the corpus.

UI: /precedents page with four tabs — library / semantic search /
pending review (J/K nav, A/R/E shortcuts, badge count) / stats.
Reuses the existing upload-sheet progress + SSE pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 08:38:18 +00:00