legal-ai

Author	SHA1	Message	Date
chaim	83d6b5ecf0	Merge pull request 'fix: drop gold-set card from chair approval center (data/ not in image)' (#20 ) from fix/chair-pending-drop-goldset-card into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 15:41:40 +00:00
Chaim	c231782ee8	fix(ui): drop gold-set card from /api/chair/pending — data/ excluded from image The gold-set card read data/eval/gold-set.jsonl, but .dockerignore excludes data/ from the build context, so the file is never in the container and the card silently never rendered. Baking eval data into the image is the wrong layering (data/ is runtime volumes). The gold-set review is a one-time task, not a recurring chair queue, so it doesn't belong on the live board — it's tracked via task #63 and reviewed directly with the chair. The board now returns the 4 robust DB-backed gates (halachot, missing precedents, feedback, qa_failed). Removes the best-effort file read + its unused Path import. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:41:00 +00:00
chaim	dfa2f5bd7f	Merge pull request 'מרכז אישורים — chair approval center (everything Dafna must approve, in one page)' (#19 ) from feat/chair-approval-center into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 37s Details	2026-05-31 15:37:00 +00:00
Chaim	19d3dc81d0	feat(ui): chair approval center — one page for every pending human-gate (#63 follow-up) Dafna asked for a single page under the prod site listing everything she needs to approve, so nothing is forgotten — the visible embodiment of INV-G10 (human gates) and INV-QA1 (halacha backlog must be visible). Backend — GET /api/chair/pending aggregates every pending chair gate, each as a direct source query (count + sample + action link): - halachot review backlog (review_status='pending_review') + oldest - open missing precedents - unresolved chair_feedback - qa_failed cases - gold-set review (FU-5, file-based, best-effort: total vs source='chair') Frontend — /approvals page ("מרכז אישורים"): - src/lib/api/chair.ts — usePendingApprovals() (hand-typed until next api:types) - src/app/approvals/page.tsx — card per category, severity-coloured count, sample rows, oldest-pending date, link to where each is handled; live (60s refetch) - app-shell nav: "מרכז אישורים" in the work group + total-pending badge (quiet at 0) Live counts at build time surfaced the value immediately: 226 open missing precedents, 178 pending halachot, 20 unapplied feedback notes, 1 qa_failed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 15:36:29 +00:00
chaim	aee2140b0b	Merge pull request 'FU-5 — retrieval eval harness + halacha backlog visibility (#63 )' (#18 ) from feat/fu5-eval-harness-backlog-visibility into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-05-31 14:58:47 +00:00
Chaim	6ff2e36bf9	feat(eval): FU-5 — retrieval eval harness + halacha backlog visibility (#63 ) Covers GAP-11 (INV-RET4/G8) and GAP-14 (INV-QA1/G10). Retrieval quality was never measured (only telemetry observation) and the halacha review backlog was invisible (the 10/19 gap was found by accident). Unit B — backlog visibility (pure code, container): - metrics.halacha_backlog(conn) → {pending_review, approved, rejected, published, total, oldest_pending_at}; surfaced in metrics.get_dashboard() (get_metrics MCP tool) and /api/system/diagnostics. Live count revealed 178 pending / 1552 total, oldest from 2026-05-03 — previously invisible. Unit A — retrieval eval harness (host-side scripts): - scripts/eval_gold_bootstrap.py — seeds data/eval/gold-set.jsonl. Two sources: citations (cited==relevant via search_relevance_feedback — empty until decisions cite precedents) and known_item (query=case_name → relevant=self; a real citation-free signal, the methodology #52 checked by hand). Idempotent; preserves source='chair' rows. - scripts/eval_retrieval.py — runs the production retrieval path (search_library / search_internal) over the gold-set; computes precision@k, recall@k, MRR, nDCG@k (k=5,10); aggregates overall + per-corpus + per-practice_area; writes a report and a delta vs committed baseline.json (which records the retrieval_config it reflects). --self-test unit-checks the metric math offline. Gold-set strategy = hybrid (chair decision): bootstrap + chair review. The citation source is empty today (0 cited precedents in decisions), so the seed is known-item (77 queries: 54 internal_decisions + 23 precedent_library). The gold-set is PROVISIONAL until Dafna reviews it (the domain chair-gate). Baseline (production config: multimodal+rerank on): R@10=0.987, MRR=0.837, nDCG@10=0.872. Finding: MULTIMODAL_ENABLED=true slightly lowers known-item recall (image-page results displace exact name matches) — relevant to #15. precedent_library weaker than internal (R@10 0.957 vs 1.0) — one external precedent unfindable by name. "CI gate" realized as discipline (re-runnable harness + committed baseline + run before/after any retrieval-layer change) — retrieval needs prod DB + Voyage, no CI runner has that access. Spec: docs/superpowers/specs/2026-05-31-fu5-eval-harness-design.md Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 14:58:13 +00:00
chaim	cfcac80de2	Merge pull request 'FU-2c — reconcile external case_law identifiers (GAP-08, #68 )' (#17 ) from feat/fu2c-external-id-reconciliation into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 8s Details	2026-05-31 14:13:25 +00:00
Chaim	4fce9d503f	feat(migration): FU-2c — reconcile external case_law identifiers (GAP-08, #68 ) External court precedents stored the full citation (designator + docket + parties + Nevo date) inside case_number, violating INV-ID2/G1 (citation as identifier). Chair decision 2026-05-31 (Option A): canonical external case_number = proceeding-designator + docket, '/' preserved (court convention, not X1's '/'→'-'); parties/court/date → citation_formatted. scripts/fu2c_reconcile_external_case_numbers.py — deterministic dry-run → chair-review → apply, mirroring FU-2b: - extracts designator+docket; flags split into BLOCKING (MISMATCH / CIT_NO_DOCKET / DESIG_MISMATCH / DUP_CHECK / NO_DOCKET) vs ADVISORY (NO_CITATION — case_number fix still deterministic, missing citation is a separate gap), so advisory rows apply while uncertain identity does not. - --overrides CSV (id,proposed_canonical,citation_formatted,reason) for audited chair adjudication of blocking rows. - apply scoped to source_kind='external_upload' (task target) while keeping cited_only/nevo_seed in the reconciliation VIEW so DUP_CHECK spans the full external unique space; pre-flight collision guard before every UPDATE. Applied to production 2026-05-31: 21 case_number normalized + 3 citation_formatted reconciled (D = consolidated Supreme Court judgment לויתן/קלמנוביץ → lead docket 25226-04-25; 2×C empty citations composed from metadata). אהוד שפר עע"מ 317/10 deferred — cross-source duplicate with an existing cited_only reference (collision guard held; → #70). 49 cited_only records out of scope → new task #70 (committee-form NNNN-NN dockets the extractor misses, dedup, unresolvable "ערר אדלר"). Extraction + gating verified offline on all 24 records. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 14:12:45 +00:00
chaim	9dbc1bafbf	Merge pull request 'FU-8a: process→code guards (GAP-21/22)' (#16 ) from fix/fu8a-process-to-code-guards into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m39s Details	2026-05-31 11:36:07 +00:00
Chaim	e5b34e01dc	docs(scripts): note sync --verify drift-gate semantics (FU-8a)	2026-05-31 11:36:06 +00:00
Chaim	4d8422198a	feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a) Wakeup-INSERT rule is universal (never allowlisted — hard invariant). Raw-HTTP rule exempts the sanctioned helpers + standalone operator/admin scripts (a distinct category per fitness-function scope differentiation + DRY: tooling needn't reuse the FastAPI wrapper). Repo scanned clean under these rules. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 11:35:07 +00:00
Chaim	a66ab3b3cd	feat(guard): fitness function blocking raw Paperclip access (GAP-22, FU-8a) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 11:16:36 +00:00
Chaim	aac383acb7	feat(sync): --verify exits non-zero on drift; adapter mismatch = loud drift (GAP-21, FU-8a) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 11:14:44 +00:00
Chaim	adc196ac20	docs(plan): FU-8a process→code guards implementation plan (3 tasks, TDD) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 10:51:31 +00:00
Chaim	e8431a2adf	docs(spec): FU-8a process→code guards design (GAP-21/22) + split GAP-23 to #69 GAP-21: sync_agents --verify exits non-zero on drift; adapter_type mismatch counted as drift (loud), not silent skip — makes it an enforceable gate (INV-MC1). GAP-22: fitness-function pytest guarding against raw Paperclip HTTP + direct agent_wakeup_requests INSERT (INV-INT1/INT3). Repo pre-scanned: 0 existing violations → clean forward-fence. Verified vs 3+ sources (architectural fitness functions; drift-verify non-zero exit). GAP-23 (spec→agents) split to #69. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 10:48:15 +00:00
chaim	43873adc90	Merge pull request 'FU-2b: internal case_number reconciliation tooling (GAP-07/08)' (#15 ) from fix/fu2b-identifier-reconciliation into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m39s Details	2026-05-31 08:59:13 +00:00
Chaim	8477fd87e7	docs(scripts): register fu2b reconciliation script (FU-2b)	2026-05-31 08:58:32 +00:00
Chaim	e46868feda	feat(fu2b): flag PROC_MISMATCH (case_number prefix vs proceeding_type) for chair Dry-run surfaced 2 rows with בל"מ prefix but proceeding_type=ערר. Since the migration strips the prefix, a wrong proceeding_type would silently lose the בל"מ signal — must be chair-adjudicated, not auto-applied. Chair table now flags 4 rows: 2 DUP_CHECK (8047-23) + 2 PROC_MISMATCH. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 08:57:42 +00:00
Chaim	ab8d17fdd8	feat(fu2b): chair-gated internal case_number reconciliation script (GAP-07/08) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-31 08:54:38 +00:00
Chaim	a41fcedc28	test(fu2b): failing tests for bare-number extraction (FU-2b)	2026-05-31 08:52:48 +00:00
Chaim	c2de69272d	docs(plan): FU-2b identifier-reconciliation implementation plan (chair-gated, TDD) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 08:09:22 +00:00
Chaim	105d9626ca	docs(spec): FU-2b internal identifier reconciliation design (GAP-07/08) + split external to #68 Deterministic migration of ~52 internal_committee rows whose case_number holds a full citation → normalized bare number (citation_formatted already correct). DB analysis (2026-05-31): clean 1-token extraction, 0 key-collisions, 0 citation↔case_number mismatches, no month-padding dups. Chair-gated reversible migration (backup→dry-run→approve→apply). One edge for chair: 8047/23 ערר vs בל"מ. External (#68/FU-2c) split out — its citation_formatted is inconsistent. Verified all 11 case_law FKs use id(UUID), not case_number → rename is FK-safe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-31 06:12:43 +00:00
chaim	fc502a6441	Merge pull request 'FU-3: re-index on content change (GAP-09)' (#14 ) from fix/fu3-reindex-on-change into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m37s Details	2026-05-30 22:13:54 +00:00
Chaim	7e35a24d80	test(reindex): cover empty-text raise path (FU-3 review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:13:18 +00:00
Chaim	7341ee8275	tasks(legal-ai): mark FU-3 (#61 ) done; 61.1 done, 61.2 cancelled (not-applicable) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:10:27 +00:00
Chaim	8a0c206ecd	feat(reindex): precedent_reindex MCP tool (GAP-09, FU-3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 22:09:44 +00:00
Chaim	f008820ec8	feat(reindex): health-check stale_embedding_case_law count (GAP-09, FU-3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 22:08:27 +00:00
Chaim	63abf83e76	test(reindex): fix mark_indexed stub arity in FU-1 fixture (FU-3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:07:39 +00:00
Chaim	c8de42150e	test(reindex): stub db.mark_indexed in FU-1/FU-2a ingest fixtures (FU-3 interaction) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:07:18 +00:00
Chaim	c7c7a1e119	feat(reindex): reindex_case_law from stored text + mark_indexed on ingest (GAP-09, FU-3) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 22:06:17 +00:00
Chaim	96ae83081f	feat(reindex): V23 content/indexed hashes + helpers + write content_hash (GAP-09, FU-3) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:04:43 +00:00
Chaim	e522555b1a	test(reindex): failing tests for content-hash re-index (FU-3)	2026-05-30 22:02:16 +00:00
Chaim	8b3f191c8b	docs(plan): FU-3 re-index on content change implementation plan (6 tasks, TDD) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 22:00:02 +00:00
Chaim	a62116a571	docs(spec): FU-3 re-index on content change design (GAP-09) + close #61.2 not-applicable content_hash/indexed_hash change detection + reindex_case_law from stored full_text (no re-OCR) + drift health-check. Verified vs 3+ sources (content- hash change detection, RAG re-embed-on-edit). #61.2 multimodal backfill closed: 42 rows are text-ingested (document_id NULL, no source PDF) — page-images impossible without a PDF to render. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:52:40 +00:00
chaim	63dc08c963	Merge pull request 'FU-7: audit-trail + provenance (GAP-17/18/19/20)' (#13 ) from fix/fu7-audit-provenance into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 1m36s Details	2026-05-30 21:43:33 +00:00
Chaim	9bfb912bdf	fix(audit): _collect_block_sources mirrors None-doc-types (provenance accuracy, FU-7 review) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:40:42 +00:00
Chaim	d28f7b8398	tasks(legal-ai): mark FU-7 (#65 ) done; FU-2a (#60 ) done Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:37:46 +00:00
Chaim	677f29ddec	feat(audit): blocks_stale drift flag + health-check visibility (GAP-17, FU-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 21:36:56 +00:00
Chaim	7e2f4b2872	feat(qa): citation→corpus resolution as non-blocking warning (GAP-20, FU-7) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:35:24 +00:00
Chaim	769f5020eb	feat(audit): block→source provenance via write_block audit event (GAP-19, FU-7) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:33:36 +00:00
Chaim	1f483383b9	feat(audit): log document_upload/extract_claims/export_docx (GAP-18, FU-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 21:31:09 +00:00
Chaim	a121f79d6a	feat(audit): log_action_safe + V22 blocks_stale + citation resolver (FU-7) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 21:29:26 +00:00
Chaim	bffd2ec701	test(audit): failing tests for audit-trail + provenance (FU-7)	2026-05-30 21:27:54 +00:00
Chaim	2994a884e9	docs(plan): FU-7 audit-trail + provenance implementation plan (7 tasks, TDD) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:26:30 +00:00
Chaim	99cd6bc4dd	docs(spec): FU-7 audit-trail + provenance design (GAP-17/18/19/20) Reuse audit_log.log_action with details JSONB (X5 §4, no new table) for end-to-end audit + block→source provenance. GAP-17 drift = blocks_stale flag + health-check (not fragile DOCX→blocks reparse). GAP-20 = structural case_law_id resolution (not Hebrew citation NLP). Verified vs 3+ sources (append-only lineage event; GitOps drift detect-don't-auto-remediate). Pure-code, no migration. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 21:15:50 +00:00
chaim	3b758850e0	Merge pull request 'FU-2a: idempotent ingest + write-time normalization + searchable flag (GAP-03/06/13)' (#12 ) from fix/fu2a-idempotent-ingest into main All checks were successful Build & Deploy / build-and-deploy (push) Successful in 3m15s Details	2026-05-30 21:06:32 +00:00
Chaim	5d3c340243	test(ingest): stub recompute_searchable in FU-1 fixture (FU-2a interaction) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 20:59:11 +00:00
Chaim	358d82e90e	feat(retrieval): require practice_area only for internal/cases; enable searchable filter + health visibility (GAP-13, FU-2a) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-05-30 20:57:27 +00:00
Chaim	6dbcb7e798	feat(ingest): recompute searchable on ingest + metadata completion (GAP-13, FU-2a) Wire db.recompute_searchable into the ingest pipeline (after statuses are set) and into extract_and_apply (after fields are persisted to DB, success path only). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 20:47:51 +00:00
Chaim	4b8bbc3794	feat(data-model): V21 searchable flag + recompute_searchable (GAP-13, FU-2a) Add SCHEMA_V21_SQL (searchable boolean column + index on case_law), wire it into _run_schema_migrations, and implement _compute_searchable (pure predicate) + recompute_searchable (idempotent async backfill/update). All 5 unit tests pass. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-05-30 20:46:29 +00:00

1 2 3 4 5 ...

469 Commits