Commit Graph

509 Commits

Author SHA1 Message Date
be4f7bbe99 feat(ingest): canonical ingest_document pipeline (FU-1) 2026-05-30 19:13:15 +00:00
d4663eba8f feat(ingest): IntakeSpec + shared helpers for canonical pipeline (FU-1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 19:11:27 +00:00
9ae2d47d03 test(ingest): failing tests for unified pipeline (FU-1)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-05-30 19:09:37 +00:00
15f42bc91c docs(plan): FU-1 unified-ingest implementation plan (6 tasks, TDD)
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:05:14 +00:00
357a5238c4 docs(spec): FU-1 unified-ingest design + FU-3 backfill task (#61.2)
Design for unifying the two parallel ingest paths (ingest_precedent /
ingest_internal_decision) into one canonical pipeline parameterized by an
IntakeSpec config object — Template Method skeleton + Strategy injection.
Closes the GAP-02 root cause (missing metadata queue on internal path) by
making a skipped step structurally impossible.

Architecture choice verified against 3+ authoritative sources (refactoring.guru
Template-Method/Replace-Conditional, Fowler FlagArgument, Strategy pattern).
DB check (2026-05-30): no migration needed — 0/56 internal rows lack metadata,
0 invalid enums; multimodal backfill (42 rows) tracked as TaskMaster #61.2 / FU-3.

Covers GAP-01/02/04/05 · provides INV-ING1/ING3/G2/G4 · TaskMaster #59.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 19:00:30 +00:00
df437c2462 tasks(legal-ai): mark FU-4 (62) + FU-6 (64) + subtasks done (merged+deployed)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m34s
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 18:30:26 +00:00
a53d8eef14 merge: GAP-12 — domain-scope search_decisions (INV-RET1)
Some checks failed
Build & Deploy / build-and-deploy (push) Has been cancelled
Derive practice_area from case (case row → number-prefix fallback); block only when a
case is present but undeterminable; case-less/exploratory search stays cross-domain.
Verified offline (test_search_domain_scope.py 5/5). Closes PR #10.
2026-05-30 18:29:45 +00:00
0c8d415044 fix(retrieval): scope search_decisions by domain — derive from case, block only on undeterminable case (GAP-12, INV-RET1)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 18:23:41 +00:00
bd6edb8937 merge: FU-6 — code-enforced QA gates (GAP-15/16)
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m38s
export_docx hard-blocks on critical QA failures (gates on stored qa_results, no LLM
re-run); neutral_background severity consistency fix; export HTTP endpoint returns 409
on block (UI shows error, not false success). Verified offline (test_export_qa_gate.py 5/5).
Closes PR #9.
2026-05-30 18:14:40 +00:00
a61495f5ef fix(api): export endpoint returns 409 when QA gate blocks (FU-6 UX — avoid false success toast)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 18:03:21 +00:00
084b31cd9b fix(qa): enforce critical-QA gate on export + fix neutral_background critical-but-passed (GAP-15/16, INV-QA3/EX3)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:58:50 +00:00
1473bdf3c2 merge: FU-4/GAP-10 corpus-isolation fix
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m39s
Enforce source_kind on halacha_filters (db.py) — closes cross-corpus halacha leak (#56).
Verified by offline regression test (mcp-server/tests/test_precedent_corpus_isolation.py).
2026-05-30 17:53:46 +00:00
f51036bd98 merge: System Spec-set + gap-audit (sub-projects 1+2)
Adds docs/spec/ (14-file living system spec, 11 invariants) + gap-audit (23 findings
→ 8 fix-units) + TaskMaster tasks 59-66. Closes PR #8. Docs/tasks only — no runtime code.
2026-05-30 17:53:46 +00:00
1af689a969 fix(retrieval): enforce source_kind on halacha_filters — close cross-corpus leak (GAP-10, INV-RET1)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:46:59 +00:00
80d1c5ff27 tasks(legal-ai): reconcile #56 (cancel→superseded by 62.1) + #57 (link to FU-3)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:43:12 +00:00
d72d5429ed tasks(legal-ai): 8 fix-unit tasks (59-66) + 23 GAP subtasks from gap-audit
Granularity (epic-per-fix-unit + subtask-per-gap) and dependency-aware/WSJF
prioritization both backed by ≥3 authoritative sources (SAFe/Pichler/OWASP/CVSS;
Wake-INVEST/Cohn/Agile-Alliance/Atlassian/SAFe).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:38:31 +00:00
28bed4906c docs(spec): gap-audit — 23 findings mapped to invariants + proposed fix-units (sub-project 2) 2026-05-30 17:27:06 +00:00
ebfda74575 docs(spec): X1 — canonical case_number = official assigned number (no month invention); mixed-form reconciliation is a migration task 2026-05-30 17:23:14 +00:00
e3880aef4e docs(spec): sign-off fixes — 06 index row (G2,G9), refresh stale §7 note, fix X3 G9 anchor niqqud
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:15:00 +00:00
380998da17 docs(spec): X5 — file:line/name precision (log_search_bg, user param, active_draft_path) 2026-05-30 17:09:33 +00:00
8c4b8cf19e docs(spec): X5-audit-provenance
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 17:05:43 +00:00
b0351958db docs(spec): X4-agents map + reserved process-agents section
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:59:31 +00:00
c881665b7c docs(spec): constitution index — X3 enforces G2,G9 (operational)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:56:39 +00:00
7fd6d8cb95 docs(spec): X3 — replace out-of-repo memory links with plain mentions (self-containment) 2026-05-30 16:56:20 +00:00
951f2366e6 docs(spec): X3-integration-deploy
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:53:01 +00:00
a0004f0274 docs(spec): constitution — document third authority model (project-operational)
X2/X3/X4 invariants are facts about this system's own integration/ops (no external
authority); they use מקור-סמכות=project runbooks, tied to a global engineering invariant.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:49:58 +00:00
f0fd405f4e docs(spec): X2-multi-company sync rules
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:47:19 +00:00
b0e4e14832 docs(spec): X1-identifiers canonical model
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 16:41:37 +00:00
b46d25f605 docs(spec): 07-learning loop
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 15:21:34 +00:00
0fd06659da docs(spec): 06-export DOCX contract
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 15:16:00 +00:00
c0ef90d722 docs(spec): 05-qa-review — clarify neutral_background dual return path (critical fallback w/ passed=True); fix line ref
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 15:12:30 +00:00
c1872aa214 docs(spec): 05-qa-review — QA gates + human gates 2026-05-30 15:09:42 +00:00
1582556b0b docs(spec): 04-analysis-writing — 12 blocks + reasoned-decision invariants
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 15:03:56 +00:00
5e80bf560d docs(spec): constitution index — add G9 to 03-retrieval row (consistency)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 15:00:30 +00:00
72737df154 docs(spec): 03-retrieval corpora + retrieval invariants
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:57:11 +00:00
998194462f docs(spec): 02-data-model entities + completeness contract
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:50:06 +00:00
9199214b7c docs(spec): 01-ingest — trim §4 redundancy (reference INV-ING3)
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:46:23 +00:00
da80bcf0fe docs(spec): 01-ingest unified intake contract
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:42:26 +00:00
6afd155dc1 docs(spec): scope ≥3-source rule to engineering decisions; reframe legal-content (G11)
Per chair clarification: the ≥3-authoritative-source verification protocol governs
ENGINEERING/architecture decisions only (G1–G10). Legal-domain content (G11) is the
authority of the chair + project docs (block-schema, decision-methodology, lessons,
skills/decision) — NOT externally triple-sourced.

- §2/§4/§5 scoped to engineering invariants; added the two-authority distinction
- G11 reframed: source-of-authority = chair + project docs; removed FJC/South Bucks/
  1958-statute as "sources to verify" and the UNVERIFIED flag
- Removed the "open items — primary-source verification" section (the over-application)
- Pruned now-orphaned legal sources from the appendix (kept NCSC/CEPEJ/FJC for G9/G10)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:37:54 +00:00
1daaa4861b docs(spec): reframe G2 example as structural asymmetry + note forthcoming files
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:21:00 +00:00
fd682d130f docs(spec): 00-constitution — mission, 11 global invariants, engineering rules
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:15:28 +00:00
c351d6d714 docs(spec): scaffold docs/spec/ living spec-set
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:12:25 +00:00
1d01135e32 docs(plan): implementation plan for system spec-set (sub-project 1)
13 tasks across 3 phases (keystone constitution → lifecycle files → cross-cutting),
each verification-gated (≥3 sources or UNVERIFIED+escalate) with review checkpoints.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:08:31 +00:00
a5b22dadf3 docs(spec): master design for system spec + integrity layer
Establishes the foundation to fix a recurring root-cause failure class
(non-canonical identifiers, asymmetric ingest paths, silent manual gates):

- Confirmed system mission (quasi-judicial decision assistant; human decides)
- Decomposition into 5 sub-projects (spec → audit → integrity layer → re-check → process agents)
- spec-set structure under docs/spec/ (lifecycle-organized + cross-cutting files)
- 11 global invariants + engineering rules, each backed by ≥3 authoritative sources
  (NCSC/JTC, FJC, CEPEJ, South Bucks; RAG/Lewis, Manning IR, Elastic/Pinecone/Weaviate;
   DAMA-DMBOK, ISO 8000, ISO 15489, Kleppmann, Codd, Fowler)
- 3-source verification protocol; UNVERIFIED items escalated, not decided solo

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-30 14:05:06 +00:00
7826ff4910 fix(cases): tolerant case_number lookup so agents see case documents
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m39s
Reported: an agent claimed the case had no documents because document_list
returned empty — but the documents exist. Root cause: get_case_by_number did
an exact `WHERE case_number = $1`, so any formatting variant of the number
silently failed to resolve. Verified on 8137-24 (9 docs): "8137/24",
"ערר 8137-24", leading/trailing space, and "בל\"מ 8126/03/25" all returned
"תיק לא נמצא", which the agent read as "no documents" and went blind.

Add _normalize_case_number (strip leading proceeding-type prefix to the first
digit, trim, unify '/'→'-') and a normalized fallback in the lookup query
(exact match preferred via ORDER BY). One fix covers every case_number-scoped
tool (document_list, extract_references, search_case_documents, get_claims,
drafting, ...). Bogus numbers still correctly resolve to "not found". (#58)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 11:54:52 +00:00
58ab003206 fix(retrieval): make decisions findable by name + unhide committee uploads
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m57s
Root cause of "agent can't find the Agasi decision in the corpus" (CMPA-55):
the decision was fully ingested, but the retrieval layer failed on the
realistic agent query — searching by case name.

- RC-A (#52): lexical tsvector covered only chunk content + halacha text,
  so a bare-name query ("אגסי") matched decisions that *cite* the case, not
  the case itself. Add meta_tsv on case_law(case_name, case_number) (SCHEMA
  V20) and OR it into the lexical halacha/chunk SQL with a match boost, so a
  name/number hit surfaces the case's own rows. Agasi: rank 4 → rank 1.
- RC-B (#53): precedent_library_list hard-defaulted source_kind=external_upload
  and never exposed the param, hiding uploaded ערר/בל"מ (internal_committee)
  decisions. Thread source_kind through service → tool → MCP tool (supports
  'internal_committee' / 'all_committees').
- #54: agent instructions (researcher/analyst/writer) — search-by-name
  protocol: add content/case-number, search both corpora, use all_committees
  before declaring "not in corpus".
- #55: chunker produced tiny fragment chunks ("דיון", "החלטה") from header
  keywords matched mid-sentence. Anchor SECTION_PATTERNS to line start +
  merge sub-min sections; exclude <50-char fragments at query time (484
  existing fragments hidden; full re-chunk tracked as #57).

Tests: scripts/test_retrieval_by_name.py (name ranks case above citer +
substantive regressions); chunker unit checks (0 tiny chunks). New findings
filed as tasks #56 (halacha source_kind leak) and #57 (re-chunk migration).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 11:26:19 +00:00
165efc62b0 docs(claude): correct canonical tasks.json path + add CLI cwd footgun warning
TaskMaster's --tag selects the logical group inside a file, not which
tasks.json to write; the CLI resolves the file from cwd. Document the
canonical project-root-relative path and the cwd footgun.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-05-30 11:19:47 +00:00
d3c6baf9e2 security(chat): bind chat service to docker bridge + require Bearer auth
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m38s
Address security-review finding: the host-side legal-chat-service was
binding 0.0.0.0:8770 with no authentication. The service spawns the
claude CLI, whose tool set includes Bash + Edit — so an unauthenticated
/chat/start is effectively RCE. Oracle Cloud's security list closes the
port externally, but defense-in-depth requires two independent layers:

1. Bind defaults to 10.0.1.1 (docker0 bridge gateway). Reachable from
   containers on docker bridges (the legal-ai container has a route via
   the coolify network), invisible to anything outside the host. The
   --host flag is still configurable for local-dev (127.0.0.1) or
   special-case deployments, but 0.0.0.0 is explicitly discouraged in
   the docstring.
2. /chat/start requires Authorization: Bearer <LEGAL_CHAT_SHARED_SECRET>.
   The secret is loaded from /home/chaim/.legal-chat-service.env (chmod
   600, off-repo) by the pm2 ecosystem and mirrored as a Coolify env
   var so the FastAPI chat_proxy sends a matching header. hmac.compare_digest
   prevents timing oracles. /health stays unauthenticated (static OK,
   no subprocess) so the FastAPI proxy can probe liveness without the
   secret.

The service refuses to start if LEGAL_CHAT_SHARED_SECRET is empty or
shorter than 24 chars — no silent fallback to an open mode.

When the Infisical MCP comes back, migrate the secret into the vault
at /_GUIDELINES per the project secrets policy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 10:22:14 +00:00
5ad541e54c ui(precedents): upload sheet routes ערר/בל"מ to internal-decisions endpoint
Some checks failed
Build & Deploy / build-and-deploy (push) Has been cancelled
Citations starting with ערר/בל"מ/ARAR are committee decisions and must
carry chair_name + district. The /precedents upload form previously
errored out for these (precedent_library service rejects them) with no
in-UI path forward — internal_decision_upload was only reachable via
the /missing-precedents flow.

The form now auto-detects committee citations, reveals chair_name +
district fields, hides the irrelevant source_type/precedent_level
(derived server-side), and posts to /api/internal-decisions/upload.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-27 10:22:03 +00:00
a3454bcb57 fix(training): bundle reference content + use docker bridge gateway
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 9s
The Style Studio's curator-prompt + chat features read reference docs
from disk at runtime. Two issues from the initial production run:

1. Dockerfile + .dockerignore excluded .claude/, docs/, and most of
   skills/. Now COPY the four specific files the new endpoints need:
     - .claude/agents/hermes-curator.md
     - skills/decision/SKILL.md
     - docs/legal-decision-lessons.md
     - docs/corpus-analysis.md
   .dockerignore opens whitelists for just those files.

2. Coolify's custom_docker_run_options=--add-host=host.docker.internal:host-gateway
   is not honored on dockerimage build_pack apps (ExtraHosts stayed []).
   Switch chat_proxy.py default to http://10.0.1.1:8770 — the docker0
   bridge gateway, same pattern Paperclip uses for 3100. Bind the host
   pm2 service to 0.0.0.0:8770 so the container can reach it via the
   bridge IP. Oracle Cloud's security list keeps the port unreachable
   from the public internet.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 10:15:27 +00:00