Commit Graph

657 Commits

Author SHA1 Message Date
e3e3da09e5 feat(feedback): דף מרכזי /feedback להערות יו"ר + תיקון קישורי מרכז אישורים
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 37s
- דף /feedback חדש: מאגד את כל הערות chair_feedback מכל התיקים, סינון
  טרם-יושמו/הכל + לפי קטגוריה, כפתור "סמן כיושם" לכל הערה
- מרכז אישורים: כרטיס "הערות יו"ר" קישר ל-/ (חסר תועלת) → עכשיו /feedback
- מרכז אישורים: כרטיס "תיקים שנכשלו ב-QA" — כל תיק במדגם קליקבילי לדף
  התיק, והכרטיס מקשר ישירות לתיק כשיש רק אחד
- ApprovalSample.href אופציונלי; פריטי מדגם נהפכים ל-Link כשיש href
- ניווט: הוספת "הערות יו"ר" לקבוצת work ב-app-shell

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 12:38:04 +00:00
59ff4e31cf feat(halacha): כפתורי אישור/דחייה/שחזור inline ברכיב "הלכות שחולצו"
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 38s
ExtractedHalachotSection היה read-only — הוסף כפתורי פעולה לכל הלכה לפי
review_status: נדחתה → אשר/שחזר לתור · מאושרת → בטל אישור/דחה ·
ממתינה → אשר/דחה. משתמש ב-useUpdateHalacha שמרענן את detail query.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-06 12:27:48 +00:00
68a77c11b6 feat(upload): חסימת כפילות בהעלאת פסיקה + banner עם אפשרויות
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m38s
- בקאנד: GET לפני ה-async task — אם citation כבר קיים כ-external_upload מחזיר 409
- DB: get_external_case_law_by_citation — lookup לפי citation + source_kind
- פרונט: banner אדום עם פרטי הרשומה הקיימת ושני כפתורות:
  • "הפעל חילוץ מחדש" — request-halachot ל-ID הקיים וסגירת הטופס
  • "מחק את הרשומה" — DELETE עם confirm, ניקוי conflict לאחר מכן

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 12:11:33 +00:00
c83d0162ca feat(halacha): טאבים נדחו/אושרו + שחזור הלכה + הסרת placeholders עם שמות
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 43s
- מוסיף טאב "נדחו" לדף האישורים: הלכות שנדחו מופיעות עם כפתורי "אשר" (ישירות) ו-"שחזר לתור"
- מוסיף טאב "אושרו": הלכות שאושרו עם "בטל אישור" ו-"דחה"
- ספירה צבועה על כל טאב (זהב/אדום/כחול)
- מוסיף useHalachotByStatus hook ב-API
- מסיר placeholders עם שמות ("דפנה תמיר") משדות יו"ר

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 12:07:49 +00:00
f5926506fe chore(types): regenerate OpenAPI types after decision-blocks endpoints
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 39s
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:42:05 +00:00
df97e21d22 Merge pull request 'feat(ui): interactive decision-block viewer + inline editor on case page' (#57) from feat/decision-blocks-viewer into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 4m15s
feat(ui): interactive decision-block viewer + inline editor on case page (#57)
2026-06-06 09:37:13 +00:00
c35e0e50ed feat(ui): interactive decision-block viewer + inline editor on case page
Adds a new "ההחלטה" tab to the case detail page showing all 12 decision
blocks with rendered markdown content and inline editing that saves back
to the DB via two new FastAPI endpoints.

Backend (web/app.py):
- GET  /api/cases/{n}/decision-blocks   — returns all 12 blocks (empty
  ones included) merged from BLOCK_CONFIG + decision_blocks table.
  Exposes source_of_truth ("docx"|"blocks") and active_draft_path.
- PUT  /api/cases/{n}/decision-blocks/{block_id} — inline save via
  block_writer.save_block_content; warns (does not block) when an
  active DOCX draft exists.

Frontend:
- src/lib/api/decision-blocks.ts    — typed hooks (useDecisionBlocks,
  useSaveBlock) following the cases.ts hand-written-module pattern.
- src/components/cases/decision-blocks-panel.tsx — accordion of 12
  blocks; view mode renders Markdown component; edit mode is a textarea
  with on-blur save (derived from ChairEditor pattern, setState-during-
  render for re-sync to avoid effect cascade).
- BLOCK_LABELS in feedback.ts extended from 7 → 12 blocks.
- cases/[caseNumber]/page.tsx — new "ההחלטה" tab wired to the panel.

No DB migration required — decision_blocks + active_draft_path exist.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-06 09:36:51 +00:00
6dd125c491 Merge pull request 'fix(nevo): strip preamble/mini-ratio from court rulings too (#86.1)' (#56) from fix/nevo-preamble-court-rulings into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m35s
2026-06-03 16:56:01 +00:00
f8c3fd6c89 fix(nevo): strip preamble/mini-ratio from court rulings too (#86.1)
strip_nevo_preamble's _DECISION_START only matched ועדת-ערר openings (בפנינו /
הערר שבנדון / ...), so Nevo COURT judgments — exactly the ones carrying a
מיני-רציו — slipped through unstripped. The editorial mini-ratio then leaked into
the chunked body, risking that the halacha extractor reads Nevo's answer key
(contamination) and polluting the corpus. Proven on בג"ץ 1764/05: its full_text
still contained the מיני-רציו (unstripped).

Fix:
- Extend _DECISION_START with court-ruling openings: פסק-דין/פסק דין header and
  the authoring-judge line (השופט/ת, כב' השופט, הנשיא, המשנה לנשיא). re.search
  picks the earliest line-start match → the real opinion start, not the prose
  ratio above it.
- Widen the Nevo-marker detection window 400→1500 chars so a long court/parties
  header doesn't push חקיקה שאוזכרה:/מיני-רציו: out of range.

Verified on the real 1764/05 full_text: strips 2702 chars, body now starts at
'השופט ס' ג'ובראן:', מיני-רציו gone. Regression: ועדת-ערר openings still strip;
non-Nevo text untouched; markers-past-400 now detected. Suite 182 passed (6 new).

This is the anti-contamination prerequisite for the Nevo-ratio gold-set (#86.3/#81.7).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 16:55:31 +00:00
d47a633fcf Merge pull request 'feat(halacha): over-extraction consolidation — fold facets via claude_session (#81.5)' (#55) from feat/halacha-consolidation into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m38s
2026-06-03 16:27:13 +00:00
fb60dca796 feat(halacha): over-extraction consolidation — fold facets via claude_session (#81.5)
After a precedent finishes extracting, a claude_session pass folds facets of the
SAME legal question (below #82's dedup cosine — the שפר 14-vs-4 / 403-17→89
granularity gap) into one canonical; the rest are marked 'rejected' (reversible:
out of the active corpus AND the review queue, but recoverable). FOLD-ONLY —
never merges distinct legal questions, never invents.

- Engine: claude_session-as-judge (local CLI, zero cost), 'high' effort — folding
  needs careful judgment. One pass per precedent, runs in _extract_impl once all
  chunks are done (the prompt dedups within a chunk; this catches across chunks).
- Pure, unit-tested helpers in halacha_quality: CONSOLIDATE_SYSTEM,
  build_consolidation_prompt, parse_fold_groups (fails SAFE → [] on any malformed
  shape; drops <2-member groups; coerces/dedups indices).
- halacha_extractor._consolidate_precedent picks the canonical per group
  (approved>pending, higher confidence, quote_verified, longer) and rejects the
  rest via the existing update_halachot_batch (#84). Never rejects a canonical.
  Fails OPEN on any error (no CLI / parse fail → 0 folds, data untouched).
- config: HALACHA_CONSOLIDATE_ENABLED/MODEL/EFFORT.

Verified: suite 176 passed (10 new); integration vs dev DB — a 2-facet group
folds to 1 canonical + 1 rejected (tagged), distinct rules untouched, claude
error → 0 folds (fail-open).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 16:26:44 +00:00
5efb8cf915 Merge pull request 'feat(halacha): NLI entailment validator via claude_session (#81.3)' (#54) from feat/halacha-nli-validator into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m39s
2026-06-03 14:46:40 +00:00
f196bed564 feat(halacha): NLI entailment validator via claude_session (#81.3) + task #86
#81.3 — a post-extraction validator that flags halachot whose rule_statement is
NOT entailed by its supporting_quote (the model over-reaching beyond its source).

- Engine: claude_session-as-judge (local CLI, zero API cost) per chaim's standing
  preference — one batched judge call per chunk, NOT a hosted NLI model.
- Pure, unit-tested helpers in halacha_quality: NLI_SYSTEM, build_nli_prompt,
  parse_nli_verdicts (fails OPEN — any shape/label ambiguity → 'entailed').
- halacha_extractor._nli_check wraps the call; fails OPEN on any error (e.g. no
  CLI in the container) so a flaky judge never blocks a genuine halacha.
- Non-entailed (neutral/contradiction) → quality_flag 'nli_unsupported' which
  blocks auto-approve (routes to pending_review) via the existing store gate.
- config: HALACHA_NLI_ENABLED/MODEL/EFFORT (effort 'low' — entailment is simple).

Verified: suite 166 passed (10 new); LIVE smoke test against the real claude CLI
returned ['entailed','neutral'] for a supported vs unsupported rule.

Also commits TaskMaster #86 (Nevo preamble/ratio: anti-contamination strip fix +
gold-set benchmark) capturing today's strip_nevo_preamble findings.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 14:46:12 +00:00
e25507f9ad Merge pull request 'feat(upload): accept legacy .doc, convert via LibreOffice in container' (#53) from feat/doc-upload-support into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 2m3s
2026-06-03 13:48:26 +00:00
476c2fc5d1 feat(upload): accept legacy .doc, convert via LibreOffice in container
Legacy Hebrew .doc precedents (e.g. nevo.co.il CP1255 OLE2) can now be
uploaded directly through the precedent-library, missing-precedent, and
training upload paths — the frontend already advertised .doc but the
backend gate rejected it before reaching the extractor.

- web/app.py: add .doc to ALLOWED_EXTENSIONS (covers all paths that share
  the set: precedent library, missing-precedent, training).
- Dockerfile: install libreoffice-writer-nogui (no X11/Java) so the
  extractor's existing _extract_doc LibreOffice conversion works in the
  Coolify container (was missing → would fail at runtime).
- extractor.py: isolate the LibreOffice user profile per call to avoid a
  profile-lock failure on concurrent .doc conversions.

Verified in python:3.12-slim (prod base): .doc→.docx→text yields text
byte-identical to a native Word .docx save (103 paragraphs, 24,341 chars).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:47:47 +00:00
db6bad5d1e Merge pull request 'feat(halacha): review-queue triage — defer + batch + quality-flag badges (#84)' (#52) from feat/halacha-review-triage into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m41s
2026-06-03 13:42:53 +00:00
eeb70a5758 feat(halacha): review-queue triage — defer + batch group actions + quality-flag badges (#84)
Make the chair's pending-halacha review faster and less exhausting.

Backend:
- New 'deferred' review_status (snooze): stays out of the active library AND
  out of the default pending queue, without the finality of 'rejected'.
  update_halacha stamps reviewer+reviewed_at on defer; HALACHA_REVIEW_STATUSES
  is the single source of valid statuses (PATCH validation now uses it).
- db.update_halachot_batch(ids, status, reviewer) — one atomic UPDATE for a
  whole group; invalid status / empty ids are a no-op.
- POST /api/halachot/batch (HalachaBatchReviewRequest) wraps it.
- update_halacha now RETURNs quality_flags too (parity with list_halachot).

Frontend (halacha-review-panel):
- Quality-flag badges (#81: non_decision / truncated_quote / thin_restatement /
  quote_unverified) so the chair sees WHY an item was held back.
- Defer action — button + keyboard 'D' — to snooze without rejecting (fixes the
  'leave in pending forever' anti-pattern; reject stays the junk verb).
- Per-precedent batch bar: 'אשר הכל' / 'דחה הכל' via useBatchReviewHalachot
  (one request, one refetch) with confirm guards.
- Halacha/HalachaPatch types gain quality_flags + 'deferred'.

Verified: mcp-server suite 156 passed; web build green; end-to-end integration
against dev DB (batch approve/reject, defer sets status+timestamp, pending
excludes approved+deferred, deferred queryable, invalid status no-op).

Note: api:types regen deferred until deploy (the batch hook is hand-typed, not
dependent on generated types).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:42:21 +00:00
7ebddcce6d Merge pull request 'feat(halacha): UNIQUE(case_law_id, halacha_index) backstop (#83)' (#51) from feat/halacha-unique-index into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m37s
2026-06-03 13:07:30 +00:00
0f64b4c062 feat(halacha): UNIQUE(case_law_id, halacha_index) backstop + task tracking (#83)
#83 pipeline robustness — the index-numbering correctness guarantee:
- Add CREATE UNIQUE INDEX idx_halachot_unique_index ON halachot(case_law_id,
  halacha_index). The extractor assigns the index as MAX+1 under an in-process
  store-lock + a cross-process pg advisory lock, so collisions shouldn't occur
  in normal operation — but per the research (FireHydrant/OneUptime) the
  constraint is the actual correctness guarantee while the lock is the
  optimization. A racing/double run now fails LOUDLY (UniqueViolation, chunk
  left un-checkpointed → clean resume) instead of silently appending the
  duplicates that were the 2026-05/06 over-extraction root cause.

Data prep (run against the live DB before the constraint, backed up to
data/audit/halacha-reindex-backup-*.sql): the 6 precedents that still carried
colliding halacha_index values (9 groups, distinct principles that shared a
number — NOT content dups) were renumbered to unique sequential indices.

Verified: advisory lock holds cross-process and the DB path is direct asyncpg
(no transaction-pooler), so the session lock is safe (83.1); force=True does
delete+checkpoint-clear in one transaction (83.5); constraint rejects a
duplicate-index insert (integration-checked). Full suite 156 passed.

Also commits the TaskMaster tracking for the whole halacha-quality initiative
(#81-#84 + research-backed subtasks, statuses).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:06:58 +00:00
8e3d14abee Merge pull request 'feat(halacha): strict-rubric quality gate + dedup-on-insert (#81,#82)' (#50) from feat/halacha-quality-gate into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m36s
2026-06-03 12:31:27 +00:00
ca959d4a9c feat(halacha): strict-rubric quality gate + dedup-on-insert (#81,#82)
Bake the 2026-06-03 strict-cleanup rubric into the extraction pipeline so the
corpus stays clean at the source instead of accumulating duplicates, obiter
dicta, truncated quotes and thin restatements that clog the review queue.

#81 — quality gate:
- New pure module halacha_quality.py with unit-tested validators:
  non-decision/obiter (Wambaugh markers), truncated-quote (mid-word cut),
  thin-restatement (rule≈quote), quote-unverified.
- Validators run in halacha_extractor._process; a non-decision is re-typed
  obiter; flags persist in new halachot.quality_flags column.
- Auto-approve now requires confidence>=threshold AND no quality flags;
  flagged items route to pending_review regardless of confidence.
- Both extraction prompts hardened: reject undecided dicta, exclude
  case-specific applications, require abstraction, forbid over-splitting.

#82 — dedup-on-insert (store_halachot_for_chunk):
- Within the same precedent, skip a halacha whose normalized supporting_quote
  already exists, or whose rule-embedding has cosine>=HALACHA_DEDUP_COSINE
  (0.93) against an already-stored one. Makes re-runs idempotent.

Migration: halachot.quality_flags TEXT[] (additive, idempotent ALTER).
Tests: 19 new unit tests; full suite 156 passed. Validated end-to-end against
dev DB (dedup skips dups, flag blocks auto-approve, re-run inserts 0).
Calibration: flags fire on only ~10% of current survivors (low false-positive).

Spec: docs/halacha-strict-rubric.md

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 12:30:38 +00:00
b0ec24a9d5 Merge pull request 'chore(#80): backfill 8070-25 → appraisal multimodal 12/12; close #80' (#49) from chore/80-multimodal-appraisal-coverage into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 7s
2026-06-03 09:46:44 +00:00
f5d14fd6b8 chore(#80): backfill 8070-25 -> appraisal multimodal coverage 12/12; close #80
Full check found the premise wrong on every count (like #71/#70):
- Not 140 docs/17,700 pages/2hr/$$ needing Dafna+chaim. Of 140 image-less
  docs, only 65 are PDF (rest MD/DOCX — pipeline renders PDF only) = 704 pages.
- The value docs (appraisal, where multimodal's table/image worth is) were
  already 8/12 embedded. The only gap was ONE case, 8070-25 (4 appraisal docs).
- Backfilled 8070-25 locally (voyage-multimodal-3, ~30s, cents): all 14 docs
  embedded. Appraisal coverage now 12/12 (100%).
- Remaining 51 PDFs/649 pages are all text-dense (reference/response/appeal);
  #15 proved multimodal does NOT help text-dense docs, so they're intentionally
  left text-only. Not an inconsistency — the correct config.

No gold-set / Dafna labeling / chaim cost approval needed — cost was cents and
value was already proven in #15. #80 done (technical, not human-gated).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 09:46:23 +00:00
bbe3db7b94 Merge pull request 'chore(#70): delete 15 orphaned cited_only stubs + close #70' (#48) from chore/70-orphan-stub-cleanup into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 8s
2026-06-03 09:38:51 +00:00
7d0d4a9b27 chore(#70): delete 15 orphaned cited_only stubs + close #70
The 4 'ambiguous' citation items flagged for chair turned out to be dead
orphan stubs: 0 inbound/outbound edges across all 5 citation mechanisms,
0 full_text, 0 halachot, 0 chunks/embeddings. A corpus-wide check found 15
such orphans total (incl. clean-looking ones). Per OpenCitations (keep an
id-less entity only if it is CITED — these are cited by nothing), these are
pure noise → deleted, not chair-judgment.

- 15 orphan cited_only stubs deleted (cited_only 46 -> 31); backup in
  data/audit/fu2b-orphan-stub-cleanup-*.json.
- 0 malformed / 0 orphans remain; all 31 remaining stubs are cited.
- Combines with the 3 earlier mechanical normalizations. #70 fully done.
- Known forward-edge (no current data, no task): '+' combined-citation
  handling in citation_extractor if it recurs in future extraction.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 09:38:30 +00:00
61dde4cd83 Merge pull request 'chore(tasks): research-backed decisions — close #71/#42/#14/#76 + #70 normalization' (#47) from chore/close-open-tasks-research-decisions into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 8s
2026-06-03 09:10:02 +00:00
2a9168a1b4 chore(tasks): research-backed decisions to close open tasks (#71/#42/#14/#76/#70)
Per chaim's directive — for decisions not requiring Dafna/chaim, decide after
>=3 authoritative open sources.

#71 DONE — resolved by #15's weight fix (measured: all multi-relevant docs now
  in top-10, the rank-15/16 weak queries fixed). Research (6 sources) said
  enable rerank; tested empirically → it HURT (nDCG@5 0.879 vs 0.960, MRR 0.867
  vs 0.954) because recall is saturated and the cross-encoder demotes exact
  known-item matches. Measurement overrides theory: no rerank, no limit change.
#42 CANCELLED — obviated by BM25 hybrid (already on; handles abbreviation
  tokens lexically); 0 abbrev queries in eval, recall ~0.99, no measured gap.
#14 DEFERRED (reviewed) — no current blocker; YAGNI; trigger documented.
#76 CANCELLED — upstream Paperclip bug (ee=companyId), not safely fixable our
  side; workaround + #78 documented.
#70 — research-backed normalization (ECLI/Akoma Ntoso/ELI/OpenCitations +
  Christen). Applied 3 deterministic mechanical fixes to cited_only (whitespace
  + missing prefix-space); 0 malformed remain. 4 ambiguous items (2 garbled,
  'ערר אדלר', 1 combined citation) flagged for chair — NOT auto-guessed, per
  the entity-resolution false-merge guardrail.

#80 stays pending — human-gated (Dafna value-labeling + chaim cost).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 09:09:30 +00:00
5a00a0ef47 Merge pull request 'chore(#15): adopt MULTIMODAL_TEXT_WEIGHT=0.65 + close #15, open #80' (#46) from chore/15-multimodal-weight-065 into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 7s
2026-06-03 08:45:29 +00:00
4debe9995b chore(#15): adopt MULTIMODAL_TEXT_WEIGHT=0.65 + close #15, open #80
A/B eval (eval_retrieval.py, 86-query gold-set) showed the 0.5 default was
mis-tuned: the image side was too heavy and dragged precedent_library recall
0.971 -> 0.885. Sweep 0.5..0.75 — at 0.65 multimodal beats text-only on every
overall metric AND every corpus (R@5 0.994 vs 0.989, nDCG@5 0.960 vs 0.944,
MRR 0.954 vs 0.936). Dafna approved.

- MULTIMODAL_TEXT_WEIGHT=0.65 set in Coolify (legal-ai, runtime) + redeploy.
- baseline.json updated to the 0.65 config (future regression reference).
- #15 done (premise was stale — multimodal already default on 110 docs; the
  win was tuning the weight, not the backfill).
- #80 opened: the costly 140-doc legacy backfill is deferred until a targeted
  image-answer gold-set proves the table/image value prop (untested here).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 08:45:06 +00:00
bb42aeeff4 Merge pull request 'fix(#79): chunker never emits sub-50-char fragment chunks (#55 follow-up)' (#45) from fix/79-chunker-no-tiny-fragments into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m38s
2026-06-03 08:10:39 +00:00
6fcfdc76db fix(#79): chunker never emits sub-50-char fragment chunks (#55 follow-up)
A section that opens with a short header line ('דיון', 'טענות המשיבים')
followed by a paragraph larger than chunk_size flushed the header alone as a
tiny chunk. #55 added a query-time >=50 filter to hide these; this removes
them at the source.

_split_section: (1) don't flush a buffer still below MIN_CHUNK_CHARS — let it
absorb the next paragraph even if that overflows chunk_size, so a short header
rides with its following content; (2) fold a trailing tiny chunk back into its
predecessor.

Verified: re-chunked the 4 corpus docs that still had a tiny chunk
(ע"א 5138/04, בר"מ 2340/02, בג"ץ 6525/15, 403-17) — corpus-wide chunks<50
went 4 -> 0; all 4 stay embedded/searchable and rank top in a relevant search
(נווה שלום #1 for the s.19(ג)(1) exemption query). No regression.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 08:10:10 +00:00
0a88bed58b Merge pull request 'chore(tasks): #79#55 follow-up (isolated section-heading chunks)' (#44) from chore/task-79-chunker-followup into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 7s
2026-06-03 07:59:26 +00:00
d4046c2fbd chore(tasks): #79#55 follow-up for isolated section-heading chunks
Discovered closing #57: the current chunker still emits 4 tiny chunks that
are standalone section headings ('דיון', 'טענות המשיבים', ...). Low priority
— filtered at query time, search unaffected. Proposed fix: anchor a short
isolated heading forward into the following section.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 07:58:54 +00:00
f74fa13146 Merge pull request 'chore(#57): re-chunk+re-embed legacy precedents (pre-#55 remediation)' (#43) from chore/57-rechunk-legacy-precedents into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m11s
2026-06-03 07:56:12 +00:00
434341cc29 chore(#57): re-chunk+re-embed legacy precedents (pre-#55 chunker remediation)
Adds scripts/rechunk_legacy_precedents.py: selects every case_law with a tiny
chunk (content<50 — the pre-fix chunker fingerprint) and runs
ingest.reindex_case_law (re-chunk+re-embed from stored full_text only, no
re-OCR/LLM, idempotent). Batch-idempotent (re-queries the affected set).

Run result (2026-06-03): 73 precedents reindexed, 0 failed. Tiny chunks
483 -> 4 (99.2%); total precedent_chunks 5019 -> 3115 (fragments merged).
Search verified healthy (substantial coherent passages, no errors).

The 4 residual tiny chunks are isolated section headings ('דיון',
'טענות המשיבים', ...) emitted by the CURRENT (fixed) chunker — not legacy
fragments — and are already filtered at query time (>=50, #55). Minor
chunker edge case, candidate #55 follow-up.

The DB chunk migration is already applied to prod; this commit is the script
+ SCRIPTS.md entry only (no app code change, no deploy needed).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 07:55:42 +00:00
c7c6f3eb9c Merge pull request 'chore(tasks): #77+#78 done; #76 deferred with root-cause' (#42) from chore/tasks-76-78-status into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 10s
2026-06-02 12:28:02 +00:00
76fae77393 chore(tasks): #77+#78 done; #76 deferred with root-cause diagnosis
#78 (committee-upload wakeup) + #77 (case_number identity) shipped.
#76 (Paperclip create-task button): root-caused to ee=companyId guard —
button enabled on title only but submit requires a company; not safely
patchable via injection. Deferred with workaround + upstream note.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 12:27:45 +00:00
901ec9f869 Merge pull request 'fix(#77 frontend): separate מספר-תיק field on committee upload + editable case_number' (#41) from fix/77-precedent-identity-frontend into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 38s
2026-06-02 12:17:35 +00:00
7be1c3162c fix(#77 frontend): separate מספר-תיק field on committee upload + editable case_number in edit sheet
Pairs with the backend PR. Stops the citation (מראה-מקום) from being stored
as the identifier, and lets a wrong identifier be corrected after the fact.

- upload sheet: new required 'מספר תיק (מזהה ייחודי)' field for committee
  decisions → sent as case_number; the citation field is now sent as the
  separate citation (→ citation_formatted) instead of as case_number.
- edit sheet: the case_number block is now an editable input (was read-only).
  Halachot/chunks key off case_law_id (UUID), so renaming case_number is safe.
- precedent-library.ts: InternalDecisionUploadInput += citation; PrecedentPatch
  += case_number.
- types.ts: regenerated (api:types) — PrecedentUpdateRequest now carries
  case_number.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 12:17:16 +00:00
9295e74762 Merge pull request 'fix(#77 backend): editable case_number + separate citation field on committee upload' (#40) from fix/77-precedent-identity-backend into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 8s
2026-06-02 12:09:59 +00:00
fc0c36b2f8 fix(#77 backend): make case_number editable + separate citation field on committee upload
Two identity fixes for the precedent corpus:
1. PrecedentUpdateRequest += case_number — the canonical identifier was not
   in the edit model, so a wrong id captured at upload (e.g. the full
   citation pasted into the field) could not be corrected. update_case_law
   already whitelists case_number.
2. /api/internal-decisions/upload += citation form field — case_number is
   now the clean identifier (e.g. 8027-25) and citation is the full
   מראה-מקום, stored as citation_formatted up-front (previously the UI sent
   the citation AS case_number, leaving the id polluted and citation_formatted
   empty until extraction). Stored via a post-ingest update_case_law, not the
   core INSERT.

Frontend (separate case_number field in the upload + edit sheets) follows in
a second PR after api:types regen.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 12:09:40 +00:00
2d7ab26c71 Merge pull request 'fix(#78): trigger extraction wakeup on committee-decision upload + surface silent failures' (#39) from fix/78-precedent-extraction-wakeup into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 10s
2026-06-02 12:06:56 +00:00
1d3e235556 fix(#78): trigger extraction wakeup on committee-decision upload + surface failures
The /api/internal-decisions/upload path (used by the UI for ועדת-ערר
decisions) never called pc_wake_for_precedent_extraction, so committee
decisions were stuck at halacha_extraction_status='pending' forever — the
CEO was never woken to drain the queue. Root cause behind 8027-25's stuck
extraction. The other two upload paths (precedent_library, missing-precedent)
already wake the CEO; this one was missing it.

- internal-decisions upload: add the wakeup, routing the company by case
  number prefix (1xxx→רישוי, 8xxx→היטל, 9xxx→פיצויים) when practice_area is
  empty (else an 8xxx case wrongly routes to the licensing CEO).
- all three call sites: the wake helper returns {ok:False} WITHOUT raising
  on a skipped/failed wakeup; that was silently dropped. Now logged at
  WARNING with the reason, and the upload progress carries extraction_queued.

Fallback drainer (scheduled precedent_process_pending) deferred — the
missing wakeup was the actual failure; manual precedent_process_pending
remains the recovery path.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 12:06:31 +00:00
7471dcf3cc Merge pull request 'chore: tasks #76-78 + weekly chair-feedback lessons #34-35' (#38) from chore/tasks-and-weekly-lessons into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 8s
2026-06-02 11:57:44 +00:00
d790fb26e0 docs(lessons): weekly chair-feedback lessons #34-35 (week ending 2026-05-31)
#34 don't manufacture doubt about unambiguous statutes (s.19(ג)(2));
#35 writer/QA two-sources-of-truth sync gap (DB vs drafts/decision.md).
Output of the weekly-feedback-analysis job, pending commit.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:57:24 +00:00
7e34c53224 chore(tasks): add #76-78 — Paperclip create-task button + 2 precedent-upload bugs
#76 צור-משימה button (enabled but submit no-ops), #77 committee-upload
field mapping (citation→case_number, case_number uneditable), #78 silent
extraction wakeup failure. Discovered while debugging precedent 8027-25.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:57:24 +00:00
77ed0361b7 Merge pull request 'fix(appraiser-facts): valid Paperclip priority enum (normal→medium)' (#37) from fix/appraiser-facts-priority-enum into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m18s
2026-06-02 11:49:23 +00:00
5d63a903ce fix(appraiser-facts): valid Paperclip priority enum (normal→medium)
The 'חלץ עובדות שמאיות עכשיו' button returned HTTP 500. Root cause:
wake_analyst_for_appraiser_facts POSTs a child issue to Paperclip with
priority='normal', but Paperclip's ISSUE_PRIORITIES enum is only
critical|high|medium|low. createChildIssueSchema (Zod) rejects 'normal'
with 400 Bad Request; pc_request raise_for_status() turns it into a 500
surfaced to the chair. Fixed to 'medium' (the sole non-normal occurrence
in the repo).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-02 11:48:58 +00:00
aeddcb41eb Merge pull request 'feat(web-ui): sort corroborated halachot first' (#36) from feat/x11-corroborated-first into main
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 36s
2026-06-01 05:50:29 +00:00
1aadd3b455 feat(web-ui): sort corroborated halachot first in extracted list (X11)
Halachot carrying a corroboration badge (positive citation count or a
negative treatment) float to the top of 'הלכות שחולצו', ordered by
corroboration strength; the rest keep document order by halacha_index.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-01 05:50:12 +00:00