feat(training): Style Studio — upload, rich corpus, lessons, curator portrait, chat

Six-phase upgrade of /training from a read-only dashboard into a full Style Studio for managing Daphna's style corpus. - Upload Sheet on /training: file → proofread preview → commit (no more CLI-only `upload-training` skill). - Rich corpus metadata: GET /api/training/corpus returns summary, outcome, key_principles, page_count, parties (regex), legal_citation, lessons_count. PATCH endpoint for chair edits. CorpusDetailDrawer with 4 tabs (details /content/lessons/patterns) replaces the bare table row. - LLM metadata enrichment: style_metadata_extractor + MCP tools (style_corpus_enrich, style_corpus_pending_enrichment) fill summary /outcome/key_principles via claude_session (free, host-side). - Per-decision lessons: new decision_lessons table + 4 REST endpoints + LessonsTab in drawer; hermes-curator now auto-posts findings as decision_lessons(source=curator). - Curator Portrait tab: prompt rendered with link to Gitea, recent curator findings, style_analyzer training prompts, propose-change form that writes proposals to data/curator-proposals/ for manual chair review (no auto-mutation of the agent file). - Style chat tab: SSE-streamed conversations with the style agent. New host-side pm2 service (legal-chat-service, port 8770) wraps claude CLI with stream-json + --resume continuation; FastAPI proxies via host.docker.internal. Zero API cost — uses chaim's claude.ai subscription. chat_conversations + chat_messages persist history. Architecture: keeps the existing rule that claude_session only runs on the host (not the container). The new legal-chat-service is the canonical bridge between the container and the local CLI for the chat feature; everything else (upload, metadata, lessons) stays within the container's existing capabilities. Audit script (scripts/audit_training_corpus.py) included for verifying which corpus rows still need enrichment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 10:06:22 +00:00
parent 0629f19d5f
commit bb0cd7c6a2
23 changed files with 4568 additions and 75 deletions
--- a/.claude/agents/hermes-curator.md
+++ b/.claude/agents/hermes-curator.md
@@ -76,6 +76,24 @@ profiles:
   Authorization: Bearer $PAPERCLIP_API_KEY
   { "body": "<my findings>" }
   ```
 5b. **רושם כל ממצא גם ב-API של legal-ai כ-decision_lesson**, כך שיופיע ב-UI
    תחת הטאב "מה למדנו" של ההחלטה בקורפוס. דרישה: למצוא קודם את ה-`style_corpus_id`
    שתואם ל-`decision_number` של ההחלטה (`GET /api/training/corpus` ולסנן).
    לכל ממצא:
    ```
    POST https://legal-ai.nautilus.marcusgroup.org/api/training/corpus/{corpus_id}/lessons
    Content-Type: application/json
    {
      "lesson_text": "<התקציר של הממצא — מה ראיתי + הצעה — שורה אחת>",
      "category": "<style|structure|lexicon|tabular|general>",
      "source": "curator"
    }
    ```
    מיפוי תגי-ממצא ל-`category`:
    - `[סגנון]` → `style`
    - `[מבנה]` → `structure`
    - `[לקסיקון משפטי]` → `lexicon`
    - `[טבלאי]` → `tabular`
 6. סוגר את ה-issue (status=done) אחרי שכתבתי את ה-comment
 ## פורמט ה-comment
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -91,6 +91,16 @@
 - שינויי קוד נכנסים לתוקף אחרי `pm2 restart paperclip`
 - **אין צורך ב-Docker או Coolify**
 **legal-chat-service** — רץ **מקומית דרך pm2** (חדש, מאפריל 2026):
 - פורט: `localhost:8770` (loopback בלבד)
 - שירות aiohttp קצר שעוטף את `claude` CLI ב-streaming + session continuation, ומשרת את הטאב "שיחה" בדף `/training`. הקונטיינר משדל אליו proxy דרך `host.docker.internal:8770`.
 - קוד: [mcp-server/src/legal_mcp/chat_service/](mcp-server/src/legal_mcp/chat_service/)
 - התקנה: `pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs && pm2 save`
 - בריאות: `curl http://127.0.0.1:8770/health` → `{"ok":true,...}`
 - שינויי קוד: `pm2 restart legal-chat-service`
 - **אפס עלות API** — claude CLI משתמש ב-claude.ai subscription של chaim. הנחת היסוד של `claude_session.py` (claude CLI מקומי בלבד) נשמרת — השירות הזה הוא הגשר הרשמי בין הקונטיינר לחוץ.
 - Coolify dependency: ה-Service Definition של legal-ai חייב להכיל `extra_hosts: host.docker.internal:host-gateway` (אחרת ה-proxy יקבל ConnectError).
 ---
 ## מבנה תיקיות
--- a/mcp-server/src/legal_mcp/chat_service/init.py
+++ b/mcp-server/src/legal_mcp/chat_service/init.py
@@ -0,0 +1,13 @@
 """legal-chat-service — host-side SSE bridge to ``claude`` CLI.
 Runs as a pm2-managed process on the host (port 127.0.0.1:8770 by default).
 The legal-ai FastAPI container proxies chat requests to it via
 ``host.docker.internal:8770``.
 Why a separate service:
    The chat needs real-time streaming + multi-turn session continuation
    (``claude --resume <session_id>``). The container can't run the
    claude CLI (no binary, no claude.ai credentials). Splitting this out
    keeps the architectural rule of ``claude_session.py`` intact while
    enabling the new chat feature for free (no API key).
 """
--- a/mcp-server/src/legal_mcp/chat_service/server.py
+++ b/mcp-server/src/legal_mcp/chat_service/server.py
@@ -0,0 +1,144 @@
 """HTTP+SSE bridge from FastAPI (in container) to local claude CLI.
 Endpoints:
    POST /chat/start    — body: {prompt, system?, resume_session_id?}
                          returns SSE stream of events from
                          ``claude_session.query_streaming``.
    GET  /health        — liveness probe.
 Run with pm2:
    pm2 start ecosystem.config.cjs --only legal-chat-service
 Standalone for dev:
    cd ~/legal-ai/mcp-server
    .venv/bin/python -m legal_mcp.chat_service.server --port 8770
 We intentionally bind to 127.0.0.1 only — the FastAPI container reaches
 us via ``host.docker.internal``, and exposing the bridge publicly would
 let anyone run claude CLI commands against Daphna's session.
 """
 from __future__ import annotations
 import argparse
 import asyncio
 import json
 import logging
 import os
 import sys
 from typing import Any
 from aiohttp import web
 # Run-via-CLI bootstrap so ``python -m legal_mcp.chat_service.server``
 # works even when the package isn't installed (it is in the venv, but
 # this safeguard keeps the entrypoint robust).
 _pkg_root = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
 if _pkg_root not in sys.path:
    sys.path.insert(0, _pkg_root)
 from legal_mcp.services import claude_session  # noqa: E402
 logger = logging.getLogger("legal_chat_service")
 async def health(request: web.Request) -> web.Response:
    return web.json_response({"ok": True, "service": "legal-chat-service"})
 async def chat_start(request: web.Request) -> web.StreamResponse:
    """Drive ``claude_session.query_streaming`` and forward events as SSE.
    Request body (JSON):
        prompt: str                    — required, user message
        system: str | None             — system instructions (ignored if resuming)
        resume_session_id: str | None  — continue a prior CLI session
        timeout: int = 3600            — hard timeout for the subprocess
    """
    try:
        body = await request.json()
    except json.JSONDecodeError:
        return web.json_response({"error": "invalid JSON body"}, status=400)
    prompt = body.get("prompt") or ""
    if not prompt.strip():
        return web.json_response({"error": "prompt is required"}, status=400)
    system = body.get("system")
    resume_session_id = body.get("resume_session_id")
    timeout = int(body.get("timeout") or 3600)
    response = web.StreamResponse(
        status=200,
        reason="OK",
        headers={
            "Content-Type": "text/event-stream",
            "Cache-Control": "no-cache, no-transform",
            "Connection": "keep-alive",
            # X-Accel-Buffering=no defeats nginx/traefik buffering — the
            # FastAPI container proxies via httpx and forwards bytes as
            # they arrive, but the inner header is harmless and makes
            # browser-direct testing easier.
            "X-Accel-Buffering": "no",
        },
    )
    await response.prepare(request)
    async def send_event(payload: dict[str, Any]) -> None:
        line = f"data: {json.dumps(payload, ensure_ascii=False)}\n\n"
        await response.write(line.encode("utf-8"))
    try:
        async for event in claude_session.query_streaming(
            prompt,
            system=system,
            resume_session_id=resume_session_id,
            timeout=timeout,
        ):
            await send_event(event)
            if event.get("type") == "done" or event.get("type") == "error":
                break
    except asyncio.CancelledError:
        # Client disconnected — bail cleanly.
        logger.info("chat_start: client disconnected")
    except Exception as e:
        logger.exception("chat_start: streaming failed")
        try:
            await send_event({"type": "error", "message": str(e)})
        except ConnectionResetError:
            pass
    try:
        await response.write_eof()
    except ConnectionResetError:
        pass
    return response
 def build_app() -> web.Application:
    app = web.Application()
    app.router.add_get("/health", health)
    app.router.add_post("/chat/start", chat_start)
    return app
 def main() -> int:
    parser = argparse.ArgumentParser(description="legal-chat-service")
    parser.add_argument("--port", type=int, default=8770)
    parser.add_argument("--host", default="127.0.0.1",
                        help="bind address; 127.0.0.1 keeps the service "
                             "loopback-only — leave it alone in production")
    parser.add_argument("--log-level", default="INFO")
    args = parser.parse_args()
    logging.basicConfig(
        level=args.log_level.upper(),
        format="%(asctime)s %(name)s %(levelname)s %(message)s",
    )
    app = build_app()
    web.run_app(app, host=args.host, port=args.port, print=lambda _msg: None)
    return 0
 if __name__ == "__main__":
    sys.exit(main())
--- a/mcp-server/src/legal_mcp/server.py
+++ b/mcp-server/src/legal_mcp/server.py
@@ -57,6 +57,7 @@ from legal_mcp.tools import (  # noqa: E402
    legal_arguments as la_tools,
    missing_precedents as mp_tools,
    citations as cit_tools,
    training_enrichment as train_tools,
 )
@@ -248,6 +249,18 @@ async def precedent_extract_metadata(case_law_id: str) -> str:
    return await plib.precedent_extract_metadata(case_law_id)
@mcp.tool()
 async def style_corpus_enrich(corpus_id: str, overwrite: bool = False) -> str:
    """חילוץ מטא-דאטה (summary, outcome, key_principles, appeal_subtype) להחלטה בקורפוס הסגנון של דפנה. ברירת מחדל: ממלא רק שדות ריקים. שלח `overwrite=true` כדי לרענן."""
    return await train_tools.extract_decision_metadata(corpus_id, overwrite=overwrite)
@mcp.tool()
 async def style_corpus_pending_enrichment(limit: int = 50) -> str:
    """רשימת החלטות בקורפוס הסגנון שעדיין חסרות summary/outcome/key_principles — מועמדות לחילוץ."""
    return await train_tools.list_corpus_pending_enrichment(limit)
@mcp.tool()
 async def precedent_process_pending(kind: str = "metadata", limit: int = 20) -> str:
    """ריקון תור בקשות חילוץ שנשלחו מ-UI. kind: 'metadata' או 'halacha'. מריץ extractor מקומית עם CLI על כל פריט בתור, ומנקה את הסימון אחרי הצלחה."""
--- a/mcp-server/src/legal_mcp/services/claude_session.py
+++ b/mcp-server/src/legal_mcp/services/claude_session.py
@@ -142,3 +142,175 @@ async def query_json(
    """
    raw = await query(prompt, timeout=timeout, system=system)
    return parse_llm_json(raw)
 # ── Streaming + session continuation ────────────────────────────────
 async def query_streaming(
    prompt: str,
    *,
    system: str | None = None,
    resume_session_id: str | None = None,
    timeout: int = LONG_TIMEOUT,
    cwd: str | None = None,
 ):
    """Stream Claude's response as an async iterator of events.
    Wraps `claude -p --output-format=stream-json` (newline-delimited JSON
    objects from the CLI) and translates each line into a small, stable
    shape that the chat service / SSE proxy can forward without leaking
    CLI internals to the browser.
    Event shapes yielded:
        {"type": "session_id",  "value": "<uuid>"}      # first event, used for resume
        {"type": "text_delta",  "text":  "<partial>"}   # incremental assistant text
        {"type": "tool_use",    "name": "...", "input": {...}}
        {"type": "error",       "message": "..."}
        {"type": "done",        "text": "<full response>"}
    The CLI emits a richer stream; we project to this minimal set so the
    front-end can stay stable across CLI upgrades.
    Args:
        prompt: The user message to send.
        system: Optional system instructions (used only when starting a
            fresh conversation — when resume_session_id is set, the
            session already carries its system prompt).
        resume_session_id: Continue a prior conversation. When given,
            we don't re-send the system prompt; the CLI loads the
            entire conversation history from disk.
        timeout: Hard ceiling on the subprocess.
        cwd: Working directory for the subprocess — defaults to the
            host's HOME so claude.ai credentials resolve correctly.
    """
    if resume_session_id:
        # When resuming, system is already baked into the on-disk session
        # — sending it again would be a no-op at best and confuse the
        # conversation at worst.
        full_prompt = prompt
        cmd = [
            "claude", "-p",
            "--output-format", "stream-json",
            "--verbose",
            "--resume", resume_session_id,
        ]
    else:
        full_prompt = f"{system}\n\n{prompt}" if system else prompt
        cmd = [
            "claude", "-p",
            "--output-format", "stream-json",
            "--verbose",
        ]
    if len(full_prompt) > 200_000:
        logger.warning(
            "Streaming: large prompt (%d chars) — may hit CLI input limits",
            len(full_prompt),
        )
    try:
        proc = await asyncio.create_subprocess_exec(
            *cmd,
            stdin=asyncio.subprocess.PIPE,
            stdout=asyncio.subprocess.PIPE,
            stderr=asyncio.subprocess.PIPE,
            cwd=cwd,
        )
    except FileNotFoundError:
        yield {
            "type": "error",
            "message": (
                "Claude CLI not found on host — legal-chat-service must "
                "run where the `claude` binary is installed (Daphna's host, "
                "not the legal-ai container)."
            ),
        }
        return
    assert proc.stdin is not None  # for type checkers
    assert proc.stdout is not None
    # Send the prompt and close stdin so the CLI knows the user message
    # is complete.
    try:
        proc.stdin.write(full_prompt.encode("utf-8"))
        await proc.stdin.drain()
        proc.stdin.close()
    except BrokenPipeError:
        # CLI exited before reading the prompt — drain stderr and bail.
        stderr_b = await proc.stderr.read() if proc.stderr else b""
        yield {
            "type": "error",
            "message": f"Claude CLI closed stdin early: {stderr_b.decode('utf-8', errors='replace')[:300]}",
        }
        return
    accumulated_text: list[str] = []
    session_id_emitted = False
    deadline = asyncio.get_event_loop().time() + timeout
    try:
        while True:
            remaining = deadline - asyncio.get_event_loop().time()
            if remaining <= 0:
                yield {"type": "error", "message": f"timed out after {timeout}s"}
                break
            try:
                line_b = await asyncio.wait_for(proc.stdout.readline(), timeout=remaining)
            except asyncio.TimeoutError:
                yield {"type": "error", "message": f"stream timed out after {timeout}s"}
                break
            if not line_b:
                break
            line = line_b.decode("utf-8", errors="replace").strip()
            if not line:
                continue
            try:
                event = json.loads(line)
            except json.JSONDecodeError:
                # Stray non-JSON line from CLI — surface a snippet for debug.
                logger.debug("non-JSON stream line: %s", line[:120])
                continue
            # The CLI's stream-json emits several event types. We only
            # care about the ones the chat service forwards.
            t = event.get("type")
            if not session_id_emitted:
                sid = event.get("session_id")
                if sid:
                    session_id_emitted = True
                    yield {"type": "session_id", "value": sid}
            if t == "assistant":
                # event["message"]["content"] is a list of blocks; we extract
                # text blocks and tool_use blocks.
                msg = event.get("message") or {}
                for block in msg.get("content") or []:
                    btype = block.get("type")
                    if btype == "text":
                        text = block.get("text") or ""
                        if text:
                            accumulated_text.append(text)
                            yield {"type": "text_delta", "text": text}
                    elif btype == "tool_use":
                        yield {
                            "type": "tool_use",
                            "name": block.get("name") or "",
                            "input": block.get("input") or {},
                        }
            elif t == "result":
                # Final synthesized result line from the CLI — we already
                # delivered the deltas, so just stop here.
                break
    finally:
        if proc.returncode is None:
            try:
                proc.kill()
            except ProcessLookupError:
                pass
        try:
            await proc.wait()
        except Exception:
            pass
    yield {"type": "done", "text": "".join(accumulated_text)}
--- a/mcp-server/src/legal_mcp/services/db.py
+++ b/mcp-server/src/legal_mcp/services/db.py
@@ -194,6 +194,55 @@ ALTER TABLE style_corpus ADD COLUMN IF NOT EXISTS appeal_subtype TEXT DEFAULT ''
 -- הרחבת style_patterns עם appeal_subtype לניתוח סגנון נפרד לכל סוג ערר
 ALTER TABLE style_patterns ADD COLUMN IF NOT EXISTS appeal_subtype TEXT DEFAULT '';
 -- decision_lessons: per-decision learnings the chair / curator / style_analyzer
 -- attaches to a corpus row. The generic legal-decision-lessons.md file stays
 -- as the source of truth for cross-corpus patterns; this table stores the
 -- granular "what we learned from THIS decision" notes that drive the writer's
 -- future drafts and let the curator look up prior observations on the same row.
 CREATE TABLE IF NOT EXISTS decision_lessons (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    style_corpus_id UUID NOT NULL REFERENCES style_corpus(id) ON DELETE CASCADE,
    lesson_text TEXT NOT NULL,
    category TEXT DEFAULT 'general',           -- style / structure / lexicon / tabular / general
    source TEXT DEFAULT 'manual',              -- manual / curator / chair / style_analyzer
    applied_to_skill BOOLEAN DEFAULT false,    -- has this been promoted into SKILL.md?
    created_by TEXT DEFAULT 'chaim',
    created_at TIMESTAMPTZ DEFAULT now(),
    updated_at TIMESTAMPTZ DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_decision_lessons_corpus ON decision_lessons(style_corpus_id);
 CREATE INDEX IF NOT EXISTS idx_decision_lessons_applied ON decision_lessons(applied_to_skill);
 -- chat_conversations / chat_messages: persistent history for the
 -- "שיחה עם הסוכן" tab on /training. Each conversation can optionally be
 -- scoped to a single style_corpus row (when the chair starts a chat
 -- "about decision X"). claude_session_id is the value the local claude
 -- CLI returns in stream-json — we pass it back via `--resume` on the
 -- next message so the model continues the same conversation without
 -- re-loading the system prompt every time.
 CREATE TABLE IF NOT EXISTS chat_conversations (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    title TEXT NOT NULL DEFAULT 'שיחה חדשה',
    style_corpus_id UUID REFERENCES style_corpus(id) ON DELETE SET NULL,
    claude_session_id TEXT,
    system_prompt_version TEXT DEFAULT 'v1',
    created_at TIMESTAMPTZ DEFAULT now(),
    last_message_at TIMESTAMPTZ DEFAULT now()
 );
 CREATE TABLE IF NOT EXISTS chat_messages (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
    conversation_id UUID NOT NULL REFERENCES chat_conversations(id) ON DELETE CASCADE,
    role TEXT NOT NULL,                -- 'user' | 'assistant'
    content TEXT NOT NULL,
    raw_events JSONB DEFAULT '[]',     -- stream-json events for the assistant turn (optional, for debug)
    created_at TIMESTAMPTZ DEFAULT now()
 );
 CREATE INDEX IF NOT EXISTS idx_chat_messages_conv ON chat_messages(conversation_id, created_at);
 CREATE INDEX IF NOT EXISTS idx_chat_conv_corpus ON chat_conversations(style_corpus_id);
 CREATE INDEX IF NOT EXISTS idx_chat_conv_last ON chat_conversations(last_message_at DESC);
 -- טבלת qa_results
 CREATE TABLE IF NOT EXISTS qa_results (
    id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
@@ -1609,6 +1658,284 @@ async def delete_from_style_corpus(corpus_id: UUID) -> dict:
    }
 async def get_style_corpus_row(corpus_id: UUID) -> dict | None:
    """Return a single style_corpus row by id, or None if missing."""
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            """
            SELECT id, document_id, decision_number, decision_date,
                   subject_categories, full_text, summary, outcome,
                   key_principles, practice_area, appeal_subtype, created_at
            FROM style_corpus WHERE id = $1
            """,
            corpus_id,
        )
    return dict(row) if row else None
 async def update_style_corpus_metadata(
    corpus_id: UUID,
    *,
    summary: str | None = None,
    outcome: str | None = None,
    key_principles: list[str] | None = None,
    appeal_subtype: str | None = None,
    practice_area: str | None = None,
    overwrite: bool = False,
 ) -> dict:
    """Patch the enriched-metadata columns of a style_corpus row.
    By default, only empty columns are filled — passing ``overwrite=True``
    is the caller's signal that they intentionally want to replace existing
    values (used by the re-extract flow when the chair runs it manually).
    """
    pool = await get_pool()
    async with pool.acquire() as conn:
        existing = await conn.fetchrow(
            "SELECT summary, outcome, key_principles, appeal_subtype, practice_area "
            "FROM style_corpus WHERE id = $1",
            corpus_id,
        )
        if not existing:
            return {"updated": False, "reason": "not found"}
        sets: dict = {}
        if summary is not None and (overwrite or not (existing["summary"] or "").strip()):
            sets["summary"] = summary
        if outcome is not None and (overwrite or not (existing["outcome"] or "").strip()):
            sets["outcome"] = outcome
        if key_principles is not None:
            current = existing["key_principles"]
            if isinstance(current, str):
                try:
                    current = json.loads(current)
                except json.JSONDecodeError:
                    current = []
            if overwrite or not (current or []):
                sets["key_principles"] = json.dumps(key_principles)
        if appeal_subtype is not None and (overwrite or not (existing["appeal_subtype"] or "").strip()):
            sets["appeal_subtype"] = appeal_subtype
        if practice_area is not None and (overwrite or not (existing["practice_area"] or "").strip()):
            sets["practice_area"] = practice_area
        if not sets:
            return {"updated": False, "reason": "nothing to update", "fields": []}
        cols = list(sets.keys())
        set_clause = ", ".join(f"{c} = ${i + 2}" for i, c in enumerate(cols))
        values = [sets[c] for c in cols]
        await conn.execute(
            f"UPDATE style_corpus SET {set_clause} WHERE id = $1",
            corpus_id, *values,
        )
        return {"updated": True, "fields": cols}
 # ── decision_lessons (per-corpus row notes) ────────────────────────
 async def list_decision_lessons(corpus_id: UUID) -> list[dict]:
    pool = await get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            "SELECT id, style_corpus_id, lesson_text, category, source, "
            "       applied_to_skill, created_by, created_at, updated_at "
            "FROM decision_lessons WHERE style_corpus_id = $1 "
            "ORDER BY created_at DESC",
            corpus_id,
        )
    return [dict(r) for r in rows]
 async def add_decision_lesson(
    corpus_id: UUID,
    *,
    lesson_text: str,
    category: str = "general",
    source: str = "manual",
    created_by: str = "chaim",
 ) -> dict:
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            "INSERT INTO decision_lessons "
            "(style_corpus_id, lesson_text, category, source, created_by) "
            "VALUES ($1, $2, $3, $4, $5) "
            "RETURNING id, style_corpus_id, lesson_text, category, source, "
            "          applied_to_skill, created_by, created_at, updated_at",
            corpus_id, lesson_text, category, source, created_by,
        )
    return dict(row) if row else {}
 async def update_decision_lesson(
    lesson_id: UUID,
    *,
    lesson_text: str | None = None,
    category: str | None = None,
    applied_to_skill: bool | None = None,
 ) -> dict:
    sets: dict = {}
    if lesson_text is not None:
        sets["lesson_text"] = lesson_text
    if category is not None:
        sets["category"] = category
    if applied_to_skill is not None:
        sets["applied_to_skill"] = applied_to_skill
    if not sets:
        return {"updated": False, "reason": "nothing to update"}
    sets["updated_at"] = "now()"  # sentinel — replaced inline below
    cols = [c for c in sets if c != "updated_at"]
    set_clause = ", ".join(f"{c} = ${i + 2}" for i, c in enumerate(cols))
    set_clause += ", updated_at = now()"
    values = [sets[c] for c in cols]
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            f"UPDATE decision_lessons SET {set_clause} WHERE id = $1 "
            f"RETURNING id, style_corpus_id, lesson_text, category, source, "
            f"          applied_to_skill, updated_at",
            lesson_id, *values,
        )
    if not row:
        return {"updated": False, "reason": "not found"}
    return {"updated": True, **dict(row)}
 async def delete_decision_lesson(lesson_id: UUID) -> dict:
    pool = await get_pool()
    async with pool.acquire() as conn:
        result = await conn.execute(
            "DELETE FROM decision_lessons WHERE id = $1", lesson_id,
        )
    # asyncpg returns "DELETE n"
    deleted = result.split(" ", 1)[1].strip() if " " in result else "0"
    return {"deleted": deleted != "0"}
 async def count_decision_lessons_per_corpus() -> dict[str, int]:
    """Map style_corpus.id (str) → lesson count, for badge display in the list."""
    pool = await get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            "SELECT style_corpus_id, count(*) AS n "
            "FROM decision_lessons GROUP BY style_corpus_id"
        )
    return {str(r["style_corpus_id"]): r["n"] for r in rows}
 # ── chat (style agent conversations) ───────────────────────────────
 async def create_chat_conversation(
    *,
    title: str = "שיחה חדשה",
    style_corpus_id: UUID | None = None,
    system_prompt_version: str = "v1",
 ) -> dict:
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            "INSERT INTO chat_conversations "
            "(title, style_corpus_id, system_prompt_version) "
            "VALUES ($1, $2, $3) "
            "RETURNING id, title, style_corpus_id, claude_session_id, "
            "          system_prompt_version, created_at, last_message_at",
            title, style_corpus_id, system_prompt_version,
        )
    return dict(row) if row else {}
 async def list_chat_conversations(limit: int = 50) -> list[dict]:
    pool = await get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            """
            SELECT c.id, c.title, c.style_corpus_id, c.claude_session_id,
                   c.created_at, c.last_message_at,
                   sc.decision_number,
                   (SELECT count(*) FROM chat_messages m WHERE m.conversation_id = c.id) AS message_count
            FROM chat_conversations c
            LEFT JOIN style_corpus sc ON sc.id = c.style_corpus_id
            ORDER BY c.last_message_at DESC NULLS LAST
            LIMIT $1
            """,
            limit,
        )
    return [dict(r) for r in rows]
 async def get_chat_conversation(conv_id: UUID) -> dict | None:
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            "SELECT id, title, style_corpus_id, claude_session_id, "
            "       system_prompt_version, created_at, last_message_at "
            "FROM chat_conversations WHERE id = $1",
            conv_id,
        )
    return dict(row) if row else None
 async def delete_chat_conversation(conv_id: UUID) -> dict:
    pool = await get_pool()
    async with pool.acquire() as conn:
        result = await conn.execute(
            "DELETE FROM chat_conversations WHERE id = $1", conv_id,
        )
    deleted = result.split(" ", 1)[1].strip() if " " in result else "0"
    return {"deleted": deleted != "0"}
 async def update_chat_conversation_session_id(
    conv_id: UUID, claude_session_id: str,
 ) -> None:
    pool = await get_pool()
    async with pool.acquire() as conn:
        await conn.execute(
            "UPDATE chat_conversations SET claude_session_id = $1, "
            "       last_message_at = now() "
            "WHERE id = $2",
            claude_session_id, conv_id,
        )
 async def add_chat_message(
    conv_id: UUID,
    *,
    role: str,
    content: str,
    raw_events: list | None = None,
 ) -> dict:
    pool = await get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            "INSERT INTO chat_messages "
            "(conversation_id, role, content, raw_events) "
            "VALUES ($1, $2, $3, $4) "
            "RETURNING id, conversation_id, role, content, created_at",
            conv_id, role, content, json.dumps(raw_events or []),
        )
        await conn.execute(
            "UPDATE chat_conversations SET last_message_at = now() WHERE id = $1",
            conv_id,
        )
    return dict(row) if row else {}
 async def list_chat_messages(conv_id: UUID) -> list[dict]:
    pool = await get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            "SELECT id, role, content, created_at "
            "FROM chat_messages WHERE conversation_id = $1 "
            "ORDER BY created_at ASC",
            conv_id,
        )
    return [dict(r) for r in rows]
 async def get_style_patterns(pattern_type: str | None = None) -> list[dict]:
    pool = await get_pool()
    async with pool.acquire() as conn:
--- a/mcp-server/src/legal_mcp/services/style_metadata_extractor.py
+++ b/mcp-server/src/legal_mcp/services/style_metadata_extractor.py
@@ -0,0 +1,195 @@
 """Auto-extract per-decision metadata for a style_corpus row.
 Populates the fields that the upload flow leaves empty — summary, outcome,
 key_principles, appeal_subtype, practice_area — by asking Claude (via the
 local CLI session) to read the proofread full_text and return a structured
 JSON blob.
 Caller policy (``apply_to_corpus``): by default we **only fill empty
 columns**, so chair-edited values are preserved across re-runs. The chair
 can force a refresh by passing ``overwrite=True``.
 Why this is a separate module from ``precedent_metadata_extractor``:
 that one fills the *external* case_law corpus (court rulings, third-party
 committee decisions). This one fills the *style* corpus — Daphna's own
 decisions used to teach the writer the in-house voice. The two corpora
 have different schemas, different prompts, and different downstream
 consumers, so coupling them would have been the wrong shortcut.
 """
 from __future__ import annotations
 import logging
 from uuid import UUID
 from legal_mcp.services import claude_session, db
 logger = logging.getLogger(__name__)
 # A single decision typically runs 200K-650K chars. We sample the head
 # (where outcome + parties + framing live) and the tail (where the
 # operative ruling sits). Picking from both edges keeps the prompt under
 # 60K chars — comfortable for any Claude tier.
 _HEAD_CHARS = 25_000
 _TAIL_CHARS = 15_000
 def _build_text_window(full_text: str) -> str:
    if len(full_text) <= _HEAD_CHARS + _TAIL_CHARS:
        return full_text
    head = full_text[:_HEAD_CHARS]
    tail = full_text[-_TAIL_CHARS:]
    return (
        f"{head}\n\n"
        f"[... חתך: {len(full_text) - _HEAD_CHARS - _TAIL_CHARS:,} תווים מהאמצע "
        f"הושמטו — שמרנו על ההתחלה (טענות + רקע) ועל הסוף (הכרעה + הוצאות) ...]"
        f"\n\n{tail}"
    )
 # Static instructions — go via ``system`` so the SDK path can cache them
 # across batch enrichment runs (24+ decisions in one pass).
 METADATA_PROMPT = """אתה מסייע משפטי שמקטלג את הקורפוס הסגנוני של דפנה תמיר (יו"ר ועדת ערר).
 תפקידך: לקרוא החלטה אחת ולחלץ מטא-דאטה ל-style_corpus — שדות שהמשתמש לא הזין בעת ההעלאה.
 **אל תמציא**. אם המידע לא מופיע בטקסט, השאר מחרוזת ריקה או מערך ריק. אסור להסיק עובדות שלא כתובות.
 ## פלט נדרש
 החזר JSON אחד (object אחד — לא array, לא markdown, לא הסברים):
 {
  "summary": "תקציר עניני ב-2-3 משפטים: מי העורר, מה דרש, מה הוכרע. סגנון יבש, ניטרלי, ללא שיפוט. דוגמה: 'ערר על דחיית בקשה להיתר לתוספת מרפסת בקומה ג׳. דפנה קיבלה את הערר חלקית — אישרה את המרפסת בהקטנה ל-12 מ״ר.'",
  "outcome": "התוצאה התמציתית. אחד מאלה (או צירוף קצר): 'קבלה' / 'קבלה חלקית' / 'דחייה' / 'הסתלקות' / 'החזרה לוועדה המקומית'. אם זה לא ברור — מחרוזת ריקה.",
  "key_principles": [
    "עיקרון משפטי 1 שעולה מההחלטה — משפט אחד, ניסוח מופשט. למשל 'שיקול דעת מוגבל לחריגות בנייה קטנות'.",
    "עיקרון 2",
    "..."
  ],
  "appeal_subtype": "תת-סוג ערר. ערכים מותרים: 'building_permit' (היתר בנייה / רישוי), 'betterment_levy' (היטל השבחה), 'compensation_197' (פיצויים ס׳ 197), 'use_change' (שימוש חורג), 'tama_38' (תמ\\"א 38), או מחרוזת ריקה אם לא ברור.",
  "practice_area": "תחום משפט גנרי. ברירת מחדל: 'appeals_committee'. אם זה במובהק 'planning_law' — סמן.",
  "parties_appellant": "שם העורר/ים המרכזיים בהחלטה (אחד או כמה, מופרדים בפסיק). אם זו החלטה מאוחדת — שם הצד המוביל. השאר ריק אם לא ניתן לזהות במדויק.",
  "parties_respondent": "שם המשיב/ים. ברירת מחדל לעררי 1xxx ו-8xxx: 'הוועדה המקומית לתכנון ובניה ירושלים' או דומה. השאר ריק אם לא ברור."
 }
 ## כללי איכות
 1. **summary** — חייב להזכיר את התוצאה. בלי 'בית המשפט קבע ש...' (אנחנו לא בית משפט). בלי הערכת אישית.
 2. **outcome** — קבלה / קבלה חלקית / דחייה / הסתלקות / החזרה לוועדה המקומית. אם דפנה הכריעה חלקית — 'קבלה חלקית'. אסור 'התקבל' או 'נדחה' בלשון פעולה — רק שם פעולה.
 3. **key_principles** — 2-5 עקרונות מקסימום. כל אחד משפט אחד. לא ציטוטים מילוליים, אלא תמצות העיקרון.
 4. **appeal_subtype** — תמיד פעולה אחת. אם החלטה מערבת כמה תת-סוגים — בחר את העיקרי.
 5. **parties_appellant / parties_respondent** — שם בלבד, בלי 'נ׳' או 'נגד'.
 החזר רק את ה-JSON. אל תכתוב שום דבר לפניו או אחריו.
 """
 async def extract_decision_metadata(corpus_id: UUID | str) -> dict:
    """Run Claude over the row's full_text and return suggested fields.
    Does NOT touch the DB. The caller decides what to apply.
    """
    if isinstance(corpus_id, str):
        corpus_id = UUID(corpus_id)
    row = await db.get_style_corpus_row(corpus_id)
    if not row:
        return {}
    full_text = (row.get("full_text") or "").strip()
    if not full_text:
        return {}
    context = (
        f"מספר החלטה: {row.get('decision_number') or '—'}\n"
        f"תאריך: {row.get('decision_date') or '—'}\n"
        f"תת-סוג נוכחי: {row.get('appeal_subtype') or '—'}\n"
        f"נושאים מתויגים: {row.get('subject_categories') or '—'}"
    )
    window = _build_text_window(full_text)
    user_msg = (
        f"## הקלט\n{context}\n\n"
        f"--- תחילת ההחלטה ---\n{window}\n--- סוף ההחלטה ---"
    )
    try:
        result = await claude_session.query_json(user_msg, system=METADATA_PROMPT)
    except Exception as e:
        logger.warning("style_metadata_extractor: query failed: %s", e)
        return {}
    if not isinstance(result, dict):
        logger.warning(
            "style_metadata_extractor: expected JSON object, got %s",
            type(result).__name__,
        )
        return {}
    out: dict = {}
    if isinstance(result.get("summary"), str):
        out["summary"] = result["summary"].strip()
    if isinstance(result.get("outcome"), str):
        out["outcome"] = result["outcome"].strip()
    kp = result.get("key_principles") or []
    if isinstance(kp, list):
        out["key_principles"] = [str(p).strip() for p in kp if str(p).strip()]
    if isinstance(result.get("appeal_subtype"), str):
        st = result["appeal_subtype"].strip()
        # Open enum — but log values outside the documented list so we can
        # tighten the prompt later if needed.
        known = {
            "building_permit", "betterment_levy", "compensation_197",
            "use_change", "tama_38", "",
        }
        if st not in known:
            logger.info("style_metadata: unknown appeal_subtype=%r (kept)", st)
        out["appeal_subtype"] = st
    if isinstance(result.get("practice_area"), str):
        out["practice_area"] = result["practice_area"].strip()
    # Parties: not stored in the schema today, but worth surfacing in the
    # extractor's return value so callers (and the UI's drawer) can display
    # them. The list endpoint extracts via regex; LLM output is the
    # higher-quality fallback when regex fails.
    if isinstance(result.get("parties_appellant"), str):
        out["parties_appellant"] = result["parties_appellant"].strip()
    if isinstance(result.get("parties_respondent"), str):
        out["parties_respondent"] = result["parties_respondent"].strip()
    return out
 async def extract_and_apply(
    corpus_id: UUID | str, *, overwrite: bool = False,
 ) -> dict:
    """Convenience: extract → apply → return summary of what changed.
    Idempotent under default ``overwrite=False`` — re-runs only fill empty
    fields. Use ``overwrite=True`` to refresh values the chair (or a prior
    extraction) already wrote.
    """
    if isinstance(corpus_id, str):
        corpus_id = UUID(corpus_id)
    suggested = await extract_decision_metadata(corpus_id)
    if not suggested:
        return {"extracted": False, "applied": False, "reason": "no suggestion"}
    update_result = await db.update_style_corpus_metadata(
        corpus_id,
        summary=suggested.get("summary"),
        outcome=suggested.get("outcome"),
        key_principles=suggested.get("key_principles"),
        appeal_subtype=suggested.get("appeal_subtype"),
        practice_area=suggested.get("practice_area"),
        overwrite=overwrite,
    )
    return {
        "extracted": True,
        "applied": update_result.get("updated", False),
        "fields_set": update_result.get("fields", []),
        "suggested": suggested,
    }
--- a/mcp-server/src/legal_mcp/tools/training_enrichment.py
+++ b/mcp-server/src/legal_mcp/tools/training_enrichment.py
@@ -0,0 +1,85 @@
 """MCP tool wrappers for the style_corpus metadata-enrichment flow.
 The actual extractor lives in
 ``legal_mcp.services.style_metadata_extractor``; this module just exposes
 it as MCP tools that the chair (or a future automation) can call from
 Claude Code.
 Why these tools matter: the upload pipeline (`/api/training/upload` →
 `_process_proofread_training`) inserts a style_corpus row with
 ``summary=''``, ``outcome=''``, ``key_principles=[]`` because LLM
 extraction can't run from the FastAPI container (no claude CLI there).
 This module fills that gap — call it from the host, where ``claude``
 CLI is available, and the row gets enriched.
 """
 from __future__ import annotations
 import json
 from uuid import UUID
 from legal_mcp.services import db, style_metadata_extractor
 def _ok(payload) -> str:
    return json.dumps({"ok": True, **payload}, ensure_ascii=False, default=str)
 def _err(msg: str) -> str:
    return json.dumps({"ok": False, "error": msg}, ensure_ascii=False)
 async def extract_decision_metadata(corpus_id: str, overwrite: bool = False) -> str:
    """חילוץ מטא-דאטה (summary, outcome, key_principles, appeal_subtype) להחלטה בקורפוס הסגנון.
    ברירת מחדל ``overwrite=False`` ממלא רק שדות ריקים. הזן ``overwrite=true``
    כדי לרענן ערכים שכבר נכתבו.
    """
    try:
        cid = UUID(corpus_id)
    except ValueError:
        return _err("corpus_id לא תקין")
    try:
        result = await style_metadata_extractor.extract_and_apply(cid, overwrite=overwrite)
    except Exception as e:
        return _err(str(e))
    return _ok(result)
 async def list_corpus_pending_enrichment(limit: int = 50) -> str:
    """רשימת רשומות style_corpus שחסר להן summary/outcome/key_principles — מועמדות להעשרה."""
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
            """
            SELECT id, decision_number, decision_date,
                   length(full_text) AS chars,
                   coalesce(summary, '') = '' AS missing_summary,
                   coalesce(outcome, '') = '' AS missing_outcome,
                   coalesce(jsonb_array_length(key_principles), 0) = 0 AS missing_principles
            FROM style_corpus
            WHERE coalesce(summary, '') = ''
               OR coalesce(outcome, '') = ''
               OR coalesce(jsonb_array_length(key_principles), 0) = 0
            ORDER BY decision_date NULLS LAST
            LIMIT $1
            """,
            limit,
        )
    items = [
        {
            "corpus_id": str(r["id"]),
            "decision_number": r["decision_number"] or "",
            "decision_date": str(r["decision_date"]) if r["decision_date"] else "",
            "chars": r["chars"],
            "missing": [
                f for f, v in (
                    ("summary", r["missing_summary"]),
                    ("outcome", r["missing_outcome"]),
                    ("key_principles", r["missing_principles"]),
                ) if v
            ],
        }
        for r in rows
    ]
    return _ok({"count": len(items), "items": items})
--- a/scripts/SCRIPTS.md
+++ b/scripts/SCRIPTS.md
@@ -35,6 +35,7 @@
 | `compute_ndcg.py` | python | חישוב nDCG@10 על `search_relevance_feedback` (TaskMaster #50, Stage C). aggregation לפי `search_type` ולפי שבוע, כולל top-cited case_law ו-coverage %. דגלים: `--k 10`, `--weeks 12`, `--pretty`. read-only, פלט JSON. משמש גם את `GET /api/admin/rag-metrics` (מיובא inline) — שינוי חתימה ב-`compute()` ישבור את ה-endpoint | ידני / cron עתידי לדיווח שבועי |
 | `backfill_multimodal_precedents.py` | python | Backfill voyage-multimodal-3 page embeddings על רשומות `case_law` (external_upload + internal_committee) שחסרות `precedent_image_embeddings`. בונה אינדקס קבצים מ-`data/precedent-library/` ו-`data/internal-decisions/`, מנסה התאמה לפי tokens של מספרי תיק (כולל parts-match לפורמטים שונים של Nevo doc-id). מדלג על רשומות בלי קובץ-מקור או עם MD בלבד (PyMuPDF לא מרנדר MD). תומך `--dry-run` (default) / `--apply` / `--only external_upload\|internal_committee` / `--limit N`. רץ בקונטיינר (יש `/data` + Voyage env). **הופעל 2026-05-26**: 70 חסרים → 26 backfilled (503 pages, ~$0.21 voyage tokens), 44 אין-קובץ-מקור. ניתן להריץ שוב אחרי שיועלו עוד PDF/DOCX לספרייה | ידני |
 | `monitor_halacha_quality.py` | python | מנטר איכות חילוץ הלכות. בודק drift של `avg(confidence)` בין baseline היסטורי לחלון אחרון. מחזיר JSON מטריקות + alert ב-stderr אם drift > threshold (ברירת מחדל 5%). 2 סדרות: trusted (approved+published) ו-all_extracted. תומך `--window N` / `--threshold X` / `--min-sample N` / `--silent` / `--exit-on-alert`. רץ ב-container או מקומית עם `mcp-server/.venv` (אין תלות ב-LLM, רק SQL). **תזמון מומלץ**: `0 8 * * 1` (יום ראשון 08:00, שבועי) | `0 8 * * 1` (לתזמן) |
 | `audit_training_corpus.py` | python | audit של `style_corpus` — לכל החלטה: שדות מטא-דאטה מאוכלסים (`summary`/`outcome`/`key_principles`/`appeal_subtype`/`subject_categories`), קישור ל-`documents` (FK + chunks + embeddings). מפיק `data/audit/corpus-YYYY-MM-DD.json` + summary בקונסול. דרוש `POSTGRES_URL` או POSTGRES_*. אין תלויות חיצוניות מלבד asyncpg. **רץ מהמכונה המקומית** (לא קונטיינר) — חיבור ישיר ל-Postgres :5433 | ידני / קדם-עבודה לפני enrichment של מטא-דאטה |
 ## תיקיית `.archive/` — סקריפטים שהושלמו
--- a/scripts/audit_training_corpus.py
+++ b/scripts/audit_training_corpus.py
@@ -0,0 +1,196 @@
 #!/usr/bin/env python
 """Audit the style_corpus table — list each decision with what's populated and what's missing.
 Produces a JSON report at data/audit/corpus-YYYY-MM-DD.json so we can see at a glance
 which corpus entries lack summary/outcome/key_principles/appeal_subtype/chunks/embeddings.
 Run with the mcp-server venv (has asyncpg):
    POSTGRES_URL=postgres://... ./mcp-server/.venv/bin/python scripts/audit_training_corpus.py
 Without POSTGRES_URL, falls back to the per-field env vars used by web/mcp-server config.
 """
 from __future__ import annotations
 import asyncio
 import json
 import os
 import re
 import sys
 from datetime import UTC, date, datetime
 from pathlib import Path
 import asyncpg
 def _build_dsn() -> str:
    if url := os.environ.get("POSTGRES_URL"):
        return url
    return (
        f"postgres://{os.environ.get('POSTGRES_USER', 'legal_ai')}:"
        f"{os.environ.get('POSTGRES_PASSWORD', '')}@"
        f"{os.environ.get('POSTGRES_HOST', '127.0.0.1')}:"
        f"{os.environ.get('POSTGRES_PORT', '5433')}/"
        f"{os.environ.get('POSTGRES_DB', 'legal_ai')}"
    )
 async def audit() -> dict:
    dsn = _build_dsn()
    conn = await asyncpg.connect(dsn)
    try:
        rows = await conn.fetch(
            """
            SELECT id, decision_number, decision_date, subject_categories,
                   length(full_text)     AS chars,
                   summary,
                   outcome,
                   key_principles,
                   practice_area,
                   appeal_subtype,
                   document_id,
                   created_at
            FROM style_corpus
            ORDER BY decision_date NULLS LAST, decision_number
            """
        )
        # Chunk + embedding counts for each related document — by direct FK first,
        # then by title-match for legacy rows where style_corpus.document_id is NULL.
        chunk_counts = await conn.fetch(
            """
            SELECT d.id AS doc_id, d.title,
                   count(c.id)                                AS chunks,
                   count(c.embedding) FILTER (WHERE c.embedding IS NOT NULL) AS chunks_with_emb
            FROM documents d
            LEFT JOIN document_chunks c ON c.document_id = d.id
            WHERE d.title LIKE '[קורפוס]%' OR d.id IN (SELECT document_id FROM style_corpus WHERE document_id IS NOT NULL)
            GROUP BY d.id, d.title
            """
        )
    finally:
        await conn.close()
    by_doc_id = {r["doc_id"]: r for r in chunk_counts}
    # Index corpus documents by every digit cluster in their title so we can
    # match against style_corpus.decision_number regardless of formatting
    # (e.g. style_corpus has "1109-25" but title may say "ARAR-25-1109" or
    # "ערר 1009-25"). Each digit run >=3 chars becomes a key.
    by_digit: dict[str, dict] = {}
    for r in chunk_counts:
        title = r["title"] or ""
        for tok in re.findall(r"\d{3,}", title):
            by_digit.setdefault(tok, r)
    decisions = []
    gaps_total = {
        "summary": 0, "outcome": 0, "key_principles": 0,
        "appeal_subtype": 0, "subject_categories": 0,
        "chunks": 0, "embeddings": 0, "document_id": 0,
    }
    for row in rows:
        cats = row["subject_categories"]
        if isinstance(cats, str):
            try:
                cats = json.loads(cats)
            except json.JSONDecodeError:
                cats = []
        cats = cats or []
        kp = row["key_principles"]
        if isinstance(kp, str):
            try:
                kp = json.loads(kp)
            except json.JSONDecodeError:
                kp = []
        kp = kp or []
        # Resolve chunks: prefer FK, fall back to digit-cluster match on decision_number.
        chunks = 0
        chunks_with_emb = 0
        if row["document_id"] and row["document_id"] in by_doc_id:
            r = by_doc_id[row["document_id"]]
            chunks = r["chunks"]
            chunks_with_emb = r["chunks_with_emb"]
        elif row["decision_number"]:
            for tok in re.findall(r"\d{3,}", row["decision_number"]):
                if tok in by_digit:
                    r = by_digit[tok]
                    chunks = r["chunks"]
                    chunks_with_emb = r["chunks_with_emb"]
                    break
        missing = []
        if not row["summary"]:
            missing.append("summary")
            gaps_total["summary"] += 1
        if not row["outcome"]:
            missing.append("outcome")
            gaps_total["outcome"] += 1
        if not kp:
            missing.append("key_principles")
            gaps_total["key_principles"] += 1
        if not row["appeal_subtype"]:
            missing.append("appeal_subtype")
            gaps_total["appeal_subtype"] += 1
        if not cats:
            missing.append("subject_categories")
            gaps_total["subject_categories"] += 1
        if chunks == 0:
            missing.append("chunks")
            gaps_total["chunks"] += 1
        elif chunks_with_emb < chunks:
            missing.append(f"embeddings({chunks_with_emb}/{chunks})")
            gaps_total["embeddings"] += 1
        if row["document_id"] is None:
            missing.append("document_id")
            gaps_total["document_id"] += 1
        decisions.append({
            "id": str(row["id"]),
            "decision_number": row["decision_number"] or "",
            "decision_date": row["decision_date"].isoformat() if row["decision_date"] else None,
            "chars": row["chars"],
            "subject_categories": cats,
            "practice_area": row["practice_area"] or "",
            "appeal_subtype": row["appeal_subtype"] or "",
            "summary_len": len(row["summary"] or ""),
            "outcome_len": len(row["outcome"] or ""),
            "key_principles_count": len(kp),
            "chunks": chunks,
            "chunks_with_embeddings": chunks_with_emb,
            "document_id": str(row["document_id"]) if row["document_id"] else None,
            "missing": missing,
            "created_at": row["created_at"].isoformat() if row["created_at"] else None,
        })
    return {
        "generated_at": datetime.now(UTC).isoformat(),
        "total_decisions": len(decisions),
        "gaps_total": gaps_total,
        "decisions": decisions,
    }
 async def main() -> int:
    report = await audit()
    out_dir = Path(__file__).resolve().parents[1] / "data" / "audit"
    out_dir.mkdir(parents=True, exist_ok=True)
    today = date.today().isoformat()
    out_file = out_dir / f"corpus-{today}.json"
    out_file.write_text(json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8")
    # Console summary
    print(f"Total decisions: {report['total_decisions']}")
    print("Gaps by field (count of decisions missing it):")
    for field, n in report["gaps_total"].items():
        bar = "█" * min(n, 60)
        print(f"  {field:25s} {n:3d}  {bar}")
    print(f"\nReport written to {out_file}")
    return 0
 if __name__ == "__main__":
    sys.exit(asyncio.run(main()))
--- a/scripts/legal-chat-service.config.cjs
+++ b/scripts/legal-chat-service.config.cjs
@@ -0,0 +1,48 @@
 /**
 * pm2 ecosystem entry for legal-chat-service — the host-side SSE bridge
 * to ``claude`` CLI that powers the /training chat tab.
 *
 * Why pm2:
 *   - Auto-restart if the process dies (claude CLI subprocess failures
 *     should never leave the service in a half-dead state).
 *   - Log rotation matches paperclip's behavior so the chair sees
 *     consistent log paths under ~/.pm2/logs/.
 *
 * Install (once):
 *     pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs
 *     pm2 save
 *
 * Smoke test:
 *     curl http://127.0.0.1:8770/health
 *     # → {"ok":true,"service":"legal-chat-service"}
 *
 * Update:
 *     pm2 restart legal-chat-service
 *
 * Stop:
 *     pm2 stop legal-chat-service
 */
 module.exports = {
  apps: [
    {
      name: "legal-chat-service",
      cwd: "/home/chaim/legal-ai/mcp-server",
      // Run the in-package server via the venv interpreter so all
      // imports (claude_session, etc) resolve.
      script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
      args: "-m legal_mcp.chat_service.server --port 8770",
      // claude CLI looks up credentials under HOME — make sure it
      // sees Daphna's session, not an empty container HOME.
      env: {
        HOME: "/home/chaim",
        PATH: "/home/chaim/.local/bin:/usr/local/bin:/usr/bin:/bin",
        PYTHONUNBUFFERED: "1",
      },
      restart_delay: 5000,
      max_restarts: 10,
      autorestart: true,
      max_memory_restart: "500M",
    },
  ],
 };
--- a/web-ui/src/app/training/page.tsx
+++ b/web-ui/src/app/training/page.tsx
@@ -1,18 +1,27 @@
 "use client";
 import { useState } from "react";
 import Link from "next/link";
 import { Upload } from "lucide-react";
 import { AppShell } from "@/components/app-shell";
 import { Button } from "@/components/ui/button";
 import { Card, CardContent } from "@/components/ui/card";
 import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
 import { StyleReportPanel } from "@/components/training/style-report-panel";
 import { CorpusPanel } from "@/components/training/corpus-panel";
 import { ComparePanel } from "@/components/training/compare-panel";
 import { CuratorPortraitPanel } from "@/components/training/curator-portrait-panel";
 import { ChatPanel } from "@/components/training/chat-panel";
 import { TrainingUploadDialog } from "@/components/training/upload-dialog";
 export default function TrainingPage() {
  const [uploadOpen, setUploadOpen] = useState(false);
  return (
    <AppShell>
      <section className="space-y-6">
-        <header>
+        <header className="flex items-start justify-between gap-4 flex-wrap">
          <div>
            <nav className="text-[0.78rem] text-ink-muted mb-1">
              <Link href="/" className="hover:text-gold-deep">בית</Link>
              <span aria-hidden> · </span>
@@ -23,8 +32,18 @@ export default function TrainingPage() {
              לוח בקרה של קורפוס האימון — סטטיסטיקות, אנטומיית החלטה ממוצעת,
              ביטויי חתימה, וכלי השוואה בין שתי החלטות.
            </p>
          </div>
          <Button
            onClick={() => setUploadOpen(true)}
            className="bg-navy text-parchment hover:bg-navy-soft shrink-0"
          >
            <Upload className="w-4 h-4 me-1" />
            העלה החלטה
          </Button>
        </header>
        <TrainingUploadDialog open={uploadOpen} onOpenChange={setUploadOpen} />
        <div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
        <Card className="bg-surface border-rule shadow-sm">
@@ -34,6 +53,8 @@ export default function TrainingPage() {
                <TabsTrigger value="report">פורטרט סגנון</TabsTrigger>
                <TabsTrigger value="corpus">קורפוס</TabsTrigger>
                <TabsTrigger value="compare">השוואה</TabsTrigger>
                <TabsTrigger value="curator">הסוכן</TabsTrigger>
                <TabsTrigger value="chat">שיחה</TabsTrigger>
              </TabsList>
              <TabsContent value="report" className="mt-5">
@@ -47,6 +68,14 @@ export default function TrainingPage() {
              <TabsContent value="compare" className="mt-5">
                <ComparePanel />
              </TabsContent>
              <TabsContent value="curator" className="mt-5">
                <CuratorPortraitPanel />
              </TabsContent>
              <TabsContent value="chat" className="mt-5">
                <ChatPanel />
              </TabsContent>
            </Tabs>
          </CardContent>
        </Card>
--- a/web-ui/src/components/training/chat-panel.tsx
+++ b/web-ui/src/components/training/chat-panel.tsx
@@ -0,0 +1,434 @@
 "use client";
 /*
 * Style-agent chat panel — the new "שיחה" tab on /training.
 *
 * Layout: two columns.
 *   - Sidebar: list of conversations + "+ שיחה חדשה" button
 *   - Main: thread of messages + composer with SSE streaming
 *
 * Each message is persisted to the legal-ai DB; the LLM call goes
 * out via FastAPI → host's legal-chat-service → claude CLI. There
 * is no API cost — the claude CLI uses Daphna's claude.ai
 * subscription via the host's auth.
 *
 * Health gate: if /api/training/chat/health reports the host service
 * is unreachable, the composer is replaced by a setup notice telling
 * the chair to start the pm2 service.
 */
 import { useEffect, useRef, useState } from "react";
 import {
  Send, Plus, Trash2, Loader2, MessageSquare, Sparkles, AlertTriangle,
 } from "lucide-react";
 import { toast } from "sonner";
 import { Card, CardContent } from "@/components/ui/card";
 import { Button } from "@/components/ui/button";
 import { Textarea } from "@/components/ui/textarea";
 import { ScrollArea } from "@/components/ui/scroll-area";
 import { Badge } from "@/components/ui/badge";
 import { Skeleton } from "@/components/ui/skeleton";
 import {
  Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
 } from "@/components/ui/select";
 import {
  chatKeys,
  useChatConversation,
  useChatConversations,
  useChatHealth,
  useCorpus,
  useCreateChat,
  useDeleteChat,
  type ChatMessage,
 } from "@/lib/api/training";
 import { useQueryClient } from "@tanstack/react-query";
 export function ChatPanel() {
  const [activeId, setActiveId] = useState<string | null>(null);
  const health = useChatHealth();
  return (
    <div className="grid gap-4 lg:grid-cols-[280px_1fr]">
      <ConversationsSidebar activeId={activeId} onSelect={setActiveId} />
      <div className="space-y-3">
        {health.data && !health.data.reachable && (
          <ChatServiceWarning health={health.data} />
        )}
        {activeId ? (
          <ChatThread convId={activeId} />
        ) : (
          <Card className="bg-rule-soft/40 border-rule">
            <CardContent className="px-6 py-10 text-center text-ink-muted text-sm space-y-2">
              <MessageSquare className="w-8 h-8 mx-auto opacity-50" />
              <p>בחר שיחה קיימת או פתח חדשה כדי להתחיל לדבר עם סוכן הסגנון.</p>
              <p className="text-[0.78rem]">
                הסוכן רץ על claude CLI מקומי דרך legal-chat-service. אין עלות API.
              </p>
            </CardContent>
          </Card>
        )}
      </div>
    </div>
  );
 }
 // ── Sidebar: list + new ────────────────────────────────────────────
 function ConversationsSidebar({
  activeId, onSelect,
 }: {
  activeId: string | null;
  onSelect: (id: string | null) => void;
 }) {
  const { data: convs, isPending } = useChatConversations();
  const { data: corpus } = useCorpus();
  const create = useCreateChat();
  const del = useDeleteChat();
  const [creating, setCreating] = useState(false);
  const [newTitle, setNewTitle] = useState("");
  const [newCorpusId, setNewCorpusId] = useState<string>("__none__");
  const onCreate = async () => {
    try {
      const conv = await create.mutateAsync({
        title: newTitle.trim() || "שיחה חדשה",
        style_corpus_id: newCorpusId === "__none__" ? null : newCorpusId,
      });
      onSelect(conv.id);
      setCreating(false);
      setNewTitle("");
      setNewCorpusId("__none__");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל ביצירת שיחה");
    }
  };
  const onDelete = async (id: string) => {
    if (!window.confirm("למחוק את השיחה? פעולה זו לא ניתנת לביטול.")) return;
    try {
      await del.mutateAsync(id);
      if (activeId === id) onSelect(null);
      toast.success("השיחה נמחקה");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל במחיקה");
    }
  };
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-3 py-3 space-y-2">
        {!creating ? (
          <Button
            onClick={() => setCreating(true)}
            className="w-full bg-navy text-parchment hover:bg-navy-soft"
            size="sm"
          >
            <Plus className="w-4 h-4 me-1" />
            שיחה חדשה
          </Button>
        ) : (
          <div className="space-y-2 border border-rule rounded p-2 bg-rule-soft/30">
            <Textarea
              value={newTitle}
              onChange={(e) => setNewTitle(e.target.value)}
              placeholder="כותרת לשיחה (אופציונלי)"
              rows={2} dir="rtl"
            />
            <Select value={newCorpusId} onValueChange={setNewCorpusId} dir="rtl">
              <SelectTrigger>
                <SelectValue placeholder="צמד להחלטה (אופציונלי)" />
              </SelectTrigger>
              <SelectContent className="max-h-[300px]">
                <SelectItem value="__none__">— שיחה כללית —</SelectItem>
                {corpus?.map((c) => (
                  <SelectItem key={c.id} value={c.id}>
                    {c.decision_number || "—"}
                    {c.decision_date ? ` · ${c.decision_date}` : ""}
                  </SelectItem>
                ))}
              </SelectContent>
            </Select>
            <div className="flex gap-1 justify-end">
              <Button variant="ghost" size="sm"
                onClick={() => { setCreating(false); setNewTitle(""); setNewCorpusId("__none__"); }}>
                ביטול
              </Button>
              <Button size="sm" onClick={onCreate} disabled={create.isPending}
                className="bg-navy text-parchment hover:bg-navy-soft">
                צור
              </Button>
            </div>
          </div>
        )}
        <ScrollArea className="h-[520px]">
          <ul className="space-y-1">
            {isPending && (
              <>
                <Skeleton className="h-12 w-full" />
                <Skeleton className="h-12 w-full" />
              </>
            )}
            {convs?.length === 0 && (
              <p className="text-center text-ink-muted text-[0.78rem] py-6">
                אין עדיין שיחות
              </p>
            )}
            {convs?.map((c) => {
              const active = c.id === activeId;
              return (
                <li key={c.id}>
                  <button
                    onClick={() => onSelect(c.id)}
                    className={
                      "w-full text-end rounded-md px-2 py-2 transition " +
                      (active
                        ? "bg-gold-wash border border-gold/40"
                        : "hover:bg-rule-soft/60 border border-transparent")
                    }
                  >
                    <div className="text-sm text-navy font-semibold truncate">
                      {c.title}
                    </div>
                    <div className="flex items-center gap-1 text-[0.7rem] text-ink-muted">
                      {c.decision_number && (
                        <Badge variant="outline"
                          className="text-[0.65rem] bg-info-bg text-info border-info/40">
                          {c.decision_number}
                        </Badge>
                      )}
                      <span className="tabular-nums">{c.message_count}</span>
                      <MessageSquare className="w-3 h-3" />
                      <span className="grow text-end">
                        {new Date(c.last_message_at).toLocaleDateString("he-IL")}
                      </span>
                      <button
                        onClick={(e) => { e.stopPropagation(); onDelete(c.id); }}
                        className="hover:text-danger"
                        aria-label="מחק שיחה"
                      >
                        <Trash2 className="w-3 h-3" />
                      </button>
                    </div>
                  </button>
                </li>
              );
            })}
          </ul>
        </ScrollArea>
      </CardContent>
    </Card>
  );
 }
 // ── Thread + composer ──────────────────────────────────────────────
 function ChatThread({ convId }: { convId: string }) {
  const { data, isPending } = useChatConversation(convId);
  const qc = useQueryClient();
  const [draft, setDraft] = useState("");
  const [streaming, setStreaming] = useState(false);
  const [streamingText, setStreamingText] = useState("");
  const [streamError, setStreamError] = useState("");
  const scrollRef = useRef<HTMLDivElement | null>(null);
  /* Auto-scroll to bottom when new messages arrive. */
  useEffect(() => {
    const el = scrollRef.current;
    if (!el) return;
    el.scrollTo({ top: el.scrollHeight, behavior: "smooth" });
  }, [data?.messages.length, streamingText]);
  const onSend = async () => {
    const text = draft.trim();
    if (!text || streaming) return;
    setDraft("");
    setStreaming(true);
    setStreamingText("");
    setStreamError("");
    try {
      const res = await fetch(
        `/api/training/chat/conversations/${encodeURIComponent(convId)}/messages`,
        {
          method: "POST",
          headers: { "Content-Type": "application/json" },
          body: JSON.stringify({ content: text }),
        },
      );
      if (!res.ok || !res.body) {
        const body = await res.text();
        throw new Error(`HTTP ${res.status}: ${body.slice(0, 200)}`);
      }
      // Parse SSE line-by-line. EventSource would be cleaner but it
      // doesn't support POST bodies; the manual reader is small.
      const reader = res.body.getReader();
      const decoder = new TextDecoder();
      let buffer = "";
      let accumulated = "";
      while (true) {
        const { value, done } = await reader.read();
        if (done) break;
        buffer += decoder.decode(value, { stream: true });
        let nl: number;
        while ((nl = buffer.indexOf("\n\n")) !== -1) {
          const event = buffer.slice(0, nl);
          buffer = buffer.slice(nl + 2);
          if (!event.startsWith("data: ")) continue;
          try {
            const payload = JSON.parse(event.slice("data: ".length));
            if (payload.type === "text_delta" && payload.text) {
              accumulated += payload.text;
              setStreamingText(accumulated);
            } else if (payload.type === "error") {
              setStreamError(String(payload.message || "שגיאה לא ידועה"));
            } else if (payload.type === "done") {
              if (payload.text && !accumulated) {
                accumulated = payload.text;
                setStreamingText(accumulated);
              }
            }
          } catch {
            /* ignore non-JSON */
          }
        }
      }
    } catch (e) {
      setStreamError(e instanceof Error ? e.message : "שגיאה בשיחה");
    } finally {
      setStreaming(false);
      setStreamingText("");
      // Refetch the conversation so the persisted assistant turn shows up.
      qc.invalidateQueries({ queryKey: chatKeys.conversation(convId) });
      qc.invalidateQueries({ queryKey: chatKeys.conversations() });
    }
  };
  if (isPending) return <Skeleton className="h-[560px] w-full" />;
  if (!data) return null;
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-4 py-3 space-y-3">
        <header className="flex items-center gap-2 border-b border-rule pb-2">
          <Sparkles className="w-4 h-4 text-gold-deep" />
          <h3 className="text-navy font-semibold grow">{data.conversation.title}</h3>
          {data.conversation.decision_number && (
            <Badge variant="outline" className="bg-info-bg text-info border-info/40">
              {data.conversation.decision_number}
            </Badge>
          )}
        </header>
        <div ref={scrollRef} className="h-[440px] overflow-y-auto space-y-3 pe-1">
          {data.messages.length === 0 && !streaming && (
            <p className="text-center text-ink-muted text-sm py-8">
              התחל בשאלה — למשל: &quot;מה מאפיין את הפתיחות של דפנה בעררי 1xxx?&quot;
            </p>
          )}
          {data.messages.map((m) => <MessageBubble key={m.id} message={m} />)}
          {streaming && (
            <MessageBubble
              message={{
                id: "streaming",
                role: "assistant",
                content: streamingText || "(מקליד…)",
                created_at: "",
              }}
              isStreaming
            />
          )}
          {streamError && (
            <div className="rounded-lg border border-danger/40 bg-danger-bg p-3 text-danger text-sm">
              {streamError}
            </div>
          )}
        </div>
        <div className="border-t border-rule pt-3 space-y-2">
          <Textarea
            value={draft}
            onChange={(e) => setDraft(e.target.value)}
            placeholder="שאל את הסוכן… (Shift+Enter לשורה חדשה)"
            rows={3} dir="rtl"
            disabled={streaming}
            onKeyDown={(e) => {
              if (e.key === "Enter" && !e.shiftKey) {
                e.preventDefault();
                void onSend();
              }
            }}
          />
          <div className="flex items-center gap-2">
            <p className="text-[0.72rem] text-ink-muted grow">
              {data.conversation.claude_session_id
                ? "שיחה ממשיכה (--resume) — אין צורך לטעון מחדש את ה-system prompt"
                : "שיחה חדשה — system prompt ייטען (שני מסמכי ייחוס + רשימת קורפוס)"}
            </p>
            <Button onClick={onSend} disabled={streaming || !draft.trim()}
              className="bg-navy text-parchment hover:bg-navy-soft">
              {streaming ? (
                <Loader2 className="w-4 h-4 animate-spin me-1" />
              ) : (
                <Send className="w-4 h-4 me-1" />
              )}
              שלח
            </Button>
          </div>
        </div>
      </CardContent>
    </Card>
  );
 }
 function MessageBubble({
  message, isStreaming = false,
 }: { message: ChatMessage; isStreaming?: boolean }) {
  const isUser = message.role === "user";
  return (
    <div className={isUser ? "flex justify-start" : "flex justify-end"}>
      <div
        className={
          "max-w-[85%] rounded-lg px-3 py-2 text-sm leading-relaxed whitespace-pre-wrap " +
          (isUser
            ? "bg-gold-wash text-ink border border-gold/40"
            : "bg-rule-soft text-ink border border-rule")
        }
        dir="rtl"
      >
        {message.content}
        {isStreaming && (
          <span className="inline-block w-1.5 h-3.5 bg-navy/60 align-middle ms-1 animate-pulse" />
        )}
      </div>
    </div>
  );
 }
 // ── Service-down warning ──────────────────────────────────────────
 function ChatServiceWarning({
  health,
 }: { health: { reachable: boolean; url: string; error?: string } }) {
  return (
    <Card className="bg-danger-bg border-danger/40">
      <CardContent className="px-4 py-3 space-y-1">
        <div className="flex items-center gap-2 text-danger">
          <AlertTriangle className="w-4 h-4" />
          <strong>שירות הצ&apos;אט אינו זמין</strong>
        </div>
        <p className="text-[0.78rem] text-danger">
          לא ניתן להגיע ל-legal-chat-service בכתובת
          <code className="px-1 mx-1 bg-rule-soft rounded">{health.url}</code>.
          {health.error && (<> פירוט: <code className="px-1 bg-rule-soft rounded">{health.error}</code></>)}
        </p>
        <p className="text-[0.72rem] text-ink-muted">
          על המכונה המקומית הפעל:&nbsp;
          <code className="px-1 bg-rule-soft rounded">
            pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs
          </code>
        </p>
      </CardContent>
    </Card>
  );
 }
--- a/web-ui/src/components/training/corpus-detail-drawer.tsx
+++ b/web-ui/src/components/training/corpus-detail-drawer.tsx
@@ -0,0 +1,402 @@
 "use client";
 /*
 * Side-drawer for inspecting + editing a single style_corpus entry.
 *
 * Tabs:
 *   - "פרטים" — show + edit the enriched metadata (decision_number, date,
 *     subjects, summary, outcome, key_principles, appeal_subtype). Saving
 *     issues a PATCH /api/training/corpus/{id} and invalidates the list.
 *   - "תוכן" — read-only full_text view (truncated to 5K with "show more").
 *     We never let the chair edit full_text from the UI; corrections happen
 *     by re-uploading via the Upload dialog.
 *   - "מה למדנו" — per-decision lessons (Phase 4 placeholder for now).
 *   - "דפוסים" — style_patterns scoped by appeal_subtype.
 *
 * Why a Sheet, not a Dialog: the drawer needs to coexist with the corpus
 * table so the chair can scan multiple decisions without losing context.
 * Sheet (side: "left" in RTL = right edge in LTR) gives that without
 * stealing the entire viewport.
 */
 import { useEffect, useState } from "react";
 import { Save, FileText, Tag, Calendar, BookOpen, Loader2 } from "lucide-react";
 import { toast } from "sonner";
 import {
  Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
 } from "@/components/ui/sheet";
 import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
 import { Card, CardContent } from "@/components/ui/card";
 import { Button } from "@/components/ui/button";
 import { Input } from "@/components/ui/input";
 import { Label } from "@/components/ui/label";
 import { Textarea } from "@/components/ui/textarea";
 import { Badge } from "@/components/ui/badge";
 import { ScrollArea } from "@/components/ui/scroll-area";
 import {
  usePatchCorpus,
  type CorpusDecision,
  type CorpusDecisionPatch,
 } from "@/lib/api/training";
 import { LessonsTab } from "./lessons-tab";
 type Props = {
  decision: CorpusDecision | null;
  onOpenChange: (open: boolean) => void;
 };
 export function CorpusDetailDrawer({ decision, onOpenChange }: Props) {
  // Local editable state for the "details" tab. Re-seeds whenever the
  // selected decision changes so the form reflects the row the chair
  // clicked.
  const [draft, setDraft] = useState<CorpusDecisionPatch>({});
  const patch = usePatchCorpus();
  /* eslint-disable react-hooks/set-state-in-effect */
  useEffect(() => {
    if (!decision) {
      setDraft({});
      return;
    }
    setDraft({
      decision_number: decision.decision_number,
      decision_date: decision.decision_date,
      subject_categories: decision.subject_categories,
      summary: decision.summary,
      outcome: decision.outcome,
      key_principles: decision.key_principles,
      appeal_subtype: decision.appeal_subtype,
      practice_area: decision.practice_area,
    });
  }, [decision]);
  /* eslint-enable react-hooks/set-state-in-effect */
  const open = decision !== null;
  if (!decision) return null;
  // Diff against the originally loaded row — only PATCH fields the chair
  // actually changed, so concurrent edits to other fields stay intact.
  const diff: CorpusDecisionPatch = {};
  if (draft.decision_number !== decision.decision_number)
    diff.decision_number = draft.decision_number;
  if (draft.decision_date !== decision.decision_date)
    diff.decision_date = draft.decision_date;
  if (draft.summary !== decision.summary)
    diff.summary = draft.summary;
  if (draft.outcome !== decision.outcome)
    diff.outcome = draft.outcome;
  if (draft.appeal_subtype !== decision.appeal_subtype)
    diff.appeal_subtype = draft.appeal_subtype;
  if (draft.practice_area !== decision.practice_area)
    diff.practice_area = draft.practice_area;
  if (
    JSON.stringify(draft.subject_categories) !==
    JSON.stringify(decision.subject_categories)
  )
    diff.subject_categories = draft.subject_categories;
  if (
    JSON.stringify(draft.key_principles) !==
    JSON.stringify(decision.key_principles)
  )
    diff.key_principles = draft.key_principles;
  const isDirty = Object.keys(diff).length > 0;
  const onSave = async () => {
    if (!isDirty) return;
    try {
      await patch.mutateAsync({ id: decision.id, patch: diff });
      toast.success("המטא-דאטה עודכן");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל בשמירה");
    }
  };
  const setSubjects = (raw: string) =>
    setDraft((d) => ({
      ...d,
      subject_categories: raw.split(/[,،]/).map((s) => s.trim()).filter(Boolean),
    }));
  const setPrinciples = (raw: string) =>
    setDraft((d) => ({
      ...d,
      key_principles: raw.split("\n").map((s) => s.trim()).filter(Boolean),
    }));
  return (
    <Sheet open={open} onOpenChange={onOpenChange}>
      <SheetContent side="left" className="w-full sm:max-w-3xl overflow-y-auto" dir="rtl">
        <SheetHeader>
          <SheetTitle className="text-navy flex items-center gap-2">
            <BookOpen className="w-4 h-4 shrink-0" />
            {decision.legal_citation || decision.decision_number || "—"}
          </SheetTitle>
          <SheetDescription className="text-ink-muted">
            {decision.doc_title || "החלטה בקורפוס הסגנוני"}
          </SheetDescription>
        </SheetHeader>
        {/* Summary strip — fast-scan info, always visible above the tabs. */}
        <div className="px-6 mt-3 grid grid-cols-2 md:grid-cols-4 gap-3 text-[0.78rem]">
          <DataPoint icon={<Calendar className="w-3 h-3" />} label="תאריך"
            value={decision.decision_date || "—"} />
          <DataPoint icon={<FileText className="w-3 h-3" />} label="תווים"
            value={`${(decision.chars / 1000).toFixed(1)}K`} />
          <DataPoint icon={<FileText className="w-3 h-3" />} label="עמודים"
            value={decision.page_count > 0 ? String(decision.page_count) : "—"} />
          <DataPoint icon={<Tag className="w-3 h-3" />} label="תת-סוג"
            value={decision.appeal_subtype || "—"} />
        </div>
        <div className="px-6 pb-6 mt-4">
          <Tabs defaultValue="details" dir="rtl">
            <TabsList className="bg-rule-soft/60">
              <TabsTrigger value="details">פרטים</TabsTrigger>
              <TabsTrigger value="content">תוכן</TabsTrigger>
              <TabsTrigger value="lessons">מה למדנו</TabsTrigger>
              <TabsTrigger value="patterns">דפוסים</TabsTrigger>
            </TabsList>
            {/* ── Tab: editable metadata ─────────────────────────── */}
            <TabsContent value="details" className="mt-4 space-y-4">
              <div className="grid grid-cols-2 gap-3">
                <Field label="מספר ההחלטה">
                  <Input value={draft.decision_number ?? ""}
                    onChange={(e) => setDraft((d) => ({ ...d, decision_number: e.target.value }))}
                    dir="rtl" />
                </Field>
                <Field label="תאריך">
                  <Input type="date" value={draft.decision_date ?? ""}
                    onChange={(e) => setDraft((d) => ({ ...d, decision_date: e.target.value }))} />
                </Field>
              </div>
              <Field label="נושאים (מופרדים בפסיקים)">
                <Input value={(draft.subject_categories ?? []).join(", ")}
                  onChange={(e) => setSubjects(e.target.value)} dir="rtl" />
                {decision.subject_categories.length > 0 && (
                  <div className="flex flex-wrap gap-1 mt-1">
                    {decision.subject_categories.map((s) => (
                      <Badge key={s} variant="outline"
                        className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
                        {s}
                      </Badge>
                    ))}
                  </div>
                )}
              </Field>
              <div className="grid grid-cols-2 gap-3">
                <Field label="תת-סוג ערר">
                  <Input value={draft.appeal_subtype ?? ""}
                    onChange={(e) => setDraft((d) => ({ ...d, appeal_subtype: e.target.value }))}
                    placeholder="building_permit / betterment_levy / compensation_197"
                    dir="rtl" />
                </Field>
                <Field label="תחום משפט">
                  <Input value={draft.practice_area ?? ""}
                    onChange={(e) => setDraft((d) => ({ ...d, practice_area: e.target.value }))}
                    dir="rtl" />
                </Field>
              </div>
              <Field label="תקציר (summary)">
                <Textarea value={draft.summary ?? ""} rows={3}
                  onChange={(e) => setDraft((d) => ({ ...d, summary: e.target.value }))}
                  placeholder="תקציר חופשי — מי, מה, איך הוכרע"
                  dir="rtl" />
              </Field>
              <Field label="התוצאה (outcome)">
                <Textarea value={draft.outcome ?? ""} rows={2}
                  onChange={(e) => setDraft((d) => ({ ...d, outcome: e.target.value }))}
                  placeholder="קבלה / קבלה חלקית / דחייה — בקצרה"
                  dir="rtl" />
              </Field>
              <Field label="עקרונות מרכזיים (שורה לכל אחד)">
                <Textarea value={(draft.key_principles ?? []).join("\n")} rows={4}
                  onChange={(e) => setPrinciples(e.target.value)}
                  placeholder={"דוגמה:\nשיקול דעת מוגבל לחריגות קטנות\nריפוי פגם רק בנסיבות חריגות"}
                  dir="rtl" />
              </Field>
              {decision.parties.appellant && (
                <Card className="bg-rule-soft/40 border-rule">
                  <CardContent className="px-4 py-3 text-[0.78rem] text-ink-soft">
                    <p><strong className="text-navy">עורר/ת:</strong> {decision.parties.appellant}</p>
                    {decision.parties.respondent && (
                      <p className="mt-1"><strong className="text-navy">משיב/ה:</strong> {decision.parties.respondent}</p>
                    )}
                    <p className="mt-2 text-ink-muted text-[0.72rem]">
                      (חולץ אוטומטית מתחילת הטקסט — תקן ע&quot;י עריכת ה-full_text במקור.)
                    </p>
                  </CardContent>
                </Card>
              )}
              <div className="flex items-center justify-end gap-2 pt-2 border-t border-rule">
                <Button variant="ghost" onClick={() => onOpenChange(false)}>
                  סגור
                </Button>
                <Button onClick={onSave} disabled={!isDirty || patch.isPending}
                  className="bg-navy text-parchment hover:bg-navy-soft">
                  {patch.isPending ? (
                    <Loader2 className="w-4 h-4 animate-spin me-1" />
                  ) : (
                    <Save className="w-4 h-4 me-1" />
                  )}
                  שמור שינויים
                </Button>
              </div>
            </TabsContent>
            {/* ── Tab: full_text (read-only) ─────────────────────── */}
            <TabsContent value="content" className="mt-4">
              <Card className="bg-surface border-rule">
                <CardContent className="px-4 py-3">
                  <p className="text-[0.72rem] text-ink-muted mb-2">
                    {decision.chars.toLocaleString("he-IL")} תווים · קריאה בלבד
                  </p>
                  <ScrollArea className="h-[480px] pe-2">
                    <p className="text-sm text-ink leading-relaxed whitespace-pre-wrap">
                      <FullTextLazy id={decision.id} />
                    </p>
                  </ScrollArea>
                </CardContent>
              </Card>
            </TabsContent>
            {/* ── Tab: lessons (per-decision) ────────────────────── */}
            <TabsContent value="lessons" className="mt-4">
              <LessonsTab corpusId={decision.id} />
            </TabsContent>
            {/* ── Tab: patterns scoped by appeal_subtype ─────────── */}
            <TabsContent value="patterns" className="mt-4">
              <PatternsForSubtype subtype={decision.appeal_subtype} />
            </TabsContent>
          </Tabs>
        </div>
      </SheetContent>
    </Sheet>
  );
 }
 // ── helpers ────────────────────────────────────────────────────────
 function DataPoint({
  icon, label, value,
 }: { icon: React.ReactNode; label: string; value: string }) {
  return (
    <div className="flex items-center gap-1 text-ink-muted">
      {icon}
      <span>{label}:</span>
      <span className="font-semibold text-navy tabular-nums truncate">{value}</span>
    </div>
  );
 }
 function Field({
  label, children,
 }: { label: string; children: React.ReactNode }) {
  return (
    <div className="space-y-1">
      <Label className="text-[0.78rem]">{label}</Label>
      {children}
    </div>
  );
 }
 /* The corpus-list endpoint deliberately doesn't return full_text (too big).
 * We fetch it on demand only when the content tab opens.
 *
 * Implementation note: we don't have a dedicated /api/training/corpus/{id}
 * GET endpoint yet. As a thin stopgap we hit a planned `/full-text` shortcut
 * via apiRequest; if the endpoint isn't deployed yet the UI just shows the
 * fallback message instead of crashing. The full-text endpoint lands with
 * the next backend deploy.
 */
 function FullTextLazy({ id }: { id: string }) {
  const [text, setText] = useState<string>("");
  const [loading, setLoading] = useState(true);
  const [error, setError] = useState("");
  /* eslint-disable react-hooks/set-state-in-effect */
  useEffect(() => {
    let cancelled = false;
    setLoading(true);
    setError("");
    fetch(`/api/training/corpus/${encodeURIComponent(id)}/full-text`)
      .then((r) => (r.ok ? r.json() : Promise.reject(new Error(`HTTP ${r.status}`))))
      .then((d: { full_text: string }) => {
        if (cancelled) return;
        setText(d.full_text || "");
      })
      .catch((e: Error) => {
        if (cancelled) return;
        setError(e.message);
      })
      .finally(() => !cancelled && setLoading(false));
    return () => { cancelled = true; };
  }, [id]);
  /* eslint-enable react-hooks/set-state-in-effect */
  if (loading) return <span className="text-ink-muted">טוען…</span>;
  if (error) return <span className="text-ink-muted">לא נמצא ({error})</span>;
  return text;
 }
 function PatternsForSubtype({ subtype }: { subtype: string }) {
  // Filtered patterns endpoint isn't built yet — we fall back to /patterns
  // and filter client-side. The result is mediocre when many subtypes share
  // patterns; better filtering ships in the metadata-enrichment iteration.
  const [data, setData] = useState<Record<string, { pattern_text: string; frequency: number }[]> | null>(null);
  const [loading, setLoading] = useState(true);
  useEffect(() => {
    let cancelled = false;
    fetch("/api/training/patterns")
      .then((r) => r.json())
      .then((d: { by_type: Record<string, { pattern_text: string; frequency: number }[]> }) => {
        if (!cancelled) setData(d.by_type);
      })
      .catch(() => !cancelled && setData({}))
      .finally(() => !cancelled && setLoading(false));
    return () => { cancelled = true; };
  }, []);
  if (loading) return <p className="text-ink-muted text-sm text-center py-6">טוען…</p>;
  if (!data || Object.keys(data).length === 0) {
    return <p className="text-ink-muted text-sm text-center py-6">אין דפוסים שמורים — הרץ ניתוח סגנון.</p>;
  }
  return (
    <div className="space-y-3">
      {subtype && (
        <p className="text-[0.78rem] text-ink-muted">
          דפוסים בכלל הקורפוס. סינון לפי תת-סוג {subtype} ייושם בעדכון הבא.
        </p>
      )}
      {Object.entries(data).slice(0, 4).map(([type, items]) => (
        <Card key={type} className="bg-surface border-rule">
          <CardContent className="px-4 py-3">
            <h4 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-2">
              {type}
            </h4>
            <ul className="space-y-1 text-sm text-ink">
              {items.slice(0, 6).map((p, i) => (
                <li key={i} className="flex items-start gap-2">
                  <span className="text-[0.72rem] tabular-nums text-ink-muted shrink-0 mt-0.5">
                    ×{p.frequency}
                  </span>
                  <span>{p.pattern_text}</span>
                </li>
              ))}
            </ul>
          </CardContent>
        </Card>
      ))}
    </div>
  );
 }
--- a/web-ui/src/components/training/corpus-panel.tsx
+++ b/web-ui/src/components/training/corpus-panel.tsx
@@ -1,6 +1,7 @@
 "use client";
-import { Trash2 } from "lucide-react";
+import { useState } from "react";
 import { Trash2, Sparkles } from "lucide-react";
 import { toast } from "sonner";
 import {
  Table, TableBody, TableCell, TableHead, TableHeader, TableRow,
@@ -9,12 +10,20 @@ import { Button } from "@/components/ui/button";
 import { Badge } from "@/components/ui/badge";
 import { Skeleton } from "@/components/ui/skeleton";
 import { useCorpus, useDeleteCorpusEntry, type CorpusDecision } from "@/lib/api/training";
 import { CorpusDetailDrawer } from "./corpus-detail-drawer";
 /*
- * Corpus tab: table of all decisions currently in the style corpus, with a
+ * Corpus tab: table of all decisions currently in the style corpus.
- * single destructive action (remove from corpus). Uses browser confirm() for
+ *
- * the confirmation — a full shadcn AlertDialog would be overkill for an
+ * Click any row → opens CorpusDetailDrawer with the enriched metadata
- * admin-only destructive action with a server-side safety net.
+ * + edit UI. The trash button is now in its own narrow column and uses
 * stopPropagation so deleting a row doesn't also open the drawer.
 *
 * We use browser confirm() for the destructive action rather than a
 * full shadcn AlertDialog because this is a single admin operation
 * gated by an API-level safety net (FK cascade is best-effort but
 * style_corpus DELETE returns 404 on missing rows, so the worst case
 * is a no-op).
 */
 function formatChars(n: number) {
@@ -30,9 +39,12 @@ function formatDate(iso: string) {
  }
 }
-function Row({ item }: { item: CorpusDecision }) {
+function Row({
  item, onOpen,
 }: { item: CorpusDecision; onOpen: () => void }) {
  const del = useDeleteCorpusEntry();
-  const onDelete = async () => {
+  const onDelete = async (e: React.MouseEvent) => {
    e.stopPropagation();
    if (!window.confirm(`למחוק את החלטה ${item.decision_number} מהקורפוס?`)) return;
    try {
      await del.mutateAsync(item.id);
@@ -43,7 +55,10 @@ function Row({ item }: { item: CorpusDecision }) {
  };
  return (
-    <TableRow className="border-rule hover:bg-gold-wash/30">
+    <TableRow
      className="border-rule hover:bg-gold-wash/30 cursor-pointer"
      onClick={onOpen}
    >
      <TableCell className="font-semibold text-navy tabular-nums">
        {item.decision_number || "—"}
      </TableCell>
@@ -55,20 +70,39 @@ function Row({ item }: { item: CorpusDecision }) {
          <span className="text-ink-light">—</span>
        ) : (
          <div className="flex flex-wrap gap-1">
-            {item.subject_categories.map((s) => (
+            {item.subject_categories.slice(0, 3).map((s) => (
-              <Badge
+              <Badge key={s} variant="outline"
-                key={s}
+                className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
                variant="outline"
                className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40"
              >
                {s}
              </Badge>
            ))}
            {item.subject_categories.length > 3 && (
              <span className="text-[0.7rem] text-ink-muted">
                +{item.subject_categories.length - 3}
              </span>
            )}
          </div>
        )}
      </TableCell>
      <TableCell className="text-[0.78rem] text-ink-soft">
        <div className="flex items-center gap-2">
          <span className="truncate">{item.legal_citation || "—"}</span>
          {item.lessons_count > 0 && (
            <Badge variant="outline"
              className="text-[0.7rem] bg-info-bg text-info border-info/40 shrink-0">
              <Sparkles className="w-3 h-3 me-0.5" />
              {item.lessons_count}
            </Badge>
          )}
        </div>
      </TableCell>
      <TableCell className="text-ink-soft tabular-nums">
        {formatChars(item.chars)}
        {item.page_count > 0 && (
          <span className="text-ink-muted text-[0.72rem] ms-1">
            · {item.page_count} ע׳
          </span>
        )}
      </TableCell>
      <TableCell className="text-ink-muted tabular-nums text-[0.78rem]">
        {formatDate(item.created_at)}
@@ -91,6 +125,7 @@ function Row({ item }: { item: CorpusDecision }) {
 export function CorpusPanel() {
  const { data, isPending, error } = useCorpus();
  const [selected, setSelected] = useState<CorpusDecision | null>(null);
  if (error) {
    return (
@@ -101,6 +136,7 @@ export function CorpusPanel() {
  }
  return (
    <>
      <div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
        <Table>
          <TableHeader className="bg-rule-soft/60">
@@ -108,7 +144,8 @@ export function CorpusPanel() {
              <TableHead className="text-navy text-right">מס׳ החלטה</TableHead>
              <TableHead className="text-navy text-right">תאריך</TableHead>
              <TableHead className="text-navy text-right">נושאים</TableHead>
-            <TableHead className="text-navy text-right">תווים</TableHead>
+              <TableHead className="text-navy text-right">מראה מקום</TableHead>
              <TableHead className="text-navy text-right">תווים / עמודים</TableHead>
              <TableHead className="text-navy text-right">נוסף בתאריך</TableHead>
              <TableHead className="text-navy" />
            </TableRow>
@@ -117,7 +154,7 @@ export function CorpusPanel() {
            {isPending ? (
              [...Array(4)].map((_, i) => (
                <TableRow key={i} className="border-rule">
-                {[...Array(6)].map((_, j) => (
+                  {[...Array(7)].map((_, j) => (
                    <TableCell key={j}>
                      <Skeleton className="h-4 w-24" />
                    </TableCell>
@@ -126,15 +163,23 @@ export function CorpusPanel() {
              ))
            ) : data?.length === 0 ? (
              <TableRow>
-              <TableCell colSpan={6} className="text-center text-ink-muted py-12">
+                <TableCell colSpan={7} className="text-center text-ink-muted py-12">
                  הקורפוס ריק
                </TableCell>
              </TableRow>
            ) : (
-            data?.map((item) => <Row key={item.id} item={item} />)
+              data?.map((item) => (
                <Row key={item.id} item={item} onOpen={() => setSelected(item)} />
              ))
            )}
          </TableBody>
        </Table>
      </div>
      <CorpusDetailDrawer
        decision={selected}
        onOpenChange={(open) => { if (!open) setSelected(null); }}
      />
    </>
  );
 }
--- a/web-ui/src/components/training/curator-portrait-panel.tsx
+++ b/web-ui/src/components/training/curator-portrait-panel.tsx
@@ -0,0 +1,338 @@
 "use client";
 /*
 * Curator-Portrait tab — shows everything about the agent that learns
 * Daphna's style:
 *   1. Snapshot stats (curator findings to date, % applied)
 *   2. Recent curator findings (last 10) — linked by decision number
 *   3. The hermes-curator system prompt, rendered + linked to Gitea
 *   4. The style_analyzer training prompts (different lifecycle — runs
 *      over the corpus at training time, not per-decision)
 *   5. Propose-change form — writes a markdown file to disk for chair
 *      review (no auto-commit)
 *
 * The prompts are deliberately read-only here: they're symlinked into
 * Paperclip and load-bearing for every curator wake. Editing them from
 * the UI would silently fork the source of truth.
 */
 import { useState } from "react";
 import {
  Sparkles, ExternalLink, Send, Loader2, FileText, Brain,
  CheckCircle2, Clock,
 } from "lucide-react";
 import { toast } from "sonner";
 import { Card, CardContent } from "@/components/ui/card";
 import { Button } from "@/components/ui/button";
 import { Input } from "@/components/ui/input";
 import { Label } from "@/components/ui/label";
 import { Textarea } from "@/components/ui/textarea";
 import { Badge } from "@/components/ui/badge";
 import { Skeleton } from "@/components/ui/skeleton";
 import { ScrollArea } from "@/components/ui/scroll-area";
 import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
 import { Markdown } from "@/components/ui/markdown";
 import {
  useCuratorPrompt,
  useCuratorStats,
  useStyleAnalyzerPrompts,
  useSubmitCuratorProposal,
 } from "@/lib/api/training";
 export function CuratorPortraitPanel() {
  return (
    <div className="space-y-6">
      <StatsCard />
      <RecentFindings />
      <Tabs defaultValue="curator-prompt" dir="rtl">
        <TabsList className="bg-rule-soft/60">
          <TabsTrigger value="curator-prompt">פרומפט ה-Curator</TabsTrigger>
          <TabsTrigger value="analyzer-prompt">פרומפט אימון הסגנון</TabsTrigger>
          <TabsTrigger value="propose">הצעת שינוי</TabsTrigger>
        </TabsList>
        <TabsContent value="curator-prompt" className="mt-4">
          <CuratorPromptCard />
        </TabsContent>
        <TabsContent value="analyzer-prompt" className="mt-4">
          <StyleAnalyzerPromptCard />
        </TabsContent>
        <TabsContent value="propose" className="mt-4">
          <ProposeChangeForm />
        </TabsContent>
      </Tabs>
    </div>
  );
 }
 // ── stats card ─────────────────────────────────────────────────────
 function StatsCard() {
  const { data, isPending } = useCuratorStats();
  if (isPending) {
    return (
      <div className="grid grid-cols-2 md:grid-cols-4 gap-3">
        {[...Array(4)].map((_, i) => <Skeleton key={i} className="h-20 w-full" />)}
      </div>
    );
  }
  if (!data) return null;
  return (
    <div className="grid grid-cols-2 md:grid-cols-4 gap-3">
      <Kpi label="ממצאי curator" value={data.total_findings} icon={<Sparkles className="w-4 h-4" />} />
      <Kpi label="החלטות שנסקרו" value={`${data.decisions_with_findings}/${data.decisions_total}`} icon={<FileText className="w-4 h-4" />} />
      <Kpi label="ממצאים שאומצו ל-SKILL" value={data.findings_applied} icon={<CheckCircle2 className="w-4 h-4" />} />
      <Kpi label="ממוצע ממצאים להחלטה"
        value={
          data.decisions_with_findings > 0
            ? (data.total_findings / data.decisions_with_findings).toFixed(1)
            : "—"
        }
        icon={<Brain className="w-4 h-4" />}
      />
    </div>
  );
 }
 function Kpi({
  label, value, icon,
 }: { label: string; value: string | number; icon: React.ReactNode }) {
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-4 py-3">
        <div className="flex items-center gap-2 text-ink-muted text-[0.78rem]">
          {icon}
          <span>{label}</span>
        </div>
        <p className="text-2xl text-navy font-semibold tabular-nums mt-1">{value}</p>
      </CardContent>
    </Card>
  );
 }
 // ── recent findings ────────────────────────────────────────────────
 function RecentFindings() {
  const { data, isPending } = useCuratorStats();
  if (isPending) {
    return <Skeleton className="h-40 w-full" />;
  }
  if (!data || data.recent_findings.length === 0) {
    return (
      <Card className="bg-rule-soft/40 border-rule">
        <CardContent className="px-6 py-5 text-center text-ink-muted text-sm">
          אין עדיין ממצאים של ה-Curator. הוא מופעל אוטומטית כאשר דפנה מסמנת
          החלטה כסופית (mark-final), ושומר את ממצאיו כ-decision_lessons עם
          source=&quot;curator&quot;.
        </CardContent>
      </Card>
    );
  }
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-4 py-3">
        <h3 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-3">
          ממצאים אחרונים של ה-Curator
        </h3>
        <ul className="space-y-2">
          {data.recent_findings.map((f) => (
            <li key={f.id} className="border-b border-rule pb-2 last:border-0 last:pb-0">
              <div className="flex items-center gap-2 text-[0.72rem] mb-1">
                <Badge variant="outline"
                  className="bg-info-bg text-info border-info/40">
                  {f.category}
                </Badge>
                <span className="text-navy font-semibold tabular-nums">
                  {f.decision_number || "—"}
                </span>
                {f.applied_to_skill && (
                  <Badge variant="outline"
                    className="bg-success-bg text-success border-success/40">
                    <CheckCircle2 className="w-3 h-3 me-0.5" />
                    אומץ
                  </Badge>
                )}
                <span className="grow text-ink-muted text-end">
                  <Clock className="w-3 h-3 inline me-1" />
                  {new Date(f.created_at).toLocaleDateString("he-IL")}
                </span>
              </div>
              <p className="text-sm text-ink leading-relaxed">{f.lesson_text}</p>
            </li>
          ))}
        </ul>
      </CardContent>
    </Card>
  );
 }
 // ── prompts ────────────────────────────────────────────────────────
 function CuratorPromptCard() {
  const { data, isPending, error } = useCuratorPrompt();
  if (isPending) return <Skeleton className="h-96 w-full" />;
  if (error) {
    return (
      <Card className="bg-danger-bg border-danger/40">
        <CardContent className="px-6 py-4 text-danger">{error.message}</CardContent>
      </Card>
    );
  }
  if (!data) return null;
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-5 py-4 space-y-3">
        <div className="flex items-center justify-between gap-2 flex-wrap">
          <div>
            <h3 className="text-navy font-semibold">{data.filename}</h3>
            <p className="text-[0.72rem] text-ink-muted">
              {data.bytes.toLocaleString("he-IL")} בייטים ·
              עודכן: {new Date(data.last_modified * 1000).toLocaleString("he-IL")}
            </p>
          </div>
          <Button asChild variant="outline" size="sm">
            <a href={data.gitea_url} target="_blank" rel="noopener noreferrer">
              <ExternalLink className="w-3 h-3 me-1" />
              ערוך ב-Gitea
            </a>
          </Button>
        </div>
        <ScrollArea className="h-[520px] pe-2 border border-rule rounded p-3 bg-rule-soft/30">
          <Markdown content={data.content} />
        </ScrollArea>
      </CardContent>
    </Card>
  );
 }
 function StyleAnalyzerPromptCard() {
  const { data, isPending } = useStyleAnalyzerPrompts();
  if (isPending) return <Skeleton className="h-96 w-full" />;
  if (!data) return null;
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-5 py-4 space-y-3">
        <div>
          <h3 className="text-navy font-semibold">פרומפטים של style_analyzer.py</h3>
          <p className="text-[0.72rem] text-ink-muted">
            רץ ב-Claude Opus (1M context, עד {data.max_input_tokens.toLocaleString("he-IL")} tokens
            input) דרך claude CLI מקומי — חינמי, ללא API. נקרא ע&quot;י
            <code className="px-1 mx-1 bg-rule-soft rounded">POST /api/training/analyze-style</code>
            ומכניס דפוסים ל-<code className="px-1 bg-rule-soft rounded">style_patterns</code>.
          </p>
        </div>
        <Tabs defaultValue="analysis" dir="rtl">
          <TabsList className="bg-rule-soft/60">
            <TabsTrigger value="analysis">Single-pass (כל הקורפוס)</TabsTrigger>
            <TabsTrigger value="single">Multi-pass (החלטה אחת)</TabsTrigger>
            <TabsTrigger value="synthesis">Synthesis</TabsTrigger>
          </TabsList>
          <TabsContent value="analysis" className="mt-3">
            <PromptBlock content={data.analysis_prompt} />
          </TabsContent>
          <TabsContent value="single" className="mt-3">
            <PromptBlock content={data.single_decision_prompt} />
          </TabsContent>
          <TabsContent value="synthesis" className="mt-3">
            <PromptBlock content={data.synthesis_prompt} />
          </TabsContent>
        </Tabs>
      </CardContent>
    </Card>
  );
 }
 function PromptBlock({ content }: { content: string }) {
  return (
    <ScrollArea className="h-[420px] pe-2 border border-rule rounded p-3 bg-rule-soft/30">
      <pre className="text-[0.78rem] whitespace-pre-wrap font-mono text-ink leading-relaxed"
        dir="rtl">
        {content}
      </pre>
    </ScrollArea>
  );
 }
 // ── propose change form ────────────────────────────────────────────
 function ProposeChangeForm() {
  const [title, setTitle] = useState("");
  const [proposedChange, setProposedChange] = useState("");
  const [rationale, setRationale] = useState("");
  const submit = useSubmitCuratorProposal();
  const onSubmit = async (e: React.FormEvent) => {
    e.preventDefault();
    if (!title.trim() || !proposedChange.trim()) {
      toast.error("חובה כותרת ושינוי מוצע");
      return;
    }
    try {
      const r = await submit.mutateAsync({
        title: title.trim(),
        proposed_change: proposedChange.trim(),
        rationale: rationale.trim(),
      });
      toast.success(`נשמרה הצעה: ${r.filename}`);
      setTitle(""); setProposedChange(""); setRationale("");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל בשמירה");
    }
  };
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-5 py-4">
        <h3 className="text-navy font-semibold mb-2">הצעת שינוי לפרומפט ה-Curator</h3>
        <p className="text-[0.78rem] text-ink-muted mb-4">
          ההצעה תישמר כקובץ Markdown ב-
          <code className="px-1 bg-rule-soft rounded">data/curator-proposals/</code>.
          חיים יבחן ויאשר ידנית — אין שינוי אוטומטי בפרומפט.
        </p>
        <form onSubmit={onSubmit} className="space-y-3">
          <div className="space-y-1">
            <Label htmlFor="proposal-title">כותרת השינוי</Label>
            <Input id="proposal-title" value={title}
              onChange={(e) => setTitle(e.target.value)}
              placeholder="לדוגמה: הוסף קטגוריה [צ׳קליסט תוכן] לממצאי ה-curator"
              dir="rtl" />
          </div>
          <div className="space-y-1">
            <Label htmlFor="proposal-change">השינוי המוצע (Markdown)</Label>
            <Textarea id="proposal-change" value={proposedChange} rows={6}
              onChange={(e) => setProposedChange(e.target.value)}
              placeholder={"תאר במדויק מה לשנות. אפשר להעתיק את הקטע הקיים ולסמן ב-strikethrough + להוסיף את החדש."}
              dir="rtl" />
          </div>
          <div className="space-y-1">
            <Label htmlFor="proposal-rationale">נימוק</Label>
            <Textarea id="proposal-rationale" value={rationale} rows={3}
              onChange={(e) => setRationale(e.target.value)}
              placeholder="למה השינוי הזה חשוב? איזה בעיה הוא פותר?"
              dir="rtl" />
          </div>
          <div className="flex justify-end">
            <Button type="submit" disabled={submit.isPending}
              className="bg-navy text-parchment hover:bg-navy-soft">
              {submit.isPending ? (
                <Loader2 className="w-4 h-4 animate-spin me-1" />
              ) : (
                <Send className="w-4 h-4 me-1" />
              )}
              שלח הצעה
            </Button>
          </div>
        </form>
      </CardContent>
    </Card>
  );
 }
--- a/web-ui/src/components/training/lessons-tab.tsx
+++ b/web-ui/src/components/training/lessons-tab.tsx
@@ -0,0 +1,267 @@
 "use client";
 /*
 * Per-decision lessons editor — lives inside CorpusDetailDrawer's
 * "מה למדנו" tab. Lessons are persisted in the decision_lessons table
 * (one-to-many on style_corpus) and consumed by hermes-curator and
 * future style_analyzer runs as context.
 *
 * The chair can:
 *   - Add a lesson typed manually (category = "general" by default)
 *   - Edit / delete existing lessons
 *   - Mark a lesson as "applied_to_skill" (informational — doesn't
 *     auto-commit anything to SKILL.md; chair still curates that file
 *     manually in git).
 *
 * Lessons from the curator arrive with source="curator" and are visually
 * distinguished by a badge so the chair can audit auto-suggestions.
 */
 import { useState } from "react";
 import { Plus, Save, Trash2, Loader2, CheckCircle2, Sparkles } from "lucide-react";
 import { toast } from "sonner";
 import { Button } from "@/components/ui/button";
 import { Card, CardContent } from "@/components/ui/card";
 import { Textarea } from "@/components/ui/textarea";
 import { Badge } from "@/components/ui/badge";
 import { Skeleton } from "@/components/ui/skeleton";
 import {
  Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
 } from "@/components/ui/select";
 import {
  useAddLesson,
  useCorpusLessons,
  useDeleteLesson,
  usePatchLesson,
  type DecisionLesson,
 } from "@/lib/api/training";
 const CATEGORIES = [
  { value: "general", label: "כללי" },
  { value: "style", label: "סגנון" },
  { value: "structure", label: "מבנה" },
  { value: "lexicon", label: "לקסיקון" },
  { value: "tabular", label: "טבלאי" },
 ] as const;
 const SOURCE_BADGE: Record<DecisionLesson["source"], { label: string; cls: string }> = {
  manual: { label: "ידני", cls: "bg-rule-soft text-ink-soft" },
  chair: { label: "יו״ר", cls: "bg-gold-wash text-gold-deep" },
  curator: { label: "Curator", cls: "bg-info-bg text-info" },
  style_analyzer: { label: "Analyzer", cls: "bg-success-bg text-success" },
 };
 export function LessonsTab({ corpusId }: { corpusId: string }) {
  const { data, isPending } = useCorpusLessons(corpusId);
  const add = useAddLesson(corpusId);
  const [draftText, setDraftText] = useState("");
  const [draftCategory, setDraftCategory] = useState<DecisionLesson["category"]>("general");
  const onAdd = async () => {
    const text = draftText.trim();
    if (!text) return;
    try {
      await add.mutateAsync({ lesson_text: text, category: draftCategory });
      setDraftText("");
      setDraftCategory("general");
      toast.success("הלקח נוסף");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל בשמירה");
    }
  };
  return (
    <div className="space-y-4">
      {/* Composer */}
      <Card className="bg-surface border-rule">
        <CardContent className="px-4 py-3 space-y-2">
          <h4 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold">
            הוסף לקח להחלטה
          </h4>
          <Textarea
            value={draftText}
            onChange={(e) => setDraftText(e.target.value)}
            placeholder="מה למדנו מההחלטה הזו? למשל: 'דפנה מעדיפה הוצאות מתונות (5K-10K ₪) גם בערר שהתקבל במלואו'"
            rows={3}
            dir="rtl"
            disabled={add.isPending}
          />
          <div className="flex items-center gap-2">
            <Select
              value={draftCategory}
              onValueChange={(v) => setDraftCategory(v as DecisionLesson["category"])}
              disabled={add.isPending}
              dir="rtl"
            >
              <SelectTrigger className="w-40">
                <SelectValue />
              </SelectTrigger>
              <SelectContent>
                {CATEGORIES.map((c) => (
                  <SelectItem key={c.value} value={c.value}>{c.label}</SelectItem>
                ))}
              </SelectContent>
            </Select>
            <div className="grow" />
            <Button onClick={onAdd} disabled={add.isPending || !draftText.trim()}
              className="bg-navy text-parchment hover:bg-navy-soft">
              {add.isPending ? (
                <Loader2 className="w-4 h-4 animate-spin me-1" />
              ) : (
                <Plus className="w-4 h-4 me-1" />
              )}
              שמור לקח
            </Button>
          </div>
        </CardContent>
      </Card>
      {/* List */}
      {isPending ? (
        <div className="space-y-2">
          {[...Array(3)].map((_, i) => (
            <Skeleton key={i} className="h-16 w-full" />
          ))}
        </div>
      ) : !data || data.length === 0 ? (
        <p className="text-center text-ink-muted text-sm py-6">
          אין עדיין לקחים להחלטה זו. הוסף לקח ראשון מלמעלה.
        </p>
      ) : (
        <div className="space-y-2">
          {data.map((lesson) => (
            <LessonItem key={lesson.id} lesson={lesson} corpusId={corpusId} />
          ))}
        </div>
      )}
    </div>
  );
 }
 function LessonItem({
  lesson, corpusId,
 }: { lesson: DecisionLesson; corpusId: string }) {
  const [editing, setEditing] = useState(false);
  const [text, setText] = useState(lesson.lesson_text);
  const [category, setCategory] = useState<DecisionLesson["category"]>(lesson.category);
  const patch = usePatchLesson(corpusId);
  const del = useDeleteLesson(corpusId);
  const sourceBadge = SOURCE_BADGE[lesson.source];
  const dirty = text !== lesson.lesson_text || category !== lesson.category;
  const onSave = async () => {
    try {
      await patch.mutateAsync({
        id: lesson.id,
        patch: dirty ? { lesson_text: text, category } : {},
      });
      setEditing(false);
      toast.success("הלקח עודכן");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל בעדכון");
    }
  };
  const onToggleApplied = async () => {
    try {
      await patch.mutateAsync({
        id: lesson.id,
        patch: { applied_to_skill: !lesson.applied_to_skill },
      });
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל בעדכון");
    }
  };
  const onDelete = async () => {
    if (!window.confirm("למחוק את הלקח?")) return;
    try {
      await del.mutateAsync(lesson.id);
      toast.success("נמחק");
    } catch (e) {
      toast.error(e instanceof Error ? e.message : "כשל במחיקה");
    }
  };
  return (
    <Card className="bg-surface border-rule">
      <CardContent className="px-4 py-3 space-y-2">
        <div className="flex items-center gap-2 text-[0.72rem]">
          <Badge variant="outline"
            className="bg-rule-soft text-ink-soft">
            {CATEGORIES.find((c) => c.value === lesson.category)?.label || lesson.category}
          </Badge>
          <Badge variant="outline" className={sourceBadge.cls}>
            {sourceBadge.label}
          </Badge>
          {lesson.applied_to_skill && (
            <Badge variant="outline"
              className="bg-success-bg text-success border-success/40">
              <CheckCircle2 className="w-3 h-3 me-1" />
              אומץ
            </Badge>
          )}
          <span className="grow text-ink-muted tabular-nums">
            {new Date(lesson.created_at).toLocaleDateString("he-IL")}
          </span>
        </div>
        {editing ? (
          <>
            <Textarea value={text} onChange={(e) => setText(e.target.value)}
              rows={3} dir="rtl" />
            <div className="flex items-center gap-2">
              <Select value={category}
                onValueChange={(v) => setCategory(v as DecisionLesson["category"])}
                dir="rtl">
                <SelectTrigger className="w-40">
                  <SelectValue />
                </SelectTrigger>
                <SelectContent>
                  {CATEGORIES.map((c) => (
                    <SelectItem key={c.value} value={c.value}>{c.label}</SelectItem>
                  ))}
                </SelectContent>
              </Select>
              <div className="grow" />
              <Button variant="ghost" size="sm"
                onClick={() => { setEditing(false); setText(lesson.lesson_text); setCategory(lesson.category); }}>
                ביטול
              </Button>
              <Button size="sm" onClick={onSave} disabled={patch.isPending}
                className="bg-navy text-parchment hover:bg-navy-soft">
                <Save className="w-3 h-3 me-1" />
                שמור
              </Button>
            </div>
          </>
        ) : (
          <>
            <p className="text-sm text-ink leading-relaxed whitespace-pre-wrap"
               onClick={() => setEditing(true)}
               style={{ cursor: "text" }}>
              {lesson.lesson_text}
            </p>
            <div className="flex items-center gap-2">
              <Button variant="ghost" size="sm" onClick={onToggleApplied}
                disabled={patch.isPending}>
                <Sparkles className="w-3 h-3 me-1" />
                {lesson.applied_to_skill ? "בטל סימון 'אומץ'" : "סמן כ'אומץ ל-SKILL'"}
              </Button>
              <Button variant="ghost" size="sm" onClick={() => setEditing(true)}>
                ערוך
              </Button>
              <div className="grow" />
              <Button variant="ghost" size="sm" onClick={onDelete}
                disabled={del.isPending}
                className="text-danger hover:text-danger hover:bg-danger-bg">
                <Trash2 className="w-3 h-3" />
              </Button>
            </div>
          </>
        )}
      </CardContent>
    </Card>
  );
 }
--- a/web-ui/src/components/training/upload-dialog.tsx
+++ b/web-ui/src/components/training/upload-dialog.tsx
@@ -0,0 +1,328 @@
 "use client";
 /*
 * Upload a Daphna decision into the style corpus, from the /training page.
 *
 * The flow is three explicit steps inside the same sheet:
 *   1. file picker → POST /api/upload                    (gets sanitized filename)
 *   2. preview     → POST /api/training/analyze          (proofread + auto-extracted meta)
 *                    chair can correct decision_number / decision_date / subjects
 *   3. commit      → POST /api/training/upload           (background task)
 *                    progress watched via SSE; on completion we invalidate
 *                    corpus + style-report so the new row appears.
 *
 * The Sheet UX mirrors precedent-upload-sheet.tsx: same dir="rtl", same
 * loading + error patterns, same toast on success. The reason this isn't
 * a single one-click upload is that style-corpus rows are write-once
 * (we don't allow editing full_text), so the chair MUST see the proofread
 * preview before committing — otherwise a bad OCR/proofread can silently
 * pollute the style portrait.
 */
 import { useEffect, useState } from "react";
 import { Upload, Loader2, CheckCircle2, AlertCircle, FileText } from "lucide-react";
 import { toast } from "sonner";
 import { useQueryClient } from "@tanstack/react-query";
 import {
  Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
 } from "@/components/ui/sheet";
 import { Button } from "@/components/ui/button";
 import { Input } from "@/components/ui/input";
 import { Label } from "@/components/ui/label";
 import { Progress } from "@/components/ui/progress";
 import { Badge } from "@/components/ui/badge";
 import {
  trainingKeys,
  useAnalyzeTraining,
  useCommitTrainingUpload,
  useUploadFile,
  type AnalyzeTrainingResponse,
 } from "@/lib/api/training";
 import { useProgress } from "@/lib/api/documents";
 const ACCEPT = ".pdf,.docx,.doc,.rtf,.txt,.md";
 type Props = {
  open: boolean;
  onOpenChange: (open: boolean) => void;
 };
 type Stage = "pick" | "analyzing" | "preview" | "committing" | "done" | "error";
 export function TrainingUploadDialog({ open, onOpenChange }: Props) {
  const [stage, setStage] = useState<Stage>("pick");
  const [file, setFile] = useState<File | null>(null);
  const [analysis, setAnalysis] = useState<AnalyzeTrainingResponse | null>(null);
  // editable copies of the auto-extracted metadata
  const [decisionNumber, setDecisionNumber] = useState("");
  const [decisionDate, setDecisionDate] = useState("");
  const [subjectsRaw, setSubjectsRaw] = useState("");
  const [title, setTitle] = useState("");
  const [taskId, setTaskId] = useState<string | null>(null);
  const [errorMsg, setErrorMsg] = useState("");
  const uploadFile = useUploadFile();
  const analyze = useAnalyzeTraining();
  const commit = useCommitTrainingUpload();
  const progress = useProgress(taskId);
  const qc = useQueryClient();
  // Reset everything when the sheet closes — important because Sheet keeps
  // the component mounted between opens. The cascade-render warning is the
  // intended behavior (reset is the side effect we want).
  useEffect(() => {
    if (open) return;
    /* eslint-disable react-hooks/set-state-in-effect */
    setStage("pick"); setFile(null); setAnalysis(null);
    setDecisionNumber(""); setDecisionDate(""); setSubjectsRaw("");
    setTitle(""); setTaskId(null); setErrorMsg("");
    /* eslint-enable react-hooks/set-state-in-effect */
  }, [open]);
  // Watch background task. When complete, invalidate corpus + report so the
  // new row + updated stats show up automatically. The setStage call here
  // is the deliberate UX (success card → auto-close) — synchronizing UI
  // with the external SSE stream is exactly what effects are for.
  useEffect(() => {
    if (!progress) return;
    if (progress.status === "completed") {
      qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
      qc.invalidateQueries({ queryKey: trainingKeys.report() });
      // eslint-disable-next-line react-hooks/set-state-in-effect
      setStage("done");
      toast.success(`החלטה ${decisionNumber || analysis?.decision_number || ""} נוספה לקורפוס`);
      const t = window.setTimeout(() => onOpenChange(false), 1500);
      return () => window.clearTimeout(t);
    }
    if (progress.status === "failed") {
      setStage("error");
      setErrorMsg(progress.error || "כשל בעיבוד");
    }
  }, [progress, analysis, decisionNumber, qc, onOpenChange]);
  const onPickFile = async (f: File | null) => {
    setFile(f);
    setErrorMsg("");
    if (!f) return;
    setStage("analyzing");
    try {
      const { filename } = await uploadFile.mutateAsync(f);
      const result = await analyze.mutateAsync(filename);
      setAnalysis(result);
      setDecisionNumber(result.decision_number);
      setDecisionDate(result.decision_date);
      setSubjectsRaw(result.subject_categories.join(", "));
      // Default title from the original filename stem (chair can override).
      const stem = f.name.replace(/\.[^.]+$/, "");
      setTitle(stem);
      setStage("preview");
    } catch (e) {
      setStage("error");
      setErrorMsg(e instanceof Error ? e.message : "כשל בקריאת הקובץ");
    }
  };
  const onCommit = async () => {
    if (!analysis) return;
    setStage("committing");
    setErrorMsg("");
    try {
      const subjects = subjectsRaw
        .split(/[,،]/)
        .map((s) => s.trim())
        .filter(Boolean);
      const res = await commit.mutateAsync({
        filename: analysis.filename,
        decision_number: decisionNumber.trim(),
        decision_date: decisionDate || "",
        subject_categories: subjects,
        title: title.trim() || undefined,
      });
      setTaskId(res.task_id);
    } catch (e) {
      setStage("error");
      // 409 = duplicate decision_number — surface the backend's Hebrew message.
      setErrorMsg(e instanceof Error ? e.message : "כשל בהעלאה");
    }
  };
  const isProcessing =
    stage === "analyzing" || stage === "committing" ||
    (taskId !== null && progress?.status !== "completed" && progress?.status !== "failed");
  const progressStep = (progress as { step?: string } | null)?.step;
  return (
    <Sheet open={open} onOpenChange={onOpenChange}>
      <SheetContent side="left" className="w-full sm:max-w-2xl overflow-y-auto" dir="rtl">
        <SheetHeader>
          <SheetTitle className="text-navy">העלאת החלטה לקורפוס הסגנון</SheetTitle>
          <SheetDescription className="text-ink-muted">
            הקובץ יעבור הגהה (סינון Nevo, ניקוד), חילוץ אוטומטי של מספר תיק, תאריך
            ונושאים, ויוטמע ב-style_corpus עם chunks ו-embeddings. תוכל לתקן את
            פרטי המטא-דאטה לפני שמירה.
          </SheetDescription>
        </SheetHeader>
        <div className="px-6 pb-6 mt-4 space-y-4">
          {/* Step 1: pick */}
          {stage === "pick" && (
            <div className="space-y-2">
              <Label htmlFor="t-file">קובץ ההחלטה (PDF / DOCX / DOC / RTF / TXT / MD)</Label>
              <Input
                id="t-file" type="file" accept={ACCEPT}
                onChange={(e) => onPickFile(e.target.files?.[0] ?? null)}
              />
              <p className="text-[0.78rem] text-ink-muted">
                המערכת תחלץ מהקובץ את מספר התיק, התאריך והנושאים. תוכל לערוך
                לפני השמירה.
              </p>
            </div>
          )}
          {/* Stage 2: analyzing the file */}
          {stage === "analyzing" && (
            <div className="rounded-lg border border-rule bg-rule-soft/40 p-6 space-y-2 text-center">
              <Loader2 className="w-5 h-5 animate-spin mx-auto text-navy" />
              <p className="text-sm text-navy">מבצע הגהה וחילוץ מטא-דאטה…</p>
              <p className="text-[0.78rem] text-ink-muted">
                {file?.name}
              </p>
            </div>
          )}
          {/* Stage 3: preview + editable metadata */}
          {stage === "preview" && analysis && (
            <form
              className="space-y-4"
              onSubmit={(e) => { e.preventDefault(); onCommit(); }}
            >
              <div className="rounded-lg border border-rule bg-surface px-4 py-3">
                <h3 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-2">
                  תצוגה מקדימה של הטקסט הנקי
                </h3>
                <p className="text-sm text-ink leading-relaxed line-clamp-6 whitespace-pre-wrap">
                  {analysis.preview}
                </p>
                <div className="mt-2 flex items-center gap-3 text-[0.72rem] text-ink-muted tabular-nums">
                  <span className="flex items-center gap-1">
                    <FileText className="w-3 h-3" />
                    {analysis.chars.toLocaleString("he-IL")} תווים
                  </span>
                </div>
              </div>
              <div className="grid grid-cols-2 gap-3">
                <div className="space-y-1">
                  <Label htmlFor="t-decision-number">מספר ההחלטה</Label>
                  <Input
                    id="t-decision-number"
                    value={decisionNumber}
                    onChange={(e) => setDecisionNumber(e.target.value)}
                    placeholder="1130-25"
                    dir="rtl"
                  />
                </div>
                <div className="space-y-1">
                  <Label htmlFor="t-decision-date">תאריך ההחלטה</Label>
                  <Input
                    id="t-decision-date" type="date"
                    value={decisionDate}
                    onChange={(e) => setDecisionDate(e.target.value)}
                  />
                </div>
              </div>
              <div className="space-y-1">
                <Label htmlFor="t-title">כותרת קצרה (אופציונלי)</Label>
                <Input
                  id="t-title" value={title}
                  onChange={(e) => setTitle(e.target.value)}
                  placeholder="ARAR-25-1130 - כרמל יצחק" dir="rtl"
                />
              </div>
              <div className="space-y-1">
                <Label htmlFor="t-subjects">נושאים (מופרדים בפסיקים)</Label>
                <Input
                  id="t-subjects" value={subjectsRaw}
                  onChange={(e) => setSubjectsRaw(e.target.value)}
                  placeholder="חניה, קווי בניין, שימוש חורג" dir="rtl"
                />
                {analysis.subject_categories.length > 0 && (
                  <div className="flex flex-wrap gap-1 mt-1">
                    <span className="text-[0.72rem] text-ink-muted">חולץ אוטומטית:</span>
                    {analysis.subject_categories.map((s) => (
                      <Badge key={s} variant="outline"
                        className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
                        {s}
                      </Badge>
                    ))}
                  </div>
                )}
              </div>
              {errorMsg && (
                <div className="rounded-lg border border-danger/40 bg-danger-bg p-3 flex items-center gap-2 text-danger text-sm">
                  <AlertCircle className="w-4 h-4 shrink-0" />
                  {errorMsg}
                </div>
              )}
              <div className="flex gap-2 justify-end pt-2">
                <Button type="button" variant="ghost"
                  onClick={() => onOpenChange(false)}
                  disabled={isProcessing}>
                  ביטול
                </Button>
                <Button type="submit" disabled={isProcessing || !decisionNumber.trim()}
                  className="bg-navy text-parchment hover:bg-navy-soft">
                  <Upload className="w-4 h-4 me-1" />
                  שמור בקורפוס
                </Button>
              </div>
            </form>
          )}
          {/* Stage 4: committing — background task progress */}
          {(stage === "committing" || (taskId && stage !== "done" && stage !== "error")) && (
            <div className="rounded-lg border border-rule bg-rule-soft/40 p-4 space-y-2">
              <div className="flex items-center gap-2 text-sm text-navy">
                <Loader2 className="w-4 h-4 animate-spin" />
                <span>{progressStep || "מעבד את ההחלטה לקורפוס"}</span>
              </div>
              <Progress value={progressStep ? 60 : 30} className="h-1.5" />
            </div>
          )}
          {/* Stage 5: success */}
          {stage === "done" && (
            <div className="rounded-lg border border-gold/40 bg-gold-wash p-4 flex items-center gap-2 text-gold-deep text-sm">
              <CheckCircle2 className="w-4 h-4" />
              ההחלטה נוספה לקורפוס בהצלחה.
            </div>
          )}
          {/* Stage 6: error (after a failed analyze or upload) */}
          {stage === "error" && (
            <div className="space-y-3">
              <div className="rounded-lg border border-danger/40 bg-danger-bg p-4 flex items-center gap-2 text-danger text-sm">
                <AlertCircle className="w-4 h-4 shrink-0" />
                {errorMsg || "שגיאה לא ידועה"}
              </div>
              <div className="flex gap-2 justify-end">
                <Button type="button" variant="ghost"
                  onClick={() => onOpenChange(false)}>
                  סגור
                </Button>
                <Button type="button"
                  onClick={() => { setStage("pick"); setErrorMsg(""); setFile(null); }}>
                  נסה קובץ אחר
                </Button>
              </div>
            </div>
          )}
        </div>
      </SheetContent>
    </Sheet>
  );
 }
--- a/web-ui/src/lib/api/training.ts
+++ b/web-ui/src/lib/api/training.ts
@@ -7,10 +7,13 @@
 *   - GET /corpus → flat list of decisions for the corpus tab / compare tool
 *   - GET /compare?a=UUID&b=UUID → side-by-side comparison
 *   - DELETE /corpus/{id} → remove a decision from the corpus
 *   - POST /api/upload → multipart file → returns sanitized filename
 *   - POST /analyze → proofread + extract metadata for preview
 *   - POST /upload → commit a proofread decision to the corpus (task_id)
 */
 import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
-import { apiRequest } from "./client";
+import { ApiError, apiRequest } from "./client";
 export type StyleReport = {
  corpus: {
@@ -69,6 +72,29 @@ export type CorpusDecision = {
  subject_categories: string[];
  chars: number;
  created_at: string;
  // Enriched metadata (added in the corpus-page upgrade).
  summary: string;
  outcome: string;
  key_principles: string[];
  appeal_subtype: string;
  practice_area: string;
  page_count: number;
  document_id: string | null;
  doc_title: string;
  parties: { appellant: string; respondent: string };
  legal_citation: string;
  lessons_count: number;
 };
 export type CorpusDecisionPatch = {
  decision_number?: string;
  decision_date?: string;
  subject_categories?: string[];
  summary?: string;
  outcome?: string;
  key_principles?: string[];
  appeal_subtype?: string;
  practice_area?: string;
 };
 export type CompareResult = {
@@ -149,3 +175,407 @@ export function useDeleteCorpusEntry() {
    },
  });
 }
 // ── Style-agent chat ─────────────────────────────────────────────
 export type ChatConversation = {
  id: string;
  title: string;
  style_corpus_id: string | null;
  decision_number: string;
  claude_session_id: string | null;
  message_count: number;
  created_at: string;
  last_message_at: string;
 };
 export type ChatMessage = {
  id: string;
  role: "user" | "assistant";
  content: string;
  created_at: string;
 };
 export type ChatHealth = {
  reachable: boolean;
  status?: number;
  url: string;
  error?: string;
 };
 export const chatKeys = {
  conversations: () => [...trainingKeys.all, "chat", "conversations"] as const,
  conversation: (id: string) =>
    [...trainingKeys.all, "chat", "conversations", id] as const,
  health: () => [...trainingKeys.all, "chat", "health"] as const,
 };
 export function useChatConversations() {
  return useQuery({
    queryKey: chatKeys.conversations(),
    queryFn: ({ signal }) =>
      apiRequest<ChatConversation[]>("/api/training/chat/conversations", { signal }),
    staleTime: 15_000,
  });
 }
 export function useChatConversation(convId: string | null) {
  return useQuery({
    queryKey: chatKeys.conversation(convId ?? ""),
    queryFn: ({ signal }) =>
      apiRequest<{ conversation: ChatConversation; messages: ChatMessage[] }>(
        `/api/training/chat/conversations/${encodeURIComponent(convId!)}`,
        { signal },
      ),
    enabled: Boolean(convId),
    staleTime: 5_000,
  });
 }
 export function useChatHealth() {
  return useQuery({
    queryKey: chatKeys.health(),
    queryFn: ({ signal }) =>
      apiRequest<ChatHealth>("/api/training/chat/health", { signal }),
    staleTime: 30_000,
    retry: false,
  });
 }
 export function useCreateChat() {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: (body: { title?: string; style_corpus_id?: string | null }) =>
      apiRequest<ChatConversation>("/api/training/chat/conversations", {
        method: "POST",
        body,
      }),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: chatKeys.conversations() });
    },
  });
 }
 export function useDeleteChat() {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: (id: string) =>
      apiRequest<{ deleted: boolean }>(
        `/api/training/chat/conversations/${encodeURIComponent(id)}`,
        { method: "DELETE" },
      ),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: chatKeys.conversations() });
    },
  });
 }
 // ── Curator portrait ──────────────────────────────────────────────
 export type CuratorPrompt = {
  content: string;
  filename: string;
  bytes: number;
  last_modified: number;
  gitea_url: string;
 };
 export type StyleAnalyzerPrompts = {
  analysis_prompt: string;
  single_decision_prompt: string;
  synthesis_prompt: string;
  max_input_tokens: number;
 };
 export type CuratorFinding = {
  id: string;
  lesson_text: string;
  category: string;
  applied_to_skill: boolean;
  decision_number: string;
  decision_date: string;
  created_at: string;
 };
 export type CuratorStats = {
  total_findings: number;
  decisions_with_findings: number;
  decisions_total: number;
  findings_applied: number;
  recent_findings: CuratorFinding[];
 };
 export type CuratorProposalInput = {
  title: string;
  proposed_change: string;
  rationale: string;
 };
 export type CuratorProposalFile = {
  filename: string;
  bytes: number;
  modified_at: number;
 };
 export const curatorKeys = {
  prompt: () => [...trainingKeys.all, "curator", "prompt"] as const,
  analyzerPrompt: () => [...trainingKeys.all, "curator", "analyzer-prompt"] as const,
  stats: () => [...trainingKeys.all, "curator", "stats"] as const,
  proposals: () => [...trainingKeys.all, "curator", "proposals"] as const,
 };
 export function useCuratorPrompt() {
  return useQuery({
    queryKey: curatorKeys.prompt(),
    queryFn: ({ signal }) =>
      apiRequest<CuratorPrompt>("/api/training/curator/prompt", { signal }),
    staleTime: 5 * 60_000,
  });
 }
 export function useStyleAnalyzerPrompts() {
  return useQuery({
    queryKey: curatorKeys.analyzerPrompt(),
    queryFn: ({ signal }) =>
      apiRequest<StyleAnalyzerPrompts>(
        "/api/training/curator/style-analyzer-prompt",
        { signal },
      ),
    staleTime: 5 * 60_000,
  });
 }
 export function useCuratorStats() {
  return useQuery({
    queryKey: curatorKeys.stats(),
    queryFn: ({ signal }) =>
      apiRequest<CuratorStats>("/api/training/curator/stats", { signal }),
    staleTime: 60_000,
  });
 }
 export function useCuratorProposals() {
  return useQuery({
    queryKey: curatorKeys.proposals(),
    queryFn: ({ signal }) =>
      apiRequest<CuratorProposalFile[]>("/api/training/curator/proposals", { signal }),
    staleTime: 30_000,
  });
 }
 export function useSubmitCuratorProposal() {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: (body: CuratorProposalInput) =>
      apiRequest<{ saved: boolean; filename: string }>(
        "/api/training/curator/proposals",
        { method: "POST", body },
      ),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: curatorKeys.proposals() });
    },
  });
 }
 // ── Upload flow ──────────────────────────────────────────────────
 // Three-step pipeline:
 //   1. useUploadFile   → POST /api/upload (multipart)        → { filename }
 //   2. useAnalyzeFile  → POST /api/training/analyze (form)   → preview + extracted metadata
 //   3. useCommitUpload → POST /api/training/upload (json)    → { task_id }
 //      Track task_id via useProgress() from documents.ts.
 export type UploadFileResponse = {
  filename: string;       // sanitized, time-prefixed name in UPLOAD_DIR
  original_name: string;
  size: number;
 };
 export type AnalyzeTrainingResponse = {
  filename: string;
  clean_text: string;
  preview: string;
  decision_number: string;
  decision_date: string;        // ISO YYYY-MM-DD or ""
  subject_categories: string[];
  stats: Record<string, unknown>;
  chars: number;
 };
 export type CommitTrainingRequest = {
  filename: string;
  decision_number: string;
  decision_date: string;        // YYYY-MM-DD or ""
  subject_categories: string[];
  title?: string;
 };
 export type CommitTrainingResponse = { task_id: string };
 export function useUploadFile() {
  return useMutation({
    mutationFn: async (file: File): Promise<UploadFileResponse> => {
      const fd = new FormData();
      fd.append("file", file);
      const res = await fetch("/api/upload", { method: "POST", body: fd });
      const contentType = res.headers.get("content-type") ?? "";
      const parsed = contentType.includes("application/json")
        ? await res.json().catch(() => null)
        : await res.text().catch(() => null);
      if (!res.ok) {
        throw new ApiError(
          typeof parsed === "object" && parsed && "detail" in parsed
            ? String((parsed as { detail: unknown }).detail)
            : `Upload failed with ${res.status}`,
          res.status,
          parsed,
        );
      }
      return parsed as UploadFileResponse;
    },
  });
 }
 export function useAnalyzeTraining() {
  return useMutation({
    mutationFn: async (filename: string): Promise<AnalyzeTrainingResponse> => {
      const fd = new FormData();
      fd.append("filename", filename);
      const res = await fetch("/api/training/analyze", {
        method: "POST",
        body: fd,
      });
      const contentType = res.headers.get("content-type") ?? "";
      const parsed = contentType.includes("application/json")
        ? await res.json().catch(() => null)
        : await res.text().catch(() => null);
      if (!res.ok) {
        throw new ApiError(
          typeof parsed === "object" && parsed && "detail" in parsed
            ? String((parsed as { detail: unknown }).detail)
            : `Analyze failed with ${res.status}`,
          res.status,
          parsed,
        );
      }
      return parsed as AnalyzeTrainingResponse;
    },
  });
 }
 // ── Per-decision lessons ─────────────────────────────────────────
 export type DecisionLesson = {
  id: string;
  style_corpus_id: string;
  lesson_text: string;
  category: "style" | "structure" | "lexicon" | "tabular" | "general";
  source: "manual" | "curator" | "chair" | "style_analyzer";
  applied_to_skill: boolean;
  created_by: string;
  created_at: string;
  updated_at: string;
 };
 export type LessonCreate = {
  lesson_text: string;
  category?: DecisionLesson["category"];
  source?: DecisionLesson["source"];
 };
 export type LessonPatch = {
  lesson_text?: string;
  category?: DecisionLesson["category"];
  applied_to_skill?: boolean;
 };
 export const lessonsKeys = {
  forCorpus: (corpusId: string) =>
    [...trainingKeys.all, "lessons", corpusId] as const,
 };
 export function useCorpusLessons(corpusId: string | null) {
  return useQuery({
    queryKey: lessonsKeys.forCorpus(corpusId ?? ""),
    queryFn: ({ signal }) =>
      apiRequest<DecisionLesson[]>(
        `/api/training/corpus/${encodeURIComponent(corpusId!)}/lessons`,
        { signal },
      ),
    enabled: Boolean(corpusId),
    staleTime: 30_000,
  });
 }
 export function useAddLesson(corpusId: string) {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: (body: LessonCreate) =>
      apiRequest<DecisionLesson>(
        `/api/training/corpus/${encodeURIComponent(corpusId)}/lessons`,
        { method: "POST", body },
      ),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
      // lessons_count on the corpus row is computed server-side, so
      // invalidate the list too — otherwise the badge stays stale.
      qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
    },
  });
 }
 export function usePatchLesson(corpusId: string) {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: ({ id, patch }: { id: string; patch: LessonPatch }) =>
      apiRequest<{ updated: boolean }>(
        `/api/training/lessons/${encodeURIComponent(id)}`,
        { method: "PATCH", body: patch },
      ),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
    },
  });
 }
 export function useDeleteLesson(corpusId: string) {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: (id: string) =>
      apiRequest<{ deleted: boolean }>(
        `/api/training/lessons/${encodeURIComponent(id)}`,
        { method: "DELETE" },
      ),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
      qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
    },
  });
 }
 export function usePatchCorpus() {
  const qc = useQueryClient();
  return useMutation({
    mutationFn: ({ id, patch }: { id: string; patch: CorpusDecisionPatch }) =>
      apiRequest<{ updated: boolean; id: string }>(
        `/api/training/corpus/${encodeURIComponent(id)}`,
        { method: "PATCH", body: patch },
      ),
    onSuccess: () => {
      qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
      qc.invalidateQueries({ queryKey: trainingKeys.report() });
    },
  });
 }
 export function useCommitTrainingUpload() {
  // No onSuccess invalidation here — the row only appears after the
  // background task finishes. The dialog watches useProgress(task_id)
  // and invalidates trainingKeys when status === "completed".
  return useMutation({
    mutationFn: (body: CommitTrainingRequest) =>
      apiRequest<CommitTrainingResponse>("/api/training/upload", {
        method: "POST",
        body,
      }),
  });
 }
--- a/web/app.py
+++ b/web/app.py
@@ -12,6 +12,7 @@ import subprocess
 import sys
 import time
 from contextlib import asynccontextmanager
 from datetime import date as date_type
 from pathlib import Path
 from uuid import UUID, uuid4
@@ -945,32 +946,648 @@ async def training_corpus_delete(corpus_id: str):
    return result
 def _format_legal_citation(decision_number: str, decision_date: str) -> str:
    """Compose the Israeli ועדת ערר citation string from corpus metadata.
    Mirrors how decisions are referenced in Daphna's own writing — e.g.
    "ערר 1130-25 ועדת ערר ירושלים (26.4.2026)". Empty parts are dropped
    gracefully so partially populated rows still produce a readable label.
    """
    if not decision_number:
        return ""
    parts = [f"ערר {decision_number}", "ועדת ערר ירושלים"]
    if decision_date:
        try:
            d = date_type.fromisoformat(decision_date)
            parts.append(f"({d.day}.{d.month}.{d.year})")
        except ValueError:
            pass
    return " ".join(parts)
 _PARTIES_PATTERNS = (
    # "העורר: X" or "העוררים: X". Captures up to a newline / end of stanza.
    re.compile(r"העורר(?:ים|ת)?[:\s]+([^\n]{3,120})"),
    re.compile(r"המבקש(?:ים|ת)?[:\s]+([^\n]{3,120})"),
    re.compile(r"בעניין[:\s]+([^\n]{3,120})"),
 )
 _RESPONDENT_PATTERNS = (
    re.compile(r"המשיב(?:ים|ה|ות)?[:\s]+([^\n]{3,120})"),
    re.compile(r"נגד\s*\n+\s*([^\n]{3,120})"),
 )
 def _extract_parties(text: str) -> dict[str, str]:
    """Best-effort regex extraction of עורר/משיב from the first 5K of full_text.
    We only scan the head of the document because the parties are always
    declared at the top in Israeli legal decisions. The result is a hint
    for display — never authoritative — so a miss returns an empty string
    rather than raising.
    """
    head = (text or "")[:5000]
    appellant = respondent = ""
    for pat in _PARTIES_PATTERNS:
        m = pat.search(head)
        if m:
            appellant = m.group(1).strip(" .,-—")
            break
    for pat in _RESPONDENT_PATTERNS:
        m = pat.search(head)
        if m:
            respondent = m.group(1).strip(" .,-—")
            break
    return {"appellant": appellant, "respondent": respondent}
@app.get("/api/training/corpus")
 async def training_corpus_list():
-    """List all decisions currently in the style corpus."""
+    """List all decisions currently in the style corpus, with enriched metadata.
    Joins to ``documents`` via FK when available, falling back to the
    title-token match used in the chunking pipeline so legacy rows with
    ``style_corpus.document_id IS NULL`` still resolve to their page_count
    and chunk counts.
    """
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        rows = await conn.fetch(
-            "SELECT id, decision_number, decision_date, subject_categories, "
+            """
-            "       length(full_text) as chars, created_at "
+            SELECT sc.id,
-            "FROM style_corpus "
+                   sc.decision_number,
-            "ORDER BY created_at DESC"
+                   sc.decision_date,
                   sc.subject_categories,
                   length(sc.full_text) AS chars,
                   substring(sc.full_text from 1 for 5000) AS head_text,
                   sc.summary,
                   sc.outcome,
                   sc.key_principles,
                   sc.appeal_subtype,
                   sc.practice_area,
                   sc.document_id,
                   sc.created_at,
                   d.page_count AS page_count,
                   d.title       AS doc_title
            FROM style_corpus sc
            LEFT JOIN documents d ON d.id = sc.document_id
            ORDER BY sc.created_at DESC
            """
        )
-    return [
+    lessons_counts = await db.count_decision_lessons_per_corpus()
-        {
+    out = []
    for r in rows:
        cats = r["subject_categories"]
        if isinstance(cats, str):
            try:
                cats = json.loads(cats)
            except json.JSONDecodeError:
                cats = []
        kp = r["key_principles"]
        if isinstance(kp, str):
            try:
                kp = json.loads(kp)
            except json.JSONDecodeError:
                kp = []
        decision_date = str(r["decision_date"]) if r["decision_date"] else ""
        parties = _extract_parties(r["head_text"] or "")
        out.append({
            "id": str(r["id"]),
            "decision_number": r["decision_number"] or "",
-            "decision_date": str(r["decision_date"]) if r["decision_date"] else "",
+            "decision_date": decision_date,
-            "subject_categories": (
+            "subject_categories": cats or [],
                json.loads(r["subject_categories"])
                if isinstance(r["subject_categories"], str)
                else r["subject_categories"] or []
            ),
            "chars": r["chars"],
            "created_at": r["created_at"].isoformat() if r["created_at"] else "",
            # ── enriched fields ──
            "summary": r["summary"] or "",
            "outcome": r["outcome"] or "",
            "key_principles": kp or [],
            "appeal_subtype": r["appeal_subtype"] or "",
            "practice_area": r["practice_area"] or "",
            "page_count": r["page_count"] or 0,
            "document_id": str(r["document_id"]) if r["document_id"] else None,
            "doc_title": r["doc_title"] or "",
            "parties": parties,
            "legal_citation": _format_legal_citation(r["decision_number"] or "", decision_date),
            "lessons_count": lessons_counts.get(str(r["id"]), 0),
        })
    return out
 # ── Style-agent chat (delegated to legal-chat-service on host) ─────
 class ChatConversationCreate(BaseModel):
    title: str = "שיחה חדשה"
    style_corpus_id: str | None = None     # optional — scope chat to a decision
 class ChatMessageRequest(BaseModel):
    content: str
 def _conv_to_json(row: dict) -> dict:
    """Serialize a chat_conversations row for the API."""
    return {
        "id": str(row["id"]),
        "title": row.get("title") or "",
        "style_corpus_id": str(row["style_corpus_id"]) if row.get("style_corpus_id") else None,
        "decision_number": row.get("decision_number") or "",
        "claude_session_id": row.get("claude_session_id"),
        "message_count": row.get("message_count", 0),
        "created_at": row["created_at"].isoformat() if row.get("created_at") else "",
        "last_message_at": row["last_message_at"].isoformat() if row.get("last_message_at") else "",
    }
 def _msg_to_json(row: dict) -> dict:
    return {
        "id": str(row["id"]),
        "role": row["role"],
        "content": row["content"],
        "created_at": row["created_at"].isoformat() if row.get("created_at") else "",
    }
@app.post("/api/training/chat/conversations")
 async def chat_create_conversation(body: ChatConversationCreate):
    """Create a new style-agent chat conversation."""
    corpus_uuid: UUID | None = None
    if body.style_corpus_id:
        try:
            corpus_uuid = UUID(body.style_corpus_id)
        except ValueError:
            raise HTTPException(400, "invalid style_corpus_id")
    row = await db.create_chat_conversation(
        title=body.title.strip() or "שיחה חדשה",
        style_corpus_id=corpus_uuid,
    )
    if not row:
        raise HTTPException(500, "failed to create conversation")
    return _conv_to_json(row)
@app.get("/api/training/chat/conversations")
 async def chat_list_conversations(limit: int = 50):
    rows = await db.list_chat_conversations(limit=limit)
    return [_conv_to_json(r) for r in rows]
@app.get("/api/training/chat/conversations/{conv_id}")
 async def chat_get_conversation(conv_id: str):
    try:
        cid = UUID(conv_id)
    except ValueError:
        raise HTTPException(400, "invalid conv_id")
    conv = await db.get_chat_conversation(cid)
    if not conv:
        raise HTTPException(404, "conversation not found")
    messages = await db.list_chat_messages(cid)
    return {
        "conversation": _conv_to_json(conv),
        "messages": [_msg_to_json(m) for m in messages],
    }
@app.delete("/api/training/chat/conversations/{conv_id}")
 async def chat_delete_conversation(conv_id: str):
    try:
        cid = UUID(conv_id)
    except ValueError:
        raise HTTPException(400, "invalid conv_id")
    result = await db.delete_chat_conversation(cid)
    if not result.get("deleted"):
        raise HTTPException(404, "conversation not found")
    return result
@app.post("/api/training/chat/conversations/{conv_id}/messages")
 async def chat_send_message(conv_id: str, body: ChatMessageRequest):
    """Send a user message; stream the assistant response as SSE.
    Proxies through ``web.chat_proxy.stream_chat_message`` to the
    legal-chat-service running on the host.
    """
    try:
        cid = UUID(conv_id)
    except ValueError:
        raise HTTPException(400, "invalid conv_id")
    text = (body.content or "").strip()
    if not text:
        raise HTTPException(400, "content is required")
    from web import chat_proxy
    return await chat_proxy.stream_chat_message(cid, text)
@app.get("/api/training/chat/health")
 async def chat_health():
    """Probe legal-chat-service liveness from inside the container.
    Useful when the UI wants to gracefully degrade ("שירות הצ'אט אינו
    זמין") instead of letting messages fail mid-stream.
    """
    from web import chat_proxy
    try:
        async with httpx.AsyncClient(timeout=httpx.Timeout(5.0)) as client:
            r = await client.get(f"{chat_proxy.CHAT_SERVICE_URL}/health")
        return {"reachable": r.status_code == 200, "status": r.status_code,
                "url": chat_proxy.CHAT_SERVICE_URL}
    except Exception as e:
        return {"reachable": False, "error": str(e),
                "url": chat_proxy.CHAT_SERVICE_URL}
 # ── Curator portrait — read prompt + stats + accept proposals ──────
 # The curator agent's prompt is symlinked into Paperclip, but the source
 # lives in the legal-ai repo. Resolve via env so the container (where the
 # agent file is mounted from a different path) and the host both work.
 _AGENTS_DIR = Path(os.environ.get(
    "AGENTS_DIR",
    str(Path(__file__).resolve().parent.parent / ".claude" / "agents"),
 ))
 _CURATOR_PROPOSALS_DIR = Path(os.environ.get(
    "CURATOR_PROPOSALS_DIR",
    str(Path(__file__).resolve().parent.parent / "data" / "curator-proposals"),
 ))
 _GITEA_REPO_BASE = os.environ.get(
    "GITEA_REPO_BASE",
    "https://gitea.nautilus.marcusgroup.org/ezer-mishpati/legal-ai",
 )
@app.get("/api/training/curator/prompt")
 async def get_curator_prompt():
    """Return the hermes-curator agent's prompt (read-only) + Gitea source URL.
    The file is the canonical source of how the curator analyzes Daphna's
    final decisions. Changes go through git/Gitea, not the UI — the UI just
    surfaces it for transparency.
    """
    path = _AGENTS_DIR / "hermes-curator.md"
    if not path.exists():
        raise HTTPException(404, f"curator prompt not found at {path}")
    try:
        content = path.read_text(encoding="utf-8")
        stat = path.stat()
    except OSError as e:
        raise HTTPException(500, f"failed to read curator prompt: {e}")
    gitea_url = (
        f"{_GITEA_REPO_BASE}/src/branch/main/.claude/agents/hermes-curator.md"
    )
    return {
        "content": content,
        "filename": path.name,
        "bytes": stat.st_size,
        "last_modified": stat.st_mtime,
        "gitea_url": gitea_url,
    }
@app.get("/api/training/curator/style-analyzer-prompt")
 async def get_style_analyzer_prompt():
    """Return the system prompt that style_analyzer.py uses to extract patterns.
    Surfaces the *training-time* prompt (Claude Opus 1M context) so the
    chair can compare it against the curator's post-export prompt. Both
    are shown side-by-side in the curator-portrait tab.
    """
    # Embedded as a string so we don't need to import the service module
    # here (which would pull in claude_session + db). The prompt is the
    # one defined in mcp-server/src/legal_mcp/services/style_analyzer.py.
    try:
        from legal_mcp.services import style_analyzer
        return {
            "analysis_prompt": style_analyzer.ANALYSIS_PROMPT,
            "single_decision_prompt": style_analyzer.SINGLE_DECISION_PROMPT,
            "synthesis_prompt": style_analyzer.SYNTHESIS_PROMPT,
            "max_input_tokens": style_analyzer.MAX_INPUT_TOKENS,
        }
    except Exception as e:
        raise HTTPException(500, f"failed to load style_analyzer prompt: {e}")
@app.get("/api/training/curator/stats")
 async def get_curator_stats():
    """Cheap aggregate stats over decision_lessons + style_corpus.
    Used by the Curator-Portrait tab to show "10 curator findings across 24
    decisions". We deliberately keep this server-side and aggregate so the
    UI can render a single card without fanning out N queries.
    """
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        total_lessons = await conn.fetchval(
            "SELECT count(*) FROM decision_lessons WHERE source = 'curator'"
        )
        decisions_with_findings = await conn.fetchval(
            "SELECT count(DISTINCT style_corpus_id) FROM decision_lessons "
            "WHERE source = 'curator'"
        )
        total_corpus = await conn.fetchval("SELECT count(*) FROM style_corpus")
        applied = await conn.fetchval(
            "SELECT count(*) FROM decision_lessons "
            "WHERE source = 'curator' AND applied_to_skill = true"
        )
        # Last 10 curator findings — newest first
        recent_rows = await conn.fetch(
            """
            SELECT dl.id, dl.lesson_text, dl.category, dl.applied_to_skill,
                   dl.created_at,
                   sc.decision_number, sc.decision_date
            FROM decision_lessons dl
            JOIN style_corpus sc ON sc.id = dl.style_corpus_id
            WHERE dl.source = 'curator'
            ORDER BY dl.created_at DESC
            LIMIT 10
            """
        )
    return {
        "total_findings": total_lessons or 0,
        "decisions_with_findings": decisions_with_findings or 0,
        "decisions_total": total_corpus or 0,
        "findings_applied": applied or 0,
        "recent_findings": [
            {
                "id": str(r["id"]),
                "lesson_text": r["lesson_text"],
                "category": r["category"],
                "applied_to_skill": bool(r["applied_to_skill"]),
                "decision_number": r["decision_number"] or "",
                "decision_date": str(r["decision_date"]) if r["decision_date"] else "",
                "created_at": r["created_at"].isoformat() if r["created_at"] else "",
            }
            for r in recent_rows
        ],
    }
 class CuratorProposal(BaseModel):
    title: str
    proposed_change: str       # markdown — what to change in the prompt
    rationale: str             # markdown — why
@app.post("/api/training/curator/proposals")
 async def create_curator_proposal(body: CuratorProposal):
    """Save a proposed change to the curator prompt as a file on disk.
    No automatic commit, no overwrite — the chair (chaim) reviews the
    file manually and applies it through git. This is intentional: the
    prompt is too load-bearing to mutate from a web UI.
    """
    title = (body.title or "").strip()
    if not title:
        raise HTTPException(400, "title is required")
    if not body.proposed_change.strip():
        raise HTTPException(400, "proposed_change is required")
    _CURATOR_PROPOSALS_DIR.mkdir(parents=True, exist_ok=True)
    # Slug-ish filename — strip anything that isn't a Hebrew letter, ASCII
    # letter, digit, hyphen, or underscore. Hebrew letters are explicitly
    # allowed because most proposals will be in Hebrew.
    slug = re.sub(r"[^\w֐-׿\-]+", "-", title)[:60].strip("-_") or "proposal"
    today = date_type.today().isoformat()
    fname = f"{today}-{slug}.md"
    path = _CURATOR_PROPOSALS_DIR / fname
    # If a proposal with the same slug already exists today, append a
    # numeric suffix so we don't silently overwrite.
    idx = 2
    while path.exists():
        path = _CURATOR_PROPOSALS_DIR / f"{today}-{slug}-{idx}.md"
        idx += 1
    md = (
        f"# הצעת שינוי לפרומפט hermes-curator\n\n"
        f"- **תאריך:** {today}\n"
        f"- **כותרת:** {title}\n\n"
        f"## שינוי מוצע\n\n{body.proposed_change.strip()}\n\n"
        f"## נימוק\n\n{body.rationale.strip() or '(לא ניתן)'}\n"
    )
    try:
        path.write_text(md, encoding="utf-8")
    except OSError as e:
        raise HTTPException(500, f"failed to write proposal: {e}")
    return {
        "saved": True,
        "filename": path.name,
        "path": str(path),
        "bytes": len(md.encode("utf-8")),
    }
@app.get("/api/training/curator/proposals")
 async def list_curator_proposals():
    """List proposed-change files in data/curator-proposals/, newest first."""
    if not _CURATOR_PROPOSALS_DIR.exists():
        return []
    items = []
    for p in sorted(_CURATOR_PROPOSALS_DIR.iterdir(),
                    key=lambda f: f.stat().st_mtime, reverse=True):
        if not p.is_file() or p.suffix.lower() != ".md":
            continue
        stat = p.stat()
        items.append({
            "filename": p.name,
            "bytes": stat.st_size,
            "modified_at": stat.st_mtime,
        })
    return items
 # ── Per-decision lessons (decision_lessons table) ──────────────────
 class LessonCreate(BaseModel):
    lesson_text: str
    category: str = "general"
    source: str = "manual"
 class LessonPatch(BaseModel):
    lesson_text: str | None = None
    category: str | None = None
    applied_to_skill: bool | None = None
 _LESSON_CATEGORIES = {"style", "structure", "lexicon", "tabular", "general"}
 _LESSON_SOURCES = {"manual", "curator", "chair", "style_analyzer"}
 def _lesson_to_json(row: dict) -> dict:
    return {
        "id": str(row["id"]),
        "style_corpus_id": str(row["style_corpus_id"]),
        "lesson_text": row["lesson_text"],
        "category": row["category"],
        "source": row["source"],
        "applied_to_skill": bool(row["applied_to_skill"]),
        "created_by": row.get("created_by", ""),
        "created_at": row["created_at"].isoformat() if row.get("created_at") else "",
        "updated_at": row["updated_at"].isoformat() if row.get("updated_at") else "",
    }
@app.get("/api/training/corpus/{corpus_id}/lessons")
 async def list_corpus_lessons(corpus_id: str):
    try:
        cid = UUID(corpus_id)
    except ValueError:
        raise HTTPException(400, "invalid corpus_id")
    rows = await db.list_decision_lessons(cid)
    return [_lesson_to_json(r) for r in rows]
@app.post("/api/training/corpus/{corpus_id}/lessons")
 async def add_corpus_lesson(corpus_id: str, body: LessonCreate):
    try:
        cid = UUID(corpus_id)
    except ValueError:
        raise HTTPException(400, "invalid corpus_id")
    text = (body.lesson_text or "").strip()
    if not text:
        raise HTTPException(400, "lesson_text is required")
    if body.category not in _LESSON_CATEGORIES:
        raise HTTPException(400, f"invalid category; allowed: {sorted(_LESSON_CATEGORIES)}")
    if body.source not in _LESSON_SOURCES:
        raise HTTPException(400, f"invalid source; allowed: {sorted(_LESSON_SOURCES)}")
    row = await db.add_decision_lesson(
        cid, lesson_text=text, category=body.category, source=body.source,
    )
    if not row:
        raise HTTPException(500, "failed to insert lesson")
    return _lesson_to_json(row)
@app.patch("/api/training/lessons/{lesson_id}")
 async def patch_corpus_lesson(lesson_id: str, body: LessonPatch):
    try:
        lid = UUID(lesson_id)
    except ValueError:
        raise HTTPException(400, "invalid lesson_id")
    if body.category is not None and body.category not in _LESSON_CATEGORIES:
        raise HTTPException(400, f"invalid category; allowed: {sorted(_LESSON_CATEGORIES)}")
    result = await db.update_decision_lesson(
        lid,
        lesson_text=body.lesson_text,
        category=body.category,
        applied_to_skill=body.applied_to_skill,
    )
    if not result.get("updated"):
        if result.get("reason") == "not found":
            raise HTTPException(404, "lesson not found")
        return result  # "nothing to update" — 200 with reason
    return result
@app.delete("/api/training/lessons/{lesson_id}")
 async def delete_corpus_lesson(lesson_id: str):
    try:
        lid = UUID(lesson_id)
    except ValueError:
        raise HTTPException(400, "invalid lesson_id")
    result = await db.delete_decision_lesson(lid)
    if not result.get("deleted"):
        raise HTTPException(404, "lesson not found")
    return result
@app.get("/api/training/corpus/{corpus_id}/full-text")
 async def training_corpus_full_text(corpus_id: str):
    """Return the proofread full_text for a single corpus row.
    Kept out of the list endpoint because full_text is large (50K-650K chars
    per decision) and the table view only needs counts. The drawer fetches
    it on demand when the chair opens the "content" tab.
    """
    try:
        cid = UUID(corpus_id)
    except ValueError:
        raise HTTPException(400, "invalid corpus_id")
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            "SELECT decision_number, full_text FROM style_corpus WHERE id = $1",
            cid,
        )
    if not row:
        raise HTTPException(404, "corpus row not found")
    return {
        "id": corpus_id,
        "decision_number": row["decision_number"] or "",
        "full_text": row["full_text"] or "",
    }
 class TrainingCorpusPatch(BaseModel):
    """Editable metadata fields on a style_corpus row.
    full_text is intentionally NOT editable — the corpus is write-once.
    For corrections, re-upload the decision via /api/training/upload.
    """
    decision_number: str | None = None
    decision_date: str | None = None       # ISO YYYY-MM-DD, or "" to clear
    subject_categories: list[str] | None = None
    summary: str | None = None
    outcome: str | None = None
    key_principles: list[str] | None = None
    appeal_subtype: str | None = None
    practice_area: str | None = None
@app.patch("/api/training/corpus/{corpus_id}")
 async def training_corpus_patch(corpus_id: str, patch: TrainingCorpusPatch):
    """Update metadata fields on a corpus row. Only provided fields are touched."""
    try:
        cid = UUID(corpus_id)
    except ValueError:
        raise HTTPException(400, "invalid corpus_id")
    fields = patch.model_dump(exclude_none=True)
    if not fields:
        return {"updated": False, "reason": "no fields to update"}
    # Coerce decision_date "" → SQL NULL, otherwise parse as DATE.
    if "decision_date" in fields:
        v = fields["decision_date"]
        if v == "":
            fields["decision_date"] = None
        else:
            try:
                fields["decision_date"] = date_type.fromisoformat(v)
            except ValueError as e:
                raise HTTPException(400, f"invalid decision_date: {e}")
    # subject_categories + key_principles are JSONB columns.
    if "subject_categories" in fields:
        fields["subject_categories"] = json.dumps(fields["subject_categories"])
    if "key_principles" in fields:
        fields["key_principles"] = json.dumps(fields["key_principles"])
    # Build a positional UPDATE — asyncpg doesn't support named parameters.
    cols = list(fields.keys())
    set_clause = ", ".join(f"{c} = ${i + 2}" for i, c in enumerate(cols))
    values = [fields[c] for c in cols]
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        result = await conn.fetchrow(
            f"UPDATE style_corpus SET {set_clause} "
            f"WHERE id = $1 "
            f"RETURNING id, decision_number, decision_date, summary, outcome",
            cid, *values,
        )
    if not result:
        raise HTTPException(404, "corpus row not found")
    return {
        "updated": True,
        "id": str(result["id"]),
        "decision_number": result["decision_number"] or "",
        "decision_date": str(result["decision_date"]) if result["decision_date"] else "",
        "summary_len": len(result["summary"] or ""),
        "outcome_len": len(result["outcome"] or ""),
    }
        for r in rows
    ]
 # Headers that defeat proxy buffering for SSE streams. `X-Accel-Buffering: no`
--- a/web/chat_proxy.py
+++ b/web/chat_proxy.py
@@ -0,0 +1,176 @@
 """FastAPI ↔ legal-chat-service streaming bridge.
 The browser hits ``/api/training/chat/conversations/{id}/messages`` on
 the legal-ai container. The container is sealed off from the host's
 ``claude`` CLI (intentional — see ``claude_session.py`` docstring), so
 we forward each request to the pm2-managed ``legal-chat-service`` over
 loopback (``host.docker.internal:8770``).
 Responsibilities:
  - Save the user message to ``chat_messages`` before streaming starts.
  - Open an HTTP streaming connection to the host service.
  - Forward each SSE event to the browser as-is, accumulating the
    assistant text and any ``session_id`` so we can persist them once
    the stream closes.
  - Persist the assistant turn + the CLI's session_id at end-of-stream.
 """
 from __future__ import annotations
 import json
 import logging
 import os
 from typing import AsyncIterator
 from uuid import UUID
 import httpx
 from fastapi import HTTPException
 from fastapi.responses import StreamingResponse
 from legal_mcp.services import db
 from web import chat_system_prompt
 logger = logging.getLogger(__name__)
 # legal-chat-service lives on the host. In the container we reach it via
 # host.docker.internal — which requires ``extra_hosts: host.docker.internal:host-gateway``
 # in the Coolify service definition. Set ``CHAT_SERVICE_URL`` to override
 # (handy for local dev outside Docker).
 CHAT_SERVICE_URL = os.environ.get(
    "CHAT_SERVICE_URL",
    "http://host.docker.internal:8770",
 )
 CHAT_SERVICE_TIMEOUT_S = float(os.environ.get("CHAT_SERVICE_TIMEOUT_S", "3600"))
 _SSE_HEADERS = {
    "Cache-Control": "no-cache, no-transform",
    "X-Accel-Buffering": "no",
    "Connection": "keep-alive",
 }
 async def stream_chat_message(
    conversation_id: UUID,
    user_message: str,
 ) -> StreamingResponse:
    """Open SSE stream, forward events, persist when done.
    Returns a FastAPI StreamingResponse the route can return directly.
    """
    conv = await db.get_chat_conversation(conversation_id)
    if not conv:
        raise HTTPException(404, "conversation not found")
    # Persist the user turn immediately so a network drop doesn't lose it.
    await db.add_chat_message(
        conversation_id, role="user", content=user_message,
    )
    is_first_turn = not conv.get("claude_session_id")
    system_block: str | None = None
    if is_first_turn:
        try:
            system_block = await chat_system_prompt.build_system_prompt(
                corpus_id=conv.get("style_corpus_id"),
            )
        except Exception as e:
            logger.exception("system prompt build failed")
            raise HTTPException(500, f"system prompt failed: {e}")
    payload = {
        "prompt": user_message,
        "system": system_block,
        "resume_session_id": conv.get("claude_session_id"),
    }
    async def proxy_stream() -> AsyncIterator[bytes]:
        accumulated_text: list[str] = []
        events_log: list[dict] = []
        new_session_id: str | None = None
        try:
            timeout_cfg = httpx.Timeout(
                CHAT_SERVICE_TIMEOUT_S,
                connect=10.0,
                read=CHAT_SERVICE_TIMEOUT_S,
            )
            async with httpx.AsyncClient(timeout=timeout_cfg) as client:
                async with client.stream(
                    "POST",
                    f"{CHAT_SERVICE_URL}/chat/start",
                    json=payload,
                ) as upstream:
                    if upstream.status_code != 200:
                        body = await upstream.aread()
                        msg = body.decode("utf-8", errors="replace")[:300]
                        err = {"type": "error",
                               "message": f"chat-service {upstream.status_code}: {msg}"}
                        yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
                        return
                    async for line in upstream.aiter_lines():
                        if not line:
                            yield b"\n"
                            continue
                        # Forward verbatim so the browser sees the same
                        # SSE framing the host emits.
                        out = line + "\n"
                        yield out.encode("utf-8")
                        # Mirror events: capture text + session_id for
                        # persistence. The line starts with "data: <json>"
                        # so we strip the prefix before parsing.
                        if line.startswith("data: "):
                            try:
                                event = json.loads(line[len("data: "):])
                            except json.JSONDecodeError:
                                continue
                            events_log.append(event)
                            t = event.get("type")
                            if t == "session_id" and event.get("value"):
                                new_session_id = event["value"]
                            elif t == "text_delta" and event.get("text"):
                                accumulated_text.append(event["text"])
                            elif t == "done" and event.get("text"):
                                if not accumulated_text:
                                    accumulated_text.append(event["text"])
        except httpx.ConnectError:
            err = {
                "type": "error",
                "message": (
                    f"לא ניתן להגיע ל-legal-chat-service בכתובת {CHAT_SERVICE_URL}. "
                    "ודא ש-pm2 מריץ אותו: `pm2 status legal-chat-service`."
                ),
            }
            yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
            return
        except Exception as e:
            logger.exception("chat proxy failed")
            err = {"type": "error", "message": str(e)}
            yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
            return
        # End of stream — persist the assistant turn.
        try:
            full_text = "".join(accumulated_text).strip()
            if full_text:
                await db.add_chat_message(
                    conversation_id,
                    role="assistant",
                    content=full_text,
                    raw_events=events_log,
                )
            if new_session_id:
                await db.update_chat_conversation_session_id(
                    conversation_id, new_session_id,
                )
        except Exception:
            logger.exception("failed to persist assistant turn for conv=%s", conversation_id)
    return StreamingResponse(
        proxy_stream(),
        media_type="text/event-stream",
        headers=_SSE_HEADERS,
    )
--- a/web/chat_system_prompt.py
+++ b/web/chat_system_prompt.py
@@ -0,0 +1,205 @@
 """Compose the system prompt the style-chat agent receives.
 The chat runs against the local ``claude`` CLI on the host (via
 legal-chat-service). We assemble a once-per-conversation system block
 that gives the agent everything it needs to discuss decisions in
 Daphna's voice:
  - The style guide (``skills/decision/SKILL.md``) — how she writes
  - The lessons file (``docs/legal-decision-lessons.md``) — what we've
    learned across the corpus
  - The corpus-analysis report (``docs/corpus-analysis.md``) — the
    structural map of 24+ decisions
  - A summary of every style_corpus row (number, date, subjects,
    chars + summary if extracted) so the agent can reason about the
    whole corpus without us shipping all of it inline
  - Optional: when the conversation is scoped to a specific decision
    (``style_corpus_id``), append its full_text so the chat can dive
    into the text directly
 Sent **once**, when the conversation is first created. On subsequent
 messages the legal-chat-service uses ``claude --resume <session_id>``
 and the on-disk CLI session keeps the system context intact — no need
 to re-ship the 100K+ chars of skills + lessons every turn.
 """
 from __future__ import annotations
 import logging
 import os
 from pathlib import Path
 from uuid import UUID
 from legal_mcp.services import db
 logger = logging.getLogger(__name__)
 # The reference files live in the repo at known paths. In the
 # container they're mounted alongside the code, so resolve relative
 # to web/app.py's parent.
 _REPO_ROOT = Path(os.environ.get(
    "LEGAL_AI_REPO_ROOT",
    str(Path(__file__).resolve().parent.parent),
 ))
 _SKILLS_PATH = _REPO_ROOT / "skills" / "decision" / "SKILL.md"
 _LESSONS_PATH = _REPO_ROOT / "docs" / "legal-decision-lessons.md"
 _CORPUS_ANALYSIS_PATH = _REPO_ROOT / "docs" / "corpus-analysis.md"
 def _safe_read(path: Path, cap_chars: int = 50_000) -> str:
    """Read a file (UTF-8) or return a marker that it's missing.
    The cap protects against accidentally injecting an enormous file —
    even at 50K, a single source file is the lion's share of the
    system prompt budget.
    """
    try:
        text = path.read_text(encoding="utf-8")
    except FileNotFoundError:
        return f"(קובץ {path.name} לא נמצא בנתיב {path})"
    except OSError as e:
        logger.warning("could not read %s: %s", path, e)
        return f"(שגיאה בקריאת {path.name}: {e})"
    if len(text) > cap_chars:
        return text[:cap_chars] + f"\n\n[... חתך ב-{cap_chars:,} תווים מתוך {len(text):,}]"
    return text
 async def _corpus_summary_block() -> str:
    """Compact one-row-per-decision summary the agent can scan."""
    rows = await db.get_pool()
    async with rows.acquire() as conn:
        records = await conn.fetch(
            """
            SELECT decision_number, decision_date, appeal_subtype,
                   subject_categories, length(full_text) AS chars,
                   coalesce(summary, '') AS summary,
                   coalesce(outcome, '') AS outcome
            FROM style_corpus
            ORDER BY decision_date NULLS LAST
            """
        )
    if not records:
        return "(הקורפוס ריק)"
    lines = []
    for r in records:
        cats = r["subject_categories"]
        if isinstance(cats, str):
            import json as _json
            try:
                cats = _json.loads(cats)
            except _json.JSONDecodeError:
                cats = []
        cats_str = ", ".join(cats or []) if cats else "—"
        date_str = str(r["decision_date"]) if r["decision_date"] else "—"
        summary = (r["summary"] or "").strip()
        outcome = (r["outcome"] or "").strip()
        head = f"- **{r['decision_number'] or '—'}** ({date_str}) [{r['appeal_subtype'] or '—'}] · {r['chars']:,} תווים"
        meta = f"  נושאים: {cats_str}"
        body = ""
        if summary:
            body = f"\n  תקציר: {summary}"
            if outcome:
                body += f" — תוצאה: {outcome}"
        elif outcome:
            body = f"\n  תוצאה: {outcome}"
        lines.append(head + "\n" + meta + body)
    return "\n".join(lines)
 async def _decision_full_text(corpus_id: UUID) -> str:
    pool = await db.get_pool()
    async with pool.acquire() as conn:
        row = await conn.fetchrow(
            "SELECT decision_number, decision_date, full_text "
            "FROM style_corpus WHERE id = $1",
            corpus_id,
        )
    if not row:
        return ""
    header = f"# החלטה {row['decision_number']} ({row['decision_date']})\n\n"
    return header + (row["full_text"] or "")
 SYSTEM_PROMPT_HEADER = """\
 אתה סוכן הסגנון של עו"ד דפנה תמיר, יו"ר ועדת הערר לתכנון ובניה — מחוז ירושלים.
 תפקידך: לעזור לחיים (העוזר המקצועי של דפנה) להבין, לנתח ולחדד את הסגנון
 של דפנה. אתה לא כותב החלטות חדשות; אתה דן בסגנון של החלטות קיימות,
 מזהה דפוסים, מקפיד שהכותבים העתידיים (ה-writer agent) יישארו נאמנים
 לקולה.
 יש לך גישה ל:
  1. **מדריך הסגנון** של דפנה (skills/decision/SKILL.md) — איך היא כותבת.
  2. **הלקחים הגנריים** מהקורפוס (docs/legal-decision-lessons.md) — מה
     למדנו לאורך 24+ החלטות. **חובה** להישען על הקבצים האלה כשאתה דן
     בסגנון, ולא להמציא תובנות חדשות מהאוויר.
  3. **ניתוח הקורפוס** המבני (docs/corpus-analysis.md) — מפת תוכן ופערים.
  4. **רשימת ההחלטות בקורפוס** (למטה) — סקירה תמציתית של כל החלטה
     שעלתה ל-style_corpus.
  5. **טקסט מלא של החלטה ספציפית** (אם השיחה הוצמדה ל-style_corpus_id).
 כללי תקשורת:
  - כל התשובות בעברית.
  - חיים יושב מולך, לא דפנה — אבל המטרה היא לחדד את הסגנון *של דפנה*.
  - אם חיים שואל "האם פסקה X מתאימה לסגנון של דפנה?" — תן ניתוח מנומק
    שמסתמך על SKILL.md ועל החלטות הקורפוס. אל תמציא ראיות.
  - אם אתה צריך החלטה ספציפית שאין בקורפוס — הודע לחיים שיצרף אותה.
  - אם חיים אומר לך משהו חדש על דפנה ("דפנה אומרת לעולם אל תפתח החלטה
    במילה X") — שמור את זה בזיכרון השיחה; אם זה מצדיק תיעוד קבוע, הצע
    לחיים להוסיף את זה כ-decision_lesson (POST /api/training/lessons)
    או כתוספת ל-SKILL.md.
  - אל תיתן לעצמך אישיות מומצאת — אתה כלי-עזר מקצועי, לא חבר.
 """
 async def build_system_prompt(
    *,
    corpus_id: UUID | None = None,
    include_corpus_summary: bool = True,
 ) -> str:
    """Assemble the full system prompt for a new chat conversation.
    Args:
        corpus_id: When set, the full_text of that decision is appended
            so the chat can dive into the text.
        include_corpus_summary: Set False for low-context chats (e.g.
            quick "what does Daphna do at the end of a betterment-levy
            decision?" — no need to ship 24 summaries).
    """
    parts: list[str] = [SYSTEM_PROMPT_HEADER]
    parts.append("\n## מדריך הסגנון (skills/decision/SKILL.md)\n")
    parts.append(_safe_read(_SKILLS_PATH, cap_chars=40_000))
    parts.append("\n\n## לקחים מהקורפוס (docs/legal-decision-lessons.md)\n")
    parts.append(_safe_read(_LESSONS_PATH, cap_chars=30_000))
    parts.append("\n\n## ניתוח קורפוס מבני (docs/corpus-analysis.md)\n")
    parts.append(_safe_read(_CORPUS_ANALYSIS_PATH, cap_chars=15_000))
    if include_corpus_summary:
        parts.append("\n\n## רשימת ההחלטות בקורפוס הסגנון\n")
        try:
            parts.append(await _corpus_summary_block())
        except Exception as e:
            logger.warning("corpus summary failed: %s", e)
            parts.append("(שגיאה בטעינת רשימת הקורפוס)")
    if corpus_id is not None:
        parts.append("\n\n## ההחלטה הספציפית בדיון (full_text)\n")
        try:
            txt = await _decision_full_text(corpus_id)
            if txt:
                parts.append(txt[:200_000])  # hard cap
            else:
                parts.append("(לא נמצאה החלטה — בדוק את ה-corpus_id)")
        except Exception as e:
            logger.warning("decision full_text failed: %s", e)
            parts.append("(שגיאה בטעינת ההחלטה)")
    return "\n".join(parts)