feat(training): Style Studio — upload, rich corpus, lessons, curator portrait, chat

Six-phase upgrade of /training from a read-only dashboard into a full Style Studio for managing Daphna's style corpus. - Upload Sheet on /training: file → proofread preview → commit (no more CLI-only `upload-training` skill). - Rich corpus metadata: GET /api/training/corpus returns summary, outcome, key_principles, page_count, parties (regex), legal_citation, lessons_count. PATCH endpoint for chair edits. CorpusDetailDrawer with 4 tabs (details /content/lessons/patterns) replaces the bare table row. - LLM metadata enrichment: style_metadata_extractor + MCP tools (style_corpus_enrich, style_corpus_pending_enrichment) fill summary /outcome/key_principles via claude_session (free, host-side). - Per-decision lessons: new decision_lessons table + 4 REST endpoints + LessonsTab in drawer; hermes-curator now auto-posts findings as decision_lessons(source=curator). - Curator Portrait tab: prompt rendered with link to Gitea, recent curator findings, style_analyzer training prompts, propose-change form that writes proposals to data/curator-proposals/ for manual chair review (no auto-mutation of the agent file). - Style chat tab: SSE-streamed conversations with the style agent. New host-side pm2 service (legal-chat-service, port 8770) wraps claude CLI with stream-json + --resume continuation; FastAPI proxies via host.docker.internal. Zero API cost — uses chaim's claude.ai subscription. chat_conversations + chat_messages persist history. Architecture: keeps the existing rule that claude_session only runs on the host (not the container). The new legal-chat-service is the canonical bridge between the container and the local CLI for the chat feature; everything else (upload, metadata, lessons) stays within the container's existing capabilities. Audit script (scripts/audit_training_corpus.py) included for verifying which corpus rows still need enrichment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 10:06:22 +00:00
parent 0629f19d5f
commit bb0cd7c6a2
23 changed files with 4568 additions and 75 deletions
--- a/mcp-server/src/legal_mcp/services/claude_session.py
+++ b/mcp-server/src/legal_mcp/services/claude_session.py
@@ -142,3 +142,175 @@ async def query_json(
    """
    raw = await query(prompt, timeout=timeout, system=system)
    return parse_llm_json(raw)
+
+
+# ── Streaming + session continuation ────────────────────────────────
+
+
+async def query_streaming(
+    prompt: str,
+    *,
+    system: str | None = None,
+    resume_session_id: str | None = None,
+    timeout: int = LONG_TIMEOUT,
+    cwd: str | None = None,
+):
+    """Stream Claude's response as an async iterator of events.
+
+    Wraps `claude -p --output-format=stream-json` (newline-delimited JSON
+    objects from the CLI) and translates each line into a small, stable
+    shape that the chat service / SSE proxy can forward without leaking
+    CLI internals to the browser.
+
+    Event shapes yielded:
+        {"type": "session_id",  "value": "<uuid>"}      # first event, used for resume
+        {"type": "text_delta",  "text":  "<partial>"}   # incremental assistant text
+        {"type": "tool_use",    "name": "...", "input": {...}}
+        {"type": "error",       "message": "..."}
+        {"type": "done",        "text": "<full response>"}
+
+    The CLI emits a richer stream; we project to this minimal set so the
+    front-end can stay stable across CLI upgrades.
+
+    Args:
+        prompt: The user message to send.
+        system: Optional system instructions (used only when starting a
+            fresh conversation — when resume_session_id is set, the
+            session already carries its system prompt).
+        resume_session_id: Continue a prior conversation. When given,
+            we don't re-send the system prompt; the CLI loads the
+            entire conversation history from disk.
+        timeout: Hard ceiling on the subprocess.
+        cwd: Working directory for the subprocess — defaults to the
+            host's HOME so claude.ai credentials resolve correctly.
+    """
+    if resume_session_id:
+        # When resuming, system is already baked into the on-disk session
+        # — sending it again would be a no-op at best and confuse the
+        # conversation at worst.
+        full_prompt = prompt
+        cmd = [
+            "claude", "-p",
+            "--output-format", "stream-json",
+            "--verbose",
+            "--resume", resume_session_id,
+        ]
+    else:
+        full_prompt = f"{system}\n\n{prompt}" if system else prompt
+        cmd = [
+            "claude", "-p",
+            "--output-format", "stream-json",
+            "--verbose",
+        ]
+
+    if len(full_prompt) > 200_000:
+        logger.warning(
+            "Streaming: large prompt (%d chars) — may hit CLI input limits",
+            len(full_prompt),
+        )
+
+    try:
+        proc = await asyncio.create_subprocess_exec(
+            *cmd,
+            stdin=asyncio.subprocess.PIPE,
+            stdout=asyncio.subprocess.PIPE,
+            stderr=asyncio.subprocess.PIPE,
+            cwd=cwd,
+        )
+    except FileNotFoundError:
+        yield {
+            "type": "error",
+            "message": (
+                "Claude CLI not found on host — legal-chat-service must "
+                "run where the `claude` binary is installed (Daphna's host, "
+                "not the legal-ai container)."
+            ),
+        }
+        return
+
+    assert proc.stdin is not None  # for type checkers
+    assert proc.stdout is not None
+
+    # Send the prompt and close stdin so the CLI knows the user message
+    # is complete.
+    try:
+        proc.stdin.write(full_prompt.encode("utf-8"))
+        await proc.stdin.drain()
+        proc.stdin.close()
+    except BrokenPipeError:
+        # CLI exited before reading the prompt — drain stderr and bail.
+        stderr_b = await proc.stderr.read() if proc.stderr else b""
+        yield {
+            "type": "error",
+            "message": f"Claude CLI closed stdin early: {stderr_b.decode('utf-8', errors='replace')[:300]}",
+        }
+        return
+
+    accumulated_text: list[str] = []
+    session_id_emitted = False
+    deadline = asyncio.get_event_loop().time() + timeout
+    try:
+        while True:
+            remaining = deadline - asyncio.get_event_loop().time()
+            if remaining <= 0:
+                yield {"type": "error", "message": f"timed out after {timeout}s"}
+                break
+            try:
+                line_b = await asyncio.wait_for(proc.stdout.readline(), timeout=remaining)
+            except asyncio.TimeoutError:
+                yield {"type": "error", "message": f"stream timed out after {timeout}s"}
+                break
+            if not line_b:
+                break
+            line = line_b.decode("utf-8", errors="replace").strip()
+            if not line:
+                continue
+            try:
+                event = json.loads(line)
+            except json.JSONDecodeError:
+                # Stray non-JSON line from CLI — surface a snippet for debug.
+                logger.debug("non-JSON stream line: %s", line[:120])
+                continue
+
+            # The CLI's stream-json emits several event types. We only
+            # care about the ones the chat service forwards.
+            t = event.get("type")
+            if not session_id_emitted:
+                sid = event.get("session_id")
+                if sid:
+                    session_id_emitted = True
+                    yield {"type": "session_id", "value": sid}
+
+            if t == "assistant":
+                # event["message"]["content"] is a list of blocks; we extract
+                # text blocks and tool_use blocks.
+                msg = event.get("message") or {}
+                for block in msg.get("content") or []:
+                    btype = block.get("type")
+                    if btype == "text":
+                        text = block.get("text") or ""
+                        if text:
+                            accumulated_text.append(text)
+                            yield {"type": "text_delta", "text": text}
+                    elif btype == "tool_use":
+                        yield {
+                            "type": "tool_use",
+                            "name": block.get("name") or "",
+                            "input": block.get("input") or {},
+                        }
+            elif t == "result":
+                # Final synthesized result line from the CLI — we already
+                # delivered the deltas, so just stop here.
+                break
+    finally:
+        if proc.returncode is None:
+            try:
+                proc.kill()
+            except ProcessLookupError:
+                pass
+        try:
+            await proc.wait()
+        except Exception:
+            pass
+
+    yield {"type": "done", "text": "".join(accumulated_text)}