feat(training): Style Studio — upload, rich corpus, lessons, curator portrait, chat
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 2m7s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 2m7s
Six-phase upgrade of /training from a read-only dashboard into a full Style Studio for managing Daphna's style corpus. - Upload Sheet on /training: file → proofread preview → commit (no more CLI-only `upload-training` skill). - Rich corpus metadata: GET /api/training/corpus returns summary, outcome, key_principles, page_count, parties (regex), legal_citation, lessons_count. PATCH endpoint for chair edits. CorpusDetailDrawer with 4 tabs (details /content/lessons/patterns) replaces the bare table row. - LLM metadata enrichment: style_metadata_extractor + MCP tools (style_corpus_enrich, style_corpus_pending_enrichment) fill summary /outcome/key_principles via claude_session (free, host-side). - Per-decision lessons: new decision_lessons table + 4 REST endpoints + LessonsTab in drawer; hermes-curator now auto-posts findings as decision_lessons(source=curator). - Curator Portrait tab: prompt rendered with link to Gitea, recent curator findings, style_analyzer training prompts, propose-change form that writes proposals to data/curator-proposals/ for manual chair review (no auto-mutation of the agent file). - Style chat tab: SSE-streamed conversations with the style agent. New host-side pm2 service (legal-chat-service, port 8770) wraps claude CLI with stream-json + --resume continuation; FastAPI proxies via host.docker.internal. Zero API cost — uses chaim's claude.ai subscription. chat_conversations + chat_messages persist history. Architecture: keeps the existing rule that claude_session only runs on the host (not the container). The new legal-chat-service is the canonical bridge between the container and the local CLI for the chat feature; everything else (upload, metadata, lessons) stays within the container's existing capabilities. Audit script (scripts/audit_training_corpus.py) included for verifying which corpus rows still need enrichment. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
@@ -76,6 +76,24 @@ profiles:
|
|||||||
Authorization: Bearer $PAPERCLIP_API_KEY
|
Authorization: Bearer $PAPERCLIP_API_KEY
|
||||||
{ "body": "<my findings>" }
|
{ "body": "<my findings>" }
|
||||||
```
|
```
|
||||||
|
5b. **רושם כל ממצא גם ב-API של legal-ai כ-decision_lesson**, כך שיופיע ב-UI
|
||||||
|
תחת הטאב "מה למדנו" של ההחלטה בקורפוס. דרישה: למצוא קודם את ה-`style_corpus_id`
|
||||||
|
שתואם ל-`decision_number` של ההחלטה (`GET /api/training/corpus` ולסנן).
|
||||||
|
לכל ממצא:
|
||||||
|
```
|
||||||
|
POST https://legal-ai.nautilus.marcusgroup.org/api/training/corpus/{corpus_id}/lessons
|
||||||
|
Content-Type: application/json
|
||||||
|
{
|
||||||
|
"lesson_text": "<התקציר של הממצא — מה ראיתי + הצעה — שורה אחת>",
|
||||||
|
"category": "<style|structure|lexicon|tabular|general>",
|
||||||
|
"source": "curator"
|
||||||
|
}
|
||||||
|
```
|
||||||
|
מיפוי תגי-ממצא ל-`category`:
|
||||||
|
- `[סגנון]` → `style`
|
||||||
|
- `[מבנה]` → `structure`
|
||||||
|
- `[לקסיקון משפטי]` → `lexicon`
|
||||||
|
- `[טבלאי]` → `tabular`
|
||||||
6. סוגר את ה-issue (status=done) אחרי שכתבתי את ה-comment
|
6. סוגר את ה-issue (status=done) אחרי שכתבתי את ה-comment
|
||||||
|
|
||||||
## פורמט ה-comment
|
## פורמט ה-comment
|
||||||
|
|||||||
10
CLAUDE.md
10
CLAUDE.md
@@ -91,6 +91,16 @@
|
|||||||
- שינויי קוד נכנסים לתוקף אחרי `pm2 restart paperclip`
|
- שינויי קוד נכנסים לתוקף אחרי `pm2 restart paperclip`
|
||||||
- **אין צורך ב-Docker או Coolify**
|
- **אין צורך ב-Docker או Coolify**
|
||||||
|
|
||||||
|
**legal-chat-service** — רץ **מקומית דרך pm2** (חדש, מאפריל 2026):
|
||||||
|
- פורט: `localhost:8770` (loopback בלבד)
|
||||||
|
- שירות aiohttp קצר שעוטף את `claude` CLI ב-streaming + session continuation, ומשרת את הטאב "שיחה" בדף `/training`. הקונטיינר משדל אליו proxy דרך `host.docker.internal:8770`.
|
||||||
|
- קוד: [mcp-server/src/legal_mcp/chat_service/](mcp-server/src/legal_mcp/chat_service/)
|
||||||
|
- התקנה: `pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs && pm2 save`
|
||||||
|
- בריאות: `curl http://127.0.0.1:8770/health` → `{"ok":true,...}`
|
||||||
|
- שינויי קוד: `pm2 restart legal-chat-service`
|
||||||
|
- **אפס עלות API** — claude CLI משתמש ב-claude.ai subscription של chaim. הנחת היסוד של `claude_session.py` (claude CLI מקומי בלבד) נשמרת — השירות הזה הוא הגשר הרשמי בין הקונטיינר לחוץ.
|
||||||
|
- Coolify dependency: ה-Service Definition של legal-ai חייב להכיל `extra_hosts: host.docker.internal:host-gateway` (אחרת ה-proxy יקבל ConnectError).
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## מבנה תיקיות
|
## מבנה תיקיות
|
||||||
|
|||||||
13
mcp-server/src/legal_mcp/chat_service/__init__.py
Normal file
13
mcp-server/src/legal_mcp/chat_service/__init__.py
Normal file
@@ -0,0 +1,13 @@
|
|||||||
|
"""legal-chat-service — host-side SSE bridge to ``claude`` CLI.
|
||||||
|
|
||||||
|
Runs as a pm2-managed process on the host (port 127.0.0.1:8770 by default).
|
||||||
|
The legal-ai FastAPI container proxies chat requests to it via
|
||||||
|
``host.docker.internal:8770``.
|
||||||
|
|
||||||
|
Why a separate service:
|
||||||
|
The chat needs real-time streaming + multi-turn session continuation
|
||||||
|
(``claude --resume <session_id>``). The container can't run the
|
||||||
|
claude CLI (no binary, no claude.ai credentials). Splitting this out
|
||||||
|
keeps the architectural rule of ``claude_session.py`` intact while
|
||||||
|
enabling the new chat feature for free (no API key).
|
||||||
|
"""
|
||||||
144
mcp-server/src/legal_mcp/chat_service/server.py
Normal file
144
mcp-server/src/legal_mcp/chat_service/server.py
Normal file
@@ -0,0 +1,144 @@
|
|||||||
|
"""HTTP+SSE bridge from FastAPI (in container) to local claude CLI.
|
||||||
|
|
||||||
|
Endpoints:
|
||||||
|
POST /chat/start — body: {prompt, system?, resume_session_id?}
|
||||||
|
returns SSE stream of events from
|
||||||
|
``claude_session.query_streaming``.
|
||||||
|
GET /health — liveness probe.
|
||||||
|
|
||||||
|
Run with pm2:
|
||||||
|
pm2 start ecosystem.config.cjs --only legal-chat-service
|
||||||
|
|
||||||
|
Standalone for dev:
|
||||||
|
cd ~/legal-ai/mcp-server
|
||||||
|
.venv/bin/python -m legal_mcp.chat_service.server --port 8770
|
||||||
|
|
||||||
|
We intentionally bind to 127.0.0.1 only — the FastAPI container reaches
|
||||||
|
us via ``host.docker.internal``, and exposing the bridge publicly would
|
||||||
|
let anyone run claude CLI commands against Daphna's session.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from aiohttp import web
|
||||||
|
|
||||||
|
# Run-via-CLI bootstrap so ``python -m legal_mcp.chat_service.server``
|
||||||
|
# works even when the package isn't installed (it is in the venv, but
|
||||||
|
# this safeguard keeps the entrypoint robust).
|
||||||
|
_pkg_root = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
|
||||||
|
if _pkg_root not in sys.path:
|
||||||
|
sys.path.insert(0, _pkg_root)
|
||||||
|
|
||||||
|
from legal_mcp.services import claude_session # noqa: E402
|
||||||
|
|
||||||
|
logger = logging.getLogger("legal_chat_service")
|
||||||
|
|
||||||
|
|
||||||
|
async def health(request: web.Request) -> web.Response:
|
||||||
|
return web.json_response({"ok": True, "service": "legal-chat-service"})
|
||||||
|
|
||||||
|
|
||||||
|
async def chat_start(request: web.Request) -> web.StreamResponse:
|
||||||
|
"""Drive ``claude_session.query_streaming`` and forward events as SSE.
|
||||||
|
|
||||||
|
Request body (JSON):
|
||||||
|
prompt: str — required, user message
|
||||||
|
system: str | None — system instructions (ignored if resuming)
|
||||||
|
resume_session_id: str | None — continue a prior CLI session
|
||||||
|
timeout: int = 3600 — hard timeout for the subprocess
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
body = await request.json()
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
return web.json_response({"error": "invalid JSON body"}, status=400)
|
||||||
|
|
||||||
|
prompt = body.get("prompt") or ""
|
||||||
|
if not prompt.strip():
|
||||||
|
return web.json_response({"error": "prompt is required"}, status=400)
|
||||||
|
system = body.get("system")
|
||||||
|
resume_session_id = body.get("resume_session_id")
|
||||||
|
timeout = int(body.get("timeout") or 3600)
|
||||||
|
|
||||||
|
response = web.StreamResponse(
|
||||||
|
status=200,
|
||||||
|
reason="OK",
|
||||||
|
headers={
|
||||||
|
"Content-Type": "text/event-stream",
|
||||||
|
"Cache-Control": "no-cache, no-transform",
|
||||||
|
"Connection": "keep-alive",
|
||||||
|
# X-Accel-Buffering=no defeats nginx/traefik buffering — the
|
||||||
|
# FastAPI container proxies via httpx and forwards bytes as
|
||||||
|
# they arrive, but the inner header is harmless and makes
|
||||||
|
# browser-direct testing easier.
|
||||||
|
"X-Accel-Buffering": "no",
|
||||||
|
},
|
||||||
|
)
|
||||||
|
await response.prepare(request)
|
||||||
|
|
||||||
|
async def send_event(payload: dict[str, Any]) -> None:
|
||||||
|
line = f"data: {json.dumps(payload, ensure_ascii=False)}\n\n"
|
||||||
|
await response.write(line.encode("utf-8"))
|
||||||
|
|
||||||
|
try:
|
||||||
|
async for event in claude_session.query_streaming(
|
||||||
|
prompt,
|
||||||
|
system=system,
|
||||||
|
resume_session_id=resume_session_id,
|
||||||
|
timeout=timeout,
|
||||||
|
):
|
||||||
|
await send_event(event)
|
||||||
|
if event.get("type") == "done" or event.get("type") == "error":
|
||||||
|
break
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
# Client disconnected — bail cleanly.
|
||||||
|
logger.info("chat_start: client disconnected")
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception("chat_start: streaming failed")
|
||||||
|
try:
|
||||||
|
await send_event({"type": "error", "message": str(e)})
|
||||||
|
except ConnectionResetError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
try:
|
||||||
|
await response.write_eof()
|
||||||
|
except ConnectionResetError:
|
||||||
|
pass
|
||||||
|
return response
|
||||||
|
|
||||||
|
|
||||||
|
def build_app() -> web.Application:
|
||||||
|
app = web.Application()
|
||||||
|
app.router.add_get("/health", health)
|
||||||
|
app.router.add_post("/chat/start", chat_start)
|
||||||
|
return app
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> int:
|
||||||
|
parser = argparse.ArgumentParser(description="legal-chat-service")
|
||||||
|
parser.add_argument("--port", type=int, default=8770)
|
||||||
|
parser.add_argument("--host", default="127.0.0.1",
|
||||||
|
help="bind address; 127.0.0.1 keeps the service "
|
||||||
|
"loopback-only — leave it alone in production")
|
||||||
|
parser.add_argument("--log-level", default="INFO")
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=args.log_level.upper(),
|
||||||
|
format="%(asctime)s %(name)s %(levelname)s %(message)s",
|
||||||
|
)
|
||||||
|
|
||||||
|
app = build_app()
|
||||||
|
web.run_app(app, host=args.host, port=args.port, print=lambda _msg: None)
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
@@ -57,6 +57,7 @@ from legal_mcp.tools import ( # noqa: E402
|
|||||||
legal_arguments as la_tools,
|
legal_arguments as la_tools,
|
||||||
missing_precedents as mp_tools,
|
missing_precedents as mp_tools,
|
||||||
citations as cit_tools,
|
citations as cit_tools,
|
||||||
|
training_enrichment as train_tools,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -248,6 +249,18 @@ async def precedent_extract_metadata(case_law_id: str) -> str:
|
|||||||
return await plib.precedent_extract_metadata(case_law_id)
|
return await plib.precedent_extract_metadata(case_law_id)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def style_corpus_enrich(corpus_id: str, overwrite: bool = False) -> str:
|
||||||
|
"""חילוץ מטא-דאטה (summary, outcome, key_principles, appeal_subtype) להחלטה בקורפוס הסגנון של דפנה. ברירת מחדל: ממלא רק שדות ריקים. שלח `overwrite=true` כדי לרענן."""
|
||||||
|
return await train_tools.extract_decision_metadata(corpus_id, overwrite=overwrite)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def style_corpus_pending_enrichment(limit: int = 50) -> str:
|
||||||
|
"""רשימת החלטות בקורפוס הסגנון שעדיין חסרות summary/outcome/key_principles — מועמדות לחילוץ."""
|
||||||
|
return await train_tools.list_corpus_pending_enrichment(limit)
|
||||||
|
|
||||||
|
|
||||||
@mcp.tool()
|
@mcp.tool()
|
||||||
async def precedent_process_pending(kind: str = "metadata", limit: int = 20) -> str:
|
async def precedent_process_pending(kind: str = "metadata", limit: int = 20) -> str:
|
||||||
"""ריקון תור בקשות חילוץ שנשלחו מ-UI. kind: 'metadata' או 'halacha'. מריץ extractor מקומית עם CLI על כל פריט בתור, ומנקה את הסימון אחרי הצלחה."""
|
"""ריקון תור בקשות חילוץ שנשלחו מ-UI. kind: 'metadata' או 'halacha'. מריץ extractor מקומית עם CLI על כל פריט בתור, ומנקה את הסימון אחרי הצלחה."""
|
||||||
|
|||||||
@@ -142,3 +142,175 @@ async def query_json(
|
|||||||
"""
|
"""
|
||||||
raw = await query(prompt, timeout=timeout, system=system)
|
raw = await query(prompt, timeout=timeout, system=system)
|
||||||
return parse_llm_json(raw)
|
return parse_llm_json(raw)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Streaming + session continuation ────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
async def query_streaming(
|
||||||
|
prompt: str,
|
||||||
|
*,
|
||||||
|
system: str | None = None,
|
||||||
|
resume_session_id: str | None = None,
|
||||||
|
timeout: int = LONG_TIMEOUT,
|
||||||
|
cwd: str | None = None,
|
||||||
|
):
|
||||||
|
"""Stream Claude's response as an async iterator of events.
|
||||||
|
|
||||||
|
Wraps `claude -p --output-format=stream-json` (newline-delimited JSON
|
||||||
|
objects from the CLI) and translates each line into a small, stable
|
||||||
|
shape that the chat service / SSE proxy can forward without leaking
|
||||||
|
CLI internals to the browser.
|
||||||
|
|
||||||
|
Event shapes yielded:
|
||||||
|
{"type": "session_id", "value": "<uuid>"} # first event, used for resume
|
||||||
|
{"type": "text_delta", "text": "<partial>"} # incremental assistant text
|
||||||
|
{"type": "tool_use", "name": "...", "input": {...}}
|
||||||
|
{"type": "error", "message": "..."}
|
||||||
|
{"type": "done", "text": "<full response>"}
|
||||||
|
|
||||||
|
The CLI emits a richer stream; we project to this minimal set so the
|
||||||
|
front-end can stay stable across CLI upgrades.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
prompt: The user message to send.
|
||||||
|
system: Optional system instructions (used only when starting a
|
||||||
|
fresh conversation — when resume_session_id is set, the
|
||||||
|
session already carries its system prompt).
|
||||||
|
resume_session_id: Continue a prior conversation. When given,
|
||||||
|
we don't re-send the system prompt; the CLI loads the
|
||||||
|
entire conversation history from disk.
|
||||||
|
timeout: Hard ceiling on the subprocess.
|
||||||
|
cwd: Working directory for the subprocess — defaults to the
|
||||||
|
host's HOME so claude.ai credentials resolve correctly.
|
||||||
|
"""
|
||||||
|
if resume_session_id:
|
||||||
|
# When resuming, system is already baked into the on-disk session
|
||||||
|
# — sending it again would be a no-op at best and confuse the
|
||||||
|
# conversation at worst.
|
||||||
|
full_prompt = prompt
|
||||||
|
cmd = [
|
||||||
|
"claude", "-p",
|
||||||
|
"--output-format", "stream-json",
|
||||||
|
"--verbose",
|
||||||
|
"--resume", resume_session_id,
|
||||||
|
]
|
||||||
|
else:
|
||||||
|
full_prompt = f"{system}\n\n{prompt}" if system else prompt
|
||||||
|
cmd = [
|
||||||
|
"claude", "-p",
|
||||||
|
"--output-format", "stream-json",
|
||||||
|
"--verbose",
|
||||||
|
]
|
||||||
|
|
||||||
|
if len(full_prompt) > 200_000:
|
||||||
|
logger.warning(
|
||||||
|
"Streaming: large prompt (%d chars) — may hit CLI input limits",
|
||||||
|
len(full_prompt),
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
proc = await asyncio.create_subprocess_exec(
|
||||||
|
*cmd,
|
||||||
|
stdin=asyncio.subprocess.PIPE,
|
||||||
|
stdout=asyncio.subprocess.PIPE,
|
||||||
|
stderr=asyncio.subprocess.PIPE,
|
||||||
|
cwd=cwd,
|
||||||
|
)
|
||||||
|
except FileNotFoundError:
|
||||||
|
yield {
|
||||||
|
"type": "error",
|
||||||
|
"message": (
|
||||||
|
"Claude CLI not found on host — legal-chat-service must "
|
||||||
|
"run where the `claude` binary is installed (Daphna's host, "
|
||||||
|
"not the legal-ai container)."
|
||||||
|
),
|
||||||
|
}
|
||||||
|
return
|
||||||
|
|
||||||
|
assert proc.stdin is not None # for type checkers
|
||||||
|
assert proc.stdout is not None
|
||||||
|
|
||||||
|
# Send the prompt and close stdin so the CLI knows the user message
|
||||||
|
# is complete.
|
||||||
|
try:
|
||||||
|
proc.stdin.write(full_prompt.encode("utf-8"))
|
||||||
|
await proc.stdin.drain()
|
||||||
|
proc.stdin.close()
|
||||||
|
except BrokenPipeError:
|
||||||
|
# CLI exited before reading the prompt — drain stderr and bail.
|
||||||
|
stderr_b = await proc.stderr.read() if proc.stderr else b""
|
||||||
|
yield {
|
||||||
|
"type": "error",
|
||||||
|
"message": f"Claude CLI closed stdin early: {stderr_b.decode('utf-8', errors='replace')[:300]}",
|
||||||
|
}
|
||||||
|
return
|
||||||
|
|
||||||
|
accumulated_text: list[str] = []
|
||||||
|
session_id_emitted = False
|
||||||
|
deadline = asyncio.get_event_loop().time() + timeout
|
||||||
|
try:
|
||||||
|
while True:
|
||||||
|
remaining = deadline - asyncio.get_event_loop().time()
|
||||||
|
if remaining <= 0:
|
||||||
|
yield {"type": "error", "message": f"timed out after {timeout}s"}
|
||||||
|
break
|
||||||
|
try:
|
||||||
|
line_b = await asyncio.wait_for(proc.stdout.readline(), timeout=remaining)
|
||||||
|
except asyncio.TimeoutError:
|
||||||
|
yield {"type": "error", "message": f"stream timed out after {timeout}s"}
|
||||||
|
break
|
||||||
|
if not line_b:
|
||||||
|
break
|
||||||
|
line = line_b.decode("utf-8", errors="replace").strip()
|
||||||
|
if not line:
|
||||||
|
continue
|
||||||
|
try:
|
||||||
|
event = json.loads(line)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
# Stray non-JSON line from CLI — surface a snippet for debug.
|
||||||
|
logger.debug("non-JSON stream line: %s", line[:120])
|
||||||
|
continue
|
||||||
|
|
||||||
|
# The CLI's stream-json emits several event types. We only
|
||||||
|
# care about the ones the chat service forwards.
|
||||||
|
t = event.get("type")
|
||||||
|
if not session_id_emitted:
|
||||||
|
sid = event.get("session_id")
|
||||||
|
if sid:
|
||||||
|
session_id_emitted = True
|
||||||
|
yield {"type": "session_id", "value": sid}
|
||||||
|
|
||||||
|
if t == "assistant":
|
||||||
|
# event["message"]["content"] is a list of blocks; we extract
|
||||||
|
# text blocks and tool_use blocks.
|
||||||
|
msg = event.get("message") or {}
|
||||||
|
for block in msg.get("content") or []:
|
||||||
|
btype = block.get("type")
|
||||||
|
if btype == "text":
|
||||||
|
text = block.get("text") or ""
|
||||||
|
if text:
|
||||||
|
accumulated_text.append(text)
|
||||||
|
yield {"type": "text_delta", "text": text}
|
||||||
|
elif btype == "tool_use":
|
||||||
|
yield {
|
||||||
|
"type": "tool_use",
|
||||||
|
"name": block.get("name") or "",
|
||||||
|
"input": block.get("input") or {},
|
||||||
|
}
|
||||||
|
elif t == "result":
|
||||||
|
# Final synthesized result line from the CLI — we already
|
||||||
|
# delivered the deltas, so just stop here.
|
||||||
|
break
|
||||||
|
finally:
|
||||||
|
if proc.returncode is None:
|
||||||
|
try:
|
||||||
|
proc.kill()
|
||||||
|
except ProcessLookupError:
|
||||||
|
pass
|
||||||
|
try:
|
||||||
|
await proc.wait()
|
||||||
|
except Exception:
|
||||||
|
pass
|
||||||
|
|
||||||
|
yield {"type": "done", "text": "".join(accumulated_text)}
|
||||||
|
|||||||
@@ -194,6 +194,55 @@ ALTER TABLE style_corpus ADD COLUMN IF NOT EXISTS appeal_subtype TEXT DEFAULT ''
|
|||||||
-- הרחבת style_patterns עם appeal_subtype לניתוח סגנון נפרד לכל סוג ערר
|
-- הרחבת style_patterns עם appeal_subtype לניתוח סגנון נפרד לכל סוג ערר
|
||||||
ALTER TABLE style_patterns ADD COLUMN IF NOT EXISTS appeal_subtype TEXT DEFAULT '';
|
ALTER TABLE style_patterns ADD COLUMN IF NOT EXISTS appeal_subtype TEXT DEFAULT '';
|
||||||
|
|
||||||
|
-- decision_lessons: per-decision learnings the chair / curator / style_analyzer
|
||||||
|
-- attaches to a corpus row. The generic legal-decision-lessons.md file stays
|
||||||
|
-- as the source of truth for cross-corpus patterns; this table stores the
|
||||||
|
-- granular "what we learned from THIS decision" notes that drive the writer's
|
||||||
|
-- future drafts and let the curator look up prior observations on the same row.
|
||||||
|
CREATE TABLE IF NOT EXISTS decision_lessons (
|
||||||
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
|
style_corpus_id UUID NOT NULL REFERENCES style_corpus(id) ON DELETE CASCADE,
|
||||||
|
lesson_text TEXT NOT NULL,
|
||||||
|
category TEXT DEFAULT 'general', -- style / structure / lexicon / tabular / general
|
||||||
|
source TEXT DEFAULT 'manual', -- manual / curator / chair / style_analyzer
|
||||||
|
applied_to_skill BOOLEAN DEFAULT false, -- has this been promoted into SKILL.md?
|
||||||
|
created_by TEXT DEFAULT 'chaim',
|
||||||
|
created_at TIMESTAMPTZ DEFAULT now(),
|
||||||
|
updated_at TIMESTAMPTZ DEFAULT now()
|
||||||
|
);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_decision_lessons_corpus ON decision_lessons(style_corpus_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_decision_lessons_applied ON decision_lessons(applied_to_skill);
|
||||||
|
|
||||||
|
-- chat_conversations / chat_messages: persistent history for the
|
||||||
|
-- "שיחה עם הסוכן" tab on /training. Each conversation can optionally be
|
||||||
|
-- scoped to a single style_corpus row (when the chair starts a chat
|
||||||
|
-- "about decision X"). claude_session_id is the value the local claude
|
||||||
|
-- CLI returns in stream-json — we pass it back via `--resume` on the
|
||||||
|
-- next message so the model continues the same conversation without
|
||||||
|
-- re-loading the system prompt every time.
|
||||||
|
CREATE TABLE IF NOT EXISTS chat_conversations (
|
||||||
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
|
title TEXT NOT NULL DEFAULT 'שיחה חדשה',
|
||||||
|
style_corpus_id UUID REFERENCES style_corpus(id) ON DELETE SET NULL,
|
||||||
|
claude_session_id TEXT,
|
||||||
|
system_prompt_version TEXT DEFAULT 'v1',
|
||||||
|
created_at TIMESTAMPTZ DEFAULT now(),
|
||||||
|
last_message_at TIMESTAMPTZ DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE TABLE IF NOT EXISTS chat_messages (
|
||||||
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
|
conversation_id UUID NOT NULL REFERENCES chat_conversations(id) ON DELETE CASCADE,
|
||||||
|
role TEXT NOT NULL, -- 'user' | 'assistant'
|
||||||
|
content TEXT NOT NULL,
|
||||||
|
raw_events JSONB DEFAULT '[]', -- stream-json events for the assistant turn (optional, for debug)
|
||||||
|
created_at TIMESTAMPTZ DEFAULT now()
|
||||||
|
);
|
||||||
|
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_chat_messages_conv ON chat_messages(conversation_id, created_at);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_chat_conv_corpus ON chat_conversations(style_corpus_id);
|
||||||
|
CREATE INDEX IF NOT EXISTS idx_chat_conv_last ON chat_conversations(last_message_at DESC);
|
||||||
|
|
||||||
-- טבלת qa_results
|
-- טבלת qa_results
|
||||||
CREATE TABLE IF NOT EXISTS qa_results (
|
CREATE TABLE IF NOT EXISTS qa_results (
|
||||||
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
@@ -1609,6 +1658,284 @@ async def delete_from_style_corpus(corpus_id: UUID) -> dict:
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def get_style_corpus_row(corpus_id: UUID) -> dict | None:
|
||||||
|
"""Return a single style_corpus row by id, or None if missing."""
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"""
|
||||||
|
SELECT id, document_id, decision_number, decision_date,
|
||||||
|
subject_categories, full_text, summary, outcome,
|
||||||
|
key_principles, practice_area, appeal_subtype, created_at
|
||||||
|
FROM style_corpus WHERE id = $1
|
||||||
|
""",
|
||||||
|
corpus_id,
|
||||||
|
)
|
||||||
|
return dict(row) if row else None
|
||||||
|
|
||||||
|
|
||||||
|
async def update_style_corpus_metadata(
|
||||||
|
corpus_id: UUID,
|
||||||
|
*,
|
||||||
|
summary: str | None = None,
|
||||||
|
outcome: str | None = None,
|
||||||
|
key_principles: list[str] | None = None,
|
||||||
|
appeal_subtype: str | None = None,
|
||||||
|
practice_area: str | None = None,
|
||||||
|
overwrite: bool = False,
|
||||||
|
) -> dict:
|
||||||
|
"""Patch the enriched-metadata columns of a style_corpus row.
|
||||||
|
|
||||||
|
By default, only empty columns are filled — passing ``overwrite=True``
|
||||||
|
is the caller's signal that they intentionally want to replace existing
|
||||||
|
values (used by the re-extract flow when the chair runs it manually).
|
||||||
|
"""
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
existing = await conn.fetchrow(
|
||||||
|
"SELECT summary, outcome, key_principles, appeal_subtype, practice_area "
|
||||||
|
"FROM style_corpus WHERE id = $1",
|
||||||
|
corpus_id,
|
||||||
|
)
|
||||||
|
if not existing:
|
||||||
|
return {"updated": False, "reason": "not found"}
|
||||||
|
|
||||||
|
sets: dict = {}
|
||||||
|
if summary is not None and (overwrite or not (existing["summary"] or "").strip()):
|
||||||
|
sets["summary"] = summary
|
||||||
|
if outcome is not None and (overwrite or not (existing["outcome"] or "").strip()):
|
||||||
|
sets["outcome"] = outcome
|
||||||
|
if key_principles is not None:
|
||||||
|
current = existing["key_principles"]
|
||||||
|
if isinstance(current, str):
|
||||||
|
try:
|
||||||
|
current = json.loads(current)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
current = []
|
||||||
|
if overwrite or not (current or []):
|
||||||
|
sets["key_principles"] = json.dumps(key_principles)
|
||||||
|
if appeal_subtype is not None and (overwrite or not (existing["appeal_subtype"] or "").strip()):
|
||||||
|
sets["appeal_subtype"] = appeal_subtype
|
||||||
|
if practice_area is not None and (overwrite or not (existing["practice_area"] or "").strip()):
|
||||||
|
sets["practice_area"] = practice_area
|
||||||
|
|
||||||
|
if not sets:
|
||||||
|
return {"updated": False, "reason": "nothing to update", "fields": []}
|
||||||
|
|
||||||
|
cols = list(sets.keys())
|
||||||
|
set_clause = ", ".join(f"{c} = ${i + 2}" for i, c in enumerate(cols))
|
||||||
|
values = [sets[c] for c in cols]
|
||||||
|
await conn.execute(
|
||||||
|
f"UPDATE style_corpus SET {set_clause} WHERE id = $1",
|
||||||
|
corpus_id, *values,
|
||||||
|
)
|
||||||
|
return {"updated": True, "fields": cols}
|
||||||
|
|
||||||
|
|
||||||
|
# ── decision_lessons (per-corpus row notes) ────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
async def list_decision_lessons(corpus_id: UUID) -> list[dict]:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"SELECT id, style_corpus_id, lesson_text, category, source, "
|
||||||
|
" applied_to_skill, created_by, created_at, updated_at "
|
||||||
|
"FROM decision_lessons WHERE style_corpus_id = $1 "
|
||||||
|
"ORDER BY created_at DESC",
|
||||||
|
corpus_id,
|
||||||
|
)
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
|
async def add_decision_lesson(
|
||||||
|
corpus_id: UUID,
|
||||||
|
*,
|
||||||
|
lesson_text: str,
|
||||||
|
category: str = "general",
|
||||||
|
source: str = "manual",
|
||||||
|
created_by: str = "chaim",
|
||||||
|
) -> dict:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"INSERT INTO decision_lessons "
|
||||||
|
"(style_corpus_id, lesson_text, category, source, created_by) "
|
||||||
|
"VALUES ($1, $2, $3, $4, $5) "
|
||||||
|
"RETURNING id, style_corpus_id, lesson_text, category, source, "
|
||||||
|
" applied_to_skill, created_by, created_at, updated_at",
|
||||||
|
corpus_id, lesson_text, category, source, created_by,
|
||||||
|
)
|
||||||
|
return dict(row) if row else {}
|
||||||
|
|
||||||
|
|
||||||
|
async def update_decision_lesson(
|
||||||
|
lesson_id: UUID,
|
||||||
|
*,
|
||||||
|
lesson_text: str | None = None,
|
||||||
|
category: str | None = None,
|
||||||
|
applied_to_skill: bool | None = None,
|
||||||
|
) -> dict:
|
||||||
|
sets: dict = {}
|
||||||
|
if lesson_text is not None:
|
||||||
|
sets["lesson_text"] = lesson_text
|
||||||
|
if category is not None:
|
||||||
|
sets["category"] = category
|
||||||
|
if applied_to_skill is not None:
|
||||||
|
sets["applied_to_skill"] = applied_to_skill
|
||||||
|
if not sets:
|
||||||
|
return {"updated": False, "reason": "nothing to update"}
|
||||||
|
sets["updated_at"] = "now()" # sentinel — replaced inline below
|
||||||
|
cols = [c for c in sets if c != "updated_at"]
|
||||||
|
set_clause = ", ".join(f"{c} = ${i + 2}" for i, c in enumerate(cols))
|
||||||
|
set_clause += ", updated_at = now()"
|
||||||
|
values = [sets[c] for c in cols]
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
f"UPDATE decision_lessons SET {set_clause} WHERE id = $1 "
|
||||||
|
f"RETURNING id, style_corpus_id, lesson_text, category, source, "
|
||||||
|
f" applied_to_skill, updated_at",
|
||||||
|
lesson_id, *values,
|
||||||
|
)
|
||||||
|
if not row:
|
||||||
|
return {"updated": False, "reason": "not found"}
|
||||||
|
return {"updated": True, **dict(row)}
|
||||||
|
|
||||||
|
|
||||||
|
async def delete_decision_lesson(lesson_id: UUID) -> dict:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
result = await conn.execute(
|
||||||
|
"DELETE FROM decision_lessons WHERE id = $1", lesson_id,
|
||||||
|
)
|
||||||
|
# asyncpg returns "DELETE n"
|
||||||
|
deleted = result.split(" ", 1)[1].strip() if " " in result else "0"
|
||||||
|
return {"deleted": deleted != "0"}
|
||||||
|
|
||||||
|
|
||||||
|
async def count_decision_lessons_per_corpus() -> dict[str, int]:
|
||||||
|
"""Map style_corpus.id (str) → lesson count, for badge display in the list."""
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"SELECT style_corpus_id, count(*) AS n "
|
||||||
|
"FROM decision_lessons GROUP BY style_corpus_id"
|
||||||
|
)
|
||||||
|
return {str(r["style_corpus_id"]): r["n"] for r in rows}
|
||||||
|
|
||||||
|
|
||||||
|
# ── chat (style agent conversations) ───────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
async def create_chat_conversation(
|
||||||
|
*,
|
||||||
|
title: str = "שיחה חדשה",
|
||||||
|
style_corpus_id: UUID | None = None,
|
||||||
|
system_prompt_version: str = "v1",
|
||||||
|
) -> dict:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"INSERT INTO chat_conversations "
|
||||||
|
"(title, style_corpus_id, system_prompt_version) "
|
||||||
|
"VALUES ($1, $2, $3) "
|
||||||
|
"RETURNING id, title, style_corpus_id, claude_session_id, "
|
||||||
|
" system_prompt_version, created_at, last_message_at",
|
||||||
|
title, style_corpus_id, system_prompt_version,
|
||||||
|
)
|
||||||
|
return dict(row) if row else {}
|
||||||
|
|
||||||
|
|
||||||
|
async def list_chat_conversations(limit: int = 50) -> list[dict]:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"""
|
||||||
|
SELECT c.id, c.title, c.style_corpus_id, c.claude_session_id,
|
||||||
|
c.created_at, c.last_message_at,
|
||||||
|
sc.decision_number,
|
||||||
|
(SELECT count(*) FROM chat_messages m WHERE m.conversation_id = c.id) AS message_count
|
||||||
|
FROM chat_conversations c
|
||||||
|
LEFT JOIN style_corpus sc ON sc.id = c.style_corpus_id
|
||||||
|
ORDER BY c.last_message_at DESC NULLS LAST
|
||||||
|
LIMIT $1
|
||||||
|
""",
|
||||||
|
limit,
|
||||||
|
)
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
|
async def get_chat_conversation(conv_id: UUID) -> dict | None:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"SELECT id, title, style_corpus_id, claude_session_id, "
|
||||||
|
" system_prompt_version, created_at, last_message_at "
|
||||||
|
"FROM chat_conversations WHERE id = $1",
|
||||||
|
conv_id,
|
||||||
|
)
|
||||||
|
return dict(row) if row else None
|
||||||
|
|
||||||
|
|
||||||
|
async def delete_chat_conversation(conv_id: UUID) -> dict:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
result = await conn.execute(
|
||||||
|
"DELETE FROM chat_conversations WHERE id = $1", conv_id,
|
||||||
|
)
|
||||||
|
deleted = result.split(" ", 1)[1].strip() if " " in result else "0"
|
||||||
|
return {"deleted": deleted != "0"}
|
||||||
|
|
||||||
|
|
||||||
|
async def update_chat_conversation_session_id(
|
||||||
|
conv_id: UUID, claude_session_id: str,
|
||||||
|
) -> None:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
await conn.execute(
|
||||||
|
"UPDATE chat_conversations SET claude_session_id = $1, "
|
||||||
|
" last_message_at = now() "
|
||||||
|
"WHERE id = $2",
|
||||||
|
claude_session_id, conv_id,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def add_chat_message(
|
||||||
|
conv_id: UUID,
|
||||||
|
*,
|
||||||
|
role: str,
|
||||||
|
content: str,
|
||||||
|
raw_events: list | None = None,
|
||||||
|
) -> dict:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"INSERT INTO chat_messages "
|
||||||
|
"(conversation_id, role, content, raw_events) "
|
||||||
|
"VALUES ($1, $2, $3, $4) "
|
||||||
|
"RETURNING id, conversation_id, role, content, created_at",
|
||||||
|
conv_id, role, content, json.dumps(raw_events or []),
|
||||||
|
)
|
||||||
|
await conn.execute(
|
||||||
|
"UPDATE chat_conversations SET last_message_at = now() WHERE id = $1",
|
||||||
|
conv_id,
|
||||||
|
)
|
||||||
|
return dict(row) if row else {}
|
||||||
|
|
||||||
|
|
||||||
|
async def list_chat_messages(conv_id: UUID) -> list[dict]:
|
||||||
|
pool = await get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"SELECT id, role, content, created_at "
|
||||||
|
"FROM chat_messages WHERE conversation_id = $1 "
|
||||||
|
"ORDER BY created_at ASC",
|
||||||
|
conv_id,
|
||||||
|
)
|
||||||
|
return [dict(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
async def get_style_patterns(pattern_type: str | None = None) -> list[dict]:
|
async def get_style_patterns(pattern_type: str | None = None) -> list[dict]:
|
||||||
pool = await get_pool()
|
pool = await get_pool()
|
||||||
async with pool.acquire() as conn:
|
async with pool.acquire() as conn:
|
||||||
|
|||||||
195
mcp-server/src/legal_mcp/services/style_metadata_extractor.py
Normal file
195
mcp-server/src/legal_mcp/services/style_metadata_extractor.py
Normal file
@@ -0,0 +1,195 @@
|
|||||||
|
"""Auto-extract per-decision metadata for a style_corpus row.
|
||||||
|
|
||||||
|
Populates the fields that the upload flow leaves empty — summary, outcome,
|
||||||
|
key_principles, appeal_subtype, practice_area — by asking Claude (via the
|
||||||
|
local CLI session) to read the proofread full_text and return a structured
|
||||||
|
JSON blob.
|
||||||
|
|
||||||
|
Caller policy (``apply_to_corpus``): by default we **only fill empty
|
||||||
|
columns**, so chair-edited values are preserved across re-runs. The chair
|
||||||
|
can force a refresh by passing ``overwrite=True``.
|
||||||
|
|
||||||
|
Why this is a separate module from ``precedent_metadata_extractor``:
|
||||||
|
that one fills the *external* case_law corpus (court rulings, third-party
|
||||||
|
committee decisions). This one fills the *style* corpus — Daphna's own
|
||||||
|
decisions used to teach the writer the in-house voice. The two corpora
|
||||||
|
have different schemas, different prompts, and different downstream
|
||||||
|
consumers, so coupling them would have been the wrong shortcut.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp.services import claude_session, db
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# A single decision typically runs 200K-650K chars. We sample the head
|
||||||
|
# (where outcome + parties + framing live) and the tail (where the
|
||||||
|
# operative ruling sits). Picking from both edges keeps the prompt under
|
||||||
|
# 60K chars — comfortable for any Claude tier.
|
||||||
|
_HEAD_CHARS = 25_000
|
||||||
|
_TAIL_CHARS = 15_000
|
||||||
|
|
||||||
|
|
||||||
|
def _build_text_window(full_text: str) -> str:
|
||||||
|
if len(full_text) <= _HEAD_CHARS + _TAIL_CHARS:
|
||||||
|
return full_text
|
||||||
|
head = full_text[:_HEAD_CHARS]
|
||||||
|
tail = full_text[-_TAIL_CHARS:]
|
||||||
|
return (
|
||||||
|
f"{head}\n\n"
|
||||||
|
f"[... חתך: {len(full_text) - _HEAD_CHARS - _TAIL_CHARS:,} תווים מהאמצע "
|
||||||
|
f"הושמטו — שמרנו על ההתחלה (טענות + רקע) ועל הסוף (הכרעה + הוצאות) ...]"
|
||||||
|
f"\n\n{tail}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# Static instructions — go via ``system`` so the SDK path can cache them
|
||||||
|
# across batch enrichment runs (24+ decisions in one pass).
|
||||||
|
METADATA_PROMPT = """אתה מסייע משפטי שמקטלג את הקורפוס הסגנוני של דפנה תמיר (יו"ר ועדת ערר).
|
||||||
|
|
||||||
|
תפקידך: לקרוא החלטה אחת ולחלץ מטא-דאטה ל-style_corpus — שדות שהמשתמש לא הזין בעת ההעלאה.
|
||||||
|
|
||||||
|
**אל תמציא**. אם המידע לא מופיע בטקסט, השאר מחרוזת ריקה או מערך ריק. אסור להסיק עובדות שלא כתובות.
|
||||||
|
|
||||||
|
## פלט נדרש
|
||||||
|
|
||||||
|
החזר JSON אחד (object אחד — לא array, לא markdown, לא הסברים):
|
||||||
|
|
||||||
|
{
|
||||||
|
"summary": "תקציר עניני ב-2-3 משפטים: מי העורר, מה דרש, מה הוכרע. סגנון יבש, ניטרלי, ללא שיפוט. דוגמה: 'ערר על דחיית בקשה להיתר לתוספת מרפסת בקומה ג׳. דפנה קיבלה את הערר חלקית — אישרה את המרפסת בהקטנה ל-12 מ״ר.'",
|
||||||
|
|
||||||
|
"outcome": "התוצאה התמציתית. אחד מאלה (או צירוף קצר): 'קבלה' / 'קבלה חלקית' / 'דחייה' / 'הסתלקות' / 'החזרה לוועדה המקומית'. אם זה לא ברור — מחרוזת ריקה.",
|
||||||
|
|
||||||
|
"key_principles": [
|
||||||
|
"עיקרון משפטי 1 שעולה מההחלטה — משפט אחד, ניסוח מופשט. למשל 'שיקול דעת מוגבל לחריגות בנייה קטנות'.",
|
||||||
|
"עיקרון 2",
|
||||||
|
"..."
|
||||||
|
],
|
||||||
|
|
||||||
|
"appeal_subtype": "תת-סוג ערר. ערכים מותרים: 'building_permit' (היתר בנייה / רישוי), 'betterment_levy' (היטל השבחה), 'compensation_197' (פיצויים ס׳ 197), 'use_change' (שימוש חורג), 'tama_38' (תמ\\"א 38), או מחרוזת ריקה אם לא ברור.",
|
||||||
|
|
||||||
|
"practice_area": "תחום משפט גנרי. ברירת מחדל: 'appeals_committee'. אם זה במובהק 'planning_law' — סמן.",
|
||||||
|
|
||||||
|
"parties_appellant": "שם העורר/ים המרכזיים בהחלטה (אחד או כמה, מופרדים בפסיק). אם זו החלטה מאוחדת — שם הצד המוביל. השאר ריק אם לא ניתן לזהות במדויק.",
|
||||||
|
|
||||||
|
"parties_respondent": "שם המשיב/ים. ברירת מחדל לעררי 1xxx ו-8xxx: 'הוועדה המקומית לתכנון ובניה ירושלים' או דומה. השאר ריק אם לא ברור."
|
||||||
|
}
|
||||||
|
|
||||||
|
## כללי איכות
|
||||||
|
|
||||||
|
1. **summary** — חייב להזכיר את התוצאה. בלי 'בית המשפט קבע ש...' (אנחנו לא בית משפט). בלי הערכת אישית.
|
||||||
|
2. **outcome** — קבלה / קבלה חלקית / דחייה / הסתלקות / החזרה לוועדה המקומית. אם דפנה הכריעה חלקית — 'קבלה חלקית'. אסור 'התקבל' או 'נדחה' בלשון פעולה — רק שם פעולה.
|
||||||
|
3. **key_principles** — 2-5 עקרונות מקסימום. כל אחד משפט אחד. לא ציטוטים מילוליים, אלא תמצות העיקרון.
|
||||||
|
4. **appeal_subtype** — תמיד פעולה אחת. אם החלטה מערבת כמה תת-סוגים — בחר את העיקרי.
|
||||||
|
5. **parties_appellant / parties_respondent** — שם בלבד, בלי 'נ׳' או 'נגד'.
|
||||||
|
|
||||||
|
החזר רק את ה-JSON. אל תכתוב שום דבר לפניו או אחריו.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
async def extract_decision_metadata(corpus_id: UUID | str) -> dict:
|
||||||
|
"""Run Claude over the row's full_text and return suggested fields.
|
||||||
|
|
||||||
|
Does NOT touch the DB. The caller decides what to apply.
|
||||||
|
"""
|
||||||
|
if isinstance(corpus_id, str):
|
||||||
|
corpus_id = UUID(corpus_id)
|
||||||
|
row = await db.get_style_corpus_row(corpus_id)
|
||||||
|
if not row:
|
||||||
|
return {}
|
||||||
|
full_text = (row.get("full_text") or "").strip()
|
||||||
|
if not full_text:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
context = (
|
||||||
|
f"מספר החלטה: {row.get('decision_number') or '—'}\n"
|
||||||
|
f"תאריך: {row.get('decision_date') or '—'}\n"
|
||||||
|
f"תת-סוג נוכחי: {row.get('appeal_subtype') or '—'}\n"
|
||||||
|
f"נושאים מתויגים: {row.get('subject_categories') or '—'}"
|
||||||
|
)
|
||||||
|
window = _build_text_window(full_text)
|
||||||
|
user_msg = (
|
||||||
|
f"## הקלט\n{context}\n\n"
|
||||||
|
f"--- תחילת ההחלטה ---\n{window}\n--- סוף ההחלטה ---"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = await claude_session.query_json(user_msg, system=METADATA_PROMPT)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("style_metadata_extractor: query failed: %s", e)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
if not isinstance(result, dict):
|
||||||
|
logger.warning(
|
||||||
|
"style_metadata_extractor: expected JSON object, got %s",
|
||||||
|
type(result).__name__,
|
||||||
|
)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
out: dict = {}
|
||||||
|
if isinstance(result.get("summary"), str):
|
||||||
|
out["summary"] = result["summary"].strip()
|
||||||
|
if isinstance(result.get("outcome"), str):
|
||||||
|
out["outcome"] = result["outcome"].strip()
|
||||||
|
kp = result.get("key_principles") or []
|
||||||
|
if isinstance(kp, list):
|
||||||
|
out["key_principles"] = [str(p).strip() for p in kp if str(p).strip()]
|
||||||
|
if isinstance(result.get("appeal_subtype"), str):
|
||||||
|
st = result["appeal_subtype"].strip()
|
||||||
|
# Open enum — but log values outside the documented list so we can
|
||||||
|
# tighten the prompt later if needed.
|
||||||
|
known = {
|
||||||
|
"building_permit", "betterment_levy", "compensation_197",
|
||||||
|
"use_change", "tama_38", "",
|
||||||
|
}
|
||||||
|
if st not in known:
|
||||||
|
logger.info("style_metadata: unknown appeal_subtype=%r (kept)", st)
|
||||||
|
out["appeal_subtype"] = st
|
||||||
|
if isinstance(result.get("practice_area"), str):
|
||||||
|
out["practice_area"] = result["practice_area"].strip()
|
||||||
|
# Parties: not stored in the schema today, but worth surfacing in the
|
||||||
|
# extractor's return value so callers (and the UI's drawer) can display
|
||||||
|
# them. The list endpoint extracts via regex; LLM output is the
|
||||||
|
# higher-quality fallback when regex fails.
|
||||||
|
if isinstance(result.get("parties_appellant"), str):
|
||||||
|
out["parties_appellant"] = result["parties_appellant"].strip()
|
||||||
|
if isinstance(result.get("parties_respondent"), str):
|
||||||
|
out["parties_respondent"] = result["parties_respondent"].strip()
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
async def extract_and_apply(
|
||||||
|
corpus_id: UUID | str, *, overwrite: bool = False,
|
||||||
|
) -> dict:
|
||||||
|
"""Convenience: extract → apply → return summary of what changed.
|
||||||
|
|
||||||
|
Idempotent under default ``overwrite=False`` — re-runs only fill empty
|
||||||
|
fields. Use ``overwrite=True`` to refresh values the chair (or a prior
|
||||||
|
extraction) already wrote.
|
||||||
|
"""
|
||||||
|
if isinstance(corpus_id, str):
|
||||||
|
corpus_id = UUID(corpus_id)
|
||||||
|
suggested = await extract_decision_metadata(corpus_id)
|
||||||
|
if not suggested:
|
||||||
|
return {"extracted": False, "applied": False, "reason": "no suggestion"}
|
||||||
|
|
||||||
|
update_result = await db.update_style_corpus_metadata(
|
||||||
|
corpus_id,
|
||||||
|
summary=suggested.get("summary"),
|
||||||
|
outcome=suggested.get("outcome"),
|
||||||
|
key_principles=suggested.get("key_principles"),
|
||||||
|
appeal_subtype=suggested.get("appeal_subtype"),
|
||||||
|
practice_area=suggested.get("practice_area"),
|
||||||
|
overwrite=overwrite,
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"extracted": True,
|
||||||
|
"applied": update_result.get("updated", False),
|
||||||
|
"fields_set": update_result.get("fields", []),
|
||||||
|
"suggested": suggested,
|
||||||
|
}
|
||||||
85
mcp-server/src/legal_mcp/tools/training_enrichment.py
Normal file
85
mcp-server/src/legal_mcp/tools/training_enrichment.py
Normal file
@@ -0,0 +1,85 @@
|
|||||||
|
"""MCP tool wrappers for the style_corpus metadata-enrichment flow.
|
||||||
|
|
||||||
|
The actual extractor lives in
|
||||||
|
``legal_mcp.services.style_metadata_extractor``; this module just exposes
|
||||||
|
it as MCP tools that the chair (or a future automation) can call from
|
||||||
|
Claude Code.
|
||||||
|
|
||||||
|
Why these tools matter: the upload pipeline (`/api/training/upload` →
|
||||||
|
`_process_proofread_training`) inserts a style_corpus row with
|
||||||
|
``summary=''``, ``outcome=''``, ``key_principles=[]`` because LLM
|
||||||
|
extraction can't run from the FastAPI container (no claude CLI there).
|
||||||
|
This module fills that gap — call it from the host, where ``claude``
|
||||||
|
CLI is available, and the row gets enriched.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp.services import db, style_metadata_extractor
|
||||||
|
|
||||||
|
|
||||||
|
def _ok(payload) -> str:
|
||||||
|
return json.dumps({"ok": True, **payload}, ensure_ascii=False, default=str)
|
||||||
|
|
||||||
|
|
||||||
|
def _err(msg: str) -> str:
|
||||||
|
return json.dumps({"ok": False, "error": msg}, ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
|
async def extract_decision_metadata(corpus_id: str, overwrite: bool = False) -> str:
|
||||||
|
"""חילוץ מטא-דאטה (summary, outcome, key_principles, appeal_subtype) להחלטה בקורפוס הסגנון.
|
||||||
|
|
||||||
|
ברירת מחדל ``overwrite=False`` ממלא רק שדות ריקים. הזן ``overwrite=true``
|
||||||
|
כדי לרענן ערכים שכבר נכתבו.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
cid = UUID(corpus_id)
|
||||||
|
except ValueError:
|
||||||
|
return _err("corpus_id לא תקין")
|
||||||
|
try:
|
||||||
|
result = await style_metadata_extractor.extract_and_apply(cid, overwrite=overwrite)
|
||||||
|
except Exception as e:
|
||||||
|
return _err(str(e))
|
||||||
|
return _ok(result)
|
||||||
|
|
||||||
|
|
||||||
|
async def list_corpus_pending_enrichment(limit: int = 50) -> str:
|
||||||
|
"""רשימת רשומות style_corpus שחסר להן summary/outcome/key_principles — מועמדות להעשרה."""
|
||||||
|
pool = await db.get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"""
|
||||||
|
SELECT id, decision_number, decision_date,
|
||||||
|
length(full_text) AS chars,
|
||||||
|
coalesce(summary, '') = '' AS missing_summary,
|
||||||
|
coalesce(outcome, '') = '' AS missing_outcome,
|
||||||
|
coalesce(jsonb_array_length(key_principles), 0) = 0 AS missing_principles
|
||||||
|
FROM style_corpus
|
||||||
|
WHERE coalesce(summary, '') = ''
|
||||||
|
OR coalesce(outcome, '') = ''
|
||||||
|
OR coalesce(jsonb_array_length(key_principles), 0) = 0
|
||||||
|
ORDER BY decision_date NULLS LAST
|
||||||
|
LIMIT $1
|
||||||
|
""",
|
||||||
|
limit,
|
||||||
|
)
|
||||||
|
items = [
|
||||||
|
{
|
||||||
|
"corpus_id": str(r["id"]),
|
||||||
|
"decision_number": r["decision_number"] or "",
|
||||||
|
"decision_date": str(r["decision_date"]) if r["decision_date"] else "",
|
||||||
|
"chars": r["chars"],
|
||||||
|
"missing": [
|
||||||
|
f for f, v in (
|
||||||
|
("summary", r["missing_summary"]),
|
||||||
|
("outcome", r["missing_outcome"]),
|
||||||
|
("key_principles", r["missing_principles"]),
|
||||||
|
) if v
|
||||||
|
],
|
||||||
|
}
|
||||||
|
for r in rows
|
||||||
|
]
|
||||||
|
return _ok({"count": len(items), "items": items})
|
||||||
@@ -35,6 +35,7 @@
|
|||||||
| `compute_ndcg.py` | python | חישוב nDCG@10 על `search_relevance_feedback` (TaskMaster #50, Stage C). aggregation לפי `search_type` ולפי שבוע, כולל top-cited case_law ו-coverage %. דגלים: `--k 10`, `--weeks 12`, `--pretty`. read-only, פלט JSON. משמש גם את `GET /api/admin/rag-metrics` (מיובא inline) — שינוי חתימה ב-`compute()` ישבור את ה-endpoint | ידני / cron עתידי לדיווח שבועי |
|
| `compute_ndcg.py` | python | חישוב nDCG@10 על `search_relevance_feedback` (TaskMaster #50, Stage C). aggregation לפי `search_type` ולפי שבוע, כולל top-cited case_law ו-coverage %. דגלים: `--k 10`, `--weeks 12`, `--pretty`. read-only, פלט JSON. משמש גם את `GET /api/admin/rag-metrics` (מיובא inline) — שינוי חתימה ב-`compute()` ישבור את ה-endpoint | ידני / cron עתידי לדיווח שבועי |
|
||||||
| `backfill_multimodal_precedents.py` | python | Backfill voyage-multimodal-3 page embeddings על רשומות `case_law` (external_upload + internal_committee) שחסרות `precedent_image_embeddings`. בונה אינדקס קבצים מ-`data/precedent-library/` ו-`data/internal-decisions/`, מנסה התאמה לפי tokens של מספרי תיק (כולל parts-match לפורמטים שונים של Nevo doc-id). מדלג על רשומות בלי קובץ-מקור או עם MD בלבד (PyMuPDF לא מרנדר MD). תומך `--dry-run` (default) / `--apply` / `--only external_upload\|internal_committee` / `--limit N`. רץ בקונטיינר (יש `/data` + Voyage env). **הופעל 2026-05-26**: 70 חסרים → 26 backfilled (503 pages, ~$0.21 voyage tokens), 44 אין-קובץ-מקור. ניתן להריץ שוב אחרי שיועלו עוד PDF/DOCX לספרייה | ידני |
|
| `backfill_multimodal_precedents.py` | python | Backfill voyage-multimodal-3 page embeddings על רשומות `case_law` (external_upload + internal_committee) שחסרות `precedent_image_embeddings`. בונה אינדקס קבצים מ-`data/precedent-library/` ו-`data/internal-decisions/`, מנסה התאמה לפי tokens של מספרי תיק (כולל parts-match לפורמטים שונים של Nevo doc-id). מדלג על רשומות בלי קובץ-מקור או עם MD בלבד (PyMuPDF לא מרנדר MD). תומך `--dry-run` (default) / `--apply` / `--only external_upload\|internal_committee` / `--limit N`. רץ בקונטיינר (יש `/data` + Voyage env). **הופעל 2026-05-26**: 70 חסרים → 26 backfilled (503 pages, ~$0.21 voyage tokens), 44 אין-קובץ-מקור. ניתן להריץ שוב אחרי שיועלו עוד PDF/DOCX לספרייה | ידני |
|
||||||
| `monitor_halacha_quality.py` | python | מנטר איכות חילוץ הלכות. בודק drift של `avg(confidence)` בין baseline היסטורי לחלון אחרון. מחזיר JSON מטריקות + alert ב-stderr אם drift > threshold (ברירת מחדל 5%). 2 סדרות: trusted (approved+published) ו-all_extracted. תומך `--window N` / `--threshold X` / `--min-sample N` / `--silent` / `--exit-on-alert`. רץ ב-container או מקומית עם `mcp-server/.venv` (אין תלות ב-LLM, רק SQL). **תזמון מומלץ**: `0 8 * * 1` (יום ראשון 08:00, שבועי) | `0 8 * * 1` (לתזמן) |
|
| `monitor_halacha_quality.py` | python | מנטר איכות חילוץ הלכות. בודק drift של `avg(confidence)` בין baseline היסטורי לחלון אחרון. מחזיר JSON מטריקות + alert ב-stderr אם drift > threshold (ברירת מחדל 5%). 2 סדרות: trusted (approved+published) ו-all_extracted. תומך `--window N` / `--threshold X` / `--min-sample N` / `--silent` / `--exit-on-alert`. רץ ב-container או מקומית עם `mcp-server/.venv` (אין תלות ב-LLM, רק SQL). **תזמון מומלץ**: `0 8 * * 1` (יום ראשון 08:00, שבועי) | `0 8 * * 1` (לתזמן) |
|
||||||
|
| `audit_training_corpus.py` | python | audit של `style_corpus` — לכל החלטה: שדות מטא-דאטה מאוכלסים (`summary`/`outcome`/`key_principles`/`appeal_subtype`/`subject_categories`), קישור ל-`documents` (FK + chunks + embeddings). מפיק `data/audit/corpus-YYYY-MM-DD.json` + summary בקונסול. דרוש `POSTGRES_URL` או POSTGRES_*. אין תלויות חיצוניות מלבד asyncpg. **רץ מהמכונה המקומית** (לא קונטיינר) — חיבור ישיר ל-Postgres :5433 | ידני / קדם-עבודה לפני enrichment של מטא-דאטה |
|
||||||
|
|
||||||
## תיקיית `.archive/` — סקריפטים שהושלמו
|
## תיקיית `.archive/` — סקריפטים שהושלמו
|
||||||
|
|
||||||
|
|||||||
196
scripts/audit_training_corpus.py
Executable file
196
scripts/audit_training_corpus.py
Executable file
@@ -0,0 +1,196 @@
|
|||||||
|
#!/usr/bin/env python
|
||||||
|
"""Audit the style_corpus table — list each decision with what's populated and what's missing.
|
||||||
|
|
||||||
|
Produces a JSON report at data/audit/corpus-YYYY-MM-DD.json so we can see at a glance
|
||||||
|
which corpus entries lack summary/outcome/key_principles/appeal_subtype/chunks/embeddings.
|
||||||
|
|
||||||
|
Run with the mcp-server venv (has asyncpg):
|
||||||
|
POSTGRES_URL=postgres://... ./mcp-server/.venv/bin/python scripts/audit_training_corpus.py
|
||||||
|
|
||||||
|
Without POSTGRES_URL, falls back to the per-field env vars used by web/mcp-server config.
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import sys
|
||||||
|
from datetime import UTC, date, datetime
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import asyncpg
|
||||||
|
|
||||||
|
|
||||||
|
def _build_dsn() -> str:
|
||||||
|
if url := os.environ.get("POSTGRES_URL"):
|
||||||
|
return url
|
||||||
|
return (
|
||||||
|
f"postgres://{os.environ.get('POSTGRES_USER', 'legal_ai')}:"
|
||||||
|
f"{os.environ.get('POSTGRES_PASSWORD', '')}@"
|
||||||
|
f"{os.environ.get('POSTGRES_HOST', '127.0.0.1')}:"
|
||||||
|
f"{os.environ.get('POSTGRES_PORT', '5433')}/"
|
||||||
|
f"{os.environ.get('POSTGRES_DB', 'legal_ai')}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def audit() -> dict:
|
||||||
|
dsn = _build_dsn()
|
||||||
|
conn = await asyncpg.connect(dsn)
|
||||||
|
try:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"""
|
||||||
|
SELECT id, decision_number, decision_date, subject_categories,
|
||||||
|
length(full_text) AS chars,
|
||||||
|
summary,
|
||||||
|
outcome,
|
||||||
|
key_principles,
|
||||||
|
practice_area,
|
||||||
|
appeal_subtype,
|
||||||
|
document_id,
|
||||||
|
created_at
|
||||||
|
FROM style_corpus
|
||||||
|
ORDER BY decision_date NULLS LAST, decision_number
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
|
||||||
|
# Chunk + embedding counts for each related document — by direct FK first,
|
||||||
|
# then by title-match for legacy rows where style_corpus.document_id is NULL.
|
||||||
|
chunk_counts = await conn.fetch(
|
||||||
|
"""
|
||||||
|
SELECT d.id AS doc_id, d.title,
|
||||||
|
count(c.id) AS chunks,
|
||||||
|
count(c.embedding) FILTER (WHERE c.embedding IS NOT NULL) AS chunks_with_emb
|
||||||
|
FROM documents d
|
||||||
|
LEFT JOIN document_chunks c ON c.document_id = d.id
|
||||||
|
WHERE d.title LIKE '[קורפוס]%' OR d.id IN (SELECT document_id FROM style_corpus WHERE document_id IS NOT NULL)
|
||||||
|
GROUP BY d.id, d.title
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
|
||||||
|
finally:
|
||||||
|
await conn.close()
|
||||||
|
|
||||||
|
by_doc_id = {r["doc_id"]: r for r in chunk_counts}
|
||||||
|
|
||||||
|
# Index corpus documents by every digit cluster in their title so we can
|
||||||
|
# match against style_corpus.decision_number regardless of formatting
|
||||||
|
# (e.g. style_corpus has "1109-25" but title may say "ARAR-25-1109" or
|
||||||
|
# "ערר 1009-25"). Each digit run >=3 chars becomes a key.
|
||||||
|
by_digit: dict[str, dict] = {}
|
||||||
|
for r in chunk_counts:
|
||||||
|
title = r["title"] or ""
|
||||||
|
for tok in re.findall(r"\d{3,}", title):
|
||||||
|
by_digit.setdefault(tok, r)
|
||||||
|
|
||||||
|
decisions = []
|
||||||
|
gaps_total = {
|
||||||
|
"summary": 0, "outcome": 0, "key_principles": 0,
|
||||||
|
"appeal_subtype": 0, "subject_categories": 0,
|
||||||
|
"chunks": 0, "embeddings": 0, "document_id": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
for row in rows:
|
||||||
|
cats = row["subject_categories"]
|
||||||
|
if isinstance(cats, str):
|
||||||
|
try:
|
||||||
|
cats = json.loads(cats)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
cats = []
|
||||||
|
cats = cats or []
|
||||||
|
|
||||||
|
kp = row["key_principles"]
|
||||||
|
if isinstance(kp, str):
|
||||||
|
try:
|
||||||
|
kp = json.loads(kp)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
kp = []
|
||||||
|
kp = kp or []
|
||||||
|
|
||||||
|
# Resolve chunks: prefer FK, fall back to digit-cluster match on decision_number.
|
||||||
|
chunks = 0
|
||||||
|
chunks_with_emb = 0
|
||||||
|
if row["document_id"] and row["document_id"] in by_doc_id:
|
||||||
|
r = by_doc_id[row["document_id"]]
|
||||||
|
chunks = r["chunks"]
|
||||||
|
chunks_with_emb = r["chunks_with_emb"]
|
||||||
|
elif row["decision_number"]:
|
||||||
|
for tok in re.findall(r"\d{3,}", row["decision_number"]):
|
||||||
|
if tok in by_digit:
|
||||||
|
r = by_digit[tok]
|
||||||
|
chunks = r["chunks"]
|
||||||
|
chunks_with_emb = r["chunks_with_emb"]
|
||||||
|
break
|
||||||
|
|
||||||
|
missing = []
|
||||||
|
if not row["summary"]:
|
||||||
|
missing.append("summary")
|
||||||
|
gaps_total["summary"] += 1
|
||||||
|
if not row["outcome"]:
|
||||||
|
missing.append("outcome")
|
||||||
|
gaps_total["outcome"] += 1
|
||||||
|
if not kp:
|
||||||
|
missing.append("key_principles")
|
||||||
|
gaps_total["key_principles"] += 1
|
||||||
|
if not row["appeal_subtype"]:
|
||||||
|
missing.append("appeal_subtype")
|
||||||
|
gaps_total["appeal_subtype"] += 1
|
||||||
|
if not cats:
|
||||||
|
missing.append("subject_categories")
|
||||||
|
gaps_total["subject_categories"] += 1
|
||||||
|
if chunks == 0:
|
||||||
|
missing.append("chunks")
|
||||||
|
gaps_total["chunks"] += 1
|
||||||
|
elif chunks_with_emb < chunks:
|
||||||
|
missing.append(f"embeddings({chunks_with_emb}/{chunks})")
|
||||||
|
gaps_total["embeddings"] += 1
|
||||||
|
if row["document_id"] is None:
|
||||||
|
missing.append("document_id")
|
||||||
|
gaps_total["document_id"] += 1
|
||||||
|
|
||||||
|
decisions.append({
|
||||||
|
"id": str(row["id"]),
|
||||||
|
"decision_number": row["decision_number"] or "",
|
||||||
|
"decision_date": row["decision_date"].isoformat() if row["decision_date"] else None,
|
||||||
|
"chars": row["chars"],
|
||||||
|
"subject_categories": cats,
|
||||||
|
"practice_area": row["practice_area"] or "",
|
||||||
|
"appeal_subtype": row["appeal_subtype"] or "",
|
||||||
|
"summary_len": len(row["summary"] or ""),
|
||||||
|
"outcome_len": len(row["outcome"] or ""),
|
||||||
|
"key_principles_count": len(kp),
|
||||||
|
"chunks": chunks,
|
||||||
|
"chunks_with_embeddings": chunks_with_emb,
|
||||||
|
"document_id": str(row["document_id"]) if row["document_id"] else None,
|
||||||
|
"missing": missing,
|
||||||
|
"created_at": row["created_at"].isoformat() if row["created_at"] else None,
|
||||||
|
})
|
||||||
|
|
||||||
|
return {
|
||||||
|
"generated_at": datetime.now(UTC).isoformat(),
|
||||||
|
"total_decisions": len(decisions),
|
||||||
|
"gaps_total": gaps_total,
|
||||||
|
"decisions": decisions,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def main() -> int:
|
||||||
|
report = await audit()
|
||||||
|
out_dir = Path(__file__).resolve().parents[1] / "data" / "audit"
|
||||||
|
out_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
today = date.today().isoformat()
|
||||||
|
out_file = out_dir / f"corpus-{today}.json"
|
||||||
|
out_file.write_text(json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8")
|
||||||
|
|
||||||
|
# Console summary
|
||||||
|
print(f"Total decisions: {report['total_decisions']}")
|
||||||
|
print("Gaps by field (count of decisions missing it):")
|
||||||
|
for field, n in report["gaps_total"].items():
|
||||||
|
bar = "█" * min(n, 60)
|
||||||
|
print(f" {field:25s} {n:3d} {bar}")
|
||||||
|
print(f"\nReport written to {out_file}")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(asyncio.run(main()))
|
||||||
48
scripts/legal-chat-service.config.cjs
Normal file
48
scripts/legal-chat-service.config.cjs
Normal file
@@ -0,0 +1,48 @@
|
|||||||
|
/**
|
||||||
|
* pm2 ecosystem entry for legal-chat-service — the host-side SSE bridge
|
||||||
|
* to ``claude`` CLI that powers the /training chat tab.
|
||||||
|
*
|
||||||
|
* Why pm2:
|
||||||
|
* - Auto-restart if the process dies (claude CLI subprocess failures
|
||||||
|
* should never leave the service in a half-dead state).
|
||||||
|
* - Log rotation matches paperclip's behavior so the chair sees
|
||||||
|
* consistent log paths under ~/.pm2/logs/.
|
||||||
|
*
|
||||||
|
* Install (once):
|
||||||
|
* pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs
|
||||||
|
* pm2 save
|
||||||
|
*
|
||||||
|
* Smoke test:
|
||||||
|
* curl http://127.0.0.1:8770/health
|
||||||
|
* # → {"ok":true,"service":"legal-chat-service"}
|
||||||
|
*
|
||||||
|
* Update:
|
||||||
|
* pm2 restart legal-chat-service
|
||||||
|
*
|
||||||
|
* Stop:
|
||||||
|
* pm2 stop legal-chat-service
|
||||||
|
*/
|
||||||
|
|
||||||
|
module.exports = {
|
||||||
|
apps: [
|
||||||
|
{
|
||||||
|
name: "legal-chat-service",
|
||||||
|
cwd: "/home/chaim/legal-ai/mcp-server",
|
||||||
|
// Run the in-package server via the venv interpreter so all
|
||||||
|
// imports (claude_session, etc) resolve.
|
||||||
|
script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
|
||||||
|
args: "-m legal_mcp.chat_service.server --port 8770",
|
||||||
|
// claude CLI looks up credentials under HOME — make sure it
|
||||||
|
// sees Daphna's session, not an empty container HOME.
|
||||||
|
env: {
|
||||||
|
HOME: "/home/chaim",
|
||||||
|
PATH: "/home/chaim/.local/bin:/usr/local/bin:/usr/bin:/bin",
|
||||||
|
PYTHONUNBUFFERED: "1",
|
||||||
|
},
|
||||||
|
restart_delay: 5000,
|
||||||
|
max_restarts: 10,
|
||||||
|
autorestart: true,
|
||||||
|
max_memory_restart: "500M",
|
||||||
|
},
|
||||||
|
],
|
||||||
|
};
|
||||||
@@ -1,18 +1,27 @@
|
|||||||
"use client";
|
"use client";
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
import Link from "next/link";
|
import Link from "next/link";
|
||||||
|
import { Upload } from "lucide-react";
|
||||||
import { AppShell } from "@/components/app-shell";
|
import { AppShell } from "@/components/app-shell";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
import { Card, CardContent } from "@/components/ui/card";
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||||
import { StyleReportPanel } from "@/components/training/style-report-panel";
|
import { StyleReportPanel } from "@/components/training/style-report-panel";
|
||||||
import { CorpusPanel } from "@/components/training/corpus-panel";
|
import { CorpusPanel } from "@/components/training/corpus-panel";
|
||||||
import { ComparePanel } from "@/components/training/compare-panel";
|
import { ComparePanel } from "@/components/training/compare-panel";
|
||||||
|
import { CuratorPortraitPanel } from "@/components/training/curator-portrait-panel";
|
||||||
|
import { ChatPanel } from "@/components/training/chat-panel";
|
||||||
|
import { TrainingUploadDialog } from "@/components/training/upload-dialog";
|
||||||
|
|
||||||
export default function TrainingPage() {
|
export default function TrainingPage() {
|
||||||
|
const [uploadOpen, setUploadOpen] = useState(false);
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<AppShell>
|
<AppShell>
|
||||||
<section className="space-y-6">
|
<section className="space-y-6">
|
||||||
<header>
|
<header className="flex items-start justify-between gap-4 flex-wrap">
|
||||||
|
<div>
|
||||||
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
||||||
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
||||||
<span aria-hidden> · </span>
|
<span aria-hidden> · </span>
|
||||||
@@ -23,8 +32,18 @@ export default function TrainingPage() {
|
|||||||
לוח בקרה של קורפוס האימון — סטטיסטיקות, אנטומיית החלטה ממוצעת,
|
לוח בקרה של קורפוס האימון — סטטיסטיקות, אנטומיית החלטה ממוצעת,
|
||||||
ביטויי חתימה, וכלי השוואה בין שתי החלטות.
|
ביטויי חתימה, וכלי השוואה בין שתי החלטות.
|
||||||
</p>
|
</p>
|
||||||
|
</div>
|
||||||
|
<Button
|
||||||
|
onClick={() => setUploadOpen(true)}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft shrink-0"
|
||||||
|
>
|
||||||
|
<Upload className="w-4 h-4 me-1" />
|
||||||
|
העלה החלטה
|
||||||
|
</Button>
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
|
<TrainingUploadDialog open={uploadOpen} onOpenChange={setUploadOpen} />
|
||||||
|
|
||||||
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
||||||
|
|
||||||
<Card className="bg-surface border-rule shadow-sm">
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
@@ -34,6 +53,8 @@ export default function TrainingPage() {
|
|||||||
<TabsTrigger value="report">פורטרט סגנון</TabsTrigger>
|
<TabsTrigger value="report">פורטרט סגנון</TabsTrigger>
|
||||||
<TabsTrigger value="corpus">קורפוס</TabsTrigger>
|
<TabsTrigger value="corpus">קורפוס</TabsTrigger>
|
||||||
<TabsTrigger value="compare">השוואה</TabsTrigger>
|
<TabsTrigger value="compare">השוואה</TabsTrigger>
|
||||||
|
<TabsTrigger value="curator">הסוכן</TabsTrigger>
|
||||||
|
<TabsTrigger value="chat">שיחה</TabsTrigger>
|
||||||
</TabsList>
|
</TabsList>
|
||||||
|
|
||||||
<TabsContent value="report" className="mt-5">
|
<TabsContent value="report" className="mt-5">
|
||||||
@@ -47,6 +68,14 @@ export default function TrainingPage() {
|
|||||||
<TabsContent value="compare" className="mt-5">
|
<TabsContent value="compare" className="mt-5">
|
||||||
<ComparePanel />
|
<ComparePanel />
|
||||||
</TabsContent>
|
</TabsContent>
|
||||||
|
|
||||||
|
<TabsContent value="curator" className="mt-5">
|
||||||
|
<CuratorPortraitPanel />
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
<TabsContent value="chat" className="mt-5">
|
||||||
|
<ChatPanel />
|
||||||
|
</TabsContent>
|
||||||
</Tabs>
|
</Tabs>
|
||||||
</CardContent>
|
</CardContent>
|
||||||
</Card>
|
</Card>
|
||||||
|
|||||||
434
web-ui/src/components/training/chat-panel.tsx
Normal file
434
web-ui/src/components/training/chat-panel.tsx
Normal file
@@ -0,0 +1,434 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Style-agent chat panel — the new "שיחה" tab on /training.
|
||||||
|
*
|
||||||
|
* Layout: two columns.
|
||||||
|
* - Sidebar: list of conversations + "+ שיחה חדשה" button
|
||||||
|
* - Main: thread of messages + composer with SSE streaming
|
||||||
|
*
|
||||||
|
* Each message is persisted to the legal-ai DB; the LLM call goes
|
||||||
|
* out via FastAPI → host's legal-chat-service → claude CLI. There
|
||||||
|
* is no API cost — the claude CLI uses Daphna's claude.ai
|
||||||
|
* subscription via the host's auth.
|
||||||
|
*
|
||||||
|
* Health gate: if /api/training/chat/health reports the host service
|
||||||
|
* is unreachable, the composer is replaced by a setup notice telling
|
||||||
|
* the chair to start the pm2 service.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useEffect, useRef, useState } from "react";
|
||||||
|
import {
|
||||||
|
Send, Plus, Trash2, Loader2, MessageSquare, Sparkles, AlertTriangle,
|
||||||
|
} from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import { ScrollArea } from "@/components/ui/scroll-area";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import {
|
||||||
|
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import {
|
||||||
|
chatKeys,
|
||||||
|
useChatConversation,
|
||||||
|
useChatConversations,
|
||||||
|
useChatHealth,
|
||||||
|
useCorpus,
|
||||||
|
useCreateChat,
|
||||||
|
useDeleteChat,
|
||||||
|
type ChatMessage,
|
||||||
|
} from "@/lib/api/training";
|
||||||
|
import { useQueryClient } from "@tanstack/react-query";
|
||||||
|
|
||||||
|
export function ChatPanel() {
|
||||||
|
const [activeId, setActiveId] = useState<string | null>(null);
|
||||||
|
const health = useChatHealth();
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="grid gap-4 lg:grid-cols-[280px_1fr]">
|
||||||
|
<ConversationsSidebar activeId={activeId} onSelect={setActiveId} />
|
||||||
|
<div className="space-y-3">
|
||||||
|
{health.data && !health.data.reachable && (
|
||||||
|
<ChatServiceWarning health={health.data} />
|
||||||
|
)}
|
||||||
|
{activeId ? (
|
||||||
|
<ChatThread convId={activeId} />
|
||||||
|
) : (
|
||||||
|
<Card className="bg-rule-soft/40 border-rule">
|
||||||
|
<CardContent className="px-6 py-10 text-center text-ink-muted text-sm space-y-2">
|
||||||
|
<MessageSquare className="w-8 h-8 mx-auto opacity-50" />
|
||||||
|
<p>בחר שיחה קיימת או פתח חדשה כדי להתחיל לדבר עם סוכן הסגנון.</p>
|
||||||
|
<p className="text-[0.78rem]">
|
||||||
|
הסוכן רץ על claude CLI מקומי דרך legal-chat-service. אין עלות API.
|
||||||
|
</p>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Sidebar: list + new ────────────────────────────────────────────
|
||||||
|
|
||||||
|
function ConversationsSidebar({
|
||||||
|
activeId, onSelect,
|
||||||
|
}: {
|
||||||
|
activeId: string | null;
|
||||||
|
onSelect: (id: string | null) => void;
|
||||||
|
}) {
|
||||||
|
const { data: convs, isPending } = useChatConversations();
|
||||||
|
const { data: corpus } = useCorpus();
|
||||||
|
const create = useCreateChat();
|
||||||
|
const del = useDeleteChat();
|
||||||
|
const [creating, setCreating] = useState(false);
|
||||||
|
const [newTitle, setNewTitle] = useState("");
|
||||||
|
const [newCorpusId, setNewCorpusId] = useState<string>("__none__");
|
||||||
|
|
||||||
|
const onCreate = async () => {
|
||||||
|
try {
|
||||||
|
const conv = await create.mutateAsync({
|
||||||
|
title: newTitle.trim() || "שיחה חדשה",
|
||||||
|
style_corpus_id: newCorpusId === "__none__" ? null : newCorpusId,
|
||||||
|
});
|
||||||
|
onSelect(conv.id);
|
||||||
|
setCreating(false);
|
||||||
|
setNewTitle("");
|
||||||
|
setNewCorpusId("__none__");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל ביצירת שיחה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const onDelete = async (id: string) => {
|
||||||
|
if (!window.confirm("למחוק את השיחה? פעולה זו לא ניתנת לביטול.")) return;
|
||||||
|
try {
|
||||||
|
await del.mutateAsync(id);
|
||||||
|
if (activeId === id) onSelect(null);
|
||||||
|
toast.success("השיחה נמחקה");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל במחיקה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-3 py-3 space-y-2">
|
||||||
|
{!creating ? (
|
||||||
|
<Button
|
||||||
|
onClick={() => setCreating(true)}
|
||||||
|
className="w-full bg-navy text-parchment hover:bg-navy-soft"
|
||||||
|
size="sm"
|
||||||
|
>
|
||||||
|
<Plus className="w-4 h-4 me-1" />
|
||||||
|
שיחה חדשה
|
||||||
|
</Button>
|
||||||
|
) : (
|
||||||
|
<div className="space-y-2 border border-rule rounded p-2 bg-rule-soft/30">
|
||||||
|
<Textarea
|
||||||
|
value={newTitle}
|
||||||
|
onChange={(e) => setNewTitle(e.target.value)}
|
||||||
|
placeholder="כותרת לשיחה (אופציונלי)"
|
||||||
|
rows={2} dir="rtl"
|
||||||
|
/>
|
||||||
|
<Select value={newCorpusId} onValueChange={setNewCorpusId} dir="rtl">
|
||||||
|
<SelectTrigger>
|
||||||
|
<SelectValue placeholder="צמד להחלטה (אופציונלי)" />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent className="max-h-[300px]">
|
||||||
|
<SelectItem value="__none__">— שיחה כללית —</SelectItem>
|
||||||
|
{corpus?.map((c) => (
|
||||||
|
<SelectItem key={c.id} value={c.id}>
|
||||||
|
{c.decision_number || "—"}
|
||||||
|
{c.decision_date ? ` · ${c.decision_date}` : ""}
|
||||||
|
</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
<div className="flex gap-1 justify-end">
|
||||||
|
<Button variant="ghost" size="sm"
|
||||||
|
onClick={() => { setCreating(false); setNewTitle(""); setNewCorpusId("__none__"); }}>
|
||||||
|
ביטול
|
||||||
|
</Button>
|
||||||
|
<Button size="sm" onClick={onCreate} disabled={create.isPending}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
צור
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<ScrollArea className="h-[520px]">
|
||||||
|
<ul className="space-y-1">
|
||||||
|
{isPending && (
|
||||||
|
<>
|
||||||
|
<Skeleton className="h-12 w-full" />
|
||||||
|
<Skeleton className="h-12 w-full" />
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
{convs?.length === 0 && (
|
||||||
|
<p className="text-center text-ink-muted text-[0.78rem] py-6">
|
||||||
|
אין עדיין שיחות
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
{convs?.map((c) => {
|
||||||
|
const active = c.id === activeId;
|
||||||
|
return (
|
||||||
|
<li key={c.id}>
|
||||||
|
<button
|
||||||
|
onClick={() => onSelect(c.id)}
|
||||||
|
className={
|
||||||
|
"w-full text-end rounded-md px-2 py-2 transition " +
|
||||||
|
(active
|
||||||
|
? "bg-gold-wash border border-gold/40"
|
||||||
|
: "hover:bg-rule-soft/60 border border-transparent")
|
||||||
|
}
|
||||||
|
>
|
||||||
|
<div className="text-sm text-navy font-semibold truncate">
|
||||||
|
{c.title}
|
||||||
|
</div>
|
||||||
|
<div className="flex items-center gap-1 text-[0.7rem] text-ink-muted">
|
||||||
|
{c.decision_number && (
|
||||||
|
<Badge variant="outline"
|
||||||
|
className="text-[0.65rem] bg-info-bg text-info border-info/40">
|
||||||
|
{c.decision_number}
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
<span className="tabular-nums">{c.message_count}</span>
|
||||||
|
<MessageSquare className="w-3 h-3" />
|
||||||
|
<span className="grow text-end">
|
||||||
|
{new Date(c.last_message_at).toLocaleDateString("he-IL")}
|
||||||
|
</span>
|
||||||
|
<button
|
||||||
|
onClick={(e) => { e.stopPropagation(); onDelete(c.id); }}
|
||||||
|
className="hover:text-danger"
|
||||||
|
aria-label="מחק שיחה"
|
||||||
|
>
|
||||||
|
<Trash2 className="w-3 h-3" />
|
||||||
|
</button>
|
||||||
|
</div>
|
||||||
|
</button>
|
||||||
|
</li>
|
||||||
|
);
|
||||||
|
})}
|
||||||
|
</ul>
|
||||||
|
</ScrollArea>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Thread + composer ──────────────────────────────────────────────
|
||||||
|
|
||||||
|
function ChatThread({ convId }: { convId: string }) {
|
||||||
|
const { data, isPending } = useChatConversation(convId);
|
||||||
|
const qc = useQueryClient();
|
||||||
|
const [draft, setDraft] = useState("");
|
||||||
|
const [streaming, setStreaming] = useState(false);
|
||||||
|
const [streamingText, setStreamingText] = useState("");
|
||||||
|
const [streamError, setStreamError] = useState("");
|
||||||
|
const scrollRef = useRef<HTMLDivElement | null>(null);
|
||||||
|
|
||||||
|
/* Auto-scroll to bottom when new messages arrive. */
|
||||||
|
useEffect(() => {
|
||||||
|
const el = scrollRef.current;
|
||||||
|
if (!el) return;
|
||||||
|
el.scrollTo({ top: el.scrollHeight, behavior: "smooth" });
|
||||||
|
}, [data?.messages.length, streamingText]);
|
||||||
|
|
||||||
|
const onSend = async () => {
|
||||||
|
const text = draft.trim();
|
||||||
|
if (!text || streaming) return;
|
||||||
|
setDraft("");
|
||||||
|
setStreaming(true);
|
||||||
|
setStreamingText("");
|
||||||
|
setStreamError("");
|
||||||
|
|
||||||
|
try {
|
||||||
|
const res = await fetch(
|
||||||
|
`/api/training/chat/conversations/${encodeURIComponent(convId)}/messages`,
|
||||||
|
{
|
||||||
|
method: "POST",
|
||||||
|
headers: { "Content-Type": "application/json" },
|
||||||
|
body: JSON.stringify({ content: text }),
|
||||||
|
},
|
||||||
|
);
|
||||||
|
if (!res.ok || !res.body) {
|
||||||
|
const body = await res.text();
|
||||||
|
throw new Error(`HTTP ${res.status}: ${body.slice(0, 200)}`);
|
||||||
|
}
|
||||||
|
// Parse SSE line-by-line. EventSource would be cleaner but it
|
||||||
|
// doesn't support POST bodies; the manual reader is small.
|
||||||
|
const reader = res.body.getReader();
|
||||||
|
const decoder = new TextDecoder();
|
||||||
|
let buffer = "";
|
||||||
|
let accumulated = "";
|
||||||
|
while (true) {
|
||||||
|
const { value, done } = await reader.read();
|
||||||
|
if (done) break;
|
||||||
|
buffer += decoder.decode(value, { stream: true });
|
||||||
|
let nl: number;
|
||||||
|
while ((nl = buffer.indexOf("\n\n")) !== -1) {
|
||||||
|
const event = buffer.slice(0, nl);
|
||||||
|
buffer = buffer.slice(nl + 2);
|
||||||
|
if (!event.startsWith("data: ")) continue;
|
||||||
|
try {
|
||||||
|
const payload = JSON.parse(event.slice("data: ".length));
|
||||||
|
if (payload.type === "text_delta" && payload.text) {
|
||||||
|
accumulated += payload.text;
|
||||||
|
setStreamingText(accumulated);
|
||||||
|
} else if (payload.type === "error") {
|
||||||
|
setStreamError(String(payload.message || "שגיאה לא ידועה"));
|
||||||
|
} else if (payload.type === "done") {
|
||||||
|
if (payload.text && !accumulated) {
|
||||||
|
accumulated = payload.text;
|
||||||
|
setStreamingText(accumulated);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch {
|
||||||
|
/* ignore non-JSON */
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} catch (e) {
|
||||||
|
setStreamError(e instanceof Error ? e.message : "שגיאה בשיחה");
|
||||||
|
} finally {
|
||||||
|
setStreaming(false);
|
||||||
|
setStreamingText("");
|
||||||
|
// Refetch the conversation so the persisted assistant turn shows up.
|
||||||
|
qc.invalidateQueries({ queryKey: chatKeys.conversation(convId) });
|
||||||
|
qc.invalidateQueries({ queryKey: chatKeys.conversations() });
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-[560px] w-full" />;
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3 space-y-3">
|
||||||
|
<header className="flex items-center gap-2 border-b border-rule pb-2">
|
||||||
|
<Sparkles className="w-4 h-4 text-gold-deep" />
|
||||||
|
<h3 className="text-navy font-semibold grow">{data.conversation.title}</h3>
|
||||||
|
{data.conversation.decision_number && (
|
||||||
|
<Badge variant="outline" className="bg-info-bg text-info border-info/40">
|
||||||
|
{data.conversation.decision_number}
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
</header>
|
||||||
|
|
||||||
|
<div ref={scrollRef} className="h-[440px] overflow-y-auto space-y-3 pe-1">
|
||||||
|
{data.messages.length === 0 && !streaming && (
|
||||||
|
<p className="text-center text-ink-muted text-sm py-8">
|
||||||
|
התחל בשאלה — למשל: "מה מאפיין את הפתיחות של דפנה בעררי 1xxx?"
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
{data.messages.map((m) => <MessageBubble key={m.id} message={m} />)}
|
||||||
|
{streaming && (
|
||||||
|
<MessageBubble
|
||||||
|
message={{
|
||||||
|
id: "streaming",
|
||||||
|
role: "assistant",
|
||||||
|
content: streamingText || "(מקליד…)",
|
||||||
|
created_at: "",
|
||||||
|
}}
|
||||||
|
isStreaming
|
||||||
|
/>
|
||||||
|
)}
|
||||||
|
{streamError && (
|
||||||
|
<div className="rounded-lg border border-danger/40 bg-danger-bg p-3 text-danger text-sm">
|
||||||
|
{streamError}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="border-t border-rule pt-3 space-y-2">
|
||||||
|
<Textarea
|
||||||
|
value={draft}
|
||||||
|
onChange={(e) => setDraft(e.target.value)}
|
||||||
|
placeholder="שאל את הסוכן… (Shift+Enter לשורה חדשה)"
|
||||||
|
rows={3} dir="rtl"
|
||||||
|
disabled={streaming}
|
||||||
|
onKeyDown={(e) => {
|
||||||
|
if (e.key === "Enter" && !e.shiftKey) {
|
||||||
|
e.preventDefault();
|
||||||
|
void onSend();
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
/>
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<p className="text-[0.72rem] text-ink-muted grow">
|
||||||
|
{data.conversation.claude_session_id
|
||||||
|
? "שיחה ממשיכה (--resume) — אין צורך לטעון מחדש את ה-system prompt"
|
||||||
|
: "שיחה חדשה — system prompt ייטען (שני מסמכי ייחוס + רשימת קורפוס)"}
|
||||||
|
</p>
|
||||||
|
<Button onClick={onSend} disabled={streaming || !draft.trim()}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
{streaming ? (
|
||||||
|
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||||
|
) : (
|
||||||
|
<Send className="w-4 h-4 me-1" />
|
||||||
|
)}
|
||||||
|
שלח
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function MessageBubble({
|
||||||
|
message, isStreaming = false,
|
||||||
|
}: { message: ChatMessage; isStreaming?: boolean }) {
|
||||||
|
const isUser = message.role === "user";
|
||||||
|
return (
|
||||||
|
<div className={isUser ? "flex justify-start" : "flex justify-end"}>
|
||||||
|
<div
|
||||||
|
className={
|
||||||
|
"max-w-[85%] rounded-lg px-3 py-2 text-sm leading-relaxed whitespace-pre-wrap " +
|
||||||
|
(isUser
|
||||||
|
? "bg-gold-wash text-ink border border-gold/40"
|
||||||
|
: "bg-rule-soft text-ink border border-rule")
|
||||||
|
}
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
{message.content}
|
||||||
|
{isStreaming && (
|
||||||
|
<span className="inline-block w-1.5 h-3.5 bg-navy/60 align-middle ms-1 animate-pulse" />
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Service-down warning ──────────────────────────────────────────
|
||||||
|
|
||||||
|
function ChatServiceWarning({
|
||||||
|
health,
|
||||||
|
}: { health: { reachable: boolean; url: string; error?: string } }) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-danger-bg border-danger/40">
|
||||||
|
<CardContent className="px-4 py-3 space-y-1">
|
||||||
|
<div className="flex items-center gap-2 text-danger">
|
||||||
|
<AlertTriangle className="w-4 h-4" />
|
||||||
|
<strong>שירות הצ'אט אינו זמין</strong>
|
||||||
|
</div>
|
||||||
|
<p className="text-[0.78rem] text-danger">
|
||||||
|
לא ניתן להגיע ל-legal-chat-service בכתובת
|
||||||
|
<code className="px-1 mx-1 bg-rule-soft rounded">{health.url}</code>.
|
||||||
|
{health.error && (<> פירוט: <code className="px-1 bg-rule-soft rounded">{health.error}</code></>)}
|
||||||
|
</p>
|
||||||
|
<p className="text-[0.72rem] text-ink-muted">
|
||||||
|
על המכונה המקומית הפעל:
|
||||||
|
<code className="px-1 bg-rule-soft rounded">
|
||||||
|
pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs
|
||||||
|
</code>
|
||||||
|
</p>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
402
web-ui/src/components/training/corpus-detail-drawer.tsx
Normal file
402
web-ui/src/components/training/corpus-detail-drawer.tsx
Normal file
@@ -0,0 +1,402 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Side-drawer for inspecting + editing a single style_corpus entry.
|
||||||
|
*
|
||||||
|
* Tabs:
|
||||||
|
* - "פרטים" — show + edit the enriched metadata (decision_number, date,
|
||||||
|
* subjects, summary, outcome, key_principles, appeal_subtype). Saving
|
||||||
|
* issues a PATCH /api/training/corpus/{id} and invalidates the list.
|
||||||
|
* - "תוכן" — read-only full_text view (truncated to 5K with "show more").
|
||||||
|
* We never let the chair edit full_text from the UI; corrections happen
|
||||||
|
* by re-uploading via the Upload dialog.
|
||||||
|
* - "מה למדנו" — per-decision lessons (Phase 4 placeholder for now).
|
||||||
|
* - "דפוסים" — style_patterns scoped by appeal_subtype.
|
||||||
|
*
|
||||||
|
* Why a Sheet, not a Dialog: the drawer needs to coexist with the corpus
|
||||||
|
* table so the chair can scan multiple decisions without losing context.
|
||||||
|
* Sheet (side: "left" in RTL = right edge in LTR) gives that without
|
||||||
|
* stealing the entire viewport.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useEffect, useState } from "react";
|
||||||
|
import { Save, FileText, Tag, Calendar, BookOpen, Loader2 } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import {
|
||||||
|
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||||
|
} from "@/components/ui/sheet";
|
||||||
|
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Label } from "@/components/ui/label";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { ScrollArea } from "@/components/ui/scroll-area";
|
||||||
|
import {
|
||||||
|
usePatchCorpus,
|
||||||
|
type CorpusDecision,
|
||||||
|
type CorpusDecisionPatch,
|
||||||
|
} from "@/lib/api/training";
|
||||||
|
import { LessonsTab } from "./lessons-tab";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
decision: CorpusDecision | null;
|
||||||
|
onOpenChange: (open: boolean) => void;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function CorpusDetailDrawer({ decision, onOpenChange }: Props) {
|
||||||
|
// Local editable state for the "details" tab. Re-seeds whenever the
|
||||||
|
// selected decision changes so the form reflects the row the chair
|
||||||
|
// clicked.
|
||||||
|
const [draft, setDraft] = useState<CorpusDecisionPatch>({});
|
||||||
|
const patch = usePatchCorpus();
|
||||||
|
|
||||||
|
/* eslint-disable react-hooks/set-state-in-effect */
|
||||||
|
useEffect(() => {
|
||||||
|
if (!decision) {
|
||||||
|
setDraft({});
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
setDraft({
|
||||||
|
decision_number: decision.decision_number,
|
||||||
|
decision_date: decision.decision_date,
|
||||||
|
subject_categories: decision.subject_categories,
|
||||||
|
summary: decision.summary,
|
||||||
|
outcome: decision.outcome,
|
||||||
|
key_principles: decision.key_principles,
|
||||||
|
appeal_subtype: decision.appeal_subtype,
|
||||||
|
practice_area: decision.practice_area,
|
||||||
|
});
|
||||||
|
}, [decision]);
|
||||||
|
/* eslint-enable react-hooks/set-state-in-effect */
|
||||||
|
|
||||||
|
const open = decision !== null;
|
||||||
|
if (!decision) return null;
|
||||||
|
|
||||||
|
// Diff against the originally loaded row — only PATCH fields the chair
|
||||||
|
// actually changed, so concurrent edits to other fields stay intact.
|
||||||
|
const diff: CorpusDecisionPatch = {};
|
||||||
|
if (draft.decision_number !== decision.decision_number)
|
||||||
|
diff.decision_number = draft.decision_number;
|
||||||
|
if (draft.decision_date !== decision.decision_date)
|
||||||
|
diff.decision_date = draft.decision_date;
|
||||||
|
if (draft.summary !== decision.summary)
|
||||||
|
diff.summary = draft.summary;
|
||||||
|
if (draft.outcome !== decision.outcome)
|
||||||
|
diff.outcome = draft.outcome;
|
||||||
|
if (draft.appeal_subtype !== decision.appeal_subtype)
|
||||||
|
diff.appeal_subtype = draft.appeal_subtype;
|
||||||
|
if (draft.practice_area !== decision.practice_area)
|
||||||
|
diff.practice_area = draft.practice_area;
|
||||||
|
if (
|
||||||
|
JSON.stringify(draft.subject_categories) !==
|
||||||
|
JSON.stringify(decision.subject_categories)
|
||||||
|
)
|
||||||
|
diff.subject_categories = draft.subject_categories;
|
||||||
|
if (
|
||||||
|
JSON.stringify(draft.key_principles) !==
|
||||||
|
JSON.stringify(decision.key_principles)
|
||||||
|
)
|
||||||
|
diff.key_principles = draft.key_principles;
|
||||||
|
|
||||||
|
const isDirty = Object.keys(diff).length > 0;
|
||||||
|
|
||||||
|
const onSave = async () => {
|
||||||
|
if (!isDirty) return;
|
||||||
|
try {
|
||||||
|
await patch.mutateAsync({ id: decision.id, patch: diff });
|
||||||
|
toast.success("המטא-דאטה עודכן");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל בשמירה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const setSubjects = (raw: string) =>
|
||||||
|
setDraft((d) => ({
|
||||||
|
...d,
|
||||||
|
subject_categories: raw.split(/[,،]/).map((s) => s.trim()).filter(Boolean),
|
||||||
|
}));
|
||||||
|
const setPrinciples = (raw: string) =>
|
||||||
|
setDraft((d) => ({
|
||||||
|
...d,
|
||||||
|
key_principles: raw.split("\n").map((s) => s.trim()).filter(Boolean),
|
||||||
|
}));
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||||
|
<SheetContent side="left" className="w-full sm:max-w-3xl overflow-y-auto" dir="rtl">
|
||||||
|
<SheetHeader>
|
||||||
|
<SheetTitle className="text-navy flex items-center gap-2">
|
||||||
|
<BookOpen className="w-4 h-4 shrink-0" />
|
||||||
|
{decision.legal_citation || decision.decision_number || "—"}
|
||||||
|
</SheetTitle>
|
||||||
|
<SheetDescription className="text-ink-muted">
|
||||||
|
{decision.doc_title || "החלטה בקורפוס הסגנוני"}
|
||||||
|
</SheetDescription>
|
||||||
|
</SheetHeader>
|
||||||
|
|
||||||
|
{/* Summary strip — fast-scan info, always visible above the tabs. */}
|
||||||
|
<div className="px-6 mt-3 grid grid-cols-2 md:grid-cols-4 gap-3 text-[0.78rem]">
|
||||||
|
<DataPoint icon={<Calendar className="w-3 h-3" />} label="תאריך"
|
||||||
|
value={decision.decision_date || "—"} />
|
||||||
|
<DataPoint icon={<FileText className="w-3 h-3" />} label="תווים"
|
||||||
|
value={`${(decision.chars / 1000).toFixed(1)}K`} />
|
||||||
|
<DataPoint icon={<FileText className="w-3 h-3" />} label="עמודים"
|
||||||
|
value={decision.page_count > 0 ? String(decision.page_count) : "—"} />
|
||||||
|
<DataPoint icon={<Tag className="w-3 h-3" />} label="תת-סוג"
|
||||||
|
value={decision.appeal_subtype || "—"} />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="px-6 pb-6 mt-4">
|
||||||
|
<Tabs defaultValue="details" dir="rtl">
|
||||||
|
<TabsList className="bg-rule-soft/60">
|
||||||
|
<TabsTrigger value="details">פרטים</TabsTrigger>
|
||||||
|
<TabsTrigger value="content">תוכן</TabsTrigger>
|
||||||
|
<TabsTrigger value="lessons">מה למדנו</TabsTrigger>
|
||||||
|
<TabsTrigger value="patterns">דפוסים</TabsTrigger>
|
||||||
|
</TabsList>
|
||||||
|
|
||||||
|
{/* ── Tab: editable metadata ─────────────────────────── */}
|
||||||
|
<TabsContent value="details" className="mt-4 space-y-4">
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<Field label="מספר ההחלטה">
|
||||||
|
<Input value={draft.decision_number ?? ""}
|
||||||
|
onChange={(e) => setDraft((d) => ({ ...d, decision_number: e.target.value }))}
|
||||||
|
dir="rtl" />
|
||||||
|
</Field>
|
||||||
|
<Field label="תאריך">
|
||||||
|
<Input type="date" value={draft.decision_date ?? ""}
|
||||||
|
onChange={(e) => setDraft((d) => ({ ...d, decision_date: e.target.value }))} />
|
||||||
|
</Field>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<Field label="נושאים (מופרדים בפסיקים)">
|
||||||
|
<Input value={(draft.subject_categories ?? []).join(", ")}
|
||||||
|
onChange={(e) => setSubjects(e.target.value)} dir="rtl" />
|
||||||
|
{decision.subject_categories.length > 0 && (
|
||||||
|
<div className="flex flex-wrap gap-1 mt-1">
|
||||||
|
{decision.subject_categories.map((s) => (
|
||||||
|
<Badge key={s} variant="outline"
|
||||||
|
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||||
|
{s}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</Field>
|
||||||
|
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<Field label="תת-סוג ערר">
|
||||||
|
<Input value={draft.appeal_subtype ?? ""}
|
||||||
|
onChange={(e) => setDraft((d) => ({ ...d, appeal_subtype: e.target.value }))}
|
||||||
|
placeholder="building_permit / betterment_levy / compensation_197"
|
||||||
|
dir="rtl" />
|
||||||
|
</Field>
|
||||||
|
<Field label="תחום משפט">
|
||||||
|
<Input value={draft.practice_area ?? ""}
|
||||||
|
onChange={(e) => setDraft((d) => ({ ...d, practice_area: e.target.value }))}
|
||||||
|
dir="rtl" />
|
||||||
|
</Field>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<Field label="תקציר (summary)">
|
||||||
|
<Textarea value={draft.summary ?? ""} rows={3}
|
||||||
|
onChange={(e) => setDraft((d) => ({ ...d, summary: e.target.value }))}
|
||||||
|
placeholder="תקציר חופשי — מי, מה, איך הוכרע"
|
||||||
|
dir="rtl" />
|
||||||
|
</Field>
|
||||||
|
|
||||||
|
<Field label="התוצאה (outcome)">
|
||||||
|
<Textarea value={draft.outcome ?? ""} rows={2}
|
||||||
|
onChange={(e) => setDraft((d) => ({ ...d, outcome: e.target.value }))}
|
||||||
|
placeholder="קבלה / קבלה חלקית / דחייה — בקצרה"
|
||||||
|
dir="rtl" />
|
||||||
|
</Field>
|
||||||
|
|
||||||
|
<Field label="עקרונות מרכזיים (שורה לכל אחד)">
|
||||||
|
<Textarea value={(draft.key_principles ?? []).join("\n")} rows={4}
|
||||||
|
onChange={(e) => setPrinciples(e.target.value)}
|
||||||
|
placeholder={"דוגמה:\nשיקול דעת מוגבל לחריגות קטנות\nריפוי פגם רק בנסיבות חריגות"}
|
||||||
|
dir="rtl" />
|
||||||
|
</Field>
|
||||||
|
|
||||||
|
{decision.parties.appellant && (
|
||||||
|
<Card className="bg-rule-soft/40 border-rule">
|
||||||
|
<CardContent className="px-4 py-3 text-[0.78rem] text-ink-soft">
|
||||||
|
<p><strong className="text-navy">עורר/ת:</strong> {decision.parties.appellant}</p>
|
||||||
|
{decision.parties.respondent && (
|
||||||
|
<p className="mt-1"><strong className="text-navy">משיב/ה:</strong> {decision.parties.respondent}</p>
|
||||||
|
)}
|
||||||
|
<p className="mt-2 text-ink-muted text-[0.72rem]">
|
||||||
|
(חולץ אוטומטית מתחילת הטקסט — תקן ע"י עריכת ה-full_text במקור.)
|
||||||
|
</p>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="flex items-center justify-end gap-2 pt-2 border-t border-rule">
|
||||||
|
<Button variant="ghost" onClick={() => onOpenChange(false)}>
|
||||||
|
סגור
|
||||||
|
</Button>
|
||||||
|
<Button onClick={onSave} disabled={!isDirty || patch.isPending}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
{patch.isPending ? (
|
||||||
|
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||||
|
) : (
|
||||||
|
<Save className="w-4 h-4 me-1" />
|
||||||
|
)}
|
||||||
|
שמור שינויים
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
{/* ── Tab: full_text (read-only) ─────────────────────── */}
|
||||||
|
<TabsContent value="content" className="mt-4">
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3">
|
||||||
|
<p className="text-[0.72rem] text-ink-muted mb-2">
|
||||||
|
{decision.chars.toLocaleString("he-IL")} תווים · קריאה בלבד
|
||||||
|
</p>
|
||||||
|
<ScrollArea className="h-[480px] pe-2">
|
||||||
|
<p className="text-sm text-ink leading-relaxed whitespace-pre-wrap">
|
||||||
|
<FullTextLazy id={decision.id} />
|
||||||
|
</p>
|
||||||
|
</ScrollArea>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
{/* ── Tab: lessons (per-decision) ────────────────────── */}
|
||||||
|
<TabsContent value="lessons" className="mt-4">
|
||||||
|
<LessonsTab corpusId={decision.id} />
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
{/* ── Tab: patterns scoped by appeal_subtype ─────────── */}
|
||||||
|
<TabsContent value="patterns" className="mt-4">
|
||||||
|
<PatternsForSubtype subtype={decision.appeal_subtype} />
|
||||||
|
</TabsContent>
|
||||||
|
</Tabs>
|
||||||
|
</div>
|
||||||
|
</SheetContent>
|
||||||
|
</Sheet>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── helpers ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function DataPoint({
|
||||||
|
icon, label, value,
|
||||||
|
}: { icon: React.ReactNode; label: string; value: string }) {
|
||||||
|
return (
|
||||||
|
<div className="flex items-center gap-1 text-ink-muted">
|
||||||
|
{icon}
|
||||||
|
<span>{label}:</span>
|
||||||
|
<span className="font-semibold text-navy tabular-nums truncate">{value}</span>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function Field({
|
||||||
|
label, children,
|
||||||
|
}: { label: string; children: React.ReactNode }) {
|
||||||
|
return (
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label className="text-[0.78rem]">{label}</Label>
|
||||||
|
{children}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* The corpus-list endpoint deliberately doesn't return full_text (too big).
|
||||||
|
* We fetch it on demand only when the content tab opens.
|
||||||
|
*
|
||||||
|
* Implementation note: we don't have a dedicated /api/training/corpus/{id}
|
||||||
|
* GET endpoint yet. As a thin stopgap we hit a planned `/full-text` shortcut
|
||||||
|
* via apiRequest; if the endpoint isn't deployed yet the UI just shows the
|
||||||
|
* fallback message instead of crashing. The full-text endpoint lands with
|
||||||
|
* the next backend deploy.
|
||||||
|
*/
|
||||||
|
function FullTextLazy({ id }: { id: string }) {
|
||||||
|
const [text, setText] = useState<string>("");
|
||||||
|
const [loading, setLoading] = useState(true);
|
||||||
|
const [error, setError] = useState("");
|
||||||
|
|
||||||
|
/* eslint-disable react-hooks/set-state-in-effect */
|
||||||
|
useEffect(() => {
|
||||||
|
let cancelled = false;
|
||||||
|
setLoading(true);
|
||||||
|
setError("");
|
||||||
|
fetch(`/api/training/corpus/${encodeURIComponent(id)}/full-text`)
|
||||||
|
.then((r) => (r.ok ? r.json() : Promise.reject(new Error(`HTTP ${r.status}`))))
|
||||||
|
.then((d: { full_text: string }) => {
|
||||||
|
if (cancelled) return;
|
||||||
|
setText(d.full_text || "");
|
||||||
|
})
|
||||||
|
.catch((e: Error) => {
|
||||||
|
if (cancelled) return;
|
||||||
|
setError(e.message);
|
||||||
|
})
|
||||||
|
.finally(() => !cancelled && setLoading(false));
|
||||||
|
return () => { cancelled = true; };
|
||||||
|
}, [id]);
|
||||||
|
/* eslint-enable react-hooks/set-state-in-effect */
|
||||||
|
|
||||||
|
if (loading) return <span className="text-ink-muted">טוען…</span>;
|
||||||
|
if (error) return <span className="text-ink-muted">לא נמצא ({error})</span>;
|
||||||
|
return text;
|
||||||
|
}
|
||||||
|
|
||||||
|
function PatternsForSubtype({ subtype }: { subtype: string }) {
|
||||||
|
// Filtered patterns endpoint isn't built yet — we fall back to /patterns
|
||||||
|
// and filter client-side. The result is mediocre when many subtypes share
|
||||||
|
// patterns; better filtering ships in the metadata-enrichment iteration.
|
||||||
|
const [data, setData] = useState<Record<string, { pattern_text: string; frequency: number }[]> | null>(null);
|
||||||
|
const [loading, setLoading] = useState(true);
|
||||||
|
|
||||||
|
useEffect(() => {
|
||||||
|
let cancelled = false;
|
||||||
|
fetch("/api/training/patterns")
|
||||||
|
.then((r) => r.json())
|
||||||
|
.then((d: { by_type: Record<string, { pattern_text: string; frequency: number }[]> }) => {
|
||||||
|
if (!cancelled) setData(d.by_type);
|
||||||
|
})
|
||||||
|
.catch(() => !cancelled && setData({}))
|
||||||
|
.finally(() => !cancelled && setLoading(false));
|
||||||
|
return () => { cancelled = true; };
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
if (loading) return <p className="text-ink-muted text-sm text-center py-6">טוען…</p>;
|
||||||
|
if (!data || Object.keys(data).length === 0) {
|
||||||
|
return <p className="text-ink-muted text-sm text-center py-6">אין דפוסים שמורים — הרץ ניתוח סגנון.</p>;
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-3">
|
||||||
|
{subtype && (
|
||||||
|
<p className="text-[0.78rem] text-ink-muted">
|
||||||
|
דפוסים בכלל הקורפוס. סינון לפי תת-סוג {subtype} ייושם בעדכון הבא.
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
{Object.entries(data).slice(0, 4).map(([type, items]) => (
|
||||||
|
<Card key={type} className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3">
|
||||||
|
<h4 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-2">
|
||||||
|
{type}
|
||||||
|
</h4>
|
||||||
|
<ul className="space-y-1 text-sm text-ink">
|
||||||
|
{items.slice(0, 6).map((p, i) => (
|
||||||
|
<li key={i} className="flex items-start gap-2">
|
||||||
|
<span className="text-[0.72rem] tabular-nums text-ink-muted shrink-0 mt-0.5">
|
||||||
|
×{p.frequency}
|
||||||
|
</span>
|
||||||
|
<span>{p.pattern_text}</span>
|
||||||
|
</li>
|
||||||
|
))}
|
||||||
|
</ul>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
@@ -1,6 +1,7 @@
|
|||||||
"use client";
|
"use client";
|
||||||
|
|
||||||
import { Trash2 } from "lucide-react";
|
import { useState } from "react";
|
||||||
|
import { Trash2, Sparkles } from "lucide-react";
|
||||||
import { toast } from "sonner";
|
import { toast } from "sonner";
|
||||||
import {
|
import {
|
||||||
Table, TableBody, TableCell, TableHead, TableHeader, TableRow,
|
Table, TableBody, TableCell, TableHead, TableHeader, TableRow,
|
||||||
@@ -9,12 +10,20 @@ import { Button } from "@/components/ui/button";
|
|||||||
import { Badge } from "@/components/ui/badge";
|
import { Badge } from "@/components/ui/badge";
|
||||||
import { Skeleton } from "@/components/ui/skeleton";
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
import { useCorpus, useDeleteCorpusEntry, type CorpusDecision } from "@/lib/api/training";
|
import { useCorpus, useDeleteCorpusEntry, type CorpusDecision } from "@/lib/api/training";
|
||||||
|
import { CorpusDetailDrawer } from "./corpus-detail-drawer";
|
||||||
|
|
||||||
/*
|
/*
|
||||||
* Corpus tab: table of all decisions currently in the style corpus, with a
|
* Corpus tab: table of all decisions currently in the style corpus.
|
||||||
* single destructive action (remove from corpus). Uses browser confirm() for
|
*
|
||||||
* the confirmation — a full shadcn AlertDialog would be overkill for an
|
* Click any row → opens CorpusDetailDrawer with the enriched metadata
|
||||||
* admin-only destructive action with a server-side safety net.
|
* + edit UI. The trash button is now in its own narrow column and uses
|
||||||
|
* stopPropagation so deleting a row doesn't also open the drawer.
|
||||||
|
*
|
||||||
|
* We use browser confirm() for the destructive action rather than a
|
||||||
|
* full shadcn AlertDialog because this is a single admin operation
|
||||||
|
* gated by an API-level safety net (FK cascade is best-effort but
|
||||||
|
* style_corpus DELETE returns 404 on missing rows, so the worst case
|
||||||
|
* is a no-op).
|
||||||
*/
|
*/
|
||||||
|
|
||||||
function formatChars(n: number) {
|
function formatChars(n: number) {
|
||||||
@@ -30,9 +39,12 @@ function formatDate(iso: string) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
function Row({ item }: { item: CorpusDecision }) {
|
function Row({
|
||||||
|
item, onOpen,
|
||||||
|
}: { item: CorpusDecision; onOpen: () => void }) {
|
||||||
const del = useDeleteCorpusEntry();
|
const del = useDeleteCorpusEntry();
|
||||||
const onDelete = async () => {
|
const onDelete = async (e: React.MouseEvent) => {
|
||||||
|
e.stopPropagation();
|
||||||
if (!window.confirm(`למחוק את החלטה ${item.decision_number} מהקורפוס?`)) return;
|
if (!window.confirm(`למחוק את החלטה ${item.decision_number} מהקורפוס?`)) return;
|
||||||
try {
|
try {
|
||||||
await del.mutateAsync(item.id);
|
await del.mutateAsync(item.id);
|
||||||
@@ -43,7 +55,10 @@ function Row({ item }: { item: CorpusDecision }) {
|
|||||||
};
|
};
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<TableRow className="border-rule hover:bg-gold-wash/30">
|
<TableRow
|
||||||
|
className="border-rule hover:bg-gold-wash/30 cursor-pointer"
|
||||||
|
onClick={onOpen}
|
||||||
|
>
|
||||||
<TableCell className="font-semibold text-navy tabular-nums">
|
<TableCell className="font-semibold text-navy tabular-nums">
|
||||||
{item.decision_number || "—"}
|
{item.decision_number || "—"}
|
||||||
</TableCell>
|
</TableCell>
|
||||||
@@ -55,20 +70,39 @@ function Row({ item }: { item: CorpusDecision }) {
|
|||||||
<span className="text-ink-light">—</span>
|
<span className="text-ink-light">—</span>
|
||||||
) : (
|
) : (
|
||||||
<div className="flex flex-wrap gap-1">
|
<div className="flex flex-wrap gap-1">
|
||||||
{item.subject_categories.map((s) => (
|
{item.subject_categories.slice(0, 3).map((s) => (
|
||||||
<Badge
|
<Badge key={s} variant="outline"
|
||||||
key={s}
|
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||||
variant="outline"
|
|
||||||
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40"
|
|
||||||
>
|
|
||||||
{s}
|
{s}
|
||||||
</Badge>
|
</Badge>
|
||||||
))}
|
))}
|
||||||
|
{item.subject_categories.length > 3 && (
|
||||||
|
<span className="text-[0.7rem] text-ink-muted">
|
||||||
|
+{item.subject_categories.length - 3}
|
||||||
|
</span>
|
||||||
|
)}
|
||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
</TableCell>
|
</TableCell>
|
||||||
|
<TableCell className="text-[0.78rem] text-ink-soft">
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<span className="truncate">{item.legal_citation || "—"}</span>
|
||||||
|
{item.lessons_count > 0 && (
|
||||||
|
<Badge variant="outline"
|
||||||
|
className="text-[0.7rem] bg-info-bg text-info border-info/40 shrink-0">
|
||||||
|
<Sparkles className="w-3 h-3 me-0.5" />
|
||||||
|
{item.lessons_count}
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</TableCell>
|
||||||
<TableCell className="text-ink-soft tabular-nums">
|
<TableCell className="text-ink-soft tabular-nums">
|
||||||
{formatChars(item.chars)}
|
{formatChars(item.chars)}
|
||||||
|
{item.page_count > 0 && (
|
||||||
|
<span className="text-ink-muted text-[0.72rem] ms-1">
|
||||||
|
· {item.page_count} ע׳
|
||||||
|
</span>
|
||||||
|
)}
|
||||||
</TableCell>
|
</TableCell>
|
||||||
<TableCell className="text-ink-muted tabular-nums text-[0.78rem]">
|
<TableCell className="text-ink-muted tabular-nums text-[0.78rem]">
|
||||||
{formatDate(item.created_at)}
|
{formatDate(item.created_at)}
|
||||||
@@ -91,6 +125,7 @@ function Row({ item }: { item: CorpusDecision }) {
|
|||||||
|
|
||||||
export function CorpusPanel() {
|
export function CorpusPanel() {
|
||||||
const { data, isPending, error } = useCorpus();
|
const { data, isPending, error } = useCorpus();
|
||||||
|
const [selected, setSelected] = useState<CorpusDecision | null>(null);
|
||||||
|
|
||||||
if (error) {
|
if (error) {
|
||||||
return (
|
return (
|
||||||
@@ -101,6 +136,7 @@ export function CorpusPanel() {
|
|||||||
}
|
}
|
||||||
|
|
||||||
return (
|
return (
|
||||||
|
<>
|
||||||
<div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
|
<div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
|
||||||
<Table>
|
<Table>
|
||||||
<TableHeader className="bg-rule-soft/60">
|
<TableHeader className="bg-rule-soft/60">
|
||||||
@@ -108,7 +144,8 @@ export function CorpusPanel() {
|
|||||||
<TableHead className="text-navy text-right">מס׳ החלטה</TableHead>
|
<TableHead className="text-navy text-right">מס׳ החלטה</TableHead>
|
||||||
<TableHead className="text-navy text-right">תאריך</TableHead>
|
<TableHead className="text-navy text-right">תאריך</TableHead>
|
||||||
<TableHead className="text-navy text-right">נושאים</TableHead>
|
<TableHead className="text-navy text-right">נושאים</TableHead>
|
||||||
<TableHead className="text-navy text-right">תווים</TableHead>
|
<TableHead className="text-navy text-right">מראה מקום</TableHead>
|
||||||
|
<TableHead className="text-navy text-right">תווים / עמודים</TableHead>
|
||||||
<TableHead className="text-navy text-right">נוסף בתאריך</TableHead>
|
<TableHead className="text-navy text-right">נוסף בתאריך</TableHead>
|
||||||
<TableHead className="text-navy" />
|
<TableHead className="text-navy" />
|
||||||
</TableRow>
|
</TableRow>
|
||||||
@@ -117,7 +154,7 @@ export function CorpusPanel() {
|
|||||||
{isPending ? (
|
{isPending ? (
|
||||||
[...Array(4)].map((_, i) => (
|
[...Array(4)].map((_, i) => (
|
||||||
<TableRow key={i} className="border-rule">
|
<TableRow key={i} className="border-rule">
|
||||||
{[...Array(6)].map((_, j) => (
|
{[...Array(7)].map((_, j) => (
|
||||||
<TableCell key={j}>
|
<TableCell key={j}>
|
||||||
<Skeleton className="h-4 w-24" />
|
<Skeleton className="h-4 w-24" />
|
||||||
</TableCell>
|
</TableCell>
|
||||||
@@ -126,15 +163,23 @@ export function CorpusPanel() {
|
|||||||
))
|
))
|
||||||
) : data?.length === 0 ? (
|
) : data?.length === 0 ? (
|
||||||
<TableRow>
|
<TableRow>
|
||||||
<TableCell colSpan={6} className="text-center text-ink-muted py-12">
|
<TableCell colSpan={7} className="text-center text-ink-muted py-12">
|
||||||
הקורפוס ריק
|
הקורפוס ריק
|
||||||
</TableCell>
|
</TableCell>
|
||||||
</TableRow>
|
</TableRow>
|
||||||
) : (
|
) : (
|
||||||
data?.map((item) => <Row key={item.id} item={item} />)
|
data?.map((item) => (
|
||||||
|
<Row key={item.id} item={item} onOpen={() => setSelected(item)} />
|
||||||
|
))
|
||||||
)}
|
)}
|
||||||
</TableBody>
|
</TableBody>
|
||||||
</Table>
|
</Table>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
|
<CorpusDetailDrawer
|
||||||
|
decision={selected}
|
||||||
|
onOpenChange={(open) => { if (!open) setSelected(null); }}
|
||||||
|
/>
|
||||||
|
</>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|||||||
338
web-ui/src/components/training/curator-portrait-panel.tsx
Normal file
338
web-ui/src/components/training/curator-portrait-panel.tsx
Normal file
@@ -0,0 +1,338 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Curator-Portrait tab — shows everything about the agent that learns
|
||||||
|
* Daphna's style:
|
||||||
|
* 1. Snapshot stats (curator findings to date, % applied)
|
||||||
|
* 2. Recent curator findings (last 10) — linked by decision number
|
||||||
|
* 3. The hermes-curator system prompt, rendered + linked to Gitea
|
||||||
|
* 4. The style_analyzer training prompts (different lifecycle — runs
|
||||||
|
* over the corpus at training time, not per-decision)
|
||||||
|
* 5. Propose-change form — writes a markdown file to disk for chair
|
||||||
|
* review (no auto-commit)
|
||||||
|
*
|
||||||
|
* The prompts are deliberately read-only here: they're symlinked into
|
||||||
|
* Paperclip and load-bearing for every curator wake. Editing them from
|
||||||
|
* the UI would silently fork the source of truth.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import {
|
||||||
|
Sparkles, ExternalLink, Send, Loader2, FileText, Brain,
|
||||||
|
CheckCircle2, Clock,
|
||||||
|
} from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Label } from "@/components/ui/label";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { ScrollArea } from "@/components/ui/scroll-area";
|
||||||
|
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||||
|
import { Markdown } from "@/components/ui/markdown";
|
||||||
|
import {
|
||||||
|
useCuratorPrompt,
|
||||||
|
useCuratorStats,
|
||||||
|
useStyleAnalyzerPrompts,
|
||||||
|
useSubmitCuratorProposal,
|
||||||
|
} from "@/lib/api/training";
|
||||||
|
|
||||||
|
export function CuratorPortraitPanel() {
|
||||||
|
return (
|
||||||
|
<div className="space-y-6">
|
||||||
|
<StatsCard />
|
||||||
|
<RecentFindings />
|
||||||
|
|
||||||
|
<Tabs defaultValue="curator-prompt" dir="rtl">
|
||||||
|
<TabsList className="bg-rule-soft/60">
|
||||||
|
<TabsTrigger value="curator-prompt">פרומפט ה-Curator</TabsTrigger>
|
||||||
|
<TabsTrigger value="analyzer-prompt">פרומפט אימון הסגנון</TabsTrigger>
|
||||||
|
<TabsTrigger value="propose">הצעת שינוי</TabsTrigger>
|
||||||
|
</TabsList>
|
||||||
|
<TabsContent value="curator-prompt" className="mt-4">
|
||||||
|
<CuratorPromptCard />
|
||||||
|
</TabsContent>
|
||||||
|
<TabsContent value="analyzer-prompt" className="mt-4">
|
||||||
|
<StyleAnalyzerPromptCard />
|
||||||
|
</TabsContent>
|
||||||
|
<TabsContent value="propose" className="mt-4">
|
||||||
|
<ProposeChangeForm />
|
||||||
|
</TabsContent>
|
||||||
|
</Tabs>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── stats card ─────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function StatsCard() {
|
||||||
|
const { data, isPending } = useCuratorStats();
|
||||||
|
|
||||||
|
if (isPending) {
|
||||||
|
return (
|
||||||
|
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
|
||||||
|
{[...Array(4)].map((_, i) => <Skeleton key={i} className="h-20 w-full" />)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
|
||||||
|
<Kpi label="ממצאי curator" value={data.total_findings} icon={<Sparkles className="w-4 h-4" />} />
|
||||||
|
<Kpi label="החלטות שנסקרו" value={`${data.decisions_with_findings}/${data.decisions_total}`} icon={<FileText className="w-4 h-4" />} />
|
||||||
|
<Kpi label="ממצאים שאומצו ל-SKILL" value={data.findings_applied} icon={<CheckCircle2 className="w-4 h-4" />} />
|
||||||
|
<Kpi label="ממוצע ממצאים להחלטה"
|
||||||
|
value={
|
||||||
|
data.decisions_with_findings > 0
|
||||||
|
? (data.total_findings / data.decisions_with_findings).toFixed(1)
|
||||||
|
: "—"
|
||||||
|
}
|
||||||
|
icon={<Brain className="w-4 h-4" />}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function Kpi({
|
||||||
|
label, value, icon,
|
||||||
|
}: { label: string; value: string | number; icon: React.ReactNode }) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3">
|
||||||
|
<div className="flex items-center gap-2 text-ink-muted text-[0.78rem]">
|
||||||
|
{icon}
|
||||||
|
<span>{label}</span>
|
||||||
|
</div>
|
||||||
|
<p className="text-2xl text-navy font-semibold tabular-nums mt-1">{value}</p>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── recent findings ────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function RecentFindings() {
|
||||||
|
const { data, isPending } = useCuratorStats();
|
||||||
|
|
||||||
|
if (isPending) {
|
||||||
|
return <Skeleton className="h-40 w-full" />;
|
||||||
|
}
|
||||||
|
if (!data || data.recent_findings.length === 0) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-rule-soft/40 border-rule">
|
||||||
|
<CardContent className="px-6 py-5 text-center text-ink-muted text-sm">
|
||||||
|
אין עדיין ממצאים של ה-Curator. הוא מופעל אוטומטית כאשר דפנה מסמנת
|
||||||
|
החלטה כסופית (mark-final), ושומר את ממצאיו כ-decision_lessons עם
|
||||||
|
source="curator".
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3">
|
||||||
|
<h3 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-3">
|
||||||
|
ממצאים אחרונים של ה-Curator
|
||||||
|
</h3>
|
||||||
|
<ul className="space-y-2">
|
||||||
|
{data.recent_findings.map((f) => (
|
||||||
|
<li key={f.id} className="border-b border-rule pb-2 last:border-0 last:pb-0">
|
||||||
|
<div className="flex items-center gap-2 text-[0.72rem] mb-1">
|
||||||
|
<Badge variant="outline"
|
||||||
|
className="bg-info-bg text-info border-info/40">
|
||||||
|
{f.category}
|
||||||
|
</Badge>
|
||||||
|
<span className="text-navy font-semibold tabular-nums">
|
||||||
|
{f.decision_number || "—"}
|
||||||
|
</span>
|
||||||
|
{f.applied_to_skill && (
|
||||||
|
<Badge variant="outline"
|
||||||
|
className="bg-success-bg text-success border-success/40">
|
||||||
|
<CheckCircle2 className="w-3 h-3 me-0.5" />
|
||||||
|
אומץ
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
<span className="grow text-ink-muted text-end">
|
||||||
|
<Clock className="w-3 h-3 inline me-1" />
|
||||||
|
{new Date(f.created_at).toLocaleDateString("he-IL")}
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
<p className="text-sm text-ink leading-relaxed">{f.lesson_text}</p>
|
||||||
|
</li>
|
||||||
|
))}
|
||||||
|
</ul>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── prompts ────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
function CuratorPromptCard() {
|
||||||
|
const { data, isPending, error } = useCuratorPrompt();
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-danger-bg border-danger/40">
|
||||||
|
<CardContent className="px-6 py-4 text-danger">{error.message}</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-5 py-4 space-y-3">
|
||||||
|
<div className="flex items-center justify-between gap-2 flex-wrap">
|
||||||
|
<div>
|
||||||
|
<h3 className="text-navy font-semibold">{data.filename}</h3>
|
||||||
|
<p className="text-[0.72rem] text-ink-muted">
|
||||||
|
{data.bytes.toLocaleString("he-IL")} בייטים ·
|
||||||
|
עודכן: {new Date(data.last_modified * 1000).toLocaleString("he-IL")}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
<Button asChild variant="outline" size="sm">
|
||||||
|
<a href={data.gitea_url} target="_blank" rel="noopener noreferrer">
|
||||||
|
<ExternalLink className="w-3 h-3 me-1" />
|
||||||
|
ערוך ב-Gitea
|
||||||
|
</a>
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
<ScrollArea className="h-[520px] pe-2 border border-rule rounded p-3 bg-rule-soft/30">
|
||||||
|
<Markdown content={data.content} />
|
||||||
|
</ScrollArea>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function StyleAnalyzerPromptCard() {
|
||||||
|
const { data, isPending } = useStyleAnalyzerPrompts();
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-5 py-4 space-y-3">
|
||||||
|
<div>
|
||||||
|
<h3 className="text-navy font-semibold">פרומפטים של style_analyzer.py</h3>
|
||||||
|
<p className="text-[0.72rem] text-ink-muted">
|
||||||
|
רץ ב-Claude Opus (1M context, עד {data.max_input_tokens.toLocaleString("he-IL")} tokens
|
||||||
|
input) דרך claude CLI מקומי — חינמי, ללא API. נקרא ע"י
|
||||||
|
<code className="px-1 mx-1 bg-rule-soft rounded">POST /api/training/analyze-style</code>
|
||||||
|
ומכניס דפוסים ל-<code className="px-1 bg-rule-soft rounded">style_patterns</code>.
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<Tabs defaultValue="analysis" dir="rtl">
|
||||||
|
<TabsList className="bg-rule-soft/60">
|
||||||
|
<TabsTrigger value="analysis">Single-pass (כל הקורפוס)</TabsTrigger>
|
||||||
|
<TabsTrigger value="single">Multi-pass (החלטה אחת)</TabsTrigger>
|
||||||
|
<TabsTrigger value="synthesis">Synthesis</TabsTrigger>
|
||||||
|
</TabsList>
|
||||||
|
<TabsContent value="analysis" className="mt-3">
|
||||||
|
<PromptBlock content={data.analysis_prompt} />
|
||||||
|
</TabsContent>
|
||||||
|
<TabsContent value="single" className="mt-3">
|
||||||
|
<PromptBlock content={data.single_decision_prompt} />
|
||||||
|
</TabsContent>
|
||||||
|
<TabsContent value="synthesis" className="mt-3">
|
||||||
|
<PromptBlock content={data.synthesis_prompt} />
|
||||||
|
</TabsContent>
|
||||||
|
</Tabs>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function PromptBlock({ content }: { content: string }) {
|
||||||
|
return (
|
||||||
|
<ScrollArea className="h-[420px] pe-2 border border-rule rounded p-3 bg-rule-soft/30">
|
||||||
|
<pre className="text-[0.78rem] whitespace-pre-wrap font-mono text-ink leading-relaxed"
|
||||||
|
dir="rtl">
|
||||||
|
{content}
|
||||||
|
</pre>
|
||||||
|
</ScrollArea>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── propose change form ────────────────────────────────────────────
|
||||||
|
|
||||||
|
function ProposeChangeForm() {
|
||||||
|
const [title, setTitle] = useState("");
|
||||||
|
const [proposedChange, setProposedChange] = useState("");
|
||||||
|
const [rationale, setRationale] = useState("");
|
||||||
|
const submit = useSubmitCuratorProposal();
|
||||||
|
|
||||||
|
const onSubmit = async (e: React.FormEvent) => {
|
||||||
|
e.preventDefault();
|
||||||
|
if (!title.trim() || !proposedChange.trim()) {
|
||||||
|
toast.error("חובה כותרת ושינוי מוצע");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
const r = await submit.mutateAsync({
|
||||||
|
title: title.trim(),
|
||||||
|
proposed_change: proposedChange.trim(),
|
||||||
|
rationale: rationale.trim(),
|
||||||
|
});
|
||||||
|
toast.success(`נשמרה הצעה: ${r.filename}`);
|
||||||
|
setTitle(""); setProposedChange(""); setRationale("");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל בשמירה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-5 py-4">
|
||||||
|
<h3 className="text-navy font-semibold mb-2">הצעת שינוי לפרומפט ה-Curator</h3>
|
||||||
|
<p className="text-[0.78rem] text-ink-muted mb-4">
|
||||||
|
ההצעה תישמר כקובץ Markdown ב-
|
||||||
|
<code className="px-1 bg-rule-soft rounded">data/curator-proposals/</code>.
|
||||||
|
חיים יבחן ויאשר ידנית — אין שינוי אוטומטי בפרומפט.
|
||||||
|
</p>
|
||||||
|
<form onSubmit={onSubmit} className="space-y-3">
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="proposal-title">כותרת השינוי</Label>
|
||||||
|
<Input id="proposal-title" value={title}
|
||||||
|
onChange={(e) => setTitle(e.target.value)}
|
||||||
|
placeholder="לדוגמה: הוסף קטגוריה [צ׳קליסט תוכן] לממצאי ה-curator"
|
||||||
|
dir="rtl" />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="proposal-change">השינוי המוצע (Markdown)</Label>
|
||||||
|
<Textarea id="proposal-change" value={proposedChange} rows={6}
|
||||||
|
onChange={(e) => setProposedChange(e.target.value)}
|
||||||
|
placeholder={"תאר במדויק מה לשנות. אפשר להעתיק את הקטע הקיים ולסמן ב-strikethrough + להוסיף את החדש."}
|
||||||
|
dir="rtl" />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="proposal-rationale">נימוק</Label>
|
||||||
|
<Textarea id="proposal-rationale" value={rationale} rows={3}
|
||||||
|
onChange={(e) => setRationale(e.target.value)}
|
||||||
|
placeholder="למה השינוי הזה חשוב? איזה בעיה הוא פותר?"
|
||||||
|
dir="rtl" />
|
||||||
|
</div>
|
||||||
|
<div className="flex justify-end">
|
||||||
|
<Button type="submit" disabled={submit.isPending}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
{submit.isPending ? (
|
||||||
|
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||||
|
) : (
|
||||||
|
<Send className="w-4 h-4 me-1" />
|
||||||
|
)}
|
||||||
|
שלח הצעה
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
267
web-ui/src/components/training/lessons-tab.tsx
Normal file
267
web-ui/src/components/training/lessons-tab.tsx
Normal file
@@ -0,0 +1,267 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Per-decision lessons editor — lives inside CorpusDetailDrawer's
|
||||||
|
* "מה למדנו" tab. Lessons are persisted in the decision_lessons table
|
||||||
|
* (one-to-many on style_corpus) and consumed by hermes-curator and
|
||||||
|
* future style_analyzer runs as context.
|
||||||
|
*
|
||||||
|
* The chair can:
|
||||||
|
* - Add a lesson typed manually (category = "general" by default)
|
||||||
|
* - Edit / delete existing lessons
|
||||||
|
* - Mark a lesson as "applied_to_skill" (informational — doesn't
|
||||||
|
* auto-commit anything to SKILL.md; chair still curates that file
|
||||||
|
* manually in git).
|
||||||
|
*
|
||||||
|
* Lessons from the curator arrive with source="curator" and are visually
|
||||||
|
* distinguished by a badge so the chair can audit auto-suggestions.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import { Plus, Save, Trash2, Loader2, CheckCircle2, Sparkles } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import {
|
||||||
|
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import {
|
||||||
|
useAddLesson,
|
||||||
|
useCorpusLessons,
|
||||||
|
useDeleteLesson,
|
||||||
|
usePatchLesson,
|
||||||
|
type DecisionLesson,
|
||||||
|
} from "@/lib/api/training";
|
||||||
|
|
||||||
|
const CATEGORIES = [
|
||||||
|
{ value: "general", label: "כללי" },
|
||||||
|
{ value: "style", label: "סגנון" },
|
||||||
|
{ value: "structure", label: "מבנה" },
|
||||||
|
{ value: "lexicon", label: "לקסיקון" },
|
||||||
|
{ value: "tabular", label: "טבלאי" },
|
||||||
|
] as const;
|
||||||
|
|
||||||
|
const SOURCE_BADGE: Record<DecisionLesson["source"], { label: string; cls: string }> = {
|
||||||
|
manual: { label: "ידני", cls: "bg-rule-soft text-ink-soft" },
|
||||||
|
chair: { label: "יו״ר", cls: "bg-gold-wash text-gold-deep" },
|
||||||
|
curator: { label: "Curator", cls: "bg-info-bg text-info" },
|
||||||
|
style_analyzer: { label: "Analyzer", cls: "bg-success-bg text-success" },
|
||||||
|
};
|
||||||
|
|
||||||
|
export function LessonsTab({ corpusId }: { corpusId: string }) {
|
||||||
|
const { data, isPending } = useCorpusLessons(corpusId);
|
||||||
|
const add = useAddLesson(corpusId);
|
||||||
|
const [draftText, setDraftText] = useState("");
|
||||||
|
const [draftCategory, setDraftCategory] = useState<DecisionLesson["category"]>("general");
|
||||||
|
|
||||||
|
const onAdd = async () => {
|
||||||
|
const text = draftText.trim();
|
||||||
|
if (!text) return;
|
||||||
|
try {
|
||||||
|
await add.mutateAsync({ lesson_text: text, category: draftCategory });
|
||||||
|
setDraftText("");
|
||||||
|
setDraftCategory("general");
|
||||||
|
toast.success("הלקח נוסף");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל בשמירה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
{/* Composer */}
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3 space-y-2">
|
||||||
|
<h4 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold">
|
||||||
|
הוסף לקח להחלטה
|
||||||
|
</h4>
|
||||||
|
<Textarea
|
||||||
|
value={draftText}
|
||||||
|
onChange={(e) => setDraftText(e.target.value)}
|
||||||
|
placeholder="מה למדנו מההחלטה הזו? למשל: 'דפנה מעדיפה הוצאות מתונות (5K-10K ₪) גם בערר שהתקבל במלואו'"
|
||||||
|
rows={3}
|
||||||
|
dir="rtl"
|
||||||
|
disabled={add.isPending}
|
||||||
|
/>
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<Select
|
||||||
|
value={draftCategory}
|
||||||
|
onValueChange={(v) => setDraftCategory(v as DecisionLesson["category"])}
|
||||||
|
disabled={add.isPending}
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
<SelectTrigger className="w-40">
|
||||||
|
<SelectValue />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
{CATEGORIES.map((c) => (
|
||||||
|
<SelectItem key={c.value} value={c.value}>{c.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
<div className="grow" />
|
||||||
|
<Button onClick={onAdd} disabled={add.isPending || !draftText.trim()}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
{add.isPending ? (
|
||||||
|
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||||
|
) : (
|
||||||
|
<Plus className="w-4 h-4 me-1" />
|
||||||
|
)}
|
||||||
|
שמור לקח
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
|
||||||
|
{/* List */}
|
||||||
|
{isPending ? (
|
||||||
|
<div className="space-y-2">
|
||||||
|
{[...Array(3)].map((_, i) => (
|
||||||
|
<Skeleton key={i} className="h-16 w-full" />
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
) : !data || data.length === 0 ? (
|
||||||
|
<p className="text-center text-ink-muted text-sm py-6">
|
||||||
|
אין עדיין לקחים להחלטה זו. הוסף לקח ראשון מלמעלה.
|
||||||
|
</p>
|
||||||
|
) : (
|
||||||
|
<div className="space-y-2">
|
||||||
|
{data.map((lesson) => (
|
||||||
|
<LessonItem key={lesson.id} lesson={lesson} corpusId={corpusId} />
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function LessonItem({
|
||||||
|
lesson, corpusId,
|
||||||
|
}: { lesson: DecisionLesson; corpusId: string }) {
|
||||||
|
const [editing, setEditing] = useState(false);
|
||||||
|
const [text, setText] = useState(lesson.lesson_text);
|
||||||
|
const [category, setCategory] = useState<DecisionLesson["category"]>(lesson.category);
|
||||||
|
const patch = usePatchLesson(corpusId);
|
||||||
|
const del = useDeleteLesson(corpusId);
|
||||||
|
|
||||||
|
const sourceBadge = SOURCE_BADGE[lesson.source];
|
||||||
|
const dirty = text !== lesson.lesson_text || category !== lesson.category;
|
||||||
|
|
||||||
|
const onSave = async () => {
|
||||||
|
try {
|
||||||
|
await patch.mutateAsync({
|
||||||
|
id: lesson.id,
|
||||||
|
patch: dirty ? { lesson_text: text, category } : {},
|
||||||
|
});
|
||||||
|
setEditing(false);
|
||||||
|
toast.success("הלקח עודכן");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל בעדכון");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const onToggleApplied = async () => {
|
||||||
|
try {
|
||||||
|
await patch.mutateAsync({
|
||||||
|
id: lesson.id,
|
||||||
|
patch: { applied_to_skill: !lesson.applied_to_skill },
|
||||||
|
});
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל בעדכון");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const onDelete = async () => {
|
||||||
|
if (!window.confirm("למחוק את הלקח?")) return;
|
||||||
|
try {
|
||||||
|
await del.mutateAsync(lesson.id);
|
||||||
|
toast.success("נמחק");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "כשל במחיקה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-4 py-3 space-y-2">
|
||||||
|
<div className="flex items-center gap-2 text-[0.72rem]">
|
||||||
|
<Badge variant="outline"
|
||||||
|
className="bg-rule-soft text-ink-soft">
|
||||||
|
{CATEGORIES.find((c) => c.value === lesson.category)?.label || lesson.category}
|
||||||
|
</Badge>
|
||||||
|
<Badge variant="outline" className={sourceBadge.cls}>
|
||||||
|
{sourceBadge.label}
|
||||||
|
</Badge>
|
||||||
|
{lesson.applied_to_skill && (
|
||||||
|
<Badge variant="outline"
|
||||||
|
className="bg-success-bg text-success border-success/40">
|
||||||
|
<CheckCircle2 className="w-3 h-3 me-1" />
|
||||||
|
אומץ
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
<span className="grow text-ink-muted tabular-nums">
|
||||||
|
{new Date(lesson.created_at).toLocaleDateString("he-IL")}
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{editing ? (
|
||||||
|
<>
|
||||||
|
<Textarea value={text} onChange={(e) => setText(e.target.value)}
|
||||||
|
rows={3} dir="rtl" />
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<Select value={category}
|
||||||
|
onValueChange={(v) => setCategory(v as DecisionLesson["category"])}
|
||||||
|
dir="rtl">
|
||||||
|
<SelectTrigger className="w-40">
|
||||||
|
<SelectValue />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
{CATEGORIES.map((c) => (
|
||||||
|
<SelectItem key={c.value} value={c.value}>{c.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
<div className="grow" />
|
||||||
|
<Button variant="ghost" size="sm"
|
||||||
|
onClick={() => { setEditing(false); setText(lesson.lesson_text); setCategory(lesson.category); }}>
|
||||||
|
ביטול
|
||||||
|
</Button>
|
||||||
|
<Button size="sm" onClick={onSave} disabled={patch.isPending}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
<Save className="w-3 h-3 me-1" />
|
||||||
|
שמור
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<p className="text-sm text-ink leading-relaxed whitespace-pre-wrap"
|
||||||
|
onClick={() => setEditing(true)}
|
||||||
|
style={{ cursor: "text" }}>
|
||||||
|
{lesson.lesson_text}
|
||||||
|
</p>
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<Button variant="ghost" size="sm" onClick={onToggleApplied}
|
||||||
|
disabled={patch.isPending}>
|
||||||
|
<Sparkles className="w-3 h-3 me-1" />
|
||||||
|
{lesson.applied_to_skill ? "בטל סימון 'אומץ'" : "סמן כ'אומץ ל-SKILL'"}
|
||||||
|
</Button>
|
||||||
|
<Button variant="ghost" size="sm" onClick={() => setEditing(true)}>
|
||||||
|
ערוך
|
||||||
|
</Button>
|
||||||
|
<div className="grow" />
|
||||||
|
<Button variant="ghost" size="sm" onClick={onDelete}
|
||||||
|
disabled={del.isPending}
|
||||||
|
className="text-danger hover:text-danger hover:bg-danger-bg">
|
||||||
|
<Trash2 className="w-3 h-3" />
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
328
web-ui/src/components/training/upload-dialog.tsx
Normal file
328
web-ui/src/components/training/upload-dialog.tsx
Normal file
@@ -0,0 +1,328 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
/*
|
||||||
|
* Upload a Daphna decision into the style corpus, from the /training page.
|
||||||
|
*
|
||||||
|
* The flow is three explicit steps inside the same sheet:
|
||||||
|
* 1. file picker → POST /api/upload (gets sanitized filename)
|
||||||
|
* 2. preview → POST /api/training/analyze (proofread + auto-extracted meta)
|
||||||
|
* chair can correct decision_number / decision_date / subjects
|
||||||
|
* 3. commit → POST /api/training/upload (background task)
|
||||||
|
* progress watched via SSE; on completion we invalidate
|
||||||
|
* corpus + style-report so the new row appears.
|
||||||
|
*
|
||||||
|
* The Sheet UX mirrors precedent-upload-sheet.tsx: same dir="rtl", same
|
||||||
|
* loading + error patterns, same toast on success. The reason this isn't
|
||||||
|
* a single one-click upload is that style-corpus rows are write-once
|
||||||
|
* (we don't allow editing full_text), so the chair MUST see the proofread
|
||||||
|
* preview before committing — otherwise a bad OCR/proofread can silently
|
||||||
|
* pollute the style portrait.
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useEffect, useState } from "react";
|
||||||
|
import { Upload, Loader2, CheckCircle2, AlertCircle, FileText } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { useQueryClient } from "@tanstack/react-query";
|
||||||
|
import {
|
||||||
|
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||||
|
} from "@/components/ui/sheet";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Label } from "@/components/ui/label";
|
||||||
|
import { Progress } from "@/components/ui/progress";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import {
|
||||||
|
trainingKeys,
|
||||||
|
useAnalyzeTraining,
|
||||||
|
useCommitTrainingUpload,
|
||||||
|
useUploadFile,
|
||||||
|
type AnalyzeTrainingResponse,
|
||||||
|
} from "@/lib/api/training";
|
||||||
|
import { useProgress } from "@/lib/api/documents";
|
||||||
|
|
||||||
|
const ACCEPT = ".pdf,.docx,.doc,.rtf,.txt,.md";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
open: boolean;
|
||||||
|
onOpenChange: (open: boolean) => void;
|
||||||
|
};
|
||||||
|
|
||||||
|
type Stage = "pick" | "analyzing" | "preview" | "committing" | "done" | "error";
|
||||||
|
|
||||||
|
export function TrainingUploadDialog({ open, onOpenChange }: Props) {
|
||||||
|
const [stage, setStage] = useState<Stage>("pick");
|
||||||
|
const [file, setFile] = useState<File | null>(null);
|
||||||
|
const [analysis, setAnalysis] = useState<AnalyzeTrainingResponse | null>(null);
|
||||||
|
// editable copies of the auto-extracted metadata
|
||||||
|
const [decisionNumber, setDecisionNumber] = useState("");
|
||||||
|
const [decisionDate, setDecisionDate] = useState("");
|
||||||
|
const [subjectsRaw, setSubjectsRaw] = useState("");
|
||||||
|
const [title, setTitle] = useState("");
|
||||||
|
const [taskId, setTaskId] = useState<string | null>(null);
|
||||||
|
const [errorMsg, setErrorMsg] = useState("");
|
||||||
|
|
||||||
|
const uploadFile = useUploadFile();
|
||||||
|
const analyze = useAnalyzeTraining();
|
||||||
|
const commit = useCommitTrainingUpload();
|
||||||
|
const progress = useProgress(taskId);
|
||||||
|
const qc = useQueryClient();
|
||||||
|
|
||||||
|
// Reset everything when the sheet closes — important because Sheet keeps
|
||||||
|
// the component mounted between opens. The cascade-render warning is the
|
||||||
|
// intended behavior (reset is the side effect we want).
|
||||||
|
useEffect(() => {
|
||||||
|
if (open) return;
|
||||||
|
/* eslint-disable react-hooks/set-state-in-effect */
|
||||||
|
setStage("pick"); setFile(null); setAnalysis(null);
|
||||||
|
setDecisionNumber(""); setDecisionDate(""); setSubjectsRaw("");
|
||||||
|
setTitle(""); setTaskId(null); setErrorMsg("");
|
||||||
|
/* eslint-enable react-hooks/set-state-in-effect */
|
||||||
|
}, [open]);
|
||||||
|
|
||||||
|
// Watch background task. When complete, invalidate corpus + report so the
|
||||||
|
// new row + updated stats show up automatically. The setStage call here
|
||||||
|
// is the deliberate UX (success card → auto-close) — synchronizing UI
|
||||||
|
// with the external SSE stream is exactly what effects are for.
|
||||||
|
useEffect(() => {
|
||||||
|
if (!progress) return;
|
||||||
|
if (progress.status === "completed") {
|
||||||
|
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||||
|
qc.invalidateQueries({ queryKey: trainingKeys.report() });
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setStage("done");
|
||||||
|
toast.success(`החלטה ${decisionNumber || analysis?.decision_number || ""} נוספה לקורפוס`);
|
||||||
|
const t = window.setTimeout(() => onOpenChange(false), 1500);
|
||||||
|
return () => window.clearTimeout(t);
|
||||||
|
}
|
||||||
|
if (progress.status === "failed") {
|
||||||
|
setStage("error");
|
||||||
|
setErrorMsg(progress.error || "כשל בעיבוד");
|
||||||
|
}
|
||||||
|
}, [progress, analysis, decisionNumber, qc, onOpenChange]);
|
||||||
|
|
||||||
|
const onPickFile = async (f: File | null) => {
|
||||||
|
setFile(f);
|
||||||
|
setErrorMsg("");
|
||||||
|
if (!f) return;
|
||||||
|
setStage("analyzing");
|
||||||
|
try {
|
||||||
|
const { filename } = await uploadFile.mutateAsync(f);
|
||||||
|
const result = await analyze.mutateAsync(filename);
|
||||||
|
setAnalysis(result);
|
||||||
|
setDecisionNumber(result.decision_number);
|
||||||
|
setDecisionDate(result.decision_date);
|
||||||
|
setSubjectsRaw(result.subject_categories.join(", "));
|
||||||
|
// Default title from the original filename stem (chair can override).
|
||||||
|
const stem = f.name.replace(/\.[^.]+$/, "");
|
||||||
|
setTitle(stem);
|
||||||
|
setStage("preview");
|
||||||
|
} catch (e) {
|
||||||
|
setStage("error");
|
||||||
|
setErrorMsg(e instanceof Error ? e.message : "כשל בקריאת הקובץ");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const onCommit = async () => {
|
||||||
|
if (!analysis) return;
|
||||||
|
setStage("committing");
|
||||||
|
setErrorMsg("");
|
||||||
|
try {
|
||||||
|
const subjects = subjectsRaw
|
||||||
|
.split(/[,،]/)
|
||||||
|
.map((s) => s.trim())
|
||||||
|
.filter(Boolean);
|
||||||
|
const res = await commit.mutateAsync({
|
||||||
|
filename: analysis.filename,
|
||||||
|
decision_number: decisionNumber.trim(),
|
||||||
|
decision_date: decisionDate || "",
|
||||||
|
subject_categories: subjects,
|
||||||
|
title: title.trim() || undefined,
|
||||||
|
});
|
||||||
|
setTaskId(res.task_id);
|
||||||
|
} catch (e) {
|
||||||
|
setStage("error");
|
||||||
|
// 409 = duplicate decision_number — surface the backend's Hebrew message.
|
||||||
|
setErrorMsg(e instanceof Error ? e.message : "כשל בהעלאה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const isProcessing =
|
||||||
|
stage === "analyzing" || stage === "committing" ||
|
||||||
|
(taskId !== null && progress?.status !== "completed" && progress?.status !== "failed");
|
||||||
|
const progressStep = (progress as { step?: string } | null)?.step;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||||
|
<SheetContent side="left" className="w-full sm:max-w-2xl overflow-y-auto" dir="rtl">
|
||||||
|
<SheetHeader>
|
||||||
|
<SheetTitle className="text-navy">העלאת החלטה לקורפוס הסגנון</SheetTitle>
|
||||||
|
<SheetDescription className="text-ink-muted">
|
||||||
|
הקובץ יעבור הגהה (סינון Nevo, ניקוד), חילוץ אוטומטי של מספר תיק, תאריך
|
||||||
|
ונושאים, ויוטמע ב-style_corpus עם chunks ו-embeddings. תוכל לתקן את
|
||||||
|
פרטי המטא-דאטה לפני שמירה.
|
||||||
|
</SheetDescription>
|
||||||
|
</SheetHeader>
|
||||||
|
|
||||||
|
<div className="px-6 pb-6 mt-4 space-y-4">
|
||||||
|
{/* Step 1: pick */}
|
||||||
|
{stage === "pick" && (
|
||||||
|
<div className="space-y-2">
|
||||||
|
<Label htmlFor="t-file">קובץ ההחלטה (PDF / DOCX / DOC / RTF / TXT / MD)</Label>
|
||||||
|
<Input
|
||||||
|
id="t-file" type="file" accept={ACCEPT}
|
||||||
|
onChange={(e) => onPickFile(e.target.files?.[0] ?? null)}
|
||||||
|
/>
|
||||||
|
<p className="text-[0.78rem] text-ink-muted">
|
||||||
|
המערכת תחלץ מהקובץ את מספר התיק, התאריך והנושאים. תוכל לערוך
|
||||||
|
לפני השמירה.
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Stage 2: analyzing the file */}
|
||||||
|
{stage === "analyzing" && (
|
||||||
|
<div className="rounded-lg border border-rule bg-rule-soft/40 p-6 space-y-2 text-center">
|
||||||
|
<Loader2 className="w-5 h-5 animate-spin mx-auto text-navy" />
|
||||||
|
<p className="text-sm text-navy">מבצע הגהה וחילוץ מטא-דאטה…</p>
|
||||||
|
<p className="text-[0.78rem] text-ink-muted">
|
||||||
|
{file?.name}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Stage 3: preview + editable metadata */}
|
||||||
|
{stage === "preview" && analysis && (
|
||||||
|
<form
|
||||||
|
className="space-y-4"
|
||||||
|
onSubmit={(e) => { e.preventDefault(); onCommit(); }}
|
||||||
|
>
|
||||||
|
<div className="rounded-lg border border-rule bg-surface px-4 py-3">
|
||||||
|
<h3 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-2">
|
||||||
|
תצוגה מקדימה של הטקסט הנקי
|
||||||
|
</h3>
|
||||||
|
<p className="text-sm text-ink leading-relaxed line-clamp-6 whitespace-pre-wrap">
|
||||||
|
{analysis.preview}
|
||||||
|
</p>
|
||||||
|
<div className="mt-2 flex items-center gap-3 text-[0.72rem] text-ink-muted tabular-nums">
|
||||||
|
<span className="flex items-center gap-1">
|
||||||
|
<FileText className="w-3 h-3" />
|
||||||
|
{analysis.chars.toLocaleString("he-IL")} תווים
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="t-decision-number">מספר ההחלטה</Label>
|
||||||
|
<Input
|
||||||
|
id="t-decision-number"
|
||||||
|
value={decisionNumber}
|
||||||
|
onChange={(e) => setDecisionNumber(e.target.value)}
|
||||||
|
placeholder="1130-25"
|
||||||
|
dir="rtl"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="t-decision-date">תאריך ההחלטה</Label>
|
||||||
|
<Input
|
||||||
|
id="t-decision-date" type="date"
|
||||||
|
value={decisionDate}
|
||||||
|
onChange={(e) => setDecisionDate(e.target.value)}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="t-title">כותרת קצרה (אופציונלי)</Label>
|
||||||
|
<Input
|
||||||
|
id="t-title" value={title}
|
||||||
|
onChange={(e) => setTitle(e.target.value)}
|
||||||
|
placeholder="ARAR-25-1130 - כרמל יצחק" dir="rtl"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="t-subjects">נושאים (מופרדים בפסיקים)</Label>
|
||||||
|
<Input
|
||||||
|
id="t-subjects" value={subjectsRaw}
|
||||||
|
onChange={(e) => setSubjectsRaw(e.target.value)}
|
||||||
|
placeholder="חניה, קווי בניין, שימוש חורג" dir="rtl"
|
||||||
|
/>
|
||||||
|
{analysis.subject_categories.length > 0 && (
|
||||||
|
<div className="flex flex-wrap gap-1 mt-1">
|
||||||
|
<span className="text-[0.72rem] text-ink-muted">חולץ אוטומטית:</span>
|
||||||
|
{analysis.subject_categories.map((s) => (
|
||||||
|
<Badge key={s} variant="outline"
|
||||||
|
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||||
|
{s}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{errorMsg && (
|
||||||
|
<div className="rounded-lg border border-danger/40 bg-danger-bg p-3 flex items-center gap-2 text-danger text-sm">
|
||||||
|
<AlertCircle className="w-4 h-4 shrink-0" />
|
||||||
|
{errorMsg}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="flex gap-2 justify-end pt-2">
|
||||||
|
<Button type="button" variant="ghost"
|
||||||
|
onClick={() => onOpenChange(false)}
|
||||||
|
disabled={isProcessing}>
|
||||||
|
ביטול
|
||||||
|
</Button>
|
||||||
|
<Button type="submit" disabled={isProcessing || !decisionNumber.trim()}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
<Upload className="w-4 h-4 me-1" />
|
||||||
|
שמור בקורפוס
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Stage 4: committing — background task progress */}
|
||||||
|
{(stage === "committing" || (taskId && stage !== "done" && stage !== "error")) && (
|
||||||
|
<div className="rounded-lg border border-rule bg-rule-soft/40 p-4 space-y-2">
|
||||||
|
<div className="flex items-center gap-2 text-sm text-navy">
|
||||||
|
<Loader2 className="w-4 h-4 animate-spin" />
|
||||||
|
<span>{progressStep || "מעבד את ההחלטה לקורפוס"}</span>
|
||||||
|
</div>
|
||||||
|
<Progress value={progressStep ? 60 : 30} className="h-1.5" />
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Stage 5: success */}
|
||||||
|
{stage === "done" && (
|
||||||
|
<div className="rounded-lg border border-gold/40 bg-gold-wash p-4 flex items-center gap-2 text-gold-deep text-sm">
|
||||||
|
<CheckCircle2 className="w-4 h-4" />
|
||||||
|
ההחלטה נוספה לקורפוס בהצלחה.
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Stage 6: error (after a failed analyze or upload) */}
|
||||||
|
{stage === "error" && (
|
||||||
|
<div className="space-y-3">
|
||||||
|
<div className="rounded-lg border border-danger/40 bg-danger-bg p-4 flex items-center gap-2 text-danger text-sm">
|
||||||
|
<AlertCircle className="w-4 h-4 shrink-0" />
|
||||||
|
{errorMsg || "שגיאה לא ידועה"}
|
||||||
|
</div>
|
||||||
|
<div className="flex gap-2 justify-end">
|
||||||
|
<Button type="button" variant="ghost"
|
||||||
|
onClick={() => onOpenChange(false)}>
|
||||||
|
סגור
|
||||||
|
</Button>
|
||||||
|
<Button type="button"
|
||||||
|
onClick={() => { setStage("pick"); setErrorMsg(""); setFile(null); }}>
|
||||||
|
נסה קובץ אחר
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</SheetContent>
|
||||||
|
</Sheet>
|
||||||
|
);
|
||||||
|
}
|
||||||
@@ -7,10 +7,13 @@
|
|||||||
* - GET /corpus → flat list of decisions for the corpus tab / compare tool
|
* - GET /corpus → flat list of decisions for the corpus tab / compare tool
|
||||||
* - GET /compare?a=UUID&b=UUID → side-by-side comparison
|
* - GET /compare?a=UUID&b=UUID → side-by-side comparison
|
||||||
* - DELETE /corpus/{id} → remove a decision from the corpus
|
* - DELETE /corpus/{id} → remove a decision from the corpus
|
||||||
|
* - POST /api/upload → multipart file → returns sanitized filename
|
||||||
|
* - POST /analyze → proofread + extract metadata for preview
|
||||||
|
* - POST /upload → commit a proofread decision to the corpus (task_id)
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
|
import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
|
||||||
import { apiRequest } from "./client";
|
import { ApiError, apiRequest } from "./client";
|
||||||
|
|
||||||
export type StyleReport = {
|
export type StyleReport = {
|
||||||
corpus: {
|
corpus: {
|
||||||
@@ -69,6 +72,29 @@ export type CorpusDecision = {
|
|||||||
subject_categories: string[];
|
subject_categories: string[];
|
||||||
chars: number;
|
chars: number;
|
||||||
created_at: string;
|
created_at: string;
|
||||||
|
// Enriched metadata (added in the corpus-page upgrade).
|
||||||
|
summary: string;
|
||||||
|
outcome: string;
|
||||||
|
key_principles: string[];
|
||||||
|
appeal_subtype: string;
|
||||||
|
practice_area: string;
|
||||||
|
page_count: number;
|
||||||
|
document_id: string | null;
|
||||||
|
doc_title: string;
|
||||||
|
parties: { appellant: string; respondent: string };
|
||||||
|
legal_citation: string;
|
||||||
|
lessons_count: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CorpusDecisionPatch = {
|
||||||
|
decision_number?: string;
|
||||||
|
decision_date?: string;
|
||||||
|
subject_categories?: string[];
|
||||||
|
summary?: string;
|
||||||
|
outcome?: string;
|
||||||
|
key_principles?: string[];
|
||||||
|
appeal_subtype?: string;
|
||||||
|
practice_area?: string;
|
||||||
};
|
};
|
||||||
|
|
||||||
export type CompareResult = {
|
export type CompareResult = {
|
||||||
@@ -149,3 +175,407 @@ export function useDeleteCorpusEntry() {
|
|||||||
},
|
},
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── Style-agent chat ─────────────────────────────────────────────
|
||||||
|
|
||||||
|
export type ChatConversation = {
|
||||||
|
id: string;
|
||||||
|
title: string;
|
||||||
|
style_corpus_id: string | null;
|
||||||
|
decision_number: string;
|
||||||
|
claude_session_id: string | null;
|
||||||
|
message_count: number;
|
||||||
|
created_at: string;
|
||||||
|
last_message_at: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type ChatMessage = {
|
||||||
|
id: string;
|
||||||
|
role: "user" | "assistant";
|
||||||
|
content: string;
|
||||||
|
created_at: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type ChatHealth = {
|
||||||
|
reachable: boolean;
|
||||||
|
status?: number;
|
||||||
|
url: string;
|
||||||
|
error?: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export const chatKeys = {
|
||||||
|
conversations: () => [...trainingKeys.all, "chat", "conversations"] as const,
|
||||||
|
conversation: (id: string) =>
|
||||||
|
[...trainingKeys.all, "chat", "conversations", id] as const,
|
||||||
|
health: () => [...trainingKeys.all, "chat", "health"] as const,
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useChatConversations() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: chatKeys.conversations(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<ChatConversation[]>("/api/training/chat/conversations", { signal }),
|
||||||
|
staleTime: 15_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useChatConversation(convId: string | null) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: chatKeys.conversation(convId ?? ""),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<{ conversation: ChatConversation; messages: ChatMessage[] }>(
|
||||||
|
`/api/training/chat/conversations/${encodeURIComponent(convId!)}`,
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
enabled: Boolean(convId),
|
||||||
|
staleTime: 5_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useChatHealth() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: chatKeys.health(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<ChatHealth>("/api/training/chat/health", { signal }),
|
||||||
|
staleTime: 30_000,
|
||||||
|
retry: false,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useCreateChat() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (body: { title?: string; style_corpus_id?: string | null }) =>
|
||||||
|
apiRequest<ChatConversation>("/api/training/chat/conversations", {
|
||||||
|
method: "POST",
|
||||||
|
body,
|
||||||
|
}),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: chatKeys.conversations() });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useDeleteChat() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (id: string) =>
|
||||||
|
apiRequest<{ deleted: boolean }>(
|
||||||
|
`/api/training/chat/conversations/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "DELETE" },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: chatKeys.conversations() });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Curator portrait ──────────────────────────────────────────────
|
||||||
|
|
||||||
|
export type CuratorPrompt = {
|
||||||
|
content: string;
|
||||||
|
filename: string;
|
||||||
|
bytes: number;
|
||||||
|
last_modified: number;
|
||||||
|
gitea_url: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type StyleAnalyzerPrompts = {
|
||||||
|
analysis_prompt: string;
|
||||||
|
single_decision_prompt: string;
|
||||||
|
synthesis_prompt: string;
|
||||||
|
max_input_tokens: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CuratorFinding = {
|
||||||
|
id: string;
|
||||||
|
lesson_text: string;
|
||||||
|
category: string;
|
||||||
|
applied_to_skill: boolean;
|
||||||
|
decision_number: string;
|
||||||
|
decision_date: string;
|
||||||
|
created_at: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CuratorStats = {
|
||||||
|
total_findings: number;
|
||||||
|
decisions_with_findings: number;
|
||||||
|
decisions_total: number;
|
||||||
|
findings_applied: number;
|
||||||
|
recent_findings: CuratorFinding[];
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CuratorProposalInput = {
|
||||||
|
title: string;
|
||||||
|
proposed_change: string;
|
||||||
|
rationale: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CuratorProposalFile = {
|
||||||
|
filename: string;
|
||||||
|
bytes: number;
|
||||||
|
modified_at: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export const curatorKeys = {
|
||||||
|
prompt: () => [...trainingKeys.all, "curator", "prompt"] as const,
|
||||||
|
analyzerPrompt: () => [...trainingKeys.all, "curator", "analyzer-prompt"] as const,
|
||||||
|
stats: () => [...trainingKeys.all, "curator", "stats"] as const,
|
||||||
|
proposals: () => [...trainingKeys.all, "curator", "proposals"] as const,
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useCuratorPrompt() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: curatorKeys.prompt(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<CuratorPrompt>("/api/training/curator/prompt", { signal }),
|
||||||
|
staleTime: 5 * 60_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useStyleAnalyzerPrompts() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: curatorKeys.analyzerPrompt(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<StyleAnalyzerPrompts>(
|
||||||
|
"/api/training/curator/style-analyzer-prompt",
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
staleTime: 5 * 60_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useCuratorStats() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: curatorKeys.stats(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<CuratorStats>("/api/training/curator/stats", { signal }),
|
||||||
|
staleTime: 60_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useCuratorProposals() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: curatorKeys.proposals(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<CuratorProposalFile[]>("/api/training/curator/proposals", { signal }),
|
||||||
|
staleTime: 30_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useSubmitCuratorProposal() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (body: CuratorProposalInput) =>
|
||||||
|
apiRequest<{ saved: boolean; filename: string }>(
|
||||||
|
"/api/training/curator/proposals",
|
||||||
|
{ method: "POST", body },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: curatorKeys.proposals() });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Upload flow ──────────────────────────────────────────────────
|
||||||
|
// Three-step pipeline:
|
||||||
|
// 1. useUploadFile → POST /api/upload (multipart) → { filename }
|
||||||
|
// 2. useAnalyzeFile → POST /api/training/analyze (form) → preview + extracted metadata
|
||||||
|
// 3. useCommitUpload → POST /api/training/upload (json) → { task_id }
|
||||||
|
// Track task_id via useProgress() from documents.ts.
|
||||||
|
|
||||||
|
export type UploadFileResponse = {
|
||||||
|
filename: string; // sanitized, time-prefixed name in UPLOAD_DIR
|
||||||
|
original_name: string;
|
||||||
|
size: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type AnalyzeTrainingResponse = {
|
||||||
|
filename: string;
|
||||||
|
clean_text: string;
|
||||||
|
preview: string;
|
||||||
|
decision_number: string;
|
||||||
|
decision_date: string; // ISO YYYY-MM-DD or ""
|
||||||
|
subject_categories: string[];
|
||||||
|
stats: Record<string, unknown>;
|
||||||
|
chars: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CommitTrainingRequest = {
|
||||||
|
filename: string;
|
||||||
|
decision_number: string;
|
||||||
|
decision_date: string; // YYYY-MM-DD or ""
|
||||||
|
subject_categories: string[];
|
||||||
|
title?: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type CommitTrainingResponse = { task_id: string };
|
||||||
|
|
||||||
|
export function useUploadFile() {
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: async (file: File): Promise<UploadFileResponse> => {
|
||||||
|
const fd = new FormData();
|
||||||
|
fd.append("file", file);
|
||||||
|
const res = await fetch("/api/upload", { method: "POST", body: fd });
|
||||||
|
const contentType = res.headers.get("content-type") ?? "";
|
||||||
|
const parsed = contentType.includes("application/json")
|
||||||
|
? await res.json().catch(() => null)
|
||||||
|
: await res.text().catch(() => null);
|
||||||
|
if (!res.ok) {
|
||||||
|
throw new ApiError(
|
||||||
|
typeof parsed === "object" && parsed && "detail" in parsed
|
||||||
|
? String((parsed as { detail: unknown }).detail)
|
||||||
|
: `Upload failed with ${res.status}`,
|
||||||
|
res.status,
|
||||||
|
parsed,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return parsed as UploadFileResponse;
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useAnalyzeTraining() {
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: async (filename: string): Promise<AnalyzeTrainingResponse> => {
|
||||||
|
const fd = new FormData();
|
||||||
|
fd.append("filename", filename);
|
||||||
|
const res = await fetch("/api/training/analyze", {
|
||||||
|
method: "POST",
|
||||||
|
body: fd,
|
||||||
|
});
|
||||||
|
const contentType = res.headers.get("content-type") ?? "";
|
||||||
|
const parsed = contentType.includes("application/json")
|
||||||
|
? await res.json().catch(() => null)
|
||||||
|
: await res.text().catch(() => null);
|
||||||
|
if (!res.ok) {
|
||||||
|
throw new ApiError(
|
||||||
|
typeof parsed === "object" && parsed && "detail" in parsed
|
||||||
|
? String((parsed as { detail: unknown }).detail)
|
||||||
|
: `Analyze failed with ${res.status}`,
|
||||||
|
res.status,
|
||||||
|
parsed,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return parsed as AnalyzeTrainingResponse;
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// ── Per-decision lessons ─────────────────────────────────────────
|
||||||
|
|
||||||
|
export type DecisionLesson = {
|
||||||
|
id: string;
|
||||||
|
style_corpus_id: string;
|
||||||
|
lesson_text: string;
|
||||||
|
category: "style" | "structure" | "lexicon" | "tabular" | "general";
|
||||||
|
source: "manual" | "curator" | "chair" | "style_analyzer";
|
||||||
|
applied_to_skill: boolean;
|
||||||
|
created_by: string;
|
||||||
|
created_at: string;
|
||||||
|
updated_at: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type LessonCreate = {
|
||||||
|
lesson_text: string;
|
||||||
|
category?: DecisionLesson["category"];
|
||||||
|
source?: DecisionLesson["source"];
|
||||||
|
};
|
||||||
|
|
||||||
|
export type LessonPatch = {
|
||||||
|
lesson_text?: string;
|
||||||
|
category?: DecisionLesson["category"];
|
||||||
|
applied_to_skill?: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
export const lessonsKeys = {
|
||||||
|
forCorpus: (corpusId: string) =>
|
||||||
|
[...trainingKeys.all, "lessons", corpusId] as const,
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useCorpusLessons(corpusId: string | null) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: lessonsKeys.forCorpus(corpusId ?? ""),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<DecisionLesson[]>(
|
||||||
|
`/api/training/corpus/${encodeURIComponent(corpusId!)}/lessons`,
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
enabled: Boolean(corpusId),
|
||||||
|
staleTime: 30_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useAddLesson(corpusId: string) {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (body: LessonCreate) =>
|
||||||
|
apiRequest<DecisionLesson>(
|
||||||
|
`/api/training/corpus/${encodeURIComponent(corpusId)}/lessons`,
|
||||||
|
{ method: "POST", body },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
|
||||||
|
// lessons_count on the corpus row is computed server-side, so
|
||||||
|
// invalidate the list too — otherwise the badge stays stale.
|
||||||
|
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function usePatchLesson(corpusId: string) {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: ({ id, patch }: { id: string; patch: LessonPatch }) =>
|
||||||
|
apiRequest<{ updated: boolean }>(
|
||||||
|
`/api/training/lessons/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "PATCH", body: patch },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useDeleteLesson(corpusId: string) {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (id: string) =>
|
||||||
|
apiRequest<{ deleted: boolean }>(
|
||||||
|
`/api/training/lessons/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "DELETE" },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
|
||||||
|
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function usePatchCorpus() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: ({ id, patch }: { id: string; patch: CorpusDecisionPatch }) =>
|
||||||
|
apiRequest<{ updated: boolean; id: string }>(
|
||||||
|
`/api/training/corpus/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "PATCH", body: patch },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||||
|
qc.invalidateQueries({ queryKey: trainingKeys.report() });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useCommitTrainingUpload() {
|
||||||
|
// No onSuccess invalidation here — the row only appears after the
|
||||||
|
// background task finishes. The dialog watches useProgress(task_id)
|
||||||
|
// and invalidates trainingKeys when status === "completed".
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (body: CommitTrainingRequest) =>
|
||||||
|
apiRequest<CommitTrainingResponse>("/api/training/upload", {
|
||||||
|
method: "POST",
|
||||||
|
body,
|
||||||
|
}),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|||||||
647
web/app.py
647
web/app.py
@@ -12,6 +12,7 @@ import subprocess
|
|||||||
import sys
|
import sys
|
||||||
import time
|
import time
|
||||||
from contextlib import asynccontextmanager
|
from contextlib import asynccontextmanager
|
||||||
|
from datetime import date as date_type
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from uuid import UUID, uuid4
|
from uuid import UUID, uuid4
|
||||||
|
|
||||||
@@ -945,32 +946,648 @@ async def training_corpus_delete(corpus_id: str):
|
|||||||
return result
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
def _format_legal_citation(decision_number: str, decision_date: str) -> str:
|
||||||
|
"""Compose the Israeli ועדת ערר citation string from corpus metadata.
|
||||||
|
|
||||||
|
Mirrors how decisions are referenced in Daphna's own writing — e.g.
|
||||||
|
"ערר 1130-25 ועדת ערר ירושלים (26.4.2026)". Empty parts are dropped
|
||||||
|
gracefully so partially populated rows still produce a readable label.
|
||||||
|
"""
|
||||||
|
if not decision_number:
|
||||||
|
return ""
|
||||||
|
parts = [f"ערר {decision_number}", "ועדת ערר ירושלים"]
|
||||||
|
if decision_date:
|
||||||
|
try:
|
||||||
|
d = date_type.fromisoformat(decision_date)
|
||||||
|
parts.append(f"({d.day}.{d.month}.{d.year})")
|
||||||
|
except ValueError:
|
||||||
|
pass
|
||||||
|
return " ".join(parts)
|
||||||
|
|
||||||
|
|
||||||
|
_PARTIES_PATTERNS = (
|
||||||
|
# "העורר: X" or "העוררים: X". Captures up to a newline / end of stanza.
|
||||||
|
re.compile(r"העורר(?:ים|ת)?[:\s]+([^\n]{3,120})"),
|
||||||
|
re.compile(r"המבקש(?:ים|ת)?[:\s]+([^\n]{3,120})"),
|
||||||
|
re.compile(r"בעניין[:\s]+([^\n]{3,120})"),
|
||||||
|
)
|
||||||
|
_RESPONDENT_PATTERNS = (
|
||||||
|
re.compile(r"המשיב(?:ים|ה|ות)?[:\s]+([^\n]{3,120})"),
|
||||||
|
re.compile(r"נגד\s*\n+\s*([^\n]{3,120})"),
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _extract_parties(text: str) -> dict[str, str]:
|
||||||
|
"""Best-effort regex extraction of עורר/משיב from the first 5K of full_text.
|
||||||
|
|
||||||
|
We only scan the head of the document because the parties are always
|
||||||
|
declared at the top in Israeli legal decisions. The result is a hint
|
||||||
|
for display — never authoritative — so a miss returns an empty string
|
||||||
|
rather than raising.
|
||||||
|
"""
|
||||||
|
head = (text or "")[:5000]
|
||||||
|
appellant = respondent = ""
|
||||||
|
for pat in _PARTIES_PATTERNS:
|
||||||
|
m = pat.search(head)
|
||||||
|
if m:
|
||||||
|
appellant = m.group(1).strip(" .,-—")
|
||||||
|
break
|
||||||
|
for pat in _RESPONDENT_PATTERNS:
|
||||||
|
m = pat.search(head)
|
||||||
|
if m:
|
||||||
|
respondent = m.group(1).strip(" .,-—")
|
||||||
|
break
|
||||||
|
return {"appellant": appellant, "respondent": respondent}
|
||||||
|
|
||||||
|
|
||||||
@app.get("/api/training/corpus")
|
@app.get("/api/training/corpus")
|
||||||
async def training_corpus_list():
|
async def training_corpus_list():
|
||||||
"""List all decisions currently in the style corpus."""
|
"""List all decisions currently in the style corpus, with enriched metadata.
|
||||||
|
|
||||||
|
Joins to ``documents`` via FK when available, falling back to the
|
||||||
|
title-token match used in the chunking pipeline so legacy rows with
|
||||||
|
``style_corpus.document_id IS NULL`` still resolve to their page_count
|
||||||
|
and chunk counts.
|
||||||
|
"""
|
||||||
pool = await db.get_pool()
|
pool = await db.get_pool()
|
||||||
async with pool.acquire() as conn:
|
async with pool.acquire() as conn:
|
||||||
rows = await conn.fetch(
|
rows = await conn.fetch(
|
||||||
"SELECT id, decision_number, decision_date, subject_categories, "
|
"""
|
||||||
" length(full_text) as chars, created_at "
|
SELECT sc.id,
|
||||||
"FROM style_corpus "
|
sc.decision_number,
|
||||||
"ORDER BY created_at DESC"
|
sc.decision_date,
|
||||||
|
sc.subject_categories,
|
||||||
|
length(sc.full_text) AS chars,
|
||||||
|
substring(sc.full_text from 1 for 5000) AS head_text,
|
||||||
|
sc.summary,
|
||||||
|
sc.outcome,
|
||||||
|
sc.key_principles,
|
||||||
|
sc.appeal_subtype,
|
||||||
|
sc.practice_area,
|
||||||
|
sc.document_id,
|
||||||
|
sc.created_at,
|
||||||
|
d.page_count AS page_count,
|
||||||
|
d.title AS doc_title
|
||||||
|
FROM style_corpus sc
|
||||||
|
LEFT JOIN documents d ON d.id = sc.document_id
|
||||||
|
ORDER BY sc.created_at DESC
|
||||||
|
"""
|
||||||
)
|
)
|
||||||
return [
|
lessons_counts = await db.count_decision_lessons_per_corpus()
|
||||||
{
|
out = []
|
||||||
|
for r in rows:
|
||||||
|
cats = r["subject_categories"]
|
||||||
|
if isinstance(cats, str):
|
||||||
|
try:
|
||||||
|
cats = json.loads(cats)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
cats = []
|
||||||
|
kp = r["key_principles"]
|
||||||
|
if isinstance(kp, str):
|
||||||
|
try:
|
||||||
|
kp = json.loads(kp)
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
kp = []
|
||||||
|
decision_date = str(r["decision_date"]) if r["decision_date"] else ""
|
||||||
|
parties = _extract_parties(r["head_text"] or "")
|
||||||
|
out.append({
|
||||||
"id": str(r["id"]),
|
"id": str(r["id"]),
|
||||||
"decision_number": r["decision_number"] or "",
|
"decision_number": r["decision_number"] or "",
|
||||||
"decision_date": str(r["decision_date"]) if r["decision_date"] else "",
|
"decision_date": decision_date,
|
||||||
"subject_categories": (
|
"subject_categories": cats or [],
|
||||||
json.loads(r["subject_categories"])
|
|
||||||
if isinstance(r["subject_categories"], str)
|
|
||||||
else r["subject_categories"] or []
|
|
||||||
),
|
|
||||||
"chars": r["chars"],
|
"chars": r["chars"],
|
||||||
"created_at": r["created_at"].isoformat() if r["created_at"] else "",
|
"created_at": r["created_at"].isoformat() if r["created_at"] else "",
|
||||||
|
# ── enriched fields ──
|
||||||
|
"summary": r["summary"] or "",
|
||||||
|
"outcome": r["outcome"] or "",
|
||||||
|
"key_principles": kp or [],
|
||||||
|
"appeal_subtype": r["appeal_subtype"] or "",
|
||||||
|
"practice_area": r["practice_area"] or "",
|
||||||
|
"page_count": r["page_count"] or 0,
|
||||||
|
"document_id": str(r["document_id"]) if r["document_id"] else None,
|
||||||
|
"doc_title": r["doc_title"] or "",
|
||||||
|
"parties": parties,
|
||||||
|
"legal_citation": _format_legal_citation(r["decision_number"] or "", decision_date),
|
||||||
|
"lessons_count": lessons_counts.get(str(r["id"]), 0),
|
||||||
|
})
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
# ── Style-agent chat (delegated to legal-chat-service on host) ─────
|
||||||
|
|
||||||
|
|
||||||
|
class ChatConversationCreate(BaseModel):
|
||||||
|
title: str = "שיחה חדשה"
|
||||||
|
style_corpus_id: str | None = None # optional — scope chat to a decision
|
||||||
|
|
||||||
|
|
||||||
|
class ChatMessageRequest(BaseModel):
|
||||||
|
content: str
|
||||||
|
|
||||||
|
|
||||||
|
def _conv_to_json(row: dict) -> dict:
|
||||||
|
"""Serialize a chat_conversations row for the API."""
|
||||||
|
return {
|
||||||
|
"id": str(row["id"]),
|
||||||
|
"title": row.get("title") or "",
|
||||||
|
"style_corpus_id": str(row["style_corpus_id"]) if row.get("style_corpus_id") else None,
|
||||||
|
"decision_number": row.get("decision_number") or "",
|
||||||
|
"claude_session_id": row.get("claude_session_id"),
|
||||||
|
"message_count": row.get("message_count", 0),
|
||||||
|
"created_at": row["created_at"].isoformat() if row.get("created_at") else "",
|
||||||
|
"last_message_at": row["last_message_at"].isoformat() if row.get("last_message_at") else "",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _msg_to_json(row: dict) -> dict:
|
||||||
|
return {
|
||||||
|
"id": str(row["id"]),
|
||||||
|
"role": row["role"],
|
||||||
|
"content": row["content"],
|
||||||
|
"created_at": row["created_at"].isoformat() if row.get("created_at") else "",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/training/chat/conversations")
|
||||||
|
async def chat_create_conversation(body: ChatConversationCreate):
|
||||||
|
"""Create a new style-agent chat conversation."""
|
||||||
|
corpus_uuid: UUID | None = None
|
||||||
|
if body.style_corpus_id:
|
||||||
|
try:
|
||||||
|
corpus_uuid = UUID(body.style_corpus_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid style_corpus_id")
|
||||||
|
row = await db.create_chat_conversation(
|
||||||
|
title=body.title.strip() or "שיחה חדשה",
|
||||||
|
style_corpus_id=corpus_uuid,
|
||||||
|
)
|
||||||
|
if not row:
|
||||||
|
raise HTTPException(500, "failed to create conversation")
|
||||||
|
return _conv_to_json(row)
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/chat/conversations")
|
||||||
|
async def chat_list_conversations(limit: int = 50):
|
||||||
|
rows = await db.list_chat_conversations(limit=limit)
|
||||||
|
return [_conv_to_json(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/chat/conversations/{conv_id}")
|
||||||
|
async def chat_get_conversation(conv_id: str):
|
||||||
|
try:
|
||||||
|
cid = UUID(conv_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid conv_id")
|
||||||
|
conv = await db.get_chat_conversation(cid)
|
||||||
|
if not conv:
|
||||||
|
raise HTTPException(404, "conversation not found")
|
||||||
|
messages = await db.list_chat_messages(cid)
|
||||||
|
return {
|
||||||
|
"conversation": _conv_to_json(conv),
|
||||||
|
"messages": [_msg_to_json(m) for m in messages],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.delete("/api/training/chat/conversations/{conv_id}")
|
||||||
|
async def chat_delete_conversation(conv_id: str):
|
||||||
|
try:
|
||||||
|
cid = UUID(conv_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid conv_id")
|
||||||
|
result = await db.delete_chat_conversation(cid)
|
||||||
|
if not result.get("deleted"):
|
||||||
|
raise HTTPException(404, "conversation not found")
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/training/chat/conversations/{conv_id}/messages")
|
||||||
|
async def chat_send_message(conv_id: str, body: ChatMessageRequest):
|
||||||
|
"""Send a user message; stream the assistant response as SSE.
|
||||||
|
|
||||||
|
Proxies through ``web.chat_proxy.stream_chat_message`` to the
|
||||||
|
legal-chat-service running on the host.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
cid = UUID(conv_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid conv_id")
|
||||||
|
text = (body.content or "").strip()
|
||||||
|
if not text:
|
||||||
|
raise HTTPException(400, "content is required")
|
||||||
|
from web import chat_proxy
|
||||||
|
return await chat_proxy.stream_chat_message(cid, text)
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/chat/health")
|
||||||
|
async def chat_health():
|
||||||
|
"""Probe legal-chat-service liveness from inside the container.
|
||||||
|
|
||||||
|
Useful when the UI wants to gracefully degrade ("שירות הצ'אט אינו
|
||||||
|
זמין") instead of letting messages fail mid-stream.
|
||||||
|
"""
|
||||||
|
from web import chat_proxy
|
||||||
|
try:
|
||||||
|
async with httpx.AsyncClient(timeout=httpx.Timeout(5.0)) as client:
|
||||||
|
r = await client.get(f"{chat_proxy.CHAT_SERVICE_URL}/health")
|
||||||
|
return {"reachable": r.status_code == 200, "status": r.status_code,
|
||||||
|
"url": chat_proxy.CHAT_SERVICE_URL}
|
||||||
|
except Exception as e:
|
||||||
|
return {"reachable": False, "error": str(e),
|
||||||
|
"url": chat_proxy.CHAT_SERVICE_URL}
|
||||||
|
|
||||||
|
|
||||||
|
# ── Curator portrait — read prompt + stats + accept proposals ──────
|
||||||
|
|
||||||
|
|
||||||
|
# The curator agent's prompt is symlinked into Paperclip, but the source
|
||||||
|
# lives in the legal-ai repo. Resolve via env so the container (where the
|
||||||
|
# agent file is mounted from a different path) and the host both work.
|
||||||
|
_AGENTS_DIR = Path(os.environ.get(
|
||||||
|
"AGENTS_DIR",
|
||||||
|
str(Path(__file__).resolve().parent.parent / ".claude" / "agents"),
|
||||||
|
))
|
||||||
|
_CURATOR_PROPOSALS_DIR = Path(os.environ.get(
|
||||||
|
"CURATOR_PROPOSALS_DIR",
|
||||||
|
str(Path(__file__).resolve().parent.parent / "data" / "curator-proposals"),
|
||||||
|
))
|
||||||
|
_GITEA_REPO_BASE = os.environ.get(
|
||||||
|
"GITEA_REPO_BASE",
|
||||||
|
"https://gitea.nautilus.marcusgroup.org/ezer-mishpati/legal-ai",
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/curator/prompt")
|
||||||
|
async def get_curator_prompt():
|
||||||
|
"""Return the hermes-curator agent's prompt (read-only) + Gitea source URL.
|
||||||
|
|
||||||
|
The file is the canonical source of how the curator analyzes Daphna's
|
||||||
|
final decisions. Changes go through git/Gitea, not the UI — the UI just
|
||||||
|
surfaces it for transparency.
|
||||||
|
"""
|
||||||
|
path = _AGENTS_DIR / "hermes-curator.md"
|
||||||
|
if not path.exists():
|
||||||
|
raise HTTPException(404, f"curator prompt not found at {path}")
|
||||||
|
try:
|
||||||
|
content = path.read_text(encoding="utf-8")
|
||||||
|
stat = path.stat()
|
||||||
|
except OSError as e:
|
||||||
|
raise HTTPException(500, f"failed to read curator prompt: {e}")
|
||||||
|
gitea_url = (
|
||||||
|
f"{_GITEA_REPO_BASE}/src/branch/main/.claude/agents/hermes-curator.md"
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"content": content,
|
||||||
|
"filename": path.name,
|
||||||
|
"bytes": stat.st_size,
|
||||||
|
"last_modified": stat.st_mtime,
|
||||||
|
"gitea_url": gitea_url,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/curator/style-analyzer-prompt")
|
||||||
|
async def get_style_analyzer_prompt():
|
||||||
|
"""Return the system prompt that style_analyzer.py uses to extract patterns.
|
||||||
|
|
||||||
|
Surfaces the *training-time* prompt (Claude Opus 1M context) so the
|
||||||
|
chair can compare it against the curator's post-export prompt. Both
|
||||||
|
are shown side-by-side in the curator-portrait tab.
|
||||||
|
"""
|
||||||
|
# Embedded as a string so we don't need to import the service module
|
||||||
|
# here (which would pull in claude_session + db). The prompt is the
|
||||||
|
# one defined in mcp-server/src/legal_mcp/services/style_analyzer.py.
|
||||||
|
try:
|
||||||
|
from legal_mcp.services import style_analyzer
|
||||||
|
return {
|
||||||
|
"analysis_prompt": style_analyzer.ANALYSIS_PROMPT,
|
||||||
|
"single_decision_prompt": style_analyzer.SINGLE_DECISION_PROMPT,
|
||||||
|
"synthesis_prompt": style_analyzer.SYNTHESIS_PROMPT,
|
||||||
|
"max_input_tokens": style_analyzer.MAX_INPUT_TOKENS,
|
||||||
|
}
|
||||||
|
except Exception as e:
|
||||||
|
raise HTTPException(500, f"failed to load style_analyzer prompt: {e}")
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/curator/stats")
|
||||||
|
async def get_curator_stats():
|
||||||
|
"""Cheap aggregate stats over decision_lessons + style_corpus.
|
||||||
|
|
||||||
|
Used by the Curator-Portrait tab to show "10 curator findings across 24
|
||||||
|
decisions". We deliberately keep this server-side and aggregate so the
|
||||||
|
UI can render a single card without fanning out N queries.
|
||||||
|
"""
|
||||||
|
pool = await db.get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
total_lessons = await conn.fetchval(
|
||||||
|
"SELECT count(*) FROM decision_lessons WHERE source = 'curator'"
|
||||||
|
)
|
||||||
|
decisions_with_findings = await conn.fetchval(
|
||||||
|
"SELECT count(DISTINCT style_corpus_id) FROM decision_lessons "
|
||||||
|
"WHERE source = 'curator'"
|
||||||
|
)
|
||||||
|
total_corpus = await conn.fetchval("SELECT count(*) FROM style_corpus")
|
||||||
|
applied = await conn.fetchval(
|
||||||
|
"SELECT count(*) FROM decision_lessons "
|
||||||
|
"WHERE source = 'curator' AND applied_to_skill = true"
|
||||||
|
)
|
||||||
|
# Last 10 curator findings — newest first
|
||||||
|
recent_rows = await conn.fetch(
|
||||||
|
"""
|
||||||
|
SELECT dl.id, dl.lesson_text, dl.category, dl.applied_to_skill,
|
||||||
|
dl.created_at,
|
||||||
|
sc.decision_number, sc.decision_date
|
||||||
|
FROM decision_lessons dl
|
||||||
|
JOIN style_corpus sc ON sc.id = dl.style_corpus_id
|
||||||
|
WHERE dl.source = 'curator'
|
||||||
|
ORDER BY dl.created_at DESC
|
||||||
|
LIMIT 10
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"total_findings": total_lessons or 0,
|
||||||
|
"decisions_with_findings": decisions_with_findings or 0,
|
||||||
|
"decisions_total": total_corpus or 0,
|
||||||
|
"findings_applied": applied or 0,
|
||||||
|
"recent_findings": [
|
||||||
|
{
|
||||||
|
"id": str(r["id"]),
|
||||||
|
"lesson_text": r["lesson_text"],
|
||||||
|
"category": r["category"],
|
||||||
|
"applied_to_skill": bool(r["applied_to_skill"]),
|
||||||
|
"decision_number": r["decision_number"] or "",
|
||||||
|
"decision_date": str(r["decision_date"]) if r["decision_date"] else "",
|
||||||
|
"created_at": r["created_at"].isoformat() if r["created_at"] else "",
|
||||||
|
}
|
||||||
|
for r in recent_rows
|
||||||
|
],
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class CuratorProposal(BaseModel):
|
||||||
|
title: str
|
||||||
|
proposed_change: str # markdown — what to change in the prompt
|
||||||
|
rationale: str # markdown — why
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/training/curator/proposals")
|
||||||
|
async def create_curator_proposal(body: CuratorProposal):
|
||||||
|
"""Save a proposed change to the curator prompt as a file on disk.
|
||||||
|
|
||||||
|
No automatic commit, no overwrite — the chair (chaim) reviews the
|
||||||
|
file manually and applies it through git. This is intentional: the
|
||||||
|
prompt is too load-bearing to mutate from a web UI.
|
||||||
|
"""
|
||||||
|
title = (body.title or "").strip()
|
||||||
|
if not title:
|
||||||
|
raise HTTPException(400, "title is required")
|
||||||
|
if not body.proposed_change.strip():
|
||||||
|
raise HTTPException(400, "proposed_change is required")
|
||||||
|
|
||||||
|
_CURATOR_PROPOSALS_DIR.mkdir(parents=True, exist_ok=True)
|
||||||
|
# Slug-ish filename — strip anything that isn't a Hebrew letter, ASCII
|
||||||
|
# letter, digit, hyphen, or underscore. Hebrew letters are explicitly
|
||||||
|
# allowed because most proposals will be in Hebrew.
|
||||||
|
slug = re.sub(r"[^\w-\-]+", "-", title)[:60].strip("-_") or "proposal"
|
||||||
|
today = date_type.today().isoformat()
|
||||||
|
fname = f"{today}-{slug}.md"
|
||||||
|
path = _CURATOR_PROPOSALS_DIR / fname
|
||||||
|
|
||||||
|
# If a proposal with the same slug already exists today, append a
|
||||||
|
# numeric suffix so we don't silently overwrite.
|
||||||
|
idx = 2
|
||||||
|
while path.exists():
|
||||||
|
path = _CURATOR_PROPOSALS_DIR / f"{today}-{slug}-{idx}.md"
|
||||||
|
idx += 1
|
||||||
|
|
||||||
|
md = (
|
||||||
|
f"# הצעת שינוי לפרומפט hermes-curator\n\n"
|
||||||
|
f"- **תאריך:** {today}\n"
|
||||||
|
f"- **כותרת:** {title}\n\n"
|
||||||
|
f"## שינוי מוצע\n\n{body.proposed_change.strip()}\n\n"
|
||||||
|
f"## נימוק\n\n{body.rationale.strip() or '(לא ניתן)'}\n"
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
path.write_text(md, encoding="utf-8")
|
||||||
|
except OSError as e:
|
||||||
|
raise HTTPException(500, f"failed to write proposal: {e}")
|
||||||
|
return {
|
||||||
|
"saved": True,
|
||||||
|
"filename": path.name,
|
||||||
|
"path": str(path),
|
||||||
|
"bytes": len(md.encode("utf-8")),
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/curator/proposals")
|
||||||
|
async def list_curator_proposals():
|
||||||
|
"""List proposed-change files in data/curator-proposals/, newest first."""
|
||||||
|
if not _CURATOR_PROPOSALS_DIR.exists():
|
||||||
|
return []
|
||||||
|
items = []
|
||||||
|
for p in sorted(_CURATOR_PROPOSALS_DIR.iterdir(),
|
||||||
|
key=lambda f: f.stat().st_mtime, reverse=True):
|
||||||
|
if not p.is_file() or p.suffix.lower() != ".md":
|
||||||
|
continue
|
||||||
|
stat = p.stat()
|
||||||
|
items.append({
|
||||||
|
"filename": p.name,
|
||||||
|
"bytes": stat.st_size,
|
||||||
|
"modified_at": stat.st_mtime,
|
||||||
|
})
|
||||||
|
return items
|
||||||
|
|
||||||
|
|
||||||
|
# ── Per-decision lessons (decision_lessons table) ──────────────────
|
||||||
|
|
||||||
|
|
||||||
|
class LessonCreate(BaseModel):
|
||||||
|
lesson_text: str
|
||||||
|
category: str = "general"
|
||||||
|
source: str = "manual"
|
||||||
|
|
||||||
|
|
||||||
|
class LessonPatch(BaseModel):
|
||||||
|
lesson_text: str | None = None
|
||||||
|
category: str | None = None
|
||||||
|
applied_to_skill: bool | None = None
|
||||||
|
|
||||||
|
|
||||||
|
_LESSON_CATEGORIES = {"style", "structure", "lexicon", "tabular", "general"}
|
||||||
|
_LESSON_SOURCES = {"manual", "curator", "chair", "style_analyzer"}
|
||||||
|
|
||||||
|
|
||||||
|
def _lesson_to_json(row: dict) -> dict:
|
||||||
|
return {
|
||||||
|
"id": str(row["id"]),
|
||||||
|
"style_corpus_id": str(row["style_corpus_id"]),
|
||||||
|
"lesson_text": row["lesson_text"],
|
||||||
|
"category": row["category"],
|
||||||
|
"source": row["source"],
|
||||||
|
"applied_to_skill": bool(row["applied_to_skill"]),
|
||||||
|
"created_by": row.get("created_by", ""),
|
||||||
|
"created_at": row["created_at"].isoformat() if row.get("created_at") else "",
|
||||||
|
"updated_at": row["updated_at"].isoformat() if row.get("updated_at") else "",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/corpus/{corpus_id}/lessons")
|
||||||
|
async def list_corpus_lessons(corpus_id: str):
|
||||||
|
try:
|
||||||
|
cid = UUID(corpus_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid corpus_id")
|
||||||
|
rows = await db.list_decision_lessons(cid)
|
||||||
|
return [_lesson_to_json(r) for r in rows]
|
||||||
|
|
||||||
|
|
||||||
|
@app.post("/api/training/corpus/{corpus_id}/lessons")
|
||||||
|
async def add_corpus_lesson(corpus_id: str, body: LessonCreate):
|
||||||
|
try:
|
||||||
|
cid = UUID(corpus_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid corpus_id")
|
||||||
|
text = (body.lesson_text or "").strip()
|
||||||
|
if not text:
|
||||||
|
raise HTTPException(400, "lesson_text is required")
|
||||||
|
if body.category not in _LESSON_CATEGORIES:
|
||||||
|
raise HTTPException(400, f"invalid category; allowed: {sorted(_LESSON_CATEGORIES)}")
|
||||||
|
if body.source not in _LESSON_SOURCES:
|
||||||
|
raise HTTPException(400, f"invalid source; allowed: {sorted(_LESSON_SOURCES)}")
|
||||||
|
row = await db.add_decision_lesson(
|
||||||
|
cid, lesson_text=text, category=body.category, source=body.source,
|
||||||
|
)
|
||||||
|
if not row:
|
||||||
|
raise HTTPException(500, "failed to insert lesson")
|
||||||
|
return _lesson_to_json(row)
|
||||||
|
|
||||||
|
|
||||||
|
@app.patch("/api/training/lessons/{lesson_id}")
|
||||||
|
async def patch_corpus_lesson(lesson_id: str, body: LessonPatch):
|
||||||
|
try:
|
||||||
|
lid = UUID(lesson_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid lesson_id")
|
||||||
|
if body.category is not None and body.category not in _LESSON_CATEGORIES:
|
||||||
|
raise HTTPException(400, f"invalid category; allowed: {sorted(_LESSON_CATEGORIES)}")
|
||||||
|
result = await db.update_decision_lesson(
|
||||||
|
lid,
|
||||||
|
lesson_text=body.lesson_text,
|
||||||
|
category=body.category,
|
||||||
|
applied_to_skill=body.applied_to_skill,
|
||||||
|
)
|
||||||
|
if not result.get("updated"):
|
||||||
|
if result.get("reason") == "not found":
|
||||||
|
raise HTTPException(404, "lesson not found")
|
||||||
|
return result # "nothing to update" — 200 with reason
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
@app.delete("/api/training/lessons/{lesson_id}")
|
||||||
|
async def delete_corpus_lesson(lesson_id: str):
|
||||||
|
try:
|
||||||
|
lid = UUID(lesson_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid lesson_id")
|
||||||
|
result = await db.delete_decision_lesson(lid)
|
||||||
|
if not result.get("deleted"):
|
||||||
|
raise HTTPException(404, "lesson not found")
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
@app.get("/api/training/corpus/{corpus_id}/full-text")
|
||||||
|
async def training_corpus_full_text(corpus_id: str):
|
||||||
|
"""Return the proofread full_text for a single corpus row.
|
||||||
|
|
||||||
|
Kept out of the list endpoint because full_text is large (50K-650K chars
|
||||||
|
per decision) and the table view only needs counts. The drawer fetches
|
||||||
|
it on demand when the chair opens the "content" tab.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
cid = UUID(corpus_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid corpus_id")
|
||||||
|
pool = await db.get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"SELECT decision_number, full_text FROM style_corpus WHERE id = $1",
|
||||||
|
cid,
|
||||||
|
)
|
||||||
|
if not row:
|
||||||
|
raise HTTPException(404, "corpus row not found")
|
||||||
|
return {
|
||||||
|
"id": corpus_id,
|
||||||
|
"decision_number": row["decision_number"] or "",
|
||||||
|
"full_text": row["full_text"] or "",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
class TrainingCorpusPatch(BaseModel):
|
||||||
|
"""Editable metadata fields on a style_corpus row.
|
||||||
|
|
||||||
|
full_text is intentionally NOT editable — the corpus is write-once.
|
||||||
|
For corrections, re-upload the decision via /api/training/upload.
|
||||||
|
"""
|
||||||
|
decision_number: str | None = None
|
||||||
|
decision_date: str | None = None # ISO YYYY-MM-DD, or "" to clear
|
||||||
|
subject_categories: list[str] | None = None
|
||||||
|
summary: str | None = None
|
||||||
|
outcome: str | None = None
|
||||||
|
key_principles: list[str] | None = None
|
||||||
|
appeal_subtype: str | None = None
|
||||||
|
practice_area: str | None = None
|
||||||
|
|
||||||
|
|
||||||
|
@app.patch("/api/training/corpus/{corpus_id}")
|
||||||
|
async def training_corpus_patch(corpus_id: str, patch: TrainingCorpusPatch):
|
||||||
|
"""Update metadata fields on a corpus row. Only provided fields are touched."""
|
||||||
|
try:
|
||||||
|
cid = UUID(corpus_id)
|
||||||
|
except ValueError:
|
||||||
|
raise HTTPException(400, "invalid corpus_id")
|
||||||
|
|
||||||
|
fields = patch.model_dump(exclude_none=True)
|
||||||
|
if not fields:
|
||||||
|
return {"updated": False, "reason": "no fields to update"}
|
||||||
|
|
||||||
|
# Coerce decision_date "" → SQL NULL, otherwise parse as DATE.
|
||||||
|
if "decision_date" in fields:
|
||||||
|
v = fields["decision_date"]
|
||||||
|
if v == "":
|
||||||
|
fields["decision_date"] = None
|
||||||
|
else:
|
||||||
|
try:
|
||||||
|
fields["decision_date"] = date_type.fromisoformat(v)
|
||||||
|
except ValueError as e:
|
||||||
|
raise HTTPException(400, f"invalid decision_date: {e}")
|
||||||
|
|
||||||
|
# subject_categories + key_principles are JSONB columns.
|
||||||
|
if "subject_categories" in fields:
|
||||||
|
fields["subject_categories"] = json.dumps(fields["subject_categories"])
|
||||||
|
if "key_principles" in fields:
|
||||||
|
fields["key_principles"] = json.dumps(fields["key_principles"])
|
||||||
|
|
||||||
|
# Build a positional UPDATE — asyncpg doesn't support named parameters.
|
||||||
|
cols = list(fields.keys())
|
||||||
|
set_clause = ", ".join(f"{c} = ${i + 2}" for i, c in enumerate(cols))
|
||||||
|
values = [fields[c] for c in cols]
|
||||||
|
|
||||||
|
pool = await db.get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
result = await conn.fetchrow(
|
||||||
|
f"UPDATE style_corpus SET {set_clause} "
|
||||||
|
f"WHERE id = $1 "
|
||||||
|
f"RETURNING id, decision_number, decision_date, summary, outcome",
|
||||||
|
cid, *values,
|
||||||
|
)
|
||||||
|
if not result:
|
||||||
|
raise HTTPException(404, "corpus row not found")
|
||||||
|
return {
|
||||||
|
"updated": True,
|
||||||
|
"id": str(result["id"]),
|
||||||
|
"decision_number": result["decision_number"] or "",
|
||||||
|
"decision_date": str(result["decision_date"]) if result["decision_date"] else "",
|
||||||
|
"summary_len": len(result["summary"] or ""),
|
||||||
|
"outcome_len": len(result["outcome"] or ""),
|
||||||
}
|
}
|
||||||
for r in rows
|
|
||||||
]
|
|
||||||
|
|
||||||
|
|
||||||
# Headers that defeat proxy buffering for SSE streams. `X-Accel-Buffering: no`
|
# Headers that defeat proxy buffering for SSE streams. `X-Accel-Buffering: no`
|
||||||
|
|||||||
176
web/chat_proxy.py
Normal file
176
web/chat_proxy.py
Normal file
@@ -0,0 +1,176 @@
|
|||||||
|
"""FastAPI ↔ legal-chat-service streaming bridge.
|
||||||
|
|
||||||
|
The browser hits ``/api/training/chat/conversations/{id}/messages`` on
|
||||||
|
the legal-ai container. The container is sealed off from the host's
|
||||||
|
``claude`` CLI (intentional — see ``claude_session.py`` docstring), so
|
||||||
|
we forward each request to the pm2-managed ``legal-chat-service`` over
|
||||||
|
loopback (``host.docker.internal:8770``).
|
||||||
|
|
||||||
|
Responsibilities:
|
||||||
|
- Save the user message to ``chat_messages`` before streaming starts.
|
||||||
|
- Open an HTTP streaming connection to the host service.
|
||||||
|
- Forward each SSE event to the browser as-is, accumulating the
|
||||||
|
assistant text and any ``session_id`` so we can persist them once
|
||||||
|
the stream closes.
|
||||||
|
- Persist the assistant turn + the CLI's session_id at end-of-stream.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from typing import AsyncIterator
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
import httpx
|
||||||
|
from fastapi import HTTPException
|
||||||
|
from fastapi.responses import StreamingResponse
|
||||||
|
|
||||||
|
from legal_mcp.services import db
|
||||||
|
from web import chat_system_prompt
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# legal-chat-service lives on the host. In the container we reach it via
|
||||||
|
# host.docker.internal — which requires ``extra_hosts: host.docker.internal:host-gateway``
|
||||||
|
# in the Coolify service definition. Set ``CHAT_SERVICE_URL`` to override
|
||||||
|
# (handy for local dev outside Docker).
|
||||||
|
CHAT_SERVICE_URL = os.environ.get(
|
||||||
|
"CHAT_SERVICE_URL",
|
||||||
|
"http://host.docker.internal:8770",
|
||||||
|
)
|
||||||
|
CHAT_SERVICE_TIMEOUT_S = float(os.environ.get("CHAT_SERVICE_TIMEOUT_S", "3600"))
|
||||||
|
|
||||||
|
|
||||||
|
_SSE_HEADERS = {
|
||||||
|
"Cache-Control": "no-cache, no-transform",
|
||||||
|
"X-Accel-Buffering": "no",
|
||||||
|
"Connection": "keep-alive",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def stream_chat_message(
|
||||||
|
conversation_id: UUID,
|
||||||
|
user_message: str,
|
||||||
|
) -> StreamingResponse:
|
||||||
|
"""Open SSE stream, forward events, persist when done.
|
||||||
|
|
||||||
|
Returns a FastAPI StreamingResponse the route can return directly.
|
||||||
|
"""
|
||||||
|
conv = await db.get_chat_conversation(conversation_id)
|
||||||
|
if not conv:
|
||||||
|
raise HTTPException(404, "conversation not found")
|
||||||
|
|
||||||
|
# Persist the user turn immediately so a network drop doesn't lose it.
|
||||||
|
await db.add_chat_message(
|
||||||
|
conversation_id, role="user", content=user_message,
|
||||||
|
)
|
||||||
|
|
||||||
|
is_first_turn = not conv.get("claude_session_id")
|
||||||
|
system_block: str | None = None
|
||||||
|
if is_first_turn:
|
||||||
|
try:
|
||||||
|
system_block = await chat_system_prompt.build_system_prompt(
|
||||||
|
corpus_id=conv.get("style_corpus_id"),
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception("system prompt build failed")
|
||||||
|
raise HTTPException(500, f"system prompt failed: {e}")
|
||||||
|
|
||||||
|
payload = {
|
||||||
|
"prompt": user_message,
|
||||||
|
"system": system_block,
|
||||||
|
"resume_session_id": conv.get("claude_session_id"),
|
||||||
|
}
|
||||||
|
|
||||||
|
async def proxy_stream() -> AsyncIterator[bytes]:
|
||||||
|
accumulated_text: list[str] = []
|
||||||
|
events_log: list[dict] = []
|
||||||
|
new_session_id: str | None = None
|
||||||
|
|
||||||
|
try:
|
||||||
|
timeout_cfg = httpx.Timeout(
|
||||||
|
CHAT_SERVICE_TIMEOUT_S,
|
||||||
|
connect=10.0,
|
||||||
|
read=CHAT_SERVICE_TIMEOUT_S,
|
||||||
|
)
|
||||||
|
async with httpx.AsyncClient(timeout=timeout_cfg) as client:
|
||||||
|
async with client.stream(
|
||||||
|
"POST",
|
||||||
|
f"{CHAT_SERVICE_URL}/chat/start",
|
||||||
|
json=payload,
|
||||||
|
) as upstream:
|
||||||
|
if upstream.status_code != 200:
|
||||||
|
body = await upstream.aread()
|
||||||
|
msg = body.decode("utf-8", errors="replace")[:300]
|
||||||
|
err = {"type": "error",
|
||||||
|
"message": f"chat-service {upstream.status_code}: {msg}"}
|
||||||
|
yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
|
||||||
|
return
|
||||||
|
|
||||||
|
async for line in upstream.aiter_lines():
|
||||||
|
if not line:
|
||||||
|
yield b"\n"
|
||||||
|
continue
|
||||||
|
# Forward verbatim so the browser sees the same
|
||||||
|
# SSE framing the host emits.
|
||||||
|
out = line + "\n"
|
||||||
|
yield out.encode("utf-8")
|
||||||
|
# Mirror events: capture text + session_id for
|
||||||
|
# persistence. The line starts with "data: <json>"
|
||||||
|
# so we strip the prefix before parsing.
|
||||||
|
if line.startswith("data: "):
|
||||||
|
try:
|
||||||
|
event = json.loads(line[len("data: "):])
|
||||||
|
except json.JSONDecodeError:
|
||||||
|
continue
|
||||||
|
events_log.append(event)
|
||||||
|
t = event.get("type")
|
||||||
|
if t == "session_id" and event.get("value"):
|
||||||
|
new_session_id = event["value"]
|
||||||
|
elif t == "text_delta" and event.get("text"):
|
||||||
|
accumulated_text.append(event["text"])
|
||||||
|
elif t == "done" and event.get("text"):
|
||||||
|
if not accumulated_text:
|
||||||
|
accumulated_text.append(event["text"])
|
||||||
|
|
||||||
|
except httpx.ConnectError:
|
||||||
|
err = {
|
||||||
|
"type": "error",
|
||||||
|
"message": (
|
||||||
|
f"לא ניתן להגיע ל-legal-chat-service בכתובת {CHAT_SERVICE_URL}. "
|
||||||
|
"ודא ש-pm2 מריץ אותו: `pm2 status legal-chat-service`."
|
||||||
|
),
|
||||||
|
}
|
||||||
|
yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
|
||||||
|
return
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception("chat proxy failed")
|
||||||
|
err = {"type": "error", "message": str(e)}
|
||||||
|
yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
|
||||||
|
return
|
||||||
|
|
||||||
|
# End of stream — persist the assistant turn.
|
||||||
|
try:
|
||||||
|
full_text = "".join(accumulated_text).strip()
|
||||||
|
if full_text:
|
||||||
|
await db.add_chat_message(
|
||||||
|
conversation_id,
|
||||||
|
role="assistant",
|
||||||
|
content=full_text,
|
||||||
|
raw_events=events_log,
|
||||||
|
)
|
||||||
|
if new_session_id:
|
||||||
|
await db.update_chat_conversation_session_id(
|
||||||
|
conversation_id, new_session_id,
|
||||||
|
)
|
||||||
|
except Exception:
|
||||||
|
logger.exception("failed to persist assistant turn for conv=%s", conversation_id)
|
||||||
|
|
||||||
|
return StreamingResponse(
|
||||||
|
proxy_stream(),
|
||||||
|
media_type="text/event-stream",
|
||||||
|
headers=_SSE_HEADERS,
|
||||||
|
)
|
||||||
205
web/chat_system_prompt.py
Normal file
205
web/chat_system_prompt.py
Normal file
@@ -0,0 +1,205 @@
|
|||||||
|
"""Compose the system prompt the style-chat agent receives.
|
||||||
|
|
||||||
|
The chat runs against the local ``claude`` CLI on the host (via
|
||||||
|
legal-chat-service). We assemble a once-per-conversation system block
|
||||||
|
that gives the agent everything it needs to discuss decisions in
|
||||||
|
Daphna's voice:
|
||||||
|
|
||||||
|
- The style guide (``skills/decision/SKILL.md``) — how she writes
|
||||||
|
- The lessons file (``docs/legal-decision-lessons.md``) — what we've
|
||||||
|
learned across the corpus
|
||||||
|
- The corpus-analysis report (``docs/corpus-analysis.md``) — the
|
||||||
|
structural map of 24+ decisions
|
||||||
|
- A summary of every style_corpus row (number, date, subjects,
|
||||||
|
chars + summary if extracted) so the agent can reason about the
|
||||||
|
whole corpus without us shipping all of it inline
|
||||||
|
- Optional: when the conversation is scoped to a specific decision
|
||||||
|
(``style_corpus_id``), append its full_text so the chat can dive
|
||||||
|
into the text directly
|
||||||
|
|
||||||
|
Sent **once**, when the conversation is first created. On subsequent
|
||||||
|
messages the legal-chat-service uses ``claude --resume <session_id>``
|
||||||
|
and the on-disk CLI session keeps the system context intact — no need
|
||||||
|
to re-ship the 100K+ chars of skills + lessons every turn.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
from pathlib import Path
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp.services import db
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# The reference files live in the repo at known paths. In the
|
||||||
|
# container they're mounted alongside the code, so resolve relative
|
||||||
|
# to web/app.py's parent.
|
||||||
|
_REPO_ROOT = Path(os.environ.get(
|
||||||
|
"LEGAL_AI_REPO_ROOT",
|
||||||
|
str(Path(__file__).resolve().parent.parent),
|
||||||
|
))
|
||||||
|
|
||||||
|
|
||||||
|
_SKILLS_PATH = _REPO_ROOT / "skills" / "decision" / "SKILL.md"
|
||||||
|
_LESSONS_PATH = _REPO_ROOT / "docs" / "legal-decision-lessons.md"
|
||||||
|
_CORPUS_ANALYSIS_PATH = _REPO_ROOT / "docs" / "corpus-analysis.md"
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_read(path: Path, cap_chars: int = 50_000) -> str:
|
||||||
|
"""Read a file (UTF-8) or return a marker that it's missing.
|
||||||
|
|
||||||
|
The cap protects against accidentally injecting an enormous file —
|
||||||
|
even at 50K, a single source file is the lion's share of the
|
||||||
|
system prompt budget.
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
text = path.read_text(encoding="utf-8")
|
||||||
|
except FileNotFoundError:
|
||||||
|
return f"(קובץ {path.name} לא נמצא בנתיב {path})"
|
||||||
|
except OSError as e:
|
||||||
|
logger.warning("could not read %s: %s", path, e)
|
||||||
|
return f"(שגיאה בקריאת {path.name}: {e})"
|
||||||
|
if len(text) > cap_chars:
|
||||||
|
return text[:cap_chars] + f"\n\n[... חתך ב-{cap_chars:,} תווים מתוך {len(text):,}]"
|
||||||
|
return text
|
||||||
|
|
||||||
|
|
||||||
|
async def _corpus_summary_block() -> str:
|
||||||
|
"""Compact one-row-per-decision summary the agent can scan."""
|
||||||
|
rows = await db.get_pool()
|
||||||
|
async with rows.acquire() as conn:
|
||||||
|
records = await conn.fetch(
|
||||||
|
"""
|
||||||
|
SELECT decision_number, decision_date, appeal_subtype,
|
||||||
|
subject_categories, length(full_text) AS chars,
|
||||||
|
coalesce(summary, '') AS summary,
|
||||||
|
coalesce(outcome, '') AS outcome
|
||||||
|
FROM style_corpus
|
||||||
|
ORDER BY decision_date NULLS LAST
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
if not records:
|
||||||
|
return "(הקורפוס ריק)"
|
||||||
|
|
||||||
|
lines = []
|
||||||
|
for r in records:
|
||||||
|
cats = r["subject_categories"]
|
||||||
|
if isinstance(cats, str):
|
||||||
|
import json as _json
|
||||||
|
try:
|
||||||
|
cats = _json.loads(cats)
|
||||||
|
except _json.JSONDecodeError:
|
||||||
|
cats = []
|
||||||
|
cats_str = ", ".join(cats or []) if cats else "—"
|
||||||
|
date_str = str(r["decision_date"]) if r["decision_date"] else "—"
|
||||||
|
summary = (r["summary"] or "").strip()
|
||||||
|
outcome = (r["outcome"] or "").strip()
|
||||||
|
head = f"- **{r['decision_number'] or '—'}** ({date_str}) [{r['appeal_subtype'] or '—'}] · {r['chars']:,} תווים"
|
||||||
|
meta = f" נושאים: {cats_str}"
|
||||||
|
body = ""
|
||||||
|
if summary:
|
||||||
|
body = f"\n תקציר: {summary}"
|
||||||
|
if outcome:
|
||||||
|
body += f" — תוצאה: {outcome}"
|
||||||
|
elif outcome:
|
||||||
|
body = f"\n תוצאה: {outcome}"
|
||||||
|
lines.append(head + "\n" + meta + body)
|
||||||
|
return "\n".join(lines)
|
||||||
|
|
||||||
|
|
||||||
|
async def _decision_full_text(corpus_id: UUID) -> str:
|
||||||
|
pool = await db.get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
row = await conn.fetchrow(
|
||||||
|
"SELECT decision_number, decision_date, full_text "
|
||||||
|
"FROM style_corpus WHERE id = $1",
|
||||||
|
corpus_id,
|
||||||
|
)
|
||||||
|
if not row:
|
||||||
|
return ""
|
||||||
|
header = f"# החלטה {row['decision_number']} ({row['decision_date']})\n\n"
|
||||||
|
return header + (row["full_text"] or "")
|
||||||
|
|
||||||
|
|
||||||
|
SYSTEM_PROMPT_HEADER = """\
|
||||||
|
אתה סוכן הסגנון של עו"ד דפנה תמיר, יו"ר ועדת הערר לתכנון ובניה — מחוז ירושלים.
|
||||||
|
|
||||||
|
תפקידך: לעזור לחיים (העוזר המקצועי של דפנה) להבין, לנתח ולחדד את הסגנון
|
||||||
|
של דפנה. אתה לא כותב החלטות חדשות; אתה דן בסגנון של החלטות קיימות,
|
||||||
|
מזהה דפוסים, מקפיד שהכותבים העתידיים (ה-writer agent) יישארו נאמנים
|
||||||
|
לקולה.
|
||||||
|
|
||||||
|
יש לך גישה ל:
|
||||||
|
1. **מדריך הסגנון** של דפנה (skills/decision/SKILL.md) — איך היא כותבת.
|
||||||
|
2. **הלקחים הגנריים** מהקורפוס (docs/legal-decision-lessons.md) — מה
|
||||||
|
למדנו לאורך 24+ החלטות. **חובה** להישען על הקבצים האלה כשאתה דן
|
||||||
|
בסגנון, ולא להמציא תובנות חדשות מהאוויר.
|
||||||
|
3. **ניתוח הקורפוס** המבני (docs/corpus-analysis.md) — מפת תוכן ופערים.
|
||||||
|
4. **רשימת ההחלטות בקורפוס** (למטה) — סקירה תמציתית של כל החלטה
|
||||||
|
שעלתה ל-style_corpus.
|
||||||
|
5. **טקסט מלא של החלטה ספציפית** (אם השיחה הוצמדה ל-style_corpus_id).
|
||||||
|
|
||||||
|
כללי תקשורת:
|
||||||
|
- כל התשובות בעברית.
|
||||||
|
- חיים יושב מולך, לא דפנה — אבל המטרה היא לחדד את הסגנון *של דפנה*.
|
||||||
|
- אם חיים שואל "האם פסקה X מתאימה לסגנון של דפנה?" — תן ניתוח מנומק
|
||||||
|
שמסתמך על SKILL.md ועל החלטות הקורפוס. אל תמציא ראיות.
|
||||||
|
- אם אתה צריך החלטה ספציפית שאין בקורפוס — הודע לחיים שיצרף אותה.
|
||||||
|
- אם חיים אומר לך משהו חדש על דפנה ("דפנה אומרת לעולם אל תפתח החלטה
|
||||||
|
במילה X") — שמור את זה בזיכרון השיחה; אם זה מצדיק תיעוד קבוע, הצע
|
||||||
|
לחיים להוסיף את זה כ-decision_lesson (POST /api/training/lessons)
|
||||||
|
או כתוספת ל-SKILL.md.
|
||||||
|
- אל תיתן לעצמך אישיות מומצאת — אתה כלי-עזר מקצועי, לא חבר.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
async def build_system_prompt(
|
||||||
|
*,
|
||||||
|
corpus_id: UUID | None = None,
|
||||||
|
include_corpus_summary: bool = True,
|
||||||
|
) -> str:
|
||||||
|
"""Assemble the full system prompt for a new chat conversation.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
corpus_id: When set, the full_text of that decision is appended
|
||||||
|
so the chat can dive into the text.
|
||||||
|
include_corpus_summary: Set False for low-context chats (e.g.
|
||||||
|
quick "what does Daphna do at the end of a betterment-levy
|
||||||
|
decision?" — no need to ship 24 summaries).
|
||||||
|
"""
|
||||||
|
parts: list[str] = [SYSTEM_PROMPT_HEADER]
|
||||||
|
|
||||||
|
parts.append("\n## מדריך הסגנון (skills/decision/SKILL.md)\n")
|
||||||
|
parts.append(_safe_read(_SKILLS_PATH, cap_chars=40_000))
|
||||||
|
|
||||||
|
parts.append("\n\n## לקחים מהקורפוס (docs/legal-decision-lessons.md)\n")
|
||||||
|
parts.append(_safe_read(_LESSONS_PATH, cap_chars=30_000))
|
||||||
|
|
||||||
|
parts.append("\n\n## ניתוח קורפוס מבני (docs/corpus-analysis.md)\n")
|
||||||
|
parts.append(_safe_read(_CORPUS_ANALYSIS_PATH, cap_chars=15_000))
|
||||||
|
|
||||||
|
if include_corpus_summary:
|
||||||
|
parts.append("\n\n## רשימת ההחלטות בקורפוס הסגנון\n")
|
||||||
|
try:
|
||||||
|
parts.append(await _corpus_summary_block())
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("corpus summary failed: %s", e)
|
||||||
|
parts.append("(שגיאה בטעינת רשימת הקורפוס)")
|
||||||
|
|
||||||
|
if corpus_id is not None:
|
||||||
|
parts.append("\n\n## ההחלטה הספציפית בדיון (full_text)\n")
|
||||||
|
try:
|
||||||
|
txt = await _decision_full_text(corpus_id)
|
||||||
|
if txt:
|
||||||
|
parts.append(txt[:200_000]) # hard cap
|
||||||
|
else:
|
||||||
|
parts.append("(לא נמצאה החלטה — בדוק את ה-corpus_id)")
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("decision full_text failed: %s", e)
|
||||||
|
parts.append("(שגיאה בטעינת ההחלטה)")
|
||||||
|
|
||||||
|
return "\n".join(parts)
|
||||||
Reference in New Issue
Block a user