All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m28s
The claude_session bridge had two structural defects that made any
non-trivial document extraction unreliable:
1. subprocess.run() blocks the asyncio event loop in the MCP server
for the full duration of every LLM call (60-180s typical).
2. The 120-second timeout was below the cold-cache cost of any
document over ~12K Hebrew characters. Three back-to-back timeouts
on case 8174-24 dropped 43 appellant claims on the floor.
Phase 1 of the remediation plan — keeps claude_session as the engine
(no Anthropic API switch) and restructures around it:
claude_session.py
• query / query_json are now async — asyncio.create_subprocess_exec
instead of subprocess.run, so MCP server can serve other coroutines
while a call is in flight.
• DEFAULT_TIMEOUT 120 → 1800 (30 min). High enough that no realistic
document hits it; bounded so a runaway never zombifies forever.
• LONG_TIMEOUT 300 → 3600 for opus block writing on full case context.
• TimeoutError now actually kills the subprocess (asyncio.wait_for
cancellation alone leaves the child running).
claims_extractor.py
• _split_by_sections: chunks at numbered sections / Hebrew letter
headings / "פרק" markers / markdown ##, falls back to paragraph
breaks, then to hard splits. Targets 12K chars per chunk — small
enough that each chunk reliably finishes inside the timeout.
• _extract_chunk: per-chunk retry (1 attempt by default) with
structured logging on failure. Failed chunks no longer crash the
overall extraction; they're skipped with a partial-result warning.
• extract_claims_with_ai now runs chunks in parallel via
asyncio.gather bounded by a semaphore (CHUNK_CONCURRENCY=3).
For a 25K-char appeal: was sequential 150-300s, now ~70-90s.
Updated all 9 callers (claims, appraiser facts, block writer, qa
validator, brainstorm, learning loop, style analyzer × 3) to await
the now-async API.
The one-shot scripts/extract_claims_8174.py used to recover 43
appellant claims on case 8174-24 has been moved to .archive/ — phase 1
makes it obsolete. SCRIPTS.md updated.
Phase 2 (background-task wrapper around LLM-bound MCP tools, persistent
llm_tasks table, SSE progress) is the structural follow-up — separate PR.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
163 lines
5.2 KiB
Python
163 lines
5.2 KiB
Python
"""לולאת למידה — השוואת טיוטה לגרסה סופית וחילוץ לקחים.
|
||
|
||
שלב 7 באיפיון:
|
||
1. קליטת גרסה סופית (שדפנה חתמה)
|
||
2. השוואת טיוטה לסופית — זיהוי שינויים
|
||
3. חילוץ לקחים: ביטויים חדשים, דפוסים שהשתנו, שגיאות חוזרות
|
||
4. עדכון מודל הסגנון
|
||
"""
|
||
|
||
from __future__ import annotations
|
||
|
||
import logging
|
||
from uuid import UUID
|
||
|
||
from legal_mcp import config
|
||
from legal_mcp.config import parse_llm_json
|
||
from legal_mcp.services import db, claude_session
|
||
|
||
logger = logging.getLogger(__name__)
|
||
|
||
|
||
def compute_diff_stats(draft_text: str, final_text: str) -> dict:
|
||
"""חישוב סטטיסטיקות השוואה בין טיוטה לסופית."""
|
||
draft_words = draft_text.split()
|
||
final_words = final_text.split()
|
||
|
||
draft_len = len(draft_words)
|
||
final_len = len(final_words)
|
||
|
||
# Simple word-level diff (not a full diff algorithm, but good enough for stats)
|
||
draft_set = set(draft_words)
|
||
final_set = set(final_words)
|
||
|
||
common = draft_set & final_set
|
||
added = final_set - draft_set
|
||
removed = draft_set - final_set
|
||
|
||
# Estimate change percentage
|
||
if draft_len == 0:
|
||
change_pct = 100.0
|
||
else:
|
||
change_pct = (len(added) + len(removed)) / max(draft_len, final_len) * 100
|
||
|
||
return {
|
||
"draft_words": draft_len,
|
||
"final_words": final_len,
|
||
"change_percent": round(change_pct, 1),
|
||
"words_added": len(added),
|
||
"words_removed": len(removed),
|
||
"words_common": len(common),
|
||
}
|
||
|
||
|
||
LESSONS_PROMPT = """אתה מנתח שינויים בהחלטות משפטיות. קיבלת טיוטה (שנוצרה ע"י AI) וגרסה סופית (שעברה עריכת דפנה).
|
||
|
||
## משימה:
|
||
1. זהה את השינויים המהותיים (לא הקלדה/פורמט)
|
||
2. סווג כל שינוי:
|
||
- expression_change — ביטוי שהוחלף (הצע כלקח לעתיד)
|
||
- structure_change — שינוי מבני (סדר, חלוקה)
|
||
- content_addition — תוכן שנוסף (מה חסר?)
|
||
- content_removal — תוכן שהוסר (מה מיותר?)
|
||
- tone_change — שינוי טון (רשמי יותר/פחות)
|
||
- error_fix — תיקון שגיאה עובדתית/משפטית
|
||
3. הסק לקחים שניתן להפעיל בהחלטות עתידיות
|
||
|
||
## פלט JSON:
|
||
{
|
||
"changes": [
|
||
{"type": "...", "description": "תיאור השינוי", "draft_text": "...", "final_text": "...", "lesson": "לקח לעתיד"}
|
||
],
|
||
"new_expressions": ["ביטוי חדש שדפנה הוסיפה"],
|
||
"overall_assessment": "הערכה כללית (1-2 משפטים)"
|
||
}
|
||
"""
|
||
|
||
|
||
async def analyze_changes(draft_text: str, final_text: str) -> dict:
|
||
"""ניתוח שינויים בין טיוטה לגרסה סופית עם Claude."""
|
||
# Truncate for context window
|
||
max_chars = 15000
|
||
draft_sample = draft_text[:max_chars]
|
||
final_sample = final_text[:max_chars]
|
||
|
||
prompt = f"""{LESSONS_PROMPT}
|
||
|
||
--- טיוטה ---
|
||
{draft_sample}
|
||
|
||
--- גרסה סופית ---
|
||
{final_sample}
|
||
"""
|
||
result = await claude_session.query_json(prompt)
|
||
if result is None:
|
||
logger.warning("Failed to parse lessons response")
|
||
return {"changes": [], "new_expressions": [], "overall_assessment": ""}
|
||
return result
|
||
|
||
|
||
async def process_final_version(
|
||
case_id: UUID,
|
||
final_text: str,
|
||
) -> dict:
|
||
"""קליטת גרסה סופית, השוואה לטיוטה, חילוץ לקחים.
|
||
|
||
Args:
|
||
case_id: מזהה התיק
|
||
final_text: טקסט הגרסה הסופית
|
||
|
||
Returns:
|
||
dict עם diff stats, changes, lessons
|
||
"""
|
||
decision = await db.get_decision_by_case(case_id)
|
||
if not decision:
|
||
raise ValueError(f"No decision for case {case_id}")
|
||
|
||
# Get draft text (combine all blocks)
|
||
pool = await db.get_pool()
|
||
async with pool.acquire() as conn:
|
||
rows = await conn.fetch(
|
||
"""SELECT content FROM decision_blocks
|
||
WHERE decision_id = $1 AND word_count > 0
|
||
ORDER BY block_index""",
|
||
UUID(decision["id"]),
|
||
)
|
||
draft_text = "\n\n".join(r["content"] for r in rows if r["content"])
|
||
|
||
if not draft_text:
|
||
raise ValueError("No draft content to compare")
|
||
|
||
# Compute stats
|
||
diff_stats = compute_diff_stats(draft_text, final_text)
|
||
|
||
# Analyze changes with AI
|
||
analysis = await analyze_changes(draft_text, final_text)
|
||
|
||
# Store new expressions as style patterns
|
||
for expr in analysis.get("new_expressions", []):
|
||
if expr and len(expr) > 3:
|
||
await db.upsert_style_pattern(
|
||
pattern_type="characteristic_phrase",
|
||
pattern_text=expr,
|
||
context="למד מגרסה סופית",
|
||
)
|
||
|
||
# Update decision status
|
||
await db.update_decision(
|
||
UUID(decision["id"]),
|
||
status="final",
|
||
)
|
||
|
||
# Update case status
|
||
case = await db.get_case(case_id)
|
||
if case:
|
||
await db.update_case(case_id, status="final")
|
||
|
||
return {
|
||
"diff_stats": diff_stats,
|
||
"analysis": analysis,
|
||
"lessons_count": len(analysis.get("changes", [])),
|
||
"new_expressions": len(analysis.get("new_expressions", [])),
|
||
}
|