fix(precedents): Anthropic SDK fallback, format() crash, UI refresh
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m31s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 1m31s
Three fixes to the precedent library after the first end-to-end test on
403-17 surfaced runtime issues:
1. Anthropic SDK fallback in claude_session. The legal-ai Docker container
does not ship the `claude` CLI, so every halacha and metadata extraction
was failing with "Claude CLI not found." Module now tries the CLI first
(zero-cost local path) and falls back to the Anthropic SDK with
ANTHROPIC_API_KEY when the binary is absent. Default model is
claude-sonnet-4-6, overridable via CLAUDE_SDK_MODEL env. The system
message gets cache_control: ephemeral so multi-chunk runs reuse the
cached instruction prefix at ~10% read cost. Adds `anthropic` to
pyproject deps.
2. precedent_metadata_extractor crashed with KeyError because the JSON
example inside the prompt template contained literal { } characters
that str.format() interpreted as placeholders. Switched to f-string
concatenation; the prompt template no longer needs format() at all.
3. Library list query stays stale after upload because the upload
mutation's onSuccess fires when the POST returns task_id, not when
SSE reports completion. Added a second invalidate inside the SSE
watcher in PrecedentUploadSheet so the new row appears with up-to-date
chunk and halachot counts the moment processing finishes.
Halacha and metadata extractors now route the long static prompt through
the new `system=` parameter so the SDK path actually caches it; the CLI
path concatenates and behaves as before.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -28,6 +28,10 @@ _HEAD_CHARS = 12_000
|
||||
_TAIL_CHARS = 6_000
|
||||
|
||||
|
||||
# Note: this template is concatenated with f-strings at call-time rather
|
||||
# than using .format(), because the JSON example below contains '{' / '}'
|
||||
# which str.format would interpret as placeholders and crash with
|
||||
# KeyError on the field names.
|
||||
METADATA_EXTRACTION_PROMPT = """אתה מסייע משפטי בכיר. קרא את פסק הדין/ההחלטה הבא וחלץ ממנו מטא-דאטה לקטלוג הקורפוס.
|
||||
|
||||
המטרה: למלא שדות בטופס העלאה שהמשתמש הזין באופן חלקי. **אל תמציא** — אם המידע לא מופיע בטקסט, השאר ריק (מחרוזת ריקה / מערך ריק).
|
||||
@@ -51,13 +55,6 @@ METADATA_EXTRACTION_PROMPT = """אתה מסייע משפטי בכיר. קרא א
|
||||
4. **headnote** — לא מצטטים, מסכמים. סגנון נבו: ביטוי קצר אחד.
|
||||
5. **key_quote** — חייב להיות הדבקה מילולית מהקלט. אם אין ציטוט בולט — השאר ריק.
|
||||
6. **subject_tags** — 3-7 תגיות בעברית, snake_case (חניה, קווי_בניין, שיקול_דעת, פגם_פרוצדורלי, סמכות, מועדים, פגיעה_במקרקעין, ירידת_ערך, תכנית_רחביה, מימוש_במכר, וכד'). שייך לתחום של ועדת ערר תכנון ובניה.
|
||||
|
||||
## הקלט
|
||||
{context}
|
||||
|
||||
--- תחילת הטקסט ---
|
||||
{text_window}
|
||||
--- סוף הטקסט ---
|
||||
"""
|
||||
|
||||
|
||||
@@ -104,12 +101,18 @@ async def extract_metadata(case_law_id: UUID | str) -> dict:
|
||||
f"תאריך: {date_str}\n"
|
||||
f"תחום: {practice_area}"
|
||||
)
|
||||
prompt = METADATA_EXTRACTION_PROMPT.format(
|
||||
context=context, text_window=_build_text_window(full_text),
|
||||
text_window = _build_text_window(full_text)
|
||||
# Static instructions go via `system` so the SDK path can cache them
|
||||
# across uploads. Per-precedent content goes in the user prompt.
|
||||
user_msg = (
|
||||
f"## הקלט\n{context}\n\n"
|
||||
f"--- תחילת הטקסט ---\n{text_window}\n--- סוף הטקסט ---"
|
||||
)
|
||||
|
||||
try:
|
||||
result = await claude_session.query_json(prompt)
|
||||
result = await claude_session.query_json(
|
||||
user_msg, system=METADATA_EXTRACTION_PROMPT,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("precedent_metadata_extractor: query failed: %s", e)
|
||||
return {}
|
||||
|
||||
Reference in New Issue
Block a user