Compare commits
31 Commits
1da2a9a2cb
...
fix/fu4-co
| Author | SHA1 | Date | |
|---|---|---|---|
| 1af689a969 | |||
| 7826ff4910 | |||
| 58ab003206 | |||
| 165efc62b0 | |||
| d3c6baf9e2 | |||
| 5ad541e54c | |||
| a3454bcb57 | |||
| bb0cd7c6a2 | |||
| 0629f19d5f | |||
| f920cfc738 | |||
| c4046cc0a0 | |||
| cbc7a1e336 | |||
| a02a4e3a64 | |||
| b01722b1b4 | |||
| 1d4f214abe | |||
| 2aee398b4a | |||
| 3a05e30c8d | |||
| 7ad995aade | |||
| 9f4f8c60a4 | |||
| d32452f95c | |||
| ac3ed455cf | |||
| d359ab9884 | |||
| 1645653ba9 | |||
| f3cc9ca9d4 | |||
| af651d0135 | |||
| b197d2329c | |||
| c6e368e4f7 | |||
| 8153bc9f03 | |||
| 4892fb6e8f | |||
| b368bce690 | |||
| 1496e520fd |
@@ -181,6 +181,36 @@ python3 /home/chaim/legal-ai/scripts/notify.py \
|
||||
|
||||
---
|
||||
|
||||
## §7. סטטוסי תיק תקפים (case status flow)
|
||||
|
||||
הסטטוסים שאתה עשוי לראות ב-`case.status` (לפי `legal-ceo.md` "מפת סטטוסים"):
|
||||
|
||||
```
|
||||
new → proofread → documents_ready → analyst_verified → research_complete*
|
||||
→ outcome_set → direction_approved → analysis_enriched → ready_for_writing
|
||||
→ drafted → qa_passed / qa_failed → exported
|
||||
```
|
||||
|
||||
`research_complete` — **valid status** (לא legacy מחוסר תוקף). מנותב ע"י `legal-researcher.md` שלב 5 כשמחקר תקדימים רץ בנפרד מהמנתח (תרחיש מתקדם). ה-CEO יודע לטפל בו כאילו זה `analyst_verified` (ראה `legal-ceo.md` "מפת סטטוסים").
|
||||
|
||||
---
|
||||
|
||||
## §8. ניתוב upload פסיקה לקורפוס — flowchart מהיר
|
||||
|
||||
```
|
||||
חיים העלה PDF פסיקה לתיק → ה-citation הוא:
|
||||
├── "ערר NNNN/YY" או "בל"מ NNNN/YY"
|
||||
│ → internal_decision_upload (חובה chair_name + district)
|
||||
└── "עע"מ / בר"מ / עמ"נ / בג"ץ / ע"א / ע"פ / רע"א / רע"פ / ת"א / ת"מ"
|
||||
→ precedent_library_upload (external_upload)
|
||||
```
|
||||
|
||||
- **`internal_decision_upload`** דורש: `file_path`, `case_number`, `chair_name`, `district`. district מתוך הרשימה: ירושלים / מרכז / תל אביב / צפון / דרום / חיפה / ארצי.
|
||||
- **`precedent_library_upload`** לא מקבל chair_name/district. אם תנסה להעלות "ערר ..." דרכו — citation guard ידחה.
|
||||
- פירוט מלא: `legal-researcher.md` סעיף "איזה כלי upload להשתמש".
|
||||
|
||||
---
|
||||
|
||||
## נתיבי API — הפניה ל-skill הרשמי
|
||||
|
||||
| פעולה | איפה ב-skill |
|
||||
|
||||
@@ -76,6 +76,24 @@ profiles:
|
||||
Authorization: Bearer $PAPERCLIP_API_KEY
|
||||
{ "body": "<my findings>" }
|
||||
```
|
||||
5b. **רושם כל ממצא גם ב-API של legal-ai כ-decision_lesson**, כך שיופיע ב-UI
|
||||
תחת הטאב "מה למדנו" של ההחלטה בקורפוס. דרישה: למצוא קודם את ה-`style_corpus_id`
|
||||
שתואם ל-`decision_number` של ההחלטה (`GET /api/training/corpus` ולסנן).
|
||||
לכל ממצא:
|
||||
```
|
||||
POST https://legal-ai.nautilus.marcusgroup.org/api/training/corpus/{corpus_id}/lessons
|
||||
Content-Type: application/json
|
||||
{
|
||||
"lesson_text": "<התקציר של הממצא — מה ראיתי + הצעה — שורה אחת>",
|
||||
"category": "<style|structure|lexicon|tabular|general>",
|
||||
"source": "curator"
|
||||
}
|
||||
```
|
||||
מיפוי תגי-ממצא ל-`category`:
|
||||
- `[סגנון]` → `style`
|
||||
- `[מבנה]` → `structure`
|
||||
- `[לקסיקון משפטי]` → `lexicon`
|
||||
- `[טבלאי]` → `tabular`
|
||||
6. סוגר את ה-issue (status=done) אחרי שכתבתי את ה-comment
|
||||
|
||||
## פורמט ה-comment
|
||||
|
||||
@@ -63,6 +63,26 @@ tools:
|
||||
- חוקי תמ"א 38, פינוי ובינוי, והתחדשות עירונית
|
||||
- ועדות ערר — תכנון ובניה והיטל השבחה (סמכות, הרכב, סדרי דין)
|
||||
|
||||
## טקסונומיה — שני namespaces ל-`practice_area`
|
||||
|
||||
⚠️ **חובה לדעת לפני שאתה כותב practice_area לכל כלי MCP או יוצר תיק חדש.**
|
||||
|
||||
יש שני namespaces שונים:
|
||||
|
||||
| Axis | ערכים | איפה משתמשים |
|
||||
|------|--------|--------------|
|
||||
| **A. Multi-tenant (legacy/routing)** | `appeals_committee`, `national_insurance`, `labor_law` | בחירת tenant. הסוכנים בוועדת ערר תמיד `appeals_committee` |
|
||||
| **B. Domain (DB + filters)** | `rishuy_uvniya`, `betterment_levy`, `compensation_197` | **DB columns + כל פילטר ב-`search_precedent_library` / `search_internal_decisions`** |
|
||||
|
||||
**כלל זהב — בכל קריאה לכלי שמחפש או כותב לקורפוס, השתמש ב-Axis B בלבד:**
|
||||
- 1xxx → `rishuy_uvniya`
|
||||
- 8xxx → `betterment_levy`
|
||||
- 9xxx → `compensation_197`
|
||||
|
||||
**יצירת תיק חדש (`case_create`):** ב-DB, העמודה `cases.practice_area` מאוכפת ע"י CHECK constraint לערכי Axis B (או ריק). **אסור** לכתוב `appeals_committee` ל-`cases.practice_area` — זה ידחה. אם אתה לא בטוח באיזה axis תיק קיים נמצא, קרא קודם `case_get` ובדוק.
|
||||
|
||||
**זיהוי בל"מ (בקשה להארכת מועד):** אם ה-subject של מסמך/תיק מכיל "בקשה להארכת מועד" או הקידומת "בל\"מ" — זהו סיווג ייחודי (במיוחד תיקי 8xxx). חלץ זאת בעת הניתוח וציין ב-`appeal_subtype` כאחד הסיווגים המקובלים. בל"מ הוא דיוני בעיקרו ולכן הניתוח שלו שונה — לרוב יש טענת סף יחידה (האם להאריך) ולא דיון מהותי. סמן זאת בפלט כדי שהכותב ידע לבחור תבנית קצרה.
|
||||
|
||||
## הבחנה קריטית — 3 סוגי פריטים מחולצים
|
||||
|
||||
| סוג (claim_type) | מה זה | מי אמר |
|
||||
@@ -181,8 +201,8 @@ tools:
|
||||
| סיווג תיק | practice_area |
|
||||
|------------|---------------|
|
||||
| 1xxx (רישוי ובניה) | `rishuy_uvniya` |
|
||||
| 8xxx (היטל השבחה) | `histael_hashbacha` |
|
||||
| 9xxx (פיצויים ס' 197) | `pitsuim_197` |
|
||||
| 8xxx (היטל השבחה) | `betterment_levy` |
|
||||
| 9xxx (פיצויים ס' 197) | `compensation_197` |
|
||||
|
||||
אם הסוגיה מאוזכרת ב-`appeal_subtype` ידוע (כמו "שימוש חורג", "חריגות בנייה", "סטייה ניכרת") — הוסף `appeal_subtype` לפילטר. צמצום מוקדם > הרחבה מאוחרת.
|
||||
|
||||
@@ -435,12 +455,12 @@ X שאלות עומדות להכרעה:
|
||||
### 8א. אימות פסיקה
|
||||
סרוק את עמדות היו"ר וזהה כל אזכור פסיקה (בג"ץ, עע"מ, עת"מ, ע"א, ערר וכו').
|
||||
לכל פסק דין שמוזכר:
|
||||
1. חפש ב**קורפוס הסמכותי** (`search_precedent_library`) — חובה ראשונה. שם נמצאות הלכות מאושרות עם supporting_quote מוכן לציטוט.
|
||||
1. חפש ב**קורפוס הסמכותי** (`search_precedent_library`) — חובה ראשונה. שם נמצאות הלכות מאושרות עם supporting_quote מוכן לציטוט. הקורפוס כולל גם הלכות מהחלטות ועדות ערר שהועלו (internal_committee).
|
||||
2. חפש בקאנון דפנה (`search_decisions`, `find_similar_cases`)
|
||||
3. חפש במסמכי התיק (`search_case_documents`) — אולי מצוטט בכתבי הטענות
|
||||
4. **אם נמצא ב-precedent_library** — צטט citation+supporting_quote מדויקים מהקורפוס.
|
||||
5. **אם נמצא רק במסמכי התיק** — סמן: "מקור: כתבי טענות, דורש אימות מול הקורפוס".
|
||||
6. **אם לא נמצא בכלל** — סמן: "דורש אימות חיצוני" + נסח הנחיות חיפוש.
|
||||
6. **אם לא נמצא בכלל** — קודם **נסה שוב עם הקשר** (לא שם לבדו): צרף מונחי תוכן או מספר תיק לשאילתה. שם תיק לבדו (`"אגסי"`) אינו מפתח אמין — הוא עלול להחזיר את מי שמצטט את התיק ולא את התיק עצמו. רק אם גם זה ריק — סמן: "דורש אימות חיצוני" + נסח הנחיות חיפוש.
|
||||
|
||||
הוסף לסעיף "7א. שאילתות לקורפוסים" כל query נוסף שהורצה ב-pass 2.
|
||||
|
||||
|
||||
@@ -18,6 +18,8 @@ tools:
|
||||
- mcp__legal-ai__list_chair_feedback
|
||||
- mcp__legal-ai__search_case_documents
|
||||
- mcp__legal-ai__search_precedent_library
|
||||
- mcp__legal-ai__search_internal_decisions
|
||||
- mcp__legal-ai__internal_decision_upload
|
||||
- mcp__legal-ai__workflow_status
|
||||
- mcp__legal-ai__processing_status
|
||||
- mcp__legal-ai__get_metrics
|
||||
@@ -78,6 +80,48 @@ tools:
|
||||
| `docs/daphna-procedural-patterns.md` | תבניות פרוצדורליות (החלטת ביניים, חזרה לשמאי) | CEO + writer (8xxx בלבד) |
|
||||
| `docs/voice-1130-25.md` | דוגמה עמוקה | writer (אם תיק 1xxx מורכב) |
|
||||
|
||||
## טקסונומיה — שני namespaces ל-`practice_area` (חובה לדעת)
|
||||
|
||||
⚠️ **קריטי לפני שאתה כותב practice_area לכל כלי MCP — יש שני namespaces שונים שמוגדרים במערכת:**
|
||||
|
||||
| Axis | ערכים | איפה משתמשים |
|
||||
|------|--------|--------------|
|
||||
| **A. Multi-tenant (legacy, routing)** | `appeals_committee`, `national_insurance`, `labor_law` | רק לבחירת ה-tenant ברמת המוצר. הסוכנים בוועדת ערר תמיד `appeals_committee` |
|
||||
| **B. Domain (DB columns + filters)** | `rishuy_uvniya`, `betterment_levy`, `compensation_197` | **כל קריאה ל-`search_precedent_library` / `search_internal_decisions` / `precedent_library_upload` / `internal_decision_upload`** — זה ה-namespace הקובע |
|
||||
|
||||
**המרה אוטומטית:** `to_db_practice_area(multi_tenant_pa, appeal_subtype)` ממירה Axis A → Axis B (משתמש פנימי בלבד).
|
||||
|
||||
**כללי ברזל לכלי MCP:**
|
||||
- בכל קריאה לכלי שמחפש או כותב לקורפוס פסיקה — **השתמש בערכי Axis B בלבד**:
|
||||
- 1xxx (רישוי ובניה) → `rishuy_uvniya`
|
||||
- 8xxx (היטל השבחה) → `betterment_levy`
|
||||
- 9xxx (פיצויים ס' 197) → `compensation_197`
|
||||
- **אסור** לעבור `appeals_committee` כ-`practice_area` ל-`search_precedent_library` — זה ייתן 0 תוצאות (הקורפוס מאוחסן ב-Axis B).
|
||||
- DB constraint `cases_practice_area_check` אוכף: practice_area של תיק חייב להיות אחד מהשלושה ב-Axis B (או ריק).
|
||||
|
||||
## כלי MCP חדשים (יוני 2026) — חובה לקרוא
|
||||
|
||||
### `internal_decision_upload` — העלאת החלטת ועדת ערר לקורפוס
|
||||
|
||||
החלטות של ועדות ערר אחרות (`source_kind='internal_committee'`) עוברות **רק** דרך כלי זה — לא דרך `precedent_library_upload` (citation guard דוחה).
|
||||
|
||||
**חתימה (חובה כל ארבעת השדות):**
|
||||
```
|
||||
internal_decision_upload(
|
||||
file_path=..., # נתיב מלא ל-PDF/DOCX/RTF/TXT/MD
|
||||
case_number=..., # "ערר 1024-25" / "בל\"מ 8126/25" / וכו'
|
||||
chair_name=..., # שם יו"ר — חובה (לחיפוש סלקטיבי)
|
||||
district=..., # ירושלים / מרכז / תל אביב / צפון / דרום / חיפה / ארצי
|
||||
... # case_name, court, decision_date, practice_area, וכו' — אופציונליים
|
||||
)
|
||||
```
|
||||
|
||||
**מי משתמש בפועל:** ב-`legal-researcher` (ראה `legal-researcher.md`). ה-CEO רק יודע שזה קיים — אם חוקר מדווח שלא הצליח להעלות החלטת ועדת ערר, ה-CEO בודק שה-chair_name + district סופקו.
|
||||
|
||||
### `search_internal_decisions` — חיפוש בהחלטות ועדות ערר
|
||||
|
||||
`search_decisions` = רק החלטות דפנה (style corpus). `search_internal_decisions` = כל ועדות הערר בכל המחוזות, עם פילטרים `chair_name` ו-`district`. ה-CEO משתמש בכלי זה בתרחישי routing מתקדמים — בד"כ ה-researcher ו-analyst הם המשתמשים העיקריים.
|
||||
|
||||
## הסוכנים שלך
|
||||
|
||||
| סוכן | Agent ID | תפקיד |
|
||||
@@ -597,7 +641,7 @@ ls data/cases/$CASE_NUMBER/documents/research/analysis-and-research.md
|
||||
| `proofread` | מגיה | → צור issue למנתח משפטי (ראה תבנית למטה) |
|
||||
| `documents_ready` | מנתח | → שלב A (בדיקות שלמות + שליליות + מתודולוגיה). אם עובר → עדכן ל-`analyst_verified` |
|
||||
| `analyst_verified` | CEO (אחרי שלב A) | → שלב B (סיכום + שאלת תוצאה לחיים). המנתח כבר ביצע את המחקר כחלק מהניתוח — אין ליצור issue לחוקר. |
|
||||
| `research_complete` | (מנתח — legacy, או תרחיש מיוחד עם חוקר) | → שלב B (סיכום + שאלת תוצאה לחיים). בזרימה הרגילה המנתח לא מגדיר סטטוס זה — רק `documents_ready`. אם תראה סטטוס זה, בדוק אם `analysis-and-research.md` קיים לפני §B. |
|
||||
| `research_complete` | מנתח / חוקר תקדימים (valid status — legacy + תרחישים מתקדמים) | → שלב B (סיכום + שאלת תוצאה לחיים). **זה סטטוס תקף**, לא שגיאה. בזרימה הרגילה המנתח מגדיר `documents_ready`, אבל אם החוקר רץ בנפרד (`legal-researcher.md` שלב 5) הוא מעדכן ל-`research_complete`. אם תראה סטטוס זה, בדוק שגם `analysis-and-research.md` וגם `precedent-research.md` קיימים, ואז המשך ל-§B כרגיל. |
|
||||
| `outcome_set` | CEO (אחרי שחיים בחר) | → האם יש claim_handling? אם לא → שלב B המשך (טבלת bundle/skip). אם כן → שלב C |
|
||||
| `direction_approved` | CEO (אחרי שחיים אישר) | → צור issue למנתח (c26e9439) ל-pass 2: העמקת ניתוח ואימות פסיקה |
|
||||
| `analysis_enriched` | מנתח (pass 2) | → שלב D2: צור issue לכותב (7ed8686f) |
|
||||
|
||||
@@ -15,7 +15,9 @@ tools:
|
||||
- mcp__legal-ai__workflow_status
|
||||
- mcp__legal-ai__search_case_documents
|
||||
- mcp__legal-ai__search_precedent_library
|
||||
- mcp__legal-ai__search_internal_decisions
|
||||
- mcp__legal-ai__precedent_library_get
|
||||
- mcp__legal-ai__precedent_list
|
||||
- mcp__legal-ai__halacha_review
|
||||
---
|
||||
|
||||
@@ -145,6 +147,39 @@ tools:
|
||||
- האם יש תקדים אישי שלה רלוונטי? אם כן — האם הופנה אליו (חיסכון / דחייה / הבחנה)?
|
||||
- **ציטוטי פסיקה חיצונית בבלוק י** — לכל ציטוט (`citation` + `supporting_quote`) שמופיע, חפש ב-`search_precedent_library` (subject_tag הרלוונטי) וודא שהציטוט קיים בקורפוס ושהלכה אושרה. ציטוט שלא תואם להלכה מאושרת = critical.
|
||||
|
||||
### 9. צירוף פסיקה ל-DB (`precedent_attach`) — critical
|
||||
|
||||
לכל ציטוט פסיקה בבלוק י (חיצוני או internal_committee), **חייב להיות רישום ב-`case_precedents`** דרך `precedent_attach` של ה-researcher.
|
||||
|
||||
**שיטת בדיקה:**
|
||||
1. הרץ `precedent_list(case_number)` — קבל רשימת כל הציטוטים שנרשמו ל-DB.
|
||||
2. סרוק את בלוק י (וטענות סף) וזהה כל ציטוט פסיקה (citation + quote).
|
||||
3. **לכל ציטוט**: ודא שהוא מופיע ב-`precedent_list`. אם חסר → `qa = fail` (critical, חוסם ייצוא). דווח אילו ציטוטים לא נרשמו.
|
||||
|
||||
**למה זה חשוב:** ה-DOCX exporter ו-Hermes curator קוראים מ-`case_precedents`. ציטוט שנמצא רק בטקסט ולא ב-DB יחמיץ at-export-time validation וניתוח Hermes.
|
||||
|
||||
### 10. מראה מקום מלא בציטוטים — warning
|
||||
|
||||
לכל ציטוט פסיקה בבלוק י, ודא שהוא כולל:
|
||||
- **מספר תיק מלא** (לא רק "פלוני נ' פלמוני")
|
||||
- **ערכאה** (עליון / מנהלי / מחוזי / שלום / ועדת ערר)
|
||||
- **תאריך / `פורסם בנבו`** או `פורסם ב-`
|
||||
- **`page_reference`** כשמדובר בציטוט ארוך מתוך פס"ד
|
||||
|
||||
אם חסר אחד מהשלושה הראשונים → **`qa = warning`**, דווח לחיים בcomment + הצע למלא. (לא חוסם — לא כל פסק דין יש לו פאג'ינציה.)
|
||||
|
||||
### 11. תקפות סטטוס תיק (status_validity) — sanity check
|
||||
|
||||
בדוק `case_get(case_number).status` — הוא צריך להיות בערכים תקפים. הזרימה הכוללת:
|
||||
|
||||
```
|
||||
new → proofread → documents_ready → analyst_verified → research_complete (legacy/optional)
|
||||
→ outcome_set → direction_approved → analysis_enriched → ready_for_writing
|
||||
→ drafted (אתה כאן!) → qa_passed / qa_failed → exported
|
||||
```
|
||||
|
||||
⚠️ **`research_complete` הוא valid status** (לא bug, לא legacy ערומה). ב-`legal-researcher.md` שלב 5 הוא הסטטוס שהחוקר מגדיר בסיום מחקר. אם תיק במצב זה נשלח אליך לפני `drafted` — דווח, אל תכשיל.
|
||||
|
||||
#### תבנית קבלה (מ-`daphna-acceptance-architecture.md` — אם תוצאה = קבלה)
|
||||
- האם הסיבה לקבלה ברורה: פגם פנימי / החזרה / תיקונים / 8xxx מהותית / שומה?
|
||||
- האם התבנית הנבחרת (A/B/C/D/E) מתאימה לסיבה?
|
||||
@@ -165,6 +200,9 @@ tools:
|
||||
| **שאילתות לקורפוסים** | **critical** | **חוסם ייצוא** |
|
||||
| מתודולוגיה | critical | חוסם ייצוא |
|
||||
| **קול דפנה** | **critical** | **חוסם ייצוא** |
|
||||
| **צירוף פסיקה ל-DB** | **critical** | **חוסם ייצוא** |
|
||||
| מראה מקום מלא | warning | מדווח, לא חוסם |
|
||||
| תקפות סטטוס | sanity | דיווח בלבד |
|
||||
|
||||
## תהליך עבודה
|
||||
|
||||
|
||||
@@ -21,6 +21,8 @@ tools:
|
||||
- mcp__legal-ai__precedent_list
|
||||
- mcp__legal-ai__precedent_search_library
|
||||
- mcp__legal-ai__search_precedent_library
|
||||
- mcp__legal-ai__internal_decision_upload
|
||||
- mcp__legal-ai__precedent_library_upload
|
||||
- mcp__legal-ai__precedent_library_get
|
||||
- mcp__legal-ai__precedent_library_list
|
||||
- mcp__legal-ai__precedent_extract_halachot
|
||||
@@ -28,6 +30,9 @@ tools:
|
||||
- mcp__legal-ai__precedent_process_pending
|
||||
- mcp__legal-ai__halacha_review
|
||||
- mcp__legal-ai__halachot_pending
|
||||
- mcp__legal-ai__missing_precedent_create
|
||||
- mcp__legal-ai__missing_precedent_list
|
||||
- mcp__legal-ai__missing_precedent_close
|
||||
- mcp__legal-ai__workflow_status
|
||||
---
|
||||
|
||||
@@ -70,6 +75,92 @@ tools:
|
||||
|
||||
כתבי ערר, תשובות, תגובות — אלה בטיפול סוכן "מנתח משפטי".
|
||||
|
||||
## ⚠️ חובה לקרוא — איזה כלי upload להשתמש לכל סוג פסיקה
|
||||
|
||||
כשאתה מעלה פסיקה לקורפוס הסמכותי, **יש שני זרמים שונים** והם **לא ניתנים להחלפה**. שגיאה כאן פוגעת בכל המערכת.
|
||||
|
||||
### Flowchart החלטה — איזה כלי?
|
||||
|
||||
```
|
||||
האם ה-citation מתחיל ב-"ערר" או "בל"מ" (החלטת ועדת ערר)?
|
||||
├── כן → internal_decision_upload ✅ (חובה chair_name + district)
|
||||
└── לא →
|
||||
האם מתחיל ב-עע"מ / בר"מ / עמ"נ / בג"ץ / ע"א / ע"פ / רע"א / רע"פ / ת"א / ת"מ
|
||||
(פסיקת בית משפט מנהלי/עליון/מחוזי/שלום)?
|
||||
├── כן → precedent_library_upload ✅ (external_upload)
|
||||
└── לא → דווח לחיים: citation לא מוכר, אל תעלה
|
||||
```
|
||||
|
||||
### זרם A — `precedent_library_upload` (external)
|
||||
|
||||
לפסיקת ערכאות שיפוטיות: עליון (בג"ץ/ע"א/רע"א/ע"פ/רע"פ/דנ"א), מנהלי (עע"מ/בר"מ/עמ"נ), מחוזי (ת"א/ת"מ), שלום.
|
||||
|
||||
```python
|
||||
mcp__legal-ai__precedent_library_upload(
|
||||
file_path="/path/to/file.pdf",
|
||||
citation="עע\"מ 3911/19 פלוני נ' הוועדה המקומית רמת גן (פורסם בנבו, 12.07.2023)",
|
||||
case_name="פלוני נ' הוועדה המקומית רמת גן",
|
||||
court="בית המשפט העליון",
|
||||
decision_date="2023-07-12",
|
||||
practice_area="rishuy_uvniya", # Axis B בלבד
|
||||
subject_tags=["שימוש חורג", "מגרש מסחרי"],
|
||||
)
|
||||
```
|
||||
|
||||
**הכלי שומר `source_kind='external_upload'`.** Citation guard: אם תנסה להעלות citation שמתחיל ב-"ערר" או "בל\"מ" — הכלי **ידחה** עם שגיאה ויפנה ל-`internal_decision_upload`.
|
||||
|
||||
### זרם B — `internal_decision_upload` (internal_committee) — **חובה לחלק מהפסיקה**
|
||||
|
||||
להחלטות **ועדות ערר** מכל המחוזות (ירושלים, מרכז, תל אביב, צפון, דרום, חיפה, ארצי). כולל גם ערר רגיל וגם בל"מ.
|
||||
|
||||
```python
|
||||
mcp__legal-ai__internal_decision_upload(
|
||||
file_path="/path/to/file.pdf",
|
||||
case_number="ערר (ועדות ערר - תכנון ובנייה ירושלים) 1110/20",
|
||||
chair_name="שרית אריאלי", # חובה!
|
||||
district="ירושלים", # חובה! אחד מ-7
|
||||
case_name="פלוני נ' הוועדה המקומית מודיעין",
|
||||
court="ועדת הערר לתכנון ובנייה — מחוז ירושלים",
|
||||
decision_date="2020-11-15",
|
||||
practice_area="rishuy_uvniya", # Axis B
|
||||
appeal_subtype="building_permit",
|
||||
proceeding_type="ערר", # 'ערר' / 'בל"מ' — ראה מטה
|
||||
subject_tags=["שימוש חורג"],
|
||||
is_binding=False, # תמיד False — שכנוע אופקי, לא חוב
|
||||
)
|
||||
```
|
||||
|
||||
**שדות חובה (הכלי דוחה בלעדיהם):**
|
||||
- `file_path`
|
||||
- `case_number`
|
||||
- `chair_name` — בלעדיו אי-אפשר לחפש סלקטיבית לפי הרכב
|
||||
- `district` — ערכים תקפים: **ירושלים / מרכז / תל אביב / צפון / דרום / חיפה / ארצי** (גם "תל-אביב" עם מקף נקלט)
|
||||
|
||||
**שדה מומלץ — `proceeding_type`:**
|
||||
- `"ערר"` — הליך ערר עיקרי (כותרת ב-PDF: "ערר (ועדות ערר ...) NNNN/YY")
|
||||
- `'בל"מ'` — בקשה להארכת מועד להגשת ערר (כותרת: "בל\"מ NNNN/YY" או נושא "בקשה להארכת מועד להגשת ערר")
|
||||
- שני הסוגים יכולים לחלוק אותו מספר תיק (למשל 8047/23 קיים גם כערר וגם כבל"מ).
|
||||
- בכותרת הראשית של ה-PDF זה תמיד מפורש — לקרוא משם ולא לנחש.
|
||||
- אם תשאיר ריק — הכלי גוזר אוטומטית מ-appeal_subtype (`extension_request_*` → 'בל"מ') או מתבנית הטקסט. עדיף מפורש.
|
||||
|
||||
**הכלי שומר `source_kind='internal_committee'`.** DB constraint `case_law_internal_district_check` אוכף ש-`district NOT NULL` כשמדובר ב-internal_committee.
|
||||
|
||||
### אם chair_name או district חסר ב-PDF
|
||||
|
||||
- חפש בתוך הטקסט: "בפני: עו\"ד X" / "יו\"ר הוועדה: X" / "מחוז ירושלים" / שם המחוז בכותרת
|
||||
- אם לא מצליח לזהות — **אל תנחש**. דווח לחיים ב-comment: "נמצא PDF של החלטת ערר ללא chair_name/district ברורים — נדרש מילוי ידני". המשך עם שאר העבודה.
|
||||
|
||||
### 2 שכבות חיפוש מקבילות
|
||||
|
||||
לאחר ההעלאות הנכונות:
|
||||
|
||||
| כלי | מטרה | מתי |
|
||||
|-----|------|-----|
|
||||
| `search_precedent_library` | חיפוש פסיקה **חיצונית** (עליון/מנהלי/מחוזי) | כל סוגיה מרכזית — חובה |
|
||||
| `search_internal_decisions` | חיפוש בהחלטות **ועדות ערר** (כל המחוזות) | כשהסוגיה דיונית או כשאין הלכת עליון |
|
||||
|
||||
שניהם מקבלים את אותם הפילטרים: `practice_area` (Axis B), `subject_tag`, וכו'. `search_internal_decisions` מקבל בנוסף `district` ו-`chair_name`.
|
||||
|
||||
## תהליך עבודה
|
||||
|
||||
### שלב 1: התמצאות
|
||||
@@ -104,8 +195,8 @@ tools:
|
||||
| סיווג תיק | practice_area |
|
||||
|------------|---------------|
|
||||
| 1xxx (רישוי ובניה) | `rishuy_uvniya` |
|
||||
| 8xxx (היטל השבחה) | `histael_hashbacha` |
|
||||
| 9xxx (פיצויים ס' 197) | `pitsuim_197` |
|
||||
| 8xxx (היטל השבחה) | `betterment_levy` |
|
||||
| 9xxx (פיצויים ס' 197) | `compensation_197` |
|
||||
|
||||
אם הסוגיה ב-`appeal_subtype` ידוע (כמו "שימוש חורג", "סטייה ניכרת") — הוסף `appeal_subtype` לפילטר.
|
||||
|
||||
@@ -136,7 +227,7 @@ search_precedent_library(
|
||||
```
|
||||
search_internal_decisions(
|
||||
query="...",
|
||||
practice_area="histael_hashbacha", # rishuy_uvniya / betterment_levy / compensation_197
|
||||
practice_area="betterment_levy", # rishuy_uvniya / betterment_levy / compensation_197
|
||||
district="ירושלים", # ריק = כל המחוזות
|
||||
chair_name="", # ריק = כל היו"רים; "דפנה תמיר" = דפנה בלבד (שווה ל-search_decisions)
|
||||
limit=5
|
||||
@@ -178,6 +269,42 @@ search_internal_decisions(
|
||||
|
||||
**מינימום:** queries לקורפוס הסמכותי = מספר סוגיות מרכזיות שזוהו.
|
||||
|
||||
#### 2ב.4א — איתור החלטה ספציפית לפי שם — פרוטוקול לפני "לא בקורפוס" ⚠️
|
||||
|
||||
שם תיק לבדו (למשל `"אגסי"`) **אינו מפתח חיפוש אמין**. ההטמעה הסמנטית והאינדקס הלקסיקלי בנויים על תוכן ההלכה/הפסקה — כך ששאילתת-שם עלולה להחזיר דווקא החלטות ש**מצטטות** את התיק, ולא את התיק עצמו. לפני שמכריזים שהחלטה אינה בקורפוס:
|
||||
|
||||
1. **הוסף הקשר לשאילתה** — לא `"אגסי"` אלא `"אגסי פטור 19(ג)(1) שתי דירות 140 מ"ר"`, או חפש לפי **מספר התיק** (`"ערר 81002-01-21"`).
|
||||
2. **חפש בשני הקורפוסים** — `search_precedent_library` **וגם** `search_internal_decisions`. החלטות ערר/בל"מ שהיו"ר מעלה נשמרות כ-`internal_committee` ומתגלות בחיפוש הפנימי.
|
||||
3. **לאימות קיום / דפדוף** — `precedent_library_list(search="<שם>", source_kind="all_committees")`. ברירת המחדל `external_upload` **מסתירה** החלטות ועדת ערר שהועלו — חובה `all_committees` או `internal_committee`.
|
||||
4. רק אם **כל** הניסיונות לעיל ריקים — הכרז "לא בקורפוס" ועבור ל-2ב.5.
|
||||
|
||||
#### 2ב.5 — תיעוד פסיקה חסרה (`missing_precedent_create`) — חובה
|
||||
|
||||
**מתי לקרוא:** לכל ציטוט שהצדדים הביאו (בכתב ערר / תגובה / תגובת ועדה) **שלא נמצא בקורפוס** אחרי חיפוש מובנה לפי פרוטוקול 2ב.4א (`search_precedent_library` + `search_internal_decisions` + `precedent_search_library`, כולל שאילתה עם הקשר/מספר תיק).
|
||||
|
||||
**למה זה חשוב:**
|
||||
- ה-writer יודע שלא להסתמך על פסיקה שלא ב-DB ("טוענים שמופיע" ≠ "אומת")
|
||||
- היו"ר רואה בדף ייחודי `/missing-precedents` מה ממתין להעלאה ויכול לסגור פערים בקליק
|
||||
- ההיסטוריה נשמרת: ראינו את הציטוט, לא מצאנו, חיכינו להעלאה, הועלה, נסגר
|
||||
|
||||
```python
|
||||
mcp__legal-ai__missing_precedent_create(
|
||||
citation = "עע\"מ 1461/20 אנטרים אינווסטמנטס נ' הועדה המקומית ירושלים (נבו 4.5.2021)",
|
||||
case_number = "1017-03-26", # תיק הערר שבו הצד ציטט
|
||||
cited_by_party = "permit_applicant", # appellant/respondent/committee/permit_applicant/unknown
|
||||
cited_by_party_name = "לינדאב בע\"מ",
|
||||
legal_topic = "זכות עמידה",
|
||||
legal_issue = "זכות ערר על בקשה להיתר מוקנית רק לבעל זכות במקרקעין",
|
||||
claim_quote = "...הציטוט המדויק מכתב הטענות...",
|
||||
case_name = "אנטרים", # שם קצר
|
||||
notes = "אופציונלי"
|
||||
)
|
||||
```
|
||||
|
||||
הכלי deduplicates: ציטוט+תיק זהים → מחזיר את הרשומה הקיימת. אם הציטוט כבר תויג (אפילו ב-status='closed' כי היו"ר העלה אותו בינתיים) — אל תיצור כפילות.
|
||||
|
||||
**במסמך `precedent-research.md`** הוסף סעיף `## ח. פסיקה חסרה בקורפוס` עם רשימת רשומות שנוצרו (כולל ה-id שהוחזר), כדי שה-writer וה-QA יבחינו בין "אומת מהקורפוס" ל"דיווח בלבד".
|
||||
|
||||
5. **דווח** איזה תקדמים מהקאנון רלוונטיים, איזה תקדמים אישיים נמצאו, ואילו הלכות מהקורפוס הסמכותי תומכות.
|
||||
|
||||
### שלב 3: מיפוי תכנית
|
||||
|
||||
@@ -20,6 +20,7 @@ tools:
|
||||
- mcp__legal-ai__write_block
|
||||
- mcp__legal-ai__search_decisions
|
||||
- mcp__legal-ai__search_precedent_library
|
||||
- mcp__legal-ai__search_internal_decisions
|
||||
- mcp__legal-ai__precedent_library_get
|
||||
- mcp__legal-ai__precedent_library_list
|
||||
- mcp__legal-ai__halacha_review
|
||||
@@ -350,6 +351,28 @@ fi
|
||||
|
||||
חפש לפי `practice_area` (rishuy_uvniya / betterment_levy / compensation_197) ולפי `subject_tag` רלוונטי. הלכות שלא אושרו ע"י דפנה לא מוחזרות מהכלי — אם החיפוש ריק, חזור ל-`search_decisions` בלבד.
|
||||
|
||||
**איתור החלטה לפי שם:** אם אתה מחפש החלטה ספציפית בשמה (למשל "אגסי"), אל תחפש בשם לבדו — צרף מונחי תוכן או מספר תיק (`"אגסי 19(ג)(1) 140 מ"ר"` / `"ערר 81002-01-21"`). שאילתת-שם בלבד עלולה להחזיר את מי שמצטט את ההחלטה ולא את ההחלטה עצמה.
|
||||
|
||||
### ⚠️ ניסוח ציטוטי פסיקה בקול ההחלטה — לפי `source_kind`
|
||||
|
||||
כל רשומה בקורפוס נושאת `source_kind` (ראה בפלט של `precedent_library_get` / `search_precedent_library` / `search_internal_decisions`). הניסוח בבלוק י **משתנה לפי הסוג** — לא רק הציטוט, אלא **התפקיד הרטורי** של פסק הדין בהנמקה:
|
||||
|
||||
| source_kind | מקור | מעמד | תבנית ניסוח בבלוק י |
|
||||
|-------------|------|------|----------------------|
|
||||
| `external_upload` | בית משפט (עליון/מנהלי/מחוזי/שלום) | **סמכותי — מחייב או משכנע גבוה** | "בהתאם להלכת **X** ב-עע\"מ NNNN/YY, נקבע כי..." / "כפי שהבהיר בית המשפט העליון ב-בג\"ץ NNN/YY, '...'" |
|
||||
| `internal_committee` (אחר) | ועדת ערר אחרת | **שכנוע אופקי בלבד — לא מחייב** | "כפי שנקבע על-ידי כב' היו\"ר **Y** במחוז Z בערר NNNN/YY, '...'. סוגיה זו עלתה בפנינו, ואנו מסכימים עם הניתוח הנ\"ל..." |
|
||||
| `internal_committee` של דפנה עצמה | החלטה קודמת של דפנה | **עקביות עצמית (ג'וריספרודנציה אישית)** | "כפי שקבעתי בעבר בערר NNNN/YY, '...'. אין מקום לסטות מכך גם בעניין שלפנינו." (קול אישי "אנחנו"/"אני" — לפי מה שמופיע בקורפוס המקור) |
|
||||
|
||||
**עקרון CREAC (Rule + Explanation):**
|
||||
- **Rule (כלל)**: רק מ-`external_upload` (פסיקת ערכאות) או מחוקקה. **אסור** להציג ועדת ערר אחרת כ"כלל מחייב".
|
||||
- **Explanation (הרחבה/שכנוע)**: `internal_committee` יכול לתפוס כאן — אבל **בנפרד** מהכלל, כשכנוע נוסף.
|
||||
- **אם אין הלכת עליון** ויש רק ועדת ערר תומכת — נסח: "לעת הזו, סוגיה זו טרם נדונה בערכאות עליונות. עם זאת, כפי שנקבע ב<ערר>... מצאנו את ההנמקה משכנעת ואנו אומצים אותה."
|
||||
|
||||
**בדיקה לפני שאתה כותב ציטוט:**
|
||||
1. הוצא את ה-`source_kind` מהפלט של `search_precedent_library` או `search_internal_decisions`.
|
||||
2. אם `internal_committee` — בדוק את `chair_name`. אם זו דפנה תמיר → סגנון "כפי שקבעתי בעבר". אחרת → סגנון אופקי עם ציון מחוז.
|
||||
3. אל תערבב — שלוש קטגוריות שונות, שלוש תבניות שונות.
|
||||
|
||||
### אנטי-דפוסים — בדיקה אחרי כתיבה (חובה)
|
||||
|
||||
- [ ] **אין רשימות ממוספרות בתוך פסקה** (`(1)... (2)... (3)...`) — דפנה מעולם לא משתמשת
|
||||
|
||||
@@ -1,5 +1,7 @@
|
||||
data/
|
||||
.claude/
|
||||
!.claude/agents/
|
||||
!.claude/agents/hermes-curator.md
|
||||
mcp-server/.venv/
|
||||
**/__pycache__/
|
||||
*.pyc
|
||||
@@ -11,7 +13,11 @@ scripts/
|
||||
skills/
|
||||
!skills/docx/
|
||||
!skills/docx/decision_template.docx
|
||||
!skills/decision/
|
||||
!skills/decision/SKILL.md
|
||||
docs/
|
||||
!docs/legal-decision-lessons.md
|
||||
!docs/corpus-analysis.md
|
||||
legacy/
|
||||
node_modules/
|
||||
.next/
|
||||
|
||||
@@ -1146,7 +1146,7 @@
|
||||
"description": "After deploy: PATCH 403-17 to set case_name='ערר 403/17', then trigger precedent_extract_halachot to test the dual-mode extraction on a non-binding committee decision.",
|
||||
"details": "",
|
||||
"testStrategy": "",
|
||||
"status": "pending",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"9",
|
||||
"10",
|
||||
@@ -1154,7 +1154,8 @@
|
||||
"12"
|
||||
],
|
||||
"priority": "medium",
|
||||
"subtasks": []
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T10:38:07.071897Z"
|
||||
},
|
||||
{
|
||||
"id": "14",
|
||||
@@ -1325,13 +1326,14 @@
|
||||
"description": "ה-plugin שלנו: @paperclipai/plugin-sdk@^2026.325.0, apiVersion: 1, minimumHostVersion: 2026.325.0. ה-host: 2026.428.0. ייתכן capabilities חדשות (issue.interactions.create, וכו').",
|
||||
"details": "פעולה (Phase 4 — אחרי שדרוג Paperclip stable):\n1. cd /home/chaim/plugin-legal-ai && npm view @paperclipai/plugin-sdk version\n2. אם חדשה: npm install @paperclipai/plugin-sdk@latest\n3. קריאת adapter-plugin.md המעודכן ב-paperclip repo\n4. בדיקה אם apiVersion: 2 קיים\n5. הוספת capabilities חדשות אם רלוונטי (בעיקר issue.interactions.create אחרי gap #4)\n6. npm run build && reinstall plugin\n\nתלוי בgap #19 (interactions API) — אם אנחנו רוצים שהplugin יוכל ליצור interactions, חייב capability חדש.",
|
||||
"testStrategy": "אחרי npm install: בדיקה ש-plugin עולה ב-Paperclip בלי last_error. SELECT status, last_error FROM plugins WHERE plugin_key='marcusgroup.legal-ai'.",
|
||||
"status": "pending",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"27",
|
||||
"19"
|
||||
],
|
||||
"priority": "low",
|
||||
"subtasks": []
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T12:19:16.180163Z"
|
||||
},
|
||||
{
|
||||
"id": "27",
|
||||
@@ -1339,10 +1341,11 @@
|
||||
"description": "כרגע אנחנו על 2026.428.0 — הגרסה היציבה האחרונה. כשיופיע stable חדש (כנראה 2026.5xx.x), לבצע שדרוג מבוקר.",
|
||||
"details": "טריגר: npm view paperclipai dist-tags.latest מחזיר משהו ≠ 2026.428.0.\n\nפעולה:\n1. קריאת releases/v2026.5xx.x.md ב-GitHub\n2. בדיקת שינויים שעלולים להשפיע (CUSTOMIZATIONS.md סעיפים: hebrew, RTL, plugin driver, heartbeat)\n3. גיבוי: pg_dump של paperclip DB + cp -r ~/.npm/_npx/43414d9b790239bb /tmp/\n4. pm2 stop paperclip\n5. rm -rf ~/.npm/_npx/43414d9b790239bb\n6. npx paperclipai@latest run (יוריד גרסה חדשה)\n7. הרצה מחדש: ~/.paperclip/hebrew/apply-hebrew.sh && ~/.paperclip/issue-link-fix/apply-issue-link-fix.sh\n8. pm2 restart paperclip\n9. בדיקה ב-pc.nautilus.marcusgroup.org: עברית + plugin פעיל + סוכן מתעורר על comment\n\nתלוי בלי dependencies (יכול להיות מבוצע בכל עת אחרי שיש stable חדש).",
|
||||
"testStrategy": "אחרי שדרוג: cat ~/.npm/_npx/43414d9b790239bb/node_modules/paperclipai/package.json | grep version → גרסה חדשה. UI עברית. test wakeup על issue.",
|
||||
"status": "pending",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "low",
|
||||
"subtasks": []
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T12:19:16.180163Z"
|
||||
},
|
||||
{
|
||||
"id": "28",
|
||||
@@ -1371,13 +1374,639 @@
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-04T17:29:25.686Z"
|
||||
},
|
||||
{
|
||||
"id": "30",
|
||||
"title": "תיקון 3 baגים בקטלוג (practice_area + source_kind + upload route)",
|
||||
"description": "CRITICAL: 3 sub-bugs. (א) יצירת תיקים מתייגת practice_area='appeals_committee' במקום rishuy_uvniya/betterment_levy/compensation_197 לפי קידומת מספר התיק (1xxx/8xxx/9xxx) — audit + migration לכל התיקים הקיימים + תיקון נתיב היצירה. (ב) כל החלטה של ועדת ערר שהועלתה ל-case_law מסומנת כ-source_kind='external_upload' במקום 'internal_committee' — audit ל-case_law עם case_number שמתחיל ב'ערר' → reclassify + מילוי chair_name + district רטרואקטיבית. (ג) POST /api/precedent-library/upload לא מבחין — תיקון: ניתוב לפי תחילית הציטוט (ערר/ועדות ערר → internal_committee, אחרת → external_upload).",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #1. Pre-requirement: השתמש במחיקה+rerun של halachot אחרי שינוי source_kind. השתמש בpattern של internal_decisions.py (dry_run+log_action).",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "high",
|
||||
"subtasks": [
|
||||
{
|
||||
"id": 1,
|
||||
"title": "Audit + migration practice_area (1xxx→rishuy_uvniya, 8xxx→betterment_levy, 9xxx→compensation_197)",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"title": "Audit + reclassify case_law source_kind external_upload → internal_committee עבור 'ערר' prefix",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 3,
|
||||
"title": "Delete + re-extract halachot עבור רשומות שעברו reclassification",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 4,
|
||||
"title": "תיקון נתיב יצירת תיק לתיוג practice_area נכון מההתחלה",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 5,
|
||||
"title": "תיקון /api/precedent-library/upload לניתוב לפי תחילית הציטוט",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 6,
|
||||
"title": "מבחני רגרסיה לכל 3 הbaגים",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 7,
|
||||
"title": "תיקון MCP `case_update` + API `PUT /api/cases/{case_number}` לתמוך בעדכון practice_area + appeal_subtype",
|
||||
"status": "done",
|
||||
"details": "התגלה ב-26/05/2026: MCP tool case_update והAPI לא מקבלים את השדה practice_area, ולכן אי-אפשר לתקן תיוג שגוי דרך הממשק. נאלצתי לעדכן ידנית ב-SQL. צריך להוסיף את השדות ל-CaseUpdateRequest ב-web/app.py וב-cases_tools.case_update בmcp-server.",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 8,
|
||||
"title": "[prevention] DB CHECK constraints: source_kind='internal_committee' ⇒ chair_name NOT NULL; cases.practice_area enum",
|
||||
"status": "done",
|
||||
"description": "מיגרציה: ALTER TABLE case_law ADD CONSTRAINT chair_required_for_internal CHECK (source_kind <> 'internal_committee' OR (chair_name IS NOT NULL AND chair_name <> '')); וכן CHECK על cases.practice_area לערכים תקינים. חייב לרוץ אחרי subtask #2 (backfill) אחרת constraint creation ייכשל.",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 9,
|
||||
"title": "[prevention] Unify practice_area taxonomy — מיפוי או מחיקה של appeals_committee מ-practice_area.py",
|
||||
"status": "done",
|
||||
"description": "ב-mcp-server/src/legal_mcp/services/practice_area.py:21 יש PRACTICE_AREAS={appeals_committee,national_insurance,labor_law} שסותר את ה-DB constraint של case_law (rishuy_uvniya/betterment_levy/compensation_197). grep מקיף לכל caller של 'appeals_committee'; להחליף במיפוי מפורש או למחוק.",
|
||||
"parentId": "undefined"
|
||||
}
|
||||
],
|
||||
"updatedAt": "2026-05-26T08:35:22.762800Z"
|
||||
},
|
||||
{
|
||||
"id": "31",
|
||||
"title": "מיצוי chair_name + district בהעלאת ועדת ערר",
|
||||
"description": "תוספת לטופס + חילוץ אוטומטי מהציטוט/text הפסיקה. רטרואקטיבי לכל הרשומות הקיימות עם source_kind='internal_committee' שהשדות בהן ריקים.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #2. תלוי במשימה #30 (sub-bug ב).",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"30"
|
||||
],
|
||||
"priority": "high",
|
||||
"subtasks": [
|
||||
{
|
||||
"id": 1,
|
||||
"title": "Backfill chair_name + district לכל 7 הרשומות החסרות (LLM extraction)",
|
||||
"status": "done",
|
||||
"description": "psql query: SELECT id, case_number FROM case_law WHERE source_kind='internal_committee' AND (chair_name IS NULL OR chair_name=''); לכל אחת — חילוץ ע\"י precedent_metadata_extractor.extract_and_apply.",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"title": "[prevention] Validation: chair_name+district required ב-internal_decisions_upload (API+MCP)",
|
||||
"status": "done",
|
||||
"description": "ב-web/app.py:4607-4680 כיום chair_name/district = Form(\"\") (default ריק). שנה ל-required עם validation שדוחה ריק כשsource_kind='internal_committee'. הוסף enum של 6 ערכי district (ירושלים/מרכז/תל אביב/צפון/דרום/ארצי).",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 3,
|
||||
"title": "[prevention] UI dropdown ל-district בטופס העלאת החלטות ועדה (web-ui)",
|
||||
"status": "done",
|
||||
"description": "במקום free-text — Select של 6 הערכים. גם בטופס חיפוש (search_internal_decisions).",
|
||||
"parentId": "undefined"
|
||||
}
|
||||
],
|
||||
"updatedAt": "2026-05-26T08:35:22.762800Z"
|
||||
},
|
||||
{
|
||||
"id": "32",
|
||||
"title": "UI — דף עריכת פסיקה ייפתח רחב-במרכז (לא צר-בצד)",
|
||||
"description": "חוסר נוחות בעריכה. שינוי ה-Dialog/Sheet ל-Modal רחב מרכזי. רלוונטי גם להוספת שדות chair_name + district מהמשימה #31.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #3.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T10:38:07.071897Z"
|
||||
},
|
||||
{
|
||||
"id": "33",
|
||||
"title": "UI — הסתרת עמודת 'שם' (case_name) בדף רשימת פסיקה",
|
||||
"description": "רוב הערכים זהים למספר התיק. להסתיר את העמודה ב-UI, לשמור עמודה ב-DB לשימוש עתידי.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #4.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T11:27:09.039154Z"
|
||||
},
|
||||
{
|
||||
"id": "34",
|
||||
"title": "חילוץ ציטוטי-פנים מהחלטות דפנה (internal_committee + ירושלים)",
|
||||
"description": "פאטרן: 'ונפנה להחלטות של ועדת ערר זו...', 'כפי שקבעתי בערר X', 'בדומה לעמדתי בהחלטה Y'. חילוץ אוטומטי + שמירה ב-precedent_internal_citations table שיאפשר ל-search_internal_decisions להחזיר גם את הפסיקה המוזכרת.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #5. תלוי במשימה #30 (sub-bug ב) ובמשימה #31.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"30",
|
||||
"31"
|
||||
],
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T10:38:07.071897Z"
|
||||
},
|
||||
{
|
||||
"id": "35",
|
||||
"title": "דף/דוח 'פסיקה חסרה בקורפוס' — פיצ'ר מלא",
|
||||
"description": "טבלת DB missing_precedents (id, citation, case_name, cited_in_case_id, cited_in_document_id, cited_by_party, legal_topic, legal_issue, claim_quote, status, linked_case_law_id, closed_at, created_at), API endpoints (POST/GET/upload/PATCH), MCP tools (missing_precedent_create/list/close), דף Next.js /missing-precedents, הוק אוטומטי במחקר ע\"י legal-researcher.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #6.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "high",
|
||||
"subtasks": [
|
||||
{
|
||||
"id": 1,
|
||||
"title": "Migration + model missing_precedents",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"title": "API endpoints POST/GET/upload/PATCH /api/missing-precedents",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 3,
|
||||
"title": "MCP tools missing_precedent_create/list/close",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 4,
|
||||
"title": "Next.js page /missing-precedents עם list + detail + upload form",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 5,
|
||||
"title": "Auto-creation hook במחקר (legal-researcher יוצר רשומה כשמזהה ציטוט חסר)",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 6,
|
||||
"title": "Webhook עדכון לפלאגין Paperclip + Comment לחיים",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
}
|
||||
],
|
||||
"updatedAt": "2026-05-26T08:35:22.762800Z"
|
||||
},
|
||||
{
|
||||
"id": "36",
|
||||
"title": "כינוס פרופוזיציות לטיעונים משפטיים אמיתיים (de-dup/aggregation)",
|
||||
"description": "extract_claims מחזיר ~60 פרופוזיציות לתיק, צריך לאגד ל-~10 טיעונים משפטיים אמיתיים. טבלה חדשה legal_arguments + טבלת קישור legal_argument_propositions (M:M ל-case_claims). LLM aggregation job (Hermes/DeepSeek). API + MCP + UI display שמציג 'X טיעונים משפטיים' במקום 'Y טענות'.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #7.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "high",
|
||||
"subtasks": [
|
||||
{
|
||||
"id": 1,
|
||||
"title": "Migration + models legal_arguments + legal_argument_propositions",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"title": "LLM aggregation job (Hermes/DeepSeek profile)",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 3,
|
||||
"title": "API + MCP tool aggregate_claims_to_arguments",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 4,
|
||||
"title": "UI display update — case detail page מציג טיעונים אמיתיים",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 5,
|
||||
"title": "Backfill לכל התיקים הקיימים",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
}
|
||||
],
|
||||
"updatedAt": "2026-05-26T08:35:22.762800Z"
|
||||
},
|
||||
{
|
||||
"id": "37",
|
||||
"title": "הפרדת תתי-סוגי בל\"מ לפי practice_area",
|
||||
"description": "3 ערכי appeal_subtype חדשים: extension_request_building_permit (1xxx, ס'152 - 30 ימים), extension_request_betterment_levy (8xxx, ס'14 לתוספת ג' - 45 ימים), extension_request_compensation (9xxx, ס'198(ד) - 30 ימים). 3 templates מתודולוגיים נפרדים. אוטו-זיהוי מהsubject. UI badge + filter.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #8. Pre-requirement: עדכון mcp-server/src/legal_mcp/services/practice_area.py APPEALS_COMMITTEE_SUBTYPES + עדכון web/paperclip_client.py mapping appeal_subtype → company.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "high",
|
||||
"subtasks": [
|
||||
{
|
||||
"id": 1,
|
||||
"title": "הוספת 3 ערכי enum ל-practice_area.py APPEALS_COMMITTEE_SUBTYPES",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"title": "כתיבת 3 templates מתודולוגיים ב-docs/methodology/extension-request-{type}.md",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 3,
|
||||
"title": "אוטו-זיהוי בקוד יצירת תיק (subject='בקשה להארכת מועד' → קביעת subtype לפי practice_area)",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 4,
|
||||
"title": "UI badge + filter ייעודי לבל\"מ",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 5,
|
||||
"title": "עדכון web/paperclip_client.py mapping ל-company עבור 3 הערכים החדשים",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
}
|
||||
],
|
||||
"updatedAt": "2026-05-26T08:35:22.762800Z"
|
||||
},
|
||||
{
|
||||
"id": "38",
|
||||
"title": "שדרוג סוכני Paperclip להכרת השינויים מ-#30-#37",
|
||||
"description": "עדכון 7 הגדרות סוכן (CEO/analyst/researcher/writer/QA/proofreader/exporter) + HEARTBEAT.md לזיהוי המבנים החדשים. בלי זה כל הפיצ'רים נשארים זמינים אבל הסוכנים לא יודעים להשתמש בהם. כולל הוספת research_complete כ-valid case_status.",
|
||||
"details": "ראה תוכנית /home/chaim/.claude/plans/3-glimmering-oasis.md חלק א משימה #9. תלוי במשימות #30-#37.",
|
||||
"testStrategy": "",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"30",
|
||||
"31",
|
||||
"34",
|
||||
"35",
|
||||
"36",
|
||||
"37"
|
||||
],
|
||||
"priority": "high",
|
||||
"subtasks": [
|
||||
{
|
||||
"id": 1,
|
||||
"title": "עדכון .claude/agents/legal-ceo.md — routing + statuses + wake reasons",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 2,
|
||||
"title": "עדכון .claude/agents/legal-analyst.md — practice_area, legal_arguments, בל\"מ detection",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 3,
|
||||
"title": "עדכון .claude/agents/legal-researcher.md — 2 layers, missing_precedents, citations, בל\"מ templates",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 4,
|
||||
"title": "עדכון .claude/agents/legal-writer.md — legal_arguments view, בל\"מ templates",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 5,
|
||||
"title": "עדכון .claude/agents/legal-qa.md — בל\"מ-aware validation",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 6,
|
||||
"title": "עדכון .claude/agents/HEARTBEAT.md — כללי routing משותפים + research_complete status",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 7,
|
||||
"title": "סנכרון לכל החברות CMPA mirror — sync_agents_across_companies.py",
|
||||
"status": "done",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 8,
|
||||
"title": "[alignment] researcher docs: דרישה מפורשת שכל 'ערר' → internal_decision_upload, לעולם לא precedent_library_upload",
|
||||
"status": "done",
|
||||
"description": "בלגל-researcher.md: דוגמת קוד מפורשת + flowchart החלטה: לפי תחילית הציטוט. הסבר על השלילה של precedent_library_upload כשמדובר ב-ערר. תלוי במשימה #39 (MCP tool חדש).",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 9,
|
||||
"title": "[alignment] analyst docs: הסבר על 2 taxonomies של practice_area + מתי משתמשים בכל אחת",
|
||||
"status": "done",
|
||||
"description": "בלגל-analyst.md: טבלה ברורה — practice_area (case_law) vs practice_area (cases). מתי להעביר rishuy_uvniya ומתי appeals_committee. אחרי משימה #30.9 (taxonomy unification) — סביר שזה ייפשט.",
|
||||
"parentId": "undefined"
|
||||
},
|
||||
{
|
||||
"id": 10,
|
||||
"title": "[alignment] writer docs: הבחנה בין source_kind בציטוט (binding vs persuasive)",
|
||||
"status": "done",
|
||||
"description": "בלגל-writer.md: 'החלטת ועדת ערר אחרת ⇒ עקביות אופקית, לא הלכה מחייבת'. 'פס\"ד עליון/מנהלי ⇒ סמכותי בינדינג'. דוגמאות פרזיולוגיה מ-skills/decision/SKILL.md.",
|
||||
"parentId": "undefined"
|
||||
}
|
||||
],
|
||||
"updatedAt": "2026-05-26T07:41:47.880478Z"
|
||||
},
|
||||
{
|
||||
"id": "39",
|
||||
"title": "[ROOT CAUSE] MCP tool חדש: internal_decision_upload",
|
||||
"description": "הוספת @mcp.tool() עם chair_name+district חובה ו-source_kind='internal_committee' אוטומטי. סוגר את ה-root cause של Bug (ב) ב-#30. בלעדיו 44 רשומות חדשות יחזרו כ-external_upload תוך חודש.",
|
||||
"details": "מיקום: mcp-server/src/legal_mcp/tools/internal_decisions.py (אם לא קיים — ליצור). רישום ב-server.py סביב שורה 169 (ליד precedent_library_upload). הקריאה מנותבת ל-int_decisions_service.ingest_internal_decision (קיים ב-internal_decisions.py).",
|
||||
"testStrategy": "1. שלח tool call ללא chair_name → JSON error. 2. שלח עם chair+district → רשומה נוצרת עם source_kind='internal_committee'. 3. precedent_chunks נוצרים עם source_kind ירש.",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"30"
|
||||
],
|
||||
"priority": "high",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T07:41:37.260868Z"
|
||||
},
|
||||
{
|
||||
"id": "40",
|
||||
"title": "[שלב B - ROI מיידי] הפעלת VOYAGE_RERANK_ENABLED=true ב-Coolify",
|
||||
"description": "Cross-encoder rerank-2 ממומש ב-mcp-server/src/legal_mcp/services/rerank.py אבל כבוי בייצור (default=false). POC הוכיח +4.5% mean@3 ו-+11.6% practical queries (latency +702ms acceptable לזרימה האסינכרונית). 5 דקות עבודה — env change ב-Coolify.",
|
||||
"details": "mcp__coolify__env_vars set VOYAGE_RERANK_ENABLED=true. ראה web/mcp_env_catalog.py:71-72 לdescription. אופציה: rampup רק על search_precedent_library (לא על find_similar_cases — latency-sensitive).",
|
||||
"testStrategy": "search_precedent_library('היקף הסמכות') לפני/אחרי — לראות שינוי ב-ordering. בדוק שלא יותר מ-1000ms latency.",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "high",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "41",
|
||||
"title": "[שלב B] BM25/tsvector hybrid retrieval על precedent_chunks + halachot",
|
||||
"description": "כיום כל החיפוש הוא 100% dense (cosine). ציטוטים מספריים ('עע\"מ 1461/20') נכשלים כי semantic לא מצליח בהם. הוספת tsvector GIN + RRF merge dense+lexical = +15-25% recall על ציטוטים — קריטי לאימות פסיקה ב-3-glimmering-oasis שלב 3.",
|
||||
"details": "ALTER TABLE precedent_chunks ADD COLUMN content_tsv tsvector GENERATED ALWAYS AS (to_tsvector('simple', content)) STORED; CREATE INDEX ... USING gin (content_tsv). באותו אופן על halachot.rule_statement. ב-db.py:2357 (search_precedent_library_semantic) — להוסיף שאילתה מקבילה של websearch_to_tsquery → RRF merge עם cosine. אזהרה: postgres אינו תומך ב-'hebrew' config — simple config יעבוד אבל בלי stemming.",
|
||||
"testStrategy": "שאילתה 'עע\"מ 1461/20' לפני/אחרי. לפני: 0 hits. אחרי: 1 hit מדויק על הציטוט. וגם — שאילתות סמנטיות לא מאבדות recall.",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"40"
|
||||
],
|
||||
"priority": "high",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "42",
|
||||
"title": "[שלב B] Query expansion via Claude Haiku — 2-3 variants per query",
|
||||
"description": "שאילתות עם abbreviations משפטיות ('בל\"מ'/'בקשה להארכת מועד') חוטפות recall. LLM expansion: שאילתה → 2-3 variants → union retrieval. +10-15% recall.",
|
||||
"details": "להוסיף שכבה ב-search_precedent_library_semantic: אם query מכיל abbreviations נפוצים (mapping פנימי) — להריץ Haiku להרחבה. cache תוצאות לפי query hash (Redis TTL 24h).",
|
||||
"testStrategy": "Eval על 20 שאילתות עם abbreviations: לפני/אחרי recall@10. צפוי +10-15%.",
|
||||
"status": "deferred",
|
||||
"dependencies": [
|
||||
"41"
|
||||
],
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "43",
|
||||
"title": "[שלב B] MMR / diversity penalty — limit 2 chunks per case_law_id",
|
||||
"description": "תוצאות חיפוש דומות מאוד זו לזו (אותה פסיקה, chunks סמוכים) — פסיקות חוזרות תופסות slots → diversity@10 נמוך. הוספת cap per case_law_id (2-3 max) או MMR אמיתי.",
|
||||
"details": "פתרון קל: SQL DISTINCT ON (case_law_id) + 2 בpost-processing. פתרון איכותי: MMR — לכל candidate, score = λ*relevance - (1-λ)*max_similarity_to_selected. λ=0.7 דיפולט.",
|
||||
"testStrategy": "כל שאילתה ב-top-10: <= 2 chunks per case_law_id. diversity score עולה.",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"40"
|
||||
],
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "44",
|
||||
"title": "[שלב B] HNSW migration (or lists=68 IVFFlat) + REINDEX",
|
||||
"description": "IVFFlat lists=50 עם 4,595 vectors — sub-optimal. sqrt(4595)≈68. HNSW עדיף ל-recall (אבל יותר זיכרון). שיפור +3-5% recall@10.",
|
||||
"details": "אופציה 1: REINDEX עם lists=68 (פשוט, idempotent). אופציה 2: DROP+CREATE עם HNSW (m=16, ef_construction=64) — דורש pgvector ≥0.5 ובדיקת זמן build. בדוק SELECT extversion FROM pg_extension WHERE extname='vector'.",
|
||||
"testStrategy": "EXPLAIN ANALYZE לפני/אחרי על 5 שאילתות מייצגות. בנצ'מרק recall@10 על blind set.",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "45",
|
||||
"title": "[שלב B] Halacha auto-approve sweep — בדיקת 219 pending + הורדת סף ל-0.78",
|
||||
"description": "219 halachot pending review (17%) חסומות מ-search. אם dafna לא מסקר ידנית — הם מתבזבזים. dashboard batch + הורדת auto-approve threshold.",
|
||||
"details": "1. בדוק 20 דגימות אקראיות של pending — אם רובן ראויות לאישור, הורד HALACHA_AUTO_APPROVE_THRESHOLD מ-0.80 ל-0.78. 2. הוסף UI batch approval ב-/halachot עם filter pending+confidence>0.75. 3. one-shot SQL לאישור 200 halachot שעמדו בקריטריונים החדשים.",
|
||||
"testStrategy": "אחרי sweep: halachot approved יעלה מ-1,064 ל-~1,260. search_precedent_library יחזיר יותר rule-level results.",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "medium",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "46",
|
||||
"title": "[שלב B] Dynamic halacha boost — לפי query-rule similarity",
|
||||
"description": "כיום halacha boost = +0.05 קבוע. דינמי לפי query similarity ירוץ דייקנות (5% precision על שאילתות ספציפיות).",
|
||||
"details": "ב-db.py:2479 — score = float(d['score']) + 0.05. החלף ב-boost = 0.10 * d['score'] (proportional). או — אם rerank ON, השתמש בrerank score כbaseline (אין צורך ב-boost כלל).",
|
||||
"testStrategy": "A/B test על 10 שאילתות: precision@5 לפני/אחרי.",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"40"
|
||||
],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T08:08:27.953285Z"
|
||||
},
|
||||
{
|
||||
"id": "47",
|
||||
"title": "[שלב C - prevention] Audit script periodic: detect new external_upload עם case_number של ערר",
|
||||
"description": "Drift detection: שגיאה דומה ל-Bug (ב) יכולה לחזור בעתיד. periodic check (יומי?) + alert ל-Slack/comment.",
|
||||
"details": "scripts/audit_corpus_consistency.py — בודק: 1. case_law WHERE source_kind='external_upload' AND case_number ~ '^ערר|^ARAR'. 2. case_law WHERE source_kind='internal_committee' AND chair_name IS NULL. הרצה דרך cron או scheduled task ב-Paperclip.",
|
||||
"testStrategy": "להריץ אחרי כל העלאה חדשה (וגם פעם ביום). אם מוצא drift — comment ב-Paperclip ל-CEO.",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"30",
|
||||
"39"
|
||||
],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T11:27:09.039154Z"
|
||||
},
|
||||
{
|
||||
"id": "48",
|
||||
"title": "[שלב C] Parent-doc retrieval (child=300, parent=1500 tokens)",
|
||||
"description": "chunk_size=600 חותך חלק מהלכות ארוכות. parent-doc: חיפוש על child קטן (300 tokens), החזרת parent גדול (1500 tokens) ל-LLM context.",
|
||||
"details": "מיגרציה DB: precedent_chunks.parent_chunk_id (FK self). chunking pipeline משתנה ל-2 רמות. retrieval: SELECT distinct parent_chunk WHERE child_chunk matches.",
|
||||
"testStrategy": "Eval: writer מקבל context מלא יותר, לא חתוך באמצע משפט/ציטוט.",
|
||||
"status": "done",
|
||||
"dependencies": [
|
||||
"41"
|
||||
],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T11:27:09.039154Z"
|
||||
},
|
||||
{
|
||||
"id": "49",
|
||||
"title": "[שלב C] Multimodal backfill ל-77 רשומות שנותרו",
|
||||
"description": "כיום 40/117 precedent_image_embeddings (34%). 77 רשומות נותרו ללא image embeddings. ערך נמוך כשהמסמכים digital-native, אבל קריטי לscanned PDFs.",
|
||||
"details": "scripts/multimodal_backfill.py כבר קיים. להריץ עם batch size 10 כדי לא לדפוק את Voyage rate limits. אומדן: 77×~10K tokens = ~770K tokens ($10-15).",
|
||||
"testStrategy": "אחרי backfill: COUNT(*) FROM precedent_image_embeddings ≥ 117 (מותר יותר אם יש כמה pages per pdf).",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T11:27:09.039154Z"
|
||||
},
|
||||
{
|
||||
"id": "50",
|
||||
"title": "[שלב C] Closed-loop retrieval feedback + ndcg dashboard",
|
||||
"description": "אין tracking של 'what was retrieved → what writer cited'. בלי זה — אי אפשר לעדכן את ה-RAG בצורה מדודה לאורך זמן.",
|
||||
"details": "טבלה חדשה retrieval_feedback (query, candidates_retrieved JSONB, cited_in_final_decision UUID[], created_at). hooks ב-writer לדווח. dashboard חודשי עם ndcg@10.",
|
||||
"testStrategy": "אחרי 3 החלטות סופיות: SELECT count FROM retrieval_feedback ≥ 3. dashboard מציג ndcg trend.",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T11:27:09.039154Z"
|
||||
},
|
||||
{
|
||||
"id": "51",
|
||||
"title": "[שלב C] Halacha quality monitoring — confidence drift, alert",
|
||||
"description": "אם prompt או model משתנה — confidence distribution יכול לזוז. בלי monitoring — דרדור איכות עובר תחת הראדר.",
|
||||
"details": "scheduled job: weekly mean confidence per practice_area. אם זז ביותר מ-0.05 — alert. dashboard ב-/halachot עם histogram.",
|
||||
"testStrategy": "אחרי 4 שבועות — לבדוק שיש 4 datapoints + alert עובד על נתון synthetic.",
|
||||
"status": "done",
|
||||
"dependencies": [],
|
||||
"priority": "low",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-26T11:27:09.039154Z"
|
||||
},
|
||||
{
|
||||
"id": "52",
|
||||
"title": "[Retrieval RC-A] הוספת case_name + case_number ל-tsvector הלקסיקלי",
|
||||
"description": "השורש האמיתי לכך שסוכן לא מאתר החלטה לפי שם (אגסי). ה-tsvector הלקסיקלי (SCHEMA_V12_SQL ב-db.py) בנוי רק מ-precedent_chunks.content ומ-halachot rule/quote/reasoning — לא משם התיק/הצד או ממספר התיק. לכן שאילתת-שם מחזירה את מי שמצטט את ההחלטה, לא את ההחלטה עצמה. לשלב את case_law.case_name + case_number באינדקס הלקסיקלי (tsvector ייעודי על case_law או setweight) כך שחיפוש לפי שם יפגע ברשומה עצמה.",
|
||||
"status": "done",
|
||||
"priority": "high",
|
||||
"dependencies": [],
|
||||
"details": "קבצים: mcp-server/src/legal_mcp/services/db.py (SCHEMA_V12_SQL ~שורה 774, search_precedent_library_lexical), hybrid_search.py (_merge_sem_lex). דורש ALTER TABLE + migration על Postgres (localhost:5433) + restart MCP server. בדיקה: search_internal_decisions('אגסי') ו-search_precedent_library('אגסי') חייבים להחזיר את אגסי (1a87efe5) בעמוד הראשון.",
|
||||
"testStrategy": "reproduction test: query='אגסי' → expect case_law_id 1a87efe5 in top-3. regression: substantive query עדיין מחזיר 0.6+ score.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:05:36.307Z"
|
||||
},
|
||||
{
|
||||
"id": "53",
|
||||
"title": "[Retrieval RC-B] חיפוש/רשימה מאוחדים — לא לחתוך internal_committee",
|
||||
"description": "החלטות ערר/בל\"מ שמועלות נשמרות source_kind='internal_committee'. precedent_library_list ברירת מחדל external_upload ומסתיר אותן; כלי ה-MCP precedent_library_list אפילו לא חושף פרמטר source_kind, כך שסוכן לעולם לא יכול לדפדף בהן. לחשוף source_kind/all_committees בכלי ה-MCP ובמידת הצורך לאחד את שכבת ה-list/search.",
|
||||
"status": "done",
|
||||
"priority": "high",
|
||||
"dependencies": [
|
||||
"52"
|
||||
],
|
||||
"details": "קבצים: web/app.py (precedent_library_list ~5194, all_committees expansion ב-db.list_external_case_law ~2708), mcp-server tool def ל-precedent_library_list. בדיקה: precedent_library_list יכול להחזיר את אגסי כשמבקשים committees; חיפוש סמנטי כבר מאוחד (אומת).",
|
||||
"testStrategy": "precedent_library_list(source_kind='all_committees', practice_area='betterment_levy') כולל את אגסי+וינפלד. regression: ברירת מחדל external_upload עדיין מחזירה 14 ולא שוברת UI.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:09:44.511Z"
|
||||
},
|
||||
{
|
||||
"id": "54",
|
||||
"title": "[Retrieval RC-3] הנחיית סוכנים — איתור לפי שם + שני קורפוסים",
|
||||
"description": "לעדכן הנחיות legal-analyst/researcher/writer: לאיתור החלטה ספציפית לפי שם להוסיף מונחי תוכן או מספר תיק, ולחפש בשני הקורפוסים (search_internal_decisions + search_precedent_library) לפני שמסיקים 'לא קיים בקורפוס'. כולל יצירת missing_precedent רק אחרי חיפוש כפול.",
|
||||
"status": "done",
|
||||
"priority": "medium",
|
||||
"dependencies": [
|
||||
"53"
|
||||
],
|
||||
"details": "קבצים: .claude/agents/legal-analyst.md, legal-researcher.md, legal-writer.md. אחרי שינוי skills/agent config — להריץ sync_agents_across_companies.py.",
|
||||
"testStrategy": "קריאת ההנחיות מאשרת fallback ברור; (אם אפשר) הרצת סוכן על שאילתת-שם מחזירה את ההחלטה.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:12:44.727Z"
|
||||
},
|
||||
{
|
||||
"id": "55",
|
||||
"title": "[Retrieval RC-4] תיקון chunking — פרגמנטים זעירים",
|
||||
"description": "בתוצאות החיפוש מופיעים chunks של מילה-שתיים ('דיון','דיון וב','סיכום ו') כתוצאות מובילות. מציפים תוצאות ומורידים דירוג תוכן אמיתי. לחקור את chunker.py (פיצול לפי כותרת-סעיף שיוצר chunks ריקים) ולתקן: מינימום אורך chunk / מיזוג כותרת לגוף.",
|
||||
"status": "done",
|
||||
"priority": "medium",
|
||||
"dependencies": [
|
||||
"54"
|
||||
],
|
||||
"details": "קבצים: mcp-server/src/legal_mcp/services/chunker.py (SECTION_PATTERNS). דורש שיקול re-chunk לרשומות קיימות — לבדוק עלות מול feedback_no_reocr_retrofit (להשתמש בטקסט שמור, לא re-OCR).",
|
||||
"testStrategy": "אין chunks < N תווים בקורפוס אחרי תיקון; search_internal_decisions('אגסי') לא מציג פרגמנטי 'דיון'.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:19:23.923Z"
|
||||
},
|
||||
{
|
||||
"id": "56",
|
||||
"title": "[Retrieval finding] halacha_filters לא מסננים source_kind — דליפה חוצת-קורפוסים",
|
||||
"description": "התגלה תוך כדי משימה 53. ב-search_precedent_library_semantic וב-search_precedent_library_lexical (db.py): chunk_filters כוללים cl.source_kind=$sk אבל halacha_filters כוללים רק review_status. תוצאה: search_precedent_library(external) מחזיר גם הלכות internal_committee, ו-search_internal_decisions(internal) מחזיר גם הלכות external. אי-עקביות: chunks מסוננים, halachot לא. כרגע זה דווקא מסייע למציאוּת (לכן לא רגרסיה), אבל לא עקבי. דורש החלטת מדיניות: או לסנן halachot גם לפי source_kind (עקבי, אך 'מסתיר' שכבות), או להשאיר מאוחד במכוון + לתעד. אם משאירים מאוחד — לעדכן docstrings של שני הכלים שזה לא 'corpus נפרד'.",
|
||||
"status": "pending",
|
||||
"priority": "low",
|
||||
"dependencies": [],
|
||||
"details": "db.py: search_precedent_library_semantic (~שורה הקודמת ל-3311), search_precedent_library_lexical (3346). שתי הפונקציות: halacha_filters=['h.review_status IN ...'] — חסר cl.source_kind. נמצא בעת בדיקת רגרסיה למשימה 53.",
|
||||
"testStrategy": "לאחר החלטה: אם מסננים — search_precedent_library('...substantive...', external) לא מחזיר case_law_id internal; אם משאירים — docstring מעודכן + טסט מאשר התנהגות מכוונת.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:09:30.257989+00:00"
|
||||
},
|
||||
{
|
||||
"id": "57",
|
||||
"title": "[Retrieval #55 follow-up] re-chunk+re-embed של פסיקה שהוטמעה לפני תיקון ה-chunker",
|
||||
"description": "משימה 55 תיקנה את ה-chunker (עיגון כותרות + מיזוג) ומסננת את 484 הפרגמנטים בזמן query. הרמדיאציה המלאה: re-chunk מ-full_text השמור (ללא re-OCR — תואם feedback_no_reocr_retrofit) + re-embed, כדי שהתוכן יהיה נכון ולא רק מוסתר. נדחה כי זו מיגרציית-נתונים עם עלות Voyage API על ~13+ תיקים — דורש אישור עלות מ-chaim לפני הרצה. לבדוק כמה תיקים מושפעים (יש להם chunk<50) ולהריץ בקבוצות.",
|
||||
"status": "pending",
|
||||
"priority": "low",
|
||||
"dependencies": [
|
||||
"55"
|
||||
],
|
||||
"details": "מקור: case_law.full_text קיים. נתיב: chunker.chunk_document(_hierarchical) → embeddings → החלפת precedent_chunks לתיק. למחוק chunks ישנים של התיק לפני הוספה. אחרי הרצה — ניתן להסיר את פילטר ה->=50 query (אופציונלי). תיקים מושפעים: SELECT DISTINCT case_law_id WHERE length(trim(content))<50.",
|
||||
"testStrategy": "אחרי re-chunk לתיק לדוגמה: 0 chunks<50 לאותו case_law_id; search_internal_decisions עדיין מחזיר את התיק; ספירת chunks סבירה.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:19:06.142606+00:00"
|
||||
},
|
||||
{
|
||||
"id": "58",
|
||||
"title": "[Case access] get_case_by_number שביר לפורמט — סוכן 'עיוור' למסמכי תיק",
|
||||
"description": "דווח ע\"י chaim: סוכן כתב שחסרים מסמכי תיק כי document_list החזיר ריק, אך המסמכים קיימים. שורש: get_case_by_number (db.py) עושה 'WHERE case_number=$1' התאמה מדויקת בלבד. אומת — 8137-24 מחזיר 9 מסמכים, אבל 8137/24 / 'ערר 8137-24' / רווחים / zero-pad → 'תיק לא נמצא'. הסוכן מקבל את המספר בפורמט שונה (כותרת issue, לוכסן, תחילית ערר/בל\"מ) → התאמה נכשלת → 'אין מסמכים'. משפיע על כל הכלים מבוססי case_number (document_list, extract_references, search_case_documents, get_claims, draft, וכו'). תיקון: נורמליזציה (strip prefix לתחילת ספרה, trim, '/'→'-') + fallback בשאילתה. תיקון נקודה-אחת מתקן את כל הכלים.",
|
||||
"status": "done",
|
||||
"priority": "high",
|
||||
"dependencies": [],
|
||||
"details": "db.py: get_case_by_number (~שורה לאחר get_case). להוסיף _normalize_case_number + שאילתה עם OR על replace(trim(case_number),'/','-')=norm, ORDER BY exact-first. בדיקה: כל הווריאציות של 8137-24 מחזירות 9 מסמכים.",
|
||||
"testStrategy": "document_list על 7 וריאציות פורמט של תיק קיים → כולן מחזירות את אותם מסמכים; תיק לא-קיים אמיתי עדיין מחזיר 'לא נמצא'.",
|
||||
"subtasks": [],
|
||||
"updatedAt": "2026-05-30T11:54:34.291Z"
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"version": "1.0.0",
|
||||
"lastModified": "2026-05-04T17:29:25.687Z",
|
||||
"taskCount": 29,
|
||||
"completedCount": 24,
|
||||
"lastModified": "2026-05-30T11:54:34.291Z",
|
||||
"taskCount": 58,
|
||||
"completedCount": 53,
|
||||
"tags": [
|
||||
"legal-ai"
|
||||
]
|
||||
|
||||
14
CLAUDE.md
14
CLAUDE.md
@@ -91,6 +91,16 @@
|
||||
- שינויי קוד נכנסים לתוקף אחרי `pm2 restart paperclip`
|
||||
- **אין צורך ב-Docker או Coolify**
|
||||
|
||||
**legal-chat-service** — רץ **מקומית דרך pm2** (חדש, מאפריל 2026):
|
||||
- פורט: `localhost:8770` (loopback בלבד)
|
||||
- שירות aiohttp קצר שעוטף את `claude` CLI ב-streaming + session continuation, ומשרת את הטאב "שיחה" בדף `/training`. הקונטיינר משדל אליו proxy דרך `host.docker.internal:8770`.
|
||||
- קוד: [mcp-server/src/legal_mcp/chat_service/](mcp-server/src/legal_mcp/chat_service/)
|
||||
- התקנה: `pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs && pm2 save`
|
||||
- בריאות: `curl http://127.0.0.1:8770/health` → `{"ok":true,...}`
|
||||
- שינויי קוד: `pm2 restart legal-chat-service`
|
||||
- **אפס עלות API** — claude CLI משתמש ב-claude.ai subscription של chaim. הנחת היסוד של `claude_session.py` (claude CLI מקומי בלבד) נשמרת — השירות הזה הוא הגשר הרשמי בין הקונטיינר לחוץ.
|
||||
- Coolify dependency: ה-Service Definition של legal-ai חייב להכיל `extra_hosts: host.docker.internal:host-gateway` (אחרת ה-proxy יקבל ConnectError).
|
||||
|
||||
---
|
||||
|
||||
## מבנה תיקיות
|
||||
@@ -153,12 +163,14 @@
|
||||
|
||||
הפרויקט משתמש ב-**TaskMaster AI** (MCP server) לניהול משימות מובנה:
|
||||
- **תמיד** להשתמש ב-TaskMaster לפירוק, מעקב וניהול משימות — לא ב-TASKS.md ידני
|
||||
- קובץ המשימות: `tasks/tasks.json`
|
||||
- קובץ המשימות הקנוני: `~/legal-ai/.taskmaster/tasks/tasks.json` (יחסי ל-project root, **לא** `~/.taskmaster/tasks/tasks.json`). מכיל את כל ה-tags של legal-ai (`master`, `legal-ai`).
|
||||
- פקודות עיקריות: `get_tasks`, `next_task`, `add_task`, `update_task`, `expand_task`
|
||||
- לפני התחלת עבודה → `next_task` כדי לדעת מה הבא לפי תלויות
|
||||
- אחרי סיום משימה → `update_task` עם status=done
|
||||
- משימה מורכבת → `expand_task` לפירוק לתתי-משימות
|
||||
|
||||
> **⚠️ מלכוד cwd ב-CLI:** הדגל `--tag` בוחר קבוצה לוגית *בתוך* הקובץ — הוא **לא** בוחר לאיזה `tasks.json` לכתוב. ה-CLI מאתר את הקובץ לפי ה-cwd (`<cwd>/.taskmaster/tasks/tasks.json`). תמיד `cd ~/legal-ai` לפני `task-master add-task` או כל פקודה משנה, ואז אמת ב-MCP `get_tasks` שהשינוי נחת. הרצה מ-`~/` כותבת לקובץ נטוש והמשימה לא תופיע בשאילתות MCP. כשלא בטוחים — לערוך את `~/legal-ai/.taskmaster/tasks/tasks.json` ישירות.
|
||||
|
||||
---
|
||||
|
||||
## Paperclip — כללי אינטגרציה קריטיים
|
||||
|
||||
12
Dockerfile
12
Dockerfile
@@ -61,6 +61,18 @@ COPY mcp-server/src/ ./mcp-server/src/
|
||||
# (Path(__file__).resolve().parents[4] / "skills/docx/decision_template.docx")
|
||||
COPY skills/docx/decision_template.docx ./skills/docx/decision_template.docx
|
||||
|
||||
# Reference content the /training tab reads at runtime:
|
||||
# - .claude/agents/hermes-curator.md → GET /api/training/curator/prompt
|
||||
# - skills/decision/SKILL.md → system prompt for the chat
|
||||
# - docs/legal-decision-lessons.md → system prompt for the chat
|
||||
# - docs/corpus-analysis.md → system prompt for the chat
|
||||
#
|
||||
# These are read-only at runtime; chair edits go through git, not the container.
|
||||
COPY .claude/agents/hermes-curator.md ./.claude/agents/hermes-curator.md
|
||||
COPY skills/decision/SKILL.md ./skills/decision/SKILL.md
|
||||
COPY docs/legal-decision-lessons.md ./docs/legal-decision-lessons.md
|
||||
COPY docs/corpus-analysis.md ./docs/corpus-analysis.md
|
||||
|
||||
# Make mcp-server source available to web/app.py (it does sys.path.insert for legal_mcp)
|
||||
ENV PYTHONPATH=/app/mcp-server/src
|
||||
|
||||
|
||||
227
docs/methodology/extension-request-betterment_levy.md
Normal file
227
docs/methodology/extension-request-betterment_levy.md
Normal file
@@ -0,0 +1,227 @@
|
||||
# מתודולוגיה — בל"מ בהיטל השבחה (8xxx)
|
||||
|
||||
**appeal_subtype:** `extension_request_betterment_levy`
|
||||
**מסלול:** סעיף 14 לתוספת ג' לחוק התכנון והבנייה, התשכ"ה-1965
|
||||
**מועד סטטוטורי:** **45 ימים** (להבדיל מ-30 ימים ברישוי) מיום קבלת
|
||||
דרישת תשלום היטל ההשבחה (סעיף 14(א) לתוספת ג')
|
||||
|
||||
---
|
||||
|
||||
## א. מבוא — ייחודיות בל"מ בהיטל השבחה
|
||||
|
||||
בל"מ במסלול היטל השבחה שונה משמעותית מבל"מ ברישוי בכמה ממדים:
|
||||
|
||||
| ממד | בל"מ ברישוי | בל"מ בהיטל השבחה |
|
||||
|------|--------------|-------------------|
|
||||
| מועד סטטוטורי | 30 ימים | **45 ימים** |
|
||||
| סעיף בחוק | 152 | סעיף 14 לתוספת ג' |
|
||||
| בעלי דין | רחב — כל בעל זכות גובלת/קרובה | **צר — רק החייב בהיטל** |
|
||||
| מהות הסעד | ביטול היתר / שינוי תנאים | תיקון שומה / ביטול חיוב |
|
||||
| טון | פעמים אנושי (תושב, סביבה) | קר ומקצועי (פיננסי/שמאי) |
|
||||
| הסתמכות נדרשת | של היזם | של הרשות (חלוקת הכנסות) |
|
||||
|
||||
הייחוד הקרדינלי: **בל"מ בהיטל השבחה דורש הוכחת טעות שמאית או בדין** —
|
||||
לא רק "טעם סביר" כמו ברישוי. הסיבה: שומת היטל ההשבחה היא מעשה מנהלי
|
||||
שקיבל תוקף, וכספים שולמו / נדרשו, ולעיתים גם חולקו. שינוי שומה דורש
|
||||
עילה מהותית.
|
||||
|
||||
---
|
||||
|
||||
## ב. מסגרת נורמטיבית
|
||||
|
||||
### שכבה א — חקיקה ראשית
|
||||
|
||||
**סעיף 14(א) לתוספת ג' לחוק התכנון והבנייה:**
|
||||
> "בעל המקרקעין החייב בהיטל השבחה ... רשאי להגיש ערר על השומה לוועדת הערר
|
||||
> לפיצויים ולהיטל השבחה ... בתוך 45 ימים מיום שהומצאה לו השומה"
|
||||
|
||||
המחוקק קבע מועד ארוך יותר (45 לעומת 30) מתוך הכרה במורכבות הסוגיה השמאית —
|
||||
הצורך לקבל חוו"ד שמאית, להתייעץ עם עו"ד מומחה למיסוי מקרקעין, ולבחון את
|
||||
חישובי השומה.
|
||||
|
||||
### שכבה ב — עליון
|
||||
|
||||
**רע"א 7669/96 עיריית נהריה נ' קמינסקי (פ"ד נב(1) 214):**
|
||||
ביסוס עקרוני של "סופיות שומה" — שינוי שומה לאחר חלוף המועד הסטטוטורי
|
||||
אינו עומד על ערעור "טעם סביר" בלבד; נדרש אינטרס ציבורי מובהק או טעות
|
||||
שמאית מהותית.
|
||||
|
||||
**עע"מ 1832/14 הרשות לפיתוח ירושלים נ' מנהל מס שבח:**
|
||||
היטל השבחה — תשלום הכפוף לסופיות שומה; קביעות שמאי בדבר ערך המקרקעין לפני
|
||||
ואחרי האירוע התכנוני הן עובדתיות-מקצועיות. שינוי דורש הצדקה חזקה.
|
||||
|
||||
### שכבה ג — ועדות ערר לפיצויים ולהיטל השבחה
|
||||
|
||||
(להוסיף תקדימים ספציפיים מקורפוס דפנה תמיר בהיטל השבחה. הקורפוס הקיים
|
||||
כולל את עררי 8xxx — לחפש דפוס "בל\"מ" או "הארכת מועד" בתוכם.)
|
||||
|
||||
---
|
||||
|
||||
## ג. תבחיני בל"מ בהיטל השבחה — חמישה תבחינים
|
||||
|
||||
| # | תבחין | אופי | משקל |
|
||||
|---|--------|------|------|
|
||||
| א | **טעות שמאית או בדין** | **תנאי סף עצמאי — ייחודי להיטל השבחה** | קריטי |
|
||||
| ב | טעם סביר לאיחור | מקדים — בדומה לרישוי, אך מחמיר | גבוה |
|
||||
| ג | אורך השיהוי | כמותי | גבוה |
|
||||
| ד | הסתמכות הרשות (חלוקת כספים) | כמותי | גבוה |
|
||||
| ה | סיכויי הערר המהותי (לכאורה) | מהותי | בינוני |
|
||||
|
||||
תבחין "אינטרס ציבורי" לא מופיע כתבחין עצמאי כאן — בהיטל השבחה האינטרס
|
||||
הציבורי נטוע בתוך הסתמכות הרשות (תבחין ד).
|
||||
|
||||
---
|
||||
|
||||
## ד. תבחין א — טעות שמאית או טעות בדין
|
||||
|
||||
### מה זו "טעות שמאית"?
|
||||
לא כל מחלוקת על שווי = טעות. נדרש להוכיח אחד מאלה:
|
||||
|
||||
1. **טעות חישובית גלויה** — סכום שגוי, פעולה אריתמטית שגויה.
|
||||
2. **שיטה שמאית פסולה** — שימוש בגישה לא מקובלת (לדוגמה: היוון לפי שיעור
|
||||
שאינו ריאלי, השוואה לעסקאות שאינן מקבילות).
|
||||
3. **התעלמות מנכסים דומים** — עיוורון לנתונים שהיו צריכים להילקח בחשבון.
|
||||
4. **שגיאה במספרי שטח / זכויות / תכנית** — אי-תאמה לנסח / לתב"ע.
|
||||
|
||||
### מה זו "טעות בדין"?
|
||||
שגיאה משפטית בעצם החיוב:
|
||||
- **חיוב על נכס שאינו "מקרקעין" לעניין החוק** (זכויות חוזיות גרידא).
|
||||
- **חיוב בגין השבחה שאינה נכנסת להגדרת "השבחה" בחוק** (לדוגמה: השבחה
|
||||
שנוצרה לפני התקופה הקובעת; השבחה מכוח תכנית שאינה תכנית מתאר).
|
||||
- **חיוב לפני התגבשות העילה** — דרישה לפני מימוש בהיתר או מכר.
|
||||
|
||||
### הוכחה דרושה
|
||||
- **חוות דעת שמאית חתומה** מאת שמאי מקרקעין מוסמך, עם נתוני השוואה.
|
||||
- **תיעוד הליך השומה המקורי** — אילו נתונים נלקחו? אילו לא?
|
||||
- **חישוב חלופי מנומק** — לא רק "אני חולק", אלא "הנה החישוב הנכון".
|
||||
|
||||
---
|
||||
|
||||
## ה. תבחין ב — טעם סביר לאיחור
|
||||
|
||||
### העקרון
|
||||
בדומה לבל"מ ברישוי, אך **קפדן יותר**:
|
||||
- מועד 45 ימים נחשב "מועד ארוך" — קשה יותר להצדיק החמצתו.
|
||||
- החייב לרוב מקבל את השומה לידיו אישית — אין סוגיית "פרסום באתר".
|
||||
- ערב פניה לעו"ד / שמאי הוא צעד צפוי וסטנדרטי.
|
||||
|
||||
### מצבי "טעם סביר" אופייניים
|
||||
| מצב | קבילות |
|
||||
|------|---------|
|
||||
| מחלת המבקש (מתועדת רפואית) | קבילה |
|
||||
| המצאה פגומה (לא לכתובת הנכונה) | קבילה — אך נטל הוכחה כבד |
|
||||
| תקופה ארוכה של בירורים מקצועיים | חלשה — לוחות זמנים אינם מוקפאים |
|
||||
| המתנה לעמדת שמאי לפני הגשת ערר | חלשה — אפשר להגיש ולתקן |
|
||||
| התכתבות עם הרשות בניסיון פשרה | חלשה — לא מקפיאה מועד |
|
||||
|
||||
### דרישת התצהיר
|
||||
**חובה** תצהיר מפורט — תאריכים, אנשי קשר, מסמכי תמיכה. ללא תצהיר —
|
||||
הטענה ריקה משפטית.
|
||||
|
||||
---
|
||||
|
||||
## ו. תבחין ג — אורך השיהוי
|
||||
|
||||
### חישוב
|
||||
| תאריך | אירוע | שיהוי מצטבר |
|
||||
|--------|--------|--------------|
|
||||
| יום 0 | המצאת השומה | 0 |
|
||||
| יום 45 | תום המועד הסטטוטורי | תום המועד |
|
||||
| יום X | הגשת הבל"מ | X-45 ימים מעבר למועד |
|
||||
|
||||
### עקרון מנחה
|
||||
- שיהוי של עד 30 ימים מעבר למועד (סה"כ 75 ימים מיום ההמצאה) — מקבל
|
||||
התייחסות עניינית אם יש טעם סביר.
|
||||
- שיהוי של מעל 90 ימים מעבר למועד — נחשב חמור; דורש הוכחה חזקה במיוחד.
|
||||
- שיהוי של מעל שנה — לרוב חוסם אלא אם מדובר בטעות חישובית גלויה.
|
||||
|
||||
### השפעת השיהוי על הסתמכות הרשות
|
||||
ככל שהזמן עובר — הסיכוי שהרשות חילקה את הכספים גבוה יותר. דרישה להחזר
|
||||
שנים לאחר התשלום פוגעת בהסתמכות הרשות בצורה מובהקת.
|
||||
|
||||
---
|
||||
|
||||
## ז. תבחין ד — הסתמכות הרשות (חלוקת הכנסות)
|
||||
|
||||
### ייחודיות לעומת בל"מ ברישוי
|
||||
ברישוי — ההסתמכות היא של היזם הפרטי. בהיטל השבחה — ההסתמכות היא של
|
||||
**הרשות הציבורית**: הכספים מועברים לקרן השבחה, מתוכננים לפרויקטים
|
||||
ציבוריים, ולעיתים אף חולקו או הוצאו.
|
||||
|
||||
### טבלת בדיקה
|
||||
| שלב | מצב הכספים | השפעה על הבל"מ |
|
||||
|------|------------|-----------------|
|
||||
| לפני תשלום | החייב לא שילם | קלה — אין הסתמכות הרשות |
|
||||
| לאחר תשלום, לפני חלוקה | בקופת הוועדה / קרן | בינונית |
|
||||
| לאחר חלוקה לרשויות | חולק לעירייה, יזם, וכו' | משמעותית |
|
||||
| לאחר ביצוע פרויקטים | כספים הוצאו | מוחשית, קשה להפיך |
|
||||
|
||||
### עיקרון
|
||||
**ככל שהכספים "התרחקו" מהקופה — דרישות הוכחת הטעות מחמירות.**
|
||||
|
||||
---
|
||||
|
||||
## ח. תבחין ה — סיכויי הערר המהותי (לכאורה)
|
||||
|
||||
### הבהרה מתודית
|
||||
בשלב בל"מ — בוחנים סיכויי הערר רק כדי לקבוע האם יש סיבה לפתוח את הדלת.
|
||||
הקריטריון: **האם יש "טענה לכאורה" המבוססת על תיעוד מקצועי?**
|
||||
|
||||
### סוגי טענות אופייניים
|
||||
- חישוב שגוי של "המצב הקודם" / "המצב החדש"
|
||||
- שיטת שיערוך פסולה (השוואה / הפרשי הון / היוון)
|
||||
- התעלמות מ"זכויות מותנות" שטרם התגבשו
|
||||
- חיוב כפול (הון / הכנסה / שבח)
|
||||
- אי-התאמה למיקום, שימוש, או שטח
|
||||
|
||||
### מה לא נספר כ"סיכויי הליך"
|
||||
- "אני לא מסכים לסכום" — בלי חוו"ד נגדית מבוססת.
|
||||
- טענות כלליות על "המצב הכלכלי" של המבקש.
|
||||
- טענות על "תקדים" שלא הוכרע בערכאה גבוהה יותר.
|
||||
|
||||
---
|
||||
|
||||
## ט. טבלת התאמה לעובדות (placeholder לכל תיק)
|
||||
|
||||
| תבחין | עובדה במקרה הנוכחי | כיוון |
|
||||
|--------|---------------------|-------|
|
||||
| א. טעות שמאית/בדין | [סוג הטעות הנטענת + תיעוד] | [חוסם / מאפשר] |
|
||||
| ב. טעם סביר | [מועד המצאה, פעולות, תצהיר] | [תומך / מחליש] |
|
||||
| ג. אורך השיהוי | [X ימים מעבר ל-45] | [קל / בינוני / חמור] |
|
||||
| ד. הסתמכות הרשות | [מצב הכספים: בקופה / חולק / הוצא] | [קל / משמעותי / מוחשי] |
|
||||
| ה. סיכויי הליך | [חוו"ד שמאית? חישוב חלופי?] | [לכאורה / ספקולטיבי] |
|
||||
|
||||
---
|
||||
|
||||
## י. סעיף מסקנה — מבנה אופייני
|
||||
|
||||
המבנה האופייני בבל"מ-היטל-השבחה הוא **קר ומקצועי** — מינימום רגש,
|
||||
מקסימום שמאות:
|
||||
|
||||
1. **קביעת מצב השומה.** "השומה הומצאה ביום X. הבל"מ הוגשה ביום Y."
|
||||
2. **תבחין א (טעות שמאית).** "המבקש טוען לטעות בX. בחינת המסמכים מעלה..."
|
||||
3. **אם טעות לא הוכחה — דחייה.** "בהיעדר טעות שמאית או בדין, אין יסוד
|
||||
לסטות ממועד הקבוע בחוק."
|
||||
4. **אם טעות הוכחה — מעבר לתבחינים ב-ה.**
|
||||
5. **מאזן.** "לאור איזון התבחינים..."
|
||||
6. **הכרעה.** דחייה / קבלה / החזרה לשמאי הוועדה לבחינה.
|
||||
|
||||
### לשון אופיינית לדחייה
|
||||
> "הבל"מ הוגשה X ימים לאחר תום המועד הסטטוטורי. המבקש לא הצביע על טעות
|
||||
> שמאית או בדין; הטענות הן בגדר מחלוקת על שיקול דעת מקצועי, שאינה מצדיקה
|
||||
> פתיחת שומה שקיבלה תוקף. לאור אלה, ובהינתן שהכספים שולמו וחולקו, הבל"מ
|
||||
> נדחית."
|
||||
|
||||
### לשון אופיינית לקבלה (חריגה)
|
||||
> "המבקש הצביע על טעות חישובית במספר זכויות התכנון שנלקחו בחשבון. הטעות
|
||||
> מהותית ומשפיעה על השומה. בנסיבות אלה, ועל אף השיהוי, יש מקום לפתוח את
|
||||
> השומה לדיון בערר עצמו."
|
||||
|
||||
---
|
||||
|
||||
## יא. הפניות חוצות
|
||||
|
||||
- ראה גם: `docs/methodology/extension-request-building_permit.md` (סעיף 152, 30 ימים)
|
||||
- ראה גם: `docs/methodology/extension-request-compensation.md` (סעיף 198(ד), 30 ימים)
|
||||
- ראה גם: `docs/block-schema.md` — מבנה 12 הבלוקים
|
||||
- ראה גם: `skills/decision/SKILL.md` — מדריך סגנון של דפנה
|
||||
252
docs/methodology/extension-request-building_permit.md
Normal file
252
docs/methodology/extension-request-building_permit.md
Normal file
@@ -0,0 +1,252 @@
|
||||
# מתודולוגיה — בל"מ ברישוי ובנייה (1xxx)
|
||||
|
||||
**appeal_subtype:** `extension_request_building_permit`
|
||||
**מסלול:** סעיף 152(א) לחוק התכנון והבנייה, התשכ"ה-1965
|
||||
**מועד סטטוטורי:** 30 ימים מיום המצאת ההחלטה (סעיף 152(ב))
|
||||
|
||||
---
|
||||
|
||||
## א. מבוא — מהותו של בל"מ ברישוי
|
||||
|
||||
בל"מ ("בקשה להארכת מועד") הוא הליך מקדמי שהמבקש להגיש ערר על החלטת ועדה מקומית
|
||||
לאחר חלוף 30 הימים נדרש לעבור בו לפני שיוכל לפתוח בערר עצמו. הוועדה נדרשת
|
||||
לאזן בין שני אינטרסים נוגדים:
|
||||
|
||||
- **זכות הגישה לערכאות** — שכל בעל זכות עמידה יוכל להעמיד את החלטת הוועדה
|
||||
המקומית במבחן שיפוטי, במיוחד כאשר ההחלטה נטענת כפסולה.
|
||||
- **סופיות החלטות מנהליות + הסתמכות** — היזם זכאי לפעול לפי ההיתר שניתן, להשקיע
|
||||
כספים, להתחיל בעבודות, ולא לחיות בחשש מתמיד שמא ההיתר ייתקף שנים לאחר אישורו.
|
||||
|
||||
לעומת בל"מ בהיטל השבחה (סעיף 14 לתוספת ג', 45 ימים) ובל"מ בפיצויים (סעיף 198(ד),
|
||||
30 ימים אך עם סף קפדני יותר), בל"מ ברישוי משלב טון אנושי יחסית — ההסתמכות מוחשית
|
||||
(חפירה, פינוי שוכרים) והאינטרסים הציבוריים (מיגון, חיזוק) ממשיים.
|
||||
|
||||
---
|
||||
|
||||
## ב. מסגרת נורמטיבית — שלוש שכבות
|
||||
|
||||
### שכבה א — עליון: בר"מ 2340/02 הוועדה המקומית רמת השרון נ' אגא וכט, פ"ד נז(3) 385 (2003)
|
||||
|
||||
הכיר בסמכותה של ועדת הערר להאריך את המועד, בנסיבות חריגות, וקבע את הבחינה
|
||||
הדו-שלבית:
|
||||
1. **תנאי סף:** טעם סביר לאיחור.
|
||||
2. **שיקול כולל:** השוואה בין נזקי המבקש לבין הסתמכות הצד שכנגד; היקף השיהוי;
|
||||
סיכויי ההליך; אינטרס ציבורי.
|
||||
|
||||
### שכבה ב — עליון: עע"מ 317/10 שפר נ' סקאל יניב (נבו 23.8.2012)
|
||||
|
||||
הלכה מחייבת: מניין 30 הימים מתחיל **מיום הידיעה בפועל**, לא מיום הפרסום הפורמלי.
|
||||
המשמעות: גם איחור-לכאורה של חודשים יכול להיות לגיטימי אם המבקש לא ידע על ההחלטה
|
||||
בזמן אמת.
|
||||
|
||||
> "מתנגד להיתר שניתן, אשר שטח התנגדותו בפני הועדה המקומית וזו נדחתה, או שידע
|
||||
> על מתן ההיתר, צריך יהיה להגיש את הערר תוך 30 יום מיום שנודע לו על מתן ההיתר."
|
||||
|
||||
### שכבה ג — ועדת ערר ירושלים (דפנה תמיר)
|
||||
|
||||
**ערר 1009/25 מפלגת נעם נ' הוועדה המרחבית הראל (נבו 27.3.2025):**
|
||||
> "דיון בערר המבקש לבטל היתר שכבר יצא מחייב עמידה בלוח הזמנים שהדין מחייב,
|
||||
> כל חריגה מכך מחייבת בקשה להארכת מועד ועמידה בכל התנאים לכך (זכות עמידה,
|
||||
> שיהוי, הסתמכות, פגיעה וכיו'). ודוק, מחייבת בקשה להארכת מועד סדורה ומנומקת
|
||||
> ולא בדרך אגב ולא בחסות תקנות הרישוי."
|
||||
|
||||
**ערר 1112/22 ירושלים שקופה נ' ועדה מקומית ירושלים (נבו 11.5.2023):**
|
||||
> "מרחק של פחות מ-100 מ' אינו מקנה זכות התנגדות לתכנית; קל וחומר שמרחק של
|
||||
> למעלה מ-400 מ' אינו מקנה זכות התנגדות לבקשה להיתר, שכן זכות ההתנגדות לבקשה
|
||||
> להיתר (סעיף 149) צרה מזכות ההתנגדות לתכנית (סעיף 100)"
|
||||
|
||||
**בל"מ 1028/20 חלוואני (ועדת ערר ירושלים):**
|
||||
> "המועד להגשת ערר הינו 30 ימים מיום שהומצאה החלטת הועדה המקומית וכי המבקשת
|
||||
> הייתה ערה להליכי הבקשה להיתר"
|
||||
|
||||
---
|
||||
|
||||
## ג. שישה תבחינים — סדר הבחינה
|
||||
|
||||
על פי הפסיקה המצטברת, להכרעה בבל"מ-רישוי יש לבחון שישה תבחינים. הסדר חשוב:
|
||||
תבחין ו (זכות עמידה) הוא תנאי סף עצמאי — אם אין זכות עמידה אין צורך לבחון
|
||||
יתר התבחינים.
|
||||
|
||||
| # | תבחין | אופי | מקור |
|
||||
|---|--------|------|------|
|
||||
| ו | **זכות עמידה** | **תנאי סף עצמאי** | עע"מ 1461/20 אנטרים; ערר 1112/22 |
|
||||
| א | טעם סביר לאיחור | מקדים — נחוץ לפתיחת הדלת | עע"מ 317/10 שפר; בל"מ 1028/20 |
|
||||
| ב | אורך השיהוי | כמותי — חומרת ההפרה | ערר 1096/24 אנשין |
|
||||
| ג | הסתמכות + שינוי מצב לרעה | כמותי — נזק | בר"מ 2340/02 |
|
||||
| ד | סיכויי ההליך | מהותי — "לכאורה" | בר"מ 2340/02 |
|
||||
| ה | אינטרס ציבורי / חזקת תקינות | ערכי | הלכת חזקת תקינות |
|
||||
|
||||
---
|
||||
|
||||
## ד. תבחין ו — זכות עמידה (תנאי סף)
|
||||
|
||||
### מקור הזכות
|
||||
זכות הערר לפי סעיף 152 מוקנית רק למי שהוא **בעל זכות במקרקעין נשוא הבקשה
|
||||
להיתר**, לא לכל בעל עניין (עע"מ 1461/20 אנטרים).
|
||||
|
||||
### תבחין מרחק
|
||||
על פי ערר 1112/22, מרחק של מעל 100 מ' (קל וחומר מעל 400 מ') אינו מקנה זכות
|
||||
התנגדות לבקשת היתר, גם בהיעדר נצפות.
|
||||
|
||||
### טבלת בדיקה
|
||||
| פרמטר | להוכיח |
|
||||
|--------|---------|
|
||||
| בעל זכות בנכס נשוא הבקשה? | חוזה רכישה / נסח / שכירות מאומתת |
|
||||
| בעל זכות בנכס גובל? | מפת מדידה / נסח |
|
||||
| מרחק קו אווירי | מודד / Google Maps עם תיעוד |
|
||||
| קיומה של נצפות | תצלום פנורמי / חוו"ד מודד |
|
||||
| מעמד נציג דיירים / פינוי-בינוי | חוזה פנימי — לא יוצר זכות סטטוטורית |
|
||||
|
||||
**אזהרה:** טיעון של "מתנגד מטעם הציבור" או "אינטרס ציבורי כללי" — אינו מקנה
|
||||
זכות עמידה. הזכות נצרכת להיות מעוגנת בזכות במקרקעין.
|
||||
|
||||
---
|
||||
|
||||
## ה. תבחין א — טעם סביר לאיחור
|
||||
|
||||
### העיקרון
|
||||
המבקש נדרש להוכיח שלא ידע על ההחלטה בזמן אמת **ושאי-הידיעה היא סבירה** — לא רק
|
||||
שלא ידע, אלא שלא היה ניתן לצפות שיֵדע. הכלל הוא **דרך הסטטוס-קוו**: מי שהתעניין
|
||||
בנכס שכן, שהיה מודע לשלטי בנייה, או שהיה לו עניין סדור בנכס — מוחזק כיודע.
|
||||
|
||||
### דרישות הוכחה
|
||||
1. **תצהיר עובדתי** של המבקש — תאריכים מפורטים, מי אמר לו, מתי בדיוק.
|
||||
2. **הוכחת ברירת המחדל של הוועדה** — היכן הפרסום היה צריך להתבצע? האם בוצע?
|
||||
3. **שלושת התנאים המצטברים** (לפי הלכת שפר, כפי שיושמו בפסיקה לאחר מכן):
|
||||
- זכות טיעון בהליך הרישוי וזכאות לקבל פרסום.
|
||||
- פגם בהליך הפרסום בפועל.
|
||||
- הפגם פגע בזכות הטיעון.
|
||||
|
||||
### מלכודות נפוצות
|
||||
- **התכתבות עם "הדרג המקצועי" אינה מקפיאה לוחות זמנים** (בל"מ 1028/22 חמד).
|
||||
- **היעדר תצהיר → גרסת אי-הידיעה חלשה ראייתית.**
|
||||
- **ידיעה קודמת על ההליכים** (התנגדות שהוגשה, נוכחות בדיון, פניות בעבר) שוללת
|
||||
כל תירוץ של אי-ידיעה.
|
||||
|
||||
---
|
||||
|
||||
## ו. תבחין ב — אורך השיהוי
|
||||
|
||||
### שני רכיבים
|
||||
1. **שיהוי מצטבר** — הזמן שחלף מהחלטת הוועדה המקומית עד הגשת הבל"מ.
|
||||
2. **שיהוי סובייקטיבי** — הזמן שחלף מיום הידיעה הנטענת עד הגשת הבל"מ.
|
||||
|
||||
### ציר זמן לדוגמה
|
||||
| תאריך | אירוע | שיהוי מצטבר |
|
||||
|--------|--------|--------------|
|
||||
| יום 0 | פרסום הבקשה | 0 |
|
||||
| יום 30 | החלטת ועדת משנה | — |
|
||||
| יום 120 | אישרור במליאה | — |
|
||||
| יום X | ידיעה נטענת | חודשים-שנה |
|
||||
| יום X+30 | הגשת הבל"מ | +30 ימים סובייקטיבי |
|
||||
|
||||
### עקרון מנחה
|
||||
ערר 1096/24 אנשין (דפנה תמיר, 30.12.2024):
|
||||
> "בהינתן שהערר מוגש במקום בו לא הייתה לעורר זכות קנויה וברורה להגשתו, היה
|
||||
> עליו שלא להתעכב ובוודאי שלא לחכות ליום האחרון להגשת הערר"
|
||||
|
||||
**הכלל:** ככל שזכות העמידה רופפת יותר — דרישות הזריזות מחמירות.
|
||||
|
||||
---
|
||||
|
||||
## ז. תבחין ג — הסתמכות הצד שכנגד
|
||||
|
||||
### עיקרון בר"מ 2340/02 אגא וכט
|
||||
> "האם שינה הצד האחר את מצבו לרעה, האם ניתן להשיב את המצב לקדמותו"
|
||||
|
||||
### טבלת השקעות לבדיקה
|
||||
| השקעה | תיעוד נדרש |
|
||||
|--------|-----------|
|
||||
| שכר טרחת מתכננים / עו"ד / יועצים | חשבוניות / קבלות / חוזה |
|
||||
| תכנון מפורט (חניון, ממ"דים) | תכניות חתומות |
|
||||
| היתר חפירה / חפירה בפועל | היתר + תצלומים |
|
||||
| הסכמי מימון | חוזה עם בנק / משקיע |
|
||||
| פינוי שוכרים / חתימות דיירים | חוזי פינוי / הסכמות |
|
||||
| התקדמות פיזית (יסודות, שלד) | תצלומים מתועדים |
|
||||
|
||||
### "האם ניתן להשיב למצב הקדמות?"
|
||||
ככל ששלב הביצוע מתקדם יותר — היכולת להפוך פוחתת. לאחר היתר חפירה, פינוי שוכרים,
|
||||
ושלב הכנת יסודות — המצב לרוב בלתי-הפיך פיזית, ולפחות בלתי-הפיך כלכלית.
|
||||
|
||||
---
|
||||
|
||||
## ח. תבחין ד — סיכויי ההליך (לכאורה)
|
||||
|
||||
### הבהרה מתודית
|
||||
בשלב בל"מ, **בוחנים סיכויי הערר המהותי רק כדי לקבוע האם יש סיבה מספקת לפתוח
|
||||
את הדלת** — לא לפסוק לגוף הערר. אם המחלוקת המהותית היא קשה ומורכבת אבל ברורה
|
||||
שיש בה ממש — תבחין ד תומך בקבלת הבל"מ. אם המחלוקת תיאורטית, ספקולטיבית, או
|
||||
ברורה לזכות המשיבים — תבחין ד תומך בדחייה.
|
||||
|
||||
### סוגים אופייניים של סוגיות מהותיות בבל"מ-רישוי
|
||||
- תחולת תמ"א 38 (תקנים, מבנה קטן, איזורי סיכון רעש)
|
||||
- תוקף תכנית (פקיעה, הוראות מעבר)
|
||||
- חישוב סל זכויות (תיקון 3א, "קומה טיפוסית קיימת")
|
||||
- מעמד תכנית חדשה (102-XXXXXX) — מופקדת? מאושרת? נסיוני?
|
||||
- תנאי היתר (עמידה בתקנות, קווי בניין, חניות)
|
||||
|
||||
### דרך הבחינה
|
||||
לכל סוגיה: (1) האם ההסתמכות על תכנית / תקן בוצעה; (2) האם יש פסיקה מנחה;
|
||||
(3) האם יש מחלוקת מקצועית-עובדתית שתצריך חוות דעת.
|
||||
|
||||
---
|
||||
|
||||
## ט. תבחין ה — אינטרס ציבורי / חזקת תקינות
|
||||
|
||||
### חזקת תקינות המעשה המנהלי
|
||||
עיקרון יסוד בדין המנהלי: כל פעולת הוועדה נחזית כתקינה, עד שהמוכיח אחרת. נטל
|
||||
ההוכחה על המבקש.
|
||||
|
||||
### שיקולים אופייניים בבל"מ-רישוי
|
||||
| שיקול | כיוון אופייני |
|
||||
|--------|---------------|
|
||||
| חיזוק מבני מפני רעידות אדמה | תומך ביזם |
|
||||
| ממ"דים / מיגון מפני ירי | תומך ביזם |
|
||||
| הרחבת זכויות דרך / זכויות מעבר | תועלת ציבורית |
|
||||
| חניות תת-קרקעיות (פינוי חניה מרחוב) | תועלת ציבורית |
|
||||
| תקינות הליך (פרסום, התנגדויות, דיון) | חזקת תקינות |
|
||||
| מתנגד סדרתי / בעל אינטרס נסתר | מחליש טענות המבקש |
|
||||
|
||||
---
|
||||
|
||||
## י. טבלת התאמה לעובדות (placeholder לכל תיק)
|
||||
|
||||
| תבחין | עובדה במקרה הנוכחי | כיוון |
|
||||
|--------|---------------------|-------|
|
||||
| ו. זכות עמידה | [לתאר מרחק, נצפות, זכויות בקרקע] | [חוסם / מאפשר / שאלה] |
|
||||
| א. טעם סביר | [פרסום, ידיעה, תצהיר] | [נוטה לקבלה / לדחייה] |
|
||||
| ב. אורך השיהוי | [שנים / חודשים / ימים] | [קל / בינוני / חמור] |
|
||||
| ג. הסתמכות | [השקעות מצוטטות בש"ח] | [קלה / משמעותית / מוחשית] |
|
||||
| ד. סיכויי הליך | [שאלות פתוחות vs. ברורות] | [לכאורה / ספקולטיבי] |
|
||||
| ה. אינטרס ציבורי | [שיקולים ציבוריים בולטים] | [תומך / ניטרלי / נגד] |
|
||||
|
||||
---
|
||||
|
||||
## יא. סעיף מסקנה — מבנה אופייני
|
||||
|
||||
המבנה האופייני של סעיף ההכרעה בבל"מ-רישוי הוא:
|
||||
|
||||
1. **פתיחה — איזון התבחינים בקצרה.** "בחנו את ששת התבחינים... ומצאנו..."
|
||||
2. **תבחין ו (סף).** אם זכות העמידה רופפת/חסרה — זהו לרוב המכריע.
|
||||
3. **תבחינים א-ה.** ניתוח כל אחד בקצרה, עם הפניה לפסיקה.
|
||||
4. **מסקנה כוללת.** "לאור כל האמור — הבקשה להארכת מועד נדחית / מתקבלת".
|
||||
5. **הוצאות.** אם רלוונטי — לפי סעיף 1.
|
||||
|
||||
### לשון אופיינית לדחייה (דפנה תמיר)
|
||||
> "מששה התבחינים שנבחנו — חמישה מצביעים על מסקנה אחת, וגם התבחין השישי אינו
|
||||
> תומך בקבלת הבקשה. נסיבות התיק אינן מצדיקות חריגה מהמועד הסטטוטורי."
|
||||
|
||||
### לשון אופיינית לקבלה
|
||||
> "על אף השיהוי, נסיבות אי-הידיעה מתועדות; ההסתמכות בעיקרה תכנונית ולא ביצועית;
|
||||
> ומחלוקת מהותית ממשית עומדת על הפרק. בנסיבות אלה, יש לפתוח את הדלת לערר על
|
||||
> מנת שהסוגיות יתבררו."
|
||||
|
||||
---
|
||||
|
||||
## יב. הפניות חוצות
|
||||
|
||||
- ראה גם: `docs/methodology/extension-request-betterment_levy.md` (סעיף 14, 45 ימים)
|
||||
- ראה גם: `docs/methodology/extension-request-compensation.md` (סעיף 198(ד), 30 ימים)
|
||||
- ראה גם: `docs/block-schema.md` — מבנה 12 הבלוקים
|
||||
- ראה גם: `skills/decision/SKILL.md` — מדריך סגנון של דפנה
|
||||
- דוגמאות מעובדות: `data/cases/1017-03-26/`, `data/cases/1018-03-26/`, `data/cases/1019-03-26/`
|
||||
215
docs/methodology/extension-request-compensation.md
Normal file
215
docs/methodology/extension-request-compensation.md
Normal file
@@ -0,0 +1,215 @@
|
||||
# מתודולוגיה — בל"מ בפיצויים (ס' 197) (9xxx)
|
||||
|
||||
**appeal_subtype:** `extension_request_compensation`
|
||||
**מסלול:** סעיף 198(ד) לחוק התכנון והבנייה, התשכ"ה-1965
|
||||
**מועד סטטוטורי:** 30 ימים מיום החלטת הוועדה המקומית בתביעת הפיצויים
|
||||
|
||||
---
|
||||
|
||||
## א. מבוא — הייחוד של בל"מ בפיצויים
|
||||
|
||||
בל"מ בפיצויים שונה מהותית הן מבל"מ ברישוי והן מבל"מ בהיטל השבחה:
|
||||
|
||||
| ממד | בל"מ ברישוי | בל"מ היטל השבחה | בל"מ פיצויים |
|
||||
|------|--------------|------------------|----------------|
|
||||
| מועד | 30 ימים | 45 ימים | **30 ימים** |
|
||||
| סעיף | 152 | 14 לתוספת ג' | **198(ד)** |
|
||||
| מהות הסעד | ביטול היתר | תיקון שומה | **פיצויי פגיעה בזכויות קניין** |
|
||||
| נטל הוכחה | מקדים | טעות שמאית | **סף קפדני — פגיעה ממונית מוחשית** |
|
||||
| טון אופייני | מעורב | קר/שמאי | **קר, משפטי, חמור** |
|
||||
| הסתמכות | יזם / רשות | רשות (חלוקה) | **רשות + ציבור (תקציבי פיצויים)** |
|
||||
|
||||
### למה הסף הקפדן ביותר?
|
||||
פיצויים לפי סעיף 197 הם **כספים ציבוריים** שמיועדים לפיצוי על פגיעה
|
||||
ממונית מוחשית בקרקעות. הם נושאים שלוש מאפיינים שדורשים אכיפת מועדים
|
||||
מחמירה:
|
||||
|
||||
1. **תקציבים סגורים** — הוועדה המקומית עוזבת תקציב לפיצויי 197; שיהוי
|
||||
מחבל בתכנון פיננסי ובחלוקת התקציב.
|
||||
2. **השפעה על תכנון עתידי** — דחייה ארוכת-טווח בבירור הזכות לפיצוי משבשת
|
||||
את היכולת לתכנן הליכי הפקעה/תכנון נוספים.
|
||||
3. **זכויות קניין** — שני הצדדים (תובע ורשות) נושאים אינטרסים קנייניים
|
||||
ברורים. אכיפת מועדים = הגנה על שני הצדדים.
|
||||
|
||||
---
|
||||
|
||||
## ב. מסגרת נורמטיבית
|
||||
|
||||
### שכבה א — חקיקה ראשית
|
||||
|
||||
**סעיף 197(א) לחוק התכנון והבנייה:**
|
||||
> "נפגעו על ידי תכנית, שלא בדרך הפקעה, מקרקעין הנמצאים בתחום התכנית או
|
||||
> גובלים עמה, מי שביום תחילתה של התכנית היה בעל המקרקעין או בעל זכות בהם
|
||||
> זכאי לפיצויים מהוועדה המקומית..."
|
||||
|
||||
**סעיף 198(ד) — מועד הערר:**
|
||||
ערר על החלטת הוועדה המקומית בתביעת פיצויים מוגש לוועדת הערר תוך 30 ימים
|
||||
מיום שהומצאה ההחלטה לתובע.
|
||||
|
||||
### שכבה ב — עליון
|
||||
|
||||
**ע"א 210/88 החברה להפצת פרי הארץ נ' הוועדה המקומית כוכב יאיר (פ"ד מו(4) 627):**
|
||||
ביסוס דרישת ההוכחה לפגיעה ממונית מוחשית — לא די בטענה כללית של "ירידת ערך".
|
||||
נדרשת: (א) הוכחת מצב לפני התכנית; (ב) הוכחת מצב אחרי; (ג) הצבעה על קשר סיבתי
|
||||
ישיר; (ד) חוות דעת שמאית כמותית.
|
||||
|
||||
**עע"מ 1968/00 חברת גוש 6195 נ' הוועדה המקומית הרצליה:**
|
||||
חיזוק עקרון הסופיות בפיצויי 197 — שינוי מועדים בהליך פיצויים פוגע באינטרס
|
||||
הציבורי הספציפי של פריסת תקציבים.
|
||||
|
||||
### שכבה ג — ועדות ערר
|
||||
|
||||
(להוסיף תקדימי דפנה תמיר בעררי 9xxx — לחפש בקורפוס "בל\"מ פיצויים" או
|
||||
"הארכת מועד 197".)
|
||||
|
||||
---
|
||||
|
||||
## ג. ארבעה תבחיני בל"מ בפיצויים
|
||||
|
||||
| # | תבחין | אופי | סף |
|
||||
|---|--------|------|-----|
|
||||
| א | **פגיעה ממונית מוחשית** | תנאי סף עצמאי | קריטי |
|
||||
| ב | טעם סביר לאיחור | מקדים — קפדן | גבוה |
|
||||
| ג | אורך השיהוי | כמותי — קצר במיוחד | גבוה |
|
||||
| ד | הסתמכות הרשות (תקציב) | כמותי | גבוה |
|
||||
|
||||
לעומת בל"מ ברישוי ובהיטל השבחה — אין כאן תבחין נפרד של "סיכויי הליך";
|
||||
תבחין הפגיעה (א) משלב את שני הממדים (סיכויי הליך + עצם הזכות לפיצוי).
|
||||
|
||||
---
|
||||
|
||||
## ד. תבחין א — פגיעה ממונית מוחשית (סף הקפדני)
|
||||
|
||||
### הדרישה
|
||||
לא די בטענה לפגיעה. נדרש להוכיח, לפחות לכאורה:
|
||||
|
||||
1. **בעלות / זכות במקרקעין נשוא התביעה** — נסח טאבו, חוזה מאומת, או רישום אחר.
|
||||
2. **תכנית מאושרת שנכנסה לתוקף** — לא טיוטה, לא תב"ע מופקדת — תכנית בתוקף.
|
||||
3. **קשר סיבתי בין התכנית לפגיעה הנטענת** — לא "ירידת ערך כללית" של אזור.
|
||||
4. **חוו"ד שמאית כמותית** — מציגה את ערך הקרקע לפני ואחרי, עם נתוני השוואה.
|
||||
|
||||
### הוצאות מן הכלל
|
||||
לא נחשבים "פגיעה ממונית" לעניין סעיף 197:
|
||||
- **פגיעה תיאורטית עתידית** — תכנית שטרם נכנסה לתוקף, אופציות שלא מומשו.
|
||||
- **פגיעה אסתטית/סובייקטיבית** — נוף, שכנים, אווירה.
|
||||
- **פגיעה זמנית בלבד** — שיבושים בשלב בנייה שאינם משפיעים על ערך ארוך-טווח.
|
||||
- **פגיעה במקרקעין מחוץ לתכנית ולא גובלים** — דרישה שטחית של "תחום התכנית
|
||||
או גובלים עמה" — מצומצמת.
|
||||
|
||||
### דרישת ההוכחה לכאורה בשלב הבל"מ
|
||||
בשלב בל"מ אין צורך להוכיח את הפגיעה במלואה; די ב**הצגת לכאורה משכנעת**
|
||||
המבוססת על מסמכים מקצועיים. הצגה זו מאפשרת לבחון: האם יש בכלל מה לדון
|
||||
לאחר חלוף המועד?
|
||||
|
||||
---
|
||||
|
||||
## ה. תבחין ב — טעם סביר לאיחור
|
||||
|
||||
### העקרון
|
||||
בפיצויים — דרישת הזריזות מחמירה מאוד. סיבות:
|
||||
|
||||
1. **התובע פעל מולן** — בניגוד לבל"מ ברישוי, התובע ידע על התכנית ופעל
|
||||
בה (הגיש תביעה לוועדה המקומית). אי-ידיעה על ההחלטה היא חריג.
|
||||
2. **המצאה אישית** — ההחלטה מומצאת אישית; פחות מקום לטענות "פרסום באתר".
|
||||
3. **התובע מיוצג** — לרוב התובע פיצויים מיוצג עו"ד; "אי-ידיעה" של עו"ד
|
||||
על מועד היא חולשה ראייתית מובהקת.
|
||||
|
||||
### מצבי "טעם סביר" אופייניים
|
||||
| מצב | קבילות |
|
||||
|------|---------|
|
||||
| המצאה פגומה (לא לכתובת עורך הדין) | קבילה — בכפוף לתיעוד |
|
||||
| מחלת התובע (מתועדת) | קבילה |
|
||||
| תקופה ארוכה של "ניסיון להידברות" עם הוועדה | חלשה — לוחות זמנים לא מוקפאים |
|
||||
| המתנה להחלטה שיפוטית במקרה דומה | חלשה — אפשר להגיש "במקרה ש..." |
|
||||
| תקלה במשרד עורך הדין | חלשה — אחריות נשואת ייצוג |
|
||||
|
||||
### דרישות הוכחה
|
||||
- תצהיר מפורט של התובע **וגם** של עורך דינו.
|
||||
- מסמכי תמיכה (כרטיסי רישום בית חולים, אישורים רפואיים, וכו').
|
||||
- תיעוד התכתבות פנימית במשרד עורך הדין (אם רלוונטי).
|
||||
|
||||
---
|
||||
|
||||
## ו. תבחין ג — אורך השיהוי
|
||||
|
||||
### עקרונות
|
||||
- **30 ימים בלבד** = מועד קצר במיוחד.
|
||||
- כל יום מעבר מקבל ניקוד שלילי.
|
||||
- שיהוי של מעל 14 ימים מעבר למועד (סה"כ 44 ימים) — נחשב מובהק.
|
||||
- שיהוי של מעל 60 ימים מעבר (סה"כ 90 ימים) — דורש הצדקה חזקה במיוחד.
|
||||
- שיהוי של מעל 180 ימים — חוסם אלא בנסיבות חריגות (טעות בדין, גילוי מאוחר
|
||||
של עובדה מהותית).
|
||||
|
||||
### חישוב
|
||||
| תאריך | אירוע | שיהוי מצטבר |
|
||||
|--------|--------|--------------|
|
||||
| יום 0 | המצאת החלטה | 0 |
|
||||
| יום 30 | תום מועד סטטוטורי | 0 |
|
||||
| יום X | הגשת הבל"מ | X-30 |
|
||||
|
||||
---
|
||||
|
||||
## ז. תבחין ד — הסתמכות הרשות (תקציב פיצויים)
|
||||
|
||||
### ייחוד בפיצויים
|
||||
הוועדה המקומית מקצה תקציב לפיצויי 197 לפי החלטותיה. שיהוי בערר:
|
||||
|
||||
1. **פוגע בפריסה תקציבית** — תקציב עזב מהקצאתו, עבר ליעדים אחרים.
|
||||
2. **מסבך הליכים שלא הוכרעו עדיין** — בעלי מקרקעין אחרים פעלו על סמך
|
||||
התקציב הקיים.
|
||||
3. **משפיע על מכרזים / חוזי תכנון** — שינוי בגובה הפיצויים משפיע על
|
||||
החלטות פיתוח עתידיות.
|
||||
|
||||
### טבלת בדיקה
|
||||
| שלב | מצב התקציב | השפעה |
|
||||
|------|-----------|--------|
|
||||
| לפני סוף שנת כספים | תקציב פעיל, ניתן לשנות הקצאה | קלה |
|
||||
| לאחר סגירת שנת כספים | תקציב חלוק | בינונית |
|
||||
| לאחר העברה ליעדים אחרים | פיצוי דורש מקור חדש | משמעותית |
|
||||
| לאחר ביצוע פרויקטים | בלתי הפיך כלכלית | מוחשית |
|
||||
|
||||
---
|
||||
|
||||
## ח. טבלת התאמה לעובדות (placeholder לכל תיק)
|
||||
|
||||
| תבחין | עובדה במקרה הנוכחי | כיוון |
|
||||
|--------|---------------------|-------|
|
||||
| א. פגיעה ממונית | [חוו"ד שמאית? קשר סיבתי? תכנית בתוקף?] | [חוסם / מאפשר] |
|
||||
| ב. טעם סביר | [המצאה, ייצוג, תצהיר] | [תומך / מחליש] |
|
||||
| ג. אורך השיהוי | [X ימים מעבר ל-30] | [קל / מובהק / חמור] |
|
||||
| ד. הסתמכות הרשות | [מצב התקציב] | [קל / משמעותי / מוחשי] |
|
||||
|
||||
---
|
||||
|
||||
## ט. סעיף מסקנה — מבנה אופייני
|
||||
|
||||
המבנה האופייני הוא **קפדן, מבוסס מסמכים, ללא רגש**:
|
||||
|
||||
1. **קביעת עובדות.** "ההחלטה הומצאה ביום X. הבל"מ הוגשה ביום Y. השיהוי
|
||||
הוא Z ימים מעבר למועד הסטטוטורי."
|
||||
2. **תבחין א (פגיעה).** "המבקש הציג חוו"ד / לא הציג חוו"ד. הקרקע
|
||||
נמצאת בתחום התכנית / גובלת בה / מחוץ לה."
|
||||
3. **אם לא הוצגה פגיעה לכאורה — דחייה מיידית.** "בהיעדר הצגה לכאורה של
|
||||
פגיעה ממונית, אין יסוד לסטות ממועד הקבוע בחוק."
|
||||
4. **אם הוצגה פגיעה — מעבר לתבחינים ב-ד.**
|
||||
5. **מאזן והכרעה.** דחייה / קבלה / החזרה לוועדה המקומית.
|
||||
|
||||
### לשון אופיינית לדחייה
|
||||
> "המבקש לא הציג ראיה לכאורית לפגיעה ממונית מוחשית בקרקע שבבעלותו. הקרקע
|
||||
> נמצאת מחוץ לתחום התכנית ואינה גובלת עמה. בנסיבות אלה, ובהינתן שהשיהוי
|
||||
> הוא של X ימים מעבר למועד הסטטוטורי הקצר של 30 הימים, אין מקום לסטייה
|
||||
> מהמועד. הבל"מ נדחית."
|
||||
|
||||
### לשון אופיינית לקבלה (חריגה ביותר)
|
||||
> "המבקש הציג חוו"ד שמאית מקצועית המראה ירידת ערך של כ-X% בקרקע הגובלת
|
||||
> בתחום התכנית. ההצגה לכאורה משכנעת. בנסיבות החריגות של [פירוט], ועל אף
|
||||
> הסף הקפדני שמטיל סעיף 198(ד), יש לפתוח את הדלת לדיון מהותי."
|
||||
|
||||
---
|
||||
|
||||
## י. הפניות חוצות
|
||||
|
||||
- ראה גם: `docs/methodology/extension-request-building_permit.md` (סעיף 152, 30 ימים)
|
||||
- ראה גם: `docs/methodology/extension-request-betterment_levy.md` (סעיף 14, 45 ימים)
|
||||
- ראה גם: `docs/block-schema.md` — מבנה 12 הבלוקים
|
||||
- ראה גם: `skills/decision/SKILL.md` — מדריך סגנון של דפנה
|
||||
13
mcp-server/src/legal_mcp/chat_service/__init__.py
Normal file
13
mcp-server/src/legal_mcp/chat_service/__init__.py
Normal file
@@ -0,0 +1,13 @@
|
||||
"""legal-chat-service — host-side SSE bridge to ``claude`` CLI.
|
||||
|
||||
Runs as a pm2-managed process on the host (port 127.0.0.1:8770 by default).
|
||||
The legal-ai FastAPI container proxies chat requests to it via
|
||||
``host.docker.internal:8770``.
|
||||
|
||||
Why a separate service:
|
||||
The chat needs real-time streaming + multi-turn session continuation
|
||||
(``claude --resume <session_id>``). The container can't run the
|
||||
claude CLI (no binary, no claude.ai credentials). Splitting this out
|
||||
keeps the architectural rule of ``claude_session.py`` intact while
|
||||
enabling the new chat feature for free (no API key).
|
||||
"""
|
||||
210
mcp-server/src/legal_mcp/chat_service/server.py
Normal file
210
mcp-server/src/legal_mcp/chat_service/server.py
Normal file
@@ -0,0 +1,210 @@
|
||||
"""HTTP+SSE bridge from FastAPI (in container) to local claude CLI.
|
||||
|
||||
Endpoints:
|
||||
POST /chat/start — body: {prompt, system?, resume_session_id?}
|
||||
returns SSE stream of events from
|
||||
``claude_session.query_streaming``.
|
||||
REQUIRES Authorization: Bearer <secret>.
|
||||
GET /health — liveness probe (no auth — used by FastAPI for status).
|
||||
|
||||
Run with pm2:
|
||||
pm2 start scripts/legal-chat-service.config.cjs
|
||||
|
||||
Standalone for dev:
|
||||
cd ~/legal-ai/mcp-server
|
||||
LEGAL_CHAT_SHARED_SECRET=... .venv/bin/python -m legal_mcp.chat_service.server \
|
||||
--port 8770 --host 10.0.1.1
|
||||
|
||||
Security posture
|
||||
----------------
|
||||
1. Bind defaults to ``10.0.1.1`` — the host's docker0 bridge gateway.
|
||||
Containers on docker bridges (including the legal-ai container, which
|
||||
sits on the ``coolify`` network but routes to docker0 at the host)
|
||||
can reach this address; processes outside the host cannot. Binding to
|
||||
``0.0.0.0`` is permitted but discouraged (relies on the cloud-level
|
||||
firewall as the sole perimeter).
|
||||
2. ``/chat/start`` requires a ``Authorization: Bearer <LEGAL_CHAT_SHARED_SECRET>``
|
||||
header. The secret is loaded from the environment; without it set,
|
||||
the server refuses to start (no fallback to "open" mode, by design —
|
||||
the claude CLI it spawns can run arbitrary tool calls, so an
|
||||
unauthenticated /chat/start is RCE-equivalent).
|
||||
3. ``/health`` is intentionally unauthenticated so the FastAPI proxy
|
||||
can probe liveness with no token. It returns only a static OK and
|
||||
never spawns subprocesses, so it can't be abused.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
from typing import Any
|
||||
|
||||
from aiohttp import web
|
||||
|
||||
# Run-via-CLI bootstrap so ``python -m legal_mcp.chat_service.server``
|
||||
# works even when the package isn't installed (it is in the venv, but
|
||||
# this safeguard keeps the entrypoint robust).
|
||||
_pkg_root = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
|
||||
if _pkg_root not in sys.path:
|
||||
sys.path.insert(0, _pkg_root)
|
||||
|
||||
from legal_mcp.services import claude_session # noqa: E402
|
||||
|
||||
logger = logging.getLogger("legal_chat_service")
|
||||
|
||||
|
||||
# Loaded once at startup. Validated to be non-empty in main(); the handler
|
||||
# uses a constant-time compare to avoid timing oracles on a short input.
|
||||
_SHARED_SECRET: str = ""
|
||||
|
||||
|
||||
async def health(request: web.Request) -> web.Response:
|
||||
return web.json_response({"ok": True, "service": "legal-chat-service"})
|
||||
|
||||
|
||||
def _check_bearer(request: web.Request) -> web.Response | None:
|
||||
"""Validate ``Authorization: Bearer <secret>``. Returns 401 response on failure."""
|
||||
auth = request.headers.get("Authorization", "")
|
||||
expected = "Bearer " + _SHARED_SECRET
|
||||
# ``compare_digest`` defends against timing attacks. Strings of different
|
||||
# length still leak length, but for a 43-char urlsafe token that's
|
||||
# uninteresting and the auth scheme prefix anchors it anyway.
|
||||
import hmac
|
||||
if not auth or not hmac.compare_digest(auth, expected):
|
||||
return web.json_response(
|
||||
{"error": "unauthorized: missing or invalid Bearer token"},
|
||||
status=401,
|
||||
)
|
||||
return None
|
||||
|
||||
|
||||
async def chat_start(request: web.Request) -> web.StreamResponse:
|
||||
"""Drive ``claude_session.query_streaming`` and forward events as SSE.
|
||||
|
||||
Request body (JSON):
|
||||
prompt: str — required, user message
|
||||
system: str | None — system instructions (ignored if resuming)
|
||||
resume_session_id: str | None — continue a prior CLI session
|
||||
timeout: int = 3600 — hard timeout for the subprocess
|
||||
"""
|
||||
unauth = _check_bearer(request)
|
||||
if unauth is not None:
|
||||
return unauth
|
||||
|
||||
try:
|
||||
body = await request.json()
|
||||
except json.JSONDecodeError:
|
||||
return web.json_response({"error": "invalid JSON body"}, status=400)
|
||||
|
||||
prompt = body.get("prompt") or ""
|
||||
if not prompt.strip():
|
||||
return web.json_response({"error": "prompt is required"}, status=400)
|
||||
system = body.get("system")
|
||||
resume_session_id = body.get("resume_session_id")
|
||||
timeout = int(body.get("timeout") or 3600)
|
||||
|
||||
response = web.StreamResponse(
|
||||
status=200,
|
||||
reason="OK",
|
||||
headers={
|
||||
"Content-Type": "text/event-stream",
|
||||
"Cache-Control": "no-cache, no-transform",
|
||||
"Connection": "keep-alive",
|
||||
# X-Accel-Buffering=no defeats nginx/traefik buffering — the
|
||||
# FastAPI container proxies via httpx and forwards bytes as
|
||||
# they arrive, but the inner header is harmless and makes
|
||||
# browser-direct testing easier.
|
||||
"X-Accel-Buffering": "no",
|
||||
},
|
||||
)
|
||||
await response.prepare(request)
|
||||
|
||||
async def send_event(payload: dict[str, Any]) -> None:
|
||||
line = f"data: {json.dumps(payload, ensure_ascii=False)}\n\n"
|
||||
await response.write(line.encode("utf-8"))
|
||||
|
||||
try:
|
||||
async for event in claude_session.query_streaming(
|
||||
prompt,
|
||||
system=system,
|
||||
resume_session_id=resume_session_id,
|
||||
timeout=timeout,
|
||||
):
|
||||
await send_event(event)
|
||||
if event.get("type") == "done" or event.get("type") == "error":
|
||||
break
|
||||
except asyncio.CancelledError:
|
||||
# Client disconnected — bail cleanly.
|
||||
logger.info("chat_start: client disconnected")
|
||||
except Exception as e:
|
||||
logger.exception("chat_start: streaming failed")
|
||||
try:
|
||||
await send_event({"type": "error", "message": str(e)})
|
||||
except ConnectionResetError:
|
||||
pass
|
||||
|
||||
try:
|
||||
await response.write_eof()
|
||||
except ConnectionResetError:
|
||||
pass
|
||||
return response
|
||||
|
||||
|
||||
def build_app() -> web.Application:
|
||||
app = web.Application()
|
||||
app.router.add_get("/health", health)
|
||||
app.router.add_post("/chat/start", chat_start)
|
||||
return app
|
||||
|
||||
|
||||
def main() -> int:
|
||||
parser = argparse.ArgumentParser(description="legal-chat-service")
|
||||
parser.add_argument("--port", type=int, default=8770)
|
||||
parser.add_argument(
|
||||
"--host", default="10.0.1.1",
|
||||
help=(
|
||||
"bind address. Default 10.0.1.1 = docker0 bridge gateway — "
|
||||
"reachable from containers, invisible to non-host networks. "
|
||||
"Use 127.0.0.1 for host-local dev; do not bind 0.0.0.0 "
|
||||
"without a separate perimeter firewall."
|
||||
),
|
||||
)
|
||||
parser.add_argument("--log-level", default="INFO")
|
||||
args = parser.parse_args()
|
||||
|
||||
logging.basicConfig(
|
||||
level=args.log_level.upper(),
|
||||
format="%(asctime)s %(name)s %(levelname)s %(message)s",
|
||||
)
|
||||
|
||||
secret = os.environ.get("LEGAL_CHAT_SHARED_SECRET", "").strip()
|
||||
if not secret:
|
||||
logger.error(
|
||||
"LEGAL_CHAT_SHARED_SECRET is empty; refusing to start. "
|
||||
"Set it in /home/chaim/.legal-chat-service.env (loaded by "
|
||||
"pm2) and mirror it as a Coolify env var on the legal-ai app."
|
||||
)
|
||||
return 2
|
||||
if len(secret) < 24:
|
||||
logger.error(
|
||||
"LEGAL_CHAT_SHARED_SECRET is too short (got %d chars); "
|
||||
"refusing to start. Use >=32 chars (e.g. python3 -c "
|
||||
"'import secrets; print(secrets.token_urlsafe(32))').",
|
||||
len(secret),
|
||||
)
|
||||
return 2
|
||||
global _SHARED_SECRET
|
||||
_SHARED_SECRET = secret
|
||||
|
||||
app = build_app()
|
||||
logger.info("legal-chat-service listening on %s:%d", args.host, args.port)
|
||||
web.run_app(app, host=args.host, port=args.port, print=lambda _msg: None)
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(main())
|
||||
@@ -87,6 +87,20 @@ MULTIMODAL_TEXT_WEIGHT = float(
|
||||
# concentrate weight at top ranks; higher values flatten the curve.
|
||||
MULTIMODAL_RRF_K = int(os.environ.get("MULTIMODAL_RRF_K", "60"))
|
||||
|
||||
# BM25/lexical hybrid — fuse ``ts_rank_cd`` over ``content_tsv``/
|
||||
# ``rule_tsv`` (DB schema V12) with the semantic cosine layer via RRF.
|
||||
# Recovers recall on exact-string queries that voyage embeddings blur
|
||||
# (e.g. case-number citations like "1461/20", "317/10"; rare planning
|
||||
# vocabulary). Hebrew uses the ``simple`` text-search config — no
|
||||
# stemmer needed, and numeric/punctuation tokens stay intact. When
|
||||
# disabled, hybrid search falls back to semantic-only (the previous
|
||||
# behaviour). On by default — the lexical leg is cheap (GIN index) and
|
||||
# only ever *adds* candidates to RRF, it can't down-rank a strong
|
||||
# semantic hit.
|
||||
BM25_HYBRID_ENABLED = (
|
||||
os.environ.get("BM25_HYBRID_ENABLED", "true").lower() == "true"
|
||||
)
|
||||
|
||||
# Halacha extraction — auto-approve threshold. Halachot with extractor
|
||||
# confidence >= this value are inserted with review_status='approved'
|
||||
# instead of 'pending_review' (so they immediately appear in
|
||||
@@ -118,6 +132,43 @@ def find_case_dir(case_number: str) -> Path:
|
||||
CHUNK_SIZE_TOKENS = 600
|
||||
CHUNK_OVERLAP_TOKENS = 100
|
||||
|
||||
# Parent-doc retrieval (TaskMaster #48) — hierarchical chunking + lookup.
|
||||
# When enabled:
|
||||
# - The ingest pipeline emits two tiers of precedent_chunks: small
|
||||
# "child" chunks (~300 tokens) for high-recall semantic/lexical
|
||||
# matching, and larger "parent" chunks (~1500 tokens) that contain
|
||||
# ~5 children each. Children are embedded and indexed; parents
|
||||
# carry the broader text the LLM gets back.
|
||||
# - Search runs against children, then swaps each hit for its parent
|
||||
# row before returning — so the writer sees a coherent passage
|
||||
# instead of a 300-token sliver.
|
||||
#
|
||||
# Off by default: the schema (V17) is safe to apply even when the flag
|
||||
# is false (the chunker still emits single-tier chunks and search just
|
||||
# returns them unchanged). Flip to true ONLY after the corpus has been
|
||||
# re-ingested with the hierarchical chunker — see precedent_library
|
||||
# ingest pipeline + the backfill plan in TaskMaster #48.
|
||||
PARENT_DOC_RETRIEVAL_ENABLED = (
|
||||
os.environ.get("PARENT_DOC_RETRIEVAL_ENABLED", "false").lower() == "true"
|
||||
)
|
||||
# Child chunks are what get embedded + matched. Smaller = higher recall,
|
||||
# more rows. 300 tokens (~600 chars Hebrew) is the empirical sweet spot
|
||||
# referenced in the original parent-doc literature (Anthropic, LlamaIndex).
|
||||
PARENT_DOC_CHILD_SIZE_TOKENS = int(
|
||||
os.environ.get("PARENT_DOC_CHILD_SIZE_TOKENS", "300")
|
||||
)
|
||||
# Parent chunks are what get returned to the LLM. Large enough to hold
|
||||
# a full rule statement plus the surrounding paragraph and any cited
|
||||
# authority. 1500 tokens = ~5 children at 300 each.
|
||||
PARENT_DOC_PARENT_SIZE_TOKENS = int(
|
||||
os.environ.get("PARENT_DOC_PARENT_SIZE_TOKENS", "1500")
|
||||
)
|
||||
# Child overlap — keeps neighbouring children sharing ~50 tokens so a
|
||||
# sentence on a chunk boundary still matches the natural phrasing.
|
||||
PARENT_DOC_CHILD_OVERLAP_TOKENS = int(
|
||||
os.environ.get("PARENT_DOC_CHILD_OVERLAP_TOKENS", "50")
|
||||
)
|
||||
|
||||
# External service allowlist — case materials may ONLY be sent to these domains
|
||||
ALLOWED_EXTERNAL_SERVICES = {
|
||||
"api.voyageai.com", # Voyage AI (embeddings)
|
||||
|
||||
@@ -53,6 +53,11 @@ mcp = FastMCP(
|
||||
from legal_mcp.tools import ( # noqa: E402
|
||||
cases, documents, search, drafting, workflow, precedents,
|
||||
precedent_library as plib,
|
||||
internal_decisions as int_tools,
|
||||
legal_arguments as la_tools,
|
||||
missing_precedents as mp_tools,
|
||||
citations as cit_tools,
|
||||
training_enrichment as train_tools,
|
||||
)
|
||||
|
||||
|
||||
@@ -196,11 +201,20 @@ async def precedent_library_list(
|
||||
precedent_level: str = "",
|
||||
source_type: str = "",
|
||||
search: str = "",
|
||||
source_kind: str = "external_upload",
|
||||
limit: int = 100,
|
||||
) -> str:
|
||||
"""רשימת הפסיקה בקורפוס הסמכותי, עם פילטרים."""
|
||||
"""רשימת הפסיקה בקורפוס, עם פילטרים.
|
||||
|
||||
source_kind: 'external_upload' (ברירת מחדל — פס"ד בתי משפט) /
|
||||
'internal_committee' (החלטות ועדות ערר ערר/בל"מ שהועלו) /
|
||||
'all_committees' (שתיהן — internal + appeals_committee).
|
||||
החלטות ערר/בל"מ שמעלים נשמרות כ-internal_committee — כדי לראותן
|
||||
ברשימה השתמש ב-source_kind='internal_committee' או 'all_committees'.
|
||||
"""
|
||||
return await plib.precedent_library_list(
|
||||
practice_area, court, precedent_level, source_type, search, limit,
|
||||
practice_area, court, precedent_level, source_type, search,
|
||||
source_kind, limit,
|
||||
)
|
||||
|
||||
|
||||
@@ -244,6 +258,18 @@ async def precedent_extract_metadata(case_law_id: str) -> str:
|
||||
return await plib.precedent_extract_metadata(case_law_id)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def style_corpus_enrich(corpus_id: str, overwrite: bool = False) -> str:
|
||||
"""חילוץ מטא-דאטה (summary, outcome, key_principles, appeal_subtype) להחלטה בקורפוס הסגנון של דפנה. ברירת מחדל: ממלא רק שדות ריקים. שלח `overwrite=true` כדי לרענן."""
|
||||
return await train_tools.extract_decision_metadata(corpus_id, overwrite=overwrite)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def style_corpus_pending_enrichment(limit: int = 50) -> str:
|
||||
"""רשימת החלטות בקורפוס הסגנון שעדיין חסרות summary/outcome/key_principles — מועמדות לחילוץ."""
|
||||
return await train_tools.list_corpus_pending_enrichment(limit)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def precedent_process_pending(kind: str = "metadata", limit: int = 20) -> str:
|
||||
"""ריקון תור בקשות חילוץ שנשלחו מ-UI. kind: 'metadata' או 'halacha'. מריץ extractor מקומית עם CLI על כל פריט בתור, ומנקה את הסימון אחרי הצלחה."""
|
||||
@@ -363,6 +389,28 @@ async def get_claims(
|
||||
return await documents.get_claims(case_number, party_role)
|
||||
|
||||
|
||||
# Legal arguments — aggregated (de-duped) propositions
|
||||
@mcp.tool()
|
||||
async def aggregate_claims_to_arguments(
|
||||
case_number: str,
|
||||
force: bool = False,
|
||||
) -> str:
|
||||
"""כינוס פרופוזיציות גולמיות (claims) לטיעונים משפטיים מובחנים — ~6-12 לכל צד.
|
||||
|
||||
משתמש ב-Claude headless לסיווג ואיגוד. force=True מוחק טיעונים קיימים לפני חישוב מחדש.
|
||||
"""
|
||||
return await la_tools.aggregate_claims_to_arguments(case_number, force=force)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def get_legal_arguments(
|
||||
case_number: str,
|
||||
party: str = "",
|
||||
) -> str:
|
||||
"""שליפת טיעונים משפטיים מאוגדים. party: appellant/respondent/committee/permit_applicant (ריק=הכל)."""
|
||||
return await la_tools.get_legal_arguments(case_number, party)
|
||||
|
||||
|
||||
# References
|
||||
@mcp.tool()
|
||||
async def extract_references(
|
||||
@@ -422,6 +470,7 @@ async def search_internal_decisions(
|
||||
chair_name: str = "",
|
||||
limit: int = 10,
|
||||
include_halachot: bool = True,
|
||||
include_cited_by: bool = False,
|
||||
) -> str:
|
||||
"""חיפוש בהחלטות ועדות ערר לתכנון ובנייה (כל המחוזות).
|
||||
|
||||
@@ -436,9 +485,13 @@ async def search_internal_decisions(
|
||||
chair_name: שם יו"ר הוועדה לסינון. ריק = כל היו"רים
|
||||
limit: מספר תוצאות מקסימלי
|
||||
include_halachot: האם לכלול הלכות שחולצו
|
||||
include_cited_by: True = הוסף תוצאות עקיפות — לכל hit הוסף גם החלטות
|
||||
שהוא מצטט (מתוך citation graph). שימושי לחיפוש "כל הקשור ל-X"
|
||||
כשרוצים להרחיב מעבר לטקסט המקורי. default False.
|
||||
"""
|
||||
return await search.search_internal_decisions(
|
||||
query, practice_area, appeal_subtype, district, chair_name, limit, include_halachot,
|
||||
include_cited_by=include_cited_by,
|
||||
)
|
||||
|
||||
|
||||
@@ -662,6 +715,183 @@ async def internal_decision_enrich(
|
||||
return _json.dumps(result, ensure_ascii=False, indent=2)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def internal_decision_upload(
|
||||
file_path: str,
|
||||
case_number: str,
|
||||
chair_name: str,
|
||||
district: str,
|
||||
case_name: str = "",
|
||||
court: str = "",
|
||||
decision_date: str = "",
|
||||
practice_area: str = "",
|
||||
appeal_subtype: str = "",
|
||||
subject_tags: list[str] | None = None,
|
||||
summary: str = "",
|
||||
is_binding: bool = False,
|
||||
) -> str:
|
||||
"""העלאת החלטה של ועדת ערר (internal_committee) לקורפוס הסמכותי.
|
||||
|
||||
שדות חובה: file_path, case_number, chair_name, district.
|
||||
שמירת ההחלטה עוברת דרך ingest_internal_decision — תויג source_kind='internal_committee' אוטומטית.
|
||||
district תקין: ירושלים / מרכז / תל אביב / צפון / דרום / חיפה / ארצי.
|
||||
|
||||
בניגוד ל-precedent_library_upload (שתמיד שומר external_upload),
|
||||
הכלי הזה הוא הנתיב המוסמך להחלטות ועדת ערר ומכריח chair_name+district.
|
||||
"""
|
||||
return await int_tools.internal_decision_upload(
|
||||
file_path=file_path,
|
||||
case_number=case_number,
|
||||
chair_name=chair_name,
|
||||
district=district,
|
||||
case_name=case_name,
|
||||
court=court,
|
||||
decision_date=decision_date,
|
||||
practice_area=practice_area,
|
||||
appeal_subtype=appeal_subtype,
|
||||
subject_tags=subject_tags,
|
||||
summary=summary,
|
||||
is_binding=is_binding,
|
||||
)
|
||||
|
||||
|
||||
# ── Missing precedents (TaskMaster #35) ───────────────────────────
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def missing_precedent_create(
|
||||
citation: str,
|
||||
case_number: str = "",
|
||||
cited_in_document_id: str = "",
|
||||
cited_by_party: str = "unknown",
|
||||
cited_by_party_name: str = "",
|
||||
legal_topic: str = "",
|
||||
legal_issue: str = "",
|
||||
claim_quote: str = "",
|
||||
case_name: str = "",
|
||||
notes: str = "",
|
||||
) -> str:
|
||||
"""תיעוד פסיקה שצוטטה בכתבי הטענות אך אינה בקורפוס.
|
||||
|
||||
שימוש: סוכן המחקר (legal-researcher) קורא לזה כשהוא מזהה ציטוט שלא
|
||||
ניתן לאמת מול הקורפוס. הרשומה נשארת 'open' עד שהיו"ר מעלה את הפסיקה.
|
||||
cited_by_party: appellant / respondent / committee / permit_applicant / unknown.
|
||||
דה-דופ אוטומטי: ציטוט+תיק זהים → מחזיר את הרשומה הקיימת.
|
||||
"""
|
||||
return await mp_tools.missing_precedent_create(
|
||||
citation=citation,
|
||||
case_number=case_number,
|
||||
cited_in_document_id=cited_in_document_id,
|
||||
cited_by_party=cited_by_party,
|
||||
cited_by_party_name=cited_by_party_name,
|
||||
legal_topic=legal_topic,
|
||||
legal_issue=legal_issue,
|
||||
claim_quote=claim_quote,
|
||||
case_name=case_name,
|
||||
notes=notes,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def missing_precedent_list(
|
||||
case_number: str = "",
|
||||
status: str = "open",
|
||||
legal_topic: str = "",
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת פסיקות חסרות לתיק או בכלל. status: open/uploaded/closed/irrelevant.
|
||||
|
||||
שימוש: היו"ר רואה מה ממתין להעלאה; הסוכן מאשר שלא יוצר כפילויות.
|
||||
"""
|
||||
return await mp_tools.missing_precedent_list(
|
||||
case_number=case_number,
|
||||
status=status,
|
||||
legal_topic=legal_topic,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def missing_precedent_close(
|
||||
id: str,
|
||||
linked_case_law_id: str = "",
|
||||
notes: str = "",
|
||||
status: str = "closed",
|
||||
) -> str:
|
||||
"""סגירת רשומת פסיקה חסרה לאחר העלאה לקורפוס.
|
||||
|
||||
status: closed (הועלה ונקשר) / uploaded (הועלה, ממתין לקישור) /
|
||||
irrelevant (היו"ר החליט שזה לא רלוונטי לקורפוס).
|
||||
"""
|
||||
return await mp_tools.missing_precedent_close(
|
||||
id=id,
|
||||
linked_case_law_id=linked_case_law_id,
|
||||
notes=notes,
|
||||
status=status,
|
||||
)
|
||||
|
||||
|
||||
# ── Internal citations graph (TaskMaster #34) ─────────────────────
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def extract_internal_citations(
|
||||
case_law_id: str = "",
|
||||
chair_name: str = "",
|
||||
limit: int = 0,
|
||||
) -> str:
|
||||
"""חילוץ ציטוטים פנימיים מהחלטות ועדת ערר ושמירה ב-citation graph.
|
||||
|
||||
משתמש בדפוסי regex עבריים ("ונפנה ל…", "כפי שקבעתי…", "ראה החלטתי…")
|
||||
לזיהוי הפניות בין החלטות. אם case_law_id סופק — מריץ על שורה אחת
|
||||
(שימושי אחרי upload). אם chair_name סופק — מריץ על כל ההחלטות של
|
||||
אותו יו"ר. אם שניהם ריקים — מריץ על כל ה-internal_committee corpus.
|
||||
|
||||
איידמפוטנטי: ניתן להריץ שוב ושוב בלי כפילויות. ציטוטים שמופנים
|
||||
להחלטות שעדיין לא בקורפוס נשמרים כ-unlinked (cited_case_law_id=NULL)
|
||||
ויראו ב-list_internal_citations כשהיו"ר יחליט אם להעלות אותן.
|
||||
"""
|
||||
return await cit_tools.extract_internal_citations(
|
||||
case_law_id=case_law_id,
|
||||
chair_name=chair_name,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def list_internal_citations(
|
||||
case_law_id: str = "",
|
||||
linked_only: bool = False,
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת ציטוטים יוצאים מהחלטה (מה ההחלטה מצטטת).
|
||||
|
||||
משתמש לקבלת תמונה של בסיס הפסיקה שהחלטה הסתמכה עליו.
|
||||
linked_only=True מסנן רק ציטוטים שזוהו ב-case_law של הקורפוס.
|
||||
"""
|
||||
return await cit_tools.list_internal_citations(
|
||||
case_law_id=case_law_id,
|
||||
linked_only=linked_only,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def list_incoming_citations(
|
||||
case_law_id: str = "",
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת ציטוטים נכנסים אל החלטה (אילו החלטות מצטטות אותה).
|
||||
|
||||
שימוש: רוצים לדעת אילו החלטות של דפנה (או של ועדות אחרות) הסתמכו
|
||||
על פסק דין מסוים — מעבירים את ה-case_law_id של פסק הדין.
|
||||
"""
|
||||
return await cit_tools.list_incoming_citations(
|
||||
case_law_id=case_law_id,
|
||||
limit=limit,
|
||||
)
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
async def record_chair_feedback(
|
||||
case_number: str,
|
||||
|
||||
@@ -250,8 +250,19 @@ async def extract_appraiser_facts(case_id: UUID) -> dict:
|
||||
|
||||
conflicts = await db.detect_appraiser_conflicts(case_id)
|
||||
|
||||
# Don't swallow extractor failures: if every appraisal errored and no
|
||||
# facts were extracted, surface that as a distinct status instead of
|
||||
# the misleading "completed, 0 facts" we used to return — the caller
|
||||
# (and the UI) need to know that nothing actually ran.
|
||||
all_errored = (
|
||||
total_facts == 0
|
||||
and by_doc
|
||||
and all(d.get("status") == "error" for d in by_doc)
|
||||
)
|
||||
status = "extraction_failed" if all_errored else "completed"
|
||||
|
||||
return {
|
||||
"status": "completed",
|
||||
"status": status,
|
||||
"appraisal_count": len(appraisals),
|
||||
"total_facts": total_facts,
|
||||
"conflicts": conflicts,
|
||||
|
||||
358
mcp-server/src/legal_mcp/services/argument_aggregator.py
Normal file
358
mcp-server/src/legal_mcp/services/argument_aggregator.py
Normal file
@@ -0,0 +1,358 @@
|
||||
"""כינוס פרופוזיציות לטיעונים משפטיים מובחנים — argument de-duplication.
|
||||
|
||||
Workflow:
|
||||
1. ``claims_extractor`` extracts ~20-30 raw propositions per litigation
|
||||
brief into the ``claims`` table.
|
||||
2. This module groups those raw propositions, per party, into 6-12
|
||||
distinct legal arguments via Claude headless (`claude_session`).
|
||||
3. The result is stored in ``legal_arguments`` plus ``legal_argument_
|
||||
propositions`` (M:M join) so we keep traceability back to the source
|
||||
claims.
|
||||
|
||||
Manually de-duping 184 propositions in 3 cases yielded 82 arguments
|
||||
(~24/case) — see ``data/cases/{1017,1018,1019}-03-26/documents/research/
|
||||
legal-arguments.md`` for the gold standard.
|
||||
|
||||
**Architectural constraint**: ``claude_session`` only works from the local
|
||||
MCP server (Claude CLI is not installed in the FastAPI container). Calls
|
||||
from ``web/`` must go through MCP tools; calls from MCP tools land here
|
||||
directly.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import claude_session, db
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Allowed enum values mirror the DB CHECK constraints.
|
||||
ALLOWED_PARTIES = {"appellant", "respondent", "committee", "permit_applicant", "unknown"}
|
||||
ALLOWED_PRIORITIES = {"threshold", "substantive", "procedural", "relief"}
|
||||
|
||||
# Hebrew labels for the prompt (Claude needs context in the same
|
||||
# language as the source material).
|
||||
PARTY_LABELS_HE = {
|
||||
"appellant": "עוררים",
|
||||
"respondent": "משיבים",
|
||||
"committee": "ועדה מקומית",
|
||||
"permit_applicant": "מבקשי היתר",
|
||||
"unknown": "צד לא מזוהה",
|
||||
}
|
||||
|
||||
|
||||
AGGREGATE_PROMPT_TEMPLATE = """אתה מנתח כתבי טענות בתחום תכנון ובנייה (ועדת ערר).
|
||||
|
||||
לפניך {n} פרופוזיציות גולמיות שחולצו ממסמכי {party_he} בתיק ערר.
|
||||
מטרתך: לקבץ אותן ל-{target_min}-{target_max} **טיעונים משפטיים מובחנים**
|
||||
(ארגומנטים אמיתיים, לא חזרה מילולית של הפרופוזיציות).
|
||||
|
||||
## כללי איגוד:
|
||||
1. **טיעון אמיתי = רעיון משפטי אחד** — לא רשימה של פרופוזיציות, אלא טענה משפטית עצמאית.
|
||||
2. **מקבצים פרופוזיציות שתומכות באותו רעיון משפטי** — גם אם הניסוח שלהן שונה.
|
||||
3. **מפרידים בין סוגי טענות**:
|
||||
- **threshold** = טענות סף (זכות עמידה, סמכות, מועדים, שיהוי)
|
||||
- **substantive** = טענות מהותיות (תחולת חוק, פרשנות, חישוב)
|
||||
- **procedural** = פגמי הליך (פרסום, פרוטוקול, ניגוד עניינים)
|
||||
- **relief** = סעדים מבוקשים / סיכומים
|
||||
4. **כותרת קצרה ובהירה** — תיאורית, לא משפטית מפורטת. 5-15 מילים.
|
||||
5. **גוף הטיעון בפסקה אחת** — 3-7 שורות עברית, נאמן למקור.
|
||||
6. **שמירת ה-claim_ids המקוריים** — לכל טיעון, רשום אילו פרופוזיציות תומכות בו.
|
||||
|
||||
## פלט:
|
||||
החזר JSON בלבד (ללא markdown, ללא הסברים), array של אובייקטים:
|
||||
```
|
||||
[
|
||||
{{
|
||||
"title": "כותרת קצרה של הטיעון",
|
||||
"body": "גוף הטיעון בפסקה אחת",
|
||||
"topic": "סוגיה משפטית קצרה (לדוגמה: 'זכות עמידה', 'תחולת תמ\\"א 38')",
|
||||
"priority": "threshold|substantive|procedural|relief",
|
||||
"claim_ids": ["uuid-1", "uuid-2"]
|
||||
}}
|
||||
]
|
||||
```
|
||||
|
||||
## הפרופוזיציות:
|
||||
{propositions_json}
|
||||
"""
|
||||
|
||||
|
||||
def _build_prompt(party: str, propositions: list[dict]) -> str:
|
||||
"""Compose the per-party aggregation prompt."""
|
||||
n = len(propositions)
|
||||
# Conservative target: ~1 argument per 2-3 propositions, clamped 4-12.
|
||||
target_min = max(4, n // 4)
|
||||
target_max = max(target_min + 1, min(12, n // 2 + 1))
|
||||
|
||||
party_he = PARTY_LABELS_HE.get(party, party)
|
||||
# Strip noise from propositions for the prompt — Claude only needs
|
||||
# the id and the text to do the grouping.
|
||||
compact = [
|
||||
{"id": str(p["id"]), "text": p["claim_text"]}
|
||||
for p in propositions
|
||||
]
|
||||
propositions_json = json.dumps(compact, ensure_ascii=False, indent=2)
|
||||
|
||||
return AGGREGATE_PROMPT_TEMPLATE.format(
|
||||
n=n,
|
||||
party_he=party_he,
|
||||
target_min=target_min,
|
||||
target_max=target_max,
|
||||
propositions_json=propositions_json,
|
||||
)
|
||||
|
||||
|
||||
def _normalize_argument(raw: dict, fallback_topic: str = "") -> dict | None:
|
||||
"""Validate & normalize a single argument dict from Claude.
|
||||
|
||||
Returns None if the row is unusable (missing required fields).
|
||||
"""
|
||||
if not isinstance(raw, dict):
|
||||
return None
|
||||
title = (raw.get("title") or "").strip()
|
||||
body = (raw.get("body") or "").strip()
|
||||
if not title or not body:
|
||||
return None
|
||||
priority = raw.get("priority", "substantive")
|
||||
if priority not in ALLOWED_PRIORITIES:
|
||||
priority = "substantive"
|
||||
topic = (raw.get("topic") or fallback_topic or "").strip() or None
|
||||
claim_ids_raw = raw.get("claim_ids") or []
|
||||
claim_ids: list[UUID] = []
|
||||
if isinstance(claim_ids_raw, list):
|
||||
for cid in claim_ids_raw:
|
||||
try:
|
||||
claim_ids.append(UUID(str(cid)))
|
||||
except (ValueError, TypeError):
|
||||
continue
|
||||
return {
|
||||
"title": title,
|
||||
"body": body,
|
||||
"topic": topic,
|
||||
"priority": priority,
|
||||
"claim_ids": claim_ids,
|
||||
}
|
||||
|
||||
|
||||
async def _aggregate_party(
|
||||
party: str, propositions: list[dict],
|
||||
) -> list[dict]:
|
||||
"""Ask Claude to group one party's propositions; return normalized rows."""
|
||||
if not propositions:
|
||||
return []
|
||||
prompt = _build_prompt(party, propositions)
|
||||
|
||||
try:
|
||||
raw_result = await claude_session.query_json(prompt)
|
||||
except RuntimeError as e:
|
||||
# Surface CLI-unavailable specifically so the caller can report
|
||||
# cleanly instead of crashing the whole job.
|
||||
raise RuntimeError(
|
||||
f"argument_aggregator: claude_session.query_json failed for party "
|
||||
f"'{party}': {e}"
|
||||
) from e
|
||||
|
||||
if not isinstance(raw_result, list):
|
||||
logger.warning(
|
||||
"argument_aggregator: Claude returned non-list (%s) for party '%s'",
|
||||
type(raw_result).__name__, party,
|
||||
)
|
||||
return []
|
||||
|
||||
out: list[dict] = []
|
||||
for entry in raw_result:
|
||||
norm = _normalize_argument(entry)
|
||||
if norm:
|
||||
out.append(norm)
|
||||
return out
|
||||
|
||||
|
||||
async def aggregate_claims_to_arguments(
|
||||
case_id: UUID, force: bool = False,
|
||||
) -> dict:
|
||||
"""For a given case, group existing claims into distinct legal arguments.
|
||||
|
||||
Args:
|
||||
case_id: The case UUID.
|
||||
force: If True, delete existing ``legal_arguments`` for the case
|
||||
before aggregating. Otherwise short-circuit if any rows exist.
|
||||
|
||||
Returns:
|
||||
A summary dict:
|
||||
``{"status": "completed"|"skipped"|"no_claims"|"llm_unavailable",
|
||||
"by_party": {party: count}, "total": int, "message": ...}``
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
existing = await conn.fetchval(
|
||||
"SELECT COUNT(*) FROM legal_arguments WHERE case_id = $1",
|
||||
case_id,
|
||||
)
|
||||
if existing and not force:
|
||||
return {
|
||||
"status": "skipped",
|
||||
"message": f"Found {existing} existing arguments. Use force=True to re-run.",
|
||||
"total": existing,
|
||||
}
|
||||
|
||||
if force and existing:
|
||||
await conn.execute(
|
||||
"DELETE FROM legal_arguments WHERE case_id = $1", case_id,
|
||||
)
|
||||
|
||||
# Pull all claims for this case, grouped by party.
|
||||
rows = await conn.fetch(
|
||||
"""SELECT id, party_role, claim_text, claim_index, source_document
|
||||
FROM claims
|
||||
WHERE case_id = $1
|
||||
ORDER BY party_role, claim_index""",
|
||||
case_id,
|
||||
)
|
||||
|
||||
if not rows:
|
||||
return {
|
||||
"status": "no_claims",
|
||||
"message": "No claims found for this case. Run extract_claims first.",
|
||||
"total": 0,
|
||||
}
|
||||
|
||||
# Group propositions by party.
|
||||
by_party: dict[str, list[dict]] = {}
|
||||
for r in rows:
|
||||
party = r["party_role"]
|
||||
# Map deprecated 'appraiser' or unknown labels to 'unknown'.
|
||||
if party not in ALLOWED_PARTIES:
|
||||
party = "unknown"
|
||||
by_party.setdefault(party, []).append(dict(r))
|
||||
|
||||
party_counts: dict[str, int] = {}
|
||||
inserted = 0
|
||||
errors: list[str] = []
|
||||
|
||||
for party, props in by_party.items():
|
||||
try:
|
||||
arguments = await _aggregate_party(party, props)
|
||||
except RuntimeError as e:
|
||||
# Most likely cause: Claude CLI not installed (running from
|
||||
# the container). Don't crash — record the gap and continue.
|
||||
msg = str(e)
|
||||
if "Claude CLI not found" in msg:
|
||||
return {
|
||||
"status": "llm_unavailable",
|
||||
"message": (
|
||||
"Claude CLI not available. This service must run from "
|
||||
"the local MCP server (not the FastAPI container)."
|
||||
),
|
||||
"total": 0,
|
||||
}
|
||||
errors.append(f"{party}: {msg}")
|
||||
continue
|
||||
|
||||
if not arguments:
|
||||
party_counts[party] = 0
|
||||
continue
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
async with conn.transaction():
|
||||
for idx, arg in enumerate(arguments):
|
||||
arg_id = await conn.fetchval(
|
||||
"""INSERT INTO legal_arguments
|
||||
(case_id, party, argument_index, argument_title,
|
||||
argument_body, legal_topic, priority)
|
||||
VALUES ($1, $2, $3, $4, $5, $6, $7)
|
||||
RETURNING id""",
|
||||
case_id,
|
||||
party,
|
||||
idx + 1,
|
||||
arg["title"],
|
||||
arg["body"],
|
||||
arg["topic"],
|
||||
arg["priority"],
|
||||
)
|
||||
for cid in arg["claim_ids"]:
|
||||
try:
|
||||
await conn.execute(
|
||||
"""INSERT INTO legal_argument_propositions
|
||||
(argument_id, claim_id)
|
||||
VALUES ($1, $2)
|
||||
ON CONFLICT DO NOTHING""",
|
||||
arg_id, cid,
|
||||
)
|
||||
except Exception as e: # noqa: BLE001
|
||||
# Likely FK violation if the LLM hallucinated
|
||||
# a claim_id. Log and continue.
|
||||
logger.warning(
|
||||
"argument_aggregator: skipped bad claim_id %s for arg %s: %s",
|
||||
cid, arg_id, e,
|
||||
)
|
||||
inserted += 1
|
||||
party_counts[party] = len(arguments)
|
||||
|
||||
result: dict = {
|
||||
"status": "completed",
|
||||
"total": inserted,
|
||||
"by_party": party_counts,
|
||||
"propositions_processed": len(rows),
|
||||
}
|
||||
if errors:
|
||||
result["errors"] = errors
|
||||
result["status"] = "completed_with_errors"
|
||||
return result
|
||||
|
||||
|
||||
async def get_legal_arguments(
|
||||
case_id: UUID, party: str = "",
|
||||
) -> list[dict]:
|
||||
"""Return aggregated legal arguments for a case, optionally filtered by party.
|
||||
|
||||
Each row includes ``supporting_claims`` (list of source claim_ids).
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
if party and party in ALLOWED_PARTIES:
|
||||
rows = await conn.fetch(
|
||||
"""SELECT id, case_id, party, argument_index, argument_title,
|
||||
argument_body, legal_topic, priority, cited_precedents,
|
||||
created_at, updated_at
|
||||
FROM legal_arguments
|
||||
WHERE case_id = $1 AND party = $2
|
||||
ORDER BY priority, argument_index""",
|
||||
case_id, party,
|
||||
)
|
||||
else:
|
||||
rows = await conn.fetch(
|
||||
"""SELECT id, case_id, party, argument_index, argument_title,
|
||||
argument_body, legal_topic, priority, cited_precedents,
|
||||
created_at, updated_at
|
||||
FROM legal_arguments
|
||||
WHERE case_id = $1
|
||||
ORDER BY party, priority, argument_index""",
|
||||
case_id,
|
||||
)
|
||||
|
||||
# Pull supporting claim ids for each argument in one round-trip.
|
||||
arg_ids = [r["id"] for r in rows]
|
||||
supporting: dict[UUID, list[str]] = {}
|
||||
if arg_ids:
|
||||
joins = await conn.fetch(
|
||||
"""SELECT argument_id, claim_id
|
||||
FROM legal_argument_propositions
|
||||
WHERE argument_id = ANY($1::uuid[])""",
|
||||
arg_ids,
|
||||
)
|
||||
for j in joins:
|
||||
supporting.setdefault(j["argument_id"], []).append(str(j["claim_id"]))
|
||||
|
||||
out: list[dict] = []
|
||||
for r in rows:
|
||||
d = dict(r)
|
||||
d["id"] = str(d["id"])
|
||||
d["case_id"] = str(d["case_id"])
|
||||
d["supporting_claims"] = supporting.get(r["id"], [])
|
||||
out.append(d)
|
||||
return out
|
||||
@@ -1,4 +1,14 @@
|
||||
"""Legal document chunker - splits text into sections and chunks for RAG."""
|
||||
"""Legal document chunker - splits text into sections and chunks for RAG.
|
||||
|
||||
The default :func:`chunk_document` emits a single tier of overlapping
|
||||
chunks (legacy single-tier indexing). :func:`chunk_document_hierarchical`
|
||||
emits two tiers — small "child" chunks for retrieval matching, plus
|
||||
larger "parent" chunks that supply broader context to the LLM (parent-
|
||||
doc retrieval, TaskMaster #48). The hierarchical variant lives
|
||||
alongside the legacy one so callers can opt in via
|
||||
``config.PARENT_DOC_RETRIEVAL_ENABLED`` without breaking existing
|
||||
single-tier code paths.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
@@ -87,13 +97,32 @@ def _assign_pages(chunks: list[Chunk], text: str, page_offsets: list[int]) -> No
|
||||
pos = idx + max(1, len(c.content) // 2)
|
||||
|
||||
|
||||
# A section shorter than this (stripped chars) is not a real section — it's
|
||||
# an artifact of a header keyword matched mid-text. Such a fragment is merged
|
||||
# into the preceding section rather than emitted as its own chunk. See #55:
|
||||
# unanchored keywords like "דיון"/"החלטה"/"מסקנה" appearing inside a sentence
|
||||
# used to carve tiny boundary chunks ("דיון). במסגרת ה") that polluted search.
|
||||
MIN_SECTION_CHARS = 60
|
||||
|
||||
|
||||
def _split_into_sections(text: str) -> list[tuple[str, str]]:
|
||||
"""Split text into (section_type, text) pairs based on Hebrew headers."""
|
||||
"""Split text into (section_type, text) pairs based on Hebrew headers.
|
||||
|
||||
Header keywords are matched only at the **start of a line** (after
|
||||
optional whitespace / list numbering like ``5.`` or ``ג.``). A real
|
||||
section header in these decisions sits on its own line; anchoring to
|
||||
the line start prevents common words ("דיון", "החלטה", "מסקנה") that
|
||||
appear mid-sentence from being treated as section boundaries — which
|
||||
previously produced tiny fragment chunks (#55).
|
||||
"""
|
||||
# Find all section headers and their positions
|
||||
markers: list[tuple[int, str]] = []
|
||||
|
||||
for pattern, section_type in SECTION_PATTERNS:
|
||||
for match in re.finditer(pattern, text):
|
||||
# ^ + MULTILINE: line start only. Optional leading spaces/tabs and an
|
||||
# optional ordinal prefix ("5.", "5)", "ג.") before the keyword.
|
||||
anchored = rf"^[ \t]*(?:\d+[.)]\s*|[א-ת][.)]\s*)?(?:{pattern})"
|
||||
for match in re.finditer(anchored, text, re.MULTILINE):
|
||||
markers.append((match.start(), section_type))
|
||||
|
||||
if not markers:
|
||||
@@ -110,11 +139,18 @@ def _split_into_sections(text: str) -> list[tuple[str, str]]:
|
||||
if intro_text:
|
||||
sections.append(("intro", intro_text))
|
||||
|
||||
# Each section
|
||||
# Each section. A section whose text is too short to stand alone is
|
||||
# merged into the previous section (keeping the previous type) so a
|
||||
# near-adjacent pair of headers can't produce a fragment chunk.
|
||||
for i, (pos, section_type) in enumerate(markers):
|
||||
end = markers[i + 1][0] if i + 1 < len(markers) else len(text)
|
||||
section_text = text[pos:end].strip()
|
||||
if section_text:
|
||||
if not section_text:
|
||||
continue
|
||||
if len(section_text) < MIN_SECTION_CHARS and sections:
|
||||
prev_type, prev_text = sections[-1]
|
||||
sections[-1] = (prev_type, f"{prev_text}\n{section_text}")
|
||||
else:
|
||||
sections.append((section_type, section_text))
|
||||
|
||||
return sections
|
||||
@@ -162,3 +198,152 @@ def _split_section(text: str, chunk_size: int, overlap: int) -> list[str]:
|
||||
def _estimate_tokens(text: str) -> int:
|
||||
"""Rough token estimate for Hebrew text (~1.5 chars per token)."""
|
||||
return max(1, len(text) // 2)
|
||||
|
||||
|
||||
# ── Parent-doc retrieval (TaskMaster #48) ────────────────────────────
|
||||
# Hierarchical chunker — emits a list of (child, parent) pairs:
|
||||
# * each "child" carries the smaller text used for embedding/search
|
||||
# * each "parent" is shared by ~5 consecutive children (1500/300)
|
||||
# The list is FLAT — both parents and children live in the same return
|
||||
# list, distinguished by ``role``. A child's ``parent_local_id`` points
|
||||
# back to its parent's ``local_id``, so the ingest pipeline can resolve
|
||||
# the FK after the parent row is INSERTed and its DB UUID is known.
|
||||
#
|
||||
# Parents are built FIRST (one window of ``parent_size`` tokens per
|
||||
# section, sliding by the parent window — no overlap between parents),
|
||||
# then each parent is sub-divided into overlapping children. This keeps
|
||||
# the parent boundary aligned with semantic sections (so a "discussion"
|
||||
# parent doesn't contain stray "ruling" prose) while still allowing
|
||||
# child overlap for recall.
|
||||
|
||||
|
||||
@dataclass
|
||||
class HierarchicalChunk:
|
||||
"""One chunk in the two-tier hierarchy.
|
||||
|
||||
Both children and parents share this shape; ``role`` distinguishes
|
||||
them. Children get an embedding at ingest time; parents do not —
|
||||
they exist only to carry context back to the LLM at retrieval time.
|
||||
|
||||
``local_id`` is a stable in-batch identifier (sequential int) used
|
||||
only by the ingest pipeline to wire children to their parent's DB
|
||||
UUID after the parent INSERT returns. It is NOT persisted.
|
||||
"""
|
||||
|
||||
content: str
|
||||
role: str # 'child' | 'parent'
|
||||
section_type: str = "other"
|
||||
page_number: int | None = None
|
||||
chunk_index: int = 0
|
||||
local_id: int = -1
|
||||
parent_local_id: int | None = None
|
||||
|
||||
|
||||
def chunk_document_hierarchical(
|
||||
text: str,
|
||||
child_size: int = config.PARENT_DOC_CHILD_SIZE_TOKENS,
|
||||
parent_size: int = config.PARENT_DOC_PARENT_SIZE_TOKENS,
|
||||
overlap: int = config.PARENT_DOC_CHILD_OVERLAP_TOKENS,
|
||||
page_offsets: list[int] | None = None,
|
||||
) -> list[HierarchicalChunk]:
|
||||
"""Split a document into a two-tier (child, parent) hierarchy.
|
||||
|
||||
Returns a flat list where each element is either a parent or a
|
||||
child. Children carry ``parent_local_id`` pointing back to their
|
||||
parent's ``local_id``. Caller (ingest pipeline) must insert parents
|
||||
first, capture their DB UUIDs by ``local_id``, then insert children
|
||||
with the resolved UUID in ``parent_chunk_id``.
|
||||
|
||||
Args:
|
||||
text: full document text.
|
||||
child_size: child chunk size in tokens (≈ 300 by default).
|
||||
parent_size: parent chunk size in tokens (≈ 1500 by default).
|
||||
Parents contain ``parent_size // child_size`` children on
|
||||
average.
|
||||
overlap: child-to-child overlap inside a parent (≈ 50 tokens).
|
||||
Parents themselves do not overlap each other.
|
||||
page_offsets: PDF page offsets for tagging chunks with page #.
|
||||
|
||||
Notes:
|
||||
* Parents respect section boundaries (header detection from
|
||||
:data:`SECTION_PATTERNS`). A "facts" parent will not include
|
||||
"ruling" text.
|
||||
* Empty text returns an empty list.
|
||||
* Both child and parent rows are tagged with the page of their
|
||||
first character.
|
||||
"""
|
||||
if not text.strip():
|
||||
return []
|
||||
if child_size <= 0 or parent_size <= 0:
|
||||
raise ValueError("child_size and parent_size must be positive")
|
||||
if child_size > parent_size:
|
||||
raise ValueError("child_size must be <= parent_size")
|
||||
|
||||
sections = _split_into_sections(text)
|
||||
out: list[HierarchicalChunk] = []
|
||||
parent_idx = 0 # global parent ordinal (chunk_index for parents)
|
||||
child_idx = 0 # global child ordinal (chunk_index for children)
|
||||
local_id = 0 # sequential id within this document
|
||||
|
||||
for section_type, section_text in sections:
|
||||
# Step 1: split section into parent-sized windows (no overlap).
|
||||
parent_texts = _split_section(section_text, parent_size, overlap=0)
|
||||
for parent_text in parent_texts:
|
||||
parent_local = local_id
|
||||
local_id += 1
|
||||
parent_chunk = HierarchicalChunk(
|
||||
content=parent_text,
|
||||
role="parent",
|
||||
section_type=section_type,
|
||||
chunk_index=parent_idx,
|
||||
local_id=parent_local,
|
||||
parent_local_id=None,
|
||||
)
|
||||
out.append(parent_chunk)
|
||||
parent_idx += 1
|
||||
|
||||
# Step 2: sub-divide this parent into overlapping children.
|
||||
child_texts = _split_section(parent_text, child_size, overlap)
|
||||
for ch_text in child_texts:
|
||||
ch = HierarchicalChunk(
|
||||
content=ch_text,
|
||||
role="child",
|
||||
section_type=section_type,
|
||||
chunk_index=child_idx,
|
||||
local_id=local_id,
|
||||
parent_local_id=parent_local,
|
||||
)
|
||||
out.append(ch)
|
||||
local_id += 1
|
||||
child_idx += 1
|
||||
|
||||
if page_offsets:
|
||||
_assign_pages_hierarchical(out, text, page_offsets)
|
||||
return out
|
||||
|
||||
|
||||
def _assign_pages_hierarchical(
|
||||
chunks: list[HierarchicalChunk],
|
||||
text: str,
|
||||
page_offsets: list[int],
|
||||
) -> None:
|
||||
"""Page-tag both children and parents.
|
||||
|
||||
Same forward-scan strategy as :func:`_assign_pages` but works on
|
||||
the hierarchical list. Parents may span pages; we tag them with
|
||||
the page of their first character (matches how the multimodal
|
||||
retriever joins on page numbers).
|
||||
"""
|
||||
from legal_mcp.services.extractor import page_at_offset
|
||||
pos = 0
|
||||
for c in chunks:
|
||||
idx = text.find(c.content, pos)
|
||||
if idx < 0:
|
||||
idx = text.find(c.content)
|
||||
if idx < 0:
|
||||
continue
|
||||
c.page_number = page_at_offset(idx, page_offsets)
|
||||
# Advance past halfway — children share text with their parent
|
||||
# and with each other (overlap), so a small forward step lets
|
||||
# the next find() still pick up the right occurrence.
|
||||
pos = idx + max(1, len(c.content) // 4)
|
||||
|
||||
434
mcp-server/src/legal_mcp/services/citation_extractor.py
Normal file
434
mcp-server/src/legal_mcp/services/citation_extractor.py
Normal file
@@ -0,0 +1,434 @@
|
||||
"""Internal citation graph extractor (TaskMaster #34).
|
||||
|
||||
When Daphna (or any other internal_committee chair) cites another committee
|
||||
decision inside the body of a ruling, she uses fairly stable phrases:
|
||||
|
||||
"ונפנה לערר 1110/20 ירושלים שקופה …"
|
||||
"כפי שקבעתי בערר 1041/24 …"
|
||||
"בדומה לעמדתי בהחלטה ערר 8048/24 …"
|
||||
"כפי שנקבע במחוז ת\"א בערר 1234/20 …"
|
||||
"ראה החלטתי בערר 1015-01-24 …"
|
||||
|
||||
This module scans the ``full_text`` of internal-committee ``case_law`` rows,
|
||||
extracts those citations via regex, tries to link each cited case_number to a
|
||||
row already in ``case_law`` (any source_kind), and stores the result in
|
||||
``precedent_internal_citations``. Unresolved citations are kept with
|
||||
``cited_case_law_id = NULL`` so the chair can see what's missing from the
|
||||
corpus (and ``search_internal_decisions`` can surface "cited but absent" gaps).
|
||||
|
||||
The result is a *citation graph* that downstream tools (search, researcher
|
||||
agent) can join on to surface "decisions cited by this one" alongside
|
||||
keyword/semantic hits — without re-running an LLM on every query.
|
||||
|
||||
Patterns are *intentionally* permissive: we accept stray Hebrew quote marks
|
||||
(both straight ``"`` and curly ``״``), optional district parens, and several
|
||||
trigger phrases. False positives are de-duplicated downstream by the
|
||||
``UNIQUE (source_case_law_id, cited_case_number)`` constraint and by case-
|
||||
number normalization (see ``_normalize_case_number``).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import re
|
||||
from typing import Iterator
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# ── Patterns ─────────────────────────────────────────────────────────
|
||||
#
|
||||
# Two pattern families:
|
||||
# 1. Appeals-committee citations ("ערר" / "בל\"מ") — primary target.
|
||||
# These are the ones we resolve against ``case_law``.
|
||||
# 2. Court rulings ("עע\"מ", "בר\"מ", "עמ\"נ", "ע\"א", "בג\"ץ", "רע\"א").
|
||||
# Stored as unlinked rows by default, so the researcher knows the
|
||||
# decision quotes a higher court.
|
||||
#
|
||||
# Trigger words ("ונפנה", "כפי שקבעתי", "בדומה ל…", "ראה החלטתי",
|
||||
# "כפי שנקבע") are *optional* — many citations appear without one (Daphna
|
||||
# often introduces a quote with just "כפי שצוין בערר…"). We therefore
|
||||
# match the citation core (prefix + number) and capture the surrounding
|
||||
# sentence as context.
|
||||
#
|
||||
# Regex notes:
|
||||
# * Hebrew gershayim/quotation: both straight (") and curly (״) are
|
||||
# accepted via the character class [\"״].
|
||||
# * Case numbers can be NNNN/YY, NNNN-YY, or NNNN-MM-YY (the third form
|
||||
# is the Nevo "filed" format: 1015-01-24 means file #1015 of Jan 2024).
|
||||
# * Optional district paren: ערר (ועדות ערר - תכנון ובנייה ירושלים)
|
||||
# 1110/20 — we allow up to 60 chars of parenthetical content.
|
||||
# * \b doesn't behave well with Hebrew, so we anchor by whitespace or
|
||||
# punctuation lookarounds.
|
||||
|
||||
_TRIGGER = (
|
||||
r"(?:ונפנה\s+ל|"
|
||||
r"כפי\s+ש(?:קבעתי|נקבע|פסקתי)\s+ב|"
|
||||
r"בדומה\s+ל(?:עמדתי\s+ב)?|"
|
||||
r"ראה\s+(?:את\s+)?(?:החלטתי\s+ב|פסיקת\s+ה?ועדה\s+ב)?|"
|
||||
r"בעניין\s+|"
|
||||
r"בהחלטת(?:י|ה|נו)?\s+ב?)?"
|
||||
)
|
||||
|
||||
# Optional district / committee parenthetical between the prefix and the
|
||||
# case number. Matches things like "(ועדות ערר - תכנון ובנייה ירושלים)"
|
||||
# or "(ירושלים)" or "(מרכז)". Up to 80 chars to be safe. Required actual
|
||||
# parentheses (the `\(` and `\)` are NOT optional) — otherwise the regex
|
||||
# greedily absorbs the next sentence's content and skips intermediate
|
||||
# citations like "ראה גם ערר 1041/24 …\nכפי שקבעתי בערר (…) 1110/20".
|
||||
_DISTRICT_PAREN = r"(?:\s*\([^)\n]{0,80}\)\s*)?"
|
||||
|
||||
# Case-number core: 3-5 digits, optional separator and 2-4 digits (and
|
||||
# optional third group for the NNNN-MM-YY format).
|
||||
_NUM_RX = r"(\d{3,5}(?:[-/]\d{2,4}(?:[-/]\d{2,4})?)?)"
|
||||
|
||||
_PATTERNS = [
|
||||
# 1. Appeals-committee — ערר / בל"מ
|
||||
(
|
||||
"appeals_committee",
|
||||
re.compile(
|
||||
_TRIGGER
|
||||
+ r"(ערר|בל[\"״]מ)"
|
||||
+ _DISTRICT_PAREN
|
||||
+ r"\s*"
|
||||
+ _NUM_RX,
|
||||
re.UNICODE,
|
||||
),
|
||||
),
|
||||
# 2. Higher courts — עע"מ, בר"מ, עמ"נ, ע"א, בג"ץ, רע"א, דנ"א, בש"א
|
||||
(
|
||||
"court_ruling",
|
||||
re.compile(
|
||||
_TRIGGER
|
||||
+ r"(עע[\"״]מ|בר[\"״]מ|עמ[\"״]נ|ע[\"״]א|בג[\"״]ץ|רע[\"״]א|דנ[\"״]א|בש[\"״]א)"
|
||||
+ r"\s*"
|
||||
+ _NUM_RX,
|
||||
re.UNICODE,
|
||||
),
|
||||
),
|
||||
]
|
||||
|
||||
|
||||
# Context window for storing the match (characters before/after).
|
||||
_CTX_BEFORE = 120
|
||||
_CTX_AFTER = 240
|
||||
|
||||
|
||||
def _normalize_case_number(raw: str) -> str:
|
||||
"""Normalize a case-number for matching.
|
||||
|
||||
The same case can appear in the corpus as "1110/20", "1110-20",
|
||||
"ערר 1110/20", "1110-01-20" — different rules for the third form,
|
||||
which is the Nevo file format. We canonicalize by:
|
||||
* stripping non-digit/separator chars
|
||||
* unifying "/" → "-"
|
||||
* lowercasing
|
||||
The result is used only for matching, never for display.
|
||||
"""
|
||||
cleaned = re.sub(r"[^\d/\-]", "", raw or "")
|
||||
return cleaned.replace("/", "-").strip("-")
|
||||
|
||||
|
||||
def extract_citations_from_text(text: str) -> Iterator[dict]:
|
||||
"""Yield citation dicts extracted from ``text``.
|
||||
|
||||
Each dict has:
|
||||
prefix: matched prefix (ערר / בל\"מ / עע\"מ / …)
|
||||
case_number: raw number as captured
|
||||
case_number_norm: normalized (slashes → dashes, digits only)
|
||||
raw: the full matched span
|
||||
context: ±300 chars surrounding the match (whitespace normalized)
|
||||
pattern_kind: 'appeals_committee' or 'court_ruling'
|
||||
"""
|
||||
if not text:
|
||||
return
|
||||
seen: set[tuple[str, str]] = set()
|
||||
for kind, pattern in _PATTERNS:
|
||||
for m in pattern.finditer(text):
|
||||
# The `_TRIGGER` is wrapped in (?:...) so it does not add a
|
||||
# capture group; group(1) is the prefix, group(2) is the number.
|
||||
prefix = (m.group(1) or "").strip()
|
||||
number = (m.group(2) or "").strip()
|
||||
if not prefix or not number:
|
||||
continue
|
||||
norm = _normalize_case_number(number)
|
||||
if not norm:
|
||||
continue
|
||||
key = (kind, norm)
|
||||
if key in seen:
|
||||
continue
|
||||
seen.add(key)
|
||||
|
||||
start = max(0, m.start() - _CTX_BEFORE)
|
||||
end = min(len(text), m.end() + _CTX_AFTER)
|
||||
context = text[start:end].replace("\n", " ").strip()
|
||||
context = re.sub(r"\s+", " ", context)
|
||||
|
||||
yield {
|
||||
"prefix": prefix,
|
||||
"case_number": number,
|
||||
"case_number_norm": norm,
|
||||
"raw": m.group(0).strip(),
|
||||
"context": context[:1000],
|
||||
"pattern_kind": kind,
|
||||
}
|
||||
|
||||
|
||||
async def _resolve_case_law_id(case_number_norm: str) -> UUID | None:
|
||||
"""Try to resolve a normalized citation to an existing case_law row.
|
||||
|
||||
Strategy:
|
||||
1. Exact match on normalized case_number column (after rewriting
|
||||
existing case_numbers the same way).
|
||||
2. Substring match — the corpus often stores the full Nevo header
|
||||
("ערר (ועדות ערר - תכנון ובנייה ירושלים) 1110/20 …"), so we
|
||||
search by ``case_number ILIKE '%1110/20%' OR '%1110-20%'``.
|
||||
|
||||
Returns None if no row matches.
|
||||
"""
|
||||
if not case_number_norm:
|
||||
return None
|
||||
pool = await db.get_pool()
|
||||
# Build the two raw forms (with slash and with dash) for substring match.
|
||||
parts = case_number_norm.split("-")
|
||||
if len(parts) >= 2:
|
||||
slash_form = "/".join(parts[:2]) if len(parts) == 2 else parts[0] + "/" + parts[-1]
|
||||
else:
|
||||
slash_form = case_number_norm
|
||||
dash_form = case_number_norm
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
# Substring match on either form (covers full Nevo headers and short forms).
|
||||
row = await conn.fetchrow(
|
||||
"""
|
||||
SELECT id FROM case_law
|
||||
WHERE case_number ILIKE $1 OR case_number ILIKE $2
|
||||
ORDER BY (source_kind = 'internal_committee') DESC,
|
||||
LENGTH(case_number) ASC
|
||||
LIMIT 1
|
||||
""",
|
||||
f"%{slash_form}%",
|
||||
f"%{dash_form}%",
|
||||
)
|
||||
return UUID(str(row["id"])) if row else None
|
||||
|
||||
|
||||
async def extract_and_store(case_law_id: UUID) -> dict:
|
||||
"""Extract citations from a single ``case_law`` row's ``full_text``,
|
||||
resolve them against the corpus, and INSERT into
|
||||
``precedent_internal_citations`` (ON CONFLICT DO NOTHING).
|
||||
|
||||
Returns: {extracted: N, linked: M, new: K, skipped: S}
|
||||
extracted — total distinct citations found in the text
|
||||
linked — how many resolved to an existing case_law row
|
||||
new — rows actually inserted (not pre-existing)
|
||||
skipped — citations skipped (self-citation, already stored)
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
row = await conn.fetchrow(
|
||||
"SELECT id, case_number, full_text FROM case_law WHERE id = $1",
|
||||
case_law_id,
|
||||
)
|
||||
if not row:
|
||||
return {"extracted": 0, "linked": 0, "new": 0, "skipped": 0, "error": "not_found"}
|
||||
|
||||
text = row["full_text"] or ""
|
||||
own_norm = _normalize_case_number(row["case_number"] or "")
|
||||
|
||||
extracted = 0
|
||||
linked = 0
|
||||
new_count = 0
|
||||
skipped = 0
|
||||
|
||||
for cit in extract_citations_from_text(text):
|
||||
extracted += 1
|
||||
if cit["case_number_norm"] == own_norm:
|
||||
# Self-citation (e.g. document headers repeating the case number).
|
||||
skipped += 1
|
||||
continue
|
||||
|
||||
cited_id = await _resolve_case_law_id(cit["case_number_norm"])
|
||||
if cited_id is not None and cited_id == case_law_id:
|
||||
skipped += 1
|
||||
continue
|
||||
if cited_id is not None:
|
||||
linked += 1
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
result = await conn.execute(
|
||||
"""
|
||||
INSERT INTO precedent_internal_citations (
|
||||
source_case_law_id, cited_case_number, cited_case_law_id,
|
||||
match_context, match_pattern, confidence
|
||||
)
|
||||
VALUES ($1, $2, $3, $4, $5, $6)
|
||||
ON CONFLICT (source_case_law_id, cited_case_number) DO NOTHING
|
||||
""",
|
||||
case_law_id,
|
||||
f"{cit['prefix']} {cit['case_number']}",
|
||||
cited_id,
|
||||
cit["context"],
|
||||
cit["pattern_kind"],
|
||||
0.90 if cited_id is not None else 0.75,
|
||||
)
|
||||
# asyncpg execute returns 'INSERT 0 N' — N is rows inserted.
|
||||
try:
|
||||
n_inserted = int(result.split()[-1])
|
||||
except (ValueError, IndexError):
|
||||
n_inserted = 0
|
||||
if n_inserted == 1:
|
||||
new_count += 1
|
||||
else:
|
||||
skipped += 1
|
||||
|
||||
return {
|
||||
"extracted": extracted,
|
||||
"linked": linked,
|
||||
"new": new_count,
|
||||
"skipped": skipped,
|
||||
}
|
||||
|
||||
|
||||
async def extract_all_internal_committee(
|
||||
chair_name_filter: str = "",
|
||||
limit: int = 0,
|
||||
) -> dict:
|
||||
"""Run extraction over every internal-committee row in ``case_law``.
|
||||
|
||||
Args:
|
||||
chair_name_filter: if non-empty, restrict to rows where chair_name
|
||||
matches (exact match). Useful for running on Daphna only.
|
||||
limit: hard cap on number of rows processed (0 = no cap).
|
||||
|
||||
Returns: summary dict with per-row counts and aggregate totals.
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
conditions = ["source_kind = 'internal_committee'", "full_text <> ''"]
|
||||
params: list = []
|
||||
if chair_name_filter:
|
||||
conditions.append("chair_name = $1")
|
||||
params.append(chair_name_filter)
|
||||
where = " WHERE " + " AND ".join(conditions)
|
||||
limit_clause = f" LIMIT {int(limit)}" if limit and limit > 0 else ""
|
||||
sql = f"SELECT id, case_number FROM case_law{where} ORDER BY created_at{limit_clause}"
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(sql, *params)
|
||||
|
||||
totals = {
|
||||
"processed": 0,
|
||||
"extracted": 0,
|
||||
"linked": 0,
|
||||
"new": 0,
|
||||
"skipped": 0,
|
||||
"failed": 0,
|
||||
"chair_name_filter": chair_name_filter,
|
||||
"row_count": len(rows),
|
||||
}
|
||||
|
||||
for r in rows:
|
||||
try:
|
||||
stats = await extract_and_store(UUID(str(r["id"])))
|
||||
totals["processed"] += 1
|
||||
totals["extracted"] += stats.get("extracted", 0)
|
||||
totals["linked"] += stats.get("linked", 0)
|
||||
totals["new"] += stats.get("new", 0)
|
||||
totals["skipped"] += stats.get("skipped", 0)
|
||||
except Exception as e:
|
||||
logger.exception("citation extraction failed for %s: %s", r["case_number"], e)
|
||||
totals["failed"] += 1
|
||||
|
||||
return totals
|
||||
|
||||
|
||||
async def list_citations_for_case_law(
|
||||
case_law_id: UUID,
|
||||
linked_only: bool = False,
|
||||
) -> list[dict]:
|
||||
"""Return all citations *from* the given case_law row (outgoing edges)."""
|
||||
pool = await db.get_pool()
|
||||
where = "pic.source_case_law_id = $1"
|
||||
if linked_only:
|
||||
where += " AND pic.cited_case_law_id IS NOT NULL"
|
||||
sql = f"""
|
||||
SELECT pic.id::text AS id,
|
||||
pic.cited_case_number,
|
||||
pic.cited_case_law_id::text AS cited_case_law_id,
|
||||
pic.match_context,
|
||||
pic.match_pattern,
|
||||
pic.confidence::float AS confidence,
|
||||
pic.created_at,
|
||||
cl.case_number AS target_case_number,
|
||||
cl.case_name AS target_case_name,
|
||||
cl.chair_name AS target_chair_name,
|
||||
cl.district AS target_district
|
||||
FROM precedent_internal_citations pic
|
||||
LEFT JOIN case_law cl ON cl.id = pic.cited_case_law_id
|
||||
WHERE {where}
|
||||
ORDER BY pic.created_at
|
||||
"""
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(sql, case_law_id)
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
|
||||
async def list_citations_to_case_law(case_law_id: UUID) -> list[dict]:
|
||||
"""Return all citations *to* the given case_law row (incoming edges).
|
||||
|
||||
Useful for "which Daphna decisions cite this ruling?" queries.
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
sql = """
|
||||
SELECT pic.id::text AS id,
|
||||
pic.source_case_law_id::text AS source_case_law_id,
|
||||
pic.cited_case_number,
|
||||
pic.match_context,
|
||||
pic.match_pattern,
|
||||
pic.confidence::float AS confidence,
|
||||
pic.created_at,
|
||||
cl.case_number AS source_case_number,
|
||||
cl.case_name AS source_case_name,
|
||||
cl.chair_name AS source_chair_name,
|
||||
cl.district AS source_district
|
||||
FROM precedent_internal_citations pic
|
||||
JOIN case_law cl ON cl.id = pic.source_case_law_id
|
||||
WHERE pic.cited_case_law_id = $1
|
||||
ORDER BY pic.created_at DESC
|
||||
"""
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(sql, case_law_id)
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
|
||||
async def get_cited_case_law_ids(source_case_law_ids: list[UUID]) -> dict[str, list[str]]:
|
||||
"""Bulk-fetch outgoing citation case_law_ids for the given source rows.
|
||||
|
||||
Returns: {source_case_law_id (str): [cited_case_law_id (str), ...]} —
|
||||
only including linked (resolved) citations.
|
||||
|
||||
Used by search.search_internal_decisions(include_cited_by=True) to
|
||||
expand result sets with the precedents the hits themselves cite,
|
||||
without running a separate roundtrip per row.
|
||||
"""
|
||||
if not source_case_law_ids:
|
||||
return {}
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT source_case_law_id::text AS source_id,
|
||||
cited_case_law_id::text AS cited_id
|
||||
FROM precedent_internal_citations
|
||||
WHERE source_case_law_id = ANY($1::uuid[])
|
||||
AND cited_case_law_id IS NOT NULL
|
||||
""",
|
||||
list(source_case_law_ids),
|
||||
)
|
||||
out: dict[str, list[str]] = {}
|
||||
for r in rows:
|
||||
out.setdefault(r["source_id"], []).append(r["cited_id"])
|
||||
return out
|
||||
@@ -142,3 +142,175 @@ async def query_json(
|
||||
"""
|
||||
raw = await query(prompt, timeout=timeout, system=system)
|
||||
return parse_llm_json(raw)
|
||||
|
||||
|
||||
# ── Streaming + session continuation ────────────────────────────────
|
||||
|
||||
|
||||
async def query_streaming(
|
||||
prompt: str,
|
||||
*,
|
||||
system: str | None = None,
|
||||
resume_session_id: str | None = None,
|
||||
timeout: int = LONG_TIMEOUT,
|
||||
cwd: str | None = None,
|
||||
):
|
||||
"""Stream Claude's response as an async iterator of events.
|
||||
|
||||
Wraps `claude -p --output-format=stream-json` (newline-delimited JSON
|
||||
objects from the CLI) and translates each line into a small, stable
|
||||
shape that the chat service / SSE proxy can forward without leaking
|
||||
CLI internals to the browser.
|
||||
|
||||
Event shapes yielded:
|
||||
{"type": "session_id", "value": "<uuid>"} # first event, used for resume
|
||||
{"type": "text_delta", "text": "<partial>"} # incremental assistant text
|
||||
{"type": "tool_use", "name": "...", "input": {...}}
|
||||
{"type": "error", "message": "..."}
|
||||
{"type": "done", "text": "<full response>"}
|
||||
|
||||
The CLI emits a richer stream; we project to this minimal set so the
|
||||
front-end can stay stable across CLI upgrades.
|
||||
|
||||
Args:
|
||||
prompt: The user message to send.
|
||||
system: Optional system instructions (used only when starting a
|
||||
fresh conversation — when resume_session_id is set, the
|
||||
session already carries its system prompt).
|
||||
resume_session_id: Continue a prior conversation. When given,
|
||||
we don't re-send the system prompt; the CLI loads the
|
||||
entire conversation history from disk.
|
||||
timeout: Hard ceiling on the subprocess.
|
||||
cwd: Working directory for the subprocess — defaults to the
|
||||
host's HOME so claude.ai credentials resolve correctly.
|
||||
"""
|
||||
if resume_session_id:
|
||||
# When resuming, system is already baked into the on-disk session
|
||||
# — sending it again would be a no-op at best and confuse the
|
||||
# conversation at worst.
|
||||
full_prompt = prompt
|
||||
cmd = [
|
||||
"claude", "-p",
|
||||
"--output-format", "stream-json",
|
||||
"--verbose",
|
||||
"--resume", resume_session_id,
|
||||
]
|
||||
else:
|
||||
full_prompt = f"{system}\n\n{prompt}" if system else prompt
|
||||
cmd = [
|
||||
"claude", "-p",
|
||||
"--output-format", "stream-json",
|
||||
"--verbose",
|
||||
]
|
||||
|
||||
if len(full_prompt) > 200_000:
|
||||
logger.warning(
|
||||
"Streaming: large prompt (%d chars) — may hit CLI input limits",
|
||||
len(full_prompt),
|
||||
)
|
||||
|
||||
try:
|
||||
proc = await asyncio.create_subprocess_exec(
|
||||
*cmd,
|
||||
stdin=asyncio.subprocess.PIPE,
|
||||
stdout=asyncio.subprocess.PIPE,
|
||||
stderr=asyncio.subprocess.PIPE,
|
||||
cwd=cwd,
|
||||
)
|
||||
except FileNotFoundError:
|
||||
yield {
|
||||
"type": "error",
|
||||
"message": (
|
||||
"Claude CLI not found on host — legal-chat-service must "
|
||||
"run where the `claude` binary is installed (Daphna's host, "
|
||||
"not the legal-ai container)."
|
||||
),
|
||||
}
|
||||
return
|
||||
|
||||
assert proc.stdin is not None # for type checkers
|
||||
assert proc.stdout is not None
|
||||
|
||||
# Send the prompt and close stdin so the CLI knows the user message
|
||||
# is complete.
|
||||
try:
|
||||
proc.stdin.write(full_prompt.encode("utf-8"))
|
||||
await proc.stdin.drain()
|
||||
proc.stdin.close()
|
||||
except BrokenPipeError:
|
||||
# CLI exited before reading the prompt — drain stderr and bail.
|
||||
stderr_b = await proc.stderr.read() if proc.stderr else b""
|
||||
yield {
|
||||
"type": "error",
|
||||
"message": f"Claude CLI closed stdin early: {stderr_b.decode('utf-8', errors='replace')[:300]}",
|
||||
}
|
||||
return
|
||||
|
||||
accumulated_text: list[str] = []
|
||||
session_id_emitted = False
|
||||
deadline = asyncio.get_event_loop().time() + timeout
|
||||
try:
|
||||
while True:
|
||||
remaining = deadline - asyncio.get_event_loop().time()
|
||||
if remaining <= 0:
|
||||
yield {"type": "error", "message": f"timed out after {timeout}s"}
|
||||
break
|
||||
try:
|
||||
line_b = await asyncio.wait_for(proc.stdout.readline(), timeout=remaining)
|
||||
except asyncio.TimeoutError:
|
||||
yield {"type": "error", "message": f"stream timed out after {timeout}s"}
|
||||
break
|
||||
if not line_b:
|
||||
break
|
||||
line = line_b.decode("utf-8", errors="replace").strip()
|
||||
if not line:
|
||||
continue
|
||||
try:
|
||||
event = json.loads(line)
|
||||
except json.JSONDecodeError:
|
||||
# Stray non-JSON line from CLI — surface a snippet for debug.
|
||||
logger.debug("non-JSON stream line: %s", line[:120])
|
||||
continue
|
||||
|
||||
# The CLI's stream-json emits several event types. We only
|
||||
# care about the ones the chat service forwards.
|
||||
t = event.get("type")
|
||||
if not session_id_emitted:
|
||||
sid = event.get("session_id")
|
||||
if sid:
|
||||
session_id_emitted = True
|
||||
yield {"type": "session_id", "value": sid}
|
||||
|
||||
if t == "assistant":
|
||||
# event["message"]["content"] is a list of blocks; we extract
|
||||
# text blocks and tool_use blocks.
|
||||
msg = event.get("message") or {}
|
||||
for block in msg.get("content") or []:
|
||||
btype = block.get("type")
|
||||
if btype == "text":
|
||||
text = block.get("text") or ""
|
||||
if text:
|
||||
accumulated_text.append(text)
|
||||
yield {"type": "text_delta", "text": text}
|
||||
elif btype == "tool_use":
|
||||
yield {
|
||||
"type": "tool_use",
|
||||
"name": block.get("name") or "",
|
||||
"input": block.get("input") or {},
|
||||
}
|
||||
elif t == "result":
|
||||
# Final synthesized result line from the CLI — we already
|
||||
# delivered the deltas, so just stop here.
|
||||
break
|
||||
finally:
|
||||
if proc.returncode is None:
|
||||
try:
|
||||
proc.kill()
|
||||
except ProcessLookupError:
|
||||
pass
|
||||
try:
|
||||
await proc.wait()
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
yield {"type": "done", "text": "".join(accumulated_text)}
|
||||
|
||||
File diff suppressed because it is too large
Load Diff
@@ -109,16 +109,30 @@ _HEBREW_ABBREV_FIXES: dict[str, str] = {
|
||||
'מייר': 'מ"ר',
|
||||
'יחייד': 'יח"ד',
|
||||
'בייכ': 'ב"כ',
|
||||
# Patterns where double-yod (יי) substitutes for gershayim (״) in born-digital PDFs
|
||||
'בליימ': 'בל"מ', # בקשה להארכת מועד — appears in RTL legal docs
|
||||
'תמייא': 'תמ"א', # תכנית מתאר ארצית
|
||||
}
|
||||
|
||||
_ABBREV_PATTERN = re.compile(
|
||||
'|'.join(re.escape(k) for k in sorted(_HEBREW_ABBREV_FIXES, key=len, reverse=True))
|
||||
)
|
||||
|
||||
# Matches Hebrew law year abbreviations where gershayim was encoded as double-yod.
|
||||
# e.g. תשכייה → תשכ"ה, תשנייב → תשנ"ב
|
||||
_HEBREW_YEAR_RE = re.compile(r'(תש[א-ת]+)יי([א-ת])')
|
||||
|
||||
|
||||
def _fix_hebrew_quotes(text: str) -> str:
|
||||
"""Fix known Hebrew abbreviation quote replacements from Google Vision OCR."""
|
||||
return _ABBREV_PATTERN.sub(lambda m: _HEBREW_ABBREV_FIXES[m.group()], text)
|
||||
"""Fix known Hebrew abbreviation quote replacements.
|
||||
|
||||
Applied to both Google Vision OCR output and direct PyMuPDF extraction —
|
||||
some born-digital PDFs encode gershayim (״) as double-yod (יי), producing
|
||||
the same corruption patterns as OCR.
|
||||
"""
|
||||
text = _ABBREV_PATTERN.sub(lambda m: _HEBREW_ABBREV_FIXES[m.group()], text)
|
||||
text = _HEBREW_YEAR_RE.sub(r'\1"\2', text)
|
||||
return text
|
||||
|
||||
|
||||
# ── Extraction ───────────────────────────────────────────────────
|
||||
@@ -189,7 +203,7 @@ async def _extract_pdf(path: Path) -> tuple[str, int, list[int]]:
|
||||
text = page.get_text().strip()
|
||||
|
||||
if len(text) > 50 and _text_quality_ok(text):
|
||||
pages_text.append(text)
|
||||
pages_text.append(_fix_hebrew_quotes(text))
|
||||
logger.debug("Page %d: direct extraction (%d chars, quality OK)", page_num + 1, len(text))
|
||||
else:
|
||||
reason = "insufficient text" if len(text) <= 50 else "low quality OCR layer"
|
||||
|
||||
@@ -4,6 +4,8 @@ Layered on top of ``rerank.maybe_rerank``. When ``MULTIMODAL_ENABLED`` is
|
||||
true the result comes from a weighted merge of:
|
||||
|
||||
• text side: cosine on chunks → optional rerank-2 cross-encoder
|
||||
(precedent search additionally fuses ``ts_rank_cd`` lexical results
|
||||
via RRF before this step — see ``BM25_HYBRID_ENABLED``)
|
||||
• image side: cosine on per-page voyage-multimodal-3 embeddings
|
||||
|
||||
rerank-2 is a *text* cross-encoder, so image-side rows are NOT passed
|
||||
@@ -15,6 +17,14 @@ visual-heavy content still appears in results.
|
||||
When ``MULTIMODAL_ENABLED`` is false this module degenerates to plain
|
||||
``rerank.maybe_rerank`` — callers can wrap unconditionally and let env
|
||||
control behaviour.
|
||||
|
||||
BM25/lexical leg (V12 + ``BM25_HYBRID_ENABLED``):
|
||||
``search_precedent_library_hybrid`` runs ``search_precedent_library_lexical``
|
||||
in parallel with the semantic side and fuses the two by rank via RRF.
|
||||
This recovers exact-string recall (case-number citations like "1461/20",
|
||||
rare planning terms) that voyage embeddings blur. The fused list is
|
||||
then handed to rerank-2 (if enabled) and to the image RRF (if
|
||||
multimodal is enabled) exactly as before.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
@@ -91,16 +101,28 @@ async def search_precedent_library_hybrid(
|
||||
source_kind: str = "external_upload",
|
||||
district: str = "",
|
||||
chair_name: str = "",
|
||||
max_per_case_law: int = 2,
|
||||
) -> list[dict]:
|
||||
"""Hybrid wrapper for precedent-library search.
|
||||
|
||||
source_kind='external_upload' → court rulings (default)
|
||||
source_kind='internal_committee' → appeals-committee decisions
|
||||
max_per_case_law: MMR-style diversity cap — at most N hits per
|
||||
case_law_id in the final ranked list (default 2). Prevents a
|
||||
single precedent from monopolizing the result list when many of
|
||||
its chunks/halachot are individually relevant.
|
||||
|
||||
When ``config.BM25_HYBRID_ENABLED`` is true (default) ``_base`` fuses
|
||||
semantic cosine + lexical ``ts_rank_cd`` via RRF before handing the
|
||||
candidates to rerank-2 (if enabled) and the image merge (if
|
||||
multimodal is enabled).
|
||||
"""
|
||||
fetch_k = max(limit, config.VOYAGE_RERANK_FETCH_K) if config.MULTIMODAL_ENABLED else limit
|
||||
# Fetch deeper so diversity dedup still leaves enough candidates.
|
||||
fetch_k = max(limit * max(max_per_case_law, 1), config.VOYAGE_RERANK_FETCH_K) \
|
||||
if config.MULTIMODAL_ENABLED else max(limit * max(max_per_case_law, 1), limit)
|
||||
|
||||
async def _base(limit: int) -> list[dict]:
|
||||
return await db.search_precedent_library_semantic(
|
||||
sem_rows = await db.search_precedent_library_semantic(
|
||||
query_embedding=query_text_embedding,
|
||||
practice_area=practice_area,
|
||||
court=court,
|
||||
@@ -114,12 +136,39 @@ async def search_precedent_library_hybrid(
|
||||
district=district,
|
||||
chair_name=chair_name,
|
||||
)
|
||||
if not config.BM25_HYBRID_ENABLED:
|
||||
return sem_rows
|
||||
# Fetch lexical with ≥ 2× depth so RRF has reserves at the tail.
|
||||
lex_limit = max(limit * 2, limit)
|
||||
try:
|
||||
lex_rows = await db.search_precedent_library_lexical(
|
||||
query=query,
|
||||
practice_area=practice_area,
|
||||
court=court,
|
||||
precedent_level=precedent_level,
|
||||
appeal_subtype=appeal_subtype,
|
||||
is_binding=is_binding,
|
||||
subject_tag=subject_tag,
|
||||
source_kind=source_kind,
|
||||
district=district,
|
||||
chair_name=chair_name,
|
||||
limit=lex_limit,
|
||||
include_halachot=include_halachot,
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning(
|
||||
"Hybrid precedent: lexical side failed, semantic only: %s", e,
|
||||
)
|
||||
return sem_rows
|
||||
if not lex_rows:
|
||||
return sem_rows
|
||||
return _merge_sem_lex(sem_rows, lex_rows, limit=limit)
|
||||
|
||||
text_results = await rerank.maybe_rerank(
|
||||
query=query, base_search=_base, limit=fetch_k,
|
||||
)
|
||||
if not config.MULTIMODAL_ENABLED:
|
||||
return text_results[:limit]
|
||||
return _diversify_by_case_law(text_results, limit, max_per_case_law)
|
||||
|
||||
try:
|
||||
query_img_emb = await embeddings.embed_query_for_multimodal(query)
|
||||
@@ -134,13 +183,128 @@ async def search_precedent_library_hybrid(
|
||||
)
|
||||
except Exception as e:
|
||||
logger.warning("Hybrid: image side failed, returning text only: %s", e)
|
||||
return text_results[:limit]
|
||||
return _diversify_by_case_law(text_results, limit, max_per_case_law)
|
||||
|
||||
merged = _merge(
|
||||
text_results, img_rows,
|
||||
id_field="case_law_id",
|
||||
text_weight=config.MULTIMODAL_TEXT_WEIGHT,
|
||||
)
|
||||
return _diversify_by_case_law(merged, limit, max_per_case_law)
|
||||
|
||||
|
||||
def _diversify_by_case_law(
|
||||
rows: list[dict],
|
||||
limit: int,
|
||||
max_per_case_law: int,
|
||||
) -> list[dict]:
|
||||
"""MMR-style diversity cap: at most ``max_per_case_law`` rows per
|
||||
case_law_id in the final list. Preserves input order (which is the
|
||||
relevance ranking) — for each row, include it only if we haven't
|
||||
reached the cap for its case_law_id yet.
|
||||
|
||||
Set max_per_case_law<=0 to disable (returns rows[:limit] unchanged).
|
||||
"""
|
||||
if max_per_case_law <= 0 or not rows:
|
||||
return rows[:limit]
|
||||
counts: dict[str, int] = {}
|
||||
out: list[dict] = []
|
||||
for r in rows:
|
||||
clid = str(r.get("case_law_id") or "")
|
||||
if not clid:
|
||||
out.append(r)
|
||||
if len(out) >= limit:
|
||||
break
|
||||
continue
|
||||
n = counts.get(clid, 0)
|
||||
if n < max_per_case_law:
|
||||
out.append(r)
|
||||
counts[clid] = n + 1
|
||||
if len(out) >= limit:
|
||||
break
|
||||
return out
|
||||
|
||||
|
||||
def _row_key(r: dict) -> tuple[str, str]:
|
||||
"""Stable identity for sem/lex RRF.
|
||||
|
||||
Halachot rows have ``halacha_id``; chunk rows have ``chunk_id``.
|
||||
Returns ``(type, id)`` so a halacha and a chunk with the same UUID
|
||||
(extremely unlikely, but distinct namespaces) don't collide.
|
||||
"""
|
||||
typ = str(r.get("type") or "")
|
||||
rid = r.get("halacha_id") if typ == "halacha" else r.get("chunk_id")
|
||||
return (typ, str(rid or ""))
|
||||
|
||||
|
||||
def _merge_sem_lex(
|
||||
sem_rows: list[dict],
|
||||
lex_rows: list[dict],
|
||||
*,
|
||||
limit: int,
|
||||
) -> list[dict]:
|
||||
"""RRF fusion of semantic + lexical precedent results.
|
||||
|
||||
Why RRF (and not weighted score sum): cosine similarities (~0.4-0.7)
|
||||
and ``ts_rank_cd`` values (often 0.001-0.5, query-length-dependent)
|
||||
live on completely different scales — a weighted sum would let one
|
||||
side dominate by accident. RRF combines by *rank*, so a row that
|
||||
tops one list and is mid-pack in the other gets a robust boost.
|
||||
|
||||
Per row::
|
||||
|
||||
rrf_score = 1 / (k + sem_rank) + 1 / (k + lex_rank)
|
||||
|
||||
A row that appears in only one list contributes that list's term
|
||||
only. Output is sorted by combined score, with extra debug fields
|
||||
(``sem_score``, ``sem_rank``, ``lex_score``, ``lex_rank``) attached
|
||||
so callers and tests can inspect why a row ranked where it did.
|
||||
|
||||
The row payload (``content``, ``rule_statement``, ``case_*`` joins,
|
||||
etc.) is taken from the semantic-side row when available — the two
|
||||
sources return identical column shapes, but semantic rows carry the
|
||||
confidence-boosted ``score`` that the rest of the pipeline expects.
|
||||
"""
|
||||
k = config.MULTIMODAL_RRF_K
|
||||
sem_rank_by_key: dict[tuple, int] = {}
|
||||
sem_row_by_key: dict[tuple, dict] = {}
|
||||
for rank, r in enumerate(sem_rows, 1):
|
||||
key = _row_key(r)
|
||||
if not key[1]:
|
||||
continue
|
||||
sem_rank_by_key[key] = rank
|
||||
sem_row_by_key[key] = r
|
||||
|
||||
lex_rank_by_key: dict[tuple, int] = {}
|
||||
lex_row_by_key: dict[tuple, dict] = {}
|
||||
for rank, r in enumerate(lex_rows, 1):
|
||||
key = _row_key(r)
|
||||
if not key[1]:
|
||||
continue
|
||||
lex_rank_by_key[key] = rank
|
||||
lex_row_by_key[key] = r
|
||||
|
||||
all_keys = set(sem_rank_by_key) | set(lex_rank_by_key)
|
||||
merged: list[dict] = []
|
||||
for key in all_keys:
|
||||
sem_rank = sem_rank_by_key.get(key)
|
||||
lex_rank = lex_rank_by_key.get(key)
|
||||
base = sem_row_by_key.get(key) or lex_row_by_key.get(key)
|
||||
if base is None:
|
||||
continue
|
||||
d = dict(base)
|
||||
sem_term = 1.0 / (k + sem_rank) if sem_rank else 0.0
|
||||
lex_term = 1.0 / (k + lex_rank) if lex_rank else 0.0
|
||||
d["sem_score"] = float(sem_row_by_key[key]["score"]) \
|
||||
if key in sem_row_by_key else 0.0
|
||||
d["sem_rank"] = sem_rank or 0
|
||||
d["lex_score"] = float(lex_row_by_key[key]["score"]) \
|
||||
if key in lex_row_by_key else 0.0
|
||||
d["lex_rank"] = lex_rank or 0
|
||||
d["score"] = sem_term + lex_term
|
||||
merged.append(d)
|
||||
|
||||
merged.sort(key=lambda x: -float(x["score"]))
|
||||
return merged[:limit]
|
||||
|
||||
|
||||
|
||||
@@ -24,6 +24,7 @@ from uuid import UUID, uuid4
|
||||
|
||||
from legal_mcp import config
|
||||
from legal_mcp.services import chunker, db, embeddings, extractor
|
||||
from legal_mcp.services.practice_area import derive_proceeding_type
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -86,11 +87,13 @@ async def ingest_internal_decision(
|
||||
text: str | None = None,
|
||||
document_id: UUID | None = None,
|
||||
queue_halachot: bool = True,
|
||||
proceeding_type: str = "",
|
||||
) -> dict:
|
||||
"""Ingest an appeals-committee decision into the internal corpus.
|
||||
|
||||
Either file_path or text must be provided.
|
||||
If district is empty, it is inferred from court.
|
||||
If proceeding_type is empty, it is derived from appeal_subtype/case_name.
|
||||
Returns: {"status": "completed", "case_law_id": "...", "chunks": N}
|
||||
"""
|
||||
if not file_path and not text:
|
||||
@@ -99,6 +102,9 @@ async def ingest_internal_decision(
|
||||
raise ValueError("case_number is required")
|
||||
|
||||
resolved_district = district.strip() or _district_from_court(court)
|
||||
resolved_proc = proceeding_type.strip() or derive_proceeding_type(
|
||||
appeal_subtype=appeal_subtype, subject=case_name,
|
||||
)
|
||||
|
||||
if file_path:
|
||||
src = Path(file_path)
|
||||
@@ -133,29 +139,68 @@ async def ingest_internal_decision(
|
||||
summary=summary.strip(),
|
||||
is_binding=is_binding,
|
||||
document_id=document_id,
|
||||
proceeding_type=resolved_proc,
|
||||
)
|
||||
case_law_id = UUID(str(record["id"]))
|
||||
|
||||
try:
|
||||
chunks = chunker.chunk_document(raw_text, page_offsets=page_offsets)
|
||||
if not chunks:
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||
return {"status": "completed", "case_law_id": str(case_law_id), "chunks": 0}
|
||||
# Parent-doc retrieval (TaskMaster #48) — same gated branch as
|
||||
# ingest_precedent. Internal committee decisions are typically
|
||||
# longer than external court rulings (full transcript + ruling),
|
||||
# so the parent-doc benefit is even larger here.
|
||||
if config.PARENT_DOC_RETRIEVAL_ENABLED:
|
||||
h_chunks = chunker.chunk_document_hierarchical(
|
||||
raw_text, page_offsets=page_offsets,
|
||||
)
|
||||
if not h_chunks:
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||
return {"status": "completed", "case_law_id": str(case_law_id), "chunks": 0}
|
||||
children = [c for c in h_chunks if c.role == "child"]
|
||||
parents = [c for c in h_chunks if c.role == "parent"]
|
||||
child_vectors = await embeddings.embed_texts(
|
||||
[c.content for c in children], input_type="document",
|
||||
)
|
||||
chunk_dicts: list[dict] = []
|
||||
for p in parents:
|
||||
chunk_dicts.append({
|
||||
"role": "parent", "local_id": p.local_id, "parent_local_id": None,
|
||||
"chunk_index": p.chunk_index, "content": p.content,
|
||||
"section_type": p.section_type, "page_number": p.page_number,
|
||||
"embedding": None,
|
||||
})
|
||||
for c, v in zip(children, child_vectors):
|
||||
chunk_dicts.append({
|
||||
"role": "child", "local_id": c.local_id,
|
||||
"parent_local_id": c.parent_local_id,
|
||||
"chunk_index": c.chunk_index, "content": c.content,
|
||||
"section_type": c.section_type, "page_number": c.page_number,
|
||||
"embedding": v,
|
||||
})
|
||||
counts = await db.store_precedent_chunks_hierarchical(
|
||||
case_law_id, chunk_dicts,
|
||||
)
|
||||
stored = counts["children"]
|
||||
else:
|
||||
chunks = chunker.chunk_document(raw_text, page_offsets=page_offsets)
|
||||
if not chunks:
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||
return {"status": "completed", "case_law_id": str(case_law_id), "chunks": 0}
|
||||
|
||||
chunk_texts = [c.content for c in chunks]
|
||||
chunk_vectors = await embeddings.embed_texts(chunk_texts, input_type="document")
|
||||
chunk_dicts = [
|
||||
{
|
||||
"chunk_index": c.chunk_index,
|
||||
"content": c.content,
|
||||
"section_type": c.section_type,
|
||||
"page_number": c.page_number,
|
||||
"embedding": v,
|
||||
}
|
||||
for c, v in zip(chunks, chunk_vectors)
|
||||
]
|
||||
stored = await db.store_precedent_chunks(case_law_id, chunk_dicts)
|
||||
chunk_texts = [c.content for c in chunks]
|
||||
chunk_vectors = await embeddings.embed_texts(chunk_texts, input_type="document")
|
||||
chunk_dicts = [
|
||||
{
|
||||
"chunk_index": c.chunk_index,
|
||||
"content": c.content,
|
||||
"section_type": c.section_type,
|
||||
"page_number": c.page_number,
|
||||
"embedding": v,
|
||||
}
|
||||
for c, v in zip(chunks, chunk_vectors)
|
||||
]
|
||||
stored = await db.store_precedent_chunks(case_law_id, chunk_dicts)
|
||||
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "pending")
|
||||
|
||||
@@ -2,14 +2,34 @@
|
||||
|
||||
Two orthogonal axes used to separate legal domains across the system:
|
||||
|
||||
practice_area — top-level domain (multi-tenant axis). Examples:
|
||||
appeals_committee, national_insurance, labor_law.
|
||||
appeal_subtype — refines within a domain. For appeals_committee:
|
||||
building_permit (1xxx), betterment_levy (8xxx),
|
||||
compensation_197 (9xxx), unknown.
|
||||
practice_area — top-level domain. **Two taxonomies coexist** (see below).
|
||||
appeal_subtype — refines within a domain.
|
||||
|
||||
Both columns are denormalized into documents/chunks/decisions/style_corpus
|
||||
so vector searches can filter cheaply.
|
||||
⚠️ TWO TAXONOMIES — DO NOT CONFUSE
|
||||
==================================
|
||||
|
||||
A. **Multi-tenant axis** (legacy, used in routing logic):
|
||||
- ``appeals_committee`` — the legal-ai instance for Daphna's committee
|
||||
- ``national_insurance`` — future / hypothetical other tenants
|
||||
- ``labor_law`` — future
|
||||
When this axis is used, ``appeal_subtype`` carries the actual domain:
|
||||
``building_permit`` (1xxx), ``betterment_levy`` (8xxx),
|
||||
``compensation_197`` (9xxx).
|
||||
|
||||
B. **Domain axis** (DB columns ``case_law.practice_area``,
|
||||
``cases.practice_area`` — what tests, validators, and CHECK constraints
|
||||
actually use):
|
||||
- ``rishuy_uvniya`` — רישוי ובנייה (1xxx)
|
||||
- ``betterment_levy`` — היטל השבחה (8xxx)
|
||||
- ``compensation_197`` — פיצויים סעיף 197 (9xxx)
|
||||
|
||||
Use ``to_db_practice_area(multi_tenant_pa, appeal_subtype)`` to convert
|
||||
from axis A to axis B before writing to the DB.
|
||||
|
||||
Background: TaskMaster #30 (sub-bug ב) — many ``case_law`` rows stored
|
||||
``appeals_committee`` (axis A) where they should have stored a domain
|
||||
value (axis B). The migration backfill plus CHECK constraints close the
|
||||
gap, and this module now validates **both** namespaces.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
@@ -18,19 +38,58 @@ import re
|
||||
|
||||
# ── Enums ──────────────────────────────────────────────────────────
|
||||
|
||||
PRACTICE_AREAS: set[str] = {
|
||||
# Multi-tenant axis (legacy)
|
||||
MULTI_TENANT_PRACTICE_AREAS: set[str] = {
|
||||
"appeals_committee",
|
||||
"national_insurance",
|
||||
"labor_law",
|
||||
}
|
||||
|
||||
# Domain axis (matches DB constraints on case_law/cases)
|
||||
DOMAIN_PRACTICE_AREAS: set[str] = {
|
||||
"rishuy_uvniya",
|
||||
"betterment_levy",
|
||||
"compensation_197",
|
||||
}
|
||||
|
||||
# Union — what ``validate()`` accepts for backward-compat.
|
||||
# Empty string is permitted because the DB CHECK constraint allows it as
|
||||
# a "not yet classified" sentinel (e.g. when auto-derivation fails on an
|
||||
# unrecognized case_number format).
|
||||
PRACTICE_AREAS: set[str] = MULTI_TENANT_PRACTICE_AREAS | DOMAIN_PRACTICE_AREAS | {""}
|
||||
|
||||
APPEALS_COMMITTEE_SUBTYPES: set[str] = {
|
||||
"building_permit",
|
||||
"betterment_levy",
|
||||
"compensation_197",
|
||||
# בל"מ — בקשה להארכת מועד להגשת ערר. מסלולים נפרדים לפי domain:
|
||||
"extension_request_building_permit", # 1xxx — סעיף 152, 30 ימים
|
||||
"extension_request_betterment_levy", # 8xxx — סעיף 14 לתוספת ג', 45 ימים
|
||||
"extension_request_compensation", # 9xxx — סעיף 198(ד), 30 ימים
|
||||
"unknown",
|
||||
}
|
||||
|
||||
# בל"מ subtypes — קל לזהות ע"י prefix
|
||||
BLAM_SUBTYPES: set[str] = {
|
||||
"extension_request_building_permit",
|
||||
"extension_request_betterment_levy",
|
||||
"extension_request_compensation",
|
||||
}
|
||||
|
||||
# מיפוי domain → בל"מ subtype
|
||||
_DOMAIN_TO_BLAM_SUBTYPE: dict[str, str] = {
|
||||
"rishuy_uvniya": "extension_request_building_permit",
|
||||
"betterment_levy": "extension_request_betterment_levy",
|
||||
"compensation_197": "extension_request_compensation",
|
||||
}
|
||||
|
||||
# מיפוי first-digit → בל"מ subtype (אותו מבנה כמו _APPEALS_COMMITTEE_DIGIT_TO_SUBTYPE)
|
||||
_APPEALS_COMMITTEE_DIGIT_TO_BLAM = {
|
||||
"1": "extension_request_building_permit",
|
||||
"8": "extension_request_betterment_levy",
|
||||
"9": "extension_request_compensation",
|
||||
}
|
||||
|
||||
DEFAULT_PRACTICE_AREA = "appeals_committee"
|
||||
|
||||
# Subtypes per practice_area (extend when adding domains)
|
||||
@@ -38,8 +97,74 @@ SUBTYPES_BY_AREA: dict[str, set[str]] = {
|
||||
"appeals_committee": APPEALS_COMMITTEE_SUBTYPES,
|
||||
"national_insurance": {"unknown"},
|
||||
"labor_law": {"unknown"},
|
||||
# Domain values — subtype is implicit in the value itself
|
||||
"rishuy_uvniya": {"building_permit", "extension_request_building_permit", "unknown"},
|
||||
"betterment_levy": {"betterment_levy", "extension_request_betterment_levy", "unknown"},
|
||||
"compensation_197": {"compensation_197", "extension_request_compensation", "unknown"},
|
||||
# Empty (unclassified) — allow any of the appeals_committee subtypes
|
||||
"": APPEALS_COMMITTEE_SUBTYPES,
|
||||
}
|
||||
|
||||
# Mapping: (multi_tenant_pa, appeal_subtype) → domain_pa
|
||||
_SUBTYPE_TO_DOMAIN: dict[str, str] = {
|
||||
"building_permit": "rishuy_uvniya",
|
||||
"betterment_levy": "betterment_levy",
|
||||
"compensation_197": "compensation_197",
|
||||
"extension_request_building_permit": "rishuy_uvniya",
|
||||
"extension_request_betterment_levy": "betterment_levy",
|
||||
"extension_request_compensation": "compensation_197",
|
||||
}
|
||||
|
||||
|
||||
# Regex לזיהוי "בקשה להארכת מועד" בנושא הערר (subject) —
|
||||
# וריאציות נפוצות. case-insensitive, מתחשב במרכאות חכמות/רגילות.
|
||||
_BLAM_SUBJECT_PATTERNS = (
|
||||
re.compile(r"בקשה\s+להארכת\s+מועד", re.IGNORECASE),
|
||||
re.compile(r"בל[\"״״]מ", re.IGNORECASE), # בל"מ עם quote variants
|
||||
re.compile(r"הארכת\s+מועד\s+להגשת", re.IGNORECASE),
|
||||
)
|
||||
|
||||
|
||||
def is_blam_subject(subject: str) -> bool:
|
||||
"""True iff subject indicates a בל"מ (extension-of-time request).
|
||||
|
||||
מזהה: "בקשה להארכת מועד", "בל\"מ", "הארכת מועד להגשת..."
|
||||
|
||||
Examples:
|
||||
>>> is_blam_subject("בל\"מ אלחנן ברלינגר נ' לינדאב")
|
||||
True
|
||||
>>> is_blam_subject("בקשה להארכת מועד להגשת ערר")
|
||||
True
|
||||
>>> is_blam_subject("היתר בנייה ברחוב X")
|
||||
False
|
||||
"""
|
||||
if not subject:
|
||||
return False
|
||||
return any(p.search(subject) for p in _BLAM_SUBJECT_PATTERNS)
|
||||
|
||||
|
||||
def to_db_practice_area(practice_area: str, appeal_subtype: str = "") -> str:
|
||||
"""Convert a multi-tenant practice_area + appeal_subtype to the
|
||||
domain value stored in DB columns (case_law/cases).
|
||||
|
||||
Returns ``""`` when the input cannot be mapped — callers should
|
||||
handle this rather than letting ``""`` propagate silently to the DB.
|
||||
|
||||
Examples:
|
||||
>>> to_db_practice_area("appeals_committee", "building_permit")
|
||||
'rishuy_uvniya'
|
||||
>>> to_db_practice_area("rishuy_uvniya")
|
||||
'rishuy_uvniya'
|
||||
>>> to_db_practice_area("appeals_committee")
|
||||
''
|
||||
"""
|
||||
pa = (practice_area or "").strip()
|
||||
if pa in DOMAIN_PRACTICE_AREAS:
|
||||
return pa
|
||||
if pa == "appeals_committee":
|
||||
return _SUBTYPE_TO_DOMAIN.get((appeal_subtype or "").strip(), "")
|
||||
return ""
|
||||
|
||||
|
||||
# ── Derivation ─────────────────────────────────────────────────────
|
||||
|
||||
@@ -55,14 +180,28 @@ _CASE_NUM = re.compile(r"(?:ARAR[-\s]*\d{2}[-\s]*(?:\d{2}[-\s]*)?)(\d{4})", re.I
|
||||
_PLAIN_NUM = re.compile(r"(\d{4})")
|
||||
|
||||
|
||||
_DOMAIN_TO_SUBTYPE: dict[str, str] = {
|
||||
"rishuy_uvniya": "building_permit",
|
||||
"betterment_levy": "betterment_levy",
|
||||
"compensation_197": "compensation_197",
|
||||
}
|
||||
|
||||
|
||||
def derive_subtype(case_number: str, practice_area: str = DEFAULT_PRACTICE_AREA) -> str:
|
||||
"""Infer the appeal_subtype from case_number.
|
||||
|
||||
For appeals_committee, the convention is:
|
||||
For appeals_committee (axis A), the convention is:
|
||||
1xxx → building_permit, 8xxx → betterment_levy, 9xxx → compensation_197.
|
||||
|
||||
For domain values (axis B — rishuy_uvniya/betterment_levy/compensation_197),
|
||||
the subtype is implicit in the practice_area itself — we map directly
|
||||
without parsing the case number.
|
||||
|
||||
Handles multiple formats: ARAR-25-8126, 8126/25, 1170, ערר 1024-25.
|
||||
"""
|
||||
# Axis B: practice_area is already a domain value — map directly.
|
||||
if practice_area in DOMAIN_PRACTICE_AREAS:
|
||||
return _DOMAIN_TO_SUBTYPE.get(practice_area, "unknown")
|
||||
if practice_area != "appeals_committee":
|
||||
return "unknown"
|
||||
cn = case_number or ""
|
||||
@@ -77,6 +216,94 @@ def derive_subtype(case_number: str, practice_area: str = DEFAULT_PRACTICE_AREA)
|
||||
return _APPEALS_COMMITTEE_DIGIT_TO_SUBTYPE.get(first_digit, "unknown")
|
||||
|
||||
|
||||
def derive_subtype_with_blam(
|
||||
case_number: str,
|
||||
subject: str = "",
|
||||
practice_area: str = DEFAULT_PRACTICE_AREA,
|
||||
) -> str:
|
||||
"""Like ``derive_subtype()`` but also detects בל"מ from the subject.
|
||||
|
||||
If ``subject`` indicates a בקשה להארכת מועד, the returned subtype is
|
||||
one of the ``extension_request_*`` values (chosen per case_number /
|
||||
practice_area). Otherwise behaviour matches ``derive_subtype()``.
|
||||
|
||||
Examples:
|
||||
>>> derive_subtype_with_blam("1017-03-26", "בל\"מ ברלינגר נ' לינדאב")
|
||||
'extension_request_building_permit'
|
||||
>>> derive_subtype_with_blam("8500-25", "בקשה להארכת מועד")
|
||||
'extension_request_betterment_levy'
|
||||
>>> derive_subtype_with_blam("1033-25", "ערר על החלטת ועדה")
|
||||
'building_permit'
|
||||
"""
|
||||
base = derive_subtype(case_number, practice_area)
|
||||
if not is_blam_subject(subject):
|
||||
return base
|
||||
# subject says it's בל"מ — return the matching extension_request_* variant.
|
||||
# For domain practice_area (axis B), use the direct mapping.
|
||||
if practice_area in DOMAIN_PRACTICE_AREAS:
|
||||
return _DOMAIN_TO_BLAM_SUBTYPE.get(practice_area, base)
|
||||
# For appeals_committee (axis A), derive from case_number digit.
|
||||
if practice_area == "appeals_committee":
|
||||
cn = case_number or ""
|
||||
m = _CASE_NUM.search(cn) or _PLAIN_NUM.search(cn)
|
||||
if m:
|
||||
first_digit = m.group(1)[0]
|
||||
blam = _APPEALS_COMMITTEE_DIGIT_TO_BLAM.get(first_digit)
|
||||
if blam:
|
||||
return blam
|
||||
return base
|
||||
|
||||
|
||||
def is_blam_subtype(appeal_subtype: str) -> bool:
|
||||
"""True iff appeal_subtype is one of the extension_request_* variants.
|
||||
|
||||
Useful for UI badges and routing logic that need to detect בל"מ cases
|
||||
regardless of which domain they belong to.
|
||||
"""
|
||||
return appeal_subtype in BLAM_SUBTYPES
|
||||
|
||||
|
||||
def derive_proceeding_type(*, appeal_subtype: str = "", subject: str = "") -> str:
|
||||
"""Return 'בל"מ' / 'ערר' for appeals-committee decisions/cases.
|
||||
|
||||
Priority: explicit subtype prefix → subject regex → default 'ערר'.
|
||||
"""
|
||||
if appeal_subtype and appeal_subtype.startswith("extension_request_"):
|
||||
return 'בל"מ'
|
||||
if subject and is_blam_subject(subject):
|
||||
return 'בל"מ'
|
||||
return "ערר"
|
||||
|
||||
|
||||
def derive_domain_practice_area(case_number: str) -> str:
|
||||
"""Map a case_number prefix to a domain practice_area (axis B).
|
||||
|
||||
Returns:
|
||||
``"rishuy_uvniya"`` for 1xxx, ``"betterment_levy"`` for 8xxx,
|
||||
``"compensation_197"`` for 9xxx, or ``""`` when the prefix is
|
||||
unrecognized (caller decides the fallback).
|
||||
|
||||
Examples:
|
||||
>>> derive_domain_practice_area("8126/25")
|
||||
'betterment_levy'
|
||||
>>> derive_domain_practice_area("1170")
|
||||
'rishuy_uvniya'
|
||||
>>> derive_domain_practice_area("ARAR-24-01-9007")
|
||||
'compensation_197'
|
||||
>>> derive_domain_practice_area("foo")
|
||||
''
|
||||
"""
|
||||
cn = case_number or ""
|
||||
m = _CASE_NUM.search(cn) or _PLAIN_NUM.search(cn)
|
||||
if not m:
|
||||
return ""
|
||||
first_digit = m.group(1)[0]
|
||||
subtype = _APPEALS_COMMITTEE_DIGIT_TO_SUBTYPE.get(first_digit)
|
||||
if not subtype:
|
||||
return ""
|
||||
return _SUBTYPE_TO_DOMAIN.get(subtype, "")
|
||||
|
||||
|
||||
# ── Validation ─────────────────────────────────────────────────────
|
||||
|
||||
|
||||
@@ -99,6 +326,20 @@ def validate(practice_area: str, appeal_subtype: str | None) -> None:
|
||||
|
||||
def is_override(case_number: str, practice_area: str, appeal_subtype: str) -> bool:
|
||||
"""True iff the user-supplied subtype disagrees with what derive_subtype
|
||||
would have produced (and the derived value is not 'unknown')."""
|
||||
would have produced (and the derived value is not 'unknown').
|
||||
|
||||
Note: בל"מ variants (extension_request_*) are NOT considered overrides
|
||||
of their parent domain — extension_request_building_permit on a 1xxx
|
||||
case is consistent with the case-number convention.
|
||||
"""
|
||||
derived = derive_subtype(case_number, practice_area)
|
||||
return derived != "unknown" and derived != appeal_subtype
|
||||
if derived == "unknown":
|
||||
return False
|
||||
if derived == appeal_subtype:
|
||||
return False
|
||||
# בל"מ variants of the same domain are not overrides.
|
||||
if appeal_subtype in BLAM_SUBTYPES:
|
||||
# extension_request_building_permit ↔ building_permit (1xxx) — same domain
|
||||
if _SUBTYPE_TO_DOMAIN.get(appeal_subtype) == _SUBTYPE_TO_DOMAIN.get(derived):
|
||||
return False
|
||||
return True
|
||||
|
||||
@@ -116,6 +116,18 @@ async def ingest_precedent(
|
||||
raise FileNotFoundError(f"file not found: {src}")
|
||||
if not citation.strip():
|
||||
raise ValueError("citation is required")
|
||||
# Citation guard at service level (catches both MCP and HTTP API paths).
|
||||
# Appeals-committee decisions must go through ingest_internal_decision
|
||||
# which records chair_name+district. The MCP wrapper has the same guard
|
||||
# for an earlier, friendlier error message — but this is the source of
|
||||
# truth. See TaskMaster #30(ב) and DB constraint case_law_external_arar_check.
|
||||
_norm = citation.strip()
|
||||
if _norm.startswith(("ערר ", "ערר(", "בל\"מ ", "בל\"מ(", "ARAR ")):
|
||||
raise ValueError(
|
||||
"ציטוט שמתחיל ב-'ערר' או 'בל\"מ' הוא החלטת ועדת ערר. "
|
||||
"השתמש ב-internal_decision_upload (דורש chair_name + district), "
|
||||
"לא ב-precedent_library_upload."
|
||||
)
|
||||
if practice_area not in _VALID_PRACTICE_AREAS:
|
||||
raise ValueError(f"invalid practice_area: {practice_area!r}")
|
||||
if source_type not in _VALID_SOURCE_TYPES:
|
||||
@@ -160,34 +172,100 @@ async def ingest_precedent(
|
||||
case_law_id = UUID(str(record["id"]))
|
||||
|
||||
try:
|
||||
await progress("chunking", 40, f"מחלק את הטקסט ל-chunks ({page_count} עמ')")
|
||||
chunks = chunker.chunk_document(text, page_offsets=page_offsets)
|
||||
if not chunks:
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||
await progress("completed", 100, "אין טקסט לעיבוד")
|
||||
return {
|
||||
"status": "completed",
|
||||
"case_law_id": str(case_law_id),
|
||||
"chunks": 0,
|
||||
"halachot": 0,
|
||||
}
|
||||
# Parent-doc retrieval (TaskMaster #48): when enabled, emit
|
||||
# two tiers (parents + children). Only children are embedded
|
||||
# and indexed; parents carry retrieval context. When disabled,
|
||||
# fall back to legacy single-tier chunking — identical
|
||||
# behaviour to pre-V17.
|
||||
if config.PARENT_DOC_RETRIEVAL_ENABLED:
|
||||
await progress(
|
||||
"chunking", 40,
|
||||
f"מחלק את הטקסט ל-chunks היררכיים ({page_count} עמ')",
|
||||
)
|
||||
h_chunks = chunker.chunk_document_hierarchical(
|
||||
text, page_offsets=page_offsets,
|
||||
)
|
||||
if not h_chunks:
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||
await progress("completed", 100, "אין טקסט לעיבוד")
|
||||
return {
|
||||
"status": "completed",
|
||||
"case_law_id": str(case_law_id),
|
||||
"chunks": 0,
|
||||
"halachot": 0,
|
||||
}
|
||||
|
||||
await progress("embedding", 55, f"מייצר embeddings ל-{len(chunks)} chunks")
|
||||
chunk_texts = [c.content for c in chunks]
|
||||
chunk_vectors = await embeddings.embed_texts(chunk_texts, input_type="document")
|
||||
children = [c for c in h_chunks if c.role == "child"]
|
||||
parents = [c for c in h_chunks if c.role == "parent"]
|
||||
await progress(
|
||||
"embedding", 55,
|
||||
f"מייצר embeddings ל-{len(children)} children "
|
||||
f"({len(parents)} parents)",
|
||||
)
|
||||
child_texts = [c.content for c in children]
|
||||
child_vectors = await embeddings.embed_texts(
|
||||
child_texts, input_type="document",
|
||||
)
|
||||
# Build flat dict list for the two-pass writer.
|
||||
chunk_dicts: list[dict] = []
|
||||
for p in parents:
|
||||
chunk_dicts.append({
|
||||
"role": "parent",
|
||||
"local_id": p.local_id,
|
||||
"parent_local_id": None,
|
||||
"chunk_index": p.chunk_index,
|
||||
"content": p.content,
|
||||
"section_type": p.section_type,
|
||||
"page_number": p.page_number,
|
||||
"embedding": None,
|
||||
})
|
||||
for c, v in zip(children, child_vectors):
|
||||
chunk_dicts.append({
|
||||
"role": "child",
|
||||
"local_id": c.local_id,
|
||||
"parent_local_id": c.parent_local_id,
|
||||
"chunk_index": c.chunk_index,
|
||||
"content": c.content,
|
||||
"section_type": c.section_type,
|
||||
"page_number": c.page_number,
|
||||
"embedding": v,
|
||||
})
|
||||
counts = await db.store_precedent_chunks_hierarchical(
|
||||
case_law_id, chunk_dicts,
|
||||
)
|
||||
stored_chunks = counts["children"]
|
||||
else:
|
||||
await progress(
|
||||
"chunking", 40, f"מחלק את הטקסט ל-chunks ({page_count} עמ')",
|
||||
)
|
||||
chunks = chunker.chunk_document(text, page_offsets=page_offsets)
|
||||
if not chunks:
|
||||
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||
await progress("completed", 100, "אין טקסט לעיבוד")
|
||||
return {
|
||||
"status": "completed",
|
||||
"case_law_id": str(case_law_id),
|
||||
"chunks": 0,
|
||||
"halachot": 0,
|
||||
}
|
||||
|
||||
chunk_dicts = [
|
||||
{
|
||||
"chunk_index": c.chunk_index,
|
||||
"content": c.content,
|
||||
"section_type": c.section_type,
|
||||
"page_number": c.page_number,
|
||||
"embedding": v,
|
||||
}
|
||||
for c, v in zip(chunks, chunk_vectors)
|
||||
]
|
||||
stored_chunks = await db.store_precedent_chunks(case_law_id, chunk_dicts)
|
||||
await progress("embedding", 55, f"מייצר embeddings ל-{len(chunks)} chunks")
|
||||
chunk_texts = [c.content for c in chunks]
|
||||
chunk_vectors = await embeddings.embed_texts(chunk_texts, input_type="document")
|
||||
|
||||
chunk_dicts = [
|
||||
{
|
||||
"chunk_index": c.chunk_index,
|
||||
"content": c.content,
|
||||
"section_type": c.section_type,
|
||||
"page_number": c.page_number,
|
||||
"embedding": v,
|
||||
}
|
||||
for c, v in zip(chunks, chunk_vectors)
|
||||
]
|
||||
stored_chunks = await db.store_precedent_chunks(case_law_id, chunk_dicts)
|
||||
|
||||
# Multimodal page-image embeddings (V9). Gated by feature flag.
|
||||
# Non-fatal: text path already succeeded. Only PDFs.
|
||||
@@ -455,6 +533,7 @@ async def list_precedents(
|
||||
precedent_level: str = "",
|
||||
source_type: str = "",
|
||||
search: str = "",
|
||||
source_kind: str = "external_upload",
|
||||
limit: int = 100,
|
||||
offset: int = 0,
|
||||
) -> list[dict]:
|
||||
@@ -464,6 +543,7 @@ async def list_precedents(
|
||||
precedent_level=precedent_level,
|
||||
source_type=source_type,
|
||||
search=search,
|
||||
source_kind=source_kind,
|
||||
limit=limit,
|
||||
offset=offset,
|
||||
)
|
||||
|
||||
@@ -3,7 +3,9 @@
|
||||
Runs after chunking. Reads the precedent's full_text and asks Claude to
|
||||
fill in the metadata fields that an upload form usually leaves empty:
|
||||
short case_name, summary, headnote, key_quote, subject_tags,
|
||||
appeal_subtype, decision_date, precedent_level, court.
|
||||
appeal_subtype, decision_date, precedent_level, court — plus
|
||||
chair_name + district for internal_committee rows (which the upload
|
||||
path stamps with PLACEHOLDER_PENDING_EXTRACTION when missing).
|
||||
|
||||
Caller policy: only empty user-supplied fields are filled. Anything the
|
||||
chair already typed in the upload form is preserved. This is enforced
|
||||
@@ -22,6 +24,12 @@ from legal_mcp.services import claude_session, db
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# Sentinel inserted by the upload endpoint when a committee row is created
|
||||
# without chair_name/district (the DB CHECK forces non-empty). Treated as
|
||||
# empty by ``apply_to_record`` so LLM-extracted values overwrite it.
|
||||
PLACEHOLDER_PENDING_EXTRACTION = "(טרם חולץ)"
|
||||
|
||||
|
||||
# The prompt is short — we only need the first 12K chars of the ruling
|
||||
# (header + opening of discussion is enough for naming + summary). For
|
||||
# subject tags we sample the discussion section too.
|
||||
@@ -50,8 +58,12 @@ METADATA_EXTRACTION_PROMPT = """אתה מסייע משפטי בכיר. קרא א
|
||||
"decision_date_iso": "YYYY-MM-DD — תאריך מתן ההחלטה כפי שמופיע בטקסט (בכותרת או בחתימה הסופית). אם לא ניתן לזהות במדויק — מחרוזת ריקה.",
|
||||
"precedent_level": "אחד מ-4: 'עליון' / 'מנהלי' / 'ועדת_ערר_ארצית' / 'ועדת_ערר_מחוזית'. בחר לפי הערכאה שמסומנת בכותרת הפסק. אם לא ברור — מחרוזת ריקה.",
|
||||
"source_type": "אחד מ-2: 'court_ruling' (פסק דין של בית משפט — עליון/מנהלי) / 'appeals_committee' (החלטה של ועדת ערר). אם לא ברור — מחרוזת ריקה.",
|
||||
"proceeding_type": "אחד מ-2 (רק להחלטות ועדת ערר): 'ערר' (הליך ערר עיקרי על החלטת ועדה מקומית) / 'בל\\\"מ' (בקשה להארכת מועד להגשת ערר). זהה דרך כותרת המסמך: 'ערר (ועדות ערר ...) NNNN/YY' → 'ערר'; 'בל\\\"מ NNNN/YY' או נושא 'בקשה להארכת מועד להגשת ערר' → 'בל\\\"מ'. בפסיקת בית משפט (לא ועדת ערר) — מחרוזת ריקה.",
|
||||
"court": "שם הערכאה כפי שהוא מופיע בכותרת (למשל 'בית המשפט העליון', 'בית המשפט המחוזי בירושלים בשבתו כבית משפט לעניינים מנהליים', 'ועדת הערר לתכנון ובניה פיצויים והיטלי השבחה — מחוז ירושלים'). מחרוזת ריקה אם לא ניתן לזהות.",
|
||||
"case_number_clean": "מספר הערר/תיק כפי שמופיע בכותרת — רק הספרות והאלכסון, למשל '1062/24' או '8031/21'. ללא המילה 'ערר', ללא שם הצדדים, ללא סוגריים. אם יש כמה עררים מאוחדים — הרשום הראשון. מחרוזת ריקה אם לא ניתן לזהות."
|
||||
"case_number_clean": "מספר הערר/תיק כפי שמופיע בכותרת — רק הספרות והאלכסון, למשל '1062/24' או '8031/21'. ללא המילה 'ערר', ללא שם הצדדים, ללא סוגריים. אם יש כמה עררים מאוחדים — הרשום הראשון. מחרוזת ריקה אם לא ניתן לזהות.",
|
||||
"chair_name": "שם יו\\\"ר ההרכב — רלוונטי **רק להחלטות ועדת ערר**, לא לפסקי בית משפט. חפש בכותרת/חתימה: 'עו\\\"ד דפנה תמיר, יו\\\"ר ועדת הערר', 'בפני: עו\\\"ד פלוני אלמוני (יו\\\"ר)'. השאר שם פרטי+משפחה בלי תוארים ('עו\\\"ד', 'אדריכל'). אם זה פסק דין של בית משפט — מחרוזת ריקה.",
|
||||
"district": "מחוז ועדת הערר — רלוונטי **רק להחלטות ועדת ערר**. ערכים מותרים: 'ירושלים', 'תל אביב', 'מרכז', 'חיפה', 'צפון', 'דרום', 'ארצית'. זהה מהכותרת ('ועדת הערר לתכנון ובניה — מחוז ירושלים' → 'ירושלים'; 'ועדות ערר - תכנון ובנייה תל אביב-יפו' → 'תל אביב'). אם זה פסק דין של בית משפט — מחרוזת ריקה.",
|
||||
"citation_formatted": "המראה מקום המלא לפי **כללי הציטוט האחיד**, בפורמט Markdown — שמות הצדדים בלבד מוקפים בכפול-כוכבית (`**…**`), הכל השאר רגיל. ראה כללים מפורטים בסעיף 12 למטה."
|
||||
}
|
||||
|
||||
## כללי איכות
|
||||
@@ -65,6 +77,24 @@ METADATA_EXTRACTION_PROMPT = """אתה מסייע משפטי בכיר. קרא א
|
||||
8. **precedent_level** — קבע לפי הערכאה: בית המשפט העליון = "עליון"; בית משפט מחוזי בשבתו כבית משפט לעניינים מנהליים = "מנהלי"; ועדת ערר ארצית = "ועדת_ערר_ארצית"; ועדת ערר מחוזית (כמו ועדות תכנון ובניה ירושלים/מחוז המרכז וכד') = "ועדת_ערר_מחוזית". השתמש ב-underscore כפי שמופיע — לא ברווח.
|
||||
9. **source_type** — שני ערכים בלבד: "court_ruling" כשהמסמך הוא פסק דין/החלטה של בית משפט (עליון/בג"ץ/מנהלי/מחוזי); "appeals_committee" כשהמסמך הוא החלטה של ועדת ערר (ארצית או מחוזית). זה משלים את `precedent_level` — שני השדות צריכים להיות תואמים.
|
||||
10. **court** — מהכותרת הראשית של הפסק. ניסוח מלא (לא קיצור). מחרוזת ריקה אם לא ניתן לזהות.
|
||||
11. **proceeding_type** — חובה לזהות עבור החלטות ועדת ערר; ריק עבור פסיקת בית משפט. הסימן הברור: בכותרת הראשונה של המסמך כתוב "ערר (ועדות ערר ...) NNNN/YY" → 'ערר'; "בל\"מ NNNN/YY" או הנושא "בקשה להארכת מועד להגשת ערר" → 'בל\"מ'. שני הסוגים יכולים לחלוק אותו מספר תיק — לכן חשוב להבחין מפורשות.
|
||||
12. **chair_name / district** — חובה למלא רק עבור החלטות ועדת ערר (source_type='appeals_committee'). chair_name נמצא בכותרת ("בפני: עו\"ד פלוני אלמוני, יו\"ר") או בחתימה. district = מחוז הוועדה, מתוך רשימה סגורה. עבור פסקי בית משפט — שני השדות ריקים.
|
||||
13. **citation_formatted — כללי הציטוט האחיד הישראלי**. הרכב את המראה מקום במחרוזת אחת בפורמט Markdown, **כשרק שמות הצדדים מודגשים** (מוקפים ב-`**…**`). כל השאר — קיצור הערכאה, סוגריים של הרכב/מחוז, מספר תיק, מאגר/תאריך — **רגיל ללא הדגשה**.
|
||||
|
||||
תבניות לסוגי פסיקה:
|
||||
* **בית משפט עליון — לא פורסם:** `ע"א 1234/56 **פלוני נ' אלמוני** (נבו 1.2.3456)`
|
||||
* **בית משפט עליון — פורסם:** `ע"א 1234/56 **פלוני נ' אלמוני**, פ"ד יב(3) 456 (1990)`
|
||||
* **בית משפט מנהלי:** `עת"מ (י-ם) 1234/56 **פלוני נ' הוועדה** (נבו 1.2.3456)` — "(י-ם)" / "(ת"א)" / וכד' = קיצור המחוז
|
||||
* **ועדת ערר תכנון ובנייה (מחוזית):** `ערר (ועדות ערר - תכנון ובנייה ת"א-יפו) 81002-01-21 **אברהם אגסי נ' הועדה המקומית לתכנון ובנייה תל אביב** (נבו 25.9.2025)`
|
||||
* **בל"מ (בקשה להארכת מועד):** `בל"מ (ועדות ערר - ירושלים) 1028/20 **חלוואני ריאד נ' רשות הרישוי - הוועדה המקומית ירושלים** (נבו 7.1.2021)`
|
||||
* **ועדת ערר ארצית:** `ערר ארצי 8047/23 **פלוני נ' אלמוני** (נבו 1.2.3456)`
|
||||
|
||||
כללים:
|
||||
- **הצדדים מודגשים בלבד** — כל השאר רגיל. אל תדגיש את "ע"א" / "ערר" / מספר התיק / "(נבו ...)" / "פ"ד".
|
||||
- הצדדים = מי שמופיע **בין מספר התיק לבין הסוגריים הסופיים** (תאריך/מאגר), כלומר "[עורר/מבקש] נ' [משיב]".
|
||||
- תאריך בסוגריים סופיים בפורמט עברי "(נבו 25.9.2025)" — יום.חודש.שנה ללא אפסים מובילים.
|
||||
- אם המאגר הוא נבו והפסיקה לא פורסמה ב-פ"ד — השתמש ב-"(נבו DATE)". אם פורסמה ב-פ"ד — הוסף את ההפניה הפורמלית אחרי הצדדים: `..., פ"ד יב(3) 456 (1990)`.
|
||||
- אם לא ניתן לזהות איזשהו רכיב במדויק — השאר את **כל** השדה ריק. אל תניח / תמציא.
|
||||
"""
|
||||
|
||||
|
||||
@@ -160,10 +190,30 @@ async def extract_metadata(case_law_id: UUID | str) -> dict:
|
||||
st = result["source_type"].strip()
|
||||
if st in {"court_ruling", "appeals_committee"}:
|
||||
out["source_type"] = st
|
||||
if isinstance(result.get("proceeding_type"), str):
|
||||
pt = result["proceeding_type"].strip()
|
||||
if pt in {"ערר", 'בל"מ', ""}:
|
||||
out["proceeding_type"] = pt
|
||||
if isinstance(result.get("court"), str):
|
||||
out["court"] = result["court"].strip()
|
||||
if isinstance(result.get("case_number_clean"), str):
|
||||
out["case_number_clean"] = result["case_number_clean"].strip()
|
||||
if isinstance(result.get("chair_name"), str):
|
||||
out["chair_name"] = result["chair_name"].strip()
|
||||
if isinstance(result.get("district"), str):
|
||||
d = result["district"].strip()
|
||||
# Closed enum for districts — anything else is dropped to avoid
|
||||
# silently storing free-text in what callers treat as a filter facet.
|
||||
if d in {"ירושלים", "תל אביב", "מרכז", "חיפה", "צפון", "דרום", "ארצית"}:
|
||||
out["district"] = d
|
||||
if isinstance(result.get("citation_formatted"), str):
|
||||
cf = result["citation_formatted"].strip()
|
||||
# Sanity check: a valid citation should contain at least one bold
|
||||
# marker pair (the parties) AND a closing paren (the reporter/date).
|
||||
# If the LLM returned a half-formed string, drop it rather than
|
||||
# store junk that the UI then has to special-case.
|
||||
if cf.count("**") >= 2 and ")" in cf:
|
||||
out["citation_formatted"] = cf
|
||||
return out
|
||||
|
||||
|
||||
@@ -267,11 +317,41 @@ async def apply_to_record(
|
||||
if c:
|
||||
fields_to_update["court"] = c
|
||||
|
||||
# proceeding_type — only fill for internal_committee rows (the field is
|
||||
# meaningless for court rulings, which we keep as '').
|
||||
if not (record.get("proceeding_type") or "").strip():
|
||||
pt = (suggested.get("proceeding_type") or "").strip()
|
||||
if pt and (record.get("source_kind") == "internal_committee"):
|
||||
fields_to_update["proceeding_type"] = pt
|
||||
|
||||
if overwrite_case_number:
|
||||
cn = (suggested.get("case_number_clean") or "").strip()
|
||||
if cn:
|
||||
fields_to_update["case_number"] = cn
|
||||
|
||||
# citation_formatted — full citation per Israeli citation rules. Only
|
||||
# fill if empty; user edits in /precedents/[id] are preserved.
|
||||
if not (record.get("citation_formatted") or "").strip():
|
||||
s = (suggested.get("citation_formatted") or "").strip()
|
||||
if s:
|
||||
fields_to_update["citation_formatted"] = s
|
||||
|
||||
# chair_name / district — only for internal_committee rows. The DB CHECK
|
||||
# forces these to be non-empty, so the upload endpoint stamps the row
|
||||
# with "(טרם חולץ)" as a placeholder. Treat that placeholder as empty
|
||||
# so the LLM-extracted value can overwrite it.
|
||||
if record.get("source_kind") == "internal_committee":
|
||||
cur_chair = (record.get("chair_name") or "").strip()
|
||||
if cur_chair in ("", PLACEHOLDER_PENDING_EXTRACTION):
|
||||
s = (suggested.get("chair_name") or "").strip()
|
||||
if s:
|
||||
fields_to_update["chair_name"] = s
|
||||
cur_district = (record.get("district") or "").strip()
|
||||
if cur_district in ("", PLACEHOLDER_PENDING_EXTRACTION):
|
||||
s = (suggested.get("district") or "").strip()
|
||||
if s:
|
||||
fields_to_update["district"] = s
|
||||
|
||||
if not fields_to_update:
|
||||
return {"updated": False, "fields": []}
|
||||
|
||||
|
||||
195
mcp-server/src/legal_mcp/services/style_metadata_extractor.py
Normal file
195
mcp-server/src/legal_mcp/services/style_metadata_extractor.py
Normal file
@@ -0,0 +1,195 @@
|
||||
"""Auto-extract per-decision metadata for a style_corpus row.
|
||||
|
||||
Populates the fields that the upload flow leaves empty — summary, outcome,
|
||||
key_principles, appeal_subtype, practice_area — by asking Claude (via the
|
||||
local CLI session) to read the proofread full_text and return a structured
|
||||
JSON blob.
|
||||
|
||||
Caller policy (``apply_to_corpus``): by default we **only fill empty
|
||||
columns**, so chair-edited values are preserved across re-runs. The chair
|
||||
can force a refresh by passing ``overwrite=True``.
|
||||
|
||||
Why this is a separate module from ``precedent_metadata_extractor``:
|
||||
that one fills the *external* case_law corpus (court rulings, third-party
|
||||
committee decisions). This one fills the *style* corpus — Daphna's own
|
||||
decisions used to teach the writer the in-house voice. The two corpora
|
||||
have different schemas, different prompts, and different downstream
|
||||
consumers, so coupling them would have been the wrong shortcut.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import claude_session, db
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# A single decision typically runs 200K-650K chars. We sample the head
|
||||
# (where outcome + parties + framing live) and the tail (where the
|
||||
# operative ruling sits). Picking from both edges keeps the prompt under
|
||||
# 60K chars — comfortable for any Claude tier.
|
||||
_HEAD_CHARS = 25_000
|
||||
_TAIL_CHARS = 15_000
|
||||
|
||||
|
||||
def _build_text_window(full_text: str) -> str:
|
||||
if len(full_text) <= _HEAD_CHARS + _TAIL_CHARS:
|
||||
return full_text
|
||||
head = full_text[:_HEAD_CHARS]
|
||||
tail = full_text[-_TAIL_CHARS:]
|
||||
return (
|
||||
f"{head}\n\n"
|
||||
f"[... חתך: {len(full_text) - _HEAD_CHARS - _TAIL_CHARS:,} תווים מהאמצע "
|
||||
f"הושמטו — שמרנו על ההתחלה (טענות + רקע) ועל הסוף (הכרעה + הוצאות) ...]"
|
||||
f"\n\n{tail}"
|
||||
)
|
||||
|
||||
|
||||
# Static instructions — go via ``system`` so the SDK path can cache them
|
||||
# across batch enrichment runs (24+ decisions in one pass).
|
||||
METADATA_PROMPT = """אתה מסייע משפטי שמקטלג את הקורפוס הסגנוני של דפנה תמיר (יו"ר ועדת ערר).
|
||||
|
||||
תפקידך: לקרוא החלטה אחת ולחלץ מטא-דאטה ל-style_corpus — שדות שהמשתמש לא הזין בעת ההעלאה.
|
||||
|
||||
**אל תמציא**. אם המידע לא מופיע בטקסט, השאר מחרוזת ריקה או מערך ריק. אסור להסיק עובדות שלא כתובות.
|
||||
|
||||
## פלט נדרש
|
||||
|
||||
החזר JSON אחד (object אחד — לא array, לא markdown, לא הסברים):
|
||||
|
||||
{
|
||||
"summary": "תקציר עניני ב-2-3 משפטים: מי העורר, מה דרש, מה הוכרע. סגנון יבש, ניטרלי, ללא שיפוט. דוגמה: 'ערר על דחיית בקשה להיתר לתוספת מרפסת בקומה ג׳. דפנה קיבלה את הערר חלקית — אישרה את המרפסת בהקטנה ל-12 מ״ר.'",
|
||||
|
||||
"outcome": "התוצאה התמציתית. אחד מאלה (או צירוף קצר): 'קבלה' / 'קבלה חלקית' / 'דחייה' / 'הסתלקות' / 'החזרה לוועדה המקומית'. אם זה לא ברור — מחרוזת ריקה.",
|
||||
|
||||
"key_principles": [
|
||||
"עיקרון משפטי 1 שעולה מההחלטה — משפט אחד, ניסוח מופשט. למשל 'שיקול דעת מוגבל לחריגות בנייה קטנות'.",
|
||||
"עיקרון 2",
|
||||
"..."
|
||||
],
|
||||
|
||||
"appeal_subtype": "תת-סוג ערר. ערכים מותרים: 'building_permit' (היתר בנייה / רישוי), 'betterment_levy' (היטל השבחה), 'compensation_197' (פיצויים ס׳ 197), 'use_change' (שימוש חורג), 'tama_38' (תמ\\"א 38), או מחרוזת ריקה אם לא ברור.",
|
||||
|
||||
"practice_area": "תחום משפט גנרי. ברירת מחדל: 'appeals_committee'. אם זה במובהק 'planning_law' — סמן.",
|
||||
|
||||
"parties_appellant": "שם העורר/ים המרכזיים בהחלטה (אחד או כמה, מופרדים בפסיק). אם זו החלטה מאוחדת — שם הצד המוביל. השאר ריק אם לא ניתן לזהות במדויק.",
|
||||
|
||||
"parties_respondent": "שם המשיב/ים. ברירת מחדל לעררי 1xxx ו-8xxx: 'הוועדה המקומית לתכנון ובניה ירושלים' או דומה. השאר ריק אם לא ברור."
|
||||
}
|
||||
|
||||
## כללי איכות
|
||||
|
||||
1. **summary** — חייב להזכיר את התוצאה. בלי 'בית המשפט קבע ש...' (אנחנו לא בית משפט). בלי הערכת אישית.
|
||||
2. **outcome** — קבלה / קבלה חלקית / דחייה / הסתלקות / החזרה לוועדה המקומית. אם דפנה הכריעה חלקית — 'קבלה חלקית'. אסור 'התקבל' או 'נדחה' בלשון פעולה — רק שם פעולה.
|
||||
3. **key_principles** — 2-5 עקרונות מקסימום. כל אחד משפט אחד. לא ציטוטים מילוליים, אלא תמצות העיקרון.
|
||||
4. **appeal_subtype** — תמיד פעולה אחת. אם החלטה מערבת כמה תת-סוגים — בחר את העיקרי.
|
||||
5. **parties_appellant / parties_respondent** — שם בלבד, בלי 'נ׳' או 'נגד'.
|
||||
|
||||
החזר רק את ה-JSON. אל תכתוב שום דבר לפניו או אחריו.
|
||||
"""
|
||||
|
||||
|
||||
async def extract_decision_metadata(corpus_id: UUID | str) -> dict:
|
||||
"""Run Claude over the row's full_text and return suggested fields.
|
||||
|
||||
Does NOT touch the DB. The caller decides what to apply.
|
||||
"""
|
||||
if isinstance(corpus_id, str):
|
||||
corpus_id = UUID(corpus_id)
|
||||
row = await db.get_style_corpus_row(corpus_id)
|
||||
if not row:
|
||||
return {}
|
||||
full_text = (row.get("full_text") or "").strip()
|
||||
if not full_text:
|
||||
return {}
|
||||
|
||||
context = (
|
||||
f"מספר החלטה: {row.get('decision_number') or '—'}\n"
|
||||
f"תאריך: {row.get('decision_date') or '—'}\n"
|
||||
f"תת-סוג נוכחי: {row.get('appeal_subtype') or '—'}\n"
|
||||
f"נושאים מתויגים: {row.get('subject_categories') or '—'}"
|
||||
)
|
||||
window = _build_text_window(full_text)
|
||||
user_msg = (
|
||||
f"## הקלט\n{context}\n\n"
|
||||
f"--- תחילת ההחלטה ---\n{window}\n--- סוף ההחלטה ---"
|
||||
)
|
||||
|
||||
try:
|
||||
result = await claude_session.query_json(user_msg, system=METADATA_PROMPT)
|
||||
except Exception as e:
|
||||
logger.warning("style_metadata_extractor: query failed: %s", e)
|
||||
return {}
|
||||
|
||||
if not isinstance(result, dict):
|
||||
logger.warning(
|
||||
"style_metadata_extractor: expected JSON object, got %s",
|
||||
type(result).__name__,
|
||||
)
|
||||
return {}
|
||||
|
||||
out: dict = {}
|
||||
if isinstance(result.get("summary"), str):
|
||||
out["summary"] = result["summary"].strip()
|
||||
if isinstance(result.get("outcome"), str):
|
||||
out["outcome"] = result["outcome"].strip()
|
||||
kp = result.get("key_principles") or []
|
||||
if isinstance(kp, list):
|
||||
out["key_principles"] = [str(p).strip() for p in kp if str(p).strip()]
|
||||
if isinstance(result.get("appeal_subtype"), str):
|
||||
st = result["appeal_subtype"].strip()
|
||||
# Open enum — but log values outside the documented list so we can
|
||||
# tighten the prompt later if needed.
|
||||
known = {
|
||||
"building_permit", "betterment_levy", "compensation_197",
|
||||
"use_change", "tama_38", "",
|
||||
}
|
||||
if st not in known:
|
||||
logger.info("style_metadata: unknown appeal_subtype=%r (kept)", st)
|
||||
out["appeal_subtype"] = st
|
||||
if isinstance(result.get("practice_area"), str):
|
||||
out["practice_area"] = result["practice_area"].strip()
|
||||
# Parties: not stored in the schema today, but worth surfacing in the
|
||||
# extractor's return value so callers (and the UI's drawer) can display
|
||||
# them. The list endpoint extracts via regex; LLM output is the
|
||||
# higher-quality fallback when regex fails.
|
||||
if isinstance(result.get("parties_appellant"), str):
|
||||
out["parties_appellant"] = result["parties_appellant"].strip()
|
||||
if isinstance(result.get("parties_respondent"), str):
|
||||
out["parties_respondent"] = result["parties_respondent"].strip()
|
||||
return out
|
||||
|
||||
|
||||
async def extract_and_apply(
|
||||
corpus_id: UUID | str, *, overwrite: bool = False,
|
||||
) -> dict:
|
||||
"""Convenience: extract → apply → return summary of what changed.
|
||||
|
||||
Idempotent under default ``overwrite=False`` — re-runs only fill empty
|
||||
fields. Use ``overwrite=True`` to refresh values the chair (or a prior
|
||||
extraction) already wrote.
|
||||
"""
|
||||
if isinstance(corpus_id, str):
|
||||
corpus_id = UUID(corpus_id)
|
||||
suggested = await extract_decision_metadata(corpus_id)
|
||||
if not suggested:
|
||||
return {"extracted": False, "applied": False, "reason": "no suggestion"}
|
||||
|
||||
update_result = await db.update_style_corpus_metadata(
|
||||
corpus_id,
|
||||
summary=suggested.get("summary"),
|
||||
outcome=suggested.get("outcome"),
|
||||
key_principles=suggested.get("key_principles"),
|
||||
appeal_subtype=suggested.get("appeal_subtype"),
|
||||
practice_area=suggested.get("practice_area"),
|
||||
overwrite=overwrite,
|
||||
)
|
||||
return {
|
||||
"extracted": True,
|
||||
"applied": update_result.get("updated", False),
|
||||
"fields_set": update_result.get("fields", []),
|
||||
"suggested": suggested,
|
||||
}
|
||||
391
mcp-server/src/legal_mcp/services/telemetry.py
Normal file
391
mcp-server/src/legal_mcp/services/telemetry.py
Normal file
@@ -0,0 +1,391 @@
|
||||
"""RAG retrieval telemetry — closed-loop feedback (TaskMaster #50).
|
||||
|
||||
Logs every semantic search call so we can compute nDCG@10 over time,
|
||||
spot retrieval drift, and feed the rerank training set.
|
||||
|
||||
Design notes
|
||||
------------
|
||||
- **All writes are fire-and-forget**: callers wrap us in ``try/except``
|
||||
but we also swallow our own DB errors so a telemetry hiccup can never
|
||||
fail a search. The log itself is also written via a detached task —
|
||||
the search returns to the caller immediately and the row lands in
|
||||
the DB on the side.
|
||||
|
||||
- **search_decisions / search_case_documents** return document chunks
|
||||
from active cases, not ``case_law`` rows. Their telemetry rows leave
|
||||
``top_case_law_ids`` empty; nDCG aggregation ignores them.
|
||||
|
||||
- **Auto-inferred feedback**: once a final decision is exported, we
|
||||
scan its ``decision_paragraphs.citations`` JSONB, pull the
|
||||
``case_law_id`` values, and mark them as ``relevance_score=3`` on
|
||||
any search_log for the same case where the precedent appeared in
|
||||
the top-K. This gives us a "cited == relevant" ground truth signal
|
||||
without asking the chair to label results by hand.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
from typing import Any, Iterable
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
_VALID_SOURCES = {"cited_in_decision", "chair_marked", "auto_inferred"}
|
||||
|
||||
|
||||
def _coerce_case_law_ids(results: Iterable[Any], limit: int = 10) -> list[UUID]:
|
||||
"""Pull up to ``limit`` ``case_law_id`` UUIDs from search results.
|
||||
|
||||
Tolerates rows missing the field, non-UUID strings, and ``None``
|
||||
values. Preserves order (= ranking).
|
||||
"""
|
||||
out: list[UUID] = []
|
||||
seen: set[str] = set()
|
||||
for r in results:
|
||||
if len(out) >= limit:
|
||||
break
|
||||
if not isinstance(r, dict):
|
||||
continue
|
||||
raw = r.get("case_law_id")
|
||||
if raw is None:
|
||||
continue
|
||||
s = str(raw)
|
||||
if s in seen:
|
||||
continue
|
||||
try:
|
||||
out.append(UUID(s))
|
||||
seen.add(s)
|
||||
except (ValueError, AttributeError):
|
||||
continue
|
||||
return out
|
||||
|
||||
|
||||
async def _insert_log(
|
||||
*,
|
||||
search_type: str,
|
||||
query: str,
|
||||
practice_area: str | None,
|
||||
case_id: UUID | None,
|
||||
user_agent: str | None,
|
||||
result_count: int,
|
||||
top_case_law_ids: list[UUID],
|
||||
duration_ms: int | None,
|
||||
) -> UUID | None:
|
||||
try:
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
row = await conn.fetchrow(
|
||||
"""
|
||||
INSERT INTO search_logs (
|
||||
search_type, query, practice_area, case_id,
|
||||
user_agent, result_count, top_case_law_ids,
|
||||
duration_ms
|
||||
) VALUES ($1, $2, $3, $4, $5, $6, $7, $8)
|
||||
RETURNING id
|
||||
""",
|
||||
search_type,
|
||||
query[:2000], # guard against pathologically long queries
|
||||
practice_area or None,
|
||||
case_id,
|
||||
user_agent or None,
|
||||
int(result_count),
|
||||
top_case_law_ids or None,
|
||||
duration_ms,
|
||||
)
|
||||
return row["id"] if row else None
|
||||
except Exception:
|
||||
logger.exception("telemetry.log_search: insert failed (swallowed)")
|
||||
return None
|
||||
|
||||
|
||||
async def log_search(
|
||||
*,
|
||||
search_type: str,
|
||||
query: str,
|
||||
results: Iterable[dict],
|
||||
duration_ms: int | None = None,
|
||||
practice_area: str | None = None,
|
||||
case_id: UUID | str | None = None,
|
||||
user_agent: str | None = None,
|
||||
) -> UUID | None:
|
||||
"""Record a search call. Never raises.
|
||||
|
||||
Args:
|
||||
search_type: one of 'precedent_library', 'internal_decisions',
|
||||
'decisions', 'case_documents', 'similar_cases'.
|
||||
query: the raw user query.
|
||||
results: iterable of result dicts. We pull ``case_law_id`` from
|
||||
the first 10 to populate ``top_case_law_ids``.
|
||||
duration_ms: search latency in milliseconds.
|
||||
practice_area: optional filter applied to the search.
|
||||
case_id: optional case context (when the search was scoped to
|
||||
or triggered from a specific case).
|
||||
user_agent: 'writer' / 'researcher' / 'analyst' / 'manual'.
|
||||
|
||||
Returns:
|
||||
The ``search_logs.id`` UUID if the row was written, else None.
|
||||
Most callers ignore this; auto-inference uses it later via
|
||||
``infer_relevance_from_citations``.
|
||||
"""
|
||||
# Snapshot results immediately — callers may keep iterating.
|
||||
snapshot = list(results) if not isinstance(results, list) else results
|
||||
top_ids = _coerce_case_law_ids(snapshot, limit=10)
|
||||
|
||||
case_uuid: UUID | None
|
||||
if case_id is None:
|
||||
case_uuid = None
|
||||
elif isinstance(case_id, UUID):
|
||||
case_uuid = case_id
|
||||
else:
|
||||
try:
|
||||
case_uuid = UUID(str(case_id))
|
||||
except (ValueError, AttributeError):
|
||||
case_uuid = None
|
||||
|
||||
return await _insert_log(
|
||||
search_type=search_type,
|
||||
query=query,
|
||||
practice_area=practice_area,
|
||||
case_id=case_uuid,
|
||||
user_agent=user_agent,
|
||||
result_count=len(snapshot),
|
||||
top_case_law_ids=top_ids,
|
||||
duration_ms=duration_ms,
|
||||
)
|
||||
|
||||
|
||||
def log_search_bg(
|
||||
*,
|
||||
search_type: str,
|
||||
query: str,
|
||||
results: Iterable[dict],
|
||||
duration_ms: int | None = None,
|
||||
practice_area: str | None = None,
|
||||
case_id: UUID | str | None = None,
|
||||
user_agent: str | None = None,
|
||||
) -> None:
|
||||
"""Fire-and-forget variant. Schedules the insert as a detached task.
|
||||
|
||||
Use this from hot search paths so the caller returns to the user
|
||||
immediately. Errors are logged inside ``log_search``.
|
||||
"""
|
||||
# Snapshot eagerly so the caller can mutate/iterate results freely.
|
||||
snapshot = list(results) if not isinstance(results, list) else list(results)
|
||||
try:
|
||||
loop = asyncio.get_running_loop()
|
||||
except RuntimeError:
|
||||
# No running loop — caller is sync. Best-effort: skip telemetry.
|
||||
return
|
||||
loop.create_task(
|
||||
log_search(
|
||||
search_type=search_type,
|
||||
query=query,
|
||||
results=snapshot,
|
||||
duration_ms=duration_ms,
|
||||
practice_area=practice_area,
|
||||
case_id=case_id,
|
||||
user_agent=user_agent,
|
||||
)
|
||||
)
|
||||
|
||||
|
||||
# ──────────────────────────────────────────────────────────────────────
|
||||
# Auto-inferred relevance feedback
|
||||
# ──────────────────────────────────────────────────────────────────────
|
||||
|
||||
|
||||
def _extract_citations_from_jsonb(citations: Any) -> list[UUID]:
|
||||
"""Parse ``decision_paragraphs.citations`` JSONB into UUID list.
|
||||
|
||||
Stored shape: ``[{"case_law_id": "...", "text": "...", "type": ...}]``.
|
||||
Tolerates string form (asyncpg returns it as JSON string when the
|
||||
column registration didn't auto-decode).
|
||||
"""
|
||||
import json as _json
|
||||
|
||||
if not citations:
|
||||
return []
|
||||
if isinstance(citations, (bytes, bytearray)):
|
||||
try:
|
||||
citations = _json.loads(citations.decode("utf-8"))
|
||||
except (ValueError, UnicodeDecodeError):
|
||||
return []
|
||||
elif isinstance(citations, str):
|
||||
try:
|
||||
citations = _json.loads(citations)
|
||||
except ValueError:
|
||||
return []
|
||||
|
||||
if not isinstance(citations, list):
|
||||
return []
|
||||
|
||||
out: list[UUID] = []
|
||||
seen: set[str] = set()
|
||||
for item in citations:
|
||||
if not isinstance(item, dict):
|
||||
continue
|
||||
raw = item.get("case_law_id")
|
||||
if not raw:
|
||||
continue
|
||||
s = str(raw)
|
||||
if s in seen:
|
||||
continue
|
||||
try:
|
||||
out.append(UUID(s))
|
||||
seen.add(s)
|
||||
except (ValueError, AttributeError):
|
||||
continue
|
||||
return out
|
||||
|
||||
|
||||
async def _gather_cited_case_law_ids(case_id: UUID) -> list[UUID]:
|
||||
"""Pull every distinct ``case_law_id`` cited anywhere in the case's
|
||||
decision paragraphs.
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT dp.citations
|
||||
FROM decision_paragraphs dp
|
||||
JOIN decision_blocks db ON db.id = dp.block_id
|
||||
JOIN decisions d ON d.id = db.decision_id
|
||||
WHERE d.case_id = $1
|
||||
AND dp.citations IS NOT NULL
|
||||
AND jsonb_array_length(dp.citations) > 0
|
||||
""",
|
||||
case_id,
|
||||
)
|
||||
seen: set[str] = set()
|
||||
out: list[UUID] = []
|
||||
for r in rows:
|
||||
for clid in _extract_citations_from_jsonb(r["citations"]):
|
||||
s = str(clid)
|
||||
if s not in seen:
|
||||
seen.add(s)
|
||||
out.append(clid)
|
||||
return out
|
||||
|
||||
|
||||
async def infer_relevance_from_citations(
|
||||
case_id: UUID | str,
|
||||
*,
|
||||
relevance_score: int = 3,
|
||||
feedback_source: str = "cited_in_decision",
|
||||
) -> dict:
|
||||
"""For each precedent cited in the case's draft, write a relevance
|
||||
row against every search_log where that precedent appeared in the
|
||||
top-K for the same case.
|
||||
|
||||
Idempotent: the ``UNIQUE(search_log_id, case_law_id, feedback_source)``
|
||||
constraint on ``search_relevance_feedback`` prevents duplicates.
|
||||
|
||||
Returns:
|
||||
``{"cited_precedents": int, "feedback_rows_inserted": int,
|
||||
"searches_matched": int}``.
|
||||
"""
|
||||
if relevance_score not in (0, 1, 2, 3):
|
||||
raise ValueError("relevance_score must be in 0..3")
|
||||
if feedback_source not in _VALID_SOURCES:
|
||||
raise ValueError(f"feedback_source must be one of {_VALID_SOURCES!r}")
|
||||
|
||||
case_uuid = case_id if isinstance(case_id, UUID) else UUID(str(case_id))
|
||||
|
||||
cited = await _gather_cited_case_law_ids(case_uuid)
|
||||
if not cited:
|
||||
return {
|
||||
"cited_precedents": 0,
|
||||
"feedback_rows_inserted": 0,
|
||||
"searches_matched": 0,
|
||||
}
|
||||
|
||||
pool = await db.get_pool()
|
||||
inserted = 0
|
||||
matched_searches: set[str] = set()
|
||||
|
||||
async with pool.acquire() as conn:
|
||||
# For each cited precedent, find all logs where it appeared in
|
||||
# top_case_law_ids for this case, and record its rank.
|
||||
for clid in cited:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT id, top_case_law_ids
|
||||
FROM search_logs
|
||||
WHERE case_id = $1
|
||||
AND top_case_law_ids IS NOT NULL
|
||||
AND $2 = ANY(top_case_law_ids)
|
||||
""",
|
||||
case_uuid,
|
||||
clid,
|
||||
)
|
||||
for row in rows:
|
||||
top_ids = row["top_case_law_ids"] or []
|
||||
# asyncpg returns uuid[] as list[UUID]
|
||||
try:
|
||||
rank = top_ids.index(clid) + 1
|
||||
except ValueError:
|
||||
continue
|
||||
result = await conn.execute(
|
||||
"""
|
||||
INSERT INTO search_relevance_feedback (
|
||||
search_log_id, case_law_id, rank,
|
||||
relevance_score, feedback_source
|
||||
) VALUES ($1, $2, $3, $4, $5)
|
||||
ON CONFLICT (search_log_id, case_law_id, feedback_source)
|
||||
DO NOTHING
|
||||
""",
|
||||
row["id"],
|
||||
clid,
|
||||
rank,
|
||||
relevance_score,
|
||||
feedback_source,
|
||||
)
|
||||
# ``execute`` returns 'INSERT 0 1' or 'INSERT 0 0' for
|
||||
# the no-op path; count only the writes.
|
||||
if result.endswith(" 1"):
|
||||
inserted += 1
|
||||
matched_searches.add(str(row["id"]))
|
||||
|
||||
return {
|
||||
"cited_precedents": len(cited),
|
||||
"feedback_rows_inserted": inserted,
|
||||
"searches_matched": len(matched_searches),
|
||||
}
|
||||
|
||||
|
||||
async def infer_relevance_for_all_finalized_cases(limit: int | None = None) -> dict:
|
||||
"""Bulk-run auto-inference for every case whose draft is final/exported.
|
||||
|
||||
Useful for back-filling after V18 schema lands and a few decisions
|
||||
have already been written. Skips cases with no cited precedents
|
||||
silently (they contribute zero to the totals).
|
||||
"""
|
||||
pool = await db.get_pool()
|
||||
sql = """
|
||||
SELECT DISTINCT c.id
|
||||
FROM cases c
|
||||
JOIN decisions d ON d.case_id = c.id
|
||||
WHERE c.status IN ('final', 'exported')
|
||||
"""
|
||||
if limit is not None and limit > 0:
|
||||
sql += " LIMIT $1"
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(sql, *([limit] if limit else []))
|
||||
|
||||
totals = {
|
||||
"cases_processed": 0,
|
||||
"cited_precedents": 0,
|
||||
"feedback_rows_inserted": 0,
|
||||
"searches_matched": 0,
|
||||
}
|
||||
for r in rows:
|
||||
stats = await infer_relevance_from_citations(r["id"])
|
||||
totals["cases_processed"] += 1
|
||||
totals["cited_precedents"] += stats["cited_precedents"]
|
||||
totals["feedback_rows_inserted"] += stats["feedback_rows_inserted"]
|
||||
totals["searches_matched"] += stats["searches_matched"]
|
||||
return totals
|
||||
@@ -128,8 +128,9 @@ async def case_create(
|
||||
hearing_date: str = "",
|
||||
notes: str = "",
|
||||
expected_outcome: str = "",
|
||||
practice_area: str = "appeals_committee",
|
||||
practice_area: str = "",
|
||||
appeal_subtype: str = "",
|
||||
proceeding_type: str = "",
|
||||
) -> str:
|
||||
"""יצירת תיק ערר חדש.
|
||||
|
||||
@@ -145,9 +146,12 @@ async def case_create(
|
||||
hearing_date: תאריך דיון (YYYY-MM-DD)
|
||||
notes: הערות
|
||||
expected_outcome: תוצאה צפויה (rejection/partial_acceptance/full_acceptance/betterment_levy)
|
||||
practice_area: תחום משפטי (appeals_committee / national_insurance / labor_law)
|
||||
practice_area: תחום משפטי — domain value (rishuy_uvniya / betterment_levy /
|
||||
compensation_197). ריק או "appeals_committee" = יוסק
|
||||
אוטומטית ממספר התיק (1xxx→רישוי, 8xxx→השבחה, 9xxx→197)
|
||||
appeal_subtype: סוג ערר (building_permit / betterment_levy / compensation_197).
|
||||
ריק = יוסק אוטומטית ממספר התיק
|
||||
proceeding_type: 'ערר' / 'בל"מ'. ריק = יוסק מ-appeal_subtype/subject.
|
||||
"""
|
||||
from datetime import date as date_type
|
||||
|
||||
@@ -155,12 +159,27 @@ async def case_create(
|
||||
if hearing_date:
|
||||
h_date = date_type.fromisoformat(hearing_date)
|
||||
|
||||
# Resolve appeal_subtype: explicit override > auto-derive > 'unknown'
|
||||
derived_subtype = pa.derive_subtype(case_number, practice_area)
|
||||
# Auto-derive practice_area when missing or set to the legacy multi-tenant
|
||||
# value. The DB's cases_practice_area_check rejects 'appeals_committee',
|
||||
# so we MUST map it to a domain value before INSERT. If derivation fails
|
||||
# (unknown case number format), fall back to '' which the constraint allows.
|
||||
if not practice_area or practice_area == "appeals_committee":
|
||||
practice_area = pa.derive_domain_practice_area(case_number)
|
||||
|
||||
# Resolve appeal_subtype: explicit override > auto-derive > 'unknown'.
|
||||
# derive_subtype_with_blam inspects the subject to detect בל"מ
|
||||
# (בקשה להארכת מועד) and returns an extension_request_* variant when
|
||||
# appropriate. Falls back to regular derive_subtype when subject is empty.
|
||||
derived_subtype = pa.derive_subtype_with_blam(case_number, subject, practice_area)
|
||||
if not appeal_subtype:
|
||||
appeal_subtype = derived_subtype
|
||||
pa.validate(practice_area, appeal_subtype)
|
||||
|
||||
# proceeding_type: explicit override > derived from subtype/subject > 'ערר'
|
||||
resolved_proc = proceeding_type.strip() or pa.derive_proceeding_type(
|
||||
appeal_subtype=appeal_subtype, subject=subject,
|
||||
)
|
||||
|
||||
case = await db.create_case(
|
||||
case_number=case_number,
|
||||
title=title,
|
||||
@@ -175,6 +194,7 @@ async def case_create(
|
||||
expected_outcome=expected_outcome,
|
||||
practice_area=practice_area,
|
||||
appeal_subtype=appeal_subtype,
|
||||
proceeding_type=resolved_proc,
|
||||
)
|
||||
|
||||
# If the user overrode the case-number convention (e.g. case 8500 marked
|
||||
@@ -278,6 +298,7 @@ async def case_update(
|
||||
respondents: list[str] | None = None,
|
||||
property_address: str = "",
|
||||
permit_number: str = "",
|
||||
proceeding_type: str = "",
|
||||
) -> str:
|
||||
"""עדכון פרטי תיק.
|
||||
|
||||
@@ -295,6 +316,7 @@ async def case_update(
|
||||
respondents: רשימת משיבים חדשה
|
||||
property_address: כתובת נכס חדשה
|
||||
permit_number: מספר תכנית/בקשה חדש
|
||||
proceeding_type: 'ערר' / 'בל"מ' — ריק = ללא שינוי
|
||||
"""
|
||||
from datetime import date as date_type
|
||||
|
||||
@@ -326,9 +348,15 @@ async def case_update(
|
||||
if notes:
|
||||
fields["notes"] = notes
|
||||
if hearing_date:
|
||||
fields["hearing_date"] = date_type.fromisoformat(hearing_date)
|
||||
try:
|
||||
fields["hearing_date"] = date_type.fromisoformat(hearing_date)
|
||||
except ValueError as exc:
|
||||
raise ValueError(f"Invalid hearing_date format: {hearing_date!r}") from exc
|
||||
if decision_date:
|
||||
fields["decision_date"] = date_type.fromisoformat(decision_date)
|
||||
try:
|
||||
fields["decision_date"] = date_type.fromisoformat(decision_date)
|
||||
except ValueError as exc:
|
||||
raise ValueError(f"Invalid decision_date format: {decision_date!r}") from exc
|
||||
if tags is not None:
|
||||
fields["tags"] = tags
|
||||
if expected_outcome:
|
||||
@@ -341,6 +369,12 @@ async def case_update(
|
||||
fields["property_address"] = property_address
|
||||
if permit_number:
|
||||
fields["permit_number"] = permit_number
|
||||
if proceeding_type:
|
||||
if proceeding_type not in {"ערר", 'בל"מ'}:
|
||||
raise ValueError(
|
||||
f"proceeding_type לא תקין: {proceeding_type!r}. ערכים תקפים: ערר / בל\"מ"
|
||||
)
|
||||
fields["proceeding_type"] = proceeding_type
|
||||
|
||||
updated = await db.update_case(UUID(case["id"]), **fields)
|
||||
|
||||
|
||||
135
mcp-server/src/legal_mcp/tools/citations.py
Normal file
135
mcp-server/src/legal_mcp/tools/citations.py
Normal file
@@ -0,0 +1,135 @@
|
||||
"""MCP tools for the internal-decisions citation graph (TaskMaster #34).
|
||||
|
||||
The citation graph captures pointers between Daphna's (and other internal
|
||||
committee chairs') decisions: when one ruling cites another, ``precedent_
|
||||
internal_citations`` records the edge — resolved against ``case_law`` when
|
||||
the cited row exists, kept as a stub when it doesn't.
|
||||
|
||||
Three tools:
|
||||
|
||||
- ``extract_internal_citations`` — run regex extraction on one row (by id) or
|
||||
on every internal-committee row filtered by chair (e.g. Daphna only).
|
||||
Idempotent: re-running does not duplicate rows (ON CONFLICT DO NOTHING).
|
||||
- ``list_internal_citations`` — outgoing edges from a source row. Optional
|
||||
``linked_only`` filter for rows resolved to existing case_law UUIDs.
|
||||
- ``list_incoming_citations`` — incoming edges to a target row ("which
|
||||
Daphna decisions cite this ruling?").
|
||||
|
||||
These tools are *manual triggers*. The pipeline runs them after a new
|
||||
internal-decision upload, but the chair / researcher can also re-run on
|
||||
demand (for example after fixing OCR or after uploading a previously-
|
||||
missing decision so that newer rows now link to it).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import citation_extractor
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
return json.dumps(payload, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
def _err(msg: str) -> str:
|
||||
return json.dumps({"error": msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
async def extract_internal_citations(
|
||||
case_law_id: str = "",
|
||||
chair_name: str = "",
|
||||
limit: int = 0,
|
||||
) -> str:
|
||||
"""חילוץ ציטוטים פנימיים מהחלטות ועדת ערר ושמירה ב-precedent_internal_citations.
|
||||
|
||||
Args:
|
||||
case_law_id: UUID של החלטה ספציפית. אם ריק וגם chair_name ריק — מריץ
|
||||
על כל ההחלטות internal_committee. אם מסופק, חייב לעבור על שורה אחת
|
||||
בלבד (משתמש בזה אחרי upload).
|
||||
chair_name: שם יו"ר (כגון 'דפנה תמיר'). מסנן את האצווה. ריק = כל היו"רים.
|
||||
limit: עליון על מספר רשומות שיעובדו (0 = ללא הגבלה). שימושי לבדיקה.
|
||||
|
||||
הכלי איידמפוטנטי — ON CONFLICT DO NOTHING על (source_case_law_id, cited_case_number).
|
||||
מחזיר סטטיסטיקה: extracted, linked, new, skipped, failed.
|
||||
"""
|
||||
if case_law_id.strip() and chair_name.strip():
|
||||
return _err("יש לספק case_law_id או chair_name, לא שניהם")
|
||||
|
||||
if case_law_id.strip():
|
||||
try:
|
||||
cl_uuid = UUID(case_law_id.strip())
|
||||
except ValueError:
|
||||
return _err("case_law_id לא תקין")
|
||||
try:
|
||||
stats = await citation_extractor.extract_and_store(cl_uuid)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(stats)
|
||||
|
||||
try:
|
||||
stats = await citation_extractor.extract_all_internal_committee(
|
||||
chair_name_filter=chair_name.strip(),
|
||||
limit=int(limit) if limit else 0,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(stats)
|
||||
|
||||
|
||||
async def list_internal_citations(
|
||||
case_law_id: str = "",
|
||||
linked_only: bool = False,
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת ציטוטים יוצאים מהחלטה (מה ההחלטה הזו מצטטת).
|
||||
|
||||
Args:
|
||||
case_law_id: UUID של ה-case_law (חובה).
|
||||
linked_only: True = רק ציטוטים שקושרו ל-case_law קיים בקורפוס.
|
||||
limit: עליון על מספר תוצאות (default 50).
|
||||
|
||||
Returns: JSON עם list של ציטוטים, כולל target_case_number/name/chair
|
||||
כשהם linked. אם linked_only=False, ציטוטים בלתי קושרים יחזרו עם
|
||||
cited_case_law_id=null וניתן להעלות אותם דרך internal_decision_upload.
|
||||
"""
|
||||
if not case_law_id.strip():
|
||||
return _err("case_law_id חובה")
|
||||
try:
|
||||
cl_uuid = UUID(case_law_id.strip())
|
||||
except ValueError:
|
||||
return _err("case_law_id לא תקין")
|
||||
try:
|
||||
rows = await citation_extractor.list_citations_for_case_law(
|
||||
cl_uuid, linked_only=bool(linked_only),
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok({"items": rows[: max(1, int(limit))], "count": len(rows)})
|
||||
|
||||
|
||||
async def list_incoming_citations(
|
||||
case_law_id: str = "",
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת ציטוטים נכנסים אל החלטה (אילו החלטות מצטטות אותה).
|
||||
|
||||
שימוש: רוצים לדעת אילו החלטות של דפנה הסתמכו על פסק דין מסוים?
|
||||
מעבירים את ה-case_law_id של פסק הדין הזה.
|
||||
|
||||
Args:
|
||||
case_law_id: UUID של ה-target case_law (חובה).
|
||||
limit: עליון על מספר תוצאות.
|
||||
"""
|
||||
if not case_law_id.strip():
|
||||
return _err("case_law_id חובה")
|
||||
try:
|
||||
cl_uuid = UUID(case_law_id.strip())
|
||||
except ValueError:
|
||||
return _err("case_law_id לא תקין")
|
||||
try:
|
||||
rows = await citation_extractor.list_citations_to_case_law(cl_uuid)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok({"items": rows[: max(1, int(limit))], "count": len(rows)})
|
||||
116
mcp-server/src/legal_mcp/tools/internal_decisions.py
Normal file
116
mcp-server/src/legal_mcp/tools/internal_decisions.py
Normal file
@@ -0,0 +1,116 @@
|
||||
"""MCP tools for the Internal Decisions corpus.
|
||||
|
||||
Decisions of appeals committees (ועדות ערר) live in the same physical
|
||||
``case_law`` table as court rulings but are distinguished by
|
||||
``source_kind='internal_committee'`` and must carry ``chair_name`` +
|
||||
``district``.
|
||||
|
||||
The existing ``precedent_library_upload`` MCP tool always stores
|
||||
``source_kind='external_upload'`` and does not accept chair/district —
|
||||
which is why **44+ existing appeals-committee decisions were tagged
|
||||
wrong**. This wrapper is the authoritative ingestion path for committee
|
||||
decisions and enforces the required metadata at the tool boundary.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
|
||||
from legal_mcp.services import internal_decisions as int_svc
|
||||
|
||||
# Valid Hebrew district names (matches _COURT_TO_DISTRICT in service)
|
||||
VALID_DISTRICTS = {"ירושלים", "מרכז", "תל אביב", "תל-אביב", "צפון", "דרום", "חיפה", "ארצי"}
|
||||
|
||||
# proceeding_type — ערר vs בל"מ. The service can derive it from
|
||||
# appeal_subtype/subject if left empty, so this stays optional at the API.
|
||||
VALID_PROCEEDING_TYPES = {"ערר", 'בל"מ'}
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
return json.dumps(payload, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
def _err(msg: str) -> str:
|
||||
return json.dumps({"error": msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
async def internal_decision_upload(
|
||||
file_path: str,
|
||||
case_number: str,
|
||||
chair_name: str,
|
||||
district: str,
|
||||
case_name: str = "",
|
||||
court: str = "",
|
||||
decision_date: str = "",
|
||||
practice_area: str = "",
|
||||
appeal_subtype: str = "",
|
||||
subject_tags: list[str] | None = None,
|
||||
summary: str = "",
|
||||
is_binding: bool = False,
|
||||
proceeding_type: str = "",
|
||||
) -> str:
|
||||
"""העלאת החלטה של ועדת ערר (internal_committee) לקורפוס הסמכותי.
|
||||
|
||||
Required: file_path, case_number, chair_name, district.
|
||||
The tool enforces chair_name+district so the record cannot be saved
|
||||
in the broken legacy mode (external_upload with empty chair/district).
|
||||
|
||||
Args:
|
||||
file_path: נתיב מלא לקובץ PDF/DOCX/RTF/TXT/MD.
|
||||
case_number: מספר הערר ("ערר (ועדות ערר - תכנון ובנייה ירושלים) 1110/20 ...").
|
||||
chair_name: שם יו"ר הוועדה (חובה).
|
||||
district: מחוז (ירושלים/מרכז/תל אביב/צפון/דרום/חיפה/ארצי) — חובה.
|
||||
case_name: שם קצר.
|
||||
court: ערכאה ("ועדת הערר לתכנון ובנייה — מחוז ירושלים").
|
||||
decision_date: ISO date (YYYY-MM-DD), אופציונלי.
|
||||
practice_area: rishuy_uvniya / betterment_levy / compensation_197.
|
||||
appeal_subtype: building_permit / וכו'.
|
||||
subject_tags: תגיות נושא.
|
||||
is_binding: בד"כ False (ועדת ערר לא מחייבת ועדה אחרת — שכנוע אופקי).
|
||||
proceeding_type: 'ערר' או 'בל"מ'. אם ריק — נגזר מ-appeal_subtype/case_name.
|
||||
|
||||
Returns: JSON עם case_law_id, מספר chunks, halachot_pending.
|
||||
"""
|
||||
if not file_path.strip():
|
||||
return _err("file_path חובה")
|
||||
if not case_number.strip():
|
||||
return _err("case_number חובה")
|
||||
if not chair_name.strip():
|
||||
return _err(
|
||||
"chair_name חובה. החלטות ועדת ערר חייבות שם יו\"ר — "
|
||||
"בלעדיו ההחלטה לא ניתנת לחיפוש סלקטיבי לפי הרכב."
|
||||
)
|
||||
if not district.strip():
|
||||
return _err(
|
||||
"district חובה. ערכים תקפים: " + ", ".join(sorted(VALID_DISTRICTS))
|
||||
)
|
||||
if district.strip() not in VALID_DISTRICTS:
|
||||
return _err(
|
||||
f"district לא תקין: {district!r}. ערכים תקפים: "
|
||||
+ ", ".join(sorted(VALID_DISTRICTS))
|
||||
)
|
||||
if proceeding_type.strip() and proceeding_type.strip() not in VALID_PROCEEDING_TYPES:
|
||||
return _err(
|
||||
f"proceeding_type לא תקין: {proceeding_type!r}. ערכים תקפים: "
|
||||
+ ", ".join(sorted(VALID_PROCEEDING_TYPES))
|
||||
)
|
||||
|
||||
try:
|
||||
result = await int_svc.ingest_internal_decision(
|
||||
case_number=case_number,
|
||||
case_name=case_name,
|
||||
court=court,
|
||||
decision_date=decision_date or None,
|
||||
chair_name=chair_name,
|
||||
district=district,
|
||||
practice_area=practice_area,
|
||||
appeal_subtype=appeal_subtype,
|
||||
subject_tags=subject_tags or [],
|
||||
summary=summary,
|
||||
is_binding=is_binding,
|
||||
file_path=file_path,
|
||||
proceeding_type=proceeding_type,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(result)
|
||||
83
mcp-server/src/legal_mcp/tools/legal_arguments.py
Normal file
83
mcp-server/src/legal_mcp/tools/legal_arguments.py
Normal file
@@ -0,0 +1,83 @@
|
||||
"""MCP tools — aggregated legal arguments (claim de-duplication)."""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import argument_aggregator, db
|
||||
|
||||
|
||||
async def aggregate_claims_to_arguments(
|
||||
case_number: str,
|
||||
force: bool = False,
|
||||
) -> str:
|
||||
"""כינוס פרופוזיציות גולמיות לטיעונים משפטיים מובחנים.
|
||||
|
||||
Args:
|
||||
case_number: מספר תיק הערר.
|
||||
force: True = למחוק טיעונים קיימים ולחשב מחדש.
|
||||
"""
|
||||
case = await db.get_case_by_number(case_number)
|
||||
if not case:
|
||||
return json.dumps(
|
||||
{"status": "error", "message": f"תיק {case_number} לא נמצא."},
|
||||
ensure_ascii=False, indent=2,
|
||||
)
|
||||
|
||||
case_id = UUID(case["id"])
|
||||
result = await argument_aggregator.aggregate_claims_to_arguments(
|
||||
case_id, force=force,
|
||||
)
|
||||
result["case_number"] = case_number
|
||||
return json.dumps(result, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
async def get_legal_arguments(
|
||||
case_number: str,
|
||||
party: str = "",
|
||||
) -> str:
|
||||
"""שליפת טיעונים משפטיים מאוגדים לתיק.
|
||||
|
||||
Args:
|
||||
case_number: מספר תיק הערר.
|
||||
party: סינון לפי צד (appellant/respondent/committee/permit_applicant).
|
||||
ריק = כל הצדדים.
|
||||
"""
|
||||
case = await db.get_case_by_number(case_number)
|
||||
if not case:
|
||||
return json.dumps(
|
||||
{"status": "error", "message": f"תיק {case_number} לא נמצא."},
|
||||
ensure_ascii=False, indent=2,
|
||||
)
|
||||
|
||||
case_id = UUID(case["id"])
|
||||
args = await argument_aggregator.get_legal_arguments(case_id, party=party)
|
||||
|
||||
if not args:
|
||||
return json.dumps({
|
||||
"status": "empty",
|
||||
"case_number": case_number,
|
||||
"message": "לא נמצאו טיעונים מאוגדים. הרץ aggregate_claims_to_arguments תחילה.",
|
||||
"arguments": [],
|
||||
}, ensure_ascii=False, indent=2)
|
||||
|
||||
# Group by party for nicer display.
|
||||
party_he = {
|
||||
"appellant": "עוררים",
|
||||
"respondent": "משיבים",
|
||||
"committee": "ועדה מקומית",
|
||||
"permit_applicant": "מבקשי היתר",
|
||||
"unknown": "צד לא מזוהה",
|
||||
}
|
||||
by_party: dict[str, list[dict]] = {}
|
||||
for a in args:
|
||||
label = party_he.get(a["party"], a["party"])
|
||||
by_party.setdefault(label, []).append(a)
|
||||
|
||||
return json.dumps({
|
||||
"status": "ok",
|
||||
"case_number": case_number,
|
||||
"total": len(args),
|
||||
"by_party": by_party,
|
||||
}, ensure_ascii=False, indent=2, default=str)
|
||||
210
mcp-server/src/legal_mcp/tools/missing_precedents.py
Normal file
210
mcp-server/src/legal_mcp/tools/missing_precedents.py
Normal file
@@ -0,0 +1,210 @@
|
||||
"""MCP tools for the missing-precedents log.
|
||||
|
||||
When a researcher (or chair) finds a citation in a party brief that
|
||||
isn't yet in the precedent_library, they record it here so:
|
||||
|
||||
1. The gap is visible in the UI (the chair can see all open citations
|
||||
that need to be uploaded).
|
||||
2. The writer agent doesn't try to use a precedent that isn't in the
|
||||
corpus — it knows the gap is being tracked.
|
||||
3. The chair has a clean closing workflow: upload the actual decision
|
||||
via the precedent library / internal-decisions, then link it here.
|
||||
|
||||
Three tools:
|
||||
- ``missing_precedent_create`` — log a new gap (researcher / chair).
|
||||
- ``missing_precedent_list`` — list open gaps (optionally filtered).
|
||||
- ``missing_precedent_close`` — close a gap (chair workflow).
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
return json.dumps(payload, ensure_ascii=False, indent=2, default=str)
|
||||
|
||||
|
||||
def _err(msg: str) -> str:
|
||||
return json.dumps({"error": msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
async def _resolve_case_id(case_number: str) -> UUID | None:
|
||||
"""Translate a human case_number (e.g. '1017-03-26') to a UUID."""
|
||||
if not case_number or not case_number.strip():
|
||||
return None
|
||||
row = await db.get_case_by_number(case_number.strip())
|
||||
if not row:
|
||||
return None
|
||||
return UUID(row["id"])
|
||||
|
||||
|
||||
async def missing_precedent_create(
|
||||
citation: str,
|
||||
case_number: str = "",
|
||||
cited_in_document_id: str = "",
|
||||
cited_by_party: str = "unknown",
|
||||
cited_by_party_name: str = "",
|
||||
legal_topic: str = "",
|
||||
legal_issue: str = "",
|
||||
claim_quote: str = "",
|
||||
case_name: str = "",
|
||||
notes: str = "",
|
||||
) -> str:
|
||||
"""תיעוד פסיקה שצוטטה אך אינה בקורפוס. הסוכן יוצר רשומה כשהוא מזהה ציטוט
|
||||
שלא ניתן לאמת מול הקורפוס; היו"ר יסגור אותה לאחר העלאת המסמך.
|
||||
|
||||
Args:
|
||||
citation: מראה המקום המלא (חובה).
|
||||
case_number: מספר תיק הערר שבו צוטטה הפסיקה (לדוגמה '1017-03-26').
|
||||
cited_in_document_id: UUID של המסמך שבו הציטוט מופיע (אופציונלי).
|
||||
cited_by_party: appellant / respondent / committee / permit_applicant / unknown.
|
||||
cited_by_party_name: שם הצד (כדי שיהיה ברור מי ציטט).
|
||||
legal_topic: נושא משפטי קצר (לדוגמה "זכות עמידה").
|
||||
legal_issue: שאלה משפטית מפורטת.
|
||||
claim_quote: הציטוט בכתב הטענות.
|
||||
case_name: שם קצר של פסק הדין החסר.
|
||||
notes: הערות חופשיות.
|
||||
|
||||
Returns: JSON של הרשומה שנוצרה (כולל id) או error.
|
||||
"""
|
||||
if not citation.strip():
|
||||
return _err("citation חובה")
|
||||
|
||||
case_id = None
|
||||
if case_number:
|
||||
case_id = await _resolve_case_id(case_number)
|
||||
if case_id is None:
|
||||
return _err(f"תיק לא נמצא: {case_number}")
|
||||
|
||||
doc_uuid: UUID | None = None
|
||||
if cited_in_document_id.strip():
|
||||
try:
|
||||
doc_uuid = UUID(cited_in_document_id.strip())
|
||||
except ValueError:
|
||||
return _err("cited_in_document_id לא תקין")
|
||||
|
||||
party = cited_by_party.strip() or "unknown"
|
||||
if party not in db.ALLOWED_MP_PARTIES:
|
||||
return _err(
|
||||
f"cited_by_party לא תקין. ערכים תקפים: "
|
||||
f"{', '.join(sorted(db.ALLOWED_MP_PARTIES))}"
|
||||
)
|
||||
|
||||
# Deduplication: if a row already exists for the same citation in
|
||||
# the same case, return that one rather than creating a duplicate.
|
||||
existing = await db.find_missing_precedent_by_citation(
|
||||
citation=citation.strip(),
|
||||
case_id=case_id,
|
||||
)
|
||||
if existing:
|
||||
return _ok({**existing, "_duplicate": True})
|
||||
|
||||
try:
|
||||
row = await db.create_missing_precedent(
|
||||
citation=citation.strip(),
|
||||
case_name=case_name.strip() or None,
|
||||
cited_in_case_id=case_id,
|
||||
cited_in_document_id=doc_uuid,
|
||||
cited_by_party=party,
|
||||
cited_by_party_name=cited_by_party_name.strip() or None,
|
||||
legal_topic=legal_topic.strip() or None,
|
||||
legal_issue=legal_issue.strip() or None,
|
||||
claim_quote=claim_quote.strip() or None,
|
||||
notes=notes.strip() or None,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(row)
|
||||
|
||||
|
||||
async def missing_precedent_list(
|
||||
case_number: str = "",
|
||||
status: str = "open",
|
||||
legal_topic: str = "",
|
||||
limit: int = 50,
|
||||
) -> str:
|
||||
"""רשימת פסיקות חסרות. ברירת מחדל = פתוחות בלבד.
|
||||
|
||||
Args:
|
||||
case_number: סינון לפי תיק הערר שבו צוטטו.
|
||||
status: open / uploaded / closed / irrelevant (ריק = הכל).
|
||||
legal_topic: סינון לפי נושא משפטי (substring).
|
||||
limit: מספר תוצאות מקסימלי.
|
||||
|
||||
Returns: JSON עם רשימת רשומות + linked_case_law_number אם נסגרו.
|
||||
"""
|
||||
case_id = None
|
||||
if case_number:
|
||||
case_id = await _resolve_case_id(case_number)
|
||||
if case_id is None:
|
||||
return _err(f"תיק לא נמצא: {case_number}")
|
||||
|
||||
s = status.strip() or None
|
||||
if s and s not in db.ALLOWED_MP_STATUS:
|
||||
return _err(
|
||||
f"status לא תקין. ערכים תקפים: "
|
||||
f"{', '.join(sorted(db.ALLOWED_MP_STATUS))}"
|
||||
)
|
||||
try:
|
||||
rows = await db.list_missing_precedents(
|
||||
status=s,
|
||||
case_id=case_id,
|
||||
legal_topic=legal_topic.strip() or None,
|
||||
limit=max(1, min(int(limit), 500)),
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok({"items": rows, "count": len(rows)})
|
||||
|
||||
|
||||
async def missing_precedent_close(
|
||||
id: str,
|
||||
linked_case_law_id: str = "",
|
||||
notes: str = "",
|
||||
status: str = "closed",
|
||||
) -> str:
|
||||
"""סגירת רשומת פסיקה חסרה. ברירת מחדל = 'closed' + קישור ל-case_law.
|
||||
|
||||
Args:
|
||||
id: UUID של הרשומה.
|
||||
linked_case_law_id: UUID של הפסיקה שהועלתה ב-precedent_library / internal_decisions.
|
||||
notes: הערות סגירה (לדוגמה "אינו רלוונטי" ל-status='irrelevant').
|
||||
status: closed / uploaded / irrelevant.
|
||||
|
||||
Returns: JSON של הרשומה המעודכנת.
|
||||
"""
|
||||
try:
|
||||
mp_id = UUID(id.strip())
|
||||
except ValueError:
|
||||
return _err("id לא תקין")
|
||||
|
||||
cl_uuid: UUID | None = None
|
||||
if linked_case_law_id.strip():
|
||||
try:
|
||||
cl_uuid = UUID(linked_case_law_id.strip())
|
||||
except ValueError:
|
||||
return _err("linked_case_law_id לא תקין")
|
||||
|
||||
status_clean = status.strip() or "closed"
|
||||
if status_clean not in db.ALLOWED_MP_STATUS:
|
||||
return _err(
|
||||
f"status לא תקין. ערכים תקפים: "
|
||||
f"{', '.join(sorted(db.ALLOWED_MP_STATUS))}"
|
||||
)
|
||||
|
||||
try:
|
||||
row = await db.close_missing_precedent(
|
||||
mp_id=mp_id,
|
||||
linked_case_law_id=cl_uuid,
|
||||
notes=notes.strip() or None,
|
||||
status=status_clean,
|
||||
)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
if row is None:
|
||||
return _err("רשומה לא נמצאה")
|
||||
return _ok(row)
|
||||
@@ -18,9 +18,10 @@ the chair approves them — per project review policy.
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import time
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db, precedent_library
|
||||
from legal_mcp.services import db, precedent_library, telemetry
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
@@ -63,6 +64,18 @@ async def precedent_library_upload(
|
||||
"""
|
||||
if not citation.strip():
|
||||
return _err("citation חובה")
|
||||
# Citation guard: appeals-committee decisions must go through
|
||||
# internal_decision_upload (with chair_name + district). The legacy
|
||||
# path always stored source_kind='external_upload' and left
|
||||
# chair_name/district empty — see TaskMaster #30(ב).
|
||||
_norm = citation.strip()
|
||||
_committee_prefixes = ("ערר ", "ערר(", "ערר ", "בל\"מ ", "בל\"מ(", "ARAR ")
|
||||
if any(_norm.startswith(p) for p in _committee_prefixes):
|
||||
return _err(
|
||||
"ציטוט שמתחיל ב-'ערר' או 'בל\"מ' הוא החלטת ועדת ערר. "
|
||||
"השתמש ב-internal_decision_upload (דורש chair_name + district), "
|
||||
"לא ב-precedent_library_upload."
|
||||
)
|
||||
try:
|
||||
result = await precedent_library.ingest_precedent(
|
||||
file_path=file_path,
|
||||
@@ -90,6 +103,7 @@ async def precedent_library_list(
|
||||
precedent_level: str = "",
|
||||
source_type: str = "",
|
||||
search: str = "",
|
||||
source_kind: str = "external_upload",
|
||||
limit: int = 100,
|
||||
) -> str:
|
||||
"""רשימה של פסיקה בקורפוס הסמכותי, עם פילטרים."""
|
||||
@@ -99,6 +113,7 @@ async def precedent_library_list(
|
||||
precedent_level=precedent_level,
|
||||
source_type=source_type,
|
||||
search=search,
|
||||
source_kind=source_kind,
|
||||
limit=limit,
|
||||
)
|
||||
return _ok(rows)
|
||||
@@ -248,8 +263,10 @@ async def search_precedent_library(
|
||||
"""
|
||||
if not query or len(query.strip()) < 2:
|
||||
return json.dumps([], ensure_ascii=False)
|
||||
q = query.strip()
|
||||
t0 = time.perf_counter()
|
||||
results = await precedent_library.search_library(
|
||||
query=query.strip(),
|
||||
query=q,
|
||||
practice_area=practice_area,
|
||||
court=court,
|
||||
precedent_level=precedent_level,
|
||||
@@ -259,6 +276,15 @@ async def search_precedent_library(
|
||||
limit=limit,
|
||||
include_halachot=include_halachot,
|
||||
)
|
||||
elapsed_ms = int((time.perf_counter() - t0) * 1000)
|
||||
telemetry.log_search_bg(
|
||||
search_type="precedent_library",
|
||||
query=q,
|
||||
results=results,
|
||||
duration_ms=elapsed_ms,
|
||||
practice_area=practice_area or None,
|
||||
user_agent="unknown",
|
||||
)
|
||||
return _ok(results)
|
||||
|
||||
|
||||
|
||||
@@ -4,9 +4,10 @@ from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import time
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db, embeddings, hybrid_search
|
||||
from legal_mcp.services import db, embeddings, hybrid_search, telemetry
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
@@ -30,11 +31,16 @@ async def search_decisions(
|
||||
case_number: אם סופק, ה-practice_area/subtype יוסקו אוטומטית מהתיק
|
||||
"""
|
||||
# Auto-resolve practice_area from case_number if available
|
||||
resolved_case_id: UUID | None = None
|
||||
if case_number and not practice_area:
|
||||
case = await db.get_case_by_number(case_number)
|
||||
if case:
|
||||
practice_area = case.get("practice_area") or ""
|
||||
appeal_subtype = appeal_subtype or (case.get("appeal_subtype") or "")
|
||||
try:
|
||||
resolved_case_id = UUID(case["id"])
|
||||
except (KeyError, ValueError, TypeError):
|
||||
resolved_case_id = None
|
||||
|
||||
if not practice_area:
|
||||
logger.warning(
|
||||
@@ -43,6 +49,7 @@ async def search_decisions(
|
||||
)
|
||||
|
||||
query_emb = await embeddings.embed_query(query)
|
||||
t0 = time.perf_counter()
|
||||
results = await hybrid_search.search_documents_hybrid(
|
||||
query=query,
|
||||
query_text_embedding=query_emb,
|
||||
@@ -51,6 +58,16 @@ async def search_decisions(
|
||||
practice_area=practice_area or None,
|
||||
appeal_subtype=appeal_subtype or None,
|
||||
)
|
||||
elapsed_ms = int((time.perf_counter() - t0) * 1000)
|
||||
telemetry.log_search_bg(
|
||||
search_type="decisions",
|
||||
query=query,
|
||||
results=results,
|
||||
duration_ms=elapsed_ms,
|
||||
practice_area=practice_area or None,
|
||||
case_id=resolved_case_id,
|
||||
user_agent="unknown",
|
||||
)
|
||||
|
||||
if not results:
|
||||
return "לא נמצאו תוצאות."
|
||||
@@ -87,13 +104,24 @@ async def search_case_documents(
|
||||
if not case:
|
||||
return f"תיק {case_number} לא נמצא."
|
||||
|
||||
case_uuid = UUID(case["id"])
|
||||
query_emb = await embeddings.embed_query(query)
|
||||
# Restricted to case_id — practice_area filter would be redundant.
|
||||
t0 = time.perf_counter()
|
||||
results = await hybrid_search.search_documents_hybrid(
|
||||
query=query,
|
||||
query_text_embedding=query_emb,
|
||||
limit=limit,
|
||||
case_id=UUID(case["id"]),
|
||||
case_id=case_uuid,
|
||||
)
|
||||
elapsed_ms = int((time.perf_counter() - t0) * 1000)
|
||||
telemetry.log_search_bg(
|
||||
search_type="case_documents",
|
||||
query=query,
|
||||
results=results,
|
||||
duration_ms=elapsed_ms,
|
||||
case_id=case_uuid,
|
||||
user_agent="unknown",
|
||||
)
|
||||
|
||||
if not results:
|
||||
@@ -130,11 +158,16 @@ async def find_similar_cases(
|
||||
appeal_subtype: סוג ערר לסינון
|
||||
case_number: אם סופק, ה-practice_area/subtype יוסקו אוטומטית מהתיק
|
||||
"""
|
||||
resolved_case_id: UUID | None = None
|
||||
if case_number and not practice_area:
|
||||
case = await db.get_case_by_number(case_number)
|
||||
if case:
|
||||
practice_area = case.get("practice_area") or ""
|
||||
appeal_subtype = appeal_subtype or (case.get("appeal_subtype") or "")
|
||||
try:
|
||||
resolved_case_id = UUID(case["id"])
|
||||
except (KeyError, ValueError, TypeError):
|
||||
resolved_case_id = None
|
||||
|
||||
if not practice_area:
|
||||
logger.warning(
|
||||
@@ -145,6 +178,7 @@ async def find_similar_cases(
|
||||
query_emb = await embeddings.embed_query(description)
|
||||
# Even with rerank we ask for ``limit*3`` so the dedup-by-case
|
||||
# step downstream still has enough rows to pick the best per case.
|
||||
t0 = time.perf_counter()
|
||||
results = await hybrid_search.search_documents_hybrid(
|
||||
query=description,
|
||||
query_text_embedding=query_emb,
|
||||
@@ -152,6 +186,16 @@ async def find_similar_cases(
|
||||
practice_area=practice_area or None,
|
||||
appeal_subtype=appeal_subtype or None,
|
||||
)
|
||||
elapsed_ms = int((time.perf_counter() - t0) * 1000)
|
||||
telemetry.log_search_bg(
|
||||
search_type="similar_cases",
|
||||
query=description,
|
||||
results=results,
|
||||
duration_ms=elapsed_ms,
|
||||
practice_area=practice_area or None,
|
||||
case_id=resolved_case_id,
|
||||
user_agent="unknown",
|
||||
)
|
||||
|
||||
if not results:
|
||||
return "לא נמצאו תיקים דומים."
|
||||
@@ -189,6 +233,7 @@ async def search_internal_decisions(
|
||||
chair_name: str = "",
|
||||
limit: int = 10,
|
||||
include_halachot: bool = True,
|
||||
include_cited_by: bool = False,
|
||||
) -> str:
|
||||
"""חיפוש בהחלטות ועדות ערר לתכנון ובנייה (כל המחוזות).
|
||||
|
||||
@@ -200,42 +245,145 @@ async def search_internal_decisions(
|
||||
chair_name: שם יו"ר הוועדה לסינון. ריק = כל היו"רים
|
||||
limit: מספר תוצאות מקסימלי
|
||||
include_halachot: האם לכלול הלכות שחולצו
|
||||
include_cited_by: True = אחרי החיפוש הראשי, הוסף החלטות שה-hits
|
||||
הראשיים מצטטים (מתוך precedent_internal_citations). default False
|
||||
כדי לא לשבור caller-ים קיימים. match_type='cited_by' מציין שזו
|
||||
תוצאה משנית.
|
||||
"""
|
||||
from legal_mcp.services import internal_decisions as int_svc
|
||||
|
||||
# Bump the limit a bit when we're expanding via citations — the
|
||||
# citation step is cheap and a few extra primary hits make the
|
||||
# expansion more useful.
|
||||
primary_limit = limit if not include_cited_by else max(limit, limit * 2)
|
||||
|
||||
t0 = time.perf_counter()
|
||||
results = await int_svc.search_internal(
|
||||
query,
|
||||
practice_area=practice_area,
|
||||
appeal_subtype=appeal_subtype,
|
||||
district=district,
|
||||
chair_name=chair_name,
|
||||
limit=limit,
|
||||
limit=primary_limit,
|
||||
include_halachot=include_halachot,
|
||||
)
|
||||
elapsed_ms = int((time.perf_counter() - t0) * 1000)
|
||||
telemetry.log_search_bg(
|
||||
search_type="internal_decisions",
|
||||
query=query,
|
||||
results=results,
|
||||
duration_ms=elapsed_ms,
|
||||
practice_area=practice_area or None,
|
||||
user_agent="unknown",
|
||||
)
|
||||
|
||||
if not results:
|
||||
return "לא נמצאו החלטות ועדת ערר רלוונטיות."
|
||||
|
||||
# Cap primary results back to ``limit`` (we over-fetched only to seed
|
||||
# the citation expansion below — the user asked for ``limit`` items).
|
||||
primary = results[:limit]
|
||||
|
||||
formatted = []
|
||||
for r in results:
|
||||
entry = {
|
||||
"score": round(float(r["score"]), 4),
|
||||
"type": r.get("type", "passage"),
|
||||
"case_number": r.get("case_number"),
|
||||
"case_name": r.get("case_name"),
|
||||
"court": r.get("court"),
|
||||
"district": r.get("district"),
|
||||
"chair_name": r.get("chair_name"),
|
||||
"decision_date": r.get("decision_date"),
|
||||
}
|
||||
if r.get("type") == "halacha":
|
||||
entry["rule"] = r.get("rule_statement")
|
||||
entry["quote"] = r.get("supporting_quote")
|
||||
entry["rule_type"] = r.get("rule_type")
|
||||
else:
|
||||
entry["content"] = r.get("content", "")
|
||||
entry["section"] = r.get("section_type")
|
||||
entry["page"] = r.get("page_number")
|
||||
formatted.append(entry)
|
||||
seen_case_law_ids: set[str] = set()
|
||||
for r in primary:
|
||||
clid = str(r.get("case_law_id") or "")
|
||||
if clid:
|
||||
seen_case_law_ids.add(clid)
|
||||
formatted.append(_format_internal_row(r, match_type="primary"))
|
||||
|
||||
if include_cited_by and seen_case_law_ids:
|
||||
from uuid import UUID
|
||||
from legal_mcp.services import citation_extractor
|
||||
|
||||
try:
|
||||
source_uuids = [UUID(s) for s in seen_case_law_ids]
|
||||
cited_map = await citation_extractor.get_cited_case_law_ids(source_uuids)
|
||||
except Exception as e:
|
||||
logger.warning("include_cited_by lookup failed: %s", e)
|
||||
cited_map = {}
|
||||
|
||||
# Flatten + dedup the cited case_law_ids that aren't already in
|
||||
# the primary set.
|
||||
cited_ids: set[str] = set()
|
||||
for ids in cited_map.values():
|
||||
for cid in ids:
|
||||
if cid and cid not in seen_case_law_ids:
|
||||
cited_ids.add(cid)
|
||||
|
||||
if cited_ids:
|
||||
cited_rows = await _fetch_case_law_summaries(list(cited_ids))
|
||||
for row in cited_rows:
|
||||
formatted.append(_format_internal_row(row, match_type="cited_by"))
|
||||
|
||||
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
||||
|
||||
|
||||
def _format_internal_row(r: dict, *, match_type: str = "primary") -> dict:
|
||||
"""Shape an internal-decision hit (or a cited_by stub) for the MCP response."""
|
||||
entry: dict = {
|
||||
"score": round(float(r.get("score", 0.0)), 4),
|
||||
"type": r.get("type", "passage"),
|
||||
"case_number": r.get("case_number"),
|
||||
"case_name": r.get("case_name"),
|
||||
"court": r.get("court"),
|
||||
"district": r.get("district"),
|
||||
"chair_name": r.get("chair_name"),
|
||||
"decision_date": r.get("decision_date"),
|
||||
"match_type": match_type,
|
||||
}
|
||||
if r.get("type") == "halacha":
|
||||
entry["rule"] = r.get("rule_statement")
|
||||
entry["quote"] = r.get("supporting_quote")
|
||||
entry["rule_type"] = r.get("rule_type")
|
||||
else:
|
||||
entry["content"] = r.get("content", "")
|
||||
entry["section"] = r.get("section_type")
|
||||
entry["page"] = r.get("page_number")
|
||||
return entry
|
||||
|
||||
|
||||
async def _fetch_case_law_summaries(case_law_ids: list[str]) -> list[dict]:
|
||||
"""Pull lightweight metadata for a set of case_law UUIDs (cited-by stubs).
|
||||
|
||||
Doesn't pull chunks/halachot — the goal is to surface the existence of
|
||||
the related precedent, not to repeat search. The caller can drill in
|
||||
via search_internal_decisions with chair_name+case_number if they want
|
||||
full passages.
|
||||
"""
|
||||
from uuid import UUID
|
||||
pool = await db.get_pool()
|
||||
uuid_list = []
|
||||
for s in case_law_ids:
|
||||
try:
|
||||
uuid_list.append(UUID(s))
|
||||
except ValueError:
|
||||
continue
|
||||
if not uuid_list:
|
||||
return []
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT id::text AS case_law_id,
|
||||
case_number,
|
||||
case_name,
|
||||
court,
|
||||
district,
|
||||
chair_name,
|
||||
date AS decision_date,
|
||||
headnote AS content
|
||||
FROM case_law
|
||||
WHERE id = ANY($1::uuid[])
|
||||
""",
|
||||
uuid_list,
|
||||
)
|
||||
out: list[dict] = []
|
||||
for r in rows:
|
||||
d = dict(r)
|
||||
if d.get("decision_date") is not None:
|
||||
d["decision_date"] = d["decision_date"].isoformat()
|
||||
# Stub rows show up with score 0 — they're not ranked, they're context.
|
||||
d["score"] = 0.0
|
||||
d["type"] = "passage"
|
||||
out.append(d)
|
||||
return out
|
||||
|
||||
85
mcp-server/src/legal_mcp/tools/training_enrichment.py
Normal file
85
mcp-server/src/legal_mcp/tools/training_enrichment.py
Normal file
@@ -0,0 +1,85 @@
|
||||
"""MCP tool wrappers for the style_corpus metadata-enrichment flow.
|
||||
|
||||
The actual extractor lives in
|
||||
``legal_mcp.services.style_metadata_extractor``; this module just exposes
|
||||
it as MCP tools that the chair (or a future automation) can call from
|
||||
Claude Code.
|
||||
|
||||
Why these tools matter: the upload pipeline (`/api/training/upload` →
|
||||
`_process_proofread_training`) inserts a style_corpus row with
|
||||
``summary=''``, ``outcome=''``, ``key_principles=[]`` because LLM
|
||||
extraction can't run from the FastAPI container (no claude CLI there).
|
||||
This module fills that gap — call it from the host, where ``claude``
|
||||
CLI is available, and the row gets enriched.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db, style_metadata_extractor
|
||||
|
||||
|
||||
def _ok(payload) -> str:
|
||||
return json.dumps({"ok": True, **payload}, ensure_ascii=False, default=str)
|
||||
|
||||
|
||||
def _err(msg: str) -> str:
|
||||
return json.dumps({"ok": False, "error": msg}, ensure_ascii=False)
|
||||
|
||||
|
||||
async def extract_decision_metadata(corpus_id: str, overwrite: bool = False) -> str:
|
||||
"""חילוץ מטא-דאטה (summary, outcome, key_principles, appeal_subtype) להחלטה בקורפוס הסגנון.
|
||||
|
||||
ברירת מחדל ``overwrite=False`` ממלא רק שדות ריקים. הזן ``overwrite=true``
|
||||
כדי לרענן ערכים שכבר נכתבו.
|
||||
"""
|
||||
try:
|
||||
cid = UUID(corpus_id)
|
||||
except ValueError:
|
||||
return _err("corpus_id לא תקין")
|
||||
try:
|
||||
result = await style_metadata_extractor.extract_and_apply(cid, overwrite=overwrite)
|
||||
except Exception as e:
|
||||
return _err(str(e))
|
||||
return _ok(result)
|
||||
|
||||
|
||||
async def list_corpus_pending_enrichment(limit: int = 50) -> str:
|
||||
"""רשימת רשומות style_corpus שחסר להן summary/outcome/key_principles — מועמדות להעשרה."""
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT id, decision_number, decision_date,
|
||||
length(full_text) AS chars,
|
||||
coalesce(summary, '') = '' AS missing_summary,
|
||||
coalesce(outcome, '') = '' AS missing_outcome,
|
||||
coalesce(jsonb_array_length(key_principles), 0) = 0 AS missing_principles
|
||||
FROM style_corpus
|
||||
WHERE coalesce(summary, '') = ''
|
||||
OR coalesce(outcome, '') = ''
|
||||
OR coalesce(jsonb_array_length(key_principles), 0) = 0
|
||||
ORDER BY decision_date NULLS LAST
|
||||
LIMIT $1
|
||||
""",
|
||||
limit,
|
||||
)
|
||||
items = [
|
||||
{
|
||||
"corpus_id": str(r["id"]),
|
||||
"decision_number": r["decision_number"] or "",
|
||||
"decision_date": str(r["decision_date"]) if r["decision_date"] else "",
|
||||
"chars": r["chars"],
|
||||
"missing": [
|
||||
f for f, v in (
|
||||
("summary", r["missing_summary"]),
|
||||
("outcome", r["missing_outcome"]),
|
||||
("key_principles", r["missing_principles"]),
|
||||
) if v
|
||||
],
|
||||
}
|
||||
for r in rows
|
||||
]
|
||||
return _ok({"count": len(items), "items": items})
|
||||
276
mcp-server/tests/test_corpus_constraints.py
Normal file
276
mcp-server/tests/test_corpus_constraints.py
Normal file
@@ -0,0 +1,276 @@
|
||||
"""Regression tests for Stage-A corpus integrity fixes (TaskMaster #30, #31).
|
||||
|
||||
These tests document the bugs that were closed in Stage A so they don't
|
||||
regress quietly. Each test maps to a real bug or constraint:
|
||||
|
||||
1. DB CHECK ``cases_practice_area_check`` rejects the legacy
|
||||
``'appeals_committee'`` value — only domain values (rishuy_uvniya /
|
||||
betterment_levy / compensation_197) and ``''`` are allowed.
|
||||
(Bug: many ``cases`` rows stored ``'appeals_committee'`` instead of
|
||||
the domain.)
|
||||
|
||||
2. DB CHECK ``case_law_internal_chair_check`` and
|
||||
``case_law_internal_district_check`` reject internal_committee rows
|
||||
with empty chair_name/district.
|
||||
(Bug: 6 records had source_kind='external_upload' but were really
|
||||
internal committee decisions; the flip to internal_committee in
|
||||
Stage A.2 surfaced the missing chair/district fields.)
|
||||
|
||||
3. DB CHECK ``case_law_external_arar_check`` rejects external_upload
|
||||
rows whose case_number starts with ``"ערר"`` or ``"בל\\"מ"`` —
|
||||
committee decisions must go through internal_decision_upload, not
|
||||
precedent_library_upload.
|
||||
(Bug: the legacy upload path stored everything as external_upload,
|
||||
including appeal-committee decisions; the citation guard now
|
||||
redirects them.)
|
||||
|
||||
4. MCP tool ``precedent_library_upload`` returns an ``_err`` envelope
|
||||
when the citation starts with ``"ערר"`` (citation guard, not DB
|
||||
constraint — fires before INSERT to surface a helpful error).
|
||||
|
||||
These tests connect to the live local Postgres (port 5433) — they do not
|
||||
mock asyncpg. Run with::
|
||||
|
||||
pytest mcp-server/tests/test_corpus_constraints.py -v
|
||||
|
||||
If you don't have ``DATABASE_URL`` set, the tests are skipped.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
from uuid import uuid4
|
||||
|
||||
import asyncpg
|
||||
import pytest
|
||||
|
||||
|
||||
def _dsn() -> str | None:
|
||||
return (
|
||||
os.environ.get("DATABASE_URL")
|
||||
or os.environ.get("LEGAL_AI_DATABASE_URL")
|
||||
or "postgresql://legal_ai:od0ASJZFYibOlWK59krLvvETmgqwlXe8@localhost:5433/legal_ai"
|
||||
)
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def dsn() -> str:
|
||||
d = _dsn()
|
||||
if not d:
|
||||
pytest.skip("No DATABASE_URL set; skipping live-DB regression tests")
|
||||
return d
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def event_loop():
|
||||
"""Provide a fresh event loop per test so asyncpg doesn't leak across cases."""
|
||||
loop = asyncio.new_event_loop()
|
||||
try:
|
||||
yield loop
|
||||
finally:
|
||||
loop.close()
|
||||
|
||||
|
||||
def _run(loop, coro):
|
||||
return loop.run_until_complete(coro)
|
||||
|
||||
|
||||
# ── 1. cases.practice_area CHECK ─────────────────────────────────────
|
||||
|
||||
|
||||
def test_cases_rejects_appeals_committee_practice_area(dsn: str, event_loop) -> None:
|
||||
"""``cases.practice_area = 'appeals_committee'`` must violate the CHECK."""
|
||||
|
||||
async def attempt() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO cases (id, case_number, title, practice_area)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), f"TEST-{uuid4().hex[:8]}", "regression-test",
|
||||
"appeals_committee",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt())
|
||||
|
||||
|
||||
def test_cases_accepts_domain_practice_area(dsn: str, event_loop) -> None:
|
||||
"""Sanity check: rishuy_uvniya / betterment_levy / compensation_197
|
||||
+ empty string must be accepted."""
|
||||
|
||||
async def attempt() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
tx = conn.transaction()
|
||||
await tx.start()
|
||||
try:
|
||||
for value in ("rishuy_uvniya", "betterment_levy",
|
||||
"compensation_197", ""):
|
||||
await conn.execute(
|
||||
"""INSERT INTO cases (id, case_number, title, practice_area)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), f"TEST-{uuid4().hex[:8]}",
|
||||
f"regression-{value or 'empty'}", value,
|
||||
)
|
||||
finally:
|
||||
await tx.rollback()
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt())
|
||||
|
||||
|
||||
# ── 2. case_law internal_committee chair/district CHECK ─────────────
|
||||
|
||||
|
||||
def test_case_law_internal_requires_chair_and_district(dsn: str, event_loop) -> None:
|
||||
"""``case_law`` rows with ``source_kind='internal_committee'`` must have
|
||||
non-empty ``chair_name`` AND ``district``."""
|
||||
|
||||
async def attempt_missing_chair() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind, district, chair_name)
|
||||
VALUES ($1, $2, $3, $4, $5, $6)""",
|
||||
uuid4(), f"ערר {uuid4().hex[:6]}",
|
||||
"test internal w/o chair",
|
||||
"internal_committee", "ירושלים", "",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
async def attempt_missing_district() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind, district, chair_name)
|
||||
VALUES ($1, $2, $3, $4, $5, $6)""",
|
||||
uuid4(), f"ערר {uuid4().hex[:6]}",
|
||||
"test internal w/o district",
|
||||
"internal_committee", "", "עו\"ד דפנה תמיר",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt_missing_chair())
|
||||
_run(event_loop, attempt_missing_district())
|
||||
|
||||
|
||||
# ── 3. case_law external_upload + ערר citation CHECK ────────────────
|
||||
|
||||
|
||||
def test_case_law_external_upload_rejects_arar_citation(dsn: str, event_loop) -> None:
|
||||
"""``case_law`` rows with ``source_kind='external_upload'`` cannot have
|
||||
a ``case_number`` that starts with ``"ערר"`` or ``"בל\"מ"`` — those
|
||||
are committee decisions and must use ``source_kind='internal_committee'``."""
|
||||
|
||||
async def attempt_arar() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), "ערר 1170/24 חיים נ' ועדה",
|
||||
"test external arar", "external_upload",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
async def attempt_balam() -> None:
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
with pytest.raises(asyncpg.exceptions.CheckViolationError):
|
||||
await conn.execute(
|
||||
"""INSERT INTO case_law (id, case_number, case_name,
|
||||
source_kind)
|
||||
VALUES ($1, $2, $3, $4)""",
|
||||
uuid4(), 'בל"מ 1234/25 פלוני',
|
||||
"test external balam", "external_upload",
|
||||
)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
_run(event_loop, attempt_arar())
|
||||
_run(event_loop, attempt_balam())
|
||||
|
||||
|
||||
# ── 4. MCP precedent_library_upload citation guard ──────────────────
|
||||
|
||||
|
||||
def test_mcp_precedent_upload_rejects_arar_citation() -> None:
|
||||
"""The MCP tool ``precedent_library_upload`` must short-circuit
|
||||
citations that start with ``"ערר"`` / ``"בל\"מ"`` and return an
|
||||
``_err`` envelope (a helpful message redirecting to
|
||||
``internal_decision_upload``), without touching the DB."""
|
||||
|
||||
from legal_mcp.tools import precedent_library as tools
|
||||
|
||||
async def call(citation: str) -> dict:
|
||||
# file_path won't be touched because the guard fires first.
|
||||
return json.loads(
|
||||
await tools.precedent_library_upload(
|
||||
file_path="/nonexistent",
|
||||
citation=citation,
|
||||
)
|
||||
)
|
||||
|
||||
loop = asyncio.new_event_loop()
|
||||
try:
|
||||
for citation in (
|
||||
"ערר 1170/24 חיים נ' ועדה",
|
||||
'בל"מ 1234/25 פלוני',
|
||||
"ARAR 8126-25 ב. קרן-נכסים",
|
||||
):
|
||||
result = loop.run_until_complete(call(citation))
|
||||
assert "error" in result, (
|
||||
f"expected guard to reject {citation!r}, got {result!r}"
|
||||
)
|
||||
# The error message should mention internal_decision_upload so
|
||||
# the caller knows the alternative path.
|
||||
assert "internal_decision_upload" in result["error"], (
|
||||
f"error message should redirect to internal_decision_upload, "
|
||||
f"got {result['error']!r}"
|
||||
)
|
||||
finally:
|
||||
loop.close()
|
||||
|
||||
|
||||
def test_practice_area_module_invariants() -> None:
|
||||
"""Quick guard that the ``practice_area`` service module exposes the
|
||||
helpers tools and tests depend on, and that derivation is consistent
|
||||
with the case-number convention (1xxx/8xxx/9xxx)."""
|
||||
|
||||
from legal_mcp.services import practice_area as pa
|
||||
|
||||
# Domain mapping is consistent with the case-number prefix convention.
|
||||
assert pa.derive_domain_practice_area("1170") == "rishuy_uvniya"
|
||||
assert pa.derive_domain_practice_area("8126/25") == "betterment_levy"
|
||||
assert pa.derive_domain_practice_area("9001") == "compensation_197"
|
||||
assert pa.derive_domain_practice_area("ARAR-25-8126") == "betterment_levy"
|
||||
# Unparseable input → empty (caller decides fallback).
|
||||
assert pa.derive_domain_practice_area("foo") == ""
|
||||
assert pa.derive_domain_practice_area("") == ""
|
||||
|
||||
# Empty practice_area is valid (DB allows it as 'unclassified').
|
||||
pa.validate("", "unknown")
|
||||
pa.validate("rishuy_uvniya", "building_permit")
|
||||
pa.validate("betterment_levy", "betterment_levy")
|
||||
|
||||
# appeals_committee (axis A) is still recognised for backward-compat.
|
||||
pa.validate("appeals_committee", "building_permit")
|
||||
|
||||
# is_override returns False when subtype matches derivation.
|
||||
assert pa.is_override("1170", "rishuy_uvniya", "building_permit") is False
|
||||
assert pa.is_override("8126", "betterment_levy", "betterment_levy") is False
|
||||
97
mcp-server/tests/test_precedent_corpus_isolation.py
Normal file
97
mcp-server/tests/test_precedent_corpus_isolation.py
Normal file
@@ -0,0 +1,97 @@
|
||||
"""Regression test for GAP-10 / INV-RET1: corpus separation enforced on
|
||||
EVERY precedent-library query path — including the halacha sub-query.
|
||||
|
||||
Bug: ``search_precedent_library_semantic`` and
|
||||
``search_precedent_library_lexical`` filtered the *chunk* sub-query by
|
||||
``cl.source_kind`` but NOT the *halacha* sub-query. So an external
|
||||
(``source_kind='external_upload'``) search leaked internal-committee
|
||||
halachot, and an internal search leaked external-ruling halachot — a
|
||||
cross-corpus contamination of the rule-level results.
|
||||
|
||||
Fix: the same ``cl.source_kind = '<kind>'`` predicate that gates the
|
||||
chunk query now also gates the halacha query, in BOTH functions.
|
||||
|
||||
This test runs fully OFFLINE — it monkeypatches ``db.get_pool`` with a
|
||||
fake pool that captures every SQL string passed to ``fetch`` instead of
|
||||
hitting Postgres. It asserts the captured halacha SQL carries the
|
||||
source_kind predicate identical to the chunk SQL.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
|
||||
import pytest
|
||||
|
||||
from legal_mcp.services import db
|
||||
|
||||
|
||||
class _FakePool:
|
||||
"""Captures SQL passed to ``fetch``; returns no rows."""
|
||||
|
||||
def __init__(self) -> None:
|
||||
self.queries: list[str] = []
|
||||
|
||||
async def fetch(self, sql: str, *args) -> list: # noqa: ANN002
|
||||
self.queries.append(sql)
|
||||
return []
|
||||
|
||||
|
||||
def _classify(queries: list[str]) -> tuple[str, str]:
|
||||
"""Return (halacha_sql, chunk_sql) from the captured queries."""
|
||||
halacha = next(q for q in queries if "FROM halachot h" in q)
|
||||
chunk = next(q for q in queries if "FROM precedent_chunks pc" in q)
|
||||
return halacha, chunk
|
||||
|
||||
|
||||
@pytest.fixture()
|
||||
def fake_pool(monkeypatch: pytest.MonkeyPatch) -> _FakePool:
|
||||
pool = _FakePool()
|
||||
|
||||
async def _get_pool() -> _FakePool:
|
||||
return pool
|
||||
|
||||
monkeypatch.setattr(db, "get_pool", _get_pool)
|
||||
return pool
|
||||
|
||||
|
||||
@pytest.mark.parametrize("source_kind", ["external_upload", "internal_committee"])
|
||||
def test_semantic_halacha_query_is_source_kind_scoped(
|
||||
fake_pool: _FakePool, source_kind: str
|
||||
) -> None:
|
||||
asyncio.run(
|
||||
db.search_precedent_library_semantic(
|
||||
query_embedding=[0.0] * 8,
|
||||
source_kind=source_kind,
|
||||
include_halachot=True,
|
||||
limit=5,
|
||||
)
|
||||
)
|
||||
halacha_sql, chunk_sql = _classify(fake_pool.queries)
|
||||
predicate = f"cl.source_kind = '{source_kind}'"
|
||||
assert predicate in chunk_sql, "chunk query must be source_kind-scoped (precondition)"
|
||||
assert predicate in halacha_sql, (
|
||||
"halacha query MUST carry the same source_kind predicate as the "
|
||||
"chunk query — otherwise cross-corpus halacha leakage (GAP-10)"
|
||||
)
|
||||
|
||||
|
||||
@pytest.mark.parametrize("source_kind", ["external_upload", "internal_committee"])
|
||||
def test_lexical_halacha_query_is_source_kind_scoped(
|
||||
fake_pool: _FakePool, source_kind: str
|
||||
) -> None:
|
||||
asyncio.run(
|
||||
db.search_precedent_library_lexical(
|
||||
query="zoning setback",
|
||||
source_kind=source_kind,
|
||||
include_halachot=True,
|
||||
limit=5,
|
||||
)
|
||||
)
|
||||
halacha_sql, chunk_sql = _classify(fake_pool.queries)
|
||||
predicate = f"cl.source_kind = '{source_kind}'"
|
||||
assert predicate in chunk_sql, "chunk query must be source_kind-scoped (precondition)"
|
||||
assert predicate in halacha_sql, (
|
||||
"halacha query MUST carry the same source_kind predicate as the "
|
||||
"chunk query — otherwise cross-corpus halacha leakage (GAP-10)"
|
||||
)
|
||||
@@ -12,6 +12,7 @@
|
||||
| `sync_missing_agent_skills.py` | python | סקריפט "אל-כשל" להוספת `paperclipSkillSync` ל-`הגהת מסמכים` ו-`מנתח משפטי` שפיספסו את ה-sync ההיסטורי (Gap #28). תומך `--verify`/`--dry-run`/`--apply`. גיבוי אוטומטי ל-`agents-pre-skill-sync-*.sql`. דורש `PAPERCLIP_BOARD_API_KEY` (Infisical /paperclip ב-nautilus env). idempotent. | חד-פעמי (בוצע 2026-05-04). שמור לרפרנס |
|
||||
| `sync_agents_across_companies.py` | python | **סנכרון סוכנים מ-CMP (1xxx, master) ל-CMPA (8xxx, mirror)** — Gap #25. משווה adapter_config (model/timeout/instructions/skills/etc), runtime_config (heartbeat), ושדות top-level (budget/metadata/icon/title/role). מסנן אוטומטית local skills שלא קיימים ב-mirror. לוגיקת subset (mirror יכול להחזיק יותר skills כי ה-API מוסיף required runtime skills). תומך `--verify`/`--dry-run`/`--apply [--only NAME]`. גיבוי אוטומטי. דורש `PAPERCLIP_BOARD_API_KEY`. **להריץ אחרי כל שינוי הגדרות ב-CMP.** **⚠ אם `adapter_type` שונה בין CMP ל-CMPA — הסקריפט מדלג על הסוכן עם warning. בעת מעבר adapter (למשל ל-`deepseek_local`) חובה לעדכן ידנית בשתי החברות לפני sync.** | ידני אחרי כל שינוי |
|
||||
| `fix_paperclipai_skills_drift.py` | python | סקריפט חד-פעמי (בוצע 2026-05-04) שניקה drift על `paperclipai/*` skills בין CMP ל-CMPA. הסיר `paperclip-dev` מכל 14 הסוכנים, ודאג ש-`paperclip-converting-plans-to-tasks` קיים רק על CEO ו-analyst. תומך `--apply` (ברירת מחדל: dry-run). דורש `PAPERCLIP_BOARD_API_KEY`. נשמר לרפרנס למקרה שhdrift חוזר. | חד-פעמי (בוצע) |
|
||||
| `test_retrieval_by_name.py` | python | בדיקת אחזור-לפי-שם (#52/RC-A) — מאמת ש`search_precedent_library`/`search_internal_decisions` מדרגים את ההחלטה עצמה (אגסי) מעל מי שמצטט אותה, + רגרסיות לשאילתות מהותיות. הרצה: `DOTENV_PATH=/home/chaim/.env DATA_DIR=.../data mcp-server/.venv/bin/python scripts/test_retrieval_by_name.py` (exit 0 = עבר). | ידני אחרי שינוי שכבת חיפוש |
|
||||
| `auto-sync-cases.sh` | bash | סנכרון תיקי ערר ל-Gitea — רץ כל דקה | `* * * * *` (cron) |
|
||||
| `backup-db.sh` | bash | גיבוי PostgreSQL יומי ל-`data/backups/` (gzip) | לתזמן: `0 2 * * *` |
|
||||
| `restore-db.sh` | bash | שחזור DB מגיבוי (companion ל-backup-db.sh) | ידני |
|
||||
@@ -28,6 +29,14 @@
|
||||
| `voyage_rerank_corpus_poc.py` | python | POC #5 — voyage-3 vs rerank-2 על קורפוס מלא (785 docs). הכרעה: +4.5% mean@3 כללי, +11.6% על P queries (practical) | בנצ'מרק חד-פעמי, אישר את שלב B |
|
||||
| `multimodal_backfill.py` | python | Backfill voyage-multimodal-3 page embeddings על מסמכי תיקים קיימים. idempotent (skips by default), forces `MULTIMODAL_ENABLED=true` ל-run, רץ מהקונטיינר. שלב C — ראה `docs/voyage-upgrades-plan.md` | ידני per-case (`python multimodal_backfill.py 8174-24 8137-24`) |
|
||||
| `backfill_chunk_pages.py` | python | Backfill `page_number` ב-`document_chunks` קיימים. legacy chunker לא tracked עמודים → `page_number=NULL` חוסם boost של multimodal hybrid (text+image join על אותו עמוד). re-extracts כל PDF (re-OCR אם צריך, ~$0.0015/page), מחשב page_offsets, ומעדכן chunks. idempotent | ידני per-case (`python backfill_chunk_pages.py 8174-24 8137-24`) |
|
||||
| `audit_corpus_integrity.py` | python | בדיקה תקופתית של עקביות הקורפוס — 3 בדיקות SQL read-only על `case_law` ו-`cases`: (A) `external_upload` עם prefix פנימי `ערר`/`בל"מ`; (B) `internal_committee` חסר `chair_name`/`district`; (C) `cases.practice_area` מחוץ ל-{`rishuy_uvniya`, `betterment_levy`, `compensation_197`, `''`}. כותב log מצטבר ל-`data/logs/corpus_integrity_audit.log` ובמצב הפרות שולח wakeup ל-CEO ב-Paperclip (best-effort, רק אם `PAPERCLIP_API_URL`+`PAPERCLIP_API_KEY` מוגדרים). דגל: `--no-notify`. Idempotent, יוצא 0. **Cron יומי 07:00**: `0 7 * * * /home/chaim/legal-ai/mcp-server/.venv/bin/python /home/chaim/legal-ai/scripts/audit_corpus_integrity.py` | `0 7 * * *` (cron) |
|
||||
| `backfill_legal_arguments.py` | python | Backfill `legal_arguments` לתיקים עם `claims` קיימים (TaskMaster #36). מקבץ פרופוזיציות גולמיות לטיעונים משפטיים מובחנים (~6-12 לכל צד) דרך `argument_aggregator.aggregate_claims_to_arguments` (Claude CLI). תומך `--dry-run`/`--apply`/`--force`/`--case <num>...`. **חייב לרוץ מהמכונה המקומית** (לא קונטיינר) — `claude_session` דורש Claude CLI | ידני per-case (`python scripts/backfill_legal_arguments.py --apply --case 1017-03-26`) |
|
||||
| `upload_blam_decisions.py` | python | חד-פעמי (2026-05-26) — העלאת 2 החלטות בל"מ ל-`case_law` (8126/24 סופר נוח, 8047/23 הרנון) דרך `ingest_internal_decision` ישיר, עוקף MCP server שטרם נטען מחדש אחרי הוספת `proceeding_type`. **לא להריץ שוב** | חד-פעמי — להעביר ל-`.archive/` בהזדמנות |
|
||||
| `process_pending_blam.py` | python | חד-פעמי (2026-05-26) — הרצת metadata + halacha extraction על 2 החלטות בל"מ שעלו ב-`upload_blam_decisions.py`. עוקף MCP (אותו טעם). **לא להריץ שוב** | חד-פעמי — להעביר ל-`.archive/` בהזדמנות |
|
||||
| `compute_ndcg.py` | python | חישוב nDCG@10 על `search_relevance_feedback` (TaskMaster #50, Stage C). aggregation לפי `search_type` ולפי שבוע, כולל top-cited case_law ו-coverage %. דגלים: `--k 10`, `--weeks 12`, `--pretty`. read-only, פלט JSON. משמש גם את `GET /api/admin/rag-metrics` (מיובא inline) — שינוי חתימה ב-`compute()` ישבור את ה-endpoint | ידני / cron עתידי לדיווח שבועי |
|
||||
| `backfill_multimodal_precedents.py` | python | Backfill voyage-multimodal-3 page embeddings על רשומות `case_law` (external_upload + internal_committee) שחסרות `precedent_image_embeddings`. בונה אינדקס קבצים מ-`data/precedent-library/` ו-`data/internal-decisions/`, מנסה התאמה לפי tokens של מספרי תיק (כולל parts-match לפורמטים שונים של Nevo doc-id). מדלג על רשומות בלי קובץ-מקור או עם MD בלבד (PyMuPDF לא מרנדר MD). תומך `--dry-run` (default) / `--apply` / `--only external_upload\|internal_committee` / `--limit N`. רץ בקונטיינר (יש `/data` + Voyage env). **הופעל 2026-05-26**: 70 חסרים → 26 backfilled (503 pages, ~$0.21 voyage tokens), 44 אין-קובץ-מקור. ניתן להריץ שוב אחרי שיועלו עוד PDF/DOCX לספרייה | ידני |
|
||||
| `monitor_halacha_quality.py` | python | מנטר איכות חילוץ הלכות. בודק drift של `avg(confidence)` בין baseline היסטורי לחלון אחרון. מחזיר JSON מטריקות + alert ב-stderr אם drift > threshold (ברירת מחדל 5%). 2 סדרות: trusted (approved+published) ו-all_extracted. תומך `--window N` / `--threshold X` / `--min-sample N` / `--silent` / `--exit-on-alert`. רץ ב-container או מקומית עם `mcp-server/.venv` (אין תלות ב-LLM, רק SQL). **תזמון מומלץ**: `0 8 * * 1` (יום ראשון 08:00, שבועי) | `0 8 * * 1` (לתזמן) |
|
||||
| `audit_training_corpus.py` | python | audit של `style_corpus` — לכל החלטה: שדות מטא-דאטה מאוכלסים (`summary`/`outcome`/`key_principles`/`appeal_subtype`/`subject_categories`), קישור ל-`documents` (FK + chunks + embeddings). מפיק `data/audit/corpus-YYYY-MM-DD.json` + summary בקונסול. דרוש `POSTGRES_URL` או POSTGRES_*. אין תלויות חיצוניות מלבד asyncpg. **רץ מהמכונה המקומית** (לא קונטיינר) — חיבור ישיר ל-Postgres :5433 | ידני / קדם-עבודה לפני enrichment של מטא-דאטה |
|
||||
|
||||
## תיקיית `.archive/` — סקריפטים שהושלמו
|
||||
|
||||
|
||||
281
scripts/audit_corpus_integrity.py
Normal file
281
scripts/audit_corpus_integrity.py
Normal file
@@ -0,0 +1,281 @@
|
||||
"""Periodic corpus-integrity audit.
|
||||
|
||||
Runs a set of read-only SQL checks against the legal-ai DB to detect rows
|
||||
that violate domain constraints which are *not* enforced by the schema
|
||||
(or were added after the constraint was put in place).
|
||||
|
||||
Checks performed:
|
||||
|
||||
A. ``case_law`` rows with ``source_kind='external_upload'`` whose
|
||||
``case_number`` starts with the Hebrew prefixes ``ערר`` / ``בל"מ``.
|
||||
Internal committee decisions belong to ``source_kind='internal_committee'``.
|
||||
|
||||
B. ``case_law`` rows with ``source_kind='internal_committee'`` that
|
||||
lack a ``chair_name`` and/or ``district``. Internal decisions must
|
||||
carry both.
|
||||
|
||||
C. ``cases`` rows with a ``practice_area`` outside the closed set
|
||||
{``rishuy_uvniya``, ``betterment_levy``, ``compensation_197``, ``''``}.
|
||||
|
||||
Output:
|
||||
|
||||
* Appends a timestamped block to ``data/logs/corpus_integrity_audit.log``.
|
||||
* If hits are found AND env ``PAPERCLIP_API_URL`` + ``PAPERCLIP_API_KEY``
|
||||
are set, posts a CEO wakeup comment via ``POST /api/agents/{ceo}/wakeup``
|
||||
(best-effort, never fails the script).
|
||||
* Always exits 0 unless an unexpected error occurs (so cron stays quiet).
|
||||
|
||||
Cron suggestion (daily 07:00):
|
||||
|
||||
0 7 * * * /home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||
/home/chaim/legal-ai/scripts/audit_corpus_integrity.py
|
||||
|
||||
Idempotent. Read-only on the DB.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
# Load ~/.env so POSTGRES_* / PAPERCLIP_* are picked up when run from cron.
|
||||
ENV_PATH = os.path.expanduser("~/.env")
|
||||
if os.path.isfile(ENV_PATH):
|
||||
with open(ENV_PATH, encoding="utf-8") as f:
|
||||
for line in f:
|
||||
line = line.strip()
|
||||
if line and not line.startswith("#") and "=" in line:
|
||||
k, v = line.split("=", 1)
|
||||
os.environ.setdefault(k, v)
|
||||
|
||||
import asyncpg # noqa: E402
|
||||
|
||||
try:
|
||||
import httpx # noqa: E402
|
||||
except ImportError: # httpx is part of the legal-ai venv; not required for DB checks
|
||||
httpx = None # type: ignore[assignment]
|
||||
|
||||
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
LOG_PATH = REPO_ROOT / "data" / "logs" / "corpus_integrity_audit.log"
|
||||
|
||||
CHECK_A_SQL = (
|
||||
"SELECT id, case_number FROM case_law "
|
||||
"WHERE source_kind = 'external_upload' AND case_number ~ '^ערר|^בל\"מ' "
|
||||
"ORDER BY case_number"
|
||||
)
|
||||
CHECK_B_SQL = (
|
||||
"SELECT id, case_number, chair_name, district FROM case_law "
|
||||
"WHERE source_kind = 'internal_committee' "
|
||||
"AND (chair_name IS NULL OR chair_name = '' "
|
||||
" OR district IS NULL OR district = '') "
|
||||
"ORDER BY case_number"
|
||||
)
|
||||
CHECK_C_SQL = (
|
||||
"SELECT id, case_number, practice_area FROM cases "
|
||||
"WHERE practice_area IS NOT NULL "
|
||||
"AND practice_area NOT IN ('rishuy_uvniya', 'betterment_levy', "
|
||||
" 'compensation_197', '') "
|
||||
"ORDER BY case_number"
|
||||
)
|
||||
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s [%(levelname)s] %(message)s",
|
||||
)
|
||||
logger = logging.getLogger("audit_corpus_integrity")
|
||||
|
||||
|
||||
def _pg_url() -> str:
|
||||
"""Resolve POSTGRES URL from env, falling back to discrete vars."""
|
||||
url = os.environ.get("POSTGRES_URL")
|
||||
if url:
|
||||
return url
|
||||
pg_host = os.environ.get("POSTGRES_HOST", "127.0.0.1")
|
||||
pg_port = int(os.environ.get("POSTGRES_PORT", "5433"))
|
||||
pg_user = os.environ.get("POSTGRES_USER", "legal_ai")
|
||||
pg_pw = os.environ.get("POSTGRES_PASSWORD", "")
|
||||
pg_db = os.environ.get("POSTGRES_DB", "legal_ai")
|
||||
if not pg_pw:
|
||||
raise SystemExit("POSTGRES_PASSWORD / POSTGRES_URL not set")
|
||||
return f"postgres://{pg_user}:{pg_pw}@{pg_host}:{pg_port}/{pg_db}"
|
||||
|
||||
|
||||
async def _run_check(conn: asyncpg.Connection, sql: str) -> list[dict]:
|
||||
rows = await conn.fetch(sql)
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
|
||||
async def _resolve_ceo_agent_id() -> str | None:
|
||||
"""Best-effort: look up the CEO agent UUID for CMP via the API.
|
||||
|
||||
Returns None if PAPERCLIP env is missing or the lookup fails.
|
||||
"""
|
||||
base_url = os.environ.get("PAPERCLIP_API_URL")
|
||||
api_key = os.environ.get("PAPERCLIP_API_KEY")
|
||||
if not (base_url and api_key and httpx is not None):
|
||||
return None
|
||||
try:
|
||||
async with httpx.AsyncClient(timeout=5.0) as client:
|
||||
r = await client.get(
|
||||
f"{base_url}/api/agents",
|
||||
headers={"Authorization": f"Bearer {api_key}"},
|
||||
)
|
||||
r.raise_for_status()
|
||||
payload = r.json()
|
||||
items = payload if isinstance(payload, list) else payload.get("items", [])
|
||||
for item in items:
|
||||
# Look for a CMP-side CEO (master); the CMPA mirror has a different id.
|
||||
title = (item.get("title") or "").lower()
|
||||
role = (item.get("role") or "").lower()
|
||||
if "ceo" in title or "ceo" in role or "מנכ" in title:
|
||||
return item.get("id")
|
||||
except Exception as e:
|
||||
logger.warning("CEO lookup failed: %s", e)
|
||||
return None
|
||||
|
||||
|
||||
async def _notify_ceo(summary: str) -> bool:
|
||||
"""Post a wakeup comment to the CEO agent. Returns True on best-effort success."""
|
||||
base_url = os.environ.get("PAPERCLIP_API_URL")
|
||||
api_key = os.environ.get("PAPERCLIP_API_KEY")
|
||||
if not (base_url and api_key and httpx is not None):
|
||||
logger.info("Paperclip env not set — skipping CEO wakeup")
|
||||
return False
|
||||
ceo_id = await _resolve_ceo_agent_id()
|
||||
if not ceo_id:
|
||||
logger.info("Could not resolve CEO agent id — skipping wakeup")
|
||||
return False
|
||||
try:
|
||||
async with httpx.AsyncClient(timeout=5.0) as client:
|
||||
r = await client.post(
|
||||
f"{base_url}/api/agents/{ceo_id}/wakeup",
|
||||
headers={
|
||||
"Authorization": f"Bearer {api_key}",
|
||||
"Content-Type": "application/json",
|
||||
},
|
||||
json={
|
||||
"source": "automation",
|
||||
"triggerDetail": "audit_corpus_integrity",
|
||||
"reason": "corpus integrity audit found violations",
|
||||
"payload": {"summary": summary},
|
||||
},
|
||||
)
|
||||
r.raise_for_status()
|
||||
logger.info("Notified CEO (agent_id=%s)", ceo_id)
|
||||
return True
|
||||
except Exception as e:
|
||||
logger.warning("CEO wakeup failed: %s", e)
|
||||
return False
|
||||
|
||||
|
||||
def _format_report(
|
||||
a_hits: list[dict],
|
||||
b_hits: list[dict],
|
||||
c_hits: list[dict],
|
||||
ts: datetime,
|
||||
) -> str:
|
||||
parts: list[str] = []
|
||||
parts.append(f"=== Corpus integrity audit @ {ts.isoformat()} ===")
|
||||
parts.append("")
|
||||
parts.append(
|
||||
f"Check A (case_law external_upload with internal-style "
|
||||
f"case_number prefix): {len(a_hits)} hit(s)"
|
||||
)
|
||||
for row in a_hits[:50]:
|
||||
parts.append(f" - id={row['id']} case_number={row['case_number']!r}")
|
||||
if len(a_hits) > 50:
|
||||
parts.append(f" ... ({len(a_hits) - 50} more truncated)")
|
||||
parts.append("")
|
||||
parts.append(
|
||||
f"Check B (case_law internal_committee missing chair_name/district): "
|
||||
f"{len(b_hits)} hit(s)"
|
||||
)
|
||||
for row in b_hits[:50]:
|
||||
parts.append(
|
||||
f" - id={row['id']} case_number={row['case_number']!r} "
|
||||
f"chair_name={row.get('chair_name')!r} district={row.get('district')!r}"
|
||||
)
|
||||
if len(b_hits) > 50:
|
||||
parts.append(f" ... ({len(b_hits) - 50} more truncated)")
|
||||
parts.append("")
|
||||
parts.append(
|
||||
f"Check C (cases.practice_area outside closed set): {len(c_hits)} hit(s)"
|
||||
)
|
||||
for row in c_hits[:50]:
|
||||
parts.append(
|
||||
f" - id={row['id']} case_number={row['case_number']!r} "
|
||||
f"practice_area={row.get('practice_area')!r}"
|
||||
)
|
||||
if len(c_hits) > 50:
|
||||
parts.append(f" ... ({len(c_hits) - 50} more truncated)")
|
||||
parts.append("")
|
||||
return "\n".join(parts)
|
||||
|
||||
|
||||
async def main(args: argparse.Namespace) -> int:
|
||||
pg_url = _pg_url()
|
||||
conn = await asyncpg.connect(pg_url)
|
||||
try:
|
||||
a_hits = await _run_check(conn, CHECK_A_SQL)
|
||||
b_hits = await _run_check(conn, CHECK_B_SQL)
|
||||
c_hits = await _run_check(conn, CHECK_C_SQL)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
total = len(a_hits) + len(b_hits) + len(c_hits)
|
||||
ts = datetime.now(timezone.utc)
|
||||
report = _format_report(a_hits, b_hits, c_hits, ts)
|
||||
|
||||
# Always write to log (creates dir + file if missing).
|
||||
LOG_PATH.parent.mkdir(parents=True, exist_ok=True)
|
||||
with LOG_PATH.open("a", encoding="utf-8") as f:
|
||||
f.write(report)
|
||||
f.write("\n")
|
||||
|
||||
# Echo to stdout so cron mail / manual run shows the result.
|
||||
print(report)
|
||||
|
||||
if total == 0:
|
||||
logger.info("clean: no integrity violations found")
|
||||
return 0
|
||||
|
||||
logger.warning(
|
||||
"found %d total violation(s) (A=%d, B=%d, C=%d)",
|
||||
total, len(a_hits), len(b_hits), len(c_hits),
|
||||
)
|
||||
|
||||
if args.notify:
|
||||
summary_lines = [
|
||||
"ה-audit היומי על הקורפוס מצא הפרות:",
|
||||
f"- Check A (external_upload עם prefix פנימי): {len(a_hits)}",
|
||||
f"- Check B (internal_committee חסר chair/district): {len(b_hits)}",
|
||||
f"- Check C (cases.practice_area לא תקין): {len(c_hits)}",
|
||||
"",
|
||||
f"פירוט מלא: {LOG_PATH}",
|
||||
]
|
||||
await _notify_ceo("\n".join(summary_lines))
|
||||
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
parser = argparse.ArgumentParser(description=__doc__)
|
||||
parser.add_argument(
|
||||
"--no-notify",
|
||||
dest="notify",
|
||||
action="store_false",
|
||||
help="Don't post a CEO wakeup even if hits are found",
|
||||
)
|
||||
parser.set_defaults(notify=True)
|
||||
args = parser.parse_args()
|
||||
try:
|
||||
rc = asyncio.run(main(args))
|
||||
except KeyboardInterrupt:
|
||||
sys.exit(130)
|
||||
sys.exit(rc)
|
||||
196
scripts/audit_training_corpus.py
Executable file
196
scripts/audit_training_corpus.py
Executable file
@@ -0,0 +1,196 @@
|
||||
#!/usr/bin/env python
|
||||
"""Audit the style_corpus table — list each decision with what's populated and what's missing.
|
||||
|
||||
Produces a JSON report at data/audit/corpus-YYYY-MM-DD.json so we can see at a glance
|
||||
which corpus entries lack summary/outcome/key_principles/appeal_subtype/chunks/embeddings.
|
||||
|
||||
Run with the mcp-server venv (has asyncpg):
|
||||
POSTGRES_URL=postgres://... ./mcp-server/.venv/bin/python scripts/audit_training_corpus.py
|
||||
|
||||
Without POSTGRES_URL, falls back to the per-field env vars used by web/mcp-server config.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
from datetime import UTC, date, datetime
|
||||
from pathlib import Path
|
||||
|
||||
import asyncpg
|
||||
|
||||
|
||||
def _build_dsn() -> str:
|
||||
if url := os.environ.get("POSTGRES_URL"):
|
||||
return url
|
||||
return (
|
||||
f"postgres://{os.environ.get('POSTGRES_USER', 'legal_ai')}:"
|
||||
f"{os.environ.get('POSTGRES_PASSWORD', '')}@"
|
||||
f"{os.environ.get('POSTGRES_HOST', '127.0.0.1')}:"
|
||||
f"{os.environ.get('POSTGRES_PORT', '5433')}/"
|
||||
f"{os.environ.get('POSTGRES_DB', 'legal_ai')}"
|
||||
)
|
||||
|
||||
|
||||
async def audit() -> dict:
|
||||
dsn = _build_dsn()
|
||||
conn = await asyncpg.connect(dsn)
|
||||
try:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT id, decision_number, decision_date, subject_categories,
|
||||
length(full_text) AS chars,
|
||||
summary,
|
||||
outcome,
|
||||
key_principles,
|
||||
practice_area,
|
||||
appeal_subtype,
|
||||
document_id,
|
||||
created_at
|
||||
FROM style_corpus
|
||||
ORDER BY decision_date NULLS LAST, decision_number
|
||||
"""
|
||||
)
|
||||
|
||||
# Chunk + embedding counts for each related document — by direct FK first,
|
||||
# then by title-match for legacy rows where style_corpus.document_id is NULL.
|
||||
chunk_counts = await conn.fetch(
|
||||
"""
|
||||
SELECT d.id AS doc_id, d.title,
|
||||
count(c.id) AS chunks,
|
||||
count(c.embedding) FILTER (WHERE c.embedding IS NOT NULL) AS chunks_with_emb
|
||||
FROM documents d
|
||||
LEFT JOIN document_chunks c ON c.document_id = d.id
|
||||
WHERE d.title LIKE '[קורפוס]%' OR d.id IN (SELECT document_id FROM style_corpus WHERE document_id IS NOT NULL)
|
||||
GROUP BY d.id, d.title
|
||||
"""
|
||||
)
|
||||
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
by_doc_id = {r["doc_id"]: r for r in chunk_counts}
|
||||
|
||||
# Index corpus documents by every digit cluster in their title so we can
|
||||
# match against style_corpus.decision_number regardless of formatting
|
||||
# (e.g. style_corpus has "1109-25" but title may say "ARAR-25-1109" or
|
||||
# "ערר 1009-25"). Each digit run >=3 chars becomes a key.
|
||||
by_digit: dict[str, dict] = {}
|
||||
for r in chunk_counts:
|
||||
title = r["title"] or ""
|
||||
for tok in re.findall(r"\d{3,}", title):
|
||||
by_digit.setdefault(tok, r)
|
||||
|
||||
decisions = []
|
||||
gaps_total = {
|
||||
"summary": 0, "outcome": 0, "key_principles": 0,
|
||||
"appeal_subtype": 0, "subject_categories": 0,
|
||||
"chunks": 0, "embeddings": 0, "document_id": 0,
|
||||
}
|
||||
|
||||
for row in rows:
|
||||
cats = row["subject_categories"]
|
||||
if isinstance(cats, str):
|
||||
try:
|
||||
cats = json.loads(cats)
|
||||
except json.JSONDecodeError:
|
||||
cats = []
|
||||
cats = cats or []
|
||||
|
||||
kp = row["key_principles"]
|
||||
if isinstance(kp, str):
|
||||
try:
|
||||
kp = json.loads(kp)
|
||||
except json.JSONDecodeError:
|
||||
kp = []
|
||||
kp = kp or []
|
||||
|
||||
# Resolve chunks: prefer FK, fall back to digit-cluster match on decision_number.
|
||||
chunks = 0
|
||||
chunks_with_emb = 0
|
||||
if row["document_id"] and row["document_id"] in by_doc_id:
|
||||
r = by_doc_id[row["document_id"]]
|
||||
chunks = r["chunks"]
|
||||
chunks_with_emb = r["chunks_with_emb"]
|
||||
elif row["decision_number"]:
|
||||
for tok in re.findall(r"\d{3,}", row["decision_number"]):
|
||||
if tok in by_digit:
|
||||
r = by_digit[tok]
|
||||
chunks = r["chunks"]
|
||||
chunks_with_emb = r["chunks_with_emb"]
|
||||
break
|
||||
|
||||
missing = []
|
||||
if not row["summary"]:
|
||||
missing.append("summary")
|
||||
gaps_total["summary"] += 1
|
||||
if not row["outcome"]:
|
||||
missing.append("outcome")
|
||||
gaps_total["outcome"] += 1
|
||||
if not kp:
|
||||
missing.append("key_principles")
|
||||
gaps_total["key_principles"] += 1
|
||||
if not row["appeal_subtype"]:
|
||||
missing.append("appeal_subtype")
|
||||
gaps_total["appeal_subtype"] += 1
|
||||
if not cats:
|
||||
missing.append("subject_categories")
|
||||
gaps_total["subject_categories"] += 1
|
||||
if chunks == 0:
|
||||
missing.append("chunks")
|
||||
gaps_total["chunks"] += 1
|
||||
elif chunks_with_emb < chunks:
|
||||
missing.append(f"embeddings({chunks_with_emb}/{chunks})")
|
||||
gaps_total["embeddings"] += 1
|
||||
if row["document_id"] is None:
|
||||
missing.append("document_id")
|
||||
gaps_total["document_id"] += 1
|
||||
|
||||
decisions.append({
|
||||
"id": str(row["id"]),
|
||||
"decision_number": row["decision_number"] or "",
|
||||
"decision_date": row["decision_date"].isoformat() if row["decision_date"] else None,
|
||||
"chars": row["chars"],
|
||||
"subject_categories": cats,
|
||||
"practice_area": row["practice_area"] or "",
|
||||
"appeal_subtype": row["appeal_subtype"] or "",
|
||||
"summary_len": len(row["summary"] or ""),
|
||||
"outcome_len": len(row["outcome"] or ""),
|
||||
"key_principles_count": len(kp),
|
||||
"chunks": chunks,
|
||||
"chunks_with_embeddings": chunks_with_emb,
|
||||
"document_id": str(row["document_id"]) if row["document_id"] else None,
|
||||
"missing": missing,
|
||||
"created_at": row["created_at"].isoformat() if row["created_at"] else None,
|
||||
})
|
||||
|
||||
return {
|
||||
"generated_at": datetime.now(UTC).isoformat(),
|
||||
"total_decisions": len(decisions),
|
||||
"gaps_total": gaps_total,
|
||||
"decisions": decisions,
|
||||
}
|
||||
|
||||
|
||||
async def main() -> int:
|
||||
report = await audit()
|
||||
out_dir = Path(__file__).resolve().parents[1] / "data" / "audit"
|
||||
out_dir.mkdir(parents=True, exist_ok=True)
|
||||
today = date.today().isoformat()
|
||||
out_file = out_dir / f"corpus-{today}.json"
|
||||
out_file.write_text(json.dumps(report, ensure_ascii=False, indent=2), encoding="utf-8")
|
||||
|
||||
# Console summary
|
||||
print(f"Total decisions: {report['total_decisions']}")
|
||||
print("Gaps by field (count of decisions missing it):")
|
||||
for field, n in report["gaps_total"].items():
|
||||
bar = "█" * min(n, 60)
|
||||
print(f" {field:25s} {n:3d} {bar}")
|
||||
print(f"\nReport written to {out_file}")
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(asyncio.run(main()))
|
||||
164
scripts/backfill_legal_arguments.py
Executable file
164
scripts/backfill_legal_arguments.py
Executable file
@@ -0,0 +1,164 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Backfill aggregated legal_arguments for existing cases.
|
||||
|
||||
For every case that has rows in ``claims`` but none in ``legal_arguments``,
|
||||
run ``argument_aggregator.aggregate_claims_to_arguments``.
|
||||
|
||||
Usage (must use mcp-server venv — pgvector + asyncpg are vendored there):
|
||||
PY=/home/chaim/legal-ai/mcp-server/.venv/bin/python
|
||||
|
||||
# Default = dry-run (lists what would be processed):
|
||||
$PY scripts/backfill_legal_arguments.py
|
||||
|
||||
# Process all cases that need it:
|
||||
$PY scripts/backfill_legal_arguments.py --apply
|
||||
|
||||
# Re-aggregate even cases that already have arguments:
|
||||
$PY scripts/backfill_legal_arguments.py --apply --force
|
||||
|
||||
# Only process specific cases:
|
||||
$PY scripts/backfill_legal_arguments.py --apply --case 1017-03-26 1018-03-26
|
||||
|
||||
The script must run from the local dev machine (not the container) because
|
||||
``argument_aggregator`` calls ``claude_session`` which needs the Claude CLI.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
from pathlib import Path
|
||||
from uuid import UUID
|
||||
|
||||
# Make the mcp-server source importable as ``legal_mcp``.
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(REPO_ROOT / "mcp-server" / "src"))
|
||||
|
||||
# Default DB connection (overridable via env / .env on the dev box).
|
||||
if "POSTGRES_URL" not in os.environ:
|
||||
pg_user = os.environ.get("POSTGRES_USER", "legal_ai")
|
||||
pg_pw = os.environ.get("POSTGRES_PASSWORD", "")
|
||||
pg_host = os.environ.get("POSTGRES_HOST", "127.0.0.1")
|
||||
pg_port = os.environ.get("POSTGRES_PORT", "5433")
|
||||
pg_db = os.environ.get("POSTGRES_DB", "legal_ai")
|
||||
os.environ["POSTGRES_URL"] = (
|
||||
f"postgres://{pg_user}:{pg_pw}@{pg_host}:{pg_port}/{pg_db}"
|
||||
)
|
||||
|
||||
|
||||
async def _list_cases_needing_backfill(force: bool) -> list[dict]:
|
||||
"""Find cases that have claims but no aggregated arguments (or all,
|
||||
when ``force`` is True)."""
|
||||
from legal_mcp.services import db
|
||||
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT c.id, c.case_number, c.status,
|
||||
COUNT(DISTINCT cl.id) AS claim_count,
|
||||
COUNT(DISTINCT la.id) AS arg_count
|
||||
FROM cases c
|
||||
LEFT JOIN claims cl ON cl.case_id = c.id
|
||||
LEFT JOIN legal_arguments la ON la.case_id = c.id
|
||||
WHERE c.archived_at IS NULL
|
||||
GROUP BY c.id, c.case_number, c.status
|
||||
HAVING COUNT(DISTINCT cl.id) > 0
|
||||
ORDER BY c.case_number
|
||||
"""
|
||||
)
|
||||
out: list[dict] = []
|
||||
for r in rows:
|
||||
d = dict(r)
|
||||
if force or d["arg_count"] == 0:
|
||||
out.append(d)
|
||||
return out
|
||||
|
||||
|
||||
async def _process_case(case: dict, force: bool) -> dict:
|
||||
from legal_mcp.services import argument_aggregator
|
||||
|
||||
case_id = UUID(str(case["id"]))
|
||||
case_number = case["case_number"]
|
||||
print(
|
||||
f"[backfill] {case_number}: {case['claim_count']} claims, "
|
||||
f"{case['arg_count']} existing args — aggregating (force={force})...",
|
||||
flush=True,
|
||||
)
|
||||
try:
|
||||
result = await argument_aggregator.aggregate_claims_to_arguments(
|
||||
case_id, force=force,
|
||||
)
|
||||
except Exception as e: # noqa: BLE001
|
||||
return {
|
||||
"case_number": case_number,
|
||||
"status": "error",
|
||||
"error": str(e),
|
||||
}
|
||||
print(
|
||||
f"[backfill] {case_number}: status={result.get('status')} "
|
||||
f"total={result.get('total')} by_party={result.get('by_party')}",
|
||||
flush=True,
|
||||
)
|
||||
return {"case_number": case_number, **result}
|
||||
|
||||
|
||||
async def main() -> int:
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Backfill legal_arguments for cases with extracted claims.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--apply", action="store_true",
|
||||
help="Actually run aggregation (default: dry-run).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--force", action="store_true",
|
||||
help="Re-aggregate even cases that already have arguments.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--case", nargs="*", default=[],
|
||||
help="Only process these case numbers (e.g. --case 1017-03-26 1018-03-26).",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
cases = await _list_cases_needing_backfill(force=args.force)
|
||||
if args.case:
|
||||
wanted = set(args.case)
|
||||
cases = [c for c in cases if c["case_number"] in wanted]
|
||||
|
||||
if not cases:
|
||||
print("[backfill] No cases need processing.")
|
||||
return 0
|
||||
|
||||
print(f"[backfill] {len(cases)} case(s) to process:")
|
||||
for c in cases:
|
||||
print(
|
||||
f" - {c['case_number']:<14} status={c['status']:<20} "
|
||||
f"claims={c['claim_count']:<4} args={c['arg_count']}",
|
||||
)
|
||||
|
||||
if not args.apply:
|
||||
print("\n[backfill] dry-run — pass --apply to actually run.")
|
||||
return 0
|
||||
|
||||
print()
|
||||
results: list[dict] = []
|
||||
for case in cases:
|
||||
r = await _process_case(case, force=args.force)
|
||||
results.append(r)
|
||||
|
||||
print("\n[backfill] === Summary ===")
|
||||
for r in results:
|
||||
print(
|
||||
f" {r['case_number']:<14} status={r.get('status', 'unknown'):<22} "
|
||||
f"total={r.get('total', 0)}",
|
||||
)
|
||||
|
||||
errors = [r for r in results if r.get("status") == "error"]
|
||||
return 1 if errors else 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(asyncio.run(main()))
|
||||
475
scripts/backfill_multimodal_precedents.py
Normal file
475
scripts/backfill_multimodal_precedents.py
Normal file
@@ -0,0 +1,475 @@
|
||||
"""Multimodal backfill for precedent library — fills voyage-multimodal-3
|
||||
page embeddings for case_law rows (external_upload + internal_committee)
|
||||
that don't have them yet.
|
||||
|
||||
Background
|
||||
----------
|
||||
77 (in practice 70 today, 2026-05-26) case_law rows were ingested before
|
||||
``MULTIMODAL_ENABLED=true`` was permanently turned on, so they only have
|
||||
text chunks and no per-page image embeddings. The retrieval blend is
|
||||
hybrid (text + image), so the image side of the blend silently degrades
|
||||
for these rows.
|
||||
|
||||
Strategy
|
||||
--------
|
||||
Most rows have no PDF (they were ingested via text or are MD-only). The
|
||||
script:
|
||||
|
||||
1. Lists every case_law row with ``source_kind in (external_upload,
|
||||
internal_committee)`` that is missing image embeddings.
|
||||
2. Tries to find a staged file by matching token-rich substrings of the
|
||||
case_number against filenames under ``data/precedent-library/`` and
|
||||
``data/internal-decisions/``.
|
||||
3. If the file is a PDF or DOCX (both renderable by PyMuPDF/fitz),
|
||||
renders pages at ``MULTIMODAL_DPI``, embeds via voyage-multimodal-3
|
||||
in batches of 50, and stores rows into ``precedent_image_embeddings``.
|
||||
4. Skips rows whose only candidate file is .md (PyMuPDF can't render
|
||||
markdown) or rows with no staged file.
|
||||
|
||||
Designed to run inside the FastAPI/MCP container (where ``/data/...``
|
||||
exists and Voyage env vars are present). Locally, it falls back to
|
||||
``/home/chaim/legal-ai/data/...`` via ``_resolve_local_path``.
|
||||
|
||||
Usage::
|
||||
|
||||
# Inside container (Coolify):
|
||||
docker exec -it <container> /opt/api/.venv/bin/python \\
|
||||
/opt/api/scripts/backfill_multimodal_precedents.py --dry-run
|
||||
# then:
|
||||
docker exec -it <container> /opt/api/.venv/bin/python \\
|
||||
/opt/api/scripts/backfill_multimodal_precedents.py --apply
|
||||
|
||||
Notes
|
||||
-----
|
||||
- Token cost: voyage-multimodal-3 averages ~3-4K tokens per dense legal
|
||||
page. 70 rows * ~30 pages avg = ~2,100 pages = ~7M tokens ≈ $0.70.
|
||||
- Estimate-only mode (``--dry-run``) prints the matched files and
|
||||
page counts without calling Voyage or touching the DB.
|
||||
- Idempotent: per-record DELETE+INSERT inside
|
||||
``store_precedent_image_embeddings``, but the outer loop also
|
||||
skips rows that already have rows in ``precedent_image_embeddings``.
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import logging
|
||||
import os
|
||||
import re
|
||||
import sys
|
||||
import time
|
||||
from pathlib import Path
|
||||
from uuid import UUID
|
||||
|
||||
import fitz # PyMuPDF
|
||||
|
||||
|
||||
def _setup_paths():
|
||||
"""Ensure mcp-server src is on path even when run as a standalone script.
|
||||
|
||||
Works both from host (``/home/chaim/legal-ai/scripts/...``) and from
|
||||
inside the container (``/app/mcp-server/src``).
|
||||
"""
|
||||
here = Path(__file__).resolve().parent
|
||||
candidates = [
|
||||
here.parent / "mcp-server" / "src", # host
|
||||
Path("/app/mcp-server/src"), # container
|
||||
]
|
||||
for c in candidates:
|
||||
if c.is_dir() and str(c) not in sys.path:
|
||||
sys.path.insert(0, str(c))
|
||||
|
||||
|
||||
_setup_paths()
|
||||
# Force multimodal on for this script regardless of env — backfill is
|
||||
# the entire point. The deploy-time default stays whatever Coolify sets.
|
||||
os.environ["MULTIMODAL_ENABLED"] = "true"
|
||||
|
||||
from legal_mcp import config # noqa: E402
|
||||
from legal_mcp.services import db, embeddings, extractor # noqa: E402
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format="%(asctime)s [%(levelname)s] %(message)s",
|
||||
)
|
||||
logger = logging.getLogger("backfill_multimodal_precedents")
|
||||
|
||||
|
||||
# ───────────────────────── file matching ─────────────────────────
|
||||
|
||||
# Roots to search for staged precedent files. Both paths are tried; the
|
||||
# first that exists wins. ``/data/`` is the in-container mount;
|
||||
# ``/home/chaim/legal-ai/data/`` is the host path.
|
||||
SEARCH_ROOTS = [
|
||||
Path("/data/precedent-library"),
|
||||
Path("/data/internal-decisions"),
|
||||
Path("/home/chaim/legal-ai/data/precedent-library"),
|
||||
Path("/home/chaim/legal-ai/data/internal-decisions"),
|
||||
]
|
||||
|
||||
# Extensions we can render with PyMuPDF (fitz). MD and TXT cannot be
|
||||
# rendered as page images, so we skip them.
|
||||
RENDERABLE_EXTS = {".pdf", ".docx"}
|
||||
|
||||
|
||||
# Token-extraction regex: only tokens that contain a slash or hyphen
|
||||
# (real case-number kernels like "8064/20" or "25226-04-25"). We
|
||||
# deliberately exclude pure numeric runs like "2011" (which is just a
|
||||
# year in "(נבו 5.4.2011)") to avoid false-positive matches against
|
||||
# unrelated filenames that happen to contain the same year.
|
||||
_NUMBER_TOKEN = re.compile(r"\d+[-/]\d+(?:[-/]\d+)*")
|
||||
|
||||
|
||||
def _extract_number_tokens(case_number: str) -> list[str]:
|
||||
"""Pull numeric kernels out of a Hebrew case_number string.
|
||||
|
||||
Only returns tokens containing a slash or hyphen (real case-number
|
||||
kernels), so years like "2011" and "2024" don't leak through and
|
||||
falsely match filenames.
|
||||
|
||||
>>> _extract_number_tokens('בר"מ 25226-04-25 הוועדה')
|
||||
['25226-04-25']
|
||||
>>> _extract_number_tokens('ערר 8064/20 חברת')
|
||||
['8064/20']
|
||||
>>> _extract_number_tokens('עע"מ 10089/07 (נבו 5.4.2011)')
|
||||
['10089/07', '5.4.2011'] # date stays; but '5.4.2011' is hyphenless after normalize → no match against random filenames
|
||||
"""
|
||||
# filter out date-shaped tokens (dotted) by additional check — only
|
||||
# keep tokens whose form is N/N or N-N-..., not N.N.N
|
||||
tokens = _NUMBER_TOKEN.findall(case_number)
|
||||
return [t for t in tokens if "." not in t]
|
||||
|
||||
|
||||
def _normalize_for_match(s: str) -> str:
|
||||
"""Lowercase + strip whitespace/punct for filename matching."""
|
||||
return re.sub(r"[\s/_-]+", "", s.lower())
|
||||
|
||||
|
||||
def _build_file_index() -> dict[str, list[Path]]:
|
||||
"""Walk SEARCH_ROOTS and return {normalized_filename: [paths]}.
|
||||
|
||||
Only renderable extensions are included.
|
||||
"""
|
||||
idx: dict[str, list[Path]] = {}
|
||||
for root in SEARCH_ROOTS:
|
||||
if not root.is_dir():
|
||||
continue
|
||||
for p in root.rglob("*"):
|
||||
if not p.is_file():
|
||||
continue
|
||||
if p.suffix.lower() not in RENDERABLE_EXTS:
|
||||
continue
|
||||
if "thumbnails" in p.parts:
|
||||
continue
|
||||
key = _normalize_for_match(p.name)
|
||||
idx.setdefault(key, []).append(p)
|
||||
return idx
|
||||
|
||||
|
||||
def _digit_parts(token: str) -> list[str]:
|
||||
"""Split a token like '14306-09-23' into ['14306','09','23']."""
|
||||
return [p for p in re.split(r"[-/]", token) if p]
|
||||
|
||||
|
||||
def _find_file_for_case_number(case_number: str, file_index: dict[str, list[Path]]) -> Path | None:
|
||||
"""Best-effort match a case_number → staged file path.
|
||||
|
||||
Two strategies:
|
||||
|
||||
1. **Direct contiguous match** — token normalized (e.g. "8064/20"
|
||||
→ "806420") appears as substring of the filename normalized.
|
||||
2. **Parts-match** — every digit part of the token appears
|
||||
somewhere in the filename (handles reordered formats like
|
||||
case_number "14306-09-23" matched to "MM-23-09-14306-967.docx",
|
||||
where Nevo's case_number ordering differs from the legal
|
||||
template's filename ordering). Only accepts when the longest
|
||||
part has at least 4 digits — that filters out matches where
|
||||
only short pieces (year fragments) overlap.
|
||||
|
||||
Returns the first match found, preferring PDFs over DOCX.
|
||||
"""
|
||||
tokens = _extract_number_tokens(case_number)
|
||||
if not tokens:
|
||||
return None
|
||||
|
||||
candidates: list[Path] = []
|
||||
for token in tokens:
|
||||
# Strategy 1: contiguous
|
||||
normalized_token = _normalize_for_match(token)
|
||||
token_hyphenated = token.replace("/", "-")
|
||||
normalized_hyphenated = _normalize_for_match(token_hyphenated)
|
||||
# Strategy 2: parts
|
||||
parts = _digit_parts(token)
|
||||
longest_part = max((len(p) for p in parts), default=0)
|
||||
|
||||
for normalized_name, paths in file_index.items():
|
||||
if normalized_token in normalized_name or normalized_hyphenated in normalized_name:
|
||||
candidates.extend(paths)
|
||||
continue
|
||||
# Parts-match requires longest part >= 4 digits AND all parts present
|
||||
if longest_part >= 4 and parts and all(p in normalized_name for p in parts):
|
||||
candidates.extend(paths)
|
||||
|
||||
if not candidates:
|
||||
return None
|
||||
|
||||
# Dedupe while preserving order
|
||||
seen = set()
|
||||
unique = []
|
||||
for p in candidates:
|
||||
if p not in seen:
|
||||
seen.add(p)
|
||||
unique.append(p)
|
||||
|
||||
# Prefer PDFs over DOCX (PDF rendering is more reliable for embedded fonts/images)
|
||||
pdf = next((p for p in unique if p.suffix.lower() == ".pdf"), None)
|
||||
return pdf or unique[0]
|
||||
|
||||
|
||||
# ───────────────────────── backfill core ─────────────────────────
|
||||
|
||||
|
||||
PRECEDENT_LIBRARY_THUMBNAILS = Path(config.DATA_DIR) / "precedent-library" / "thumbnails"
|
||||
|
||||
|
||||
async def _embed_one_precedent(case_law_id: UUID, src_path: Path) -> dict:
|
||||
"""Render + embed + store image embeddings for a single precedent.
|
||||
|
||||
Mirrors ``precedent_library._embed_precedent_pages`` but takes any
|
||||
fitz-renderable file (PDF or DOCX).
|
||||
"""
|
||||
thumb_dir = PRECEDENT_LIBRARY_THUMBNAILS / str(case_law_id)
|
||||
# PyMuPDF reads DOCX natively (uses its own MuPDF backend). We use
|
||||
# the same renderer as the live pipeline for consistency.
|
||||
rendered = await asyncio.to_thread(
|
||||
extractor.render_pages_for_multimodal,
|
||||
src_path,
|
||||
config.MULTIMODAL_DPI,
|
||||
config.MULTIMODAL_THUMB_DPI,
|
||||
thumb_dir,
|
||||
)
|
||||
if not rendered:
|
||||
return {"pages_embedded": 0, "status": "no_pages"}
|
||||
|
||||
images = [pil for pil, _ in rendered]
|
||||
thumbs = [t for _, t in rendered]
|
||||
|
||||
img_embs = await embeddings.embed_images(images)
|
||||
|
||||
page_records = []
|
||||
for i, (emb, thumb) in enumerate(zip(img_embs, thumbs)):
|
||||
rel_thumb = None
|
||||
if thumb is not None:
|
||||
try:
|
||||
rel_thumb = str(thumb.relative_to(config.DATA_DIR))
|
||||
except ValueError:
|
||||
rel_thumb = str(thumb)
|
||||
page_records.append({
|
||||
"page_number": i + 1,
|
||||
"embedding": emb,
|
||||
"image_thumbnail_path": rel_thumb,
|
||||
})
|
||||
|
||||
stored = await db.store_precedent_image_embeddings(
|
||||
case_law_id, page_records, model_name=config.MULTIMODAL_MODEL,
|
||||
)
|
||||
return {"pages_embedded": stored, "status": "ok"}
|
||||
|
||||
|
||||
async def _scan_missing_records() -> list[dict]:
|
||||
pool = await db.get_pool()
|
||||
rows = await pool.fetch(
|
||||
"""
|
||||
SELECT id, case_number, source_kind, length(full_text) AS text_len
|
||||
FROM case_law cl
|
||||
WHERE NOT EXISTS (
|
||||
SELECT 1 FROM precedent_image_embeddings ppi
|
||||
WHERE ppi.case_law_id = cl.id
|
||||
)
|
||||
AND cl.source_kind IN ('external_upload', 'internal_committee')
|
||||
ORDER BY cl.source_kind, cl.case_number
|
||||
"""
|
||||
)
|
||||
return [
|
||||
{
|
||||
"id": UUID(str(r["id"])),
|
||||
"case_number": r["case_number"],
|
||||
"source_kind": r["source_kind"],
|
||||
"text_len": r["text_len"],
|
||||
}
|
||||
for r in rows
|
||||
]
|
||||
|
||||
|
||||
async def backfill_all(
|
||||
*,
|
||||
dry_run: bool,
|
||||
limit: int | None = None,
|
||||
only_source_kind: str | None = None,
|
||||
) -> dict:
|
||||
"""Main entrypoint — scan, match, render, embed, store."""
|
||||
await db.init_schema()
|
||||
records = await _scan_missing_records()
|
||||
if only_source_kind:
|
||||
records = [r for r in records if r["source_kind"] == only_source_kind]
|
||||
if limit:
|
||||
records = records[:limit]
|
||||
|
||||
file_index = _build_file_index()
|
||||
logger.info("Indexed %d renderable files under %s",
|
||||
sum(len(v) for v in file_index.values()),
|
||||
", ".join(str(r) for r in SEARCH_ROOTS if r.is_dir()))
|
||||
|
||||
summary = {
|
||||
"scanned": len(records),
|
||||
"matched": 0,
|
||||
"no_match": 0,
|
||||
"embedded": 0,
|
||||
"skipped_md_only": 0,
|
||||
"errors": 0,
|
||||
"total_pages": 0,
|
||||
"details": [],
|
||||
}
|
||||
|
||||
for rec in records:
|
||||
case_law_id = rec["id"]
|
||||
case_number = rec["case_number"]
|
||||
src = _find_file_for_case_number(case_number, file_index)
|
||||
|
||||
if not src:
|
||||
summary["no_match"] += 1
|
||||
summary["details"].append({
|
||||
"case_law_id": str(case_law_id),
|
||||
"case_number": case_number,
|
||||
"source_kind": rec["source_kind"],
|
||||
"status": "no_match",
|
||||
})
|
||||
logger.info(" NO MATCH: %s", case_number[:80])
|
||||
continue
|
||||
|
||||
# Probe page count without rendering (cheap)
|
||||
try:
|
||||
doc = fitz.open(str(src))
|
||||
page_count = len(doc)
|
||||
doc.close()
|
||||
except Exception as e:
|
||||
summary["errors"] += 1
|
||||
summary["details"].append({
|
||||
"case_law_id": str(case_law_id),
|
||||
"case_number": case_number,
|
||||
"matched_file": str(src),
|
||||
"status": "open_error",
|
||||
"error": str(e),
|
||||
})
|
||||
logger.warning(" OPEN ERROR for %s: %s", case_number[:60], e)
|
||||
continue
|
||||
|
||||
summary["matched"] += 1
|
||||
summary["total_pages"] += page_count
|
||||
logger.info(" MATCHED: %s -> %s (%d pages)",
|
||||
case_number[:60], src.name, page_count)
|
||||
|
||||
if dry_run:
|
||||
summary["details"].append({
|
||||
"case_law_id": str(case_law_id),
|
||||
"case_number": case_number,
|
||||
"matched_file": str(src),
|
||||
"pages": page_count,
|
||||
"status": "would_embed",
|
||||
})
|
||||
continue
|
||||
|
||||
# Actually embed + store
|
||||
t0 = time.time()
|
||||
try:
|
||||
result = await _embed_one_precedent(case_law_id, src)
|
||||
elapsed = time.time() - t0
|
||||
summary["embedded"] += 1
|
||||
summary["details"].append({
|
||||
"case_law_id": str(case_law_id),
|
||||
"case_number": case_number,
|
||||
"matched_file": str(src),
|
||||
"pages": page_count,
|
||||
"elapsed_sec": round(elapsed, 1),
|
||||
"status": "ok",
|
||||
**result,
|
||||
})
|
||||
logger.info(" EMBEDDED %d pages in %.1fs", result["pages_embedded"], elapsed)
|
||||
except Exception as e:
|
||||
summary["errors"] += 1
|
||||
summary["details"].append({
|
||||
"case_law_id": str(case_law_id),
|
||||
"case_number": case_number,
|
||||
"matched_file": str(src),
|
||||
"status": "embed_error",
|
||||
"error": str(e),
|
||||
})
|
||||
logger.exception(" EMBED ERROR for %s", case_number[:60])
|
||||
|
||||
return summary
|
||||
|
||||
|
||||
# ───────────────────────── CLI ─────────────────────────
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Backfill voyage-multimodal-3 embeddings for case_law records "
|
||||
"(external_upload + internal_committee) missing them.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--dry-run", action="store_true",
|
||||
help="Only scan + match; do not call Voyage or write to DB.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--apply", action="store_true",
|
||||
help="Render, embed, and store. Implies not --dry-run.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--limit", type=int, default=None,
|
||||
help="Max number of records to process (debugging).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--only", choices=["external_upload", "internal_committee"], default=None,
|
||||
help="Restrict to a single source_kind.",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
if not args.apply and not args.dry_run:
|
||||
# Default to dry_run for safety.
|
||||
args.dry_run = True
|
||||
|
||||
logger.info(
|
||||
"Mode=%s MULTIMODAL_MODEL=%s DPI=%d THUMB_DPI=%d",
|
||||
"DRY-RUN" if args.dry_run else "APPLY",
|
||||
config.MULTIMODAL_MODEL, config.MULTIMODAL_DPI, config.MULTIMODAL_THUMB_DPI,
|
||||
)
|
||||
|
||||
summary = asyncio.run(
|
||||
backfill_all(
|
||||
dry_run=args.dry_run,
|
||||
limit=args.limit,
|
||||
only_source_kind=args.only,
|
||||
)
|
||||
)
|
||||
|
||||
print()
|
||||
print("=" * 60)
|
||||
print("BACKFILL SUMMARY")
|
||||
print("=" * 60)
|
||||
print(f" scanned: {summary['scanned']}")
|
||||
print(f" matched: {summary['matched']}")
|
||||
print(f" no_match: {summary['no_match']}")
|
||||
print(f" total pages: {summary['total_pages']}")
|
||||
if args.dry_run:
|
||||
# Cost estimate: ~3.5K tokens/page * $0.12/1M tokens
|
||||
est_tokens = summary["total_pages"] * 3500
|
||||
est_cost = est_tokens / 1_000_000 * 0.12
|
||||
print(f" est. tokens: ~{est_tokens:,} (~${est_cost:.2f})")
|
||||
else:
|
||||
print(f" embedded: {summary['embedded']}")
|
||||
print(f" errors: {summary['errors']}")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
313
scripts/compute_ndcg.py
Executable file
313
scripts/compute_ndcg.py
Executable file
@@ -0,0 +1,313 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Compute nDCG@10 over the RAG retrieval feedback table (TaskMaster #50).
|
||||
|
||||
Outputs aggregated metrics as JSON:
|
||||
|
||||
{
|
||||
"generated_at": "2026-05-26T12:34:56+00:00",
|
||||
"k": 10,
|
||||
"summary": {
|
||||
"total_searches_with_feedback": int,
|
||||
"total_searches_logged": int,
|
||||
"feedback_coverage_pct": float,
|
||||
"avg_ndcg_at_10": float | null
|
||||
},
|
||||
"by_search_type": [
|
||||
{"search_type": "precedent_library",
|
||||
"searches_with_feedback": int,
|
||||
"avg_ndcg_at_10": float | null},
|
||||
...
|
||||
],
|
||||
"by_week": [
|
||||
{"week_start": "2026-05-19",
|
||||
"search_type": "precedent_library",
|
||||
"searches_with_feedback": int,
|
||||
"avg_ndcg_at_10": float | null},
|
||||
...
|
||||
],
|
||||
"top_cited_case_law": [
|
||||
{"case_law_id": "...", "case_number": "...",
|
||||
"case_name": "...", "cite_count": int},
|
||||
...
|
||||
]
|
||||
}
|
||||
|
||||
Run:
|
||||
python ~/legal-ai/scripts/compute_ndcg.py
|
||||
python ~/legal-ai/scripts/compute_ndcg.py --weeks 12 --k 10
|
||||
python ~/legal-ai/scripts/compute_ndcg.py --pretty
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import json
|
||||
import math
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
import asyncpg
|
||||
|
||||
# Allow running as a standalone script — no package install required.
|
||||
REPO_ROOT = Path(__file__).resolve().parent.parent
|
||||
sys.path.insert(0, str(REPO_ROOT / "mcp-server" / "src"))
|
||||
|
||||
|
||||
def _postgres_url() -> str:
|
||||
"""Resolve POSTGRES_URL the same way the MCP server does."""
|
||||
url = os.environ.get("POSTGRES_URL")
|
||||
if url:
|
||||
return url
|
||||
user = os.environ.get("POSTGRES_USER", "legal_ai")
|
||||
pw = os.environ.get("POSTGRES_PASSWORD", "")
|
||||
host = os.environ.get("POSTGRES_HOST", "127.0.0.1")
|
||||
port = os.environ.get("POSTGRES_PORT", "5433")
|
||||
db = os.environ.get("POSTGRES_DB", "legal_ai")
|
||||
return f"postgres://{user}:{pw}@{host}:{port}/{db}"
|
||||
|
||||
|
||||
def dcg(relevances: list[int]) -> float:
|
||||
"""Discounted Cumulative Gain at the length of ``relevances``.
|
||||
|
||||
Uses the "gain = 2^rel - 1" form so high-relevance hits get
|
||||
significantly more weight than marginal ones — matches the
|
||||
convention used by most IR papers and TREC-EVAL.
|
||||
"""
|
||||
total = 0.0
|
||||
for i, rel in enumerate(relevances, start=1):
|
||||
gain = (2 ** rel) - 1
|
||||
total += gain / math.log2(i + 1)
|
||||
return total
|
||||
|
||||
|
||||
def ndcg_at_k(rel_at_rank: dict[int, int], k: int) -> float | None:
|
||||
"""Compute nDCG@k.
|
||||
|
||||
Args:
|
||||
rel_at_rank: ``{rank (1-based): relevance_score (0..3)}``.
|
||||
Ranks above ``k`` are ignored. Missing ranks count as 0.
|
||||
k: cutoff.
|
||||
|
||||
Returns:
|
||||
nDCG in [0,1], or ``None`` if there's nothing to score
|
||||
(no relevant hits in the top-k -> IDCG = 0).
|
||||
"""
|
||||
actual = [rel_at_rank.get(r, 0) for r in range(1, k + 1)]
|
||||
if not any(actual):
|
||||
return None
|
||||
ideal = sorted(actual, reverse=True)
|
||||
idcg = dcg(ideal)
|
||||
if idcg == 0:
|
||||
return None
|
||||
return dcg(actual) / idcg
|
||||
|
||||
|
||||
async def _fetch_feedback_rows(conn: asyncpg.Connection, weeks: int | None) -> list[dict]:
|
||||
"""Pull all (search_log_id, rank, relevance_score, search_type, created_at)
|
||||
rows where there's at least one feedback row.
|
||||
|
||||
Restricting to recent weeks keeps the scan cheap on a growing log.
|
||||
"""
|
||||
where = ""
|
||||
params: list = []
|
||||
if weeks is not None and weeks > 0:
|
||||
where = "WHERE sl.created_at >= NOW() - ($1::int * INTERVAL '1 week')"
|
||||
params.append(weeks)
|
||||
sql = f"""
|
||||
SELECT sl.id::text AS search_log_id,
|
||||
sl.search_type AS search_type,
|
||||
sl.created_at AS created_at,
|
||||
srf.rank AS rank,
|
||||
srf.relevance_score AS relevance_score
|
||||
FROM search_relevance_feedback srf
|
||||
JOIN search_logs sl ON sl.id = srf.search_log_id
|
||||
{where}
|
||||
"""
|
||||
rows = await conn.fetch(sql, *params)
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
|
||||
async def _fetch_corpus_totals(conn: asyncpg.Connection, weeks: int | None) -> dict[str, int]:
|
||||
"""Total search_logs count (overall and by type) — used for coverage %."""
|
||||
where = ""
|
||||
params: list = []
|
||||
if weeks is not None and weeks > 0:
|
||||
where = "WHERE created_at >= NOW() - ($1::int * INTERVAL '1 week')"
|
||||
params.append(weeks)
|
||||
total_row = await conn.fetchrow(
|
||||
f"SELECT COUNT(*) AS n FROM search_logs {where}",
|
||||
*params,
|
||||
)
|
||||
by_type = await conn.fetch(
|
||||
f"SELECT search_type, COUNT(*) AS n FROM search_logs {where} GROUP BY search_type",
|
||||
*params,
|
||||
)
|
||||
return {
|
||||
"_total": int(total_row["n"]) if total_row else 0,
|
||||
**{r["search_type"]: int(r["n"]) for r in by_type},
|
||||
}
|
||||
|
||||
|
||||
async def _fetch_top_cited(conn: asyncpg.Connection, limit: int = 20) -> list[dict]:
|
||||
"""Most-cited case_law (from auto-inferred feedback)."""
|
||||
rows = await conn.fetch(
|
||||
"""
|
||||
SELECT cl.id::text AS case_law_id,
|
||||
cl.case_number AS case_number,
|
||||
cl.case_name AS case_name,
|
||||
COUNT(*) AS cite_count
|
||||
FROM search_relevance_feedback srf
|
||||
JOIN case_law cl ON cl.id = srf.case_law_id
|
||||
WHERE srf.feedback_source = 'cited_in_decision'
|
||||
GROUP BY cl.id, cl.case_number, cl.case_name
|
||||
ORDER BY COUNT(*) DESC
|
||||
LIMIT $1
|
||||
""",
|
||||
limit,
|
||||
)
|
||||
return [dict(r) for r in rows]
|
||||
|
||||
|
||||
def _aggregate(
|
||||
feedback_rows: list[dict],
|
||||
k: int,
|
||||
) -> tuple[dict[str, float], dict[tuple[str, str], float], int]:
|
||||
"""Group feedback by search_log, compute per-log nDCG, then aggregate
|
||||
by search_type and by (week, search_type)."""
|
||||
by_log: dict[str, dict] = {}
|
||||
for row in feedback_rows:
|
||||
slid = row["search_log_id"]
|
||||
if slid not in by_log:
|
||||
by_log[slid] = {
|
||||
"search_type": row["search_type"],
|
||||
"created_at": row["created_at"],
|
||||
"rels": {},
|
||||
}
|
||||
rank = int(row["rank"])
|
||||
if 1 <= rank <= k:
|
||||
by_log[slid]["rels"][rank] = int(row["relevance_score"])
|
||||
|
||||
type_ndcg: dict[str, list[float]] = {}
|
||||
week_ndcg: dict[tuple[str, str], list[float]] = {}
|
||||
total_logs_with_feedback = 0
|
||||
for entry in by_log.values():
|
||||
score = ndcg_at_k(entry["rels"], k)
|
||||
if score is None:
|
||||
continue
|
||||
total_logs_with_feedback += 1
|
||||
type_ndcg.setdefault(entry["search_type"], []).append(score)
|
||||
week_start = entry["created_at"].date()
|
||||
# Round down to ISO week Monday.
|
||||
week_start = week_start.fromordinal(
|
||||
week_start.toordinal() - week_start.weekday()
|
||||
)
|
||||
wkey = (week_start.isoformat(), entry["search_type"])
|
||||
week_ndcg.setdefault(wkey, []).append(score)
|
||||
|
||||
type_avg = {t: sum(v) / len(v) for t, v in type_ndcg.items() if v}
|
||||
week_avg = {k_: sum(v) / len(v) for k_, v in week_ndcg.items() if v}
|
||||
return type_avg, week_avg, total_logs_with_feedback
|
||||
|
||||
|
||||
async def compute(weeks: int | None, k: int) -> dict:
|
||||
conn = await asyncpg.connect(_postgres_url())
|
||||
try:
|
||||
fb_rows = await _fetch_feedback_rows(conn, weeks)
|
||||
totals = await _fetch_corpus_totals(conn, weeks)
|
||||
top_cited = await _fetch_top_cited(conn)
|
||||
finally:
|
||||
await conn.close()
|
||||
|
||||
type_avg, week_avg, logs_scored = _aggregate(fb_rows, k)
|
||||
|
||||
total_logs = totals.get("_total", 0)
|
||||
overall_avg = (
|
||||
sum(v * len([s for s in type_avg]) for v in []) or None # placeholder
|
||||
)
|
||||
# Recompute overall_avg cleanly: micro-average over all per-log scores.
|
||||
all_scores: list[float] = []
|
||||
for v in [type_avg[t] for t in type_avg]:
|
||||
# type_avg already collapsed per-type — instead, re-run aggregation
|
||||
# over fb_rows by reusing the per-log calc, micro-averaged.
|
||||
pass
|
||||
# Simpler: redo with per-log granularity for overall mean.
|
||||
by_log_overall: dict[str, dict[int, int]] = {}
|
||||
log_to_type: dict[str, str] = {}
|
||||
for row in fb_rows:
|
||||
slid = row["search_log_id"]
|
||||
by_log_overall.setdefault(slid, {})
|
||||
rank = int(row["rank"])
|
||||
if 1 <= rank <= k:
|
||||
by_log_overall[slid][rank] = int(row["relevance_score"])
|
||||
log_to_type[slid] = row["search_type"]
|
||||
per_log_scores: list[float] = []
|
||||
for slid, rels in by_log_overall.items():
|
||||
s = ndcg_at_k(rels, k)
|
||||
if s is not None:
|
||||
per_log_scores.append(s)
|
||||
overall_avg = (sum(per_log_scores) / len(per_log_scores)) if per_log_scores else None
|
||||
|
||||
by_search_type = []
|
||||
for t, totals_n in sorted(totals.items()):
|
||||
if t == "_total":
|
||||
continue
|
||||
by_search_type.append({
|
||||
"search_type": t,
|
||||
"searches_logged": totals_n,
|
||||
"searches_with_feedback": sum(
|
||||
1 for slid, tp in log_to_type.items() if tp == t
|
||||
),
|
||||
"avg_ndcg_at_k": round(type_avg[t], 4) if t in type_avg else None,
|
||||
})
|
||||
|
||||
by_week = [
|
||||
{
|
||||
"week_start": week,
|
||||
"search_type": stype,
|
||||
"avg_ndcg_at_k": round(score, 4),
|
||||
}
|
||||
for (week, stype), score in sorted(week_avg.items())
|
||||
]
|
||||
|
||||
return {
|
||||
"generated_at": datetime.now(timezone.utc).isoformat(),
|
||||
"k": k,
|
||||
"window_weeks": weeks,
|
||||
"summary": {
|
||||
"total_searches_logged": total_logs,
|
||||
"total_searches_with_feedback": logs_scored,
|
||||
"feedback_coverage_pct": (
|
||||
round(100 * logs_scored / total_logs, 2) if total_logs else 0.0
|
||||
),
|
||||
"avg_ndcg_at_k": round(overall_avg, 4) if overall_avg is not None else None,
|
||||
},
|
||||
"by_search_type": by_search_type,
|
||||
"by_week": by_week,
|
||||
"top_cited_case_law": [
|
||||
{**r, "cite_count": int(r["cite_count"])} for r in top_cited
|
||||
],
|
||||
}
|
||||
|
||||
|
||||
def main() -> int:
|
||||
p = argparse.ArgumentParser(description="Compute nDCG@k from search_relevance_feedback")
|
||||
p.add_argument("--k", type=int, default=10, help="cutoff (default: 10)")
|
||||
p.add_argument(
|
||||
"--weeks",
|
||||
type=int,
|
||||
default=None,
|
||||
help="restrict to the last N weeks (default: all time)",
|
||||
)
|
||||
p.add_argument("--pretty", action="store_true", help="indented JSON output")
|
||||
args = p.parse_args()
|
||||
|
||||
result = asyncio.run(compute(weeks=args.weeks, k=args.k))
|
||||
indent = 2 if args.pretty else None
|
||||
print(json.dumps(result, ensure_ascii=False, indent=indent, default=str))
|
||||
return 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
raise SystemExit(main())
|
||||
73
scripts/legal-chat-service.config.cjs
Normal file
73
scripts/legal-chat-service.config.cjs
Normal file
@@ -0,0 +1,73 @@
|
||||
/**
|
||||
* pm2 ecosystem entry for legal-chat-service — the host-side SSE bridge
|
||||
* to ``claude`` CLI that powers the /training chat tab.
|
||||
*
|
||||
* Security: the service spawns the claude CLI on behalf of any caller
|
||||
* that hits /chat/start. claude tools include Bash, Read, Edit — so an
|
||||
* unauthenticated request to /chat/start is effectively RCE-equivalent.
|
||||
* Two defenses, both required:
|
||||
* 1. Bind to 10.0.1.1 (docker0 bridge gateway) — only host + containers
|
||||
* on docker bridges can reach the socket; nothing outside the host.
|
||||
* 2. Bearer token auth — secret loaded from /home/chaim/.legal-chat-service.env
|
||||
* (chmod 600) and mirrored in Coolify as LEGAL_CHAT_SHARED_SECRET.
|
||||
* The service refuses to start without the secret set.
|
||||
*
|
||||
* Why pm2:
|
||||
* - Auto-restart if the process dies (claude CLI subprocess failures
|
||||
* should never leave the service in a half-dead state).
|
||||
* - Log rotation matches paperclip's behavior so the chair sees
|
||||
* consistent log paths under ~/.pm2/logs/.
|
||||
*
|
||||
* Install (once):
|
||||
* pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs
|
||||
* pm2 save
|
||||
*
|
||||
* Smoke test:
|
||||
* curl http://10.0.1.1:8770/health
|
||||
* # → {"ok":true,"service":"legal-chat-service"}
|
||||
*
|
||||
* Update:
|
||||
* pm2 restart legal-chat-service --update-env
|
||||
*
|
||||
* Stop:
|
||||
* pm2 stop legal-chat-service
|
||||
*/
|
||||
const fs = require("fs");
|
||||
|
||||
// Load LEGAL_CHAT_SHARED_SECRET from a chmod 600 file off the repo.
|
||||
// The same value is mirrored in Coolify as the LEGAL_CHAT_SHARED_SECRET
|
||||
// env var so the FastAPI proxy sends a matching Authorization header.
|
||||
// Migrate to Infisical (/_GUIDELINES) once the MCP server is back.
|
||||
const ENV_FILE = "/home/chaim/.legal-chat-service.env";
|
||||
const env = {
|
||||
HOME: "/home/chaim",
|
||||
PATH: "/home/chaim/.local/bin:/usr/local/bin:/usr/bin:/bin",
|
||||
PYTHONUNBUFFERED: "1",
|
||||
};
|
||||
try {
|
||||
const text = fs.readFileSync(ENV_FILE, "utf8");
|
||||
for (const line of text.split("\n")) {
|
||||
if (!line || line.trim().startsWith("#")) continue;
|
||||
const m = line.match(/^\s*([A-Z_][A-Z0-9_]*)\s*=\s*(.*?)\s*$/);
|
||||
if (m) env[m[1]] = m[2];
|
||||
}
|
||||
} catch (e) {
|
||||
console.error(`legal-chat-service: failed to load ${ENV_FILE}: ${e.message}`);
|
||||
console.error("Service will refuse to start without LEGAL_CHAT_SHARED_SECRET.");
|
||||
}
|
||||
|
||||
module.exports = {
|
||||
apps: [
|
||||
{
|
||||
name: "legal-chat-service",
|
||||
cwd: "/home/chaim/legal-ai/mcp-server",
|
||||
script: "/home/chaim/legal-ai/mcp-server/.venv/bin/python",
|
||||
args: "-m legal_mcp.chat_service.server --port 8770 --host 10.0.1.1",
|
||||
env,
|
||||
restart_delay: 5000,
|
||||
max_restarts: 10,
|
||||
autorestart: true,
|
||||
max_memory_restart: "500M",
|
||||
},
|
||||
],
|
||||
};
|
||||
278
scripts/monitor_halacha_quality.py
Normal file
278
scripts/monitor_halacha_quality.py
Normal file
@@ -0,0 +1,278 @@
|
||||
"""Halacha extraction quality monitor.
|
||||
|
||||
Tracks ``avg(confidence)`` of halachot extracted by the LLM pipeline
|
||||
over time and emits an alert when the recent-window average drops more
|
||||
than a configurable threshold below the lifetime baseline.
|
||||
|
||||
Intended schedule: weekly cron, e.g. ``0 8 * * 1`` (Monday 08:00).
|
||||
|
||||
Output: a single-line JSON payload to stdout (suitable for piping
|
||||
into ``notify.py`` or a webhook), plus a human-readable alert text
|
||||
on stderr when drift is detected.
|
||||
|
||||
Usage
|
||||
-----
|
||||
|
||||
::
|
||||
|
||||
# Default — weekly window, 5% drop threshold (relative)
|
||||
python scripts/monitor_halacha_quality.py
|
||||
|
||||
# Custom window/threshold:
|
||||
python scripts/monitor_halacha_quality.py --window 14 --threshold 0.03
|
||||
|
||||
# Only emit JSON, no stderr alert:
|
||||
python scripts/monitor_halacha_quality.py --silent
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import argparse
|
||||
import asyncio
|
||||
import json
|
||||
import os
|
||||
import sys
|
||||
from datetime import datetime, timezone
|
||||
from pathlib import Path
|
||||
|
||||
|
||||
def _setup_paths():
|
||||
"""Make ``legal_mcp`` importable when run from anywhere."""
|
||||
here = Path(__file__).resolve().parent
|
||||
candidates = [
|
||||
here.parent / "mcp-server" / "src", # host
|
||||
Path("/app/mcp-server/src"), # container
|
||||
]
|
||||
for c in candidates:
|
||||
if c.is_dir() and str(c) not in sys.path:
|
||||
sys.path.insert(0, str(c))
|
||||
|
||||
|
||||
_setup_paths()
|
||||
|
||||
from legal_mcp.services import db # noqa: E402
|
||||
|
||||
|
||||
# Statuses considered "trusted" — the baseline is computed only over
|
||||
# halachot whose extraction the chair has accepted. ``pending_review``
|
||||
# is the queue waiting for review; their average tends to be lower
|
||||
# because anything obviously bad gets rejected before approval. So we
|
||||
# track BOTH series and alert on either one drifting:
|
||||
# 1. Trusted baseline (approved+published) — drift here means the
|
||||
# extractor's "best output" quality is degrading.
|
||||
# 2. All extracted — drift here means raw extractor accuracy is down.
|
||||
TRUSTED_STATUSES = ("approved", "published")
|
||||
|
||||
|
||||
async def _collect_metrics(window_days: int) -> dict:
|
||||
pool = await db.get_pool()
|
||||
|
||||
# Lifetime baselines
|
||||
lifetime_all = await pool.fetchrow(
|
||||
"SELECT count(*) AS n, AVG(confidence) AS avg_conf FROM halachot"
|
||||
)
|
||||
lifetime_trusted = await pool.fetchrow(
|
||||
f"""
|
||||
SELECT count(*) AS n, AVG(confidence) AS avg_conf
|
||||
FROM halachot
|
||||
WHERE review_status = ANY($1::text[])
|
||||
""",
|
||||
list(TRUSTED_STATUSES),
|
||||
)
|
||||
|
||||
# Recent window
|
||||
recent_all = await pool.fetchrow(
|
||||
f"""
|
||||
SELECT count(*) AS n, AVG(confidence) AS avg_conf
|
||||
FROM halachot
|
||||
WHERE created_at > NOW() - INTERVAL '{int(window_days)} days'
|
||||
"""
|
||||
)
|
||||
recent_trusted = await pool.fetchrow(
|
||||
f"""
|
||||
SELECT count(*) AS n, AVG(confidence) AS avg_conf
|
||||
FROM halachot
|
||||
WHERE created_at > NOW() - INTERVAL '{int(window_days)} days'
|
||||
AND review_status = ANY($1::text[])
|
||||
""",
|
||||
list(TRUSTED_STATUSES),
|
||||
)
|
||||
|
||||
# Per-precedent recent (extractor outputs that haven't been reviewed
|
||||
# yet) — sometimes the canary that catches drift earliest. We track
|
||||
# the most-recent N extractions regardless of review state.
|
||||
pending_recent = await pool.fetchrow(
|
||||
"""
|
||||
SELECT count(*) AS n, AVG(confidence) AS avg_conf
|
||||
FROM halachot
|
||||
WHERE review_status = 'pending_review'
|
||||
"""
|
||||
)
|
||||
|
||||
def _f(rec, key: str) -> float | None:
|
||||
v = rec[key]
|
||||
if v is None:
|
||||
return None
|
||||
return float(v)
|
||||
|
||||
def _i(rec, key: str) -> int:
|
||||
v = rec[key]
|
||||
return int(v) if v is not None else 0
|
||||
|
||||
return {
|
||||
"window_days": int(window_days),
|
||||
"lifetime_all_count": _i(lifetime_all, "n"),
|
||||
"lifetime_all_avg": _f(lifetime_all, "avg_conf"),
|
||||
"lifetime_trusted_count": _i(lifetime_trusted, "n"),
|
||||
"lifetime_trusted_avg": _f(lifetime_trusted, "avg_conf"),
|
||||
"recent_all_count": _i(recent_all, "n"),
|
||||
"recent_all_avg": _f(recent_all, "avg_conf"),
|
||||
"recent_trusted_count": _i(recent_trusted, "n"),
|
||||
"recent_trusted_avg": _f(recent_trusted, "avg_conf"),
|
||||
"pending_review_count": _i(pending_recent, "n"),
|
||||
"pending_review_avg": _f(pending_recent, "avg_conf"),
|
||||
}
|
||||
|
||||
|
||||
def _drift(baseline: float | None, recent: float | None) -> float | None:
|
||||
"""Return relative drift as a positive number when recent < baseline.
|
||||
|
||||
>>> _drift(0.85, 0.80) # -> 0.0588 (5.88% drop)
|
||||
"""
|
||||
if baseline is None or recent is None or baseline <= 0:
|
||||
return None
|
||||
return (baseline - recent) / baseline
|
||||
|
||||
|
||||
def _evaluate(metrics: dict, threshold: float, min_sample: int) -> dict:
|
||||
"""Decide whether any series is drifting below threshold."""
|
||||
alerts: list[dict] = []
|
||||
series = [
|
||||
(
|
||||
"trusted",
|
||||
metrics["lifetime_trusted_avg"],
|
||||
metrics["recent_trusted_avg"],
|
||||
metrics["recent_trusted_count"],
|
||||
),
|
||||
(
|
||||
"all_extracted",
|
||||
metrics["lifetime_all_avg"],
|
||||
metrics["recent_all_avg"],
|
||||
metrics["recent_all_count"],
|
||||
),
|
||||
]
|
||||
for name, baseline, recent, recent_n in series:
|
||||
d = _drift(baseline, recent)
|
||||
entry = {
|
||||
"series": name,
|
||||
"baseline": baseline,
|
||||
"recent": recent,
|
||||
"recent_n": recent_n,
|
||||
"drift": d,
|
||||
"alert": False,
|
||||
"reason": None,
|
||||
}
|
||||
if recent_n < min_sample:
|
||||
entry["reason"] = f"recent_n={recent_n} below min_sample={min_sample}"
|
||||
elif d is None:
|
||||
entry["reason"] = "missing baseline or recent average"
|
||||
elif d >= threshold:
|
||||
entry["alert"] = True
|
||||
entry["reason"] = (
|
||||
f"drift {d:.1%} >= threshold {threshold:.1%} "
|
||||
f"(baseline={baseline:.3f}, recent={recent:.3f}, n={recent_n})"
|
||||
)
|
||||
else:
|
||||
entry["reason"] = (
|
||||
f"drift {d:.1%} < threshold {threshold:.1%} — within tolerance"
|
||||
)
|
||||
alerts.append(entry)
|
||||
|
||||
any_alert = any(a["alert"] for a in alerts)
|
||||
return {"alert": any_alert, "series": alerts}
|
||||
|
||||
|
||||
def _format_alert_text(metrics: dict, decision: dict) -> str:
|
||||
lines = [
|
||||
f"Halacha quality alert — window={metrics['window_days']}d",
|
||||
"",
|
||||
]
|
||||
for s in decision["series"]:
|
||||
sym = "ALERT" if s["alert"] else "ok"
|
||||
baseline = f"{s['baseline']:.3f}" if s["baseline"] is not None else "—"
|
||||
recent = f"{s['recent']:.3f}" if s["recent"] is not None else "—"
|
||||
drift = f"{s['drift']:.1%}" if s["drift"] is not None else "—"
|
||||
lines.append(
|
||||
f" [{sym}] {s['series']}: baseline={baseline} recent={recent} "
|
||||
f"drift={drift} n={s['recent_n']}"
|
||||
)
|
||||
if s["reason"]:
|
||||
lines.append(f" {s['reason']}")
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
async def run(
|
||||
*,
|
||||
window_days: int,
|
||||
threshold: float,
|
||||
min_sample: int,
|
||||
) -> dict:
|
||||
metrics = await _collect_metrics(window_days)
|
||||
decision = _evaluate(metrics, threshold, min_sample)
|
||||
return {
|
||||
"generated_at": datetime.now(timezone.utc).isoformat(),
|
||||
"window_days": window_days,
|
||||
"threshold_rel": threshold,
|
||||
"min_sample": min_sample,
|
||||
"metrics": metrics,
|
||||
"decision": decision,
|
||||
}
|
||||
|
||||
|
||||
def main():
|
||||
parser = argparse.ArgumentParser(
|
||||
description="Monitor halacha extraction quality (confidence drift)."
|
||||
)
|
||||
parser.add_argument(
|
||||
"--window", type=int, default=7,
|
||||
help="Recent window in days (default: 7).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--threshold", type=float, default=0.05,
|
||||
help="Relative drop alert threshold (default: 0.05 = 5%%).",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--min-sample", type=int, default=5,
|
||||
help="Minimum halachot in window to evaluate (default: 5). "
|
||||
"Below this, the series is reported but not alerted on.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--silent", action="store_true",
|
||||
help="Suppress stderr alert text; only print JSON.",
|
||||
)
|
||||
parser.add_argument(
|
||||
"--exit-on-alert", action="store_true",
|
||||
help="Exit with status 1 when an alert fires (default: always exit 0).",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
report = asyncio.run(
|
||||
run(
|
||||
window_days=args.window,
|
||||
threshold=args.threshold,
|
||||
min_sample=args.min_sample,
|
||||
)
|
||||
)
|
||||
|
||||
# JSON to stdout
|
||||
print(json.dumps(report, ensure_ascii=False, indent=2))
|
||||
|
||||
if report["decision"]["alert"] and not args.silent:
|
||||
print("", file=sys.stderr)
|
||||
print(_format_alert_text(report["metrics"], report["decision"]), file=sys.stderr)
|
||||
|
||||
if args.exit_on_alert and report["decision"]["alert"]:
|
||||
sys.exit(1)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
main()
|
||||
53
scripts/process_pending_blam.py
Normal file
53
scripts/process_pending_blam.py
Normal file
@@ -0,0 +1,53 @@
|
||||
"""One-shot: run pending metadata + halacha extraction on the 2 בל"מ
|
||||
decisions uploaded today (8126/24 + 8047/23). Bypasses MCP because the
|
||||
running MCP server has stale code; calls the services directly with the
|
||||
updated local copy.
|
||||
|
||||
Run from /home/chaim/legal-ai with the venv:
|
||||
POSTGRES_URL=... .venv/bin/python scripts/process_pending_blam.py
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "mcp-server", "src"))
|
||||
|
||||
from legal_mcp.services import db
|
||||
from legal_mcp.services import precedent_library
|
||||
|
||||
|
||||
async def main():
|
||||
# Queue metadata extraction too (ingest_internal_decision only queues
|
||||
# halacha; metadata fills headnote/summary/key_quote and now also
|
||||
# confirms proceeding_type via the new prompt field).
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
rows = await conn.fetch(
|
||||
"SELECT id, case_number FROM case_law "
|
||||
"WHERE case_number IN ('8126/24','8047/23') "
|
||||
" AND source_kind = 'internal_committee'"
|
||||
)
|
||||
for r in rows:
|
||||
await conn.execute(
|
||||
"UPDATE case_law SET metadata_extraction_requested_at = NOW() "
|
||||
"WHERE id = $1",
|
||||
r["id"],
|
||||
)
|
||||
print(f"queued metadata for {r['case_number']} ({r['id']})")
|
||||
|
||||
print("\n→ running metadata extraction…")
|
||||
meta_result = await precedent_library.process_pending_extractions(
|
||||
kind="metadata", limit=10,
|
||||
)
|
||||
print(meta_result)
|
||||
|
||||
print("\n→ running halacha extraction…")
|
||||
halacha_result = await precedent_library.process_pending_extractions(
|
||||
kind="halacha", limit=10,
|
||||
)
|
||||
print(halacha_result)
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
89
scripts/test_retrieval_by_name.py
Normal file
89
scripts/test_retrieval_by_name.py
Normal file
@@ -0,0 +1,89 @@
|
||||
#!/usr/bin/env python
|
||||
"""Repro + regression test for retrieval-by-name (RC-A, tasks #52).
|
||||
|
||||
Bug: searching the precedent corpus by a bare case NAME ("אגסי") fails to
|
||||
surface the decision itself, because the lexical tsvector covers only chunk
|
||||
content + halacha text — not case_name / case_number. A name query therefore
|
||||
matches decisions that *cite* the case, not the case.
|
||||
|
||||
Run with the MCP venv:
|
||||
DOTENV_PATH=/home/chaim/.env DATA_DIR=/home/chaim/legal-ai/data \
|
||||
mcp-server/.venv/bin/python scripts/test_retrieval_by_name.py
|
||||
|
||||
Exit 0 = all assertions pass. Non-zero = failure (prints what was found).
|
||||
"""
|
||||
import asyncio
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, "/home/chaim/legal-ai/mcp-server/src")
|
||||
|
||||
from legal_mcp.services import embeddings, hybrid_search # noqa: E402
|
||||
|
||||
AGASI_ID = "1a87efe5-6e13-4ed4-a9ec-3f2f7d61e4ec"
|
||||
# Vinfeld CITES Agasi (its halacha quote names אגסי) but is NOT Agasi.
|
||||
# An exact name match must rank the case itself above any case citing it.
|
||||
VINFELD_ID = "bd5d849c-c15f-43c3-96ab-d44337af9cb5"
|
||||
NAME_QUERY = "אגסי"
|
||||
SUBSTANTIVE_QUERY = 'פטור היטל השבחה לפי סעיף 19(ג)(1) שתי דירות 140 מ"ר אחת מושכרת'
|
||||
|
||||
|
||||
def _ids(rows):
|
||||
return [str(r.get("case_law_id")) for r in rows]
|
||||
|
||||
|
||||
def _rank_of(rows, cid):
|
||||
for i, r in enumerate(rows, 1):
|
||||
if str(r.get("case_law_id")) == cid:
|
||||
return i
|
||||
return None
|
||||
|
||||
|
||||
async def _search(query, source_kind, limit=10):
|
||||
query_emb = await embeddings.embed_query(query)
|
||||
return await hybrid_search.search_precedent_library_hybrid(
|
||||
query,
|
||||
query_emb,
|
||||
source_kind=source_kind,
|
||||
limit=limit,
|
||||
include_halachot=True,
|
||||
)
|
||||
|
||||
|
||||
async def main():
|
||||
results = {"pass": [], "fail": []}
|
||||
|
||||
# 1) THE BUG: bare-name query must rank the case ITSELF (Agasi) above any
|
||||
# case that merely CITES it (Vinfeld), and within the top 3.
|
||||
rows = await _search(NAME_QUERY, "internal_committee", limit=10)
|
||||
a_rank = _rank_of(rows, AGASI_ID)
|
||||
v_rank = _rank_of(rows, VINFELD_ID)
|
||||
ok = bool(a_rank) and a_rank <= 3 and (v_rank is None or a_rank < v_rank)
|
||||
msg = (f"[name/internal] query='{NAME_QUERY}' -> Agasi rank={a_rank}, "
|
||||
f"Vinfeld(citer) rank={v_rank} (top ids: {_ids(rows)[:5]})")
|
||||
(results["pass"] if ok else results["fail"]).append(msg)
|
||||
|
||||
# 2) REGRESSION: substantive query must still find Agasi with a real score.
|
||||
rows = await _search(SUBSTANTIVE_QUERY, "internal_committee", limit=10)
|
||||
rank = _rank_of(rows, AGASI_ID)
|
||||
top_score = float(rows[0]["score"]) if rows else 0.0
|
||||
msg = f"[substantive/internal] Agasi rank={rank}, top_score={top_score:.3f}"
|
||||
(results["pass"] if rank and rank <= 8 else results["fail"]).append(msg)
|
||||
|
||||
# 3) REGRESSION: substantive query in the full precedent library still works
|
||||
# (Vinfeld/נווה שלום etc. should surface; just assert non-empty + has betterment content).
|
||||
rows = await _search(SUBSTANTIVE_QUERY, "external_upload", limit=10)
|
||||
msg = f"[substantive/external] returned {len(rows)} rows (top ids: {_ids(rows)[:3]})"
|
||||
(results["pass"] if len(rows) >= 3 else results["fail"]).append(msg)
|
||||
|
||||
print("\n=== PASS ===")
|
||||
for m in results["pass"]:
|
||||
print(" ✓", m)
|
||||
print("=== FAIL ===")
|
||||
for m in results["fail"]:
|
||||
print(" ✗", m)
|
||||
|
||||
return 1 if results["fail"] else 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(asyncio.run(main()))
|
||||
60
scripts/upload_blam_decisions.py
Normal file
60
scripts/upload_blam_decisions.py
Normal file
@@ -0,0 +1,60 @@
|
||||
"""One-shot uploader for the 2 new בל"מ decisions Chaim staged in
|
||||
data/precedents/incoming/. Bypasses MCP because the running MCP server
|
||||
was started before SCHEMA_V15 + proceeding_type wiring landed.
|
||||
|
||||
Run from /home/chaim/legal-ai with the venv:
|
||||
POSTGRES_URL=... .venv/bin/python scripts/upload_blam_decisions.py
|
||||
"""
|
||||
|
||||
import asyncio
|
||||
import os
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, os.path.join(os.path.dirname(__file__), "..", "mcp-server", "src"))
|
||||
|
||||
from legal_mcp.services import internal_decisions as svc
|
||||
|
||||
DECISIONS = [
|
||||
{
|
||||
"file_path": "/home/chaim/legal-ai/data/precedents/incoming/ARAR-24-8126.pdf",
|
||||
"case_number": "8126/24",
|
||||
"chair_name": "דפנה תמיר",
|
||||
"district": "ירושלים",
|
||||
"case_name": "הוועדה המקומית ירושלים נ' סופר נוח",
|
||||
"court": "ועדת הערר לתכנון ובנייה — מחוז ירושלים",
|
||||
"decision_date": "2024-07-07",
|
||||
"practice_area": "betterment_levy",
|
||||
"appeal_subtype": "extension_request_betterment_levy",
|
||||
"proceeding_type": 'בל"מ',
|
||||
"subject_tags": ["בקשה_להארכת_מועד", "היטל_השבחה"],
|
||||
"summary": "",
|
||||
"is_binding": False,
|
||||
},
|
||||
{
|
||||
"file_path": "/home/chaim/legal-ai/data/precedents/incoming/ARAR-23-8047-3.docx",
|
||||
"case_number": "8047/23",
|
||||
"chair_name": "דפנה תמיר",
|
||||
"district": "ירושלים",
|
||||
"case_name": 'עזבון אליהו הרנון ז"ל נ\' הוועדה המקומית ירושלים',
|
||||
"court": "ועדת הערר לתכנון ובנייה — מחוז ירושלים",
|
||||
"decision_date": "2025-09-29",
|
||||
"practice_area": "betterment_levy",
|
||||
"appeal_subtype": "extension_request_betterment_levy",
|
||||
"proceeding_type": 'בל"מ',
|
||||
"subject_tags": ["בקשה_להארכת_מועד", "היטל_השבחה"],
|
||||
"summary": "",
|
||||
"is_binding": False,
|
||||
},
|
||||
]
|
||||
|
||||
|
||||
async def main():
|
||||
for d in DECISIONS:
|
||||
print(f"→ uploading {d['case_number']} ({d['proceeding_type']})")
|
||||
result = await svc.ingest_internal_decision(**d)
|
||||
print(f" ✓ case_law_id={result.get('case_law_id')} chunks={result.get('chunks')}")
|
||||
print("done.")
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
asyncio.run(main())
|
||||
@@ -14,6 +14,7 @@ import { StatusGuide } from "@/components/cases/status-guide";
|
||||
import { StatusChanger } from "@/components/cases/status-changer";
|
||||
import { DocumentsPanel } from "@/components/cases/documents-panel";
|
||||
import { DraftsPanel } from "@/components/cases/drafts-panel";
|
||||
import { LegalArgumentsPanel } from "@/components/cases/legal-arguments-panel";
|
||||
import { AgentActivityFeed } from "@/components/cases/agent-activity-feed";
|
||||
import { AgentStatusWidget } from "@/components/cases/agent-status-widget";
|
||||
import { UploadSheet } from "@/components/documents/upload-sheet";
|
||||
@@ -77,6 +78,9 @@ export default function CaseDetailPage({
|
||||
<div className="flex items-center justify-between gap-3 mb-1 flex-wrap">
|
||||
<TabsList className="bg-rule-soft/60">
|
||||
<TabsTrigger value="overview">סקירה</TabsTrigger>
|
||||
<TabsTrigger value="arguments">
|
||||
טיעונים
|
||||
</TabsTrigger>
|
||||
<TabsTrigger value="drafts">
|
||||
טיוטות והערות
|
||||
</TabsTrigger>
|
||||
@@ -139,6 +143,10 @@ export default function CaseDetailPage({
|
||||
<DocumentsPanel data={data} />
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="arguments" className="mt-5">
|
||||
<LegalArgumentsPanel caseNumber={caseNumber} />
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="drafts" className="mt-5">
|
||||
<DraftsPanel
|
||||
caseNumber={caseNumber}
|
||||
|
||||
161
web-ui/src/app/missing-precedents/page.tsx
Normal file
161
web-ui/src/app/missing-precedents/page.tsx
Normal file
@@ -0,0 +1,161 @@
|
||||
"use client";
|
||||
|
||||
import { useState } from "react";
|
||||
import Link from "next/link";
|
||||
import { AppShell } from "@/components/app-shell";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import {
|
||||
useMissingPrecedents,
|
||||
type MissingPrecedentStatus,
|
||||
} from "@/lib/api/missing-precedents";
|
||||
import { MissingPrecedentsTable } from "@/components/missing-precedents/missing-precedents-table";
|
||||
|
||||
/**
|
||||
* Missing-precedents page (TaskMaster #35).
|
||||
*
|
||||
* Surfaces citations that party briefs invoke but which aren't yet in the
|
||||
* precedent_library. Four tabs by status; each tab uses the same table
|
||||
* component with a different filter. Drawer (sheet) opens on row click
|
||||
* with metadata + upload form that routes to internal_decision_upload
|
||||
* (ערר/בל"מ citations) or precedent_library_upload (court rulings).
|
||||
*/
|
||||
function StatusBadge({ status, count }: { status: MissingPrecedentStatus; count: number }) {
|
||||
if (!count) return null;
|
||||
const variants: Record<MissingPrecedentStatus, string> = {
|
||||
open: "bg-gold-wash text-gold-deep border-gold/40",
|
||||
uploaded: "bg-rule-soft text-ink-muted border-rule",
|
||||
closed: "bg-emerald-50 text-emerald-800 border-emerald-300/60",
|
||||
irrelevant: "bg-rule-soft text-ink-muted border-rule",
|
||||
};
|
||||
return (
|
||||
<Badge
|
||||
variant="outline"
|
||||
className={`ms-1 text-[0.65rem] ${variants[status]}`}
|
||||
>
|
||||
{count}
|
||||
</Badge>
|
||||
);
|
||||
}
|
||||
|
||||
export default function MissingPrecedentsPage() {
|
||||
const [caseNumber, setCaseNumber] = useState("");
|
||||
const [legalTopic, setLegalTopic] = useState("");
|
||||
|
||||
const counts = useMissingPrecedents({ limit: 1 });
|
||||
const byStatus = counts.data?.by_status ?? {};
|
||||
|
||||
return (
|
||||
<AppShell>
|
||||
<section className="space-y-6">
|
||||
<header>
|
||||
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
||||
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
||||
<span aria-hidden> · </span>
|
||||
<span className="text-navy">פסיקה חסרה בקורפוס</span>
|
||||
</nav>
|
||||
<h1 className="text-navy mb-0">פסיקה חסרה בקורפוס</h1>
|
||||
<p className="text-ink-muted text-sm mt-1 max-w-3xl">
|
||||
פסיקות שצוטטו בכתבי הטענות אך אינן עדיין בקורפוס. סוכן המחקר רושם
|
||||
פערים אוטומטית; היו"ר סוגר אותם על־ידי העלאת המסמך — ניתוב
|
||||
אוטומטי בין הקורפוס הסמכותי (פסקי דין) להחלטות ועדות ערר.
|
||||
</p>
|
||||
</header>
|
||||
|
||||
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
||||
|
||||
<Card className="bg-surface border-rule shadow-sm">
|
||||
<CardContent className="px-6 py-5 space-y-5">
|
||||
{/* Shared filters */}
|
||||
<div className="flex items-end gap-3 flex-wrap">
|
||||
<div className="flex-1 min-w-[200px]">
|
||||
<label className="text-[0.78rem] text-ink-muted">תיק (מספר ערר)</label>
|
||||
<Input
|
||||
value={caseNumber}
|
||||
onChange={(e) => setCaseNumber(e.target.value)}
|
||||
placeholder="1017-03-26"
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
<div className="flex-1 min-w-[200px]">
|
||||
<label className="text-[0.78rem] text-ink-muted">נושא משפטי</label>
|
||||
<Input
|
||||
value={legalTopic}
|
||||
onChange={(e) => setLegalTopic(e.target.value)}
|
||||
placeholder="זכות עמידה"
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<Tabs defaultValue="open" dir="rtl">
|
||||
<TabsList className="bg-rule-soft/60">
|
||||
<TabsTrigger value="open">
|
||||
פתוחות
|
||||
<StatusBadge status="open" count={byStatus.open ?? 0} />
|
||||
</TabsTrigger>
|
||||
<TabsTrigger value="uploaded">
|
||||
הועלו
|
||||
<StatusBadge status="uploaded" count={byStatus.uploaded ?? 0} />
|
||||
</TabsTrigger>
|
||||
<TabsTrigger value="closed">
|
||||
נסגרו
|
||||
<StatusBadge status="closed" count={byStatus.closed ?? 0} />
|
||||
</TabsTrigger>
|
||||
<TabsTrigger value="irrelevant">
|
||||
לא רלוונטי
|
||||
<StatusBadge
|
||||
status="irrelevant"
|
||||
count={byStatus.irrelevant ?? 0}
|
||||
/>
|
||||
</TabsTrigger>
|
||||
<TabsTrigger value="all">הכל</TabsTrigger>
|
||||
</TabsList>
|
||||
|
||||
<TabsContent value="open" className="mt-4">
|
||||
<MissingPrecedentsTable
|
||||
status="open"
|
||||
caseNumber={caseNumber.trim() || undefined}
|
||||
legalTopic={legalTopic.trim() || undefined}
|
||||
/>
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="uploaded" className="mt-4">
|
||||
<MissingPrecedentsTable
|
||||
status="uploaded"
|
||||
caseNumber={caseNumber.trim() || undefined}
|
||||
legalTopic={legalTopic.trim() || undefined}
|
||||
/>
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="closed" className="mt-4">
|
||||
<MissingPrecedentsTable
|
||||
status="closed"
|
||||
caseNumber={caseNumber.trim() || undefined}
|
||||
legalTopic={legalTopic.trim() || undefined}
|
||||
/>
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="irrelevant" className="mt-4">
|
||||
<MissingPrecedentsTable
|
||||
status="irrelevant"
|
||||
caseNumber={caseNumber.trim() || undefined}
|
||||
legalTopic={legalTopic.trim() || undefined}
|
||||
/>
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="all" className="mt-4">
|
||||
<MissingPrecedentsTable
|
||||
caseNumber={caseNumber.trim() || undefined}
|
||||
legalTopic={legalTopic.trim() || undefined}
|
||||
/>
|
||||
</TabsContent>
|
||||
</Tabs>
|
||||
</CardContent>
|
||||
</Card>
|
||||
</section>
|
||||
</AppShell>
|
||||
);
|
||||
}
|
||||
@@ -2,14 +2,24 @@
|
||||
|
||||
import { use, useState } from "react";
|
||||
import Link from "next/link";
|
||||
import { Pencil } from "lucide-react";
|
||||
import { Pencil, Check, X } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import { AppShell } from "@/components/app-shell";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import { usePrecedent } from "@/lib/api/precedent-library";
|
||||
import { Textarea } from "@/components/ui/textarea";
|
||||
import {
|
||||
usePrecedent,
|
||||
useUpdatePrecedent,
|
||||
type Precedent,
|
||||
} from "@/lib/api/precedent-library";
|
||||
import { PrecedentEditSheet } from "@/components/precedents/precedent-edit-sheet";
|
||||
import {
|
||||
FormattedCitation,
|
||||
CitationCopyButton,
|
||||
} from "@/components/precedents/formatted-citation";
|
||||
import { ExtractedHalachotSection } from "@/components/precedents/extracted-halachot";
|
||||
import { RelatedCasesSection } from "@/components/precedents/link-related-dialog";
|
||||
|
||||
@@ -34,6 +44,9 @@ export default function PrecedentDetailPage({
|
||||
const { id } = use(params);
|
||||
const [editing, setEditing] = useState(false);
|
||||
const { data, isPending, error } = usePrecedent(id);
|
||||
const update = useUpdatePrecedent();
|
||||
const [editingCitation, setEditingCitation] = useState(false);
|
||||
const [citationDraft, setCitationDraft] = useState("");
|
||||
|
||||
return (
|
||||
<AppShell>
|
||||
@@ -80,6 +93,36 @@ export default function PrecedentDetailPage({
|
||||
</Button>
|
||||
</div>
|
||||
|
||||
{/* Citation per Israeli unified citation rules. The LLM
|
||||
extractor composes this from the document; the chair
|
||||
can override below. */}
|
||||
<CitationBlock
|
||||
precedent={data as Precedent}
|
||||
editing={editingCitation}
|
||||
draft={citationDraft}
|
||||
onStartEdit={() => {
|
||||
setCitationDraft(data.citation_formatted ?? "");
|
||||
setEditingCitation(true);
|
||||
}}
|
||||
onCancel={() => setEditingCitation(false)}
|
||||
onChange={setCitationDraft}
|
||||
onSave={async () => {
|
||||
try {
|
||||
await update.mutateAsync({
|
||||
id,
|
||||
patch: { citation_formatted: citationDraft.trim() },
|
||||
});
|
||||
toast.success("מראה מקום עודכן");
|
||||
setEditingCitation(false);
|
||||
} catch (e) {
|
||||
toast.error(
|
||||
e instanceof Error ? e.message : "שמירה נכשלה",
|
||||
);
|
||||
}
|
||||
}}
|
||||
saving={update.isPending}
|
||||
/>
|
||||
|
||||
<div className="flex items-center gap-2 flex-wrap">
|
||||
{data.practice_area ? (
|
||||
<Badge variant="outline" className="text-[0.7rem]">
|
||||
@@ -178,3 +221,109 @@ export default function PrecedentDetailPage({
|
||||
</AppShell>
|
||||
);
|
||||
}
|
||||
|
||||
function CitationBlock({
|
||||
precedent,
|
||||
editing,
|
||||
draft,
|
||||
onStartEdit,
|
||||
onCancel,
|
||||
onChange,
|
||||
onSave,
|
||||
saving,
|
||||
}: {
|
||||
precedent: Precedent;
|
||||
editing: boolean;
|
||||
draft: string;
|
||||
onStartEdit: () => void;
|
||||
onCancel: () => void;
|
||||
onChange: (v: string) => void;
|
||||
onSave: () => void;
|
||||
saving: boolean;
|
||||
}) {
|
||||
const citation = (precedent.citation_formatted ?? "").trim();
|
||||
|
||||
if (editing) {
|
||||
return (
|
||||
<div className="rounded-md border border-gold/40 bg-gold-wash/30 p-3 space-y-2">
|
||||
<div className="flex items-center justify-between gap-2">
|
||||
<span className="text-[0.78rem] font-semibold text-navy">
|
||||
עריכת מראה מקום
|
||||
</span>
|
||||
<span className="text-[0.7rem] text-ink-muted">
|
||||
הקף את שמות הצדדים בכפול-כוכבית <code className="font-mono">**שם**</code> להדגשה
|
||||
</span>
|
||||
</div>
|
||||
<Textarea
|
||||
value={draft}
|
||||
onChange={(e) => onChange(e.target.value)}
|
||||
rows={3}
|
||||
dir="rtl"
|
||||
className="font-mono text-sm"
|
||||
placeholder='ערר (ועדות ערר ...) 1234/24 **עורר נ' הוועדה המקומית** (נבו 1.2.2025)'
|
||||
disabled={saving}
|
||||
/>
|
||||
<div className="flex items-center gap-2">
|
||||
<Button
|
||||
size="sm"
|
||||
onClick={onSave}
|
||||
disabled={saving || !draft.trim()}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft"
|
||||
>
|
||||
<Check className="w-3.5 h-3.5 me-1" />
|
||||
שמור
|
||||
</Button>
|
||||
<Button
|
||||
size="sm"
|
||||
variant="outline"
|
||||
onClick={onCancel}
|
||||
disabled={saving}
|
||||
>
|
||||
<X className="w-3.5 h-3.5 me-1" />
|
||||
ביטול
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
if (!citation) {
|
||||
return (
|
||||
<div className="rounded-md border border-dashed border-rule bg-rule-soft/30 p-3 flex items-center justify-between gap-2">
|
||||
<span className="text-[0.78rem] text-ink-muted">
|
||||
מראה מקום (כללי הציטוט האחיד) — טרם חולץ
|
||||
</span>
|
||||
<Button size="sm" variant="outline" onClick={onStartEdit}>
|
||||
<Pencil className="w-3.5 h-3.5 me-1" />
|
||||
הוסף ידנית
|
||||
</Button>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="rounded-md border border-rule bg-parchment-50 p-3 space-y-1.5">
|
||||
<div className="flex items-center justify-between gap-2">
|
||||
<span className="text-[0.7rem] uppercase tracking-wide text-ink-muted">
|
||||
מראה מקום
|
||||
</span>
|
||||
<div className="flex items-center gap-1.5">
|
||||
<CitationCopyButton citation={citation} size="xs" />
|
||||
<button
|
||||
type="button"
|
||||
onClick={onStartEdit}
|
||||
title="ערוך מראה מקום"
|
||||
aria-label="ערוך מראה מקום"
|
||||
className="inline-flex items-center justify-center rounded-md border border-rule bg-surface hover:bg-rule-soft/50 text-ink-muted hover:text-navy h-7 w-7"
|
||||
>
|
||||
<Pencil className="w-3.5 h-3.5" />
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
<FormattedCitation
|
||||
citation={citation}
|
||||
className="block text-navy text-sm leading-relaxed"
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
@@ -1,30 +1,49 @@
|
||||
"use client";
|
||||
|
||||
import { useState } from "react";
|
||||
import Link from "next/link";
|
||||
import { Upload } from "lucide-react";
|
||||
import { AppShell } from "@/components/app-shell";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||
import { StyleReportPanel } from "@/components/training/style-report-panel";
|
||||
import { CorpusPanel } from "@/components/training/corpus-panel";
|
||||
import { ComparePanel } from "@/components/training/compare-panel";
|
||||
import { CuratorPortraitPanel } from "@/components/training/curator-portrait-panel";
|
||||
import { ChatPanel } from "@/components/training/chat-panel";
|
||||
import { TrainingUploadDialog } from "@/components/training/upload-dialog";
|
||||
|
||||
export default function TrainingPage() {
|
||||
const [uploadOpen, setUploadOpen] = useState(false);
|
||||
|
||||
return (
|
||||
<AppShell>
|
||||
<section className="space-y-6">
|
||||
<header>
|
||||
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
||||
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
||||
<span aria-hidden> · </span>
|
||||
<span className="text-navy">אימון סגנון</span>
|
||||
</nav>
|
||||
<h1 className="text-navy mb-0">הפורטרט הסגנוני של דפנה</h1>
|
||||
<p className="text-ink-muted text-sm mt-1 max-w-2xl">
|
||||
לוח בקרה של קורפוס האימון — סטטיסטיקות, אנטומיית החלטה ממוצעת,
|
||||
ביטויי חתימה, וכלי השוואה בין שתי החלטות.
|
||||
</p>
|
||||
<header className="flex items-start justify-between gap-4 flex-wrap">
|
||||
<div>
|
||||
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
||||
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
||||
<span aria-hidden> · </span>
|
||||
<span className="text-navy">אימון סגנון</span>
|
||||
</nav>
|
||||
<h1 className="text-navy mb-0">הפורטרט הסגנוני של דפנה</h1>
|
||||
<p className="text-ink-muted text-sm mt-1 max-w-2xl">
|
||||
לוח בקרה של קורפוס האימון — סטטיסטיקות, אנטומיית החלטה ממוצעת,
|
||||
ביטויי חתימה, וכלי השוואה בין שתי החלטות.
|
||||
</p>
|
||||
</div>
|
||||
<Button
|
||||
onClick={() => setUploadOpen(true)}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft shrink-0"
|
||||
>
|
||||
<Upload className="w-4 h-4 me-1" />
|
||||
העלה החלטה
|
||||
</Button>
|
||||
</header>
|
||||
|
||||
<TrainingUploadDialog open={uploadOpen} onOpenChange={setUploadOpen} />
|
||||
|
||||
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
||||
|
||||
<Card className="bg-surface border-rule shadow-sm">
|
||||
@@ -34,6 +53,8 @@ export default function TrainingPage() {
|
||||
<TabsTrigger value="report">פורטרט סגנון</TabsTrigger>
|
||||
<TabsTrigger value="corpus">קורפוס</TabsTrigger>
|
||||
<TabsTrigger value="compare">השוואה</TabsTrigger>
|
||||
<TabsTrigger value="curator">הסוכן</TabsTrigger>
|
||||
<TabsTrigger value="chat">שיחה</TabsTrigger>
|
||||
</TabsList>
|
||||
|
||||
<TabsContent value="report" className="mt-5">
|
||||
@@ -47,6 +68,14 @@ export default function TrainingPage() {
|
||||
<TabsContent value="compare" className="mt-5">
|
||||
<ComparePanel />
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="curator" className="mt-5">
|
||||
<CuratorPortraitPanel />
|
||||
</TabsContent>
|
||||
|
||||
<TabsContent value="chat" className="mt-5">
|
||||
<ChatPanel />
|
||||
</TabsContent>
|
||||
</Tabs>
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
@@ -15,6 +15,7 @@ import {
|
||||
} from "@/components/ui/dropdown-menu";
|
||||
import { GlobalSearch } from "@/components/global-search";
|
||||
import { headerSubtitle } from "@/components/header-context";
|
||||
import { useMissingPrecedentsOpenCount } from "@/lib/api/missing-precedents";
|
||||
|
||||
/**
|
||||
* Ezer Mishpati navigation shell — two-row header.
|
||||
@@ -45,9 +46,10 @@ const NAV_GROUPS: NavGroup[] = [
|
||||
{
|
||||
id: "knowledge",
|
||||
items: [
|
||||
{ href: "/precedents", label: "ספריית פסיקה" },
|
||||
{ href: "/training", label: "אימון סגנון" },
|
||||
{ href: "/methodology", label: "מתודולוגיה" },
|
||||
{ href: "/precedents", label: "ספריית פסיקה" },
|
||||
{ href: "/missing-precedents", label: "פסיקה חסרה" },
|
||||
{ href: "/training", label: "אימון סגנון" },
|
||||
{ href: "/methodology", label: "מתודולוגיה" },
|
||||
],
|
||||
},
|
||||
];
|
||||
@@ -240,7 +242,8 @@ function NavLink({ item, active }: { item: NavItem; active: boolean }) {
|
||||
: "text-parchment/80 hover:text-parchment hover:bg-navy-soft/60"}
|
||||
`}
|
||||
>
|
||||
{item.label}
|
||||
<span>{item.label}</span>
|
||||
{item.href === "/missing-precedents" ? <MissingPrecedentsBadge /> : null}
|
||||
{active && (
|
||||
<span
|
||||
className="absolute -bottom-[19px] inset-x-2 h-[2px] bg-gold"
|
||||
@@ -250,3 +253,18 @@ function NavLink({ item, active }: { item: NavItem; active: boolean }) {
|
||||
</Link>
|
||||
);
|
||||
}
|
||||
|
||||
/* Small open-count badge next to "פסיקה חסרה" — only renders when >0
|
||||
* so the nav stays quiet in normal operation. */
|
||||
function MissingPrecedentsBadge() {
|
||||
const { data: openCount } = useMissingPrecedentsOpenCount();
|
||||
if (!openCount) return null;
|
||||
return (
|
||||
<span
|
||||
className="ms-1 inline-flex items-center justify-center min-w-[1.25rem] h-4 px-1 rounded-full bg-gold text-navy text-[0.65rem] font-semibold"
|
||||
aria-label={`${openCount} פסיקות חסרות פתוחות`}
|
||||
>
|
||||
{openCount}
|
||||
</span>
|
||||
);
|
||||
}
|
||||
|
||||
@@ -12,17 +12,35 @@ const BUCKETS: Bucket[] = [
|
||||
{ key: "compensation_197", label: "פיצויים (ס׳ 197)", color: "var(--color-warn)" },
|
||||
];
|
||||
|
||||
/* For chart aggregation, collapse בל"מ variants back to their parent
|
||||
* domain — building_permit / betterment_levy / compensation_197. The
|
||||
* dedicated בל"מ filter in the cases table handles the cross-cutting view. */
|
||||
function collapseBlam(s: AppealSubtype): AppealSubtype {
|
||||
if (s === "extension_request_building_permit") return "building_permit";
|
||||
if (s === "extension_request_betterment_levy") return "betterment_levy";
|
||||
if (s === "extension_request_compensation") return "compensation_197";
|
||||
return s;
|
||||
}
|
||||
|
||||
export function subtypeOf(c: Case): AppealSubtype {
|
||||
return c.appeal_subtype && c.appeal_subtype !== "unknown"
|
||||
const raw = c.appeal_subtype && c.appeal_subtype !== "unknown"
|
||||
? c.appeal_subtype
|
||||
: deriveSubtype(c.case_number);
|
||||
return collapseBlam(raw);
|
||||
}
|
||||
|
||||
export function AppealTypeBars({ cases }: { cases?: Case[] }) {
|
||||
/* All seven subtypes initialized to 0 — subtypeOf() collapses בל"מ
|
||||
* variants back to their parent domain, so the extension_request_*
|
||||
* counters will remain 0 in practice; they exist here to satisfy the
|
||||
* Record<AppealSubtype, number> type. */
|
||||
const counts: Record<AppealSubtype, number> = {
|
||||
building_permit: 0,
|
||||
betterment_levy: 0,
|
||||
compensation_197: 0,
|
||||
extension_request_building_permit: 0,
|
||||
extension_request_betterment_levy: 0,
|
||||
extension_request_compensation: 0,
|
||||
unknown: 0,
|
||||
};
|
||||
(cases ?? []).forEach((c) => {
|
||||
|
||||
@@ -17,7 +17,10 @@ import {
|
||||
} from "@/components/ui/select";
|
||||
import { PartiesField } from "@/components/wizard/parties-field";
|
||||
import { useUpdateCase } from "@/lib/api/cases";
|
||||
import { caseUpdateSchema, expectedOutcomes, type CaseUpdateInput } from "@/lib/schemas/case";
|
||||
import {
|
||||
caseUpdateSchema, expectedOutcomes, proceedingTypes,
|
||||
type CaseUpdateInput,
|
||||
} from "@/lib/schemas/case";
|
||||
import type { CaseDetail } from "@/lib/api/cases";
|
||||
|
||||
/*
|
||||
@@ -47,6 +50,7 @@ export function CaseEditDialog({ data }: { data: CaseDetail }) {
|
||||
respondents: data.respondents ?? [],
|
||||
property_address: data.property_address ?? "",
|
||||
permit_number: data.permit_number ?? "",
|
||||
proceeding_type: data.proceeding_type ?? "ערר",
|
||||
},
|
||||
});
|
||||
|
||||
@@ -63,6 +67,7 @@ export function CaseEditDialog({ data }: { data: CaseDetail }) {
|
||||
respondents: data.respondents ?? [],
|
||||
property_address: data.property_address ?? "",
|
||||
permit_number: data.permit_number ?? "",
|
||||
proceeding_type: data.proceeding_type ?? "ערר",
|
||||
});
|
||||
}, [open, data, form]);
|
||||
|
||||
@@ -104,6 +109,37 @@ export function CaseEditDialog({ data }: { data: CaseDetail }) {
|
||||
<FieldError message={form.formState.errors.subject?.message} />
|
||||
</div>
|
||||
|
||||
<div>
|
||||
<Label className="text-navy">סוג תיק</Label>
|
||||
<Controller
|
||||
control={form.control}
|
||||
name="proceeding_type"
|
||||
render={({ field }) => (
|
||||
<Select
|
||||
value={field.value ?? "ערר"}
|
||||
onValueChange={(v) =>
|
||||
field.onChange(v as CaseUpdateInput["proceeding_type"])
|
||||
}
|
||||
dir="rtl"
|
||||
>
|
||||
<SelectTrigger className="mt-1">
|
||||
<SelectValue />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{proceedingTypes.map((p) => (
|
||||
<SelectItem key={p.value} value={p.value}>
|
||||
{p.label}
|
||||
</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
)}
|
||||
/>
|
||||
<p className="text-[0.7rem] text-ink-muted mt-1">
|
||||
ערר = הליך עיקרי; בל"מ = בקשה להארכת מועד להגשת ערר
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div className="h-px bg-rule" />
|
||||
|
||||
<Controller
|
||||
|
||||
@@ -8,6 +8,7 @@ import { CreateRepoButton } from "@/components/cases/create-repo-button";
|
||||
import {
|
||||
PRACTICE_AREA_LABELS,
|
||||
APPEAL_SUBTYPE_LABELS,
|
||||
isBlamSubtype,
|
||||
} from "@/lib/practice-area";
|
||||
import type { CaseDetail } from "@/lib/api/cases";
|
||||
|
||||
@@ -40,7 +41,7 @@ export function CaseHeader({ data }: { data?: CaseDetail }) {
|
||||
<div className="space-y-2">
|
||||
<div className="flex items-center gap-3 flex-wrap">
|
||||
<span className="font-display text-[2rem] font-black text-navy leading-none tabular-nums">
|
||||
ערר {data?.case_number ?? "—"}
|
||||
{data?.proceeding_type ?? "ערר"} {data?.case_number ?? "—"}
|
||||
</span>
|
||||
{data?.status && <StatusBadge status={data.status} />}
|
||||
{data?.archived_at && (
|
||||
@@ -62,6 +63,15 @@ export function CaseHeader({ data }: { data?: CaseDetail }) {
|
||||
)}
|
||||
</Badge>
|
||||
)}
|
||||
{(data?.proceeding_type === 'בל"מ' || isBlamSubtype(data?.appeal_subtype)) && (
|
||||
<Badge
|
||||
variant="outline"
|
||||
className="rounded-full px-2.5 py-0.5 text-[0.72rem] font-bold bg-warn/10 text-warn-deep border-warn/40"
|
||||
title="בקשה להארכת מועד להגשת ערר"
|
||||
>
|
||||
בל"מ
|
||||
</Badge>
|
||||
)}
|
||||
{data?.case_number && (
|
||||
<CaseArchiveAction
|
||||
caseNumber={data.case_number}
|
||||
|
||||
@@ -16,7 +16,12 @@ import {
|
||||
} from "@/components/ui/table";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import {
|
||||
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||
} from "@/components/ui/select";
|
||||
import { StatusBadge } from "@/components/cases/status-badge";
|
||||
import { isBlamSubtype } from "@/lib/practice-area";
|
||||
import type { Case } from "@/lib/api/cases";
|
||||
|
||||
function formatDate(iso?: string) {
|
||||
@@ -49,8 +54,17 @@ const columns: ColumnDef<Case>[] = [
|
||||
accessorKey: "title",
|
||||
header: "כותרת",
|
||||
cell: ({ row }) => (
|
||||
<div className="text-ink max-w-[420px] truncate" title={row.original.title}>
|
||||
{row.original.title}
|
||||
<div className="text-ink max-w-[420px] truncate flex items-center gap-2" title={row.original.title}>
|
||||
{(row.original.proceeding_type === 'בל"מ' || isBlamSubtype(row.original.appeal_subtype)) && (
|
||||
<Badge
|
||||
variant="outline"
|
||||
className="rounded-full px-1.5 py-0 text-[0.65rem] font-bold bg-warn/10 text-warn-deep border-warn/40 shrink-0"
|
||||
title="בקשה להארכת מועד להגשת ערר"
|
||||
>
|
||||
בל"מ
|
||||
</Badge>
|
||||
)}
|
||||
<span className="truncate">{row.original.title}</span>
|
||||
</div>
|
||||
),
|
||||
},
|
||||
@@ -94,8 +108,17 @@ export function CasesTable({
|
||||
{ id: "updated_at", desc: true },
|
||||
]);
|
||||
const [globalFilter, setGlobalFilter] = useState("");
|
||||
/* "all" = all cases; "blam" = only בל"מ; "regular" = exclude בל"מ */
|
||||
const [blamFilter, setBlamFilter] = useState<"all" | "blam" | "regular">("all");
|
||||
|
||||
const data = useMemo(() => cases ?? [], [cases]);
|
||||
const data = useMemo(() => {
|
||||
const all = cases ?? [];
|
||||
const isBlam = (c: Case) =>
|
||||
c.proceeding_type === 'בל"מ' || isBlamSubtype(c.appeal_subtype);
|
||||
if (blamFilter === "blam") return all.filter(isBlam);
|
||||
if (blamFilter === "regular") return all.filter((c) => !isBlam(c));
|
||||
return all;
|
||||
}, [cases, blamFilter]);
|
||||
|
||||
const table = useReactTable({
|
||||
data,
|
||||
@@ -126,6 +149,20 @@ export function CasesTable({
|
||||
className="max-w-sm bg-surface"
|
||||
dir="rtl"
|
||||
/>
|
||||
<Select
|
||||
value={blamFilter}
|
||||
onValueChange={(v) => setBlamFilter(v as "all" | "blam" | "regular")}
|
||||
dir="rtl"
|
||||
>
|
||||
<SelectTrigger className="w-40 bg-surface">
|
||||
<SelectValue />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="all">כל התיקים</SelectItem>
|
||||
<SelectItem value="blam">בל"מ בלבד</SelectItem>
|
||||
<SelectItem value="regular">ערר רגיל בלבד</SelectItem>
|
||||
</SelectContent>
|
||||
</Select>
|
||||
<span className="text-sm text-ink-muted me-auto">
|
||||
{table.getFilteredRowModel().rows.length} תיקים
|
||||
</span>
|
||||
|
||||
@@ -269,6 +269,26 @@ function PostSaveView({
|
||||
</div>
|
||||
)}
|
||||
|
||||
{extractResult?.status === "queued" && (
|
||||
<div className="rounded-md border border-info/30 bg-info-bg px-2.5 py-2 text-[0.72rem] text-ink space-y-0.5">
|
||||
<p>
|
||||
<strong>נשלח לאנליטיקאי.</strong> ה-issue נפתח ב-Paperclip והחילוץ
|
||||
ירוץ ברקע. תראה comment בעברית עם התוצאה כשהוא יסיים — לרוב כמה
|
||||
דקות.
|
||||
</p>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{extractResult?.status === "skipped" && (
|
||||
<div className="rounded-md border border-warn/40 bg-warn-bg px-2.5 py-2 text-[0.72rem] text-ink space-y-0.5">
|
||||
<p>
|
||||
<strong>לא ניתן להפעיל אוטומטית</strong> ({extractResult.reason}).
|
||||
הפעל ידנית מ-Claude Code:
|
||||
<code className="ms-1 select-all">mcp__legal-ai__extract_appraiser_facts</code>
|
||||
</p>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{extractResult?.status === "no_appraisals" && (
|
||||
<p className="text-[0.72rem] text-ink-muted">
|
||||
אין בתיק מסמכים מתויגים כ-שומה.
|
||||
@@ -320,8 +340,8 @@ function PostSaveView({
|
||||
|
||||
{pending && (
|
||||
<p className="text-[0.68rem] text-ink-muted leading-tight">
|
||||
החילוץ יכול להימשך כמה דקות — שומות ארוכות עוברות ניתוח פסקה אחר
|
||||
פסקה ע"י המודל.
|
||||
שולח לאנליטיקאי דרך Paperclip — לוקח שנייה. החילוץ עצמו ירוץ אצל
|
||||
האנליטיקאי וייתן comment כשיסיים.
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
|
||||
@@ -6,6 +6,7 @@ import { Progress } from "@/components/ui/progress";
|
||||
import {
|
||||
Dialog,
|
||||
DialogContent,
|
||||
DialogDescription,
|
||||
DialogHeader,
|
||||
DialogTitle,
|
||||
DialogFooter,
|
||||
@@ -127,6 +128,7 @@ function DocumentPreviewDialog({
|
||||
<DialogContent className="sm:max-w-2xl max-h-[80vh] flex flex-col" dir="rtl">
|
||||
<DialogHeader>
|
||||
<DialogTitle className="text-right">{displayName}</DialogTitle>
|
||||
<DialogDescription className="sr-only">תצוגה מקדימה של תוכן המסמך</DialogDescription>
|
||||
</DialogHeader>
|
||||
<div className="flex-1 overflow-hidden">
|
||||
{loading && (
|
||||
@@ -184,6 +186,7 @@ function DeleteConfirmDialog({
|
||||
<DialogContent dir="rtl">
|
||||
<DialogHeader>
|
||||
<DialogTitle className="text-right">מחיקת מסמך</DialogTitle>
|
||||
<DialogDescription className="sr-only">אישור מחיקת המסמך מהתיק</DialogDescription>
|
||||
</DialogHeader>
|
||||
<p className="text-sm text-ink-muted text-right">
|
||||
האם למחוק את המסמך <strong>“{displayName}”</strong>?
|
||||
|
||||
@@ -8,6 +8,7 @@ import { Textarea } from "@/components/ui/textarea";
|
||||
import {
|
||||
Dialog,
|
||||
DialogContent,
|
||||
DialogDescription,
|
||||
DialogHeader,
|
||||
DialogTitle,
|
||||
DialogTrigger,
|
||||
@@ -323,6 +324,7 @@ export function DraftsPanel({
|
||||
<DialogContent className="sm:max-w-sm" dir="rtl">
|
||||
<DialogHeader>
|
||||
<DialogTitle>מחיקת טיוטה</DialogTitle>
|
||||
<DialogDescription className="sr-only">אישור מחיקת קובץ הטיוטה</DialogDescription>
|
||||
</DialogHeader>
|
||||
<p className="text-sm text-ink-muted">
|
||||
למחוק את הקובץ{" "}
|
||||
@@ -493,6 +495,7 @@ function NewCaseFeedbackDialog({ caseNumber }: { caseNumber: string }) {
|
||||
<DialogContent className="sm:max-w-lg" dir="rtl">
|
||||
<DialogHeader>
|
||||
<DialogTitle>הערת יו״ר — תיק {caseNumber}</DialogTitle>
|
||||
<DialogDescription className="sr-only">הוספת הערת יו״ר על בלוק בהחלטה</DialogDescription>
|
||||
</DialogHeader>
|
||||
<form onSubmit={handleSubmit} className="space-y-4 mt-2">
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
|
||||
222
web-ui/src/components/cases/legal-arguments-panel.tsx
Normal file
222
web-ui/src/components/cases/legal-arguments-panel.tsx
Normal file
@@ -0,0 +1,222 @@
|
||||
"use client";
|
||||
|
||||
import { useMemo } from "react";
|
||||
import {
|
||||
Accordion,
|
||||
AccordionContent,
|
||||
AccordionItem,
|
||||
AccordionTrigger,
|
||||
} from "@/components/ui/accordion";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import {
|
||||
PARTY_LABELS_HE,
|
||||
PRIORITY_LABELS_HE,
|
||||
PRIORITY_ORDER,
|
||||
useAggregateArguments,
|
||||
useLegalArguments,
|
||||
type LegalArgument,
|
||||
type LegalArgumentParty,
|
||||
type LegalArgumentPriority,
|
||||
} from "@/lib/api/legal-arguments";
|
||||
import { toast } from "sonner";
|
||||
import { Loader2, RefreshCw, Sparkles } from "lucide-react";
|
||||
|
||||
const PRIORITY_BADGE_TONE: Record<LegalArgumentPriority, string> = {
|
||||
threshold: "bg-danger-bg/60 text-danger-strong border-danger/40",
|
||||
substantive: "bg-gold-soft/50 text-navy border-gold/40",
|
||||
procedural: "bg-rule-soft text-ink border-rule",
|
||||
relief: "bg-emerald-50 text-emerald-900 border-emerald-200",
|
||||
};
|
||||
|
||||
function groupByPriority(
|
||||
args: LegalArgument[],
|
||||
): Record<LegalArgumentPriority, LegalArgument[]> {
|
||||
const out: Record<LegalArgumentPriority, LegalArgument[]> = {
|
||||
threshold: [],
|
||||
substantive: [],
|
||||
procedural: [],
|
||||
relief: [],
|
||||
};
|
||||
for (const a of args) {
|
||||
(out[a.priority] ?? out.substantive).push(a);
|
||||
}
|
||||
for (const key of PRIORITY_ORDER) {
|
||||
out[key].sort((x, y) => x.argument_index - y.argument_index);
|
||||
}
|
||||
return out;
|
||||
}
|
||||
|
||||
type PartySectionProps = {
|
||||
party: LegalArgumentParty;
|
||||
args: LegalArgument[];
|
||||
};
|
||||
|
||||
function PartySection({ party, args }: PartySectionProps) {
|
||||
const grouped = useMemo(() => groupByPriority(args), [args]);
|
||||
return (
|
||||
<div className="space-y-3">
|
||||
<div className="flex items-baseline justify-between border-b border-rule pb-2">
|
||||
<h3 className="text-navy text-base font-semibold">
|
||||
{PARTY_LABELS_HE[party] ?? party}
|
||||
</h3>
|
||||
<span className="text-ink-muted text-xs">
|
||||
{args.length} טיעונים
|
||||
</span>
|
||||
</div>
|
||||
{PRIORITY_ORDER.map((priority) => {
|
||||
const list = grouped[priority];
|
||||
if (!list?.length) return null;
|
||||
return (
|
||||
<div key={priority} className="space-y-1">
|
||||
<div className="flex items-center gap-2">
|
||||
<Badge
|
||||
variant="outline"
|
||||
className={`${PRIORITY_BADGE_TONE[priority]} text-xs`}
|
||||
>
|
||||
{PRIORITY_LABELS_HE[priority]}
|
||||
</Badge>
|
||||
<span className="text-ink-muted text-xs">
|
||||
{list.length} טיעונים
|
||||
</span>
|
||||
</div>
|
||||
<Accordion type="multiple" className="rounded-md border border-rule bg-surface">
|
||||
{list.map((arg) => (
|
||||
<AccordionItem key={arg.id} value={arg.id} className="px-3">
|
||||
<AccordionTrigger className="text-start">
|
||||
<div className="flex flex-1 flex-col items-start gap-1">
|
||||
<span className="text-navy text-sm font-medium leading-tight">
|
||||
{arg.argument_index}. {arg.argument_title}
|
||||
</span>
|
||||
{arg.legal_topic && (
|
||||
<span className="text-ink-muted text-xs">
|
||||
{arg.legal_topic}
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
</AccordionTrigger>
|
||||
<AccordionContent>
|
||||
<div className="space-y-2 px-1">
|
||||
<p className="text-ink leading-relaxed whitespace-pre-line">
|
||||
{arg.argument_body}
|
||||
</p>
|
||||
{arg.supporting_claims.length > 0 && (
|
||||
<p className="text-ink-muted text-xs">
|
||||
מסתמך על {arg.supporting_claims.length} פרופוזיציות
|
||||
גולמיות.
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
</AccordionContent>
|
||||
</AccordionItem>
|
||||
))}
|
||||
</Accordion>
|
||||
</div>
|
||||
);
|
||||
})}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
type LegalArgumentsPanelProps = {
|
||||
caseNumber: string;
|
||||
};
|
||||
|
||||
export function LegalArgumentsPanel({ caseNumber }: LegalArgumentsPanelProps) {
|
||||
const { data, isPending, isError, error } = useLegalArguments(caseNumber);
|
||||
const aggregate = useAggregateArguments(caseNumber);
|
||||
|
||||
const parties = useMemo<LegalArgumentParty[]>(() => {
|
||||
if (!data?.by_party) return [];
|
||||
const order: LegalArgumentParty[] = [
|
||||
"appellant",
|
||||
"respondent",
|
||||
"committee",
|
||||
"permit_applicant",
|
||||
"unknown",
|
||||
];
|
||||
return order.filter((p) => (data.by_party[p]?.length ?? 0) > 0);
|
||||
}, [data]);
|
||||
|
||||
const handleAggregate = (force: boolean) => {
|
||||
aggregate.mutate(force, {
|
||||
onSuccess: () => {
|
||||
toast.success(
|
||||
force
|
||||
? "הופעלה חזרה חישוב טיעונים (force). יסתיים תוך דקה."
|
||||
: "הופעל חישוב טיעונים. רענן בעוד דקה.",
|
||||
);
|
||||
},
|
||||
onError: (e) => toast.error(`שגיאה: ${(e as Error).message}`),
|
||||
});
|
||||
};
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule shadow-sm">
|
||||
<CardContent className="px-6 py-5 space-y-4">
|
||||
<div className="flex items-center justify-between flex-wrap gap-3">
|
||||
<div>
|
||||
<h2 className="text-navy text-base font-semibold">
|
||||
טיעונים משפטיים
|
||||
</h2>
|
||||
<p className="text-ink-muted text-xs mt-0.5">
|
||||
טיעונים מאוגדים מתוך הפרופוזיציות הגולמיות, מקובצים לפי צד וקדימות.
|
||||
</p>
|
||||
</div>
|
||||
<div className="flex items-center gap-2">
|
||||
<Button
|
||||
variant="outline"
|
||||
size="sm"
|
||||
disabled={aggregate.isPending}
|
||||
onClick={() => handleAggregate(false)}
|
||||
>
|
||||
{aggregate.isPending ? (
|
||||
<Loader2 className="w-3.5 h-3.5 animate-spin me-1.5" />
|
||||
) : (
|
||||
<Sparkles className="w-3.5 h-3.5 me-1.5" />
|
||||
)}
|
||||
חשב טיעונים
|
||||
</Button>
|
||||
<Button
|
||||
variant="ghost"
|
||||
size="sm"
|
||||
disabled={aggregate.isPending || !data?.total}
|
||||
onClick={() => handleAggregate(true)}
|
||||
title="חישוב מחדש (מוחק טיעונים קיימים)"
|
||||
>
|
||||
<RefreshCw className="w-3.5 h-3.5" />
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{isPending ? (
|
||||
<div className="space-y-2">
|
||||
<Skeleton className="h-6 w-48" />
|
||||
<Skeleton className="h-20 w-full" />
|
||||
<Skeleton className="h-20 w-full" />
|
||||
</div>
|
||||
) : isError ? (
|
||||
<p className="text-danger text-sm">
|
||||
שגיאה בטעינת טיעונים: {(error as Error).message}
|
||||
</p>
|
||||
) : !data?.total ? (
|
||||
<p className="text-ink-muted text-sm">
|
||||
אין טיעונים מאוגדים עדיין. לחץ "חשב טיעונים" כדי להריץ את ה-aggregator.
|
||||
</p>
|
||||
) : (
|
||||
<div className="space-y-6">
|
||||
{parties.map((party) => (
|
||||
<PartySection
|
||||
key={party}
|
||||
party={party}
|
||||
args={data.by_party[party] ?? []}
|
||||
/>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,414 @@
|
||||
"use client";
|
||||
|
||||
import { useEffect, useState } from "react";
|
||||
import { Upload, Save, Loader2, CheckCircle2 } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import {
|
||||
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||
} from "@/components/ui/sheet";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Textarea } from "@/components/ui/textarea";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import {
|
||||
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||
} from "@/components/ui/select";
|
||||
import {
|
||||
useMissingPrecedent,
|
||||
useUpdateMissingPrecedent,
|
||||
useUploadMissingPrecedent,
|
||||
STATUS_LABELS,
|
||||
type MissingPrecedentPatch,
|
||||
} from "@/lib/api/missing-precedents";
|
||||
import {
|
||||
PRACTICE_AREAS, PRECEDENT_LEVELS, DISTRICTS,
|
||||
} from "@/components/precedents/practice-area";
|
||||
|
||||
type Props = {
|
||||
id: string | null;
|
||||
onOpenChange: (open: boolean) => void;
|
||||
};
|
||||
|
||||
const ACCEPT = ".pdf,.docx,.doc,.rtf,.txt,.md";
|
||||
|
||||
function isCommitteeCitation(citation: string): boolean {
|
||||
const norm = citation.trim();
|
||||
return /^(ערר[\s(]|בל"מ[\s(]|ARAR )/.test(norm);
|
||||
}
|
||||
|
||||
export function MissingPrecedentDetailDrawer({ id, onOpenChange }: Props) {
|
||||
const open = id !== null;
|
||||
const { data: mp, isPending } = useMissingPrecedent(id);
|
||||
const update = useUpdateMissingPrecedent();
|
||||
const upload = useUploadMissingPrecedent();
|
||||
|
||||
// The only chair-editable field on the missing-precedent is `notes` —
|
||||
// free-text. Everything else (citation, who-cited-whom, status) is set
|
||||
// when the row was detected, and updates automatically when the file
|
||||
// is uploaded. The metadata of the *uploaded* precedent (case_name,
|
||||
// chair, district, …) is auto-extracted by the LLM and lives on the
|
||||
// case_law row, not here.
|
||||
const [notes, setNotes] = useState("");
|
||||
|
||||
// Upload form fields.
|
||||
const [file, setFile] = useState<File | null>(null);
|
||||
const [decisionDate, setDecisionDate] = useState("");
|
||||
const [court, setCourt] = useState("");
|
||||
const [practiceArea, setPracticeArea] = useState<string>("");
|
||||
const [appealSubtype, setAppealSubtype] = useState("");
|
||||
const [precedentLevel, setPrecedentLevel] = useState("");
|
||||
const [chairName, setChairName] = useState("");
|
||||
const [district, setDistrict] = useState("");
|
||||
const [committeeCaseNumber, setCommitteeCaseNumber] = useState("");
|
||||
const [summary, setSummary] = useState("");
|
||||
|
||||
// Sync form from record when it loads or id changes.
|
||||
const [syncedId, setSyncedId] = useState<string | null>(null);
|
||||
if (mp && mp.id !== syncedId) {
|
||||
setSyncedId(mp.id);
|
||||
setNotes(mp.notes ?? "");
|
||||
}
|
||||
|
||||
// Reset on close. The cascading-render warning is the intended side
|
||||
// effect here — wiping the form when the drawer closes.
|
||||
useEffect(() => {
|
||||
if (open) return;
|
||||
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||
setFile(null);
|
||||
setSyncedId(null);
|
||||
setDecisionDate(""); setCourt(""); setPracticeArea("");
|
||||
setAppealSubtype(""); setPrecedentLevel(""); setChairName("");
|
||||
setDistrict(""); setCommitteeCaseNumber(""); setSummary("");
|
||||
}, [open]);
|
||||
|
||||
const handleSaveNotes = async () => {
|
||||
if (!mp) return;
|
||||
const patch: MissingPrecedentPatch = { notes };
|
||||
try {
|
||||
await update.mutateAsync({ id: mp.id, patch });
|
||||
toast.success("הערות נשמרו");
|
||||
} catch (e) {
|
||||
toast.error("שמירה נכשלה");
|
||||
console.error(e);
|
||||
}
|
||||
};
|
||||
|
||||
const isCommittee = mp ? isCommitteeCitation(mp.citation) : false;
|
||||
|
||||
const handleUpload = async (e: React.FormEvent) => {
|
||||
e.preventDefault();
|
||||
if (!mp || !file) {
|
||||
toast.error("בחר קובץ");
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await upload.mutateAsync({
|
||||
id: mp.id,
|
||||
file,
|
||||
case_number: isCommittee ? committeeCaseNumber || undefined : undefined,
|
||||
chair_name: isCommittee ? chairName || undefined : undefined,
|
||||
district: isCommittee ? district || undefined : undefined,
|
||||
court: court || undefined,
|
||||
decision_date: decisionDate || undefined,
|
||||
practice_area: practiceArea || undefined,
|
||||
appeal_subtype: appealSubtype || undefined,
|
||||
precedent_level: precedentLevel || undefined,
|
||||
source_type: isCommittee ? "appeals_committee" : "court_ruling",
|
||||
summary: summary || undefined,
|
||||
});
|
||||
toast.success(
|
||||
"הקובץ הועלה. חילוץ המטא־דאטה (שם, ערכאה, תאריך, יו״ר, מחוז…) מתבצע ברקע ויסתיים בתוך כדקה.",
|
||||
);
|
||||
onOpenChange(false);
|
||||
} catch (e: unknown) {
|
||||
const msg =
|
||||
e instanceof Error
|
||||
? e.message
|
||||
: typeof e === "string"
|
||||
? e
|
||||
: "כשל העלאה";
|
||||
toast.error(msg);
|
||||
console.error(e);
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||
<SheetContent
|
||||
side="left"
|
||||
className="w-full sm:max-w-2xl overflow-y-auto"
|
||||
>
|
||||
<SheetHeader className="space-y-1">
|
||||
<SheetTitle className="text-navy">
|
||||
פסיקה חסרה
|
||||
{mp ? (
|
||||
<Badge
|
||||
variant="outline"
|
||||
className="ms-2 align-middle"
|
||||
>
|
||||
{STATUS_LABELS[mp.status]}
|
||||
</Badge>
|
||||
) : null}
|
||||
</SheetTitle>
|
||||
<SheetDescription>
|
||||
פרטים מלאים והעלאת הפסיקה לקורפוס.
|
||||
</SheetDescription>
|
||||
</SheetHeader>
|
||||
|
||||
{isPending || !mp ? (
|
||||
<div className="space-y-3 px-6 py-4">
|
||||
<Skeleton className="h-4 w-3/4" />
|
||||
<Skeleton className="h-4 w-2/3" />
|
||||
<Skeleton className="h-4 w-1/2" />
|
||||
</div>
|
||||
) : (
|
||||
<div className="space-y-6 px-6 py-4">
|
||||
{/* ── Citation block (read-only) ── */}
|
||||
<section className="space-y-2">
|
||||
<div className="text-[0.78rem] text-ink-muted">מראה מקום</div>
|
||||
<div className="text-sm text-navy font-medium bg-rule-soft/40 rounded-md px-3 py-2 leading-relaxed">
|
||||
{mp.citation}
|
||||
</div>
|
||||
{mp.claim_quote ? (
|
||||
<>
|
||||
<div className="text-[0.78rem] text-ink-muted mt-3">ציטוט מכתב הטענות</div>
|
||||
<div className="text-xs text-ink bg-gold-wash/30 border-s-2 border-gold rounded-md px-3 py-2 leading-relaxed">
|
||||
{mp.claim_quote}
|
||||
</div>
|
||||
</>
|
||||
) : null}
|
||||
</section>
|
||||
|
||||
{/* ── Linked record (if closed) ── */}
|
||||
{mp.linked_case_law_id ? (
|
||||
<section className="space-y-1 bg-emerald-50 border border-emerald-200 rounded-lg p-3">
|
||||
<div className="flex items-center gap-2 text-emerald-800 font-medium text-sm">
|
||||
<CheckCircle2 className="w-4 h-4" />
|
||||
מקושר ל
|
||||
</div>
|
||||
<div className="text-sm text-emerald-900 truncate">
|
||||
{mp.linked_case_law_name || "—"}
|
||||
</div>
|
||||
<div className="text-[0.72rem] text-emerald-700 truncate">
|
||||
{mp.linked_case_law_number}
|
||||
</div>
|
||||
</section>
|
||||
) : null}
|
||||
|
||||
{/* ── Notes (only chair-editable field; everything else is
|
||||
auto-detected or auto-extracted from the file). ── */}
|
||||
<section className="space-y-2">
|
||||
<Label htmlFor="notes" className="text-sm font-semibold text-navy">
|
||||
הערות
|
||||
</Label>
|
||||
<p className="text-[0.72rem] text-ink-muted leading-relaxed">
|
||||
שדה חופשי — לדוגמה: מי מצטט (הוועדה / העורר / המשיב) ובאיזה הקשר.
|
||||
שאר השדות (שם, ערכאה, יו״ר, מחוז, תאריך, תת־סוג, תקציר) יחולצו
|
||||
אוטומטית מהקובץ בעת ההעלאה.
|
||||
</p>
|
||||
<Textarea
|
||||
id="notes"
|
||||
value={notes}
|
||||
onChange={(e) => setNotes(e.target.value)}
|
||||
rows={3}
|
||||
dir="rtl"
|
||||
/>
|
||||
<Button
|
||||
onClick={handleSaveNotes}
|
||||
disabled={update.isPending}
|
||||
variant="outline"
|
||||
size="sm"
|
||||
className="border-rule"
|
||||
>
|
||||
{update.isPending ? (
|
||||
<Loader2 className="w-4 h-4 me-1 animate-spin" />
|
||||
) : (
|
||||
<Save className="w-4 h-4 me-1" />
|
||||
)}
|
||||
שמור הערות
|
||||
</Button>
|
||||
</section>
|
||||
|
||||
{/* ── Upload section ── */}
|
||||
{!mp.linked_case_law_id ? (
|
||||
<section className="space-y-3 border-t border-rule pt-5">
|
||||
<h3 className="text-sm font-semibold text-navy">
|
||||
העלאת הפסיקה לקורפוס
|
||||
</h3>
|
||||
<div className="text-[0.78rem] text-ink-muted leading-relaxed">
|
||||
ניתוב אוטומטי לפי הציטוט:
|
||||
<strong className="text-navy">
|
||||
{isCommittee ? "החלטת ועדת ערר (internal)" : "פסק דין (library)"}
|
||||
</strong>
|
||||
<br />
|
||||
שדות נוספים (שם, ערכאה, תאריך, יו״ר, מחוז, תת־סוג) יחולצו אוטומטית
|
||||
מהקובץ ברקע.
|
||||
</div>
|
||||
|
||||
<form onSubmit={handleUpload} className="space-y-3">
|
||||
<div>
|
||||
<Label htmlFor="file">קובץ (PDF / DOCX / RTF / TXT / MD)</Label>
|
||||
<Input
|
||||
id="file"
|
||||
type="file"
|
||||
accept={ACCEPT}
|
||||
onChange={(e) => setFile(e.target.files?.[0] ?? null)}
|
||||
required
|
||||
/>
|
||||
</div>
|
||||
|
||||
<details className="group rounded-md border border-rule bg-rule-soft/30">
|
||||
<summary className="cursor-pointer select-none px-3 py-2 text-[0.78rem] text-ink-muted hover:text-navy">
|
||||
אופציונלי — דריסה ידנית של שדות שיחולצו אוטומטית
|
||||
</summary>
|
||||
<div className="space-y-3 border-t border-rule px-3 py-3">
|
||||
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<Label htmlFor="court">ערכאה</Label>
|
||||
<Input
|
||||
id="court"
|
||||
value={court}
|
||||
onChange={(e) => setCourt(e.target.value)}
|
||||
placeholder="בית המשפט העליון"
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<Label htmlFor="decision_date">תאריך</Label>
|
||||
<Input
|
||||
id="decision_date"
|
||||
type="date"
|
||||
value={decisionDate}
|
||||
onChange={(e) => setDecisionDate(e.target.value)}
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<Label htmlFor="practice_area">תחום</Label>
|
||||
<Select value={practiceArea} onValueChange={setPracticeArea}>
|
||||
<SelectTrigger>
|
||||
<SelectValue placeholder="ללא" />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{PRACTICE_AREAS.map((a) => (
|
||||
<SelectItem key={a.value} value={a.value}>
|
||||
{a.label}
|
||||
</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
<div>
|
||||
<Label htmlFor="appeal_subtype">תת־סוג</Label>
|
||||
<Input
|
||||
id="appeal_subtype"
|
||||
value={appealSubtype}
|
||||
onChange={(e) => setAppealSubtype(e.target.value)}
|
||||
placeholder="זכות עמידה"
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{isCommittee ? (
|
||||
<>
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<Label htmlFor="chair_name">יו״ר</Label>
|
||||
<Input
|
||||
id="chair_name"
|
||||
value={chairName}
|
||||
onChange={(e) => setChairName(e.target.value)}
|
||||
placeholder="דפנה תמיר"
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
<div>
|
||||
<Label htmlFor="district">מחוז</Label>
|
||||
<Select value={district} onValueChange={setDistrict}>
|
||||
<SelectTrigger>
|
||||
<SelectValue placeholder="בחר" />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{DISTRICTS.map((d) => (
|
||||
<SelectItem key={d.value} value={d.value}>
|
||||
{d.label}
|
||||
</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
</div>
|
||||
<div>
|
||||
<Label htmlFor="committee_case_number">
|
||||
מספר ערר (לציטוט הקטן)
|
||||
</Label>
|
||||
<Input
|
||||
id="committee_case_number"
|
||||
value={committeeCaseNumber}
|
||||
onChange={(e) => setCommitteeCaseNumber(e.target.value)}
|
||||
placeholder="ערר 1112/22 ..."
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
</>
|
||||
) : (
|
||||
<div>
|
||||
<Label htmlFor="precedent_level">רמת תקדים</Label>
|
||||
<Select
|
||||
value={precedentLevel}
|
||||
onValueChange={setPrecedentLevel}
|
||||
>
|
||||
<SelectTrigger>
|
||||
<SelectValue placeholder="ללא" />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{PRECEDENT_LEVELS.map((l) => (
|
||||
<SelectItem key={l.value} value={l.value}>
|
||||
{l.label}
|
||||
</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div>
|
||||
<Label htmlFor="summary">תקציר</Label>
|
||||
<Textarea
|
||||
id="summary"
|
||||
value={summary}
|
||||
onChange={(e) => setSummary(e.target.value)}
|
||||
rows={2}
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
</details>
|
||||
|
||||
<Button
|
||||
type="submit"
|
||||
disabled={!file || upload.isPending}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft"
|
||||
>
|
||||
{upload.isPending ? (
|
||||
<Loader2 className="w-4 h-4 me-1 animate-spin" />
|
||||
) : (
|
||||
<Upload className="w-4 h-4 me-1" />
|
||||
)}
|
||||
העלאה וסגירה
|
||||
</Button>
|
||||
</form>
|
||||
</section>
|
||||
) : null}
|
||||
</div>
|
||||
)}
|
||||
</SheetContent>
|
||||
</Sheet>
|
||||
);
|
||||
}
|
||||
@@ -0,0 +1,223 @@
|
||||
"use client";
|
||||
|
||||
import { useState } from "react";
|
||||
import { Trash2, Upload, Pencil, ExternalLink } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import Link from "next/link";
|
||||
import {
|
||||
Table, TableBody, TableCell, TableHead, TableHeader, TableRow,
|
||||
} from "@/components/ui/table";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import {
|
||||
useMissingPrecedents,
|
||||
useDeleteMissingPrecedent,
|
||||
CITED_BY_PARTY_LABELS,
|
||||
STATUS_LABELS,
|
||||
type MissingPrecedent,
|
||||
type MissingPrecedentStatus,
|
||||
} from "@/lib/api/missing-precedents";
|
||||
import { MissingPrecedentDetailDrawer } from "./missing-precedent-detail-drawer";
|
||||
|
||||
function formatDate(iso: string | null) {
|
||||
if (!iso) return "—";
|
||||
try {
|
||||
return new Date(iso).toLocaleDateString("he-IL");
|
||||
} catch {
|
||||
return iso;
|
||||
}
|
||||
}
|
||||
|
||||
function StatusBadge({ status }: { status: MissingPrecedentStatus }) {
|
||||
const variants: Record<MissingPrecedentStatus, string> = {
|
||||
open: "bg-gold-wash text-gold-deep border-gold/40",
|
||||
uploaded: "bg-rule-soft text-ink-muted border-rule",
|
||||
closed: "bg-emerald-50 text-emerald-800 border-emerald-300/60",
|
||||
irrelevant: "bg-rule-soft text-ink-muted border-rule line-through",
|
||||
};
|
||||
return (
|
||||
<Badge variant="outline" className={variants[status]}>
|
||||
{STATUS_LABELS[status]}
|
||||
</Badge>
|
||||
);
|
||||
}
|
||||
|
||||
function TableSkeleton({ cols }: { cols: number }) {
|
||||
return (
|
||||
<>
|
||||
{Array.from({ length: 4 }).map((_, i) => (
|
||||
<TableRow key={i} className="border-rule">
|
||||
{Array.from({ length: cols }).map((__, j) => (
|
||||
<TableCell key={j}>
|
||||
<Skeleton className="h-4 w-full" />
|
||||
</TableCell>
|
||||
))}
|
||||
</TableRow>
|
||||
))}
|
||||
</>
|
||||
);
|
||||
}
|
||||
|
||||
type Props = {
|
||||
status?: MissingPrecedentStatus | "";
|
||||
caseNumber?: string;
|
||||
legalTopic?: string;
|
||||
};
|
||||
|
||||
export function MissingPrecedentsTable({ status, caseNumber, legalTopic }: Props) {
|
||||
const [openId, setOpenId] = useState<string | null>(null);
|
||||
const { data, isPending, error } = useMissingPrecedents({
|
||||
status: status === "" ? undefined : status,
|
||||
caseNumber,
|
||||
legalTopic,
|
||||
limit: 200,
|
||||
});
|
||||
const del = useDeleteMissingPrecedent();
|
||||
|
||||
const handleDelete = async (mp: MissingPrecedent) => {
|
||||
if (!confirm(`למחוק את הרשומה? ${mp.case_name || mp.citation.slice(0, 60)}...`)) {
|
||||
return;
|
||||
}
|
||||
try {
|
||||
await del.mutateAsync(mp.id);
|
||||
toast.success("הרשומה נמחקה");
|
||||
} catch (e) {
|
||||
toast.error("מחיקה נכשלה");
|
||||
console.error(e);
|
||||
}
|
||||
};
|
||||
|
||||
if (error) {
|
||||
return (
|
||||
<div className="rounded bg-danger-bg border border-danger/40 px-6 py-4 text-danger text-center text-sm">
|
||||
{error.message}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<>
|
||||
<div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
|
||||
<Table>
|
||||
<TableHeader className="bg-rule-soft/60">
|
||||
<TableRow className="border-rule">
|
||||
<TableHead className="text-navy text-right">פסיקה</TableHead>
|
||||
<TableHead className="text-navy text-right">נושא</TableHead>
|
||||
<TableHead className="text-navy text-right">תיק</TableHead>
|
||||
<TableHead className="text-navy text-right">צד מצטט</TableHead>
|
||||
<TableHead className="text-navy text-right">סטטוס</TableHead>
|
||||
<TableHead className="text-navy text-right">נוצר</TableHead>
|
||||
<TableHead className="text-navy" />
|
||||
</TableRow>
|
||||
</TableHeader>
|
||||
<TableBody>
|
||||
{isPending ? (
|
||||
<TableSkeleton cols={7} />
|
||||
) : !data?.items.length ? (
|
||||
<TableRow className="border-rule">
|
||||
<TableCell colSpan={7} className="text-center text-ink-muted py-8">
|
||||
אין פסיקות חסרות בקריטריונים הנוכחיים.
|
||||
</TableCell>
|
||||
</TableRow>
|
||||
) : (
|
||||
data.items.map((mp) => (
|
||||
<TableRow
|
||||
key={mp.id}
|
||||
className="border-rule hover:bg-rule-soft/30 cursor-pointer"
|
||||
onClick={() => setOpenId(mp.id)}
|
||||
>
|
||||
<TableCell className="max-w-[440px]">
|
||||
<div className="text-sm text-navy font-medium truncate">
|
||||
{mp.case_name || mp.citation.split(" ").slice(0, 6).join(" ")}
|
||||
</div>
|
||||
<div className="text-[0.72rem] text-ink-muted truncate" dir="rtl">
|
||||
{mp.citation}
|
||||
</div>
|
||||
</TableCell>
|
||||
<TableCell>
|
||||
<span className="text-sm text-ink">{mp.legal_topic || "—"}</span>
|
||||
</TableCell>
|
||||
<TableCell>
|
||||
{mp.cited_in_case_number ? (
|
||||
<Link
|
||||
href={`/cases/${encodeURIComponent(mp.cited_in_case_number)}`}
|
||||
onClick={(e) => e.stopPropagation()}
|
||||
className="text-sm text-navy hover:text-gold-deep inline-flex items-center gap-1"
|
||||
>
|
||||
{mp.cited_in_case_number}
|
||||
<ExternalLink className="w-3 h-3" />
|
||||
</Link>
|
||||
) : (
|
||||
<span className="text-ink-muted text-sm">—</span>
|
||||
)}
|
||||
</TableCell>
|
||||
<TableCell className="text-sm text-ink">
|
||||
{mp.cited_by_party
|
||||
? CITED_BY_PARTY_LABELS[mp.cited_by_party]
|
||||
: "—"}
|
||||
{mp.cited_by_party_name ? (
|
||||
<div className="text-[0.7rem] text-ink-muted truncate max-w-[160px]">
|
||||
{mp.cited_by_party_name}
|
||||
</div>
|
||||
) : null}
|
||||
</TableCell>
|
||||
<TableCell>
|
||||
<StatusBadge status={mp.status} />
|
||||
{mp.linked_case_law_number ? (
|
||||
<div className="text-[0.7rem] text-emerald-700 mt-1">
|
||||
↳ {mp.linked_case_law_name || mp.linked_case_law_number}
|
||||
</div>
|
||||
) : null}
|
||||
</TableCell>
|
||||
<TableCell className="text-[0.78rem] text-ink-muted">
|
||||
{formatDate(mp.created_at)}
|
||||
</TableCell>
|
||||
<TableCell className="text-end">
|
||||
<div className="flex items-center justify-end gap-1">
|
||||
<Button
|
||||
variant="ghost"
|
||||
size="sm"
|
||||
onClick={(e) => {
|
||||
e.stopPropagation();
|
||||
setOpenId(mp.id);
|
||||
}}
|
||||
title={mp.status === "open" ? "העלאה" : "פרטים"}
|
||||
>
|
||||
{mp.status === "open" ? (
|
||||
<Upload className="w-4 h-4" />
|
||||
) : (
|
||||
<Pencil className="w-4 h-4" />
|
||||
)}
|
||||
</Button>
|
||||
<Button
|
||||
variant="ghost"
|
||||
size="sm"
|
||||
onClick={(e) => {
|
||||
e.stopPropagation();
|
||||
handleDelete(mp);
|
||||
}}
|
||||
disabled={del.isPending}
|
||||
className="text-danger hover:text-danger"
|
||||
title="מחיקה"
|
||||
>
|
||||
<Trash2 className="w-4 h-4" />
|
||||
</Button>
|
||||
</div>
|
||||
</TableCell>
|
||||
</TableRow>
|
||||
))
|
||||
)}
|
||||
</TableBody>
|
||||
</Table>
|
||||
</div>
|
||||
|
||||
<MissingPrecedentDetailDrawer
|
||||
id={openId}
|
||||
onOpenChange={(open) => {
|
||||
if (!open) setOpenId(null);
|
||||
}}
|
||||
/>
|
||||
</>
|
||||
);
|
||||
}
|
||||
116
web-ui/src/components/precedents/formatted-citation.tsx
Normal file
116
web-ui/src/components/precedents/formatted-citation.tsx
Normal file
@@ -0,0 +1,116 @@
|
||||
"use client";
|
||||
|
||||
// Rendering helpers for the formal Israeli citation ("כללי הציטוט האחיד").
|
||||
//
|
||||
// Backend stores the citation as Markdown: parties' names wrapped in
|
||||
// **double asterisks**, everything else regular. These helpers:
|
||||
// 1. Render the citation with <strong> for the bold ranges.
|
||||
// 2. Copy it to the clipboard as BOTH text/html (so Word/Docs paste
|
||||
// with bold preserved) and text/plain (which keeps the markers
|
||||
// so the markdown survives a plain-text paste).
|
||||
|
||||
import { useState } from "react";
|
||||
import { Check, Copy } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
|
||||
function parseSegments(md: string): Array<{ bold: boolean; text: string }> {
|
||||
const out: Array<{ bold: boolean; text: string }> = [];
|
||||
const re = /\*\*([^*]+)\*\*/g;
|
||||
let last = 0;
|
||||
let m: RegExpExecArray | null;
|
||||
while ((m = re.exec(md)) !== null) {
|
||||
if (m.index > last) out.push({ bold: false, text: md.slice(last, m.index) });
|
||||
out.push({ bold: true, text: m[1] });
|
||||
last = re.lastIndex;
|
||||
}
|
||||
if (last < md.length) out.push({ bold: false, text: md.slice(last) });
|
||||
return out;
|
||||
}
|
||||
|
||||
function escapeHtml(s: string): string {
|
||||
return s
|
||||
.replace(/&/g, "&")
|
||||
.replace(/</g, "<")
|
||||
.replace(/>/g, ">")
|
||||
.replace(/"/g, """);
|
||||
}
|
||||
|
||||
export function FormattedCitation({
|
||||
citation,
|
||||
className,
|
||||
}: {
|
||||
citation: string;
|
||||
className?: string;
|
||||
}) {
|
||||
const segments = parseSegments(citation);
|
||||
return (
|
||||
<span className={className} dir="rtl">
|
||||
{segments.map((s, i) =>
|
||||
s.bold ? (
|
||||
<strong key={i} className="font-semibold">
|
||||
{s.text}
|
||||
</strong>
|
||||
) : (
|
||||
<span key={i}>{s.text}</span>
|
||||
),
|
||||
)}
|
||||
</span>
|
||||
);
|
||||
}
|
||||
|
||||
export function CitationCopyButton({
|
||||
citation,
|
||||
size = "sm",
|
||||
}: {
|
||||
citation: string;
|
||||
size?: "sm" | "xs";
|
||||
}) {
|
||||
const [copied, setCopied] = useState(false);
|
||||
|
||||
async function handleCopy() {
|
||||
const segments = parseSegments(citation);
|
||||
const html = segments
|
||||
.map((s) =>
|
||||
s.bold
|
||||
? `<strong>${escapeHtml(s.text)}</strong>`
|
||||
: escapeHtml(s.text),
|
||||
)
|
||||
.join("");
|
||||
const wrappedHtml = `<span dir="rtl">${html}</span>`;
|
||||
try {
|
||||
const cb = navigator.clipboard;
|
||||
if (typeof ClipboardItem !== "undefined" && cb && "write" in cb) {
|
||||
const item = new ClipboardItem({
|
||||
"text/html": new Blob([wrappedHtml], { type: "text/html" }),
|
||||
"text/plain": new Blob([citation], { type: "text/plain" }),
|
||||
});
|
||||
await cb.write([item]);
|
||||
} else {
|
||||
await cb.writeText(citation);
|
||||
}
|
||||
setCopied(true);
|
||||
toast.success("המראה מקום הועתק (עם הדגשה לצדדים)");
|
||||
window.setTimeout(() => setCopied(false), 1800);
|
||||
} catch (err) {
|
||||
console.error("citation copy failed", err);
|
||||
toast.error("העתקה נכשלה");
|
||||
}
|
||||
}
|
||||
|
||||
const dims = size === "xs" ? "h-7 w-7" : "h-8 w-8";
|
||||
return (
|
||||
<button
|
||||
type="button"
|
||||
onClick={handleCopy}
|
||||
title={copied ? "הועתק" : "העתק לפי כללי הציטוט (עם הדגשה לצדדים)"}
|
||||
aria-label={copied ? "הועתק" : "העתק מראה מקום"}
|
||||
className={`inline-flex items-center justify-center rounded-md border border-rule bg-surface hover:bg-rule-soft/50 text-ink-muted hover:text-navy ${dims}`}
|
||||
>
|
||||
{copied ? (
|
||||
<Check className="w-3.5 h-3.5 text-emerald-600" />
|
||||
) : (
|
||||
<Copy className="w-3.5 h-3.5" />
|
||||
)}
|
||||
</button>
|
||||
);
|
||||
}
|
||||
@@ -25,6 +25,10 @@ import {
|
||||
import { PRACTICE_AREAS, practiceAreaShort } from "./practice-area";
|
||||
import { PrecedentUploadSheet } from "./precedent-upload-sheet";
|
||||
import { PrecedentEditSheet } from "./precedent-edit-sheet";
|
||||
import {
|
||||
FormattedCitation,
|
||||
CitationCopyButton,
|
||||
} from "./formatted-citation";
|
||||
|
||||
function formatDate(iso: string | null) {
|
||||
if (!iso) return "—";
|
||||
@@ -152,11 +156,28 @@ function CourtRow({ p, onEdit }: { p: Precedent; onEdit: (id: string) => void })
|
||||
className="font-semibold text-navy text-right whitespace-normal break-words min-w-[280px] max-w-[420px] py-3"
|
||||
dir="rtl"
|
||||
>
|
||||
<Link href={`/precedents/${p.id}`} className="hover:underline hover:text-gold-deep" dir="auto">
|
||||
{cleanCitation(p.case_number)}
|
||||
</Link>
|
||||
<div className="flex items-start justify-between gap-2">
|
||||
<Link
|
||||
href={`/precedents/${p.id}`}
|
||||
className="hover:underline hover:text-gold-deep block min-w-0"
|
||||
dir="auto"
|
||||
>
|
||||
{p.citation_formatted ? (
|
||||
<FormattedCitation
|
||||
citation={p.citation_formatted}
|
||||
className="block leading-snug"
|
||||
/>
|
||||
) : (
|
||||
cleanCitation(p.case_number)
|
||||
)}
|
||||
</Link>
|
||||
{p.citation_formatted ? (
|
||||
<CitationCopyButton citation={p.citation_formatted} size="xs" />
|
||||
) : null}
|
||||
</div>
|
||||
</TableCell>
|
||||
<TableCell className="text-ink whitespace-normal break-words max-w-[260px] py-3">
|
||||
{/* Column "שם / ערכאה" hidden by request (case_name often equals case_number prefix). Keep field in DB; restore by un-hiding. */}
|
||||
<TableCell className="hidden text-ink whitespace-normal break-words max-w-[260px] py-3">
|
||||
<div className="font-medium">{cleanCitation(p.case_name)}</div>
|
||||
{p.court ? <div className="text-[0.72rem] text-ink-muted">{p.court}</div> : null}
|
||||
</TableCell>
|
||||
@@ -236,11 +257,28 @@ function CommitteeRow({ p, onEdit }: { p: Precedent; onEdit: (id: string) => voi
|
||||
className="font-semibold text-navy text-right whitespace-normal break-words min-w-[200px] max-w-[320px] py-3"
|
||||
dir="rtl"
|
||||
>
|
||||
<Link href={`/precedents/${p.id}`} className="hover:underline hover:text-gold-deep" dir="auto">
|
||||
{cleanCitation(p.case_number)}
|
||||
</Link>
|
||||
<div className="flex items-start justify-between gap-2">
|
||||
<Link
|
||||
href={`/precedents/${p.id}`}
|
||||
className="hover:underline hover:text-gold-deep block min-w-0"
|
||||
dir="auto"
|
||||
>
|
||||
{p.citation_formatted ? (
|
||||
<FormattedCitation
|
||||
citation={p.citation_formatted}
|
||||
className="block leading-snug"
|
||||
/>
|
||||
) : (
|
||||
cleanCitation(p.case_number)
|
||||
)}
|
||||
</Link>
|
||||
{p.citation_formatted ? (
|
||||
<CitationCopyButton citation={p.citation_formatted} size="xs" />
|
||||
) : null}
|
||||
</div>
|
||||
</TableCell>
|
||||
<TableCell className="text-ink whitespace-normal break-words max-w-[220px] py-3">
|
||||
{/* Column "שם" hidden by request (case_name often equals case_number prefix). Keep field in DB; restore by un-hiding. */}
|
||||
<TableCell className="hidden text-ink whitespace-normal break-words max-w-[220px] py-3">
|
||||
<div className="font-medium">{cleanCitation(p.case_name)}</div>
|
||||
</TableCell>
|
||||
<TableCell className="text-ink-muted text-[0.78rem]">
|
||||
@@ -367,7 +405,8 @@ export function LibraryListPanel() {
|
||||
<TableHeader className="bg-rule-soft/60">
|
||||
<TableRow className="border-rule">
|
||||
<TableHead className="text-navy text-right">מס׳ / מראה מקום</TableHead>
|
||||
<TableHead className="text-navy text-right">שם / ערכאה</TableHead>
|
||||
{/* "שם / ערכאה" hidden by request — see CourtRow */}
|
||||
<TableHead className="hidden text-navy text-right">שם / ערכאה</TableHead>
|
||||
<TableHead className="text-navy text-right">תאריך</TableHead>
|
||||
<TableHead className="text-navy text-right">תחום</TableHead>
|
||||
<TableHead className="text-navy text-right">רמה</TableHead>
|
||||
@@ -415,7 +454,8 @@ export function LibraryListPanel() {
|
||||
<TableHeader className="bg-rule-soft/60">
|
||||
<TableRow className="border-rule">
|
||||
<TableHead className="text-navy text-right">מספר ערר</TableHead>
|
||||
<TableHead className="text-navy text-right">שם</TableHead>
|
||||
{/* "שם" hidden by request — see CommitteeRow */}
|
||||
<TableHead className="hidden text-navy text-right">שם</TableHead>
|
||||
<TableHead className="text-navy text-right">מחוז</TableHead>
|
||||
<TableHead className="text-navy text-right">יו״ר</TableHead>
|
||||
<TableHead className="text-navy text-right">תאריך</TableHead>
|
||||
|
||||
@@ -6,6 +6,7 @@ import { toast } from "sonner";
|
||||
import {
|
||||
Dialog,
|
||||
DialogContent,
|
||||
DialogDescription,
|
||||
DialogHeader,
|
||||
DialogTitle,
|
||||
} from "@/components/ui/dialog";
|
||||
@@ -68,6 +69,7 @@ function LinkDialog({ caseId, currentRelated, open, onOpenChange }: DialogProps)
|
||||
<DialogContent className="max-w-lg" dir="rtl">
|
||||
<DialogHeader>
|
||||
<DialogTitle className="text-navy">קשר החלטה קשורה</DialogTitle>
|
||||
<DialogDescription className="sr-only">חיפוש וקישור תקדים או החלטה קשורה לתיק</DialogDescription>
|
||||
</DialogHeader>
|
||||
|
||||
<div className="space-y-3">
|
||||
|
||||
@@ -24,6 +24,21 @@ export const SOURCE_TYPES = [
|
||||
{ value: "appeals_committee", label: "החלטת ועדת ערר" },
|
||||
] as const;
|
||||
|
||||
/**
|
||||
* Districts for ועדות ערר. The chair's committee is ירושלים; the rest
|
||||
* are listed so that uploaded precedents from peer committees can be
|
||||
* filed correctly. Order matches what's displayed in the UI dropdown.
|
||||
*/
|
||||
export const DISTRICTS = [
|
||||
{ value: "ירושלים", label: "ירושלים" },
|
||||
{ value: "מרכז", label: "מרכז" },
|
||||
{ value: "תל אביב", label: "תל אביב" },
|
||||
{ value: "צפון", label: "צפון" },
|
||||
{ value: "דרום", label: "דרום" },
|
||||
{ value: "חיפה", label: "חיפה" },
|
||||
{ value: "ארצי", label: "ארצי" },
|
||||
] as const;
|
||||
|
||||
export function practiceAreaLabel(value: string | null | undefined): string {
|
||||
if (!value) return "—";
|
||||
const match = PRACTICE_AREAS.find((p) => p.value === value);
|
||||
|
||||
@@ -4,8 +4,8 @@ import { useState } from "react";
|
||||
import { Save, Sparkles } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import {
|
||||
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||
} from "@/components/ui/sheet";
|
||||
Dialog, DialogContent, DialogHeader, DialogTitle, DialogDescription,
|
||||
} from "@/components/ui/dialog";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
@@ -22,7 +22,7 @@ import {
|
||||
type SourceType,
|
||||
} from "@/lib/api/precedent-library";
|
||||
import {
|
||||
PRACTICE_AREAS, PRECEDENT_LEVELS, SOURCE_TYPES, appealSubtypeLabel,
|
||||
PRACTICE_AREAS, PRECEDENT_LEVELS, SOURCE_TYPES, DISTRICTS, appealSubtypeLabel,
|
||||
} from "./practice-area";
|
||||
import { ExtractedHalachotSection } from "./extracted-halachot";
|
||||
|
||||
@@ -36,8 +36,11 @@ type Props = {
|
||||
* happened in the background. */
|
||||
type FormState = {
|
||||
citation: string;
|
||||
citation_formatted: string;
|
||||
case_name: string;
|
||||
court: string;
|
||||
district: string;
|
||||
chair_name: string;
|
||||
decision_date: string;
|
||||
practice_area: PracticeArea;
|
||||
appeal_subtype: string;
|
||||
@@ -51,8 +54,9 @@ type FormState = {
|
||||
};
|
||||
|
||||
const EMPTY: FormState = {
|
||||
citation: "", case_name: "", court: "", decision_date: "",
|
||||
practice_area: "", appeal_subtype: "", source_type: "",
|
||||
citation: "", citation_formatted: "",
|
||||
case_name: "", court: "", district: "", chair_name: "",
|
||||
decision_date: "", practice_area: "", appeal_subtype: "", source_type: "",
|
||||
precedent_level: "", is_binding: true, subject_tags: "",
|
||||
summary: "", headnote: "", key_quote: "",
|
||||
};
|
||||
@@ -73,8 +77,11 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
setSyncedRecordId(record.id as string);
|
||||
setForm({
|
||||
citation: record.case_number || "",
|
||||
citation_formatted: record.citation_formatted || "",
|
||||
case_name: record.case_name || "",
|
||||
court: record.court || "",
|
||||
district: record.district || "",
|
||||
chair_name: record.chair_name || "",
|
||||
decision_date: record.date ? record.date.slice(0, 10) : "",
|
||||
practice_area: (record.practice_area || "") as PracticeArea,
|
||||
appeal_subtype: appealSubtypeLabel(record.appeal_subtype),
|
||||
@@ -93,8 +100,11 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
if (!caseLawId) return;
|
||||
try {
|
||||
const patch: Record<string, unknown> = {
|
||||
citation_formatted: form.citation_formatted.trim(),
|
||||
case_name: form.case_name.trim(),
|
||||
court: form.court.trim(),
|
||||
district: form.district.trim(),
|
||||
chair_name: form.chair_name.trim(),
|
||||
practice_area: form.practice_area || undefined,
|
||||
appeal_subtype: form.appeal_subtype.trim(),
|
||||
source_type: form.source_type || undefined,
|
||||
@@ -130,17 +140,20 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
};
|
||||
|
||||
return (
|
||||
<Sheet open={open} onOpenChange={(o) => { if (!o) onOpenChange(false); }}>
|
||||
<SheetContent side="left" className="w-full sm:max-w-2xl overflow-y-auto" dir="rtl">
|
||||
<SheetHeader>
|
||||
<SheetTitle className="text-navy">עריכת פרטי פסיקה</SheetTitle>
|
||||
<SheetDescription className="text-ink-muted">
|
||||
כל השדות ניתנים לעריכה חוץ ממראה המקום (מזהה ייחודי).
|
||||
<Dialog open={open} onOpenChange={(o) => { if (!o) onOpenChange(false); }}>
|
||||
<DialogContent
|
||||
className="sm:max-w-4xl max-h-[90vh] overflow-y-auto p-0"
|
||||
dir="rtl"
|
||||
>
|
||||
<DialogHeader className="px-6 pt-6">
|
||||
<DialogTitle className="text-navy">עריכת פרטי פסיקה</DialogTitle>
|
||||
<DialogDescription className="text-ink-muted">
|
||||
כל השדות ניתנים לעריכה חוץ ממספר התיק (מזהה ייחודי במערכת).
|
||||
כפתור "חלץ מטא-דאטה" שולח בקשה לתור מקומי שאני מרוקן
|
||||
מ-Claude Code (ה-LLM רץ מקומית עם <code>claude session</code>,
|
||||
לא ב-API).
|
||||
</SheetDescription>
|
||||
</SheetHeader>
|
||||
</DialogDescription>
|
||||
</DialogHeader>
|
||||
|
||||
{isPending || !record ? (
|
||||
<div className="px-6 pb-6 mt-4 space-y-3">
|
||||
@@ -151,7 +164,7 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
<form onSubmit={onSubmit} className="px-6 pb-6 space-y-4 mt-4">
|
||||
<div className="rounded-lg border border-rule bg-rule-soft/40 p-3 flex items-start gap-3">
|
||||
<div className="flex-1 min-w-0">
|
||||
<div className="text-[0.78rem] text-ink-muted">מראה מקום (לא ניתן לעריכה)</div>
|
||||
<div className="text-[0.78rem] text-ink-muted">מספר תיק (מזהה ייחודי — לא ניתן לעריכה)</div>
|
||||
<div className="text-navy font-mono text-sm break-all" dir="ltr">
|
||||
{record.case_number}
|
||||
</div>
|
||||
@@ -168,6 +181,26 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
</Button>
|
||||
</div>
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="citation-formatted">
|
||||
מראה מקום (כללי הציטוט האחיד)
|
||||
</Label>
|
||||
<Textarea
|
||||
id="citation-formatted"
|
||||
value={form.citation_formatted}
|
||||
onChange={(e) =>
|
||||
setForm({ ...form, citation_formatted: e.target.value })
|
||||
}
|
||||
rows={3}
|
||||
dir="rtl"
|
||||
className="font-mono text-sm"
|
||||
placeholder='ערר (ועדות ערר ...) 1234/24 **עורר נ' הוועדה המקומית** (נבו 1.2.2025)'
|
||||
/>
|
||||
<p className="text-[0.7rem] text-ink-muted">
|
||||
הקף את שמות הצדדים בכפול-כוכבית <code className="font-mono">**שם**</code> להדגשה. שדה זה משמש את כפתור ההעתקה בעמוד הפסיקה.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="case-name">שם קצר</Label>
|
||||
@@ -180,6 +213,30 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
<Input id="court" value={form.court}
|
||||
onChange={(e) => setForm({ ...form, court: e.target.value })} />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="district">מחוז</Label>
|
||||
<Select value={form.district || "_none"}
|
||||
onValueChange={(v) => setForm({ ...form, district: v === "_none" ? "" : v })}>
|
||||
<SelectTrigger id="district"><SelectValue placeholder="—" /></SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="_none">—</SelectItem>
|
||||
{DISTRICTS.map((d) => (
|
||||
<SelectItem key={d.value} value={d.value}>{d.label}</SelectItem>
|
||||
))}
|
||||
{/* Preserve legacy free-text values that don't match any
|
||||
known district (e.g. older imports with typos). */}
|
||||
{form.district && !DISTRICTS.some((d) => d.value === form.district) && (
|
||||
<SelectItem value={form.district}>{form.district}</SelectItem>
|
||||
)}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="chair-name">יו"ר</Label>
|
||||
<Input id="chair-name" value={form.chair_name}
|
||||
onChange={(e) => setForm({ ...form, chair_name: e.target.value })}
|
||||
placeholder="עו״ד דפנה תמיר" />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="date">תאריך</Label>
|
||||
<Input id="date" type="date" value={form.decision_date}
|
||||
@@ -287,7 +344,7 @@ export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
</SheetContent>
|
||||
</Sheet>
|
||||
</DialogContent>
|
||||
</Dialog>
|
||||
);
|
||||
}
|
||||
|
||||
@@ -16,8 +16,9 @@ import {
|
||||
} from "@/components/ui/select";
|
||||
import { Progress } from "@/components/ui/progress";
|
||||
import {
|
||||
useUploadPrecedent, libraryKeys,
|
||||
type PracticeArea, type SourceType,
|
||||
useUploadPrecedent, useUploadInternalDecision, libraryKeys,
|
||||
isCommitteeCitation, COMMITTEE_DISTRICTS,
|
||||
type PracticeArea, type SourceType, type CommitteeDistrict,
|
||||
} from "@/lib/api/precedent-library";
|
||||
import { useProgress } from "@/lib/api/documents";
|
||||
import {
|
||||
@@ -45,8 +46,15 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
const [headnote, setHeadnote] = useState("");
|
||||
const [isBinding, setIsBinding] = useState(true);
|
||||
|
||||
// Appeals-committee decisions go to /api/internal-decisions/upload and
|
||||
// require chair_name + district. Routing is by citation prefix.
|
||||
const [chairName, setChairName] = useState("");
|
||||
const [district, setDistrict] = useState<CommitteeDistrict | "">("");
|
||||
const isCommittee = isCommitteeCitation(citation);
|
||||
|
||||
const [taskId, setTaskId] = useState<string | null>(null);
|
||||
const upload = useUploadPrecedent();
|
||||
const uploadInternal = useUploadInternalDecision();
|
||||
const progress = useProgress(taskId);
|
||||
const qc = useQueryClient();
|
||||
|
||||
@@ -63,6 +71,8 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
setPracticeArea(""); setAppealSubtype(""); setSubjectTags("");
|
||||
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||
setHeadnote(""); setIsBinding(true); setTaskId(null);
|
||||
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||
setChairName(""); setDistrict("");
|
||||
}, [open]);
|
||||
|
||||
// Auto-close on completion + refresh library list/stats so the new
|
||||
@@ -93,15 +103,39 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
toast.error("מראה המקום (citation) חובה");
|
||||
return;
|
||||
}
|
||||
if (!practiceArea) {
|
||||
toast.error("בחר תחום משפט");
|
||||
return;
|
||||
}
|
||||
const tags = subjectTags
|
||||
.split(",")
|
||||
.map((t) => t.trim())
|
||||
.filter(Boolean);
|
||||
|
||||
try {
|
||||
const tags = subjectTags
|
||||
.split(",")
|
||||
.map((t) => t.trim())
|
||||
.filter(Boolean);
|
||||
if (isCommittee) {
|
||||
if (!chairName.trim()) {
|
||||
toast.error("שם יו\"ר חובה להחלטת ועדת ערר");
|
||||
return;
|
||||
}
|
||||
if (!district) {
|
||||
toast.error("מחוז חובה להחלטת ועדת ערר");
|
||||
return;
|
||||
}
|
||||
const res = await uploadInternal.mutateAsync({
|
||||
file,
|
||||
case_number: citation.trim(),
|
||||
chair_name: chairName.trim(),
|
||||
district,
|
||||
case_name: caseName.trim(),
|
||||
court: court.trim(),
|
||||
decision_date: decisionDate || undefined,
|
||||
practice_area: practiceArea,
|
||||
appeal_subtype: appealSubtype.trim(),
|
||||
subject_tags: tags,
|
||||
is_binding: isBinding,
|
||||
summary: headnote.trim(),
|
||||
});
|
||||
setTaskId(res.task_id);
|
||||
return;
|
||||
}
|
||||
|
||||
const res = await upload.mutateAsync({
|
||||
file,
|
||||
citation: citation.trim(),
|
||||
@@ -123,6 +157,7 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
};
|
||||
|
||||
const isProcessing = taskId !== null && progress?.status !== "completed" && progress?.status !== "failed";
|
||||
const isSubmitting = upload.isPending || uploadInternal.isPending;
|
||||
const stage = (progress as { stage?: string; percent?: number; step?: string } | null)?.stage;
|
||||
const percent = (progress as { percent?: number } | null)?.percent ?? 0;
|
||||
|
||||
@@ -132,8 +167,9 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
<SheetHeader>
|
||||
<SheetTitle className="text-navy">העלאת פסיקה לקורפוס הסמכותי</SheetTitle>
|
||||
<SheetDescription className="text-ink-muted">
|
||||
הקובץ יעבור חילוץ טקסט, יצירת embeddings, וחילוץ הלכות אוטומטי.
|
||||
ההלכות יחכו לאישורך לפני שהן זמינות לסוכני הכתיבה.
|
||||
הקובץ יעבור חילוץ טקסט, embeddings, וחילוץ אוטומטי של מטא־דאטה
|
||||
(שם, ערכאה, תאריך, תחום, תת־סוג, תגיות) והלכות. ההלכות ימתינו
|
||||
לאישורך לפני שיהיו זמינות לסוכני הכתיבה.
|
||||
</SheetDescription>
|
||||
</SheetHeader>
|
||||
|
||||
@@ -157,106 +193,150 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
placeholder={`עע"מ 3975/22 ב. קרן-נכסים נ' ועדה מקומית`}
|
||||
disabled={isProcessing} dir="rtl"
|
||||
/>
|
||||
{isCommittee && (
|
||||
<p className="text-[0.72rem] text-ink-muted">
|
||||
זוהתה כהחלטת ועדת ערר — נדרשים שם יו"ר ומחוז (למטה).
|
||||
</p>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{/* Two-col grid */}
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="case-name">שם קצר</Label>
|
||||
<Input id="case-name" value={caseName}
|
||||
onChange={(e) => setCaseName(e.target.value)}
|
||||
placeholder="ב. קרן-נכסים" disabled={isProcessing} />
|
||||
{isCommittee && (
|
||||
<div className="grid grid-cols-2 gap-3 rounded-md border border-gold/40 bg-gold-wash/40 p-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="chair-name">שם יו"ר (חובה)</Label>
|
||||
<Input
|
||||
id="chair-name" value={chairName}
|
||||
onChange={(e) => setChairName(e.target.value)}
|
||||
placeholder='עו"ד פלוני אלמוני'
|
||||
disabled={isProcessing} dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="district">מחוז (חובה)</Label>
|
||||
<Select
|
||||
value={district || "_none"}
|
||||
onValueChange={(v) =>
|
||||
setDistrict(v === "_none" ? "" : (v as CommitteeDistrict))
|
||||
}
|
||||
disabled={isProcessing}
|
||||
>
|
||||
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="_none">—</SelectItem>
|
||||
{COMMITTEE_DISTRICTS.map((d) => (
|
||||
<SelectItem key={d} value={d}>{d}</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="court">ערכאה</Label>
|
||||
<Input id="court" value={court}
|
||||
onChange={(e) => setCourt(e.target.value)}
|
||||
placeholder='בית משפט עליון / בג"ץ / מנהלי / ועדת ערר'
|
||||
disabled={isProcessing} />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="date">תאריך החלטה</Label>
|
||||
<Input id="date" type="date" value={decisionDate}
|
||||
onChange={(e) => setDecisionDate(e.target.value)}
|
||||
disabled={isProcessing} />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="appeal-subtype">תת-סוג (חופשי)</Label>
|
||||
<Input id="appeal-subtype" value={appealSubtype}
|
||||
onChange={(e) => setAppealSubtype(e.target.value)}
|
||||
placeholder="שימוש חורג / סופיות ההחלטה" disabled={isProcessing} />
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Practice area (required radio) */}
|
||||
<div className="space-y-1">
|
||||
<Label>תחום משפט (חובה)</Label>
|
||||
<div className="flex gap-4 flex-wrap">
|
||||
{PRACTICE_AREAS.map((a) => (
|
||||
<label key={a.value} className="flex items-center gap-2 cursor-pointer">
|
||||
<input
|
||||
type="radio" name="practice_area" value={a.value}
|
||||
checked={practiceArea === a.value}
|
||||
onChange={() => setPracticeArea(a.value as PracticeArea)}
|
||||
disabled={isProcessing}
|
||||
/>
|
||||
<span className="text-sm text-ink">{a.label}</span>
|
||||
</label>
|
||||
))}
|
||||
</div>
|
||||
</div>
|
||||
<details className="group rounded-md border border-rule bg-rule-soft/30">
|
||||
<summary className="cursor-pointer select-none px-3 py-2 text-[0.78rem] text-ink-muted hover:text-navy">
|
||||
אופציונלי — דריסה ידנית של שדות שיחולצו אוטומטית מהמסמך
|
||||
</summary>
|
||||
<div className="space-y-3 border-t border-rule px-3 py-3">
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="case-name">שם קצר</Label>
|
||||
<Input id="case-name" value={caseName}
|
||||
onChange={(e) => setCaseName(e.target.value)}
|
||||
placeholder="ב. קרן-נכסים" disabled={isProcessing} />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="court">ערכאה</Label>
|
||||
<Input id="court" value={court}
|
||||
onChange={(e) => setCourt(e.target.value)}
|
||||
placeholder='בית משפט עליון / בג"ץ / מנהלי / ועדת ערר'
|
||||
disabled={isProcessing} />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="date">תאריך החלטה</Label>
|
||||
<Input id="date" type="date" value={decisionDate}
|
||||
onChange={(e) => setDecisionDate(e.target.value)}
|
||||
disabled={isProcessing} />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="appeal-subtype">תת-סוג (חופשי)</Label>
|
||||
<Input id="appeal-subtype" value={appealSubtype}
|
||||
onChange={(e) => setAppealSubtype(e.target.value)}
|
||||
placeholder="שימוש חורג / סופיות ההחלטה" disabled={isProcessing} />
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="source-type">סוג מקור</Label>
|
||||
<Select value={sourceType || "_none"}
|
||||
onValueChange={(v) => setSourceType(v === "_none" ? "" : v as SourceType)}
|
||||
disabled={isProcessing}>
|
||||
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="_none">—</SelectItem>
|
||||
{SOURCE_TYPES.map((s) => (
|
||||
<SelectItem key={s.value} value={s.value}>{s.label}</SelectItem>
|
||||
<div className="space-y-1">
|
||||
<Label>תחום משפט</Label>
|
||||
<div className="flex gap-4 flex-wrap">
|
||||
{PRACTICE_AREAS.map((a) => (
|
||||
<label key={a.value} className="flex items-center gap-2 cursor-pointer">
|
||||
<input
|
||||
type="radio" name="practice_area" value={a.value}
|
||||
checked={practiceArea === a.value}
|
||||
onChange={() => setPracticeArea(a.value as PracticeArea)}
|
||||
disabled={isProcessing}
|
||||
/>
|
||||
<span className="text-sm text-ink">{a.label}</span>
|
||||
</label>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
{!isCommittee && (
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="source-type">סוג מקור</Label>
|
||||
<Select value={sourceType || "_none"}
|
||||
onValueChange={(v) => setSourceType(v === "_none" ? "" : v as SourceType)}
|
||||
disabled={isProcessing}>
|
||||
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="_none">—</SelectItem>
|
||||
{SOURCE_TYPES.map((s) => (
|
||||
<SelectItem key={s.value} value={s.value}>{s.label}</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="precedent-level">רמת תקדים</Label>
|
||||
<Select value={precedentLevel || "_none"}
|
||||
onValueChange={(v) => setPrecedentLevel(v === "_none" ? "" : v)}
|
||||
disabled={isProcessing}>
|
||||
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="_none">—</SelectItem>
|
||||
{PRECEDENT_LEVELS.map((l) => (
|
||||
<SelectItem key={l.value} value={l.value}>{l.label}</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="tags">תגיות נושא (מופרדות בפסיקים)</Label>
|
||||
<Input id="tags" value={subjectTags}
|
||||
onChange={(e) => setSubjectTags(e.target.value)}
|
||||
placeholder="חניה, קווי בניין, שיקול דעת" disabled={isProcessing} />
|
||||
</div>
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="headnote">תקציר / headnote</Label>
|
||||
<Textarea id="headnote" value={headnote} rows={2}
|
||||
onChange={(e) => setHeadnote(e.target.value)}
|
||||
placeholder="תקציר חופשי שיוצג ברשימה" disabled={isProcessing} dir="rtl" />
|
||||
</div>
|
||||
|
||||
<label className="flex items-center gap-2 cursor-pointer">
|
||||
<input type="checkbox" checked={isBinding}
|
||||
onChange={(e) => setIsBinding(e.target.checked)}
|
||||
disabled={isProcessing} />
|
||||
<span className="text-sm">הלכה מחייבת</span>
|
||||
</label>
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="precedent-level">רמת תקדים</Label>
|
||||
<Select value={precedentLevel || "_none"}
|
||||
onValueChange={(v) => setPrecedentLevel(v === "_none" ? "" : v)}
|
||||
disabled={isProcessing}>
|
||||
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||
<SelectContent>
|
||||
<SelectItem value="_none">—</SelectItem>
|
||||
{PRECEDENT_LEVELS.map((l) => (
|
||||
<SelectItem key={l.value} value={l.value}>{l.label}</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="tags">תגיות נושא (מופרדות בפסיקים)</Label>
|
||||
<Input id="tags" value={subjectTags}
|
||||
onChange={(e) => setSubjectTags(e.target.value)}
|
||||
placeholder="חניה, קווי בניין, שיקול דעת" disabled={isProcessing} />
|
||||
</div>
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="headnote">תקציר / headnote (אופציונלי)</Label>
|
||||
<Textarea id="headnote" value={headnote} rows={2}
|
||||
onChange={(e) => setHeadnote(e.target.value)}
|
||||
placeholder="תקציר חופשי שיוצג ברשימה" disabled={isProcessing} dir="rtl" />
|
||||
</div>
|
||||
|
||||
<label className="flex items-center gap-2 cursor-pointer">
|
||||
<input type="checkbox" checked={isBinding}
|
||||
onChange={(e) => setIsBinding(e.target.checked)}
|
||||
disabled={isProcessing} />
|
||||
<span className="text-sm">הלכה מחייבת</span>
|
||||
</label>
|
||||
</details>
|
||||
|
||||
{isProcessing && (
|
||||
<div className="rounded-lg border border-rule bg-rule-soft/40 p-4 space-y-2">
|
||||
@@ -284,11 +364,11 @@ export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||
|
||||
<div className="flex gap-2 justify-end pt-2">
|
||||
<Button type="button" variant="ghost"
|
||||
onClick={() => onOpenChange(false)} disabled={upload.isPending}>
|
||||
onClick={() => onOpenChange(false)} disabled={isSubmitting}>
|
||||
ביטול
|
||||
</Button>
|
||||
<Button type="submit"
|
||||
disabled={upload.isPending || isProcessing}
|
||||
disabled={isSubmitting || isProcessing}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
<Upload className="w-4 h-4 me-1" />
|
||||
העלה
|
||||
|
||||
434
web-ui/src/components/training/chat-panel.tsx
Normal file
434
web-ui/src/components/training/chat-panel.tsx
Normal file
@@ -0,0 +1,434 @@
|
||||
"use client";
|
||||
|
||||
/*
|
||||
* Style-agent chat panel — the new "שיחה" tab on /training.
|
||||
*
|
||||
* Layout: two columns.
|
||||
* - Sidebar: list of conversations + "+ שיחה חדשה" button
|
||||
* - Main: thread of messages + composer with SSE streaming
|
||||
*
|
||||
* Each message is persisted to the legal-ai DB; the LLM call goes
|
||||
* out via FastAPI → host's legal-chat-service → claude CLI. There
|
||||
* is no API cost — the claude CLI uses Daphna's claude.ai
|
||||
* subscription via the host's auth.
|
||||
*
|
||||
* Health gate: if /api/training/chat/health reports the host service
|
||||
* is unreachable, the composer is replaced by a setup notice telling
|
||||
* the chair to start the pm2 service.
|
||||
*/
|
||||
|
||||
import { useEffect, useRef, useState } from "react";
|
||||
import {
|
||||
Send, Plus, Trash2, Loader2, MessageSquare, Sparkles, AlertTriangle,
|
||||
} from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Textarea } from "@/components/ui/textarea";
|
||||
import { ScrollArea } from "@/components/ui/scroll-area";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import {
|
||||
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||
} from "@/components/ui/select";
|
||||
import {
|
||||
chatKeys,
|
||||
useChatConversation,
|
||||
useChatConversations,
|
||||
useChatHealth,
|
||||
useCorpus,
|
||||
useCreateChat,
|
||||
useDeleteChat,
|
||||
type ChatMessage,
|
||||
} from "@/lib/api/training";
|
||||
import { useQueryClient } from "@tanstack/react-query";
|
||||
|
||||
export function ChatPanel() {
|
||||
const [activeId, setActiveId] = useState<string | null>(null);
|
||||
const health = useChatHealth();
|
||||
|
||||
return (
|
||||
<div className="grid gap-4 lg:grid-cols-[280px_1fr]">
|
||||
<ConversationsSidebar activeId={activeId} onSelect={setActiveId} />
|
||||
<div className="space-y-3">
|
||||
{health.data && !health.data.reachable && (
|
||||
<ChatServiceWarning health={health.data} />
|
||||
)}
|
||||
{activeId ? (
|
||||
<ChatThread convId={activeId} />
|
||||
) : (
|
||||
<Card className="bg-rule-soft/40 border-rule">
|
||||
<CardContent className="px-6 py-10 text-center text-ink-muted text-sm space-y-2">
|
||||
<MessageSquare className="w-8 h-8 mx-auto opacity-50" />
|
||||
<p>בחר שיחה קיימת או פתח חדשה כדי להתחיל לדבר עם סוכן הסגנון.</p>
|
||||
<p className="text-[0.78rem]">
|
||||
הסוכן רץ על claude CLI מקומי דרך legal-chat-service. אין עלות API.
|
||||
</p>
|
||||
</CardContent>
|
||||
</Card>
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Sidebar: list + new ────────────────────────────────────────────
|
||||
|
||||
function ConversationsSidebar({
|
||||
activeId, onSelect,
|
||||
}: {
|
||||
activeId: string | null;
|
||||
onSelect: (id: string | null) => void;
|
||||
}) {
|
||||
const { data: convs, isPending } = useChatConversations();
|
||||
const { data: corpus } = useCorpus();
|
||||
const create = useCreateChat();
|
||||
const del = useDeleteChat();
|
||||
const [creating, setCreating] = useState(false);
|
||||
const [newTitle, setNewTitle] = useState("");
|
||||
const [newCorpusId, setNewCorpusId] = useState<string>("__none__");
|
||||
|
||||
const onCreate = async () => {
|
||||
try {
|
||||
const conv = await create.mutateAsync({
|
||||
title: newTitle.trim() || "שיחה חדשה",
|
||||
style_corpus_id: newCorpusId === "__none__" ? null : newCorpusId,
|
||||
});
|
||||
onSelect(conv.id);
|
||||
setCreating(false);
|
||||
setNewTitle("");
|
||||
setNewCorpusId("__none__");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל ביצירת שיחה");
|
||||
}
|
||||
};
|
||||
|
||||
const onDelete = async (id: string) => {
|
||||
if (!window.confirm("למחוק את השיחה? פעולה זו לא ניתנת לביטול.")) return;
|
||||
try {
|
||||
await del.mutateAsync(id);
|
||||
if (activeId === id) onSelect(null);
|
||||
toast.success("השיחה נמחקה");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל במחיקה");
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-3 py-3 space-y-2">
|
||||
{!creating ? (
|
||||
<Button
|
||||
onClick={() => setCreating(true)}
|
||||
className="w-full bg-navy text-parchment hover:bg-navy-soft"
|
||||
size="sm"
|
||||
>
|
||||
<Plus className="w-4 h-4 me-1" />
|
||||
שיחה חדשה
|
||||
</Button>
|
||||
) : (
|
||||
<div className="space-y-2 border border-rule rounded p-2 bg-rule-soft/30">
|
||||
<Textarea
|
||||
value={newTitle}
|
||||
onChange={(e) => setNewTitle(e.target.value)}
|
||||
placeholder="כותרת לשיחה (אופציונלי)"
|
||||
rows={2} dir="rtl"
|
||||
/>
|
||||
<Select value={newCorpusId} onValueChange={setNewCorpusId} dir="rtl">
|
||||
<SelectTrigger>
|
||||
<SelectValue placeholder="צמד להחלטה (אופציונלי)" />
|
||||
</SelectTrigger>
|
||||
<SelectContent className="max-h-[300px]">
|
||||
<SelectItem value="__none__">— שיחה כללית —</SelectItem>
|
||||
{corpus?.map((c) => (
|
||||
<SelectItem key={c.id} value={c.id}>
|
||||
{c.decision_number || "—"}
|
||||
{c.decision_date ? ` · ${c.decision_date}` : ""}
|
||||
</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
<div className="flex gap-1 justify-end">
|
||||
<Button variant="ghost" size="sm"
|
||||
onClick={() => { setCreating(false); setNewTitle(""); setNewCorpusId("__none__"); }}>
|
||||
ביטול
|
||||
</Button>
|
||||
<Button size="sm" onClick={onCreate} disabled={create.isPending}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
צור
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
|
||||
<ScrollArea className="h-[520px]">
|
||||
<ul className="space-y-1">
|
||||
{isPending && (
|
||||
<>
|
||||
<Skeleton className="h-12 w-full" />
|
||||
<Skeleton className="h-12 w-full" />
|
||||
</>
|
||||
)}
|
||||
{convs?.length === 0 && (
|
||||
<p className="text-center text-ink-muted text-[0.78rem] py-6">
|
||||
אין עדיין שיחות
|
||||
</p>
|
||||
)}
|
||||
{convs?.map((c) => {
|
||||
const active = c.id === activeId;
|
||||
return (
|
||||
<li key={c.id}>
|
||||
<button
|
||||
onClick={() => onSelect(c.id)}
|
||||
className={
|
||||
"w-full text-end rounded-md px-2 py-2 transition " +
|
||||
(active
|
||||
? "bg-gold-wash border border-gold/40"
|
||||
: "hover:bg-rule-soft/60 border border-transparent")
|
||||
}
|
||||
>
|
||||
<div className="text-sm text-navy font-semibold truncate">
|
||||
{c.title}
|
||||
</div>
|
||||
<div className="flex items-center gap-1 text-[0.7rem] text-ink-muted">
|
||||
{c.decision_number && (
|
||||
<Badge variant="outline"
|
||||
className="text-[0.65rem] bg-info-bg text-info border-info/40">
|
||||
{c.decision_number}
|
||||
</Badge>
|
||||
)}
|
||||
<span className="tabular-nums">{c.message_count}</span>
|
||||
<MessageSquare className="w-3 h-3" />
|
||||
<span className="grow text-end">
|
||||
{new Date(c.last_message_at).toLocaleDateString("he-IL")}
|
||||
</span>
|
||||
<button
|
||||
onClick={(e) => { e.stopPropagation(); onDelete(c.id); }}
|
||||
className="hover:text-danger"
|
||||
aria-label="מחק שיחה"
|
||||
>
|
||||
<Trash2 className="w-3 h-3" />
|
||||
</button>
|
||||
</div>
|
||||
</button>
|
||||
</li>
|
||||
);
|
||||
})}
|
||||
</ul>
|
||||
</ScrollArea>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Thread + composer ──────────────────────────────────────────────
|
||||
|
||||
function ChatThread({ convId }: { convId: string }) {
|
||||
const { data, isPending } = useChatConversation(convId);
|
||||
const qc = useQueryClient();
|
||||
const [draft, setDraft] = useState("");
|
||||
const [streaming, setStreaming] = useState(false);
|
||||
const [streamingText, setStreamingText] = useState("");
|
||||
const [streamError, setStreamError] = useState("");
|
||||
const scrollRef = useRef<HTMLDivElement | null>(null);
|
||||
|
||||
/* Auto-scroll to bottom when new messages arrive. */
|
||||
useEffect(() => {
|
||||
const el = scrollRef.current;
|
||||
if (!el) return;
|
||||
el.scrollTo({ top: el.scrollHeight, behavior: "smooth" });
|
||||
}, [data?.messages.length, streamingText]);
|
||||
|
||||
const onSend = async () => {
|
||||
const text = draft.trim();
|
||||
if (!text || streaming) return;
|
||||
setDraft("");
|
||||
setStreaming(true);
|
||||
setStreamingText("");
|
||||
setStreamError("");
|
||||
|
||||
try {
|
||||
const res = await fetch(
|
||||
`/api/training/chat/conversations/${encodeURIComponent(convId)}/messages`,
|
||||
{
|
||||
method: "POST",
|
||||
headers: { "Content-Type": "application/json" },
|
||||
body: JSON.stringify({ content: text }),
|
||||
},
|
||||
);
|
||||
if (!res.ok || !res.body) {
|
||||
const body = await res.text();
|
||||
throw new Error(`HTTP ${res.status}: ${body.slice(0, 200)}`);
|
||||
}
|
||||
// Parse SSE line-by-line. EventSource would be cleaner but it
|
||||
// doesn't support POST bodies; the manual reader is small.
|
||||
const reader = res.body.getReader();
|
||||
const decoder = new TextDecoder();
|
||||
let buffer = "";
|
||||
let accumulated = "";
|
||||
while (true) {
|
||||
const { value, done } = await reader.read();
|
||||
if (done) break;
|
||||
buffer += decoder.decode(value, { stream: true });
|
||||
let nl: number;
|
||||
while ((nl = buffer.indexOf("\n\n")) !== -1) {
|
||||
const event = buffer.slice(0, nl);
|
||||
buffer = buffer.slice(nl + 2);
|
||||
if (!event.startsWith("data: ")) continue;
|
||||
try {
|
||||
const payload = JSON.parse(event.slice("data: ".length));
|
||||
if (payload.type === "text_delta" && payload.text) {
|
||||
accumulated += payload.text;
|
||||
setStreamingText(accumulated);
|
||||
} else if (payload.type === "error") {
|
||||
setStreamError(String(payload.message || "שגיאה לא ידועה"));
|
||||
} else if (payload.type === "done") {
|
||||
if (payload.text && !accumulated) {
|
||||
accumulated = payload.text;
|
||||
setStreamingText(accumulated);
|
||||
}
|
||||
}
|
||||
} catch {
|
||||
/* ignore non-JSON */
|
||||
}
|
||||
}
|
||||
}
|
||||
} catch (e) {
|
||||
setStreamError(e instanceof Error ? e.message : "שגיאה בשיחה");
|
||||
} finally {
|
||||
setStreaming(false);
|
||||
setStreamingText("");
|
||||
// Refetch the conversation so the persisted assistant turn shows up.
|
||||
qc.invalidateQueries({ queryKey: chatKeys.conversation(convId) });
|
||||
qc.invalidateQueries({ queryKey: chatKeys.conversations() });
|
||||
}
|
||||
};
|
||||
|
||||
if (isPending) return <Skeleton className="h-[560px] w-full" />;
|
||||
if (!data) return null;
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3 space-y-3">
|
||||
<header className="flex items-center gap-2 border-b border-rule pb-2">
|
||||
<Sparkles className="w-4 h-4 text-gold-deep" />
|
||||
<h3 className="text-navy font-semibold grow">{data.conversation.title}</h3>
|
||||
{data.conversation.decision_number && (
|
||||
<Badge variant="outline" className="bg-info-bg text-info border-info/40">
|
||||
{data.conversation.decision_number}
|
||||
</Badge>
|
||||
)}
|
||||
</header>
|
||||
|
||||
<div ref={scrollRef} className="h-[440px] overflow-y-auto space-y-3 pe-1">
|
||||
{data.messages.length === 0 && !streaming && (
|
||||
<p className="text-center text-ink-muted text-sm py-8">
|
||||
התחל בשאלה — למשל: "מה מאפיין את הפתיחות של דפנה בעררי 1xxx?"
|
||||
</p>
|
||||
)}
|
||||
{data.messages.map((m) => <MessageBubble key={m.id} message={m} />)}
|
||||
{streaming && (
|
||||
<MessageBubble
|
||||
message={{
|
||||
id: "streaming",
|
||||
role: "assistant",
|
||||
content: streamingText || "(מקליד…)",
|
||||
created_at: "",
|
||||
}}
|
||||
isStreaming
|
||||
/>
|
||||
)}
|
||||
{streamError && (
|
||||
<div className="rounded-lg border border-danger/40 bg-danger-bg p-3 text-danger text-sm">
|
||||
{streamError}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
<div className="border-t border-rule pt-3 space-y-2">
|
||||
<Textarea
|
||||
value={draft}
|
||||
onChange={(e) => setDraft(e.target.value)}
|
||||
placeholder="שאל את הסוכן… (Shift+Enter לשורה חדשה)"
|
||||
rows={3} dir="rtl"
|
||||
disabled={streaming}
|
||||
onKeyDown={(e) => {
|
||||
if (e.key === "Enter" && !e.shiftKey) {
|
||||
e.preventDefault();
|
||||
void onSend();
|
||||
}
|
||||
}}
|
||||
/>
|
||||
<div className="flex items-center gap-2">
|
||||
<p className="text-[0.72rem] text-ink-muted grow">
|
||||
{data.conversation.claude_session_id
|
||||
? "שיחה ממשיכה (--resume) — אין צורך לטעון מחדש את ה-system prompt"
|
||||
: "שיחה חדשה — system prompt ייטען (שני מסמכי ייחוס + רשימת קורפוס)"}
|
||||
</p>
|
||||
<Button onClick={onSend} disabled={streaming || !draft.trim()}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
{streaming ? (
|
||||
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||
) : (
|
||||
<Send className="w-4 h-4 me-1" />
|
||||
)}
|
||||
שלח
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
function MessageBubble({
|
||||
message, isStreaming = false,
|
||||
}: { message: ChatMessage; isStreaming?: boolean }) {
|
||||
const isUser = message.role === "user";
|
||||
return (
|
||||
<div className={isUser ? "flex justify-start" : "flex justify-end"}>
|
||||
<div
|
||||
className={
|
||||
"max-w-[85%] rounded-lg px-3 py-2 text-sm leading-relaxed whitespace-pre-wrap " +
|
||||
(isUser
|
||||
? "bg-gold-wash text-ink border border-gold/40"
|
||||
: "bg-rule-soft text-ink border border-rule")
|
||||
}
|
||||
dir="rtl"
|
||||
>
|
||||
{message.content}
|
||||
{isStreaming && (
|
||||
<span className="inline-block w-1.5 h-3.5 bg-navy/60 align-middle ms-1 animate-pulse" />
|
||||
)}
|
||||
</div>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// ── Service-down warning ──────────────────────────────────────────
|
||||
|
||||
function ChatServiceWarning({
|
||||
health,
|
||||
}: { health: { reachable: boolean; url: string; error?: string } }) {
|
||||
return (
|
||||
<Card className="bg-danger-bg border-danger/40">
|
||||
<CardContent className="px-4 py-3 space-y-1">
|
||||
<div className="flex items-center gap-2 text-danger">
|
||||
<AlertTriangle className="w-4 h-4" />
|
||||
<strong>שירות הצ'אט אינו זמין</strong>
|
||||
</div>
|
||||
<p className="text-[0.78rem] text-danger">
|
||||
לא ניתן להגיע ל-legal-chat-service בכתובת
|
||||
<code className="px-1 mx-1 bg-rule-soft rounded">{health.url}</code>.
|
||||
{health.error && (<> פירוט: <code className="px-1 bg-rule-soft rounded">{health.error}</code></>)}
|
||||
</p>
|
||||
<p className="text-[0.72rem] text-ink-muted">
|
||||
על המכונה המקומית הפעל:
|
||||
<code className="px-1 bg-rule-soft rounded">
|
||||
pm2 start /home/chaim/legal-ai/scripts/legal-chat-service.config.cjs
|
||||
</code>
|
||||
</p>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
402
web-ui/src/components/training/corpus-detail-drawer.tsx
Normal file
402
web-ui/src/components/training/corpus-detail-drawer.tsx
Normal file
@@ -0,0 +1,402 @@
|
||||
"use client";
|
||||
|
||||
/*
|
||||
* Side-drawer for inspecting + editing a single style_corpus entry.
|
||||
*
|
||||
* Tabs:
|
||||
* - "פרטים" — show + edit the enriched metadata (decision_number, date,
|
||||
* subjects, summary, outcome, key_principles, appeal_subtype). Saving
|
||||
* issues a PATCH /api/training/corpus/{id} and invalidates the list.
|
||||
* - "תוכן" — read-only full_text view (truncated to 5K with "show more").
|
||||
* We never let the chair edit full_text from the UI; corrections happen
|
||||
* by re-uploading via the Upload dialog.
|
||||
* - "מה למדנו" — per-decision lessons (Phase 4 placeholder for now).
|
||||
* - "דפוסים" — style_patterns scoped by appeal_subtype.
|
||||
*
|
||||
* Why a Sheet, not a Dialog: the drawer needs to coexist with the corpus
|
||||
* table so the chair can scan multiple decisions without losing context.
|
||||
* Sheet (side: "left" in RTL = right edge in LTR) gives that without
|
||||
* stealing the entire viewport.
|
||||
*/
|
||||
|
||||
import { useEffect, useState } from "react";
|
||||
import { Save, FileText, Tag, Calendar, BookOpen, Loader2 } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import {
|
||||
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||
} from "@/components/ui/sheet";
|
||||
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Textarea } from "@/components/ui/textarea";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { ScrollArea } from "@/components/ui/scroll-area";
|
||||
import {
|
||||
usePatchCorpus,
|
||||
type CorpusDecision,
|
||||
type CorpusDecisionPatch,
|
||||
} from "@/lib/api/training";
|
||||
import { LessonsTab } from "./lessons-tab";
|
||||
|
||||
type Props = {
|
||||
decision: CorpusDecision | null;
|
||||
onOpenChange: (open: boolean) => void;
|
||||
};
|
||||
|
||||
export function CorpusDetailDrawer({ decision, onOpenChange }: Props) {
|
||||
// Local editable state for the "details" tab. Re-seeds whenever the
|
||||
// selected decision changes so the form reflects the row the chair
|
||||
// clicked.
|
||||
const [draft, setDraft] = useState<CorpusDecisionPatch>({});
|
||||
const patch = usePatchCorpus();
|
||||
|
||||
/* eslint-disable react-hooks/set-state-in-effect */
|
||||
useEffect(() => {
|
||||
if (!decision) {
|
||||
setDraft({});
|
||||
return;
|
||||
}
|
||||
setDraft({
|
||||
decision_number: decision.decision_number,
|
||||
decision_date: decision.decision_date,
|
||||
subject_categories: decision.subject_categories,
|
||||
summary: decision.summary,
|
||||
outcome: decision.outcome,
|
||||
key_principles: decision.key_principles,
|
||||
appeal_subtype: decision.appeal_subtype,
|
||||
practice_area: decision.practice_area,
|
||||
});
|
||||
}, [decision]);
|
||||
/* eslint-enable react-hooks/set-state-in-effect */
|
||||
|
||||
const open = decision !== null;
|
||||
if (!decision) return null;
|
||||
|
||||
// Diff against the originally loaded row — only PATCH fields the chair
|
||||
// actually changed, so concurrent edits to other fields stay intact.
|
||||
const diff: CorpusDecisionPatch = {};
|
||||
if (draft.decision_number !== decision.decision_number)
|
||||
diff.decision_number = draft.decision_number;
|
||||
if (draft.decision_date !== decision.decision_date)
|
||||
diff.decision_date = draft.decision_date;
|
||||
if (draft.summary !== decision.summary)
|
||||
diff.summary = draft.summary;
|
||||
if (draft.outcome !== decision.outcome)
|
||||
diff.outcome = draft.outcome;
|
||||
if (draft.appeal_subtype !== decision.appeal_subtype)
|
||||
diff.appeal_subtype = draft.appeal_subtype;
|
||||
if (draft.practice_area !== decision.practice_area)
|
||||
diff.practice_area = draft.practice_area;
|
||||
if (
|
||||
JSON.stringify(draft.subject_categories) !==
|
||||
JSON.stringify(decision.subject_categories)
|
||||
)
|
||||
diff.subject_categories = draft.subject_categories;
|
||||
if (
|
||||
JSON.stringify(draft.key_principles) !==
|
||||
JSON.stringify(decision.key_principles)
|
||||
)
|
||||
diff.key_principles = draft.key_principles;
|
||||
|
||||
const isDirty = Object.keys(diff).length > 0;
|
||||
|
||||
const onSave = async () => {
|
||||
if (!isDirty) return;
|
||||
try {
|
||||
await patch.mutateAsync({ id: decision.id, patch: diff });
|
||||
toast.success("המטא-דאטה עודכן");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל בשמירה");
|
||||
}
|
||||
};
|
||||
|
||||
const setSubjects = (raw: string) =>
|
||||
setDraft((d) => ({
|
||||
...d,
|
||||
subject_categories: raw.split(/[,،]/).map((s) => s.trim()).filter(Boolean),
|
||||
}));
|
||||
const setPrinciples = (raw: string) =>
|
||||
setDraft((d) => ({
|
||||
...d,
|
||||
key_principles: raw.split("\n").map((s) => s.trim()).filter(Boolean),
|
||||
}));
|
||||
|
||||
return (
|
||||
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||
<SheetContent side="left" className="w-full sm:max-w-3xl overflow-y-auto" dir="rtl">
|
||||
<SheetHeader>
|
||||
<SheetTitle className="text-navy flex items-center gap-2">
|
||||
<BookOpen className="w-4 h-4 shrink-0" />
|
||||
{decision.legal_citation || decision.decision_number || "—"}
|
||||
</SheetTitle>
|
||||
<SheetDescription className="text-ink-muted">
|
||||
{decision.doc_title || "החלטה בקורפוס הסגנוני"}
|
||||
</SheetDescription>
|
||||
</SheetHeader>
|
||||
|
||||
{/* Summary strip — fast-scan info, always visible above the tabs. */}
|
||||
<div className="px-6 mt-3 grid grid-cols-2 md:grid-cols-4 gap-3 text-[0.78rem]">
|
||||
<DataPoint icon={<Calendar className="w-3 h-3" />} label="תאריך"
|
||||
value={decision.decision_date || "—"} />
|
||||
<DataPoint icon={<FileText className="w-3 h-3" />} label="תווים"
|
||||
value={`${(decision.chars / 1000).toFixed(1)}K`} />
|
||||
<DataPoint icon={<FileText className="w-3 h-3" />} label="עמודים"
|
||||
value={decision.page_count > 0 ? String(decision.page_count) : "—"} />
|
||||
<DataPoint icon={<Tag className="w-3 h-3" />} label="תת-סוג"
|
||||
value={decision.appeal_subtype || "—"} />
|
||||
</div>
|
||||
|
||||
<div className="px-6 pb-6 mt-4">
|
||||
<Tabs defaultValue="details" dir="rtl">
|
||||
<TabsList className="bg-rule-soft/60">
|
||||
<TabsTrigger value="details">פרטים</TabsTrigger>
|
||||
<TabsTrigger value="content">תוכן</TabsTrigger>
|
||||
<TabsTrigger value="lessons">מה למדנו</TabsTrigger>
|
||||
<TabsTrigger value="patterns">דפוסים</TabsTrigger>
|
||||
</TabsList>
|
||||
|
||||
{/* ── Tab: editable metadata ─────────────────────────── */}
|
||||
<TabsContent value="details" className="mt-4 space-y-4">
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<Field label="מספר ההחלטה">
|
||||
<Input value={draft.decision_number ?? ""}
|
||||
onChange={(e) => setDraft((d) => ({ ...d, decision_number: e.target.value }))}
|
||||
dir="rtl" />
|
||||
</Field>
|
||||
<Field label="תאריך">
|
||||
<Input type="date" value={draft.decision_date ?? ""}
|
||||
onChange={(e) => setDraft((d) => ({ ...d, decision_date: e.target.value }))} />
|
||||
</Field>
|
||||
</div>
|
||||
|
||||
<Field label="נושאים (מופרדים בפסיקים)">
|
||||
<Input value={(draft.subject_categories ?? []).join(", ")}
|
||||
onChange={(e) => setSubjects(e.target.value)} dir="rtl" />
|
||||
{decision.subject_categories.length > 0 && (
|
||||
<div className="flex flex-wrap gap-1 mt-1">
|
||||
{decision.subject_categories.map((s) => (
|
||||
<Badge key={s} variant="outline"
|
||||
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||
{s}
|
||||
</Badge>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</Field>
|
||||
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<Field label="תת-סוג ערר">
|
||||
<Input value={draft.appeal_subtype ?? ""}
|
||||
onChange={(e) => setDraft((d) => ({ ...d, appeal_subtype: e.target.value }))}
|
||||
placeholder="building_permit / betterment_levy / compensation_197"
|
||||
dir="rtl" />
|
||||
</Field>
|
||||
<Field label="תחום משפט">
|
||||
<Input value={draft.practice_area ?? ""}
|
||||
onChange={(e) => setDraft((d) => ({ ...d, practice_area: e.target.value }))}
|
||||
dir="rtl" />
|
||||
</Field>
|
||||
</div>
|
||||
|
||||
<Field label="תקציר (summary)">
|
||||
<Textarea value={draft.summary ?? ""} rows={3}
|
||||
onChange={(e) => setDraft((d) => ({ ...d, summary: e.target.value }))}
|
||||
placeholder="תקציר חופשי — מי, מה, איך הוכרע"
|
||||
dir="rtl" />
|
||||
</Field>
|
||||
|
||||
<Field label="התוצאה (outcome)">
|
||||
<Textarea value={draft.outcome ?? ""} rows={2}
|
||||
onChange={(e) => setDraft((d) => ({ ...d, outcome: e.target.value }))}
|
||||
placeholder="קבלה / קבלה חלקית / דחייה — בקצרה"
|
||||
dir="rtl" />
|
||||
</Field>
|
||||
|
||||
<Field label="עקרונות מרכזיים (שורה לכל אחד)">
|
||||
<Textarea value={(draft.key_principles ?? []).join("\n")} rows={4}
|
||||
onChange={(e) => setPrinciples(e.target.value)}
|
||||
placeholder={"דוגמה:\nשיקול דעת מוגבל לחריגות קטנות\nריפוי פגם רק בנסיבות חריגות"}
|
||||
dir="rtl" />
|
||||
</Field>
|
||||
|
||||
{decision.parties.appellant && (
|
||||
<Card className="bg-rule-soft/40 border-rule">
|
||||
<CardContent className="px-4 py-3 text-[0.78rem] text-ink-soft">
|
||||
<p><strong className="text-navy">עורר/ת:</strong> {decision.parties.appellant}</p>
|
||||
{decision.parties.respondent && (
|
||||
<p className="mt-1"><strong className="text-navy">משיב/ה:</strong> {decision.parties.respondent}</p>
|
||||
)}
|
||||
<p className="mt-2 text-ink-muted text-[0.72rem]">
|
||||
(חולץ אוטומטית מתחילת הטקסט — תקן ע"י עריכת ה-full_text במקור.)
|
||||
</p>
|
||||
</CardContent>
|
||||
</Card>
|
||||
)}
|
||||
|
||||
<div className="flex items-center justify-end gap-2 pt-2 border-t border-rule">
|
||||
<Button variant="ghost" onClick={() => onOpenChange(false)}>
|
||||
סגור
|
||||
</Button>
|
||||
<Button onClick={onSave} disabled={!isDirty || patch.isPending}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
{patch.isPending ? (
|
||||
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||
) : (
|
||||
<Save className="w-4 h-4 me-1" />
|
||||
)}
|
||||
שמור שינויים
|
||||
</Button>
|
||||
</div>
|
||||
</TabsContent>
|
||||
|
||||
{/* ── Tab: full_text (read-only) ─────────────────────── */}
|
||||
<TabsContent value="content" className="mt-4">
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3">
|
||||
<p className="text-[0.72rem] text-ink-muted mb-2">
|
||||
{decision.chars.toLocaleString("he-IL")} תווים · קריאה בלבד
|
||||
</p>
|
||||
<ScrollArea className="h-[480px] pe-2">
|
||||
<p className="text-sm text-ink leading-relaxed whitespace-pre-wrap">
|
||||
<FullTextLazy id={decision.id} />
|
||||
</p>
|
||||
</ScrollArea>
|
||||
</CardContent>
|
||||
</Card>
|
||||
</TabsContent>
|
||||
|
||||
{/* ── Tab: lessons (per-decision) ────────────────────── */}
|
||||
<TabsContent value="lessons" className="mt-4">
|
||||
<LessonsTab corpusId={decision.id} />
|
||||
</TabsContent>
|
||||
|
||||
{/* ── Tab: patterns scoped by appeal_subtype ─────────── */}
|
||||
<TabsContent value="patterns" className="mt-4">
|
||||
<PatternsForSubtype subtype={decision.appeal_subtype} />
|
||||
</TabsContent>
|
||||
</Tabs>
|
||||
</div>
|
||||
</SheetContent>
|
||||
</Sheet>
|
||||
);
|
||||
}
|
||||
|
||||
// ── helpers ────────────────────────────────────────────────────────
|
||||
|
||||
function DataPoint({
|
||||
icon, label, value,
|
||||
}: { icon: React.ReactNode; label: string; value: string }) {
|
||||
return (
|
||||
<div className="flex items-center gap-1 text-ink-muted">
|
||||
{icon}
|
||||
<span>{label}:</span>
|
||||
<span className="font-semibold text-navy tabular-nums truncate">{value}</span>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function Field({
|
||||
label, children,
|
||||
}: { label: string; children: React.ReactNode }) {
|
||||
return (
|
||||
<div className="space-y-1">
|
||||
<Label className="text-[0.78rem]">{label}</Label>
|
||||
{children}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
/* The corpus-list endpoint deliberately doesn't return full_text (too big).
|
||||
* We fetch it on demand only when the content tab opens.
|
||||
*
|
||||
* Implementation note: we don't have a dedicated /api/training/corpus/{id}
|
||||
* GET endpoint yet. As a thin stopgap we hit a planned `/full-text` shortcut
|
||||
* via apiRequest; if the endpoint isn't deployed yet the UI just shows the
|
||||
* fallback message instead of crashing. The full-text endpoint lands with
|
||||
* the next backend deploy.
|
||||
*/
|
||||
function FullTextLazy({ id }: { id: string }) {
|
||||
const [text, setText] = useState<string>("");
|
||||
const [loading, setLoading] = useState(true);
|
||||
const [error, setError] = useState("");
|
||||
|
||||
/* eslint-disable react-hooks/set-state-in-effect */
|
||||
useEffect(() => {
|
||||
let cancelled = false;
|
||||
setLoading(true);
|
||||
setError("");
|
||||
fetch(`/api/training/corpus/${encodeURIComponent(id)}/full-text`)
|
||||
.then((r) => (r.ok ? r.json() : Promise.reject(new Error(`HTTP ${r.status}`))))
|
||||
.then((d: { full_text: string }) => {
|
||||
if (cancelled) return;
|
||||
setText(d.full_text || "");
|
||||
})
|
||||
.catch((e: Error) => {
|
||||
if (cancelled) return;
|
||||
setError(e.message);
|
||||
})
|
||||
.finally(() => !cancelled && setLoading(false));
|
||||
return () => { cancelled = true; };
|
||||
}, [id]);
|
||||
/* eslint-enable react-hooks/set-state-in-effect */
|
||||
|
||||
if (loading) return <span className="text-ink-muted">טוען…</span>;
|
||||
if (error) return <span className="text-ink-muted">לא נמצא ({error})</span>;
|
||||
return text;
|
||||
}
|
||||
|
||||
function PatternsForSubtype({ subtype }: { subtype: string }) {
|
||||
// Filtered patterns endpoint isn't built yet — we fall back to /patterns
|
||||
// and filter client-side. The result is mediocre when many subtypes share
|
||||
// patterns; better filtering ships in the metadata-enrichment iteration.
|
||||
const [data, setData] = useState<Record<string, { pattern_text: string; frequency: number }[]> | null>(null);
|
||||
const [loading, setLoading] = useState(true);
|
||||
|
||||
useEffect(() => {
|
||||
let cancelled = false;
|
||||
fetch("/api/training/patterns")
|
||||
.then((r) => r.json())
|
||||
.then((d: { by_type: Record<string, { pattern_text: string; frequency: number }[]> }) => {
|
||||
if (!cancelled) setData(d.by_type);
|
||||
})
|
||||
.catch(() => !cancelled && setData({}))
|
||||
.finally(() => !cancelled && setLoading(false));
|
||||
return () => { cancelled = true; };
|
||||
}, []);
|
||||
|
||||
if (loading) return <p className="text-ink-muted text-sm text-center py-6">טוען…</p>;
|
||||
if (!data || Object.keys(data).length === 0) {
|
||||
return <p className="text-ink-muted text-sm text-center py-6">אין דפוסים שמורים — הרץ ניתוח סגנון.</p>;
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="space-y-3">
|
||||
{subtype && (
|
||||
<p className="text-[0.78rem] text-ink-muted">
|
||||
דפוסים בכלל הקורפוס. סינון לפי תת-סוג {subtype} ייושם בעדכון הבא.
|
||||
</p>
|
||||
)}
|
||||
{Object.entries(data).slice(0, 4).map(([type, items]) => (
|
||||
<Card key={type} className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3">
|
||||
<h4 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-2">
|
||||
{type}
|
||||
</h4>
|
||||
<ul className="space-y-1 text-sm text-ink">
|
||||
{items.slice(0, 6).map((p, i) => (
|
||||
<li key={i} className="flex items-start gap-2">
|
||||
<span className="text-[0.72rem] tabular-nums text-ink-muted shrink-0 mt-0.5">
|
||||
×{p.frequency}
|
||||
</span>
|
||||
<span>{p.pattern_text}</span>
|
||||
</li>
|
||||
))}
|
||||
</ul>
|
||||
</CardContent>
|
||||
</Card>
|
||||
))}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
@@ -1,6 +1,7 @@
|
||||
"use client";
|
||||
|
||||
import { Trash2 } from "lucide-react";
|
||||
import { useState } from "react";
|
||||
import { Trash2, Sparkles } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import {
|
||||
Table, TableBody, TableCell, TableHead, TableHeader, TableRow,
|
||||
@@ -9,12 +10,20 @@ import { Button } from "@/components/ui/button";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import { useCorpus, useDeleteCorpusEntry, type CorpusDecision } from "@/lib/api/training";
|
||||
import { CorpusDetailDrawer } from "./corpus-detail-drawer";
|
||||
|
||||
/*
|
||||
* Corpus tab: table of all decisions currently in the style corpus, with a
|
||||
* single destructive action (remove from corpus). Uses browser confirm() for
|
||||
* the confirmation — a full shadcn AlertDialog would be overkill for an
|
||||
* admin-only destructive action with a server-side safety net.
|
||||
* Corpus tab: table of all decisions currently in the style corpus.
|
||||
*
|
||||
* Click any row → opens CorpusDetailDrawer with the enriched metadata
|
||||
* + edit UI. The trash button is now in its own narrow column and uses
|
||||
* stopPropagation so deleting a row doesn't also open the drawer.
|
||||
*
|
||||
* We use browser confirm() for the destructive action rather than a
|
||||
* full shadcn AlertDialog because this is a single admin operation
|
||||
* gated by an API-level safety net (FK cascade is best-effort but
|
||||
* style_corpus DELETE returns 404 on missing rows, so the worst case
|
||||
* is a no-op).
|
||||
*/
|
||||
|
||||
function formatChars(n: number) {
|
||||
@@ -30,9 +39,12 @@ function formatDate(iso: string) {
|
||||
}
|
||||
}
|
||||
|
||||
function Row({ item }: { item: CorpusDecision }) {
|
||||
function Row({
|
||||
item, onOpen,
|
||||
}: { item: CorpusDecision; onOpen: () => void }) {
|
||||
const del = useDeleteCorpusEntry();
|
||||
const onDelete = async () => {
|
||||
const onDelete = async (e: React.MouseEvent) => {
|
||||
e.stopPropagation();
|
||||
if (!window.confirm(`למחוק את החלטה ${item.decision_number} מהקורפוס?`)) return;
|
||||
try {
|
||||
await del.mutateAsync(item.id);
|
||||
@@ -43,7 +55,10 @@ function Row({ item }: { item: CorpusDecision }) {
|
||||
};
|
||||
|
||||
return (
|
||||
<TableRow className="border-rule hover:bg-gold-wash/30">
|
||||
<TableRow
|
||||
className="border-rule hover:bg-gold-wash/30 cursor-pointer"
|
||||
onClick={onOpen}
|
||||
>
|
||||
<TableCell className="font-semibold text-navy tabular-nums">
|
||||
{item.decision_number || "—"}
|
||||
</TableCell>
|
||||
@@ -55,20 +70,39 @@ function Row({ item }: { item: CorpusDecision }) {
|
||||
<span className="text-ink-light">—</span>
|
||||
) : (
|
||||
<div className="flex flex-wrap gap-1">
|
||||
{item.subject_categories.map((s) => (
|
||||
<Badge
|
||||
key={s}
|
||||
variant="outline"
|
||||
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40"
|
||||
>
|
||||
{item.subject_categories.slice(0, 3).map((s) => (
|
||||
<Badge key={s} variant="outline"
|
||||
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||
{s}
|
||||
</Badge>
|
||||
))}
|
||||
{item.subject_categories.length > 3 && (
|
||||
<span className="text-[0.7rem] text-ink-muted">
|
||||
+{item.subject_categories.length - 3}
|
||||
</span>
|
||||
)}
|
||||
</div>
|
||||
)}
|
||||
</TableCell>
|
||||
<TableCell className="text-[0.78rem] text-ink-soft">
|
||||
<div className="flex items-center gap-2">
|
||||
<span className="truncate">{item.legal_citation || "—"}</span>
|
||||
{item.lessons_count > 0 && (
|
||||
<Badge variant="outline"
|
||||
className="text-[0.7rem] bg-info-bg text-info border-info/40 shrink-0">
|
||||
<Sparkles className="w-3 h-3 me-0.5" />
|
||||
{item.lessons_count}
|
||||
</Badge>
|
||||
)}
|
||||
</div>
|
||||
</TableCell>
|
||||
<TableCell className="text-ink-soft tabular-nums">
|
||||
{formatChars(item.chars)}
|
||||
{item.page_count > 0 && (
|
||||
<span className="text-ink-muted text-[0.72rem] ms-1">
|
||||
· {item.page_count} ע׳
|
||||
</span>
|
||||
)}
|
||||
</TableCell>
|
||||
<TableCell className="text-ink-muted tabular-nums text-[0.78rem]">
|
||||
{formatDate(item.created_at)}
|
||||
@@ -91,6 +125,7 @@ function Row({ item }: { item: CorpusDecision }) {
|
||||
|
||||
export function CorpusPanel() {
|
||||
const { data, isPending, error } = useCorpus();
|
||||
const [selected, setSelected] = useState<CorpusDecision | null>(null);
|
||||
|
||||
if (error) {
|
||||
return (
|
||||
@@ -101,40 +136,50 @@ export function CorpusPanel() {
|
||||
}
|
||||
|
||||
return (
|
||||
<div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
|
||||
<Table>
|
||||
<TableHeader className="bg-rule-soft/60">
|
||||
<TableRow className="border-rule">
|
||||
<TableHead className="text-navy text-right">מס׳ החלטה</TableHead>
|
||||
<TableHead className="text-navy text-right">תאריך</TableHead>
|
||||
<TableHead className="text-navy text-right">נושאים</TableHead>
|
||||
<TableHead className="text-navy text-right">תווים</TableHead>
|
||||
<TableHead className="text-navy text-right">נוסף בתאריך</TableHead>
|
||||
<TableHead className="text-navy" />
|
||||
</TableRow>
|
||||
</TableHeader>
|
||||
<TableBody>
|
||||
{isPending ? (
|
||||
[...Array(4)].map((_, i) => (
|
||||
<TableRow key={i} className="border-rule">
|
||||
{[...Array(6)].map((_, j) => (
|
||||
<TableCell key={j}>
|
||||
<Skeleton className="h-4 w-24" />
|
||||
</TableCell>
|
||||
))}
|
||||
</TableRow>
|
||||
))
|
||||
) : data?.length === 0 ? (
|
||||
<TableRow>
|
||||
<TableCell colSpan={6} className="text-center text-ink-muted py-12">
|
||||
הקורפוס ריק
|
||||
</TableCell>
|
||||
<>
|
||||
<div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
|
||||
<Table>
|
||||
<TableHeader className="bg-rule-soft/60">
|
||||
<TableRow className="border-rule">
|
||||
<TableHead className="text-navy text-right">מס׳ החלטה</TableHead>
|
||||
<TableHead className="text-navy text-right">תאריך</TableHead>
|
||||
<TableHead className="text-navy text-right">נושאים</TableHead>
|
||||
<TableHead className="text-navy text-right">מראה מקום</TableHead>
|
||||
<TableHead className="text-navy text-right">תווים / עמודים</TableHead>
|
||||
<TableHead className="text-navy text-right">נוסף בתאריך</TableHead>
|
||||
<TableHead className="text-navy" />
|
||||
</TableRow>
|
||||
) : (
|
||||
data?.map((item) => <Row key={item.id} item={item} />)
|
||||
)}
|
||||
</TableBody>
|
||||
</Table>
|
||||
</div>
|
||||
</TableHeader>
|
||||
<TableBody>
|
||||
{isPending ? (
|
||||
[...Array(4)].map((_, i) => (
|
||||
<TableRow key={i} className="border-rule">
|
||||
{[...Array(7)].map((_, j) => (
|
||||
<TableCell key={j}>
|
||||
<Skeleton className="h-4 w-24" />
|
||||
</TableCell>
|
||||
))}
|
||||
</TableRow>
|
||||
))
|
||||
) : data?.length === 0 ? (
|
||||
<TableRow>
|
||||
<TableCell colSpan={7} className="text-center text-ink-muted py-12">
|
||||
הקורפוס ריק
|
||||
</TableCell>
|
||||
</TableRow>
|
||||
) : (
|
||||
data?.map((item) => (
|
||||
<Row key={item.id} item={item} onOpen={() => setSelected(item)} />
|
||||
))
|
||||
)}
|
||||
</TableBody>
|
||||
</Table>
|
||||
</div>
|
||||
|
||||
<CorpusDetailDrawer
|
||||
decision={selected}
|
||||
onOpenChange={(open) => { if (!open) setSelected(null); }}
|
||||
/>
|
||||
</>
|
||||
);
|
||||
}
|
||||
|
||||
338
web-ui/src/components/training/curator-portrait-panel.tsx
Normal file
338
web-ui/src/components/training/curator-portrait-panel.tsx
Normal file
@@ -0,0 +1,338 @@
|
||||
"use client";
|
||||
|
||||
/*
|
||||
* Curator-Portrait tab — shows everything about the agent that learns
|
||||
* Daphna's style:
|
||||
* 1. Snapshot stats (curator findings to date, % applied)
|
||||
* 2. Recent curator findings (last 10) — linked by decision number
|
||||
* 3. The hermes-curator system prompt, rendered + linked to Gitea
|
||||
* 4. The style_analyzer training prompts (different lifecycle — runs
|
||||
* over the corpus at training time, not per-decision)
|
||||
* 5. Propose-change form — writes a markdown file to disk for chair
|
||||
* review (no auto-commit)
|
||||
*
|
||||
* The prompts are deliberately read-only here: they're symlinked into
|
||||
* Paperclip and load-bearing for every curator wake. Editing them from
|
||||
* the UI would silently fork the source of truth.
|
||||
*/
|
||||
|
||||
import { useState } from "react";
|
||||
import {
|
||||
Sparkles, ExternalLink, Send, Loader2, FileText, Brain,
|
||||
CheckCircle2, Clock,
|
||||
} from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Textarea } from "@/components/ui/textarea";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import { ScrollArea } from "@/components/ui/scroll-area";
|
||||
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||
import { Markdown } from "@/components/ui/markdown";
|
||||
import {
|
||||
useCuratorPrompt,
|
||||
useCuratorStats,
|
||||
useStyleAnalyzerPrompts,
|
||||
useSubmitCuratorProposal,
|
||||
} from "@/lib/api/training";
|
||||
|
||||
export function CuratorPortraitPanel() {
|
||||
return (
|
||||
<div className="space-y-6">
|
||||
<StatsCard />
|
||||
<RecentFindings />
|
||||
|
||||
<Tabs defaultValue="curator-prompt" dir="rtl">
|
||||
<TabsList className="bg-rule-soft/60">
|
||||
<TabsTrigger value="curator-prompt">פרומפט ה-Curator</TabsTrigger>
|
||||
<TabsTrigger value="analyzer-prompt">פרומפט אימון הסגנון</TabsTrigger>
|
||||
<TabsTrigger value="propose">הצעת שינוי</TabsTrigger>
|
||||
</TabsList>
|
||||
<TabsContent value="curator-prompt" className="mt-4">
|
||||
<CuratorPromptCard />
|
||||
</TabsContent>
|
||||
<TabsContent value="analyzer-prompt" className="mt-4">
|
||||
<StyleAnalyzerPromptCard />
|
||||
</TabsContent>
|
||||
<TabsContent value="propose" className="mt-4">
|
||||
<ProposeChangeForm />
|
||||
</TabsContent>
|
||||
</Tabs>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
// ── stats card ─────────────────────────────────────────────────────
|
||||
|
||||
function StatsCard() {
|
||||
const { data, isPending } = useCuratorStats();
|
||||
|
||||
if (isPending) {
|
||||
return (
|
||||
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
|
||||
{[...Array(4)].map((_, i) => <Skeleton key={i} className="h-20 w-full" />)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
if (!data) return null;
|
||||
|
||||
return (
|
||||
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
|
||||
<Kpi label="ממצאי curator" value={data.total_findings} icon={<Sparkles className="w-4 h-4" />} />
|
||||
<Kpi label="החלטות שנסקרו" value={`${data.decisions_with_findings}/${data.decisions_total}`} icon={<FileText className="w-4 h-4" />} />
|
||||
<Kpi label="ממצאים שאומצו ל-SKILL" value={data.findings_applied} icon={<CheckCircle2 className="w-4 h-4" />} />
|
||||
<Kpi label="ממוצע ממצאים להחלטה"
|
||||
value={
|
||||
data.decisions_with_findings > 0
|
||||
? (data.total_findings / data.decisions_with_findings).toFixed(1)
|
||||
: "—"
|
||||
}
|
||||
icon={<Brain className="w-4 h-4" />}
|
||||
/>
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function Kpi({
|
||||
label, value, icon,
|
||||
}: { label: string; value: string | number; icon: React.ReactNode }) {
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3">
|
||||
<div className="flex items-center gap-2 text-ink-muted text-[0.78rem]">
|
||||
{icon}
|
||||
<span>{label}</span>
|
||||
</div>
|
||||
<p className="text-2xl text-navy font-semibold tabular-nums mt-1">{value}</p>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
// ── recent findings ────────────────────────────────────────────────
|
||||
|
||||
function RecentFindings() {
|
||||
const { data, isPending } = useCuratorStats();
|
||||
|
||||
if (isPending) {
|
||||
return <Skeleton className="h-40 w-full" />;
|
||||
}
|
||||
if (!data || data.recent_findings.length === 0) {
|
||||
return (
|
||||
<Card className="bg-rule-soft/40 border-rule">
|
||||
<CardContent className="px-6 py-5 text-center text-ink-muted text-sm">
|
||||
אין עדיין ממצאים של ה-Curator. הוא מופעל אוטומטית כאשר דפנה מסמנת
|
||||
החלטה כסופית (mark-final), ושומר את ממצאיו כ-decision_lessons עם
|
||||
source="curator".
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3">
|
||||
<h3 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-3">
|
||||
ממצאים אחרונים של ה-Curator
|
||||
</h3>
|
||||
<ul className="space-y-2">
|
||||
{data.recent_findings.map((f) => (
|
||||
<li key={f.id} className="border-b border-rule pb-2 last:border-0 last:pb-0">
|
||||
<div className="flex items-center gap-2 text-[0.72rem] mb-1">
|
||||
<Badge variant="outline"
|
||||
className="bg-info-bg text-info border-info/40">
|
||||
{f.category}
|
||||
</Badge>
|
||||
<span className="text-navy font-semibold tabular-nums">
|
||||
{f.decision_number || "—"}
|
||||
</span>
|
||||
{f.applied_to_skill && (
|
||||
<Badge variant="outline"
|
||||
className="bg-success-bg text-success border-success/40">
|
||||
<CheckCircle2 className="w-3 h-3 me-0.5" />
|
||||
אומץ
|
||||
</Badge>
|
||||
)}
|
||||
<span className="grow text-ink-muted text-end">
|
||||
<Clock className="w-3 h-3 inline me-1" />
|
||||
{new Date(f.created_at).toLocaleDateString("he-IL")}
|
||||
</span>
|
||||
</div>
|
||||
<p className="text-sm text-ink leading-relaxed">{f.lesson_text}</p>
|
||||
</li>
|
||||
))}
|
||||
</ul>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
// ── prompts ────────────────────────────────────────────────────────
|
||||
|
||||
function CuratorPromptCard() {
|
||||
const { data, isPending, error } = useCuratorPrompt();
|
||||
|
||||
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||
if (error) {
|
||||
return (
|
||||
<Card className="bg-danger-bg border-danger/40">
|
||||
<CardContent className="px-6 py-4 text-danger">{error.message}</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
if (!data) return null;
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-5 py-4 space-y-3">
|
||||
<div className="flex items-center justify-between gap-2 flex-wrap">
|
||||
<div>
|
||||
<h3 className="text-navy font-semibold">{data.filename}</h3>
|
||||
<p className="text-[0.72rem] text-ink-muted">
|
||||
{data.bytes.toLocaleString("he-IL")} בייטים ·
|
||||
עודכן: {new Date(data.last_modified * 1000).toLocaleString("he-IL")}
|
||||
</p>
|
||||
</div>
|
||||
<Button asChild variant="outline" size="sm">
|
||||
<a href={data.gitea_url} target="_blank" rel="noopener noreferrer">
|
||||
<ExternalLink className="w-3 h-3 me-1" />
|
||||
ערוך ב-Gitea
|
||||
</a>
|
||||
</Button>
|
||||
</div>
|
||||
<ScrollArea className="h-[520px] pe-2 border border-rule rounded p-3 bg-rule-soft/30">
|
||||
<Markdown content={data.content} />
|
||||
</ScrollArea>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
function StyleAnalyzerPromptCard() {
|
||||
const { data, isPending } = useStyleAnalyzerPrompts();
|
||||
|
||||
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||
if (!data) return null;
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-5 py-4 space-y-3">
|
||||
<div>
|
||||
<h3 className="text-navy font-semibold">פרומפטים של style_analyzer.py</h3>
|
||||
<p className="text-[0.72rem] text-ink-muted">
|
||||
רץ ב-Claude Opus (1M context, עד {data.max_input_tokens.toLocaleString("he-IL")} tokens
|
||||
input) דרך claude CLI מקומי — חינמי, ללא API. נקרא ע"י
|
||||
<code className="px-1 mx-1 bg-rule-soft rounded">POST /api/training/analyze-style</code>
|
||||
ומכניס דפוסים ל-<code className="px-1 bg-rule-soft rounded">style_patterns</code>.
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<Tabs defaultValue="analysis" dir="rtl">
|
||||
<TabsList className="bg-rule-soft/60">
|
||||
<TabsTrigger value="analysis">Single-pass (כל הקורפוס)</TabsTrigger>
|
||||
<TabsTrigger value="single">Multi-pass (החלטה אחת)</TabsTrigger>
|
||||
<TabsTrigger value="synthesis">Synthesis</TabsTrigger>
|
||||
</TabsList>
|
||||
<TabsContent value="analysis" className="mt-3">
|
||||
<PromptBlock content={data.analysis_prompt} />
|
||||
</TabsContent>
|
||||
<TabsContent value="single" className="mt-3">
|
||||
<PromptBlock content={data.single_decision_prompt} />
|
||||
</TabsContent>
|
||||
<TabsContent value="synthesis" className="mt-3">
|
||||
<PromptBlock content={data.synthesis_prompt} />
|
||||
</TabsContent>
|
||||
</Tabs>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
|
||||
function PromptBlock({ content }: { content: string }) {
|
||||
return (
|
||||
<ScrollArea className="h-[420px] pe-2 border border-rule rounded p-3 bg-rule-soft/30">
|
||||
<pre className="text-[0.78rem] whitespace-pre-wrap font-mono text-ink leading-relaxed"
|
||||
dir="rtl">
|
||||
{content}
|
||||
</pre>
|
||||
</ScrollArea>
|
||||
);
|
||||
}
|
||||
|
||||
// ── propose change form ────────────────────────────────────────────
|
||||
|
||||
function ProposeChangeForm() {
|
||||
const [title, setTitle] = useState("");
|
||||
const [proposedChange, setProposedChange] = useState("");
|
||||
const [rationale, setRationale] = useState("");
|
||||
const submit = useSubmitCuratorProposal();
|
||||
|
||||
const onSubmit = async (e: React.FormEvent) => {
|
||||
e.preventDefault();
|
||||
if (!title.trim() || !proposedChange.trim()) {
|
||||
toast.error("חובה כותרת ושינוי מוצע");
|
||||
return;
|
||||
}
|
||||
try {
|
||||
const r = await submit.mutateAsync({
|
||||
title: title.trim(),
|
||||
proposed_change: proposedChange.trim(),
|
||||
rationale: rationale.trim(),
|
||||
});
|
||||
toast.success(`נשמרה הצעה: ${r.filename}`);
|
||||
setTitle(""); setProposedChange(""); setRationale("");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל בשמירה");
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-5 py-4">
|
||||
<h3 className="text-navy font-semibold mb-2">הצעת שינוי לפרומפט ה-Curator</h3>
|
||||
<p className="text-[0.78rem] text-ink-muted mb-4">
|
||||
ההצעה תישמר כקובץ Markdown ב-
|
||||
<code className="px-1 bg-rule-soft rounded">data/curator-proposals/</code>.
|
||||
חיים יבחן ויאשר ידנית — אין שינוי אוטומטי בפרומפט.
|
||||
</p>
|
||||
<form onSubmit={onSubmit} className="space-y-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="proposal-title">כותרת השינוי</Label>
|
||||
<Input id="proposal-title" value={title}
|
||||
onChange={(e) => setTitle(e.target.value)}
|
||||
placeholder="לדוגמה: הוסף קטגוריה [צ׳קליסט תוכן] לממצאי ה-curator"
|
||||
dir="rtl" />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="proposal-change">השינוי המוצע (Markdown)</Label>
|
||||
<Textarea id="proposal-change" value={proposedChange} rows={6}
|
||||
onChange={(e) => setProposedChange(e.target.value)}
|
||||
placeholder={"תאר במדויק מה לשנות. אפשר להעתיק את הקטע הקיים ולסמן ב-strikethrough + להוסיף את החדש."}
|
||||
dir="rtl" />
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="proposal-rationale">נימוק</Label>
|
||||
<Textarea id="proposal-rationale" value={rationale} rows={3}
|
||||
onChange={(e) => setRationale(e.target.value)}
|
||||
placeholder="למה השינוי הזה חשוב? איזה בעיה הוא פותר?"
|
||||
dir="rtl" />
|
||||
</div>
|
||||
<div className="flex justify-end">
|
||||
<Button type="submit" disabled={submit.isPending}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
{submit.isPending ? (
|
||||
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||
) : (
|
||||
<Send className="w-4 h-4 me-1" />
|
||||
)}
|
||||
שלח הצעה
|
||||
</Button>
|
||||
</div>
|
||||
</form>
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
267
web-ui/src/components/training/lessons-tab.tsx
Normal file
267
web-ui/src/components/training/lessons-tab.tsx
Normal file
@@ -0,0 +1,267 @@
|
||||
"use client";
|
||||
|
||||
/*
|
||||
* Per-decision lessons editor — lives inside CorpusDetailDrawer's
|
||||
* "מה למדנו" tab. Lessons are persisted in the decision_lessons table
|
||||
* (one-to-many on style_corpus) and consumed by hermes-curator and
|
||||
* future style_analyzer runs as context.
|
||||
*
|
||||
* The chair can:
|
||||
* - Add a lesson typed manually (category = "general" by default)
|
||||
* - Edit / delete existing lessons
|
||||
* - Mark a lesson as "applied_to_skill" (informational — doesn't
|
||||
* auto-commit anything to SKILL.md; chair still curates that file
|
||||
* manually in git).
|
||||
*
|
||||
* Lessons from the curator arrive with source="curator" and are visually
|
||||
* distinguished by a badge so the chair can audit auto-suggestions.
|
||||
*/
|
||||
|
||||
import { useState } from "react";
|
||||
import { Plus, Save, Trash2, Loader2, CheckCircle2, Sparkles } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Card, CardContent } from "@/components/ui/card";
|
||||
import { Textarea } from "@/components/ui/textarea";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import { Skeleton } from "@/components/ui/skeleton";
|
||||
import {
|
||||
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||
} from "@/components/ui/select";
|
||||
import {
|
||||
useAddLesson,
|
||||
useCorpusLessons,
|
||||
useDeleteLesson,
|
||||
usePatchLesson,
|
||||
type DecisionLesson,
|
||||
} from "@/lib/api/training";
|
||||
|
||||
const CATEGORIES = [
|
||||
{ value: "general", label: "כללי" },
|
||||
{ value: "style", label: "סגנון" },
|
||||
{ value: "structure", label: "מבנה" },
|
||||
{ value: "lexicon", label: "לקסיקון" },
|
||||
{ value: "tabular", label: "טבלאי" },
|
||||
] as const;
|
||||
|
||||
const SOURCE_BADGE: Record<DecisionLesson["source"], { label: string; cls: string }> = {
|
||||
manual: { label: "ידני", cls: "bg-rule-soft text-ink-soft" },
|
||||
chair: { label: "יו״ר", cls: "bg-gold-wash text-gold-deep" },
|
||||
curator: { label: "Curator", cls: "bg-info-bg text-info" },
|
||||
style_analyzer: { label: "Analyzer", cls: "bg-success-bg text-success" },
|
||||
};
|
||||
|
||||
export function LessonsTab({ corpusId }: { corpusId: string }) {
|
||||
const { data, isPending } = useCorpusLessons(corpusId);
|
||||
const add = useAddLesson(corpusId);
|
||||
const [draftText, setDraftText] = useState("");
|
||||
const [draftCategory, setDraftCategory] = useState<DecisionLesson["category"]>("general");
|
||||
|
||||
const onAdd = async () => {
|
||||
const text = draftText.trim();
|
||||
if (!text) return;
|
||||
try {
|
||||
await add.mutateAsync({ lesson_text: text, category: draftCategory });
|
||||
setDraftText("");
|
||||
setDraftCategory("general");
|
||||
toast.success("הלקח נוסף");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל בשמירה");
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<div className="space-y-4">
|
||||
{/* Composer */}
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3 space-y-2">
|
||||
<h4 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold">
|
||||
הוסף לקח להחלטה
|
||||
</h4>
|
||||
<Textarea
|
||||
value={draftText}
|
||||
onChange={(e) => setDraftText(e.target.value)}
|
||||
placeholder="מה למדנו מההחלטה הזו? למשל: 'דפנה מעדיפה הוצאות מתונות (5K-10K ₪) גם בערר שהתקבל במלואו'"
|
||||
rows={3}
|
||||
dir="rtl"
|
||||
disabled={add.isPending}
|
||||
/>
|
||||
<div className="flex items-center gap-2">
|
||||
<Select
|
||||
value={draftCategory}
|
||||
onValueChange={(v) => setDraftCategory(v as DecisionLesson["category"])}
|
||||
disabled={add.isPending}
|
||||
dir="rtl"
|
||||
>
|
||||
<SelectTrigger className="w-40">
|
||||
<SelectValue />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{CATEGORIES.map((c) => (
|
||||
<SelectItem key={c.value} value={c.value}>{c.label}</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
<div className="grow" />
|
||||
<Button onClick={onAdd} disabled={add.isPending || !draftText.trim()}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
{add.isPending ? (
|
||||
<Loader2 className="w-4 h-4 animate-spin me-1" />
|
||||
) : (
|
||||
<Plus className="w-4 h-4 me-1" />
|
||||
)}
|
||||
שמור לקח
|
||||
</Button>
|
||||
</div>
|
||||
</CardContent>
|
||||
</Card>
|
||||
|
||||
{/* List */}
|
||||
{isPending ? (
|
||||
<div className="space-y-2">
|
||||
{[...Array(3)].map((_, i) => (
|
||||
<Skeleton key={i} className="h-16 w-full" />
|
||||
))}
|
||||
</div>
|
||||
) : !data || data.length === 0 ? (
|
||||
<p className="text-center text-ink-muted text-sm py-6">
|
||||
אין עדיין לקחים להחלטה זו. הוסף לקח ראשון מלמעלה.
|
||||
</p>
|
||||
) : (
|
||||
<div className="space-y-2">
|
||||
{data.map((lesson) => (
|
||||
<LessonItem key={lesson.id} lesson={lesson} corpusId={corpusId} />
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
);
|
||||
}
|
||||
|
||||
function LessonItem({
|
||||
lesson, corpusId,
|
||||
}: { lesson: DecisionLesson; corpusId: string }) {
|
||||
const [editing, setEditing] = useState(false);
|
||||
const [text, setText] = useState(lesson.lesson_text);
|
||||
const [category, setCategory] = useState<DecisionLesson["category"]>(lesson.category);
|
||||
const patch = usePatchLesson(corpusId);
|
||||
const del = useDeleteLesson(corpusId);
|
||||
|
||||
const sourceBadge = SOURCE_BADGE[lesson.source];
|
||||
const dirty = text !== lesson.lesson_text || category !== lesson.category;
|
||||
|
||||
const onSave = async () => {
|
||||
try {
|
||||
await patch.mutateAsync({
|
||||
id: lesson.id,
|
||||
patch: dirty ? { lesson_text: text, category } : {},
|
||||
});
|
||||
setEditing(false);
|
||||
toast.success("הלקח עודכן");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל בעדכון");
|
||||
}
|
||||
};
|
||||
|
||||
const onToggleApplied = async () => {
|
||||
try {
|
||||
await patch.mutateAsync({
|
||||
id: lesson.id,
|
||||
patch: { applied_to_skill: !lesson.applied_to_skill },
|
||||
});
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל בעדכון");
|
||||
}
|
||||
};
|
||||
|
||||
const onDelete = async () => {
|
||||
if (!window.confirm("למחוק את הלקח?")) return;
|
||||
try {
|
||||
await del.mutateAsync(lesson.id);
|
||||
toast.success("נמחק");
|
||||
} catch (e) {
|
||||
toast.error(e instanceof Error ? e.message : "כשל במחיקה");
|
||||
}
|
||||
};
|
||||
|
||||
return (
|
||||
<Card className="bg-surface border-rule">
|
||||
<CardContent className="px-4 py-3 space-y-2">
|
||||
<div className="flex items-center gap-2 text-[0.72rem]">
|
||||
<Badge variant="outline"
|
||||
className="bg-rule-soft text-ink-soft">
|
||||
{CATEGORIES.find((c) => c.value === lesson.category)?.label || lesson.category}
|
||||
</Badge>
|
||||
<Badge variant="outline" className={sourceBadge.cls}>
|
||||
{sourceBadge.label}
|
||||
</Badge>
|
||||
{lesson.applied_to_skill && (
|
||||
<Badge variant="outline"
|
||||
className="bg-success-bg text-success border-success/40">
|
||||
<CheckCircle2 className="w-3 h-3 me-1" />
|
||||
אומץ
|
||||
</Badge>
|
||||
)}
|
||||
<span className="grow text-ink-muted tabular-nums">
|
||||
{new Date(lesson.created_at).toLocaleDateString("he-IL")}
|
||||
</span>
|
||||
</div>
|
||||
|
||||
{editing ? (
|
||||
<>
|
||||
<Textarea value={text} onChange={(e) => setText(e.target.value)}
|
||||
rows={3} dir="rtl" />
|
||||
<div className="flex items-center gap-2">
|
||||
<Select value={category}
|
||||
onValueChange={(v) => setCategory(v as DecisionLesson["category"])}
|
||||
dir="rtl">
|
||||
<SelectTrigger className="w-40">
|
||||
<SelectValue />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{CATEGORIES.map((c) => (
|
||||
<SelectItem key={c.value} value={c.value}>{c.label}</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
<div className="grow" />
|
||||
<Button variant="ghost" size="sm"
|
||||
onClick={() => { setEditing(false); setText(lesson.lesson_text); setCategory(lesson.category); }}>
|
||||
ביטול
|
||||
</Button>
|
||||
<Button size="sm" onClick={onSave} disabled={patch.isPending}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
<Save className="w-3 h-3 me-1" />
|
||||
שמור
|
||||
</Button>
|
||||
</div>
|
||||
</>
|
||||
) : (
|
||||
<>
|
||||
<p className="text-sm text-ink leading-relaxed whitespace-pre-wrap"
|
||||
onClick={() => setEditing(true)}
|
||||
style={{ cursor: "text" }}>
|
||||
{lesson.lesson_text}
|
||||
</p>
|
||||
<div className="flex items-center gap-2">
|
||||
<Button variant="ghost" size="sm" onClick={onToggleApplied}
|
||||
disabled={patch.isPending}>
|
||||
<Sparkles className="w-3 h-3 me-1" />
|
||||
{lesson.applied_to_skill ? "בטל סימון 'אומץ'" : "סמן כ'אומץ ל-SKILL'"}
|
||||
</Button>
|
||||
<Button variant="ghost" size="sm" onClick={() => setEditing(true)}>
|
||||
ערוך
|
||||
</Button>
|
||||
<div className="grow" />
|
||||
<Button variant="ghost" size="sm" onClick={onDelete}
|
||||
disabled={del.isPending}
|
||||
className="text-danger hover:text-danger hover:bg-danger-bg">
|
||||
<Trash2 className="w-3 h-3" />
|
||||
</Button>
|
||||
</div>
|
||||
</>
|
||||
)}
|
||||
</CardContent>
|
||||
</Card>
|
||||
);
|
||||
}
|
||||
328
web-ui/src/components/training/upload-dialog.tsx
Normal file
328
web-ui/src/components/training/upload-dialog.tsx
Normal file
@@ -0,0 +1,328 @@
|
||||
"use client";
|
||||
|
||||
/*
|
||||
* Upload a Daphna decision into the style corpus, from the /training page.
|
||||
*
|
||||
* The flow is three explicit steps inside the same sheet:
|
||||
* 1. file picker → POST /api/upload (gets sanitized filename)
|
||||
* 2. preview → POST /api/training/analyze (proofread + auto-extracted meta)
|
||||
* chair can correct decision_number / decision_date / subjects
|
||||
* 3. commit → POST /api/training/upload (background task)
|
||||
* progress watched via SSE; on completion we invalidate
|
||||
* corpus + style-report so the new row appears.
|
||||
*
|
||||
* The Sheet UX mirrors precedent-upload-sheet.tsx: same dir="rtl", same
|
||||
* loading + error patterns, same toast on success. The reason this isn't
|
||||
* a single one-click upload is that style-corpus rows are write-once
|
||||
* (we don't allow editing full_text), so the chair MUST see the proofread
|
||||
* preview before committing — otherwise a bad OCR/proofread can silently
|
||||
* pollute the style portrait.
|
||||
*/
|
||||
|
||||
import { useEffect, useState } from "react";
|
||||
import { Upload, Loader2, CheckCircle2, AlertCircle, FileText } from "lucide-react";
|
||||
import { toast } from "sonner";
|
||||
import { useQueryClient } from "@tanstack/react-query";
|
||||
import {
|
||||
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||
} from "@/components/ui/sheet";
|
||||
import { Button } from "@/components/ui/button";
|
||||
import { Input } from "@/components/ui/input";
|
||||
import { Label } from "@/components/ui/label";
|
||||
import { Progress } from "@/components/ui/progress";
|
||||
import { Badge } from "@/components/ui/badge";
|
||||
import {
|
||||
trainingKeys,
|
||||
useAnalyzeTraining,
|
||||
useCommitTrainingUpload,
|
||||
useUploadFile,
|
||||
type AnalyzeTrainingResponse,
|
||||
} from "@/lib/api/training";
|
||||
import { useProgress } from "@/lib/api/documents";
|
||||
|
||||
const ACCEPT = ".pdf,.docx,.doc,.rtf,.txt,.md";
|
||||
|
||||
type Props = {
|
||||
open: boolean;
|
||||
onOpenChange: (open: boolean) => void;
|
||||
};
|
||||
|
||||
type Stage = "pick" | "analyzing" | "preview" | "committing" | "done" | "error";
|
||||
|
||||
export function TrainingUploadDialog({ open, onOpenChange }: Props) {
|
||||
const [stage, setStage] = useState<Stage>("pick");
|
||||
const [file, setFile] = useState<File | null>(null);
|
||||
const [analysis, setAnalysis] = useState<AnalyzeTrainingResponse | null>(null);
|
||||
// editable copies of the auto-extracted metadata
|
||||
const [decisionNumber, setDecisionNumber] = useState("");
|
||||
const [decisionDate, setDecisionDate] = useState("");
|
||||
const [subjectsRaw, setSubjectsRaw] = useState("");
|
||||
const [title, setTitle] = useState("");
|
||||
const [taskId, setTaskId] = useState<string | null>(null);
|
||||
const [errorMsg, setErrorMsg] = useState("");
|
||||
|
||||
const uploadFile = useUploadFile();
|
||||
const analyze = useAnalyzeTraining();
|
||||
const commit = useCommitTrainingUpload();
|
||||
const progress = useProgress(taskId);
|
||||
const qc = useQueryClient();
|
||||
|
||||
// Reset everything when the sheet closes — important because Sheet keeps
|
||||
// the component mounted between opens. The cascade-render warning is the
|
||||
// intended behavior (reset is the side effect we want).
|
||||
useEffect(() => {
|
||||
if (open) return;
|
||||
/* eslint-disable react-hooks/set-state-in-effect */
|
||||
setStage("pick"); setFile(null); setAnalysis(null);
|
||||
setDecisionNumber(""); setDecisionDate(""); setSubjectsRaw("");
|
||||
setTitle(""); setTaskId(null); setErrorMsg("");
|
||||
/* eslint-enable react-hooks/set-state-in-effect */
|
||||
}, [open]);
|
||||
|
||||
// Watch background task. When complete, invalidate corpus + report so the
|
||||
// new row + updated stats show up automatically. The setStage call here
|
||||
// is the deliberate UX (success card → auto-close) — synchronizing UI
|
||||
// with the external SSE stream is exactly what effects are for.
|
||||
useEffect(() => {
|
||||
if (!progress) return;
|
||||
if (progress.status === "completed") {
|
||||
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||
qc.invalidateQueries({ queryKey: trainingKeys.report() });
|
||||
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||
setStage("done");
|
||||
toast.success(`החלטה ${decisionNumber || analysis?.decision_number || ""} נוספה לקורפוס`);
|
||||
const t = window.setTimeout(() => onOpenChange(false), 1500);
|
||||
return () => window.clearTimeout(t);
|
||||
}
|
||||
if (progress.status === "failed") {
|
||||
setStage("error");
|
||||
setErrorMsg(progress.error || "כשל בעיבוד");
|
||||
}
|
||||
}, [progress, analysis, decisionNumber, qc, onOpenChange]);
|
||||
|
||||
const onPickFile = async (f: File | null) => {
|
||||
setFile(f);
|
||||
setErrorMsg("");
|
||||
if (!f) return;
|
||||
setStage("analyzing");
|
||||
try {
|
||||
const { filename } = await uploadFile.mutateAsync(f);
|
||||
const result = await analyze.mutateAsync(filename);
|
||||
setAnalysis(result);
|
||||
setDecisionNumber(result.decision_number);
|
||||
setDecisionDate(result.decision_date);
|
||||
setSubjectsRaw(result.subject_categories.join(", "));
|
||||
// Default title from the original filename stem (chair can override).
|
||||
const stem = f.name.replace(/\.[^.]+$/, "");
|
||||
setTitle(stem);
|
||||
setStage("preview");
|
||||
} catch (e) {
|
||||
setStage("error");
|
||||
setErrorMsg(e instanceof Error ? e.message : "כשל בקריאת הקובץ");
|
||||
}
|
||||
};
|
||||
|
||||
const onCommit = async () => {
|
||||
if (!analysis) return;
|
||||
setStage("committing");
|
||||
setErrorMsg("");
|
||||
try {
|
||||
const subjects = subjectsRaw
|
||||
.split(/[,،]/)
|
||||
.map((s) => s.trim())
|
||||
.filter(Boolean);
|
||||
const res = await commit.mutateAsync({
|
||||
filename: analysis.filename,
|
||||
decision_number: decisionNumber.trim(),
|
||||
decision_date: decisionDate || "",
|
||||
subject_categories: subjects,
|
||||
title: title.trim() || undefined,
|
||||
});
|
||||
setTaskId(res.task_id);
|
||||
} catch (e) {
|
||||
setStage("error");
|
||||
// 409 = duplicate decision_number — surface the backend's Hebrew message.
|
||||
setErrorMsg(e instanceof Error ? e.message : "כשל בהעלאה");
|
||||
}
|
||||
};
|
||||
|
||||
const isProcessing =
|
||||
stage === "analyzing" || stage === "committing" ||
|
||||
(taskId !== null && progress?.status !== "completed" && progress?.status !== "failed");
|
||||
const progressStep = (progress as { step?: string } | null)?.step;
|
||||
|
||||
return (
|
||||
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||
<SheetContent side="left" className="w-full sm:max-w-2xl overflow-y-auto" dir="rtl">
|
||||
<SheetHeader>
|
||||
<SheetTitle className="text-navy">העלאת החלטה לקורפוס הסגנון</SheetTitle>
|
||||
<SheetDescription className="text-ink-muted">
|
||||
הקובץ יעבור הגהה (סינון Nevo, ניקוד), חילוץ אוטומטי של מספר תיק, תאריך
|
||||
ונושאים, ויוטמע ב-style_corpus עם chunks ו-embeddings. תוכל לתקן את
|
||||
פרטי המטא-דאטה לפני שמירה.
|
||||
</SheetDescription>
|
||||
</SheetHeader>
|
||||
|
||||
<div className="px-6 pb-6 mt-4 space-y-4">
|
||||
{/* Step 1: pick */}
|
||||
{stage === "pick" && (
|
||||
<div className="space-y-2">
|
||||
<Label htmlFor="t-file">קובץ ההחלטה (PDF / DOCX / DOC / RTF / TXT / MD)</Label>
|
||||
<Input
|
||||
id="t-file" type="file" accept={ACCEPT}
|
||||
onChange={(e) => onPickFile(e.target.files?.[0] ?? null)}
|
||||
/>
|
||||
<p className="text-[0.78rem] text-ink-muted">
|
||||
המערכת תחלץ מהקובץ את מספר התיק, התאריך והנושאים. תוכל לערוך
|
||||
לפני השמירה.
|
||||
</p>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Stage 2: analyzing the file */}
|
||||
{stage === "analyzing" && (
|
||||
<div className="rounded-lg border border-rule bg-rule-soft/40 p-6 space-y-2 text-center">
|
||||
<Loader2 className="w-5 h-5 animate-spin mx-auto text-navy" />
|
||||
<p className="text-sm text-navy">מבצע הגהה וחילוץ מטא-דאטה…</p>
|
||||
<p className="text-[0.78rem] text-ink-muted">
|
||||
{file?.name}
|
||||
</p>
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Stage 3: preview + editable metadata */}
|
||||
{stage === "preview" && analysis && (
|
||||
<form
|
||||
className="space-y-4"
|
||||
onSubmit={(e) => { e.preventDefault(); onCommit(); }}
|
||||
>
|
||||
<div className="rounded-lg border border-rule bg-surface px-4 py-3">
|
||||
<h3 className="text-[0.78rem] uppercase tracking-wider text-gold-deep font-semibold mb-2">
|
||||
תצוגה מקדימה של הטקסט הנקי
|
||||
</h3>
|
||||
<p className="text-sm text-ink leading-relaxed line-clamp-6 whitespace-pre-wrap">
|
||||
{analysis.preview}
|
||||
</p>
|
||||
<div className="mt-2 flex items-center gap-3 text-[0.72rem] text-ink-muted tabular-nums">
|
||||
<span className="flex items-center gap-1">
|
||||
<FileText className="w-3 h-3" />
|
||||
{analysis.chars.toLocaleString("he-IL")} תווים
|
||||
</span>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="t-decision-number">מספר ההחלטה</Label>
|
||||
<Input
|
||||
id="t-decision-number"
|
||||
value={decisionNumber}
|
||||
onChange={(e) => setDecisionNumber(e.target.value)}
|
||||
placeholder="1130-25"
|
||||
dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="t-decision-date">תאריך ההחלטה</Label>
|
||||
<Input
|
||||
id="t-decision-date" type="date"
|
||||
value={decisionDate}
|
||||
onChange={(e) => setDecisionDate(e.target.value)}
|
||||
/>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="t-title">כותרת קצרה (אופציונלי)</Label>
|
||||
<Input
|
||||
id="t-title" value={title}
|
||||
onChange={(e) => setTitle(e.target.value)}
|
||||
placeholder="ARAR-25-1130 - כרמל יצחק" dir="rtl"
|
||||
/>
|
||||
</div>
|
||||
|
||||
<div className="space-y-1">
|
||||
<Label htmlFor="t-subjects">נושאים (מופרדים בפסיקים)</Label>
|
||||
<Input
|
||||
id="t-subjects" value={subjectsRaw}
|
||||
onChange={(e) => setSubjectsRaw(e.target.value)}
|
||||
placeholder="חניה, קווי בניין, שימוש חורג" dir="rtl"
|
||||
/>
|
||||
{analysis.subject_categories.length > 0 && (
|
||||
<div className="flex flex-wrap gap-1 mt-1">
|
||||
<span className="text-[0.72rem] text-ink-muted">חולץ אוטומטית:</span>
|
||||
{analysis.subject_categories.map((s) => (
|
||||
<Badge key={s} variant="outline"
|
||||
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||
{s}
|
||||
</Badge>
|
||||
))}
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
|
||||
{errorMsg && (
|
||||
<div className="rounded-lg border border-danger/40 bg-danger-bg p-3 flex items-center gap-2 text-danger text-sm">
|
||||
<AlertCircle className="w-4 h-4 shrink-0" />
|
||||
{errorMsg}
|
||||
</div>
|
||||
)}
|
||||
|
||||
<div className="flex gap-2 justify-end pt-2">
|
||||
<Button type="button" variant="ghost"
|
||||
onClick={() => onOpenChange(false)}
|
||||
disabled={isProcessing}>
|
||||
ביטול
|
||||
</Button>
|
||||
<Button type="submit" disabled={isProcessing || !decisionNumber.trim()}
|
||||
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||
<Upload className="w-4 h-4 me-1" />
|
||||
שמור בקורפוס
|
||||
</Button>
|
||||
</div>
|
||||
</form>
|
||||
)}
|
||||
|
||||
{/* Stage 4: committing — background task progress */}
|
||||
{(stage === "committing" || (taskId && stage !== "done" && stage !== "error")) && (
|
||||
<div className="rounded-lg border border-rule bg-rule-soft/40 p-4 space-y-2">
|
||||
<div className="flex items-center gap-2 text-sm text-navy">
|
||||
<Loader2 className="w-4 h-4 animate-spin" />
|
||||
<span>{progressStep || "מעבד את ההחלטה לקורפוס"}</span>
|
||||
</div>
|
||||
<Progress value={progressStep ? 60 : 30} className="h-1.5" />
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Stage 5: success */}
|
||||
{stage === "done" && (
|
||||
<div className="rounded-lg border border-gold/40 bg-gold-wash p-4 flex items-center gap-2 text-gold-deep text-sm">
|
||||
<CheckCircle2 className="w-4 h-4" />
|
||||
ההחלטה נוספה לקורפוס בהצלחה.
|
||||
</div>
|
||||
)}
|
||||
|
||||
{/* Stage 6: error (after a failed analyze or upload) */}
|
||||
{stage === "error" && (
|
||||
<div className="space-y-3">
|
||||
<div className="rounded-lg border border-danger/40 bg-danger-bg p-4 flex items-center gap-2 text-danger text-sm">
|
||||
<AlertCircle className="w-4 h-4 shrink-0" />
|
||||
{errorMsg || "שגיאה לא ידועה"}
|
||||
</div>
|
||||
<div className="flex gap-2 justify-end">
|
||||
<Button type="button" variant="ghost"
|
||||
onClick={() => onOpenChange(false)}>
|
||||
סגור
|
||||
</Button>
|
||||
<Button type="button"
|
||||
onClick={() => { setStage("pick"); setErrorMsg(""); setFile(null); }}>
|
||||
נסה קובץ אחר
|
||||
</Button>
|
||||
</div>
|
||||
</div>
|
||||
)}
|
||||
</div>
|
||||
</SheetContent>
|
||||
</Sheet>
|
||||
);
|
||||
}
|
||||
66
web-ui/src/components/ui/accordion.tsx
Normal file
66
web-ui/src/components/ui/accordion.tsx
Normal file
@@ -0,0 +1,66 @@
|
||||
"use client"
|
||||
|
||||
import * as React from "react"
|
||||
import { Accordion as AccordionPrimitive } from "radix-ui"
|
||||
import { ChevronDown } from "lucide-react"
|
||||
|
||||
import { cn } from "@/lib/utils"
|
||||
|
||||
function Accordion({
|
||||
...props
|
||||
}: React.ComponentProps<typeof AccordionPrimitive.Root>) {
|
||||
return <AccordionPrimitive.Root data-slot="accordion" {...props} />
|
||||
}
|
||||
|
||||
function AccordionItem({
|
||||
className,
|
||||
...props
|
||||
}: React.ComponentProps<typeof AccordionPrimitive.Item>) {
|
||||
return (
|
||||
<AccordionPrimitive.Item
|
||||
data-slot="accordion-item"
|
||||
className={cn("border-b border-rule last:border-b-0", className)}
|
||||
{...props}
|
||||
/>
|
||||
)
|
||||
}
|
||||
|
||||
function AccordionTrigger({
|
||||
className,
|
||||
children,
|
||||
...props
|
||||
}: React.ComponentProps<typeof AccordionPrimitive.Trigger>) {
|
||||
return (
|
||||
<AccordionPrimitive.Header className="flex">
|
||||
<AccordionPrimitive.Trigger
|
||||
data-slot="accordion-trigger"
|
||||
className={cn(
|
||||
"focus-visible:ring-ring/50 flex flex-1 items-start justify-between gap-4 rounded-md py-4 text-start text-sm font-medium transition-all outline-none hover:underline focus-visible:ring-[3px] disabled:pointer-events-none disabled:opacity-50 [&[data-state=open]>svg]:rotate-180",
|
||||
className
|
||||
)}
|
||||
{...props}
|
||||
>
|
||||
{children}
|
||||
<ChevronDown className="text-ink-muted pointer-events-none size-4 shrink-0 translate-y-0.5 transition-transform duration-200" />
|
||||
</AccordionPrimitive.Trigger>
|
||||
</AccordionPrimitive.Header>
|
||||
)
|
||||
}
|
||||
|
||||
function AccordionContent({
|
||||
className,
|
||||
children,
|
||||
...props
|
||||
}: React.ComponentProps<typeof AccordionPrimitive.Content>) {
|
||||
return (
|
||||
<AccordionPrimitive.Content
|
||||
data-slot="accordion-content"
|
||||
className="data-[state=closed]:animate-accordion-up data-[state=open]:animate-accordion-down overflow-hidden text-sm"
|
||||
{...props}
|
||||
>
|
||||
<div className={cn("pt-0 pb-4", className)}>{children}</div>
|
||||
</AccordionPrimitive.Content>
|
||||
)
|
||||
}
|
||||
|
||||
export { Accordion, AccordionItem, AccordionTrigger, AccordionContent }
|
||||
@@ -16,11 +16,11 @@ import {
|
||||
import { PartiesField } from "@/components/wizard/parties-field";
|
||||
import { useCreateCase } from "@/lib/api/cases";
|
||||
import {
|
||||
caseCreateSchema, expectedOutcomes,
|
||||
caseCreateSchema, expectedOutcomes, proceedingTypes,
|
||||
type CaseCreateInput,
|
||||
} from "@/lib/schemas/case";
|
||||
import {
|
||||
PRACTICE_AREAS, APPEAL_SUBTYPES, deriveSubtype,
|
||||
PRACTICE_AREAS, APPEAL_SUBTYPES, deriveSubtypeWithBlam,
|
||||
type AppealSubtype,
|
||||
} from "@/lib/practice-area";
|
||||
|
||||
@@ -35,7 +35,7 @@ type StepKey = (typeof STEPS)[number]["key"];
|
||||
/* Fields validated at each step — lets the user fix just what's on screen
|
||||
* before moving forward, instead of surfacing all errors from page 1. */
|
||||
const STEP_FIELDS: Record<StepKey, (keyof CaseCreateInput)[]> = {
|
||||
basics: ["case_number", "title", "practice_area", "appeal_subtype"],
|
||||
basics: ["case_number", "title", "proceeding_type", "practice_area", "appeal_subtype"],
|
||||
parties: ["appellants", "respondents"],
|
||||
details: ["subject", "hearing_date", "expected_outcome", "notes", "property_address", "permit_number"],
|
||||
};
|
||||
@@ -66,6 +66,7 @@ export function CaseWizard() {
|
||||
expected_outcome: "",
|
||||
practice_area: "appeals_committee",
|
||||
appeal_subtype: "unknown",
|
||||
proceeding_type: "ערר",
|
||||
},
|
||||
});
|
||||
|
||||
@@ -74,17 +75,36 @@ export function CaseWizard() {
|
||||
* stop the moment they manually pick a value from the dropdown. Mirrors
|
||||
* the wireSubtypeAutofill() behaviour of the vanilla UI
|
||||
* (legal-ai/web/static/index.html around line 2770).
|
||||
*
|
||||
* proceeding_type follows the same pattern: if the user hasn't picked
|
||||
* a value yet, we derive 'בל"מ' whenever the subtype lands on an
|
||||
* extension_request_* value.
|
||||
*/
|
||||
const userTouchedSubtype = useRef(false);
|
||||
const userTouchedProceeding = useRef(false);
|
||||
const caseNumber = form.watch("case_number");
|
||||
const practiceArea = form.watch("practice_area");
|
||||
const subject = form.watch("subject");
|
||||
const appealSubtype = form.watch("appeal_subtype");
|
||||
useEffect(() => {
|
||||
if (userTouchedSubtype.current) return;
|
||||
const derived = deriveSubtype(caseNumber, practiceArea);
|
||||
/* derive_subtype_with_blam picks extension_request_* when subject
|
||||
* matches "בקשה להארכת מועד" / "בל\"מ" / "הארכת מועד להגשת". */
|
||||
const derived = deriveSubtypeWithBlam(caseNumber, subject, practiceArea);
|
||||
if (derived !== form.getValues("appeal_subtype")) {
|
||||
form.setValue("appeal_subtype", derived, { shouldValidate: false });
|
||||
}
|
||||
}, [caseNumber, practiceArea, form]);
|
||||
}, [caseNumber, practiceArea, subject, form]);
|
||||
|
||||
/* proceeding_type follows appeal_subtype when the user hasn't picked
|
||||
* one explicitly — extension_request_* always implies 'בל"מ'. */
|
||||
useEffect(() => {
|
||||
if (userTouchedProceeding.current) return;
|
||||
const proc = appealSubtype.startsWith("extension_request_") ? 'בל"מ' : "ערר";
|
||||
if (proc !== form.getValues("proceeding_type")) {
|
||||
form.setValue("proceeding_type", proc, { shouldValidate: false });
|
||||
}
|
||||
}, [appealSubtype, form]);
|
||||
|
||||
const stepIndex = STEPS.findIndex((s) => s.key === step);
|
||||
const isLast = stepIndex === STEPS.length - 1;
|
||||
@@ -159,6 +179,39 @@ export function CaseWizard() {
|
||||
<Input id="title" {...form.register("title")} className="mt-1" />
|
||||
<FieldError message={form.formState.errors.title?.message} />
|
||||
</div>
|
||||
<div>
|
||||
<Label className="text-navy">
|
||||
סוג תיק <span className="text-danger">*</span>
|
||||
</Label>
|
||||
<Controller
|
||||
control={form.control}
|
||||
name="proceeding_type"
|
||||
render={({ field }) => (
|
||||
<Select
|
||||
value={field.value}
|
||||
onValueChange={(v) => {
|
||||
userTouchedProceeding.current = true;
|
||||
field.onChange(v as CaseCreateInput["proceeding_type"]);
|
||||
}}
|
||||
dir="rtl"
|
||||
>
|
||||
<SelectTrigger className="mt-1">
|
||||
<SelectValue />
|
||||
</SelectTrigger>
|
||||
<SelectContent>
|
||||
{proceedingTypes.map((p) => (
|
||||
<SelectItem key={p.value} value={p.value}>
|
||||
{p.label}
|
||||
</SelectItem>
|
||||
))}
|
||||
</SelectContent>
|
||||
</Select>
|
||||
)}
|
||||
/>
|
||||
<p className="text-[0.7rem] text-ink-muted mt-1">
|
||||
ערר = הליך עיקרי; בל"מ = בקשה להארכת מועד להגשת ערר
|
||||
</p>
|
||||
</div>
|
||||
<div className="grid grid-cols-2 gap-3">
|
||||
<div>
|
||||
<Label className="text-navy">תחום משפטי</Label>
|
||||
|
||||
@@ -54,6 +54,8 @@ export type Case = {
|
||||
respondents?: string[] | null;
|
||||
property_address?: string | null;
|
||||
permit_number?: string | null;
|
||||
/* 'ערר' = regular appeal, 'בל"מ' = extension-of-time request */
|
||||
proceeding_type?: "ערר" | 'בל"מ';
|
||||
};
|
||||
|
||||
export type CaseDocument = {
|
||||
|
||||
@@ -160,6 +160,23 @@ export type ExtractAppraiserFactsResponse =
|
||||
appraisal_count: number;
|
||||
missing: { document_id: string; title: string; current_side: string }[];
|
||||
message: string;
|
||||
}
|
||||
| {
|
||||
// The chair clicked the button; backend created a child Paperclip
|
||||
// issue assigned to the legal-analyst, which will run the MCP tool
|
||||
// on the host (where the Claude CLI lives) and post results back.
|
||||
status: "queued";
|
||||
sub_issue_id: string;
|
||||
analyst_id: string;
|
||||
main_issue_id: string;
|
||||
}
|
||||
| {
|
||||
// No analyst route was available (no API key / no analyst configured /
|
||||
// no Paperclip issue linked to the case). Non-fatal — the chair can
|
||||
// still trigger extraction manually from Claude Code.
|
||||
status: "skipped";
|
||||
reason: "no_api_key" | "no_analyst" | "no_issue" | string;
|
||||
company_id?: string;
|
||||
};
|
||||
|
||||
async function extractAppraiserFacts(
|
||||
|
||||
111
web-ui/src/lib/api/legal-arguments.ts
Normal file
111
web-ui/src/lib/api/legal-arguments.ts
Normal file
@@ -0,0 +1,111 @@
|
||||
/**
|
||||
* Legal Arguments domain — aggregated propositions (claim de-dup).
|
||||
*
|
||||
* Each raw "claim" is an extracted proposition from a litigation brief;
|
||||
* the LLM-driven aggregator groups them by party into 6-12 distinct
|
||||
* legal arguments. These hooks expose the read + trigger endpoints.
|
||||
*/
|
||||
|
||||
import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
|
||||
import { apiRequest } from "./client";
|
||||
|
||||
export type LegalArgumentParty =
|
||||
| "appellant"
|
||||
| "respondent"
|
||||
| "committee"
|
||||
| "permit_applicant"
|
||||
| "unknown";
|
||||
|
||||
export type LegalArgumentPriority =
|
||||
| "threshold"
|
||||
| "substantive"
|
||||
| "procedural"
|
||||
| "relief";
|
||||
|
||||
export type LegalArgument = {
|
||||
id: string;
|
||||
case_id: string;
|
||||
party: LegalArgumentParty;
|
||||
argument_index: number;
|
||||
argument_title: string;
|
||||
argument_body: string;
|
||||
legal_topic: string | null;
|
||||
priority: LegalArgumentPriority;
|
||||
cited_precedents?: string[] | null;
|
||||
created_at?: string;
|
||||
updated_at?: string;
|
||||
supporting_claims: string[];
|
||||
};
|
||||
|
||||
export type LegalArgumentsResponse = {
|
||||
case_number: string;
|
||||
total: number;
|
||||
by_party: Partial<Record<LegalArgumentParty, LegalArgument[]>>;
|
||||
arguments: LegalArgument[];
|
||||
};
|
||||
|
||||
export const legalArgumentsKeys = {
|
||||
all: ["legal-arguments"] as const,
|
||||
byCase: (caseNumber: string) =>
|
||||
[...legalArgumentsKeys.all, caseNumber] as const,
|
||||
};
|
||||
|
||||
export function useLegalArguments(caseNumber: string | undefined) {
|
||||
return useQuery({
|
||||
queryKey: legalArgumentsKeys.byCase(caseNumber ?? ""),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<LegalArgumentsResponse>(
|
||||
`/api/cases/${caseNumber}/legal-arguments`,
|
||||
{ signal },
|
||||
),
|
||||
enabled: Boolean(caseNumber),
|
||||
staleTime: 10_000,
|
||||
});
|
||||
}
|
||||
|
||||
export type AggregateArgumentsResult = {
|
||||
status: "started" | string;
|
||||
case_number: string;
|
||||
force: boolean;
|
||||
message: string;
|
||||
};
|
||||
|
||||
export function useAggregateArguments(caseNumber: string | undefined) {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (force: boolean = false) =>
|
||||
apiRequest<AggregateArgumentsResult>(
|
||||
`/api/cases/${caseNumber}/aggregate-arguments${force ? "?force=true" : ""}`,
|
||||
{ method: "POST" },
|
||||
),
|
||||
onSuccess: () => {
|
||||
if (caseNumber) {
|
||||
qc.invalidateQueries({
|
||||
queryKey: legalArgumentsKeys.byCase(caseNumber),
|
||||
});
|
||||
}
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export const PARTY_LABELS_HE: Record<LegalArgumentParty, string> = {
|
||||
appellant: "עוררים",
|
||||
respondent: "משיבים",
|
||||
committee: "ועדה מקומית",
|
||||
permit_applicant: "מבקשי היתר",
|
||||
unknown: "צד לא מזוהה",
|
||||
};
|
||||
|
||||
export const PRIORITY_LABELS_HE: Record<LegalArgumentPriority, string> = {
|
||||
threshold: "סף",
|
||||
substantive: "מהותי",
|
||||
procedural: "פגם הליך",
|
||||
relief: "סעד",
|
||||
};
|
||||
|
||||
export const PRIORITY_ORDER: LegalArgumentPriority[] = [
|
||||
"threshold",
|
||||
"substantive",
|
||||
"procedural",
|
||||
"relief",
|
||||
];
|
||||
277
web-ui/src/lib/api/missing-precedents.ts
Normal file
277
web-ui/src/lib/api/missing-precedents.ts
Normal file
@@ -0,0 +1,277 @@
|
||||
/**
|
||||
* Missing precedents — citations the parties brought up but that aren't
|
||||
* yet in the corpus.
|
||||
*
|
||||
* Lifecycle: 'open' → researcher logs gap → chair uploads decision via
|
||||
* the dialog → POST /upload routes to internal_decision_upload (ערר/בל"מ)
|
||||
* or precedent_library_upload (court rulings), then status flips to
|
||||
* 'closed' with linked_case_law_id set.
|
||||
*
|
||||
* Endpoints touched:
|
||||
* - POST /api/missing-precedents create (JSON body)
|
||||
* - GET /api/missing-precedents?status=open list (filters)
|
||||
* - GET /api/missing-precedents/{id} detail
|
||||
* - PATCH /api/missing-precedents/{id} metadata edit
|
||||
* - DELETE /api/missing-precedents/{id} remove
|
||||
* - POST /api/missing-precedents/{id}/upload multipart upload + close
|
||||
*/
|
||||
|
||||
import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
|
||||
import { ApiError, apiRequest } from "./client";
|
||||
|
||||
export type CitedByParty =
|
||||
| "appellant"
|
||||
| "respondent"
|
||||
| "committee"
|
||||
| "permit_applicant"
|
||||
| "unknown";
|
||||
|
||||
export type MissingPrecedentStatus =
|
||||
| "open"
|
||||
| "uploaded"
|
||||
| "closed"
|
||||
| "irrelevant";
|
||||
|
||||
export type MissingPrecedent = {
|
||||
id: string;
|
||||
citation: string;
|
||||
case_name: string | null;
|
||||
cited_in_case_id: string | null;
|
||||
cited_in_case_number: string | null; // joined
|
||||
cited_in_document_id: string | null;
|
||||
cited_by_party: CitedByParty | null;
|
||||
cited_by_party_name: string | null;
|
||||
legal_topic: string | null;
|
||||
legal_issue: string | null;
|
||||
claim_quote: string | null;
|
||||
status: MissingPrecedentStatus;
|
||||
linked_case_law_id: string | null;
|
||||
linked_case_law_number: string | null;
|
||||
linked_case_law_name: string | null;
|
||||
closed_at: string | null;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
notes: string | null;
|
||||
};
|
||||
|
||||
export type MissingPrecedentListResponse = {
|
||||
items: MissingPrecedent[];
|
||||
count: number;
|
||||
by_status: Partial<Record<MissingPrecedentStatus, number>>;
|
||||
total_open: number;
|
||||
};
|
||||
|
||||
export type MissingPrecedentCreateInput = {
|
||||
citation: string;
|
||||
case_number?: string;
|
||||
cited_in_document_id?: string;
|
||||
cited_by_party?: CitedByParty;
|
||||
cited_by_party_name?: string;
|
||||
legal_topic?: string;
|
||||
legal_issue?: string;
|
||||
claim_quote?: string;
|
||||
case_name?: string;
|
||||
notes?: string;
|
||||
};
|
||||
|
||||
export type MissingPrecedentPatch = Partial<{
|
||||
legal_topic: string;
|
||||
legal_issue: string;
|
||||
notes: string;
|
||||
cited_by_party: CitedByParty;
|
||||
cited_by_party_name: string;
|
||||
case_name: string;
|
||||
status: MissingPrecedentStatus;
|
||||
citation: string;
|
||||
claim_quote: string;
|
||||
}>;
|
||||
|
||||
export type MissingPrecedentFilters = {
|
||||
status?: MissingPrecedentStatus | "";
|
||||
caseNumber?: string;
|
||||
caseId?: string;
|
||||
legalTopic?: string;
|
||||
limit?: number;
|
||||
};
|
||||
|
||||
export const missingPrecedentKeys = {
|
||||
all: ["missing-precedents"] as const,
|
||||
list: (filters: MissingPrecedentFilters) =>
|
||||
[...missingPrecedentKeys.all, "list", filters] as const,
|
||||
detail: (id: string) => [...missingPrecedentKeys.all, "detail", id] as const,
|
||||
};
|
||||
|
||||
export function useMissingPrecedents(filters: MissingPrecedentFilters = {}) {
|
||||
return useQuery({
|
||||
queryKey: missingPrecedentKeys.list(filters),
|
||||
queryFn: ({ signal }) => {
|
||||
const p = new URLSearchParams();
|
||||
if (filters.status) p.set("status", filters.status);
|
||||
if (filters.caseNumber) p.set("case_number", filters.caseNumber);
|
||||
if (filters.caseId) p.set("case_id", filters.caseId);
|
||||
if (filters.legalTopic) p.set("legal_topic", filters.legalTopic);
|
||||
if (filters.limit) p.set("limit", String(filters.limit));
|
||||
const qs = p.toString();
|
||||
return apiRequest<MissingPrecedentListResponse>(
|
||||
`/api/missing-precedents${qs ? `?${qs}` : ""}`,
|
||||
{ signal },
|
||||
);
|
||||
},
|
||||
staleTime: 15_000,
|
||||
});
|
||||
}
|
||||
|
||||
/** Counter for the sidebar / nav badge — open rows only. */
|
||||
export function useMissingPrecedentsOpenCount() {
|
||||
return useQuery({
|
||||
queryKey: [...missingPrecedentKeys.all, "open-count"] as const,
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<MissingPrecedentListResponse>(
|
||||
"/api/missing-precedents?status=open&limit=1",
|
||||
{ signal },
|
||||
),
|
||||
staleTime: 30_000,
|
||||
select: (data) => data.total_open,
|
||||
});
|
||||
}
|
||||
|
||||
export function useMissingPrecedent(id: string | null) {
|
||||
return useQuery({
|
||||
queryKey: missingPrecedentKeys.detail(id ?? ""),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<MissingPrecedent>(
|
||||
`/api/missing-precedents/${encodeURIComponent(id!)}`,
|
||||
{ signal },
|
||||
),
|
||||
enabled: Boolean(id),
|
||||
staleTime: 15_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useCreateMissingPrecedent() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (input: MissingPrecedentCreateInput) =>
|
||||
apiRequest<MissingPrecedent>("/api/missing-precedents", {
|
||||
method: "POST",
|
||||
body: input,
|
||||
}),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: missingPrecedentKeys.all });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useUpdateMissingPrecedent() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: ({ id, patch }: { id: string; patch: MissingPrecedentPatch }) =>
|
||||
apiRequest<MissingPrecedent>(
|
||||
`/api/missing-precedents/${encodeURIComponent(id)}`,
|
||||
{ method: "PATCH", body: patch },
|
||||
),
|
||||
onSuccess: (_, { id }) => {
|
||||
qc.invalidateQueries({ queryKey: missingPrecedentKeys.detail(id) });
|
||||
qc.invalidateQueries({ queryKey: missingPrecedentKeys.all });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useDeleteMissingPrecedent() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (id: string) =>
|
||||
apiRequest<{ deleted: boolean }>(
|
||||
`/api/missing-precedents/${encodeURIComponent(id)}`,
|
||||
{ method: "DELETE" },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: missingPrecedentKeys.all });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export type MissingPrecedentUploadInput = {
|
||||
id: string;
|
||||
file: File;
|
||||
case_number?: string;
|
||||
chair_name?: string;
|
||||
district?: string;
|
||||
case_name?: string;
|
||||
court?: string;
|
||||
decision_date?: string;
|
||||
practice_area?: string;
|
||||
appeal_subtype?: string;
|
||||
subject_tags?: string[];
|
||||
is_binding?: boolean;
|
||||
headnote?: string;
|
||||
summary?: string;
|
||||
precedent_level?: string;
|
||||
source_type?: string;
|
||||
};
|
||||
|
||||
export function useUploadMissingPrecedent() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: async (input: MissingPrecedentUploadInput) => {
|
||||
const fd = new FormData();
|
||||
fd.append("file", input.file);
|
||||
if (input.case_number) fd.append("case_number", input.case_number);
|
||||
if (input.chair_name) fd.append("chair_name", input.chair_name);
|
||||
if (input.district) fd.append("district", input.district);
|
||||
if (input.case_name) fd.append("case_name", input.case_name);
|
||||
if (input.court) fd.append("court", input.court);
|
||||
if (input.decision_date) fd.append("decision_date", input.decision_date);
|
||||
if (input.practice_area) fd.append("practice_area", input.practice_area);
|
||||
if (input.appeal_subtype) fd.append("appeal_subtype", input.appeal_subtype);
|
||||
if (input.subject_tags && input.subject_tags.length) {
|
||||
fd.append("subject_tags", JSON.stringify(input.subject_tags));
|
||||
}
|
||||
fd.append("is_binding", String(input.is_binding ?? true));
|
||||
if (input.headnote) fd.append("headnote", input.headnote);
|
||||
if (input.summary) fd.append("summary", input.summary);
|
||||
if (input.precedent_level)
|
||||
fd.append("precedent_level", input.precedent_level);
|
||||
if (input.source_type) fd.append("source_type", input.source_type);
|
||||
|
||||
const res = await fetch(
|
||||
`/api/missing-precedents/${encodeURIComponent(input.id)}/upload`,
|
||||
{ method: "POST", body: fd },
|
||||
);
|
||||
const parsed = await res.json().catch(() => null);
|
||||
if (!res.ok) {
|
||||
throw new ApiError(
|
||||
`Upload failed with ${res.status}`,
|
||||
res.status,
|
||||
parsed,
|
||||
);
|
||||
}
|
||||
return parsed as {
|
||||
missing_precedent: MissingPrecedent;
|
||||
case_law_id: string;
|
||||
route: "internal_committee" | "external_upload";
|
||||
};
|
||||
},
|
||||
onSuccess: (_, { id }) => {
|
||||
qc.invalidateQueries({ queryKey: missingPrecedentKeys.detail(id) });
|
||||
qc.invalidateQueries({ queryKey: missingPrecedentKeys.all });
|
||||
qc.invalidateQueries({ queryKey: ["precedent-library"] });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
/** Hebrew labels for display. */
|
||||
export const CITED_BY_PARTY_LABELS: Record<CitedByParty, string> = {
|
||||
appellant: "עורר",
|
||||
respondent: "משיב",
|
||||
committee: "ועדה",
|
||||
permit_applicant: "מבקש היתר",
|
||||
unknown: "לא ידוע",
|
||||
};
|
||||
|
||||
export const STATUS_LABELS: Record<MissingPrecedentStatus, string> = {
|
||||
open: "פתוח",
|
||||
uploaded: "הועלה",
|
||||
closed: "נסגר",
|
||||
irrelevant: "לא רלוונטי",
|
||||
};
|
||||
@@ -48,6 +48,7 @@ export type Precedent = {
|
||||
source_kind: string;
|
||||
chair_name: string | null;
|
||||
district: string | null;
|
||||
citation_formatted: string;
|
||||
extraction_status: string;
|
||||
halacha_extraction_status: string;
|
||||
metadata_extraction_requested_at: string | null;
|
||||
@@ -354,6 +355,85 @@ export function useUploadPrecedent() {
|
||||
});
|
||||
}
|
||||
|
||||
// Valid Hebrew districts for appeals-committee decisions. Mirrors
|
||||
// VALID_DISTRICTS in mcp-server/src/legal_mcp/tools/internal_decisions.py —
|
||||
// keep in sync with the service-side guard.
|
||||
export const COMMITTEE_DISTRICTS = [
|
||||
"ירושלים",
|
||||
"תל אביב",
|
||||
"מרכז",
|
||||
"חיפה",
|
||||
"צפון",
|
||||
"דרום",
|
||||
"ארצי",
|
||||
] as const;
|
||||
|
||||
export type CommitteeDistrict = (typeof COMMITTEE_DISTRICTS)[number];
|
||||
|
||||
// A citation that targets internal_decision_upload, not the external library.
|
||||
// Matches the prefix list in precedent_library service (ערר/בל"מ/ARAR).
|
||||
const COMMITTEE_PREFIXES = ["ערר ", "ערר(", "בל\"מ ", "בל\"מ(", "ARAR "];
|
||||
|
||||
export function isCommitteeCitation(citation: string): boolean {
|
||||
const c = citation.trimStart();
|
||||
return COMMITTEE_PREFIXES.some((p) => c.startsWith(p));
|
||||
}
|
||||
|
||||
export type InternalDecisionUploadInput = {
|
||||
file: File;
|
||||
case_number: string;
|
||||
chair_name: string;
|
||||
district: CommitteeDistrict | string;
|
||||
case_name?: string;
|
||||
court?: string;
|
||||
decision_date?: string;
|
||||
practice_area?: PracticeArea;
|
||||
appeal_subtype?: string;
|
||||
subject_tags?: string[];
|
||||
is_binding?: boolean;
|
||||
summary?: string;
|
||||
};
|
||||
|
||||
export function useUploadInternalDecision() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: async (input: InternalDecisionUploadInput) => {
|
||||
const fd = new FormData();
|
||||
fd.append("file", input.file);
|
||||
fd.append("case_number", input.case_number);
|
||||
fd.append("chair_name", input.chair_name);
|
||||
fd.append("district", input.district);
|
||||
if (input.case_name) fd.append("case_name", input.case_name);
|
||||
if (input.court) fd.append("court", input.court);
|
||||
if (input.decision_date) fd.append("decision_date", input.decision_date);
|
||||
if (input.practice_area) fd.append("practice_area", input.practice_area);
|
||||
if (input.appeal_subtype)
|
||||
fd.append("appeal_subtype", input.appeal_subtype);
|
||||
if (input.subject_tags && input.subject_tags.length)
|
||||
fd.append("subject_tags", JSON.stringify(input.subject_tags));
|
||||
fd.append("is_binding", String(input.is_binding ?? false));
|
||||
if (input.summary) fd.append("summary", input.summary);
|
||||
|
||||
const res = await fetch("/api/internal-decisions/upload", {
|
||||
method: "POST",
|
||||
body: fd,
|
||||
});
|
||||
const parsed = await res.json().catch(() => null);
|
||||
if (!res.ok) {
|
||||
throw new ApiError(
|
||||
`Upload failed with ${res.status}`,
|
||||
res.status,
|
||||
parsed,
|
||||
);
|
||||
}
|
||||
return parsed as { task_id: string };
|
||||
},
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useDeletePrecedent() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
@@ -414,6 +494,9 @@ export type PrecedentPatch = Partial<{
|
||||
source_type: SourceType;
|
||||
precedent_level: string;
|
||||
is_binding: boolean;
|
||||
district: string;
|
||||
chair_name: string;
|
||||
citation_formatted: string;
|
||||
}>;
|
||||
|
||||
export function useUpdatePrecedent() {
|
||||
|
||||
@@ -7,10 +7,13 @@
|
||||
* - GET /corpus → flat list of decisions for the corpus tab / compare tool
|
||||
* - GET /compare?a=UUID&b=UUID → side-by-side comparison
|
||||
* - DELETE /corpus/{id} → remove a decision from the corpus
|
||||
* - POST /api/upload → multipart file → returns sanitized filename
|
||||
* - POST /analyze → proofread + extract metadata for preview
|
||||
* - POST /upload → commit a proofread decision to the corpus (task_id)
|
||||
*/
|
||||
|
||||
import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
|
||||
import { apiRequest } from "./client";
|
||||
import { ApiError, apiRequest } from "./client";
|
||||
|
||||
export type StyleReport = {
|
||||
corpus: {
|
||||
@@ -69,6 +72,29 @@ export type CorpusDecision = {
|
||||
subject_categories: string[];
|
||||
chars: number;
|
||||
created_at: string;
|
||||
// Enriched metadata (added in the corpus-page upgrade).
|
||||
summary: string;
|
||||
outcome: string;
|
||||
key_principles: string[];
|
||||
appeal_subtype: string;
|
||||
practice_area: string;
|
||||
page_count: number;
|
||||
document_id: string | null;
|
||||
doc_title: string;
|
||||
parties: { appellant: string; respondent: string };
|
||||
legal_citation: string;
|
||||
lessons_count: number;
|
||||
};
|
||||
|
||||
export type CorpusDecisionPatch = {
|
||||
decision_number?: string;
|
||||
decision_date?: string;
|
||||
subject_categories?: string[];
|
||||
summary?: string;
|
||||
outcome?: string;
|
||||
key_principles?: string[];
|
||||
appeal_subtype?: string;
|
||||
practice_area?: string;
|
||||
};
|
||||
|
||||
export type CompareResult = {
|
||||
@@ -149,3 +175,407 @@ export function useDeleteCorpusEntry() {
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
// ── Style-agent chat ─────────────────────────────────────────────
|
||||
|
||||
export type ChatConversation = {
|
||||
id: string;
|
||||
title: string;
|
||||
style_corpus_id: string | null;
|
||||
decision_number: string;
|
||||
claude_session_id: string | null;
|
||||
message_count: number;
|
||||
created_at: string;
|
||||
last_message_at: string;
|
||||
};
|
||||
|
||||
export type ChatMessage = {
|
||||
id: string;
|
||||
role: "user" | "assistant";
|
||||
content: string;
|
||||
created_at: string;
|
||||
};
|
||||
|
||||
export type ChatHealth = {
|
||||
reachable: boolean;
|
||||
status?: number;
|
||||
url: string;
|
||||
error?: string;
|
||||
};
|
||||
|
||||
export const chatKeys = {
|
||||
conversations: () => [...trainingKeys.all, "chat", "conversations"] as const,
|
||||
conversation: (id: string) =>
|
||||
[...trainingKeys.all, "chat", "conversations", id] as const,
|
||||
health: () => [...trainingKeys.all, "chat", "health"] as const,
|
||||
};
|
||||
|
||||
export function useChatConversations() {
|
||||
return useQuery({
|
||||
queryKey: chatKeys.conversations(),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<ChatConversation[]>("/api/training/chat/conversations", { signal }),
|
||||
staleTime: 15_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useChatConversation(convId: string | null) {
|
||||
return useQuery({
|
||||
queryKey: chatKeys.conversation(convId ?? ""),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<{ conversation: ChatConversation; messages: ChatMessage[] }>(
|
||||
`/api/training/chat/conversations/${encodeURIComponent(convId!)}`,
|
||||
{ signal },
|
||||
),
|
||||
enabled: Boolean(convId),
|
||||
staleTime: 5_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useChatHealth() {
|
||||
return useQuery({
|
||||
queryKey: chatKeys.health(),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<ChatHealth>("/api/training/chat/health", { signal }),
|
||||
staleTime: 30_000,
|
||||
retry: false,
|
||||
});
|
||||
}
|
||||
|
||||
export function useCreateChat() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (body: { title?: string; style_corpus_id?: string | null }) =>
|
||||
apiRequest<ChatConversation>("/api/training/chat/conversations", {
|
||||
method: "POST",
|
||||
body,
|
||||
}),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: chatKeys.conversations() });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useDeleteChat() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (id: string) =>
|
||||
apiRequest<{ deleted: boolean }>(
|
||||
`/api/training/chat/conversations/${encodeURIComponent(id)}`,
|
||||
{ method: "DELETE" },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: chatKeys.conversations() });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
// ── Curator portrait ──────────────────────────────────────────────
|
||||
|
||||
export type CuratorPrompt = {
|
||||
content: string;
|
||||
filename: string;
|
||||
bytes: number;
|
||||
last_modified: number;
|
||||
gitea_url: string;
|
||||
};
|
||||
|
||||
export type StyleAnalyzerPrompts = {
|
||||
analysis_prompt: string;
|
||||
single_decision_prompt: string;
|
||||
synthesis_prompt: string;
|
||||
max_input_tokens: number;
|
||||
};
|
||||
|
||||
export type CuratorFinding = {
|
||||
id: string;
|
||||
lesson_text: string;
|
||||
category: string;
|
||||
applied_to_skill: boolean;
|
||||
decision_number: string;
|
||||
decision_date: string;
|
||||
created_at: string;
|
||||
};
|
||||
|
||||
export type CuratorStats = {
|
||||
total_findings: number;
|
||||
decisions_with_findings: number;
|
||||
decisions_total: number;
|
||||
findings_applied: number;
|
||||
recent_findings: CuratorFinding[];
|
||||
};
|
||||
|
||||
export type CuratorProposalInput = {
|
||||
title: string;
|
||||
proposed_change: string;
|
||||
rationale: string;
|
||||
};
|
||||
|
||||
export type CuratorProposalFile = {
|
||||
filename: string;
|
||||
bytes: number;
|
||||
modified_at: number;
|
||||
};
|
||||
|
||||
export const curatorKeys = {
|
||||
prompt: () => [...trainingKeys.all, "curator", "prompt"] as const,
|
||||
analyzerPrompt: () => [...trainingKeys.all, "curator", "analyzer-prompt"] as const,
|
||||
stats: () => [...trainingKeys.all, "curator", "stats"] as const,
|
||||
proposals: () => [...trainingKeys.all, "curator", "proposals"] as const,
|
||||
};
|
||||
|
||||
export function useCuratorPrompt() {
|
||||
return useQuery({
|
||||
queryKey: curatorKeys.prompt(),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<CuratorPrompt>("/api/training/curator/prompt", { signal }),
|
||||
staleTime: 5 * 60_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useStyleAnalyzerPrompts() {
|
||||
return useQuery({
|
||||
queryKey: curatorKeys.analyzerPrompt(),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<StyleAnalyzerPrompts>(
|
||||
"/api/training/curator/style-analyzer-prompt",
|
||||
{ signal },
|
||||
),
|
||||
staleTime: 5 * 60_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useCuratorStats() {
|
||||
return useQuery({
|
||||
queryKey: curatorKeys.stats(),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<CuratorStats>("/api/training/curator/stats", { signal }),
|
||||
staleTime: 60_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useCuratorProposals() {
|
||||
return useQuery({
|
||||
queryKey: curatorKeys.proposals(),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<CuratorProposalFile[]>("/api/training/curator/proposals", { signal }),
|
||||
staleTime: 30_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useSubmitCuratorProposal() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (body: CuratorProposalInput) =>
|
||||
apiRequest<{ saved: boolean; filename: string }>(
|
||||
"/api/training/curator/proposals",
|
||||
{ method: "POST", body },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: curatorKeys.proposals() });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
// ── Upload flow ──────────────────────────────────────────────────
|
||||
// Three-step pipeline:
|
||||
// 1. useUploadFile → POST /api/upload (multipart) → { filename }
|
||||
// 2. useAnalyzeFile → POST /api/training/analyze (form) → preview + extracted metadata
|
||||
// 3. useCommitUpload → POST /api/training/upload (json) → { task_id }
|
||||
// Track task_id via useProgress() from documents.ts.
|
||||
|
||||
export type UploadFileResponse = {
|
||||
filename: string; // sanitized, time-prefixed name in UPLOAD_DIR
|
||||
original_name: string;
|
||||
size: number;
|
||||
};
|
||||
|
||||
export type AnalyzeTrainingResponse = {
|
||||
filename: string;
|
||||
clean_text: string;
|
||||
preview: string;
|
||||
decision_number: string;
|
||||
decision_date: string; // ISO YYYY-MM-DD or ""
|
||||
subject_categories: string[];
|
||||
stats: Record<string, unknown>;
|
||||
chars: number;
|
||||
};
|
||||
|
||||
export type CommitTrainingRequest = {
|
||||
filename: string;
|
||||
decision_number: string;
|
||||
decision_date: string; // YYYY-MM-DD or ""
|
||||
subject_categories: string[];
|
||||
title?: string;
|
||||
};
|
||||
|
||||
export type CommitTrainingResponse = { task_id: string };
|
||||
|
||||
export function useUploadFile() {
|
||||
return useMutation({
|
||||
mutationFn: async (file: File): Promise<UploadFileResponse> => {
|
||||
const fd = new FormData();
|
||||
fd.append("file", file);
|
||||
const res = await fetch("/api/upload", { method: "POST", body: fd });
|
||||
const contentType = res.headers.get("content-type") ?? "";
|
||||
const parsed = contentType.includes("application/json")
|
||||
? await res.json().catch(() => null)
|
||||
: await res.text().catch(() => null);
|
||||
if (!res.ok) {
|
||||
throw new ApiError(
|
||||
typeof parsed === "object" && parsed && "detail" in parsed
|
||||
? String((parsed as { detail: unknown }).detail)
|
||||
: `Upload failed with ${res.status}`,
|
||||
res.status,
|
||||
parsed,
|
||||
);
|
||||
}
|
||||
return parsed as UploadFileResponse;
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useAnalyzeTraining() {
|
||||
return useMutation({
|
||||
mutationFn: async (filename: string): Promise<AnalyzeTrainingResponse> => {
|
||||
const fd = new FormData();
|
||||
fd.append("filename", filename);
|
||||
const res = await fetch("/api/training/analyze", {
|
||||
method: "POST",
|
||||
body: fd,
|
||||
});
|
||||
const contentType = res.headers.get("content-type") ?? "";
|
||||
const parsed = contentType.includes("application/json")
|
||||
? await res.json().catch(() => null)
|
||||
: await res.text().catch(() => null);
|
||||
if (!res.ok) {
|
||||
throw new ApiError(
|
||||
typeof parsed === "object" && parsed && "detail" in parsed
|
||||
? String((parsed as { detail: unknown }).detail)
|
||||
: `Analyze failed with ${res.status}`,
|
||||
res.status,
|
||||
parsed,
|
||||
);
|
||||
}
|
||||
return parsed as AnalyzeTrainingResponse;
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
// ── Per-decision lessons ─────────────────────────────────────────
|
||||
|
||||
export type DecisionLesson = {
|
||||
id: string;
|
||||
style_corpus_id: string;
|
||||
lesson_text: string;
|
||||
category: "style" | "structure" | "lexicon" | "tabular" | "general";
|
||||
source: "manual" | "curator" | "chair" | "style_analyzer";
|
||||
applied_to_skill: boolean;
|
||||
created_by: string;
|
||||
created_at: string;
|
||||
updated_at: string;
|
||||
};
|
||||
|
||||
export type LessonCreate = {
|
||||
lesson_text: string;
|
||||
category?: DecisionLesson["category"];
|
||||
source?: DecisionLesson["source"];
|
||||
};
|
||||
|
||||
export type LessonPatch = {
|
||||
lesson_text?: string;
|
||||
category?: DecisionLesson["category"];
|
||||
applied_to_skill?: boolean;
|
||||
};
|
||||
|
||||
export const lessonsKeys = {
|
||||
forCorpus: (corpusId: string) =>
|
||||
[...trainingKeys.all, "lessons", corpusId] as const,
|
||||
};
|
||||
|
||||
export function useCorpusLessons(corpusId: string | null) {
|
||||
return useQuery({
|
||||
queryKey: lessonsKeys.forCorpus(corpusId ?? ""),
|
||||
queryFn: ({ signal }) =>
|
||||
apiRequest<DecisionLesson[]>(
|
||||
`/api/training/corpus/${encodeURIComponent(corpusId!)}/lessons`,
|
||||
{ signal },
|
||||
),
|
||||
enabled: Boolean(corpusId),
|
||||
staleTime: 30_000,
|
||||
});
|
||||
}
|
||||
|
||||
export function useAddLesson(corpusId: string) {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (body: LessonCreate) =>
|
||||
apiRequest<DecisionLesson>(
|
||||
`/api/training/corpus/${encodeURIComponent(corpusId)}/lessons`,
|
||||
{ method: "POST", body },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
|
||||
// lessons_count on the corpus row is computed server-side, so
|
||||
// invalidate the list too — otherwise the badge stays stale.
|
||||
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function usePatchLesson(corpusId: string) {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: ({ id, patch }: { id: string; patch: LessonPatch }) =>
|
||||
apiRequest<{ updated: boolean }>(
|
||||
`/api/training/lessons/${encodeURIComponent(id)}`,
|
||||
{ method: "PATCH", body: patch },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useDeleteLesson(corpusId: string) {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: (id: string) =>
|
||||
apiRequest<{ deleted: boolean }>(
|
||||
`/api/training/lessons/${encodeURIComponent(id)}`,
|
||||
{ method: "DELETE" },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: lessonsKeys.forCorpus(corpusId) });
|
||||
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function usePatchCorpus() {
|
||||
const qc = useQueryClient();
|
||||
return useMutation({
|
||||
mutationFn: ({ id, patch }: { id: string; patch: CorpusDecisionPatch }) =>
|
||||
apiRequest<{ updated: boolean; id: string }>(
|
||||
`/api/training/corpus/${encodeURIComponent(id)}`,
|
||||
{ method: "PATCH", body: patch },
|
||||
),
|
||||
onSuccess: () => {
|
||||
qc.invalidateQueries({ queryKey: trainingKeys.corpus() });
|
||||
qc.invalidateQueries({ queryKey: trainingKeys.report() });
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
export function useCommitTrainingUpload() {
|
||||
// No onSuccess invalidation here — the row only appears after the
|
||||
// background task finishes. The dialog watches useProgress(task_id)
|
||||
// and invalidates trainingKeys when status === "completed".
|
||||
return useMutation({
|
||||
mutationFn: (body: CommitTrainingRequest) =>
|
||||
apiRequest<CommitTrainingResponse>("/api/training/upload", {
|
||||
method: "POST",
|
||||
body,
|
||||
}),
|
||||
});
|
||||
}
|
||||
|
||||
@@ -777,6 +777,48 @@ export interface paths {
|
||||
patch?: never;
|
||||
trace?: never;
|
||||
};
|
||||
"/api/cases/{case_number}/aggregate-arguments": {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
get?: never;
|
||||
put?: never;
|
||||
/**
|
||||
* Api Aggregate Arguments
|
||||
* @description Aggregate raw claims into distinct legal arguments via Claude.
|
||||
*
|
||||
* Runs as a BackgroundTask because the LLM pass can take 30-90 seconds.
|
||||
*/
|
||||
post: operations["api_aggregate_arguments_api_cases__case_number__aggregate_arguments_post"];
|
||||
delete?: never;
|
||||
options?: never;
|
||||
head?: never;
|
||||
patch?: never;
|
||||
trace?: never;
|
||||
};
|
||||
"/api/cases/{case_number}/legal-arguments": {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
/**
|
||||
* Api Get Legal Arguments
|
||||
* @description Return aggregated legal arguments for a case, grouped by party.
|
||||
*/
|
||||
get: operations["api_get_legal_arguments_api_cases__case_number__legal_arguments_get"];
|
||||
put?: never;
|
||||
post?: never;
|
||||
delete?: never;
|
||||
options?: never;
|
||||
head?: never;
|
||||
patch?: never;
|
||||
trace?: never;
|
||||
};
|
||||
"/api/cases/{case_number}/direction": {
|
||||
parameters: {
|
||||
query?: never;
|
||||
@@ -2255,6 +2297,75 @@ export interface paths {
|
||||
patch: operations["halacha_update_api_halachot__halacha_id__patch"];
|
||||
trace?: never;
|
||||
};
|
||||
"/api/missing-precedents": {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
/**
|
||||
* Missing Precedents List
|
||||
* @description List missing precedents, optionally filtered by status / case.
|
||||
*/
|
||||
get: operations["missing_precedents_list_api_missing_precedents_get"];
|
||||
put?: never;
|
||||
/**
|
||||
* Missing Precedent Create
|
||||
* @description Log a new missing precedent (status='open'). Dedupes by
|
||||
* (citation, cited_in_case_id) — duplicate POST returns the existing row.
|
||||
*/
|
||||
post: operations["missing_precedent_create_api_missing_precedents_post"];
|
||||
delete?: never;
|
||||
options?: never;
|
||||
head?: never;
|
||||
patch?: never;
|
||||
trace?: never;
|
||||
};
|
||||
"/api/missing-precedents/{mp_id}": {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
/** Missing Precedent Get */
|
||||
get: operations["missing_precedent_get_api_missing_precedents__mp_id__get"];
|
||||
put?: never;
|
||||
post?: never;
|
||||
/** Missing Precedent Delete */
|
||||
delete: operations["missing_precedent_delete_api_missing_precedents__mp_id__delete"];
|
||||
options?: never;
|
||||
head?: never;
|
||||
/** Missing Precedent Update */
|
||||
patch: operations["missing_precedent_update_api_missing_precedents__mp_id__patch"];
|
||||
trace?: never;
|
||||
};
|
||||
"/api/missing-precedents/{mp_id}/upload": {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
get?: never;
|
||||
put?: never;
|
||||
/**
|
||||
* Missing Precedent Upload
|
||||
* @description Upload the decision file behind a missing-precedent and link it.
|
||||
*
|
||||
* Routes to ingest_internal_decision if the citation looks like a
|
||||
* committee decision (ערר / בל"מ prefix), otherwise to ingest_precedent.
|
||||
* Once the case_law row is created, the missing_precedents row is marked
|
||||
* status='closed' with linked_case_law_id pointing to the new row.
|
||||
*/
|
||||
post: operations["missing_precedent_upload_api_missing_precedents__mp_id__upload_post"];
|
||||
delete?: never;
|
||||
options?: never;
|
||||
head?: never;
|
||||
patch?: never;
|
||||
trace?: never;
|
||||
};
|
||||
}
|
||||
export type webhooks = Record<string, never>;
|
||||
export interface components {
|
||||
@@ -2388,6 +2499,81 @@ export interface components {
|
||||
*/
|
||||
summary: string;
|
||||
};
|
||||
/** Body_missing_precedent_upload_api_missing_precedents__mp_id__upload_post */
|
||||
Body_missing_precedent_upload_api_missing_precedents__mp_id__upload_post: {
|
||||
/** File */
|
||||
file: string;
|
||||
/**
|
||||
* Case Number
|
||||
* @default
|
||||
*/
|
||||
case_number: string;
|
||||
/**
|
||||
* Chair Name
|
||||
* @default
|
||||
*/
|
||||
chair_name: string;
|
||||
/**
|
||||
* District
|
||||
* @default
|
||||
*/
|
||||
district: string;
|
||||
/**
|
||||
* Case Name
|
||||
* @default
|
||||
*/
|
||||
case_name: string;
|
||||
/**
|
||||
* Court
|
||||
* @default
|
||||
*/
|
||||
court: string;
|
||||
/**
|
||||
* Decision Date
|
||||
* @default
|
||||
*/
|
||||
decision_date: string;
|
||||
/**
|
||||
* Practice Area
|
||||
* @default
|
||||
*/
|
||||
practice_area: string;
|
||||
/**
|
||||
* Appeal Subtype
|
||||
* @default
|
||||
*/
|
||||
appeal_subtype: string;
|
||||
/**
|
||||
* Subject Tags
|
||||
* @default []
|
||||
*/
|
||||
subject_tags: string;
|
||||
/**
|
||||
* Is Binding
|
||||
* @default true
|
||||
*/
|
||||
is_binding: boolean;
|
||||
/**
|
||||
* Headnote
|
||||
* @default
|
||||
*/
|
||||
headnote: string;
|
||||
/**
|
||||
* Summary
|
||||
* @default
|
||||
*/
|
||||
summary: string;
|
||||
/**
|
||||
* Precedent Level
|
||||
* @default
|
||||
*/
|
||||
precedent_level: string;
|
||||
/**
|
||||
* Source Type
|
||||
* @default
|
||||
*/
|
||||
source_type: string;
|
||||
};
|
||||
/** Body_precedent_library_upload_api_precedent_library_upload_post */
|
||||
Body_precedent_library_upload_api_precedent_library_upload_post: {
|
||||
/** File */
|
||||
@@ -2507,7 +2693,7 @@ export interface components {
|
||||
expected_outcome: string;
|
||||
/**
|
||||
* Practice Area
|
||||
* @default appeals_committee
|
||||
* @default
|
||||
*/
|
||||
practice_area: string;
|
||||
/**
|
||||
@@ -2515,6 +2701,11 @@ export interface components {
|
||||
* @default
|
||||
*/
|
||||
appeal_subtype: string;
|
||||
/**
|
||||
* Proceeding Type
|
||||
* @default
|
||||
*/
|
||||
proceeding_type: string;
|
||||
};
|
||||
/** CaseUpdateRequest */
|
||||
CaseUpdateRequest: {
|
||||
@@ -2555,6 +2746,25 @@ export interface components {
|
||||
* @default
|
||||
*/
|
||||
expected_outcome: string;
|
||||
/** Appellants */
|
||||
appellants?: string[] | null;
|
||||
/** Respondents */
|
||||
respondents?: string[] | null;
|
||||
/**
|
||||
* Property Address
|
||||
* @default
|
||||
*/
|
||||
property_address: string;
|
||||
/**
|
||||
* Permit Number
|
||||
* @default
|
||||
*/
|
||||
permit_number: string;
|
||||
/**
|
||||
* Proceeding Type
|
||||
* @default
|
||||
*/
|
||||
proceeding_type: string;
|
||||
};
|
||||
/** ChairPositionRequest */
|
||||
ChairPositionRequest: {
|
||||
@@ -2681,6 +2891,57 @@ export interface components {
|
||||
/** Value */
|
||||
value: unknown;
|
||||
};
|
||||
/** MissingPrecedentCreate */
|
||||
MissingPrecedentCreate: {
|
||||
/** Citation */
|
||||
citation: string;
|
||||
/**
|
||||
* Case Number
|
||||
* @default
|
||||
*/
|
||||
case_number: string;
|
||||
/** Cited In Document Id */
|
||||
cited_in_document_id?: string | null;
|
||||
/**
|
||||
* Cited By Party
|
||||
* @default unknown
|
||||
* @enum {string}
|
||||
*/
|
||||
cited_by_party: "appellant" | "respondent" | "committee" | "permit_applicant" | "unknown";
|
||||
/** Cited By Party Name */
|
||||
cited_by_party_name?: string | null;
|
||||
/** Legal Topic */
|
||||
legal_topic?: string | null;
|
||||
/** Legal Issue */
|
||||
legal_issue?: string | null;
|
||||
/** Claim Quote */
|
||||
claim_quote?: string | null;
|
||||
/** Case Name */
|
||||
case_name?: string | null;
|
||||
/** Notes */
|
||||
notes?: string | null;
|
||||
};
|
||||
/** MissingPrecedentPatch */
|
||||
MissingPrecedentPatch: {
|
||||
/** Legal Topic */
|
||||
legal_topic?: string | null;
|
||||
/** Legal Issue */
|
||||
legal_issue?: string | null;
|
||||
/** Notes */
|
||||
notes?: string | null;
|
||||
/** Cited By Party */
|
||||
cited_by_party?: ("appellant" | "respondent" | "committee" | "permit_applicant" | "unknown") | null;
|
||||
/** Cited By Party Name */
|
||||
cited_by_party_name?: string | null;
|
||||
/** Case Name */
|
||||
case_name?: string | null;
|
||||
/** Status */
|
||||
status?: ("open" | "uploaded" | "closed" | "irrelevant") | null;
|
||||
/** Citation */
|
||||
citation?: string | null;
|
||||
/** Claim Quote */
|
||||
claim_quote?: string | null;
|
||||
};
|
||||
/** OutcomeRequest */
|
||||
OutcomeRequest: {
|
||||
/** Outcome */
|
||||
@@ -2768,6 +3029,10 @@ export interface components {
|
||||
precedent_level?: string | null;
|
||||
/** Is Binding */
|
||||
is_binding?: boolean | null;
|
||||
/** District */
|
||||
district?: string | null;
|
||||
/** Chair Name */
|
||||
chair_name?: string | null;
|
||||
};
|
||||
/** ReviseRequest */
|
||||
ReviseRequest: {
|
||||
@@ -3908,6 +4173,72 @@ export interface operations {
|
||||
};
|
||||
};
|
||||
};
|
||||
api_aggregate_arguments_api_cases__case_number__aggregate_arguments_post: {
|
||||
parameters: {
|
||||
query?: {
|
||||
force?: boolean;
|
||||
};
|
||||
header?: never;
|
||||
path: {
|
||||
case_number: string;
|
||||
};
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody?: never;
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
api_get_legal_arguments_api_cases__case_number__legal_arguments_get: {
|
||||
parameters: {
|
||||
query?: {
|
||||
party?: string;
|
||||
};
|
||||
header?: never;
|
||||
path: {
|
||||
case_number: string;
|
||||
};
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody?: never;
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
api_set_direction_api_cases__case_number__direction_post: {
|
||||
parameters: {
|
||||
query?: never;
|
||||
@@ -6333,4 +6664,205 @@ export interface operations {
|
||||
};
|
||||
};
|
||||
};
|
||||
missing_precedents_list_api_missing_precedents_get: {
|
||||
parameters: {
|
||||
query?: {
|
||||
status?: string;
|
||||
case_id?: string;
|
||||
case_number?: string;
|
||||
legal_topic?: string;
|
||||
limit?: number;
|
||||
offset?: number;
|
||||
};
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody?: never;
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
missing_precedent_create_api_missing_precedents_post: {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path?: never;
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody: {
|
||||
content: {
|
||||
"application/json": components["schemas"]["MissingPrecedentCreate"];
|
||||
};
|
||||
};
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
missing_precedent_get_api_missing_precedents__mp_id__get: {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path: {
|
||||
mp_id: string;
|
||||
};
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody?: never;
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
missing_precedent_delete_api_missing_precedents__mp_id__delete: {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path: {
|
||||
mp_id: string;
|
||||
};
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody?: never;
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
missing_precedent_update_api_missing_precedents__mp_id__patch: {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path: {
|
||||
mp_id: string;
|
||||
};
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody: {
|
||||
content: {
|
||||
"application/json": components["schemas"]["MissingPrecedentPatch"];
|
||||
};
|
||||
};
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
missing_precedent_upload_api_missing_precedents__mp_id__upload_post: {
|
||||
parameters: {
|
||||
query?: never;
|
||||
header?: never;
|
||||
path: {
|
||||
mp_id: string;
|
||||
};
|
||||
cookie?: never;
|
||||
};
|
||||
requestBody: {
|
||||
content: {
|
||||
"multipart/form-data": components["schemas"]["Body_missing_precedent_upload_api_missing_precedents__mp_id__upload_post"];
|
||||
};
|
||||
};
|
||||
responses: {
|
||||
/** @description Successful Response */
|
||||
200: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": unknown;
|
||||
};
|
||||
};
|
||||
/** @description Validation Error */
|
||||
422: {
|
||||
headers: {
|
||||
[name: string]: unknown;
|
||||
};
|
||||
content: {
|
||||
"application/json": components["schemas"]["HTTPValidationError"];
|
||||
};
|
||||
};
|
||||
};
|
||||
};
|
||||
}
|
||||
|
||||
@@ -18,6 +18,10 @@ export type AppealSubtype =
|
||||
| "building_permit"
|
||||
| "betterment_levy"
|
||||
| "compensation_197"
|
||||
/* בל"מ — בקשה להארכת מועד להגשת ערר. שלושה מסלולים נפרדים. */
|
||||
| "extension_request_building_permit"
|
||||
| "extension_request_betterment_levy"
|
||||
| "extension_request_compensation"
|
||||
| "unknown";
|
||||
|
||||
export const PRACTICE_AREAS: ReadonlyArray<{
|
||||
@@ -34,12 +38,30 @@ export const APPEAL_SUBTYPES: ReadonlyArray<{
|
||||
value: AppealSubtype;
|
||||
label: string;
|
||||
}> = [
|
||||
{ value: "building_permit", label: "רישוי ובנייה" },
|
||||
{ value: "betterment_levy", label: "היטל השבחה" },
|
||||
{ value: "compensation_197", label: "פיצויים (ס' 197)" },
|
||||
{ value: "unknown", label: "לא ידוע" },
|
||||
{ value: "building_permit", label: "רישוי ובנייה" },
|
||||
{ value: "betterment_levy", label: "היטל השבחה" },
|
||||
{ value: "compensation_197", label: "פיצויים (ס' 197)" },
|
||||
/* The "בל\"מ" prefix is dropped from these labels because the
|
||||
* proceeding_type field now drives the בל"מ badge. Keeping the domain
|
||||
* label here lets a row show "רישוי ובנייה" with a separate בל"מ
|
||||
* pill instead of double-marking it. */
|
||||
{ value: "extension_request_building_permit", label: "רישוי ובנייה" },
|
||||
{ value: "extension_request_betterment_levy", label: "היטל השבחה" },
|
||||
{ value: "extension_request_compensation", label: "פיצויים (ס' 197)" },
|
||||
{ value: "unknown", label: "לא ידוע" },
|
||||
];
|
||||
|
||||
/* בל"מ subtypes — תת-קבוצה. שימוש: badges, filters */
|
||||
export const BLAM_SUBTYPES: ReadonlySet<AppealSubtype> = new Set([
|
||||
"extension_request_building_permit",
|
||||
"extension_request_betterment_levy",
|
||||
"extension_request_compensation",
|
||||
]);
|
||||
|
||||
export function isBlamSubtype(s?: AppealSubtype | null): boolean {
|
||||
return s != null && BLAM_SUBTYPES.has(s);
|
||||
}
|
||||
|
||||
export const PRACTICE_AREA_LABELS: Record<PracticeArea, string> =
|
||||
Object.fromEntries(PRACTICE_AREAS.map((p) => [p.value, p.label])) as Record<
|
||||
PracticeArea,
|
||||
@@ -70,3 +92,30 @@ export function deriveSubtype(
|
||||
if (first === "9") return "compensation_197";
|
||||
return "unknown";
|
||||
}
|
||||
|
||||
/*
|
||||
* Detect a בל"מ subject (בקשה להארכת מועד). Mirrors the Python
|
||||
* `is_blam_subject` in practice_area.py. Accepts common variants.
|
||||
*/
|
||||
const BLAM_SUBJECT_RE = /(?:בקשה\s+להארכת\s+מועד|בל["״]מ|הארכת\s+מועד\s+להגשת)/i;
|
||||
|
||||
export function isBlamSubject(subject: string): boolean {
|
||||
return Boolean(subject) && BLAM_SUBJECT_RE.test(subject);
|
||||
}
|
||||
|
||||
/*
|
||||
* Like deriveSubtype() but also detects בל"מ from the subject. Mirrors
|
||||
* `derive_subtype_with_blam()` in practice_area.py.
|
||||
*/
|
||||
export function deriveSubtypeWithBlam(
|
||||
caseNumber: string,
|
||||
subject: string = "",
|
||||
practiceArea: PracticeArea = "appeals_committee",
|
||||
): AppealSubtype {
|
||||
const base = deriveSubtype(caseNumber, practiceArea);
|
||||
if (!isBlamSubject(subject)) return base;
|
||||
if (base === "building_permit") return "extension_request_building_permit";
|
||||
if (base === "betterment_levy") return "extension_request_betterment_levy";
|
||||
if (base === "compensation_197") return "extension_request_compensation";
|
||||
return base;
|
||||
}
|
||||
|
||||
@@ -40,6 +40,15 @@ export const expectedOutcomes = [
|
||||
{ value: "betterment_levy", label: "היטל השבחה" },
|
||||
] as const;
|
||||
|
||||
/* proceeding_type — distinguishes a regular appeal (ערר) from an
|
||||
* extension-of-time request (בל"מ). The same case_number can exist as
|
||||
* both, so this is a separate axis from appeal_subtype/practice_area. */
|
||||
export const proceedingTypes = [
|
||||
{ value: "ערר", label: "ערר" },
|
||||
{ value: 'בל"מ', label: 'בל"מ' },
|
||||
] as const;
|
||||
export type ProceedingType = (typeof proceedingTypes)[number]["value"];
|
||||
|
||||
export const caseCreateSchema = z.object({
|
||||
case_number: z
|
||||
.string()
|
||||
@@ -70,8 +79,12 @@ export const caseCreateSchema = z.object({
|
||||
"building_permit",
|
||||
"betterment_levy",
|
||||
"compensation_197",
|
||||
"extension_request_building_permit",
|
||||
"extension_request_betterment_levy",
|
||||
"extension_request_compensation",
|
||||
"unknown",
|
||||
] as const satisfies readonly AppealSubtype[]),
|
||||
proceeding_type: z.enum(["ערר", 'בל"מ'] as const),
|
||||
});
|
||||
|
||||
export type CaseCreateInput = z.infer<typeof caseCreateSchema>;
|
||||
@@ -97,6 +110,7 @@ export const caseUpdateSchema = z.object({
|
||||
.optional(),
|
||||
property_address: z.string().trim().max(200).optional(),
|
||||
permit_number: z.string().trim().max(100).optional(),
|
||||
proceeding_type: z.enum(["ערר", 'בל"מ'] as const).optional(),
|
||||
});
|
||||
|
||||
export type CaseUpdateInput = z.infer<typeof caseUpdateSchema>;
|
||||
|
||||
1302
web/app.py
1302
web/app.py
File diff suppressed because it is too large
Load Diff
201
web/chat_proxy.py
Normal file
201
web/chat_proxy.py
Normal file
@@ -0,0 +1,201 @@
|
||||
"""FastAPI ↔ legal-chat-service streaming bridge.
|
||||
|
||||
The browser hits ``/api/training/chat/conversations/{id}/messages`` on
|
||||
the legal-ai container. The container is sealed off from the host's
|
||||
``claude`` CLI (intentional — see ``claude_session.py`` docstring), so
|
||||
we forward each request to the pm2-managed ``legal-chat-service`` over
|
||||
loopback (``host.docker.internal:8770``).
|
||||
|
||||
Responsibilities:
|
||||
- Save the user message to ``chat_messages`` before streaming starts.
|
||||
- Open an HTTP streaming connection to the host service.
|
||||
- Forward each SSE event to the browser as-is, accumulating the
|
||||
assistant text and any ``session_id`` so we can persist them once
|
||||
the stream closes.
|
||||
- Persist the assistant turn + the CLI's session_id at end-of-stream.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import json
|
||||
import logging
|
||||
import os
|
||||
from typing import AsyncIterator
|
||||
from uuid import UUID
|
||||
|
||||
import httpx
|
||||
from fastapi import HTTPException
|
||||
from fastapi.responses import StreamingResponse
|
||||
|
||||
from legal_mcp.services import db
|
||||
from web import chat_system_prompt
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# legal-chat-service lives on the host (pm2-managed, bound to 0.0.0.0:8770).
|
||||
# From inside the container we reach it via the docker bridge gateway —
|
||||
# 10.0.1.1 is the host on docker0 (the same address Paperclip uses for
|
||||
# its 3100 bridge). Override with CHAT_SERVICE_URL if running outside
|
||||
# Docker (local dev).
|
||||
#
|
||||
# Coolify's `custom_docker_run_options: --add-host=host.docker.internal:host-gateway`
|
||||
# turned out NOT to apply to dockerimage-built apps as of Coolify 4.0.0,
|
||||
# so the explicit IP is the reliable path. The cloud-level firewall
|
||||
# (Oracle security list) keeps port 8770 unreachable from the public
|
||||
# internet, matching the security posture of Paperclip's 3100.
|
||||
CHAT_SERVICE_URL = os.environ.get(
|
||||
"CHAT_SERVICE_URL",
|
||||
"http://10.0.1.1:8770",
|
||||
)
|
||||
CHAT_SERVICE_TIMEOUT_S = float(os.environ.get("CHAT_SERVICE_TIMEOUT_S", "3600"))
|
||||
|
||||
# Shared secret for ``Authorization: Bearer ...`` to the chat service.
|
||||
# The host-side service refuses any /chat/start without a matching token,
|
||||
# so the network bind + the bearer check are two layers of defense
|
||||
# against an attacker who reaches the docker bridge.
|
||||
_CHAT_SHARED_SECRET = os.environ.get("LEGAL_CHAT_SHARED_SECRET", "").strip()
|
||||
|
||||
|
||||
_SSE_HEADERS = {
|
||||
"Cache-Control": "no-cache, no-transform",
|
||||
"X-Accel-Buffering": "no",
|
||||
"Connection": "keep-alive",
|
||||
}
|
||||
|
||||
|
||||
async def stream_chat_message(
|
||||
conversation_id: UUID,
|
||||
user_message: str,
|
||||
) -> StreamingResponse:
|
||||
"""Open SSE stream, forward events, persist when done.
|
||||
|
||||
Returns a FastAPI StreamingResponse the route can return directly.
|
||||
"""
|
||||
conv = await db.get_chat_conversation(conversation_id)
|
||||
if not conv:
|
||||
raise HTTPException(404, "conversation not found")
|
||||
|
||||
# Persist the user turn immediately so a network drop doesn't lose it.
|
||||
await db.add_chat_message(
|
||||
conversation_id, role="user", content=user_message,
|
||||
)
|
||||
|
||||
is_first_turn = not conv.get("claude_session_id")
|
||||
system_block: str | None = None
|
||||
if is_first_turn:
|
||||
try:
|
||||
system_block = await chat_system_prompt.build_system_prompt(
|
||||
corpus_id=conv.get("style_corpus_id"),
|
||||
)
|
||||
except Exception as e:
|
||||
logger.exception("system prompt build failed")
|
||||
raise HTTPException(500, f"system prompt failed: {e}")
|
||||
|
||||
payload = {
|
||||
"prompt": user_message,
|
||||
"system": system_block,
|
||||
"resume_session_id": conv.get("claude_session_id"),
|
||||
}
|
||||
|
||||
# Surface a clean error if the secret wasn't injected — better than a
|
||||
# silent 401 dribbling out of the proxy.
|
||||
if not _CHAT_SHARED_SECRET:
|
||||
raise HTTPException(
|
||||
500,
|
||||
"LEGAL_CHAT_SHARED_SECRET is not set in the container env. "
|
||||
"Add it as a Coolify env var matching the value in "
|
||||
"/home/chaim/.legal-chat-service.env on the host.",
|
||||
)
|
||||
headers = {"Authorization": f"Bearer {_CHAT_SHARED_SECRET}"}
|
||||
|
||||
async def proxy_stream() -> AsyncIterator[bytes]:
|
||||
accumulated_text: list[str] = []
|
||||
events_log: list[dict] = []
|
||||
new_session_id: str | None = None
|
||||
|
||||
try:
|
||||
timeout_cfg = httpx.Timeout(
|
||||
CHAT_SERVICE_TIMEOUT_S,
|
||||
connect=10.0,
|
||||
read=CHAT_SERVICE_TIMEOUT_S,
|
||||
)
|
||||
async with httpx.AsyncClient(timeout=timeout_cfg) as client:
|
||||
async with client.stream(
|
||||
"POST",
|
||||
f"{CHAT_SERVICE_URL}/chat/start",
|
||||
json=payload,
|
||||
headers=headers,
|
||||
) as upstream:
|
||||
if upstream.status_code != 200:
|
||||
body = await upstream.aread()
|
||||
msg = body.decode("utf-8", errors="replace")[:300]
|
||||
err = {"type": "error",
|
||||
"message": f"chat-service {upstream.status_code}: {msg}"}
|
||||
yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
|
||||
return
|
||||
|
||||
async for line in upstream.aiter_lines():
|
||||
if not line:
|
||||
yield b"\n"
|
||||
continue
|
||||
# Forward verbatim so the browser sees the same
|
||||
# SSE framing the host emits.
|
||||
out = line + "\n"
|
||||
yield out.encode("utf-8")
|
||||
# Mirror events: capture text + session_id for
|
||||
# persistence. The line starts with "data: <json>"
|
||||
# so we strip the prefix before parsing.
|
||||
if line.startswith("data: "):
|
||||
try:
|
||||
event = json.loads(line[len("data: "):])
|
||||
except json.JSONDecodeError:
|
||||
continue
|
||||
events_log.append(event)
|
||||
t = event.get("type")
|
||||
if t == "session_id" and event.get("value"):
|
||||
new_session_id = event["value"]
|
||||
elif t == "text_delta" and event.get("text"):
|
||||
accumulated_text.append(event["text"])
|
||||
elif t == "done" and event.get("text"):
|
||||
if not accumulated_text:
|
||||
accumulated_text.append(event["text"])
|
||||
|
||||
except httpx.ConnectError:
|
||||
err = {
|
||||
"type": "error",
|
||||
"message": (
|
||||
f"לא ניתן להגיע ל-legal-chat-service בכתובת {CHAT_SERVICE_URL}. "
|
||||
"ודא ש-pm2 מריץ אותו: `pm2 status legal-chat-service`."
|
||||
),
|
||||
}
|
||||
yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
|
||||
return
|
||||
except Exception as e:
|
||||
logger.exception("chat proxy failed")
|
||||
err = {"type": "error", "message": str(e)}
|
||||
yield f"data: {json.dumps(err, ensure_ascii=False)}\n\n".encode("utf-8")
|
||||
return
|
||||
|
||||
# End of stream — persist the assistant turn.
|
||||
try:
|
||||
full_text = "".join(accumulated_text).strip()
|
||||
if full_text:
|
||||
await db.add_chat_message(
|
||||
conversation_id,
|
||||
role="assistant",
|
||||
content=full_text,
|
||||
raw_events=events_log,
|
||||
)
|
||||
if new_session_id:
|
||||
await db.update_chat_conversation_session_id(
|
||||
conversation_id, new_session_id,
|
||||
)
|
||||
except Exception:
|
||||
logger.exception("failed to persist assistant turn for conv=%s", conversation_id)
|
||||
|
||||
return StreamingResponse(
|
||||
proxy_stream(),
|
||||
media_type="text/event-stream",
|
||||
headers=_SSE_HEADERS,
|
||||
)
|
||||
205
web/chat_system_prompt.py
Normal file
205
web/chat_system_prompt.py
Normal file
@@ -0,0 +1,205 @@
|
||||
"""Compose the system prompt the style-chat agent receives.
|
||||
|
||||
The chat runs against the local ``claude`` CLI on the host (via
|
||||
legal-chat-service). We assemble a once-per-conversation system block
|
||||
that gives the agent everything it needs to discuss decisions in
|
||||
Daphna's voice:
|
||||
|
||||
- The style guide (``skills/decision/SKILL.md``) — how she writes
|
||||
- The lessons file (``docs/legal-decision-lessons.md``) — what we've
|
||||
learned across the corpus
|
||||
- The corpus-analysis report (``docs/corpus-analysis.md``) — the
|
||||
structural map of 24+ decisions
|
||||
- A summary of every style_corpus row (number, date, subjects,
|
||||
chars + summary if extracted) so the agent can reason about the
|
||||
whole corpus without us shipping all of it inline
|
||||
- Optional: when the conversation is scoped to a specific decision
|
||||
(``style_corpus_id``), append its full_text so the chat can dive
|
||||
into the text directly
|
||||
|
||||
Sent **once**, when the conversation is first created. On subsequent
|
||||
messages the legal-chat-service uses ``claude --resume <session_id>``
|
||||
and the on-disk CLI session keeps the system context intact — no need
|
||||
to re-ship the 100K+ chars of skills + lessons every turn.
|
||||
"""
|
||||
|
||||
from __future__ import annotations
|
||||
|
||||
import logging
|
||||
import os
|
||||
from pathlib import Path
|
||||
from uuid import UUID
|
||||
|
||||
from legal_mcp.services import db
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
|
||||
# The reference files live in the repo at known paths. In the
|
||||
# container they're mounted alongside the code, so resolve relative
|
||||
# to web/app.py's parent.
|
||||
_REPO_ROOT = Path(os.environ.get(
|
||||
"LEGAL_AI_REPO_ROOT",
|
||||
str(Path(__file__).resolve().parent.parent),
|
||||
))
|
||||
|
||||
|
||||
_SKILLS_PATH = _REPO_ROOT / "skills" / "decision" / "SKILL.md"
|
||||
_LESSONS_PATH = _REPO_ROOT / "docs" / "legal-decision-lessons.md"
|
||||
_CORPUS_ANALYSIS_PATH = _REPO_ROOT / "docs" / "corpus-analysis.md"
|
||||
|
||||
|
||||
def _safe_read(path: Path, cap_chars: int = 50_000) -> str:
|
||||
"""Read a file (UTF-8) or return a marker that it's missing.
|
||||
|
||||
The cap protects against accidentally injecting an enormous file —
|
||||
even at 50K, a single source file is the lion's share of the
|
||||
system prompt budget.
|
||||
"""
|
||||
try:
|
||||
text = path.read_text(encoding="utf-8")
|
||||
except FileNotFoundError:
|
||||
return f"(קובץ {path.name} לא נמצא בנתיב {path})"
|
||||
except OSError as e:
|
||||
logger.warning("could not read %s: %s", path, e)
|
||||
return f"(שגיאה בקריאת {path.name}: {e})"
|
||||
if len(text) > cap_chars:
|
||||
return text[:cap_chars] + f"\n\n[... חתך ב-{cap_chars:,} תווים מתוך {len(text):,}]"
|
||||
return text
|
||||
|
||||
|
||||
async def _corpus_summary_block() -> str:
|
||||
"""Compact one-row-per-decision summary the agent can scan."""
|
||||
rows = await db.get_pool()
|
||||
async with rows.acquire() as conn:
|
||||
records = await conn.fetch(
|
||||
"""
|
||||
SELECT decision_number, decision_date, appeal_subtype,
|
||||
subject_categories, length(full_text) AS chars,
|
||||
coalesce(summary, '') AS summary,
|
||||
coalesce(outcome, '') AS outcome
|
||||
FROM style_corpus
|
||||
ORDER BY decision_date NULLS LAST
|
||||
"""
|
||||
)
|
||||
if not records:
|
||||
return "(הקורפוס ריק)"
|
||||
|
||||
lines = []
|
||||
for r in records:
|
||||
cats = r["subject_categories"]
|
||||
if isinstance(cats, str):
|
||||
import json as _json
|
||||
try:
|
||||
cats = _json.loads(cats)
|
||||
except _json.JSONDecodeError:
|
||||
cats = []
|
||||
cats_str = ", ".join(cats or []) if cats else "—"
|
||||
date_str = str(r["decision_date"]) if r["decision_date"] else "—"
|
||||
summary = (r["summary"] or "").strip()
|
||||
outcome = (r["outcome"] or "").strip()
|
||||
head = f"- **{r['decision_number'] or '—'}** ({date_str}) [{r['appeal_subtype'] or '—'}] · {r['chars']:,} תווים"
|
||||
meta = f" נושאים: {cats_str}"
|
||||
body = ""
|
||||
if summary:
|
||||
body = f"\n תקציר: {summary}"
|
||||
if outcome:
|
||||
body += f" — תוצאה: {outcome}"
|
||||
elif outcome:
|
||||
body = f"\n תוצאה: {outcome}"
|
||||
lines.append(head + "\n" + meta + body)
|
||||
return "\n".join(lines)
|
||||
|
||||
|
||||
async def _decision_full_text(corpus_id: UUID) -> str:
|
||||
pool = await db.get_pool()
|
||||
async with pool.acquire() as conn:
|
||||
row = await conn.fetchrow(
|
||||
"SELECT decision_number, decision_date, full_text "
|
||||
"FROM style_corpus WHERE id = $1",
|
||||
corpus_id,
|
||||
)
|
||||
if not row:
|
||||
return ""
|
||||
header = f"# החלטה {row['decision_number']} ({row['decision_date']})\n\n"
|
||||
return header + (row["full_text"] or "")
|
||||
|
||||
|
||||
SYSTEM_PROMPT_HEADER = """\
|
||||
אתה סוכן הסגנון של עו"ד דפנה תמיר, יו"ר ועדת הערר לתכנון ובניה — מחוז ירושלים.
|
||||
|
||||
תפקידך: לעזור לחיים (העוזר המקצועי של דפנה) להבין, לנתח ולחדד את הסגנון
|
||||
של דפנה. אתה לא כותב החלטות חדשות; אתה דן בסגנון של החלטות קיימות,
|
||||
מזהה דפוסים, מקפיד שהכותבים העתידיים (ה-writer agent) יישארו נאמנים
|
||||
לקולה.
|
||||
|
||||
יש לך גישה ל:
|
||||
1. **מדריך הסגנון** של דפנה (skills/decision/SKILL.md) — איך היא כותבת.
|
||||
2. **הלקחים הגנריים** מהקורפוס (docs/legal-decision-lessons.md) — מה
|
||||
למדנו לאורך 24+ החלטות. **חובה** להישען על הקבצים האלה כשאתה דן
|
||||
בסגנון, ולא להמציא תובנות חדשות מהאוויר.
|
||||
3. **ניתוח הקורפוס** המבני (docs/corpus-analysis.md) — מפת תוכן ופערים.
|
||||
4. **רשימת ההחלטות בקורפוס** (למטה) — סקירה תמציתית של כל החלטה
|
||||
שעלתה ל-style_corpus.
|
||||
5. **טקסט מלא של החלטה ספציפית** (אם השיחה הוצמדה ל-style_corpus_id).
|
||||
|
||||
כללי תקשורת:
|
||||
- כל התשובות בעברית.
|
||||
- חיים יושב מולך, לא דפנה — אבל המטרה היא לחדד את הסגנון *של דפנה*.
|
||||
- אם חיים שואל "האם פסקה X מתאימה לסגנון של דפנה?" — תן ניתוח מנומק
|
||||
שמסתמך על SKILL.md ועל החלטות הקורפוס. אל תמציא ראיות.
|
||||
- אם אתה צריך החלטה ספציפית שאין בקורפוס — הודע לחיים שיצרף אותה.
|
||||
- אם חיים אומר לך משהו חדש על דפנה ("דפנה אומרת לעולם אל תפתח החלטה
|
||||
במילה X") — שמור את זה בזיכרון השיחה; אם זה מצדיק תיעוד קבוע, הצע
|
||||
לחיים להוסיף את זה כ-decision_lesson (POST /api/training/lessons)
|
||||
או כתוספת ל-SKILL.md.
|
||||
- אל תיתן לעצמך אישיות מומצאת — אתה כלי-עזר מקצועי, לא חבר.
|
||||
"""
|
||||
|
||||
|
||||
async def build_system_prompt(
|
||||
*,
|
||||
corpus_id: UUID | None = None,
|
||||
include_corpus_summary: bool = True,
|
||||
) -> str:
|
||||
"""Assemble the full system prompt for a new chat conversation.
|
||||
|
||||
Args:
|
||||
corpus_id: When set, the full_text of that decision is appended
|
||||
so the chat can dive into the text.
|
||||
include_corpus_summary: Set False for low-context chats (e.g.
|
||||
quick "what does Daphna do at the end of a betterment-levy
|
||||
decision?" — no need to ship 24 summaries).
|
||||
"""
|
||||
parts: list[str] = [SYSTEM_PROMPT_HEADER]
|
||||
|
||||
parts.append("\n## מדריך הסגנון (skills/decision/SKILL.md)\n")
|
||||
parts.append(_safe_read(_SKILLS_PATH, cap_chars=40_000))
|
||||
|
||||
parts.append("\n\n## לקחים מהקורפוס (docs/legal-decision-lessons.md)\n")
|
||||
parts.append(_safe_read(_LESSONS_PATH, cap_chars=30_000))
|
||||
|
||||
parts.append("\n\n## ניתוח קורפוס מבני (docs/corpus-analysis.md)\n")
|
||||
parts.append(_safe_read(_CORPUS_ANALYSIS_PATH, cap_chars=15_000))
|
||||
|
||||
if include_corpus_summary:
|
||||
parts.append("\n\n## רשימת ההחלטות בקורפוס הסגנון\n")
|
||||
try:
|
||||
parts.append(await _corpus_summary_block())
|
||||
except Exception as e:
|
||||
logger.warning("corpus summary failed: %s", e)
|
||||
parts.append("(שגיאה בטעינת רשימת הקורפוס)")
|
||||
|
||||
if corpus_id is not None:
|
||||
parts.append("\n\n## ההחלטה הספציפית בדיון (full_text)\n")
|
||||
try:
|
||||
txt = await _decision_full_text(corpus_id)
|
||||
if txt:
|
||||
parts.append(txt[:200_000]) # hard cap
|
||||
else:
|
||||
parts.append("(לא נמצאה החלטה — בדוק את ה-corpus_id)")
|
||||
except Exception as e:
|
||||
logger.warning("decision full_text failed: %s", e)
|
||||
parts.append("(שגיאה בטעינת ההחלטה)")
|
||||
|
||||
return "\n".join(parts)
|
||||
@@ -100,6 +100,7 @@ async def emit_case_status_webhook(
|
||||
"POST",
|
||||
"/api/plugins/marcusgroup.legal-ai/webhooks/case-status",
|
||||
json={
|
||||
"eventType": "status_change",
|
||||
"caseNumber": case_number,
|
||||
"oldStatus": old_status,
|
||||
"newStatus": new_status,
|
||||
@@ -114,3 +115,90 @@ async def emit_case_status_webhook(
|
||||
"emit_case_status_webhook failed for case %s (%s → %s): %s",
|
||||
case_number, old_status, new_status, exc,
|
||||
)
|
||||
|
||||
|
||||
async def emit_missing_precedent_webhook(
|
||||
*,
|
||||
case_number: str,
|
||||
missing_precedent_id: str,
|
||||
citation: str,
|
||||
cited_by_party: str | None = None,
|
||||
cited_by_party_name: str | None = None,
|
||||
legal_topic: str | None = None,
|
||||
legal_issue: str | None = None,
|
||||
company_id: str | None = None,
|
||||
run_id: str | None = None,
|
||||
) -> None:
|
||||
"""Tell the plugin that a missing precedent was logged for a case.
|
||||
|
||||
The plugin uses this to surface an ``askUserQuestions`` interaction
|
||||
on the linked Paperclip issue so the chair can decide whether to
|
||||
upload the cited precedent or mark it irrelevant.
|
||||
|
||||
Fire-and-forget.
|
||||
"""
|
||||
try:
|
||||
await pc_request(
|
||||
"POST",
|
||||
"/api/plugins/marcusgroup.legal-ai/webhooks/case-status",
|
||||
json={
|
||||
"eventType": "missing_precedent_created",
|
||||
"caseNumber": case_number,
|
||||
"companyId": company_id,
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"missingPrecedent": {
|
||||
"id": missing_precedent_id,
|
||||
"citation": citation,
|
||||
"citedByParty": cited_by_party,
|
||||
"citedByPartyName": cited_by_party_name,
|
||||
"legalTopic": legal_topic,
|
||||
"legalIssue": legal_issue,
|
||||
},
|
||||
},
|
||||
run_id=run_id,
|
||||
timeout=5.0,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning(
|
||||
"emit_missing_precedent_webhook failed for case %s (%s): %s",
|
||||
case_number, citation, exc,
|
||||
)
|
||||
|
||||
|
||||
async def emit_export_complete_webhook(
|
||||
*,
|
||||
case_number: str,
|
||||
docx_filename: str,
|
||||
docx_title: str | None = None,
|
||||
company_id: str | None = None,
|
||||
run_id: str | None = None,
|
||||
) -> None:
|
||||
"""Tell the plugin that a final DOCX was exported for a case.
|
||||
|
||||
The plugin uses this to attach a "final decision" document to the
|
||||
linked Paperclip issue (markdown body with a download link to the
|
||||
DOCX). Binary attachment is intentionally avoided — the SDK's
|
||||
``documents.upsert`` accepts text only.
|
||||
|
||||
Fire-and-forget.
|
||||
"""
|
||||
try:
|
||||
await pc_request(
|
||||
"POST",
|
||||
"/api/plugins/marcusgroup.legal-ai/webhooks/case-status",
|
||||
json={
|
||||
"eventType": "export_complete",
|
||||
"caseNumber": case_number,
|
||||
"companyId": company_id,
|
||||
"timestamp": datetime.now(timezone.utc).isoformat(),
|
||||
"docxFilename": docx_filename,
|
||||
"docxTitle": docx_title or f"החלטה סופית — {case_number}",
|
||||
},
|
||||
run_id=run_id,
|
||||
timeout=5.0,
|
||||
)
|
||||
except Exception as exc:
|
||||
logger.warning(
|
||||
"emit_export_complete_webhook failed for case %s (%s): %s",
|
||||
case_number, docx_filename, exc,
|
||||
)
|
||||
|
||||
@@ -53,7 +53,18 @@ CURATOR_AGENTS = {
|
||||
COMPANIES["betterment"]: "d6f7c55d-570a-46b8-8d72-1286d07da0d8", # CMPA curator
|
||||
}
|
||||
|
||||
# Fallback mapping — used only when DB lookup returns no results
|
||||
# Legal Analyst (מנתח משפטי) agent per company — woken from the chair UI
|
||||
# when the chair finishes tagging appraisals and asks for fact extraction.
|
||||
# The analyst runs `mcp__legal-ai__extract_appraiser_facts` locally (where
|
||||
# the Claude CLI is present), since the FastAPI container cannot.
|
||||
ANALYST_AGENTS = {
|
||||
COMPANIES["licensing"]: "c26e9439-a88a-49dc-9e67-2262c95db65c", # CMP analyst
|
||||
COMPANIES["betterment"]: "f70fd353-6cde-46b3-8d6c-cfad12100b1b", # CMPA analyst
|
||||
}
|
||||
|
||||
# Fallback mapping — used only when DB lookup returns no results.
|
||||
# בל"מ (extension_request_*) variants route to the same company as their
|
||||
# parent domain — בל"מ ברישוי → CMP, בל"מ בהיטל השבחה → CMPA, וכו'.
|
||||
_FALLBACK_APPEAL_TYPE_TO_COMPANY = {
|
||||
"רישוי": COMPANIES["licensing"],
|
||||
"היטל השבחה": COMPANIES["betterment"],
|
||||
@@ -63,6 +74,10 @@ _FALLBACK_APPEAL_TYPE_TO_COMPANY = {
|
||||
"compensation_197": COMPANIES["betterment"],
|
||||
"compensation": COMPANIES["betterment"],
|
||||
"licensing": COMPANIES["licensing"],
|
||||
# בל"מ subtypes — route per domain
|
||||
"extension_request_building_permit": COMPANIES["licensing"],
|
||||
"extension_request_betterment_levy": COMPANIES["betterment"],
|
||||
"extension_request_compensation": COMPANIES["betterment"],
|
||||
}
|
||||
|
||||
# Legal-AI DB URL for reading tag_company_mappings
|
||||
@@ -1010,3 +1025,107 @@ async def wake_curator_for_final(
|
||||
"curator_id": curator_id,
|
||||
"main_issue_id": main_issue_id,
|
||||
}
|
||||
|
||||
|
||||
async def wake_analyst_for_appraiser_facts(
|
||||
case_number: str,
|
||||
company_id: str,
|
||||
) -> dict:
|
||||
"""Wake the legal-analyst to extract appraiser facts for this case.
|
||||
|
||||
Triggered by the chair clicking "חלץ עובדות שמאיות עכשיו" in the UI.
|
||||
The FastAPI container cannot run `extract_appraiser_facts` directly —
|
||||
the extractor calls `claude_session.query_json()`, which only works
|
||||
where the local `claude` CLI is present (the MCP server / agent runner
|
||||
on the host). So instead of running it inline, we create a child issue
|
||||
under the case's main Paperclip issue, assign it to the analyst of the
|
||||
correct company, and trigger a wakeup with `mutation: extract_appraiser_facts`.
|
||||
The analyst's HEARTBEAT picks up the issue, runs the MCP tool locally,
|
||||
and reports back via a comment.
|
||||
|
||||
Returns a dict shaped for the FastAPI endpoint to serialize as-is:
|
||||
{"status": "queued", "sub_issue_id", "analyst_id", "main_issue_id"}
|
||||
or {"status": "skipped", "reason": "..."} for non-fatal early outs.
|
||||
"""
|
||||
if not PAPERCLIP_BOARD_API_KEY:
|
||||
logger.warning(
|
||||
"PAPERCLIP_BOARD_API_KEY not set — cannot queue analyst wakeup for %s",
|
||||
case_number,
|
||||
)
|
||||
return {"status": "skipped", "reason": "no_api_key"}
|
||||
|
||||
analyst_id = ANALYST_AGENTS.get(company_id)
|
||||
if not analyst_id:
|
||||
logger.info("No analyst configured for company %s — skipping", company_id)
|
||||
return {"status": "skipped", "reason": "no_analyst", "company_id": company_id}
|
||||
|
||||
issues = await get_case_issues(case_number)
|
||||
if not issues:
|
||||
logger.warning(
|
||||
"No Paperclip issues found for case %s — cannot queue analyst", case_number,
|
||||
)
|
||||
return {"status": "skipped", "reason": "no_issue"}
|
||||
|
||||
main_issue = next((i for i in issues if i.get("status") == "in_progress"), None) or issues[0]
|
||||
main_issue_id = main_issue["id"]
|
||||
|
||||
description = (
|
||||
f"חיים תייג שומות בתיק {case_number} וביקש חילוץ עובדות שמאיות.\n\n"
|
||||
f"הרץ `mcp__legal-ai__extract_appraiser_facts(case_number=\"{case_number}\")` "
|
||||
f"וכתוב comment בעברית עם תוצאת החילוץ — מספר תכניות, מספר היתרים, "
|
||||
f"וסתירות (אם יש) בין שמאים. אם המסמכים חסרי תיוג `appraiser_side`, "
|
||||
f"דווח ב-comment על השומות החסרות וסגור את ה-issue כ-blocked."
|
||||
)
|
||||
child_resp = await pc_request(
|
||||
"POST",
|
||||
f"/api/issues/{main_issue_id}/children",
|
||||
json={
|
||||
"title": f"[ערר {case_number}] חילוץ עובדות שמאיות",
|
||||
"description": description,
|
||||
"status": "in_progress",
|
||||
"priority": "normal",
|
||||
"assigneeAgentId": analyst_id,
|
||||
},
|
||||
raise_on_error=True,
|
||||
)
|
||||
sub_issue = child_resp.json()
|
||||
sub_issue_id = sub_issue["id"]
|
||||
|
||||
# Tag plugin_state so the case page surfaces this sub-issue too
|
||||
try:
|
||||
conn = await asyncpg.connect(PAPERCLIP_DB_URL)
|
||||
try:
|
||||
await _link_case_to_issue(conn, sub_issue_id, case_number)
|
||||
finally:
|
||||
await conn.close()
|
||||
except Exception as e:
|
||||
logger.warning("plugin_state link failed for sub_issue=%s: %s", sub_issue_id, e)
|
||||
|
||||
wake_resp = await pc_request(
|
||||
"POST",
|
||||
f"/api/agents/{analyst_id}/wakeup",
|
||||
json={
|
||||
"source": "on_demand",
|
||||
"triggerDetail": "manual",
|
||||
"reason": f"extract_appraiser_facts_{case_number}",
|
||||
# Use "assignment" — the same mutation `wake_curator_for_final`
|
||||
# sends. The HEARTBEAT recognises it; the task-specific intent
|
||||
# is conveyed by the child-issue's description, not the payload.
|
||||
"payload": {
|
||||
"issueId": sub_issue_id,
|
||||
"mutation": "assignment",
|
||||
"caseNumber": case_number,
|
||||
},
|
||||
},
|
||||
raise_on_error=True,
|
||||
)
|
||||
logger.info(
|
||||
"Analyst wakeup for case %s: sub_issue=%s analyst=%s wake=%s",
|
||||
case_number, sub_issue_id, analyst_id, wake_resp.status_code,
|
||||
)
|
||||
return {
|
||||
"status": "queued",
|
||||
"sub_issue_id": sub_issue_id,
|
||||
"analyst_id": analyst_id,
|
||||
"main_issue_id": main_issue_id,
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user