Compare commits
79 Commits
c2fb4ca08e
...
feat/mcp-s
| Author | SHA1 | Date | |
|---|---|---|---|
| e90faa9ba4 | |||
| ae35934383 | |||
| d1e12619d4 | |||
| 1cb832473c | |||
| 89ce6c79d7 | |||
| 7e3c912899 | |||
| f418686724 | |||
| 8289b4d643 | |||
| 6c129a1350 | |||
| 320b9d3529 | |||
| 394b971856 | |||
| 1da3587334 | |||
| 272e49b6b0 | |||
| 69bdf7b30a | |||
| 2fe73fcce1 | |||
| c30c987ec2 | |||
| 562eae010a | |||
| a3ca32355a | |||
| 55a0eca070 | |||
| 796f9d5f9c | |||
| 70052b0133 | |||
| 2f05cdea2e | |||
| bd1fb61655 | |||
| f6bb46dc4a | |||
| 36f21c815e | |||
| d4496b96f1 | |||
| d12cdb1fad | |||
| 8a815ecff5 | |||
| 81ccf3a888 | |||
| 5724ed8e5b | |||
| c31fe0866b | |||
| 242f668319 | |||
| b9cdcf980d | |||
| 36e464f668 | |||
| 4d1924c7e6 | |||
| 26c3fddf41 | |||
| 688ba37d9c | |||
| b2985f88de | |||
| 01ea902156 | |||
| cca17689de | |||
| deb1a1eaf4 | |||
| f722fa45bd | |||
| cbdbc522a0 | |||
| 6c727cb5d0 | |||
| 923903217c | |||
| da0a385d9c | |||
| cb0b4b6a8b | |||
| 72c4593e74 | |||
| 789cc273ee | |||
| 1f17419ee9 | |||
| 4a9a6b7970 | |||
| 8e1384b897 | |||
| 6420fe4b0b | |||
| fc3b6b6cae | |||
| 2cfdf35191 | |||
| 5d836ca414 | |||
| 73a79ea7e8 | |||
| b51163b67c | |||
| 7ee90dce31 | |||
| a6edb75bbf | |||
| e849285806 | |||
| f7249b7807 | |||
| 5deb38f5cf | |||
| 817d6e6d8d | |||
| f256eddbb1 | |||
| 6a38789379 | |||
| fa70944ed4 | |||
| 7600810639 | |||
| 47127f1e85 | |||
| a1969dd90d | |||
| 1fbcdd0d16 | |||
| cd4eed0045 | |||
| 903fb4d140 | |||
| 28f49defff | |||
| 9bdfb05350 | |||
| 03e7d88aee | |||
| 4a297f910c | |||
| 5e4c03d0cd | |||
| 6b5d6586dc |
@@ -85,11 +85,30 @@ curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
|||||||
- עבוד על המשימה לפי ההוראות ב-AGENTS.md שלך
|
- עבוד על המשימה לפי ההוראות ב-AGENTS.md שלך
|
||||||
- השתמש בכלים המשפטיים (legal-ai MCP)
|
- השתמש בכלים המשפטיים (legal-ai MCP)
|
||||||
|
|
||||||
|
### ⚠️ self-recovery — issue ב-`todo` עם תוצרים קיימים
|
||||||
|
|
||||||
|
ל-Paperclip יש באג ידוע: לאחר ש-issue מתעדכן ל-`done`, מנגנון `issue.released` מחזיר אותו ל-`todo` תוך כ-30 שניות (תועד ב-`docs/paperclip-quirks.md §1`). זה גורם ל-wakeup חוזר של אותו סוכן על משימה שכבר בוצעה.
|
||||||
|
|
||||||
|
**לפני שאתה מתחיל עבודה — בדוק שהמשימה לא בוצעה כבר**:
|
||||||
|
|
||||||
|
1. **בדוק תוצרים בדיסק**: `Glob` על תיקיות ה-output הצפויות (`{case_dir}/documents/research/*.md` לחוקר, `analysis-and-research.md` למנתח, וכו')
|
||||||
|
2. **בדוק תוצרים ב-DB**: דרך MCP — `precedent_list`, `get_claims`, `extract_appraiser_facts` (status=completed)
|
||||||
|
3. **בדוק comments קודמים על ה-issue** — אם הסוכן הקודם פרסם "הושלם בהצלחה" מסוף-מצב
|
||||||
|
|
||||||
|
**אם הכל קיים ותקין**: אל תבצע עבודה כפולה. במקום זאת:
|
||||||
|
- פרסם comment קצר: "אין שינוי — כל התוצרים קיימים מהריצה הקודמת (X פריטים ב-DB, קובץ Y בדיסק). סוגר את ה-issue."
|
||||||
|
- `PATCH /api/issues/{id}` → `done`
|
||||||
|
- צא נקי
|
||||||
|
|
||||||
|
**אם משהו חסר/שונה**: עבוד על מה שחסר בלבד, לא על הכל מחדש.
|
||||||
|
|
||||||
## 4. דיווח — חובה!
|
## 4. דיווח — חובה!
|
||||||
|
|
||||||
**לפני שאתה מסיים, תמיד:**
|
**לפני שאתה מסיים, תמיד:**
|
||||||
|
|
||||||
### 4א. פרסם comment על ה-issue
|
### 4א. פרסם comment על ה-issue
|
||||||
|
|
||||||
|
**ל-body קצר (<500 תווים, בלי backticks/קוד/נתיבים):**
|
||||||
```bash
|
```bash
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
@@ -97,6 +116,24 @@ curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
|||||||
-d '{"body": "סיכום העבודה..."}'
|
-d '{"body": "סיכום העבודה..."}'
|
||||||
```
|
```
|
||||||
|
|
||||||
|
**ל-body ארוך / markdown עם נתיבים בbacktick / קוד — חובה שתי פעולות נפרדות:**
|
||||||
|
|
||||||
|
1. כתוב את ה-JSON לקובץ זמני דרך **Write tool** (לא דרך bash heredoc):
|
||||||
|
```
|
||||||
|
Write(file_path="/tmp/comment-{issue-id}.json",
|
||||||
|
content=json.dumps({"body": markdown_body}, ensure_ascii=False))
|
||||||
|
```
|
||||||
|
|
||||||
|
2. אז `curl -d @file` שקורא את הקובץ ישירות — בלי shell expansion:
|
||||||
|
```bash
|
||||||
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}/comments" \
|
||||||
|
-d @/tmp/comment-{issue-id}.json
|
||||||
|
```
|
||||||
|
|
||||||
|
**⚠️ למה לא bash heredoc / `python3 -c`:** backticks ב-markdown (`` `path/to/file` ``) ייפרשו על ידי bash כ-command substitution גם כשהם בתוך מחרוזת Python. תקבל שגיאת `Permission denied` מטעה (`bash` מנסה להריץ את הנתיב כפקודה). הפתרון של temp-file חוסם את כל ה-shell quoting traps. תועד ב-`docs/paperclip-quirks.md §2`.
|
||||||
|
|
||||||
### 4ב. קבע סטטוס — done או blocked
|
### 4ב. קבע סטטוס — done או blocked
|
||||||
|
|
||||||
**אם המשימה הושלמה בהצלחה** (כל המסמכים חולצו, כל הבדיקות עברו, אין חסימות):
|
**אם המשימה הושלמה בהצלחה** (כל המסמכים חולצו, כל הבדיקות עברו, אין חסימות):
|
||||||
|
|||||||
@@ -14,9 +14,15 @@ tools:
|
|||||||
- mcp__legal-ai__document_list
|
- mcp__legal-ai__document_list
|
||||||
- mcp__legal-ai__document_get_text
|
- mcp__legal-ai__document_get_text
|
||||||
- mcp__legal-ai__extract_claims
|
- mcp__legal-ai__extract_claims
|
||||||
|
- mcp__legal-ai__extract_appraiser_facts
|
||||||
- mcp__legal-ai__get_claims
|
- mcp__legal-ai__get_claims
|
||||||
- mcp__legal-ai__search_case_documents
|
- mcp__legal-ai__search_case_documents
|
||||||
- mcp__legal-ai__search_decisions
|
- mcp__legal-ai__search_decisions
|
||||||
|
- mcp__legal-ai__search_precedent_library
|
||||||
|
- mcp__legal-ai__precedent_library_get
|
||||||
|
- mcp__legal-ai__precedent_library_list
|
||||||
|
- mcp__legal-ai__halacha_review
|
||||||
|
- mcp__legal-ai__halachot_pending
|
||||||
- mcp__legal-ai__find_similar_cases
|
- mcp__legal-ai__find_similar_cases
|
||||||
- mcp__legal-ai__workflow_status
|
- mcp__legal-ai__workflow_status
|
||||||
- mcp__legal-ai__processing_status
|
- mcp__legal-ai__processing_status
|
||||||
@@ -30,7 +36,9 @@ tools:
|
|||||||
|
|
||||||
1. **`docs/decision-methodology.md`** — מתודולוגיה אנליטית: איך לחשוב על החלטה מעין-שיפוטית, מבנה סילוגיסטי, סדר סוגיות, טיפול בטענות
|
1. **`docs/decision-methodology.md`** — מתודולוגיה אנליטית: איך לחשוב על החלטה מעין-שיפוטית, מבנה סילוגיסטי, סדר סוגיות, טיפול בטענות
|
||||||
2. **`docs/block-schema.md`** — ארכיטקטורת 12 בלוקים
|
2. **`docs/block-schema.md`** — ארכיטקטורת 12 בלוקים
|
||||||
3. **`docs/legal-decision-lessons.md`** — לקחים מהחלטות קודמות
|
3. **`docs/daphna-block-zayin-claims.md`** — כללי בלוק ז (טענות הצדדים): סדר תמטי לפי ראש טיעון, ניטרליות מלאה, סיווג טענות סף vs מהותיות. **הניתוח שלך הוא הקלט לבלוק ז של ה-writer — אם תסווג שגוי או תפספס טענה, זה ייכשל גם בבלוק ז וגם בבלוק י.**
|
||||||
|
4. **`docs/daphna-precedent-network.md`** — לכל סוגיה משפטית, איזה תקדם מועדף של דפנה. שימושי כשעורר/משיב מסתמך על תקדם — לדעת אם זה תקדם בקאנון.
|
||||||
|
5. **`docs/legal-decision-lessons.md`** — לקחים מהחלטות קודמות
|
||||||
|
|
||||||
## שפה
|
## שפה
|
||||||
|
|
||||||
@@ -65,12 +73,15 @@ tools:
|
|||||||
|
|
||||||
## סוגי מסמכים — מה לחלץ ומה לא
|
## סוגי מסמכים — מה לחלץ ומה לא
|
||||||
|
|
||||||
| סוג מסמך | מה לחלץ | claim_type |
|
| סוג מסמך (doc_type) | מה לחלץ | באיזה כלי |
|
||||||
|-----------|----------|------------|
|
|----------------------|----------|------------|
|
||||||
| כתב ערר | **טענות** — מה העוררים טוענים | claim |
|
| `appeal` | **טענות** — מה העוררים טוענים | `extract_claims` (claim_type=claim) |
|
||||||
| כתב תשובה | **תשובות** — מה המשיבים/ועדה עונים | response |
|
| `response` | **תשובות** — מה המשיבים/ועדה עונים | `extract_claims` (claim_type=response) |
|
||||||
| תגובה / השלמת טיעון | **תגובות** — תשובות לתשובות | reply |
|
| `reply` / השלמת טיעון | **תגובות** — תשובות לתשובות | `extract_claims` (claim_type=reply) |
|
||||||
| פסיקה / תכנית / פרוטוקול / היתר | **אל תחלץ כלום** — מסמכי רקע בלבד | — |
|
| `appraisal` | **עובדות שמאי** — מספרים, מקדמים, עסקאות השוואה, מסקנות שווי | `extract_appraiser_facts` |
|
||||||
|
| `reference` / `plan` / `protocol` / `permit` / `decision` / `court_decision` | **אל תחלץ כלום** — מסמכי רקע בלבד | — |
|
||||||
|
|
||||||
|
> **הבחנה קריטית — שומה אינה כתב טענות.** שומה (`appraisal`) היא חוות דעת מקצועית, לא טיעון משפטי. **לא** מריצים עליה `extract_claims` — מריצים `extract_appraiser_facts` שמחלץ נתונים כמותיים מובנים (שווי, מקדמים, עסקאות). זאת קלט מהותי לבלוקים ז ו-י של ההחלטה. **דילוג עליה = פלט חסר**.
|
||||||
|
|
||||||
## תהליך עבודה — 4 שלבים
|
## תהליך עבודה — 4 שלבים
|
||||||
|
|
||||||
@@ -83,9 +94,10 @@ tools:
|
|||||||
- **הצדדים**: מי העורר, מי המשיב, מי צד ג'
|
- **הצדדים**: מי העורר, מי המשיב, מי צד ג'
|
||||||
- **המסגרת הנורמטיבית**: חוקים, תקנות, תכניות רלוונטיות — **קרא את המסמכים הנורמטיביים במלואם** (לא רק הסעיף הנטען; מילה בסעיף אחד מתפרשת לאור סעיפים אחרים באותו מסמך)
|
- **המסגרת הנורמטיבית**: חוקים, תקנות, תכניות רלוונטיות — **קרא את המסמכים הנורמטיביים במלואם** (לא רק הסעיף הנטען; מילה בסעיף אחד מתפרשת לאור סעיפים אחרים באותו מסמך)
|
||||||
4. חלץ טענות/תשובות/תגובות (`extract_claims` עם doc_type ו-party_hint מתאימים)
|
4. חלץ טענות/תשובות/תגובות (`extract_claims` עם doc_type ו-party_hint מתאימים)
|
||||||
- **מסמך גדול (>15,000 תווים):** פצל לחלקים לפי פרקים/סעיפים וחלץ מכל חלק בנפרד. אל תשלח מסמך שלם של 20K+ מילים בקריאה אחת — זה יגרום ל-timeout.
|
- **מסמך גדול (>15,000 תווים):** מאז phase 1 של מערכת הניתוח, ה-chunking הסמנטי + מקבילות + retry מטופל אוטומטית. גם מסמך של 100K+ תווים ירוץ עד הסוף. אם בכל זאת נכשל — דווח ב-issue.
|
||||||
- **אם extract_claims נכשל (timeout):** נסה שוב עם חלק מהמסמך. אם עדיין נכשל — חלץ ידנית: קרא את הטקסט (`document_get_text`), זהה את הטענות המרכזיות, והכנס ל-DB.
|
- **טיפול בכשל:** אם `extract_claims` החזיר `partial=true` או 0 טענות ממסמך לא ריק — נסה שוב פעם אחת. אם עדיין נכשל — סטטוס issue = `blocked`, פרסם comment עם הפירוט.
|
||||||
5. וודא שכל פריט מסווג ל-claim_type הנכון
|
5. **חלץ עובדות שמאי** — לכל מסמך `doc_type='appraisal'` בתיק, הרץ `extract_appraiser_facts(case_number)` (פעם אחת לתיק, מטפל בכל השומות). **חובה בכל ערר השבחה (8xxx) ופיצויים (9xxx) — בלי זה ה-writer לא יוכל לכתוב את בלוק ז עם מספרים מדויקים.**
|
||||||
|
6. וודא שכל פריט מסווג ל-claim_type הנכון
|
||||||
|
|
||||||
### שלב 2: ניתוח מעמיק
|
### שלב 2: ניתוח מעמיק
|
||||||
הצג במבנה הבא:
|
הצג במבנה הבא:
|
||||||
@@ -201,13 +213,33 @@ FROM documents d WHERE d.case_id = '{case_id}' AND d.doc_type IN ('appeal', 'res
|
|||||||
2. **פרסם comment** ב-Paperclip עם סיכום:
|
2. **פרסם comment** ב-Paperclip עם סיכום:
|
||||||
- כמה טענות חולצו (מפורט: X טענות עוררים, Y תשובות משיבים, Z תגובות)
|
- כמה טענות חולצו (מפורט: X טענות עוררים, Y תשובות משיבים, Z תגובות)
|
||||||
- **האם כל המסמכים חולצו בהצלחה** (כן/לא — אם לא, פרט מה נכשל)
|
- **האם כל המסמכים חולצו בהצלחה** (כן/לא — אם לא, פרט מה נכשל)
|
||||||
|
- **כמה עובדות שמאי חולצו** (אם יש מסמכי `appraisal`)
|
||||||
- הסוגיות המרכזיות (3-5 כותרות)
|
- הסוגיות המרכזיות (3-5 כותרות)
|
||||||
- כמה שאלות מחקר הופקו
|
- כמה שאלות מחקר הופקו
|
||||||
- המלצה לשלב הבא
|
- המלצה לשלב הבא
|
||||||
|
|
||||||
3. **עדכן סטטוס** (`case_update` עם status = `documents_ready`)
|
3. **עדכן סטטוס התיק** (`case_update` עם status = `documents_ready`)
|
||||||
|
|
||||||
4. **שלח מייל**:
|
4. **סגור את ה-issue של עצמך — חובה!** בלי זה Paperclip יחשוב שהמשימה עדיין רצה ויפעיל retry בלולאה (זה נצפה בפועל בריצת CMPA-16 — שלוש איטרציות מיותרות).
|
||||||
|
|
||||||
|
**אם הכל עבר בהצלחה (בדיקות שלב 6 + טענות + עובדות שמאי):**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "done"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**אם בדיקות שלב 6 נכשלו או חילוץ נכשל:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "blocked"}'
|
||||||
|
```
|
||||||
|
**אסור** לסיים `done` עם פלט חסר — אם ניסיון חוזר נכשל, סטטוס = `blocked` + comment עם פירוט.
|
||||||
|
|
||||||
|
5. **שלח מייל**:
|
||||||
```bash
|
```bash
|
||||||
python3 /home/chaim/legal-ai/scripts/notify.py \
|
python3 /home/chaim/legal-ai/scripts/notify.py \
|
||||||
"ניתוח ומחקר הושלמו — ערר {case_number}" \
|
"ניתוח ומחקר הושלמו — ערר {case_number}" \
|
||||||
@@ -216,15 +248,20 @@ FROM documents d WHERE d.case_id = '{case_id}' AND d.doc_type IN ('appeal', 'res
|
|||||||
|
|
||||||
### העֵר את העוזר המשפטי (CEO) — חובה!
|
### העֵר את העוזר המשפטי (CEO) — חובה!
|
||||||
```bash
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
"$PAPERCLIP_API_URL/api/agents/752cebdd-6748-4a04-aacd-c7ab0294ef33/wakeup" \
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
-d '{"reason": "מנתח משפטי סיים משימה [issue-id] בסטטוס [done/blocked]"}'
|
-d '{"source":"automation","triggerDetail":"system","reason":"מנתח משפטי סיים משימה [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
```
|
```
|
||||||
אם ה-API לא עובד:
|
|
||||||
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
||||||
|
**⚠️ אסור לקבע UUID של CEO** — UUID שונה לכל חברה. תמיד דרך `$PAPERCLIP_COMPANY_ID`. wakeup לחברה אחרת נדחה: `Agent key cannot access another company`.
|
||||||
**אם בדיקות שלב 6 נכשלו** — סטטוס issue = "blocked", פרסם comment עם פירוט מה נכשל, שלח מייל לחיים.
|
|
||||||
|
|
||||||
## מבנה הפלט המלא — analysis-and-research.md
|
## מבנה הפלט המלא — analysis-and-research.md
|
||||||
|
|
||||||
@@ -375,23 +412,19 @@ X שאלות עומדות להכרעה:
|
|||||||
```
|
```
|
||||||
6. **העֵר את ה-CEO — חובה!**
|
6. **העֵר את ה-CEO — חובה!**
|
||||||
```bash
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
"$PAPERCLIP_API_URL/api/agents/752cebdd-6748-4a04-aacd-c7ab0294ef33/wakeup" \
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
-d '{"reason": "מנתח משפטי סיים העמקת ניתוח (pass 2) [issue-id] בסטטוס [done/blocked]"}'
|
-d '{"source":"automation","triggerDetail":"system","reason":"מנתח משפטי סיים העמקת ניתוח (pass 2) [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
```
|
|
||||||
אם ה-API לא עובד:
|
|
||||||
```bash
|
|
||||||
PGPASSWORD="paperclip" psql -h 127.0.0.1 -p 54329 -U paperclip -d paperclip -c "
|
|
||||||
INSERT INTO agent_wakeup_requests (company_id, agent_id, source, reason, status, requested_by_actor_type)
|
|
||||||
VALUES (
|
|
||||||
(SELECT company_id FROM agents WHERE id = '\$PAPERCLIP_AGENT_ID'),
|
|
||||||
'752cebdd-6748-4a04-aacd-c7ab0294ef33',
|
|
||||||
'agent_completion',
|
|
||||||
'מנתח משפטי סיים העמקת ניתוח (pass 2) — נדרשת בדיקה',
|
|
||||||
'queued', 'agent'
|
|
||||||
);"
|
|
||||||
```
|
```
|
||||||
|
**⚠️ אם ה-API מחזיר שגיאה — אל תיגע ב-DB.** `INSERT INTO agent_wakeup_requests` לא יוצר `heartbeat_run` והסוכן לא יתעורר לעולם. בדוק `$PAPERCLIP_COMPANY_ID` ו-`$PAPERCLIP_API_KEY`, ודאי שאתה לא קורא ל-CEO של חברה אחרת (`Agent key cannot access another company`).
|
||||||
|
|
||||||
## כללים קריטיים
|
## כללים קריטיים
|
||||||
|
|
||||||
|
|||||||
@@ -17,6 +17,7 @@ tools:
|
|||||||
- mcp__legal-ai__record_chair_feedback
|
- mcp__legal-ai__record_chair_feedback
|
||||||
- mcp__legal-ai__list_chair_feedback
|
- mcp__legal-ai__list_chair_feedback
|
||||||
- mcp__legal-ai__search_case_documents
|
- mcp__legal-ai__search_case_documents
|
||||||
|
- mcp__legal-ai__search_precedent_library
|
||||||
- mcp__legal-ai__workflow_status
|
- mcp__legal-ai__workflow_status
|
||||||
- mcp__legal-ai__processing_status
|
- mcp__legal-ai__processing_status
|
||||||
- mcp__legal-ai__get_metrics
|
- mcp__legal-ai__get_metrics
|
||||||
@@ -28,6 +29,13 @@ tools:
|
|||||||
- mcp__legal-ai__apply_user_edit
|
- mcp__legal-ai__apply_user_edit
|
||||||
- mcp__legal-ai__list_bookmarks
|
- mcp__legal-ai__list_bookmarks
|
||||||
- mcp__legal-ai__revise_draft
|
- mcp__legal-ai__revise_draft
|
||||||
|
- mcp__legal-ai__precedent_process_pending
|
||||||
|
- mcp__legal-ai__precedent_extract_halachot
|
||||||
|
- mcp__legal-ai__precedent_extract_metadata
|
||||||
|
- mcp__legal-ai__precedent_library_get
|
||||||
|
- mcp__legal-ai__precedent_library_list
|
||||||
|
- mcp__legal-ai__halacha_review
|
||||||
|
- mcp__legal-ai__halachot_pending
|
||||||
---
|
---
|
||||||
|
|
||||||
# עוזר משפטי — מנהל תהליך כתיבת החלטות
|
# עוזר משפטי — מנהל תהליך כתיבת החלטות
|
||||||
@@ -48,10 +56,24 @@ tools:
|
|||||||
|
|
||||||
| מסמך | תוכן | מתי לקרוא |
|
| מסמך | תוכן | מתי לקרוא |
|
||||||
|------|-------|-----------|
|
|------|-------|-----------|
|
||||||
|
| `docs/daphna-decision-tree.md` | **כלי הפעולה היומיומי** — עץ החלטה: מהי הראיה הניצחת? איזו תבנית? איזה אורך? | **לפני כל החלטה** |
|
||||||
| `docs/decision-methodology.md` | מתודולוגיה אנליטית — סילוגיזמים, סדר סוגיות, איזון | **לפני כל החלטה** |
|
| `docs/decision-methodology.md` | מתודולוגיה אנליטית — סילוגיזמים, סדר סוגיות, איזון | **לפני כל החלטה** |
|
||||||
| `docs/block-schema.md` | הגדרת 12 בלוקים — content model, constraints | **לפני כל החלטה** |
|
| `docs/block-schema.md` | הגדרת 12 בלוקים — content model, constraints | **לפני כל החלטה** |
|
||||||
| `docs/legal-decision-lessons.md` | לקחים מ-3 החלטות — מה עבד, מה השתנה | **לפני כל החלטה** |
|
| `docs/legal-decision-lessons.md` | לקחים מ-3 החלטות — מה עבד, מה השתנה | **לפני כל החלטה** |
|
||||||
|
|
||||||
|
### מסמכי הקול של דפנה (להפנייה לסוכנים)
|
||||||
|
|
||||||
|
הסוכנים שלך (writer, qa, researcher, analyst) קוראים את מסמכי הקול בעצמם. **התפקיד שלך**: לוודא שהם **קוראים** אותם, ולנתב את הסוכן הנכון לפי סוג התיק.
|
||||||
|
|
||||||
|
| מסמך | תפקיד | סוכן רלוונטי |
|
||||||
|
|------|--------|---------------|
|
||||||
|
| `docs/daphna-voice-fingerprint.md` | קבועי הקול | writer + qa |
|
||||||
|
| `docs/daphna-precedent-network.md` | קאנון תקדמים | researcher + writer + qa |
|
||||||
|
| `docs/daphna-architecture-by-outcome.md` | מבנה בלוק י לפי תוצאה | writer + qa |
|
||||||
|
| `docs/daphna-acceptance-architecture.md` | 5 תבניות קבלה | writer + qa (אם תוצאה = קבלה) |
|
||||||
|
| `docs/daphna-block-zayin-claims.md` | כללי בלוק ז | analyst + writer + qa |
|
||||||
|
| `docs/voice-1130-25.md` | דוגמה עמוקה | writer (אם תיק 1xxx מורכב) |
|
||||||
|
|
||||||
## הסוכנים שלך
|
## הסוכנים שלך
|
||||||
|
|
||||||
| סוכן | Agent ID | תפקיד |
|
| סוכן | Agent ID | תפקיד |
|
||||||
@@ -137,8 +159,33 @@ Paperclip חוסם אוטומטית כל issue ב-`in_progress` שאין לו ru
|
|||||||
**לפני כל דבר אחר** — בדוק את סיבת ההתעוררות (`$PAPERCLIP_WAKE_REASON`):
|
**לפני כל דבר אחר** — בדוק את סיבת ההתעוררות (`$PAPERCLIP_WAKE_REASON`):
|
||||||
- אם ה-reason מכיל `user_commented` → **דלג ישירות לסעיף "טיפול בתגובות חדשות מחיים"**. אל תסרוק תיקים אחרים, אל תבדוק issues, אל תעשה heartbeat רגיל. **טפל רק בתגובה.**
|
- אם ה-reason מכיל `user_commented` → **דלג ישירות לסעיף "טיפול בתגובות חדשות מחיים"**. אל תסרוק תיקים אחרים, אל תבדוק issues, אל תעשה heartbeat רגיל. **טפל רק בתגובה.**
|
||||||
- אם ה-reason מכיל `agent_completion` → דלג לשלב E/F בהתאם לסוכן שסיים
|
- אם ה-reason מכיל `agent_completion` → דלג לשלב E/F בהתאם לסוכן שסיים
|
||||||
|
- אם ה-reason מכיל `precedent_extraction_` → **דלג לסעיף "חילוץ פסיקה אוטומטי"**. אל תיגע בתיקים — זו עבודת ספרייה.
|
||||||
- אחרת → המשך לשלב A (heartbeat רגיל)
|
- אחרת → המשך לשלב A (heartbeat רגיל)
|
||||||
|
|
||||||
|
### חילוץ פסיקה אוטומטי
|
||||||
|
|
||||||
|
מופעל כשפסק דין חדש מועלה לספרייה. ה-issue נמצא בפרויקט "ספריית פסיקה — תור חילוץ" ומשויך אליך.
|
||||||
|
|
||||||
|
**⚠️ MCP startup race — חובה לקרוא לפני הקריאה הראשונה!**
|
||||||
|
ה-MCP server של legal-ai לוקח ~3-10 שניות לעלות בעת wakeup חדש (Python imports). אם הקריאה הראשונה ל-`mcp__legal-ai__*` תחזיר `"No such tool available"` — זה race, **לא bug אמיתי**. הפעולה הנכונה:
|
||||||
|
1. הרץ `Bash sleep 5` — תן ל-MCP server להתייצב.
|
||||||
|
2. נסה שוב את אותו כלי MCP.
|
||||||
|
3. אם עדיין נכשל אחרי 2 retries — fallback ל-Python ישיר (`Bash` עם `.venv/bin/python -c "from legal_mcp.tools.precedent_library import ..."`).
|
||||||
|
|
||||||
|
**מה לעשות:**
|
||||||
|
1. קרא את ה-description של ה-issue — מצוין שם `case_law_id` וה-citation.
|
||||||
|
2. **warmup**: קרא קודם `mcp__legal-ai__workflow_status(case_number="warmup")` (כלי קל שמאלץ MCP להתחבר). אם נכשל ב-"No such tool available" → `Bash sleep 5` ואז retry. רק אחרי שזה עובד, המשך:
|
||||||
|
3. הרץ פעמיים:
|
||||||
|
```
|
||||||
|
mcp__legal-ai__precedent_process_pending(kind="metadata")
|
||||||
|
mcp__legal-ai__precedent_process_pending(kind="halacha")
|
||||||
|
```
|
||||||
|
הכלי מעבד את **כל** הפסיקות שבתור — אם תוקיע אחת והגיעו עוד בינתיים, גם הן יעובדו.
|
||||||
|
4. כשמסתיים: כתוב comment קצר ב-issue (`mcp__legal-ai__precedent_process_pending` מחזיר את התוצאה — סכם בעברית: כמה הלכות חולצו, אילו שדות מטא-דאטה הושלמו, ו-status לכל פסיקה).
|
||||||
|
5. סמן את ה-issue כ-`done`.
|
||||||
|
|
||||||
|
**אל**: אל תיצור issues של ביצוע בתיקי ערר, אל תיכנס לתהליך כתיבת החלטה — זו רק עבודת תחזוקה של ספריית הפסיקה.
|
||||||
|
|
||||||
### שלב A: בדיקת מצב — שלמות, בדיקות שליליות, תאימות מתודולוגיה
|
### שלב A: בדיקת מצב — שלמות, בדיקות שליליות, תאימות מתודולוגיה
|
||||||
|
|
||||||
בכל heartbeat **רגיל** (לא comment routing):
|
בכל heartbeat **רגיל** (לא comment routing):
|
||||||
@@ -494,11 +541,15 @@ Paperclip חוסם אוטומטית כל issue ב-`in_progress` שאין לו ru
|
|||||||
---
|
---
|
||||||
|
|
||||||
**תבנית issue למנתח — חובה בכל תיק:**
|
**תבנית issue למנתח — חובה בכל תיק:**
|
||||||
1. **טבלת מיפוי מסמכים** — לכל מסמך: שם, claim_type, party_role. בנה מ-`document_list`.
|
1. **טבלת מיפוי מסמכים** — לכל מסמך: שם, doc_type, פעולה נדרשת:
|
||||||
2. **רשימת מסמכים שלא לחלץ מהם** (reference, plan, decision, court_decision)
|
- `appeal` → `extract_claims` (claim_type=claim, party_role=appellant)
|
||||||
3. **הנחיה לפיצול מסמכים גדולים** — מעל 15,000 תווים → חלץ בחלקים
|
- `response` → `extract_claims` (claim_type=response, party_role=respondent/committee)
|
||||||
4. **הנחיה לשלוח wakeup ל-CEO בסיום**
|
- `reply` → `extract_claims` (claim_type=reply, party_role=permit_applicant/appellant)
|
||||||
5. **הנחיה לסיים כ-blocked אם מסמך נכשל**
|
- **`appraisal` → `extract_appraiser_facts`** (לא extract_claims! שומה אינה כתב טענות. חובה בכל תיק 8xxx/9xxx)
|
||||||
|
- `reference`/`plan`/`protocol`/`permit`/`decision`/`court_decision` → אל תחלץ — חומר רקע בלבד
|
||||||
|
2. **בדיקת השלמה** — לכל doc_type='appraisal' בתיק, וודא שה-issue אומר במפורש להריץ `extract_appraiser_facts`. בלי זה ה-writer יקבל בלוק ז ריק ממספרים.
|
||||||
|
3. **הנחיה לסגור את ה-issue ב-PATCH** — סטטוס `done` בהצלחה, `blocked` בכשל. בלי זה Paperclip יפעיל retry בלולאה (נצפה בפועל ב-CMPA-16 / 30-04-26).
|
||||||
|
4. **הנחיה לשלוח wakeup ל-CEO בסיום** (כך שאתה תידע להמשיך)
|
||||||
|
|
||||||
## סינון תיקים לפי חברה — חובה!
|
## סינון תיקים לפי חברה — חובה!
|
||||||
|
|
||||||
|
|||||||
@@ -116,15 +116,43 @@ tools:
|
|||||||
- ממצאי הבדיקה הסופית (אם היו הערות)
|
- ממצאי הבדיקה הסופית (אם היו הערות)
|
||||||
- גודל הקובץ
|
- גודל הקובץ
|
||||||
|
|
||||||
|
### סגור את ה-issue של עצמך — חובה!
|
||||||
|
|
||||||
|
בלי זה Paperclip יזהה "issue in_progress + אין execution חיה" ויפעיל auto-retry בלולאה (נצפה בפועל ב-CMPA-17 ב-30/04/26 — 4 איטרציות מיותרות עד הריגה ידנית).
|
||||||
|
|
||||||
|
**אם הכל עבר בהצלחה (כל בדיקות השלב הקודם עברו, אין כשל בפלט):**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "done"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**אם בדיקות נכשלו, חסר פלט, או חסר מידע קריטי:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "blocked"}'
|
||||||
|
```
|
||||||
|
**אסור** לסיים `done` עם פלט חסר — אם משהו נכשל, סטטוס = `blocked` + comment עם פירוט.
|
||||||
|
|
||||||
### העֵר את העוזר המשפטי (CEO) — חובה!
|
### העֵר את העוזר המשפטי (CEO) — חובה!
|
||||||
```bash
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
"$PAPERCLIP_API_URL/api/agents/752cebdd-6748-4a04-aacd-c7ab0294ef33/wakeup" \
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
-d '{"reason": "מייצא טיוטה סיים משימה [issue-id] בסטטוס [done/blocked]"}'
|
-d '{"source":"automation","triggerDetail":"system","reason":"מייצא טיוטה סיים משימה [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
```
|
```
|
||||||
אם ה-API לא עובד:
|
|
||||||
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
||||||
|
**⚠️ אסור לקבע UUID של CEO** — UUID שונה לכל חברה. תמיד דרך `$PAPERCLIP_COMPANY_ID`. wakeup לחברה אחרת נדחה: `Agent key cannot access another company`.
|
||||||
|
|
||||||
## כללים קריטיים
|
## כללים קריטיים
|
||||||
|
|
||||||
|
|||||||
@@ -69,5 +69,58 @@ tools:
|
|||||||
### שלב 4: שמירה
|
### שלב 4: שמירה
|
||||||
1. **גיבוי**: העתק את הקובץ המקורי מ-`extracted/` לתיקיית `documents/backup/` עם סיומת `.pre-proofread.txt`
|
1. **גיבוי**: העתק את הקובץ המקורי מ-`extracted/` לתיקיית `documents/backup/` עם סיומת `.pre-proofread.txt`
|
||||||
2. **כתוב** את הגרסה המתוקנת לתיקיית `documents/proofread/` (עם אותו שם קובץ כמו ב-`extracted/`)
|
2. **כתוב** את הגרסה המתוקנת לתיקיית `documents/proofread/` (עם אותו שם קובץ כמו ב-`extracted/`)
|
||||||
3. עדכן את מסד הנתונים — שנה `extraction_status` ל-`proofread`:
|
3. עדכן את מסד הנתונים — שנה `extraction_status` ל-`proofread`
|
||||||
|
|
||||||
|
### שלב 5: דיווח — חובה!
|
||||||
|
|
||||||
|
1. **פרסם comment ב-issue** עם סיכום:
|
||||||
|
- כמה מסמכים הוגהו
|
||||||
|
- כמה החלפות אוטומטיות בוצעו (לפי מילון ראשי תיבות)
|
||||||
|
- כמה תיקונים ידניים בוצעו
|
||||||
|
- אם נמצאו בעיות שלא ניתן היה לתקן — פרט (`[?]` markers)
|
||||||
|
|
||||||
|
2. **שלח מייל**:
|
||||||
|
```bash
|
||||||
|
python3 /home/chaim/legal-ai/scripts/notify.py \
|
||||||
|
"הגהה הושלמה — ערר {case_number}" \
|
||||||
|
"סיכום: X מסמכים הוגהו, Y החלפות, Z תיקונים. נדרשת ביקורתך."
|
||||||
|
```
|
||||||
|
|
||||||
|
### סגור את ה-issue של עצמך — חובה!
|
||||||
|
|
||||||
|
בלי זה Paperclip יזהה "issue in_progress + אין execution חיה" ויפעיל auto-retry בלולאה (נצפה בפועל ב-CMPA-17 ב-30/04/26 — 4 איטרציות מיותרות עד הריגה ידנית).
|
||||||
|
|
||||||
|
**אם הכל עבר בהצלחה:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "done"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**אם נכשלו תיקונים קריטיים או יש markers `[?]` רבים:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "blocked"}'
|
||||||
|
```
|
||||||
|
**אסור** לסיים `done` עם פלט חסר — אם נכשל, סטטוס = `blocked` + comment עם פירוט.
|
||||||
|
|
||||||
|
### העֵר את העוזר המשפטי (CEO) — חובה!
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
|
-d '{"source":"automation","triggerDetail":"system","reason":"מגיה סיים משימה [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
|
```
|
||||||
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
||||||
|
**⚠️ אסור לקבע UUID של CEO** — UUID שונה לכל חברה. תמיד דרך `$PAPERCLIP_COMPANY_ID`. wakeup לחברה אחרת נדחה: `Agent key cannot access another company`.
|
||||||
|
|||||||
@@ -14,6 +14,9 @@ tools:
|
|||||||
- mcp__legal-ai__get_metrics
|
- mcp__legal-ai__get_metrics
|
||||||
- mcp__legal-ai__workflow_status
|
- mcp__legal-ai__workflow_status
|
||||||
- mcp__legal-ai__search_case_documents
|
- mcp__legal-ai__search_case_documents
|
||||||
|
- mcp__legal-ai__search_precedent_library
|
||||||
|
- mcp__legal-ai__precedent_library_get
|
||||||
|
- mcp__legal-ai__halacha_review
|
||||||
---
|
---
|
||||||
|
|
||||||
# בודק איכות — סוכן QA להחלטות ועדת ערר
|
# בודק איכות — סוכן QA להחלטות ועדת ערר
|
||||||
@@ -32,7 +35,18 @@ tools:
|
|||||||
|
|
||||||
אם issue מכוון לתיק שלא בטווח שלך — סרב ודווח ב-comment.
|
אם issue מכוון לתיק שלא בטווח שלך — סרב ודווח ב-comment.
|
||||||
|
|
||||||
## 6 בדיקות
|
## לפני שאתה מתחיל — קרא את מסמכי הקול
|
||||||
|
|
||||||
|
בלי קריאת מסמכי הקול, אינך יכול לבדוק שה-writer עקב אחר הסגנון של דפנה.
|
||||||
|
|
||||||
|
1. **`docs/daphna-decision-tree.md`** — תקציר תפעולי. ממנו תגיע למסמכים הספציפיים לפי שאלה.
|
||||||
|
2. **`docs/daphna-voice-fingerprint.md`** — קבועי הקול (פעלי "אנחנו", אנטי-דפוסים, ביטויי קישור)
|
||||||
|
3. **`docs/daphna-architecture-by-outcome.md`** — מבנה בלוק י לפי תוצאה
|
||||||
|
4. **`docs/daphna-acceptance-architecture.md`** — חמש תבניות קבלה. **חובה אם התיק קבלה (לא חלקית)**
|
||||||
|
5. **`docs/daphna-block-zayin-claims.md`** — כללי בלוק ז (טענות הצדדים)
|
||||||
|
6. **`docs/daphna-precedent-network.md`** — לכל סוגיה משפטית, איזה תקדם דפנה מצטטת
|
||||||
|
|
||||||
|
## 7 בדיקות
|
||||||
|
|
||||||
### 1. שלמות מבנית (structural_integrity)
|
### 1. שלמות מבנית (structural_integrity)
|
||||||
- כל בלוקי חובה קיימים (ה עד יא)
|
- כל בלוקי חובה קיימים (ה עד יא)
|
||||||
@@ -74,6 +88,45 @@ tools:
|
|||||||
- אין "נוסחאות ריקות" (משפטים שמחיקתם לא משנה כלום)?
|
- אין "נוסחאות ריקות" (משפטים שמחיקתם לא משנה כלום)?
|
||||||
- ציטוטים עטופים בסנדוויץ' (הקדמה → ציטוט → ניתוח)?
|
- ציטוטים עטופים בסנדוויץ' (הקדמה → ציטוט → ניתוח)?
|
||||||
|
|
||||||
|
### 8. עמידה בקול דפנה (voice_compliance)
|
||||||
|
מבוסס על 6 מסמכי הקול. בדוק:
|
||||||
|
|
||||||
|
#### בלוק ז (מ-`daphna-block-zayin-claims.md`)
|
||||||
|
- כותרת **"תמצית טענות הצדדים"** (לא "טענות הצדדים")?
|
||||||
|
- כל צד מקבל כותרת משנה (טענות העוררים / תגובת הוועדה / תגובת מבקשי ההיתר)?
|
||||||
|
- אין רשימה ממוספרת `(1)... (2)...` בתוך פסקה?
|
||||||
|
- אין מילות הערכה ("בצדק", "בטעות", "משכנעת")?
|
||||||
|
- אין גילוי מסקנה עתידית ("טענה זו תידחה בהמשך")?
|
||||||
|
- אין ציטוטי פסיקה ארוכים — רק שם + הפניה?
|
||||||
|
- קול פעיל ("העורר טוען") ולא פסיביזציה ("טענות העורר היו")?
|
||||||
|
|
||||||
|
#### בלוק י (מ-`daphna-voice-fingerprint.md` + `daphna-architecture-by-outcome.md`)
|
||||||
|
- כותרת בלוק י = **"דיון והכרעה"** (קבוע)?
|
||||||
|
- קול "אנחנו" פעיל — אין "הוועדה מוצאת" אלא "מצאנו"?
|
||||||
|
- כל פועל "אנחנו" נושא תפקיד — אין "נחדד" כפתיחת פסקה אקראית?
|
||||||
|
- דפוס "אכן... אולם" לטענות שנדחות (לא דחייה במשפט אחד)?
|
||||||
|
- אין רשימה ממוספרת באנליזה?
|
||||||
|
- אין מספור פסקאות סדרתי (1., 2., 3.) — מגמה ישנה שנטושה ב-2025+?
|
||||||
|
- כותרות משנה רק אם 3+ סוגיות מובחנות (לא בתיק עם סוגיה אחת)?
|
||||||
|
- ציטוטי פסיקה במלואם (4-15 שורות), לא תמציות?
|
||||||
|
- אם תיק 1xxx מורכב — מסגור פילוסופי בפתיחה?
|
||||||
|
- אם תיק 8xxx עם הכרעה שמאית — ציטוט בר"מ 3644/13 קיים?
|
||||||
|
- "למעלה מן הצורך" לטיעונים מרכזיים?
|
||||||
|
- אין רטוריקה דרמטית של הצדדים בקול ההכרעה?
|
||||||
|
- אין תוצאה הכל-או-לא-כלום בתיק עם טענות מהותיות משני הצדדים?
|
||||||
|
|
||||||
|
#### תקדמים (מ-`daphna-precedent-network.md`)
|
||||||
|
- לכל סוגיה משפטית — האם נבחר התקדים המועדף של דפנה?
|
||||||
|
- האם יש תקדים אישי שלה רלוונטי? אם כן — האם הופנה אליו (חיסכון / דחייה / הבחנה)?
|
||||||
|
- **ציטוטי פסיקה חיצונית בבלוק י** — לכל ציטוט (`citation` + `supporting_quote`) שמופיע, חפש ב-`search_precedent_library` (subject_tag הרלוונטי) וודא שהציטוט קיים בקורפוס ושהלכה אושרה. ציטוט שלא תואם להלכה מאושרת = critical.
|
||||||
|
|
||||||
|
#### תבנית קבלה (מ-`daphna-acceptance-architecture.md` — אם תוצאה = קבלה)
|
||||||
|
- האם הסיבה לקבלה ברורה: פגם פנימי / החזרה / תיקונים / 8xxx מהותית / שומה?
|
||||||
|
- האם התבנית הנבחרת (A/B/C/D/E) מתאימה לסיבה?
|
||||||
|
- האם פורמט הסיום נכון לתבנית? (תבנית A: "מתבטלת"; B: "תיקבע לדיון" + הוראת הבהרה; C: "בכפוף לתיקונים"; D: "דרישת התשלום בטלה"; E: "השומה תושב לתיקון")
|
||||||
|
- בתבנית A: יש "הודאת צד נגדי" ו"השמטה רחבה"?
|
||||||
|
- בתבנית C: יש פסקת הכרה בוועדה ("פעלה נכון בקיום הדיון")?
|
||||||
|
|
||||||
## חומרה
|
## חומרה
|
||||||
|
|
||||||
| בדיקה | חומרה | משמעות |
|
| בדיקה | חומרה | משמעות |
|
||||||
@@ -85,6 +138,7 @@ tools:
|
|||||||
| כפילות | warning | מדווח, לא חוסם |
|
| כפילות | warning | מדווח, לא חוסם |
|
||||||
| מספור | warning | מדווח, לא חוסם |
|
| מספור | warning | מדווח, לא חוסם |
|
||||||
| מתודולוגיה | critical | חוסם ייצוא |
|
| מתודולוגיה | critical | חוסם ייצוא |
|
||||||
|
| **קול דפנה** | **critical** | **חוסם ייצוא** |
|
||||||
|
|
||||||
## תהליך עבודה
|
## תהליך עבודה
|
||||||
|
|
||||||
@@ -113,12 +167,40 @@ tools:
|
|||||||
- האם מותר לייצא (כל הקריטיים pass?)
|
- האם מותר לייצא (כל הקריטיים pass?)
|
||||||
- עדכן סטטוס ל-qa_review (אם נכשל) או drafted (אם עבר)
|
- עדכן סטטוס ל-qa_review (אם נכשל) או drafted (אם עבר)
|
||||||
|
|
||||||
|
### סגור את ה-issue של עצמך — חובה!
|
||||||
|
|
||||||
|
בלי זה Paperclip יזהה "issue in_progress + אין execution חיה" ויפעיל auto-retry בלולאה (נצפה בפועל ב-CMPA-17 ב-30/04/26 — 4 איטרציות מיותרות עד הריגה ידנית).
|
||||||
|
|
||||||
|
**אם הכל עבר בהצלחה (כל בדיקות השלב הקודם עברו, אין כשל בפלט):**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "done"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**אם בדיקות נכשלו, חסר פלט, או חסר מידע קריטי:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "blocked"}'
|
||||||
|
```
|
||||||
|
**אסור** לסיים `done` עם פלט חסר — אם משהו נכשל, סטטוס = `blocked` + comment עם פירוט.
|
||||||
|
|
||||||
### העֵר את העוזר המשפטי (CEO) — חובה!
|
### העֵר את העוזר המשפטי (CEO) — חובה!
|
||||||
```bash
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
"$PAPERCLIP_API_URL/api/agents/752cebdd-6748-4a04-aacd-c7ab0294ef33/wakeup" \
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
-d '{"reason": "בודק איכות סיים משימה [issue-id] בסטטוס [done/blocked]"}'
|
-d '{"source":"automation","triggerDetail":"system","reason":"בודק איכות סיים משימה [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
```
|
```
|
||||||
אם ה-API לא עובד:
|
|
||||||
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
||||||
|
**⚠️ אסור לקבע UUID של CEO** — UUID שונה לכל חברה. תמיד דרך `$PAPERCLIP_COMPANY_ID`. wakeup לחברה אחרת נדחה: `Agent key cannot access another company`.
|
||||||
|
|||||||
@@ -16,6 +16,17 @@ tools:
|
|||||||
- mcp__legal-ai__search_decisions
|
- mcp__legal-ai__search_decisions
|
||||||
- mcp__legal-ai__find_similar_cases
|
- mcp__legal-ai__find_similar_cases
|
||||||
- mcp__legal-ai__extract_references
|
- mcp__legal-ai__extract_references
|
||||||
|
- mcp__legal-ai__precedent_attach
|
||||||
|
- mcp__legal-ai__precedent_list
|
||||||
|
- mcp__legal-ai__precedent_search_library
|
||||||
|
- mcp__legal-ai__search_precedent_library
|
||||||
|
- mcp__legal-ai__precedent_library_get
|
||||||
|
- mcp__legal-ai__precedent_library_list
|
||||||
|
- mcp__legal-ai__precedent_extract_halachot
|
||||||
|
- mcp__legal-ai__precedent_extract_metadata
|
||||||
|
- mcp__legal-ai__precedent_process_pending
|
||||||
|
- mcp__legal-ai__halacha_review
|
||||||
|
- mcp__legal-ai__halachot_pending
|
||||||
- mcp__legal-ai__workflow_status
|
- mcp__legal-ai__workflow_status
|
||||||
---
|
---
|
||||||
|
|
||||||
@@ -37,8 +48,10 @@ tools:
|
|||||||
|
|
||||||
## לפני שאתה מתחיל — קרא!
|
## לפני שאתה מתחיל — קרא!
|
||||||
|
|
||||||
1. **מתודולוגיה אנליטית**: `docs/decision-methodology.md` — במיוחד סעיפים ד.2 (התחל מלשון הטקסט), ד.3 (שלושה מקורות להנחה עליונה), ז (ציטוטים ואזכורי פסיקה)
|
1. **רשת תקדמים של דפנה**: `docs/daphna-precedent-network.md` — **קריאת חובה**. לכל סוגיה משפטית, יש לדפנה תקדם **מועדף** שהיא מצטטת באופן עקבי (אייזן/רוזן/שפר/הרמלין/חוף השרון/בר"מ 3644/13 גלר וכו'). אל תחפש תקדמים אקראיים — בדוק את הקאנון שלה תחילה.
|
||||||
2. לקחים מהחלטות קודמות: `docs/legal-decision-lessons.md`
|
2. **מתודולוגיה אנליטית**: `docs/decision-methodology.md` — במיוחד סעיפים ד.2 (התחל מלשון הטקסט), ד.3 (שלושה מקורות להנחה עליונה), ז (ציטוטים ואזכורי פסיקה)
|
||||||
|
3. **תקדמים אישיים של דפנה**: השתמש ב-`search_decisions` לפני שמציעים תקדם חיצוני. אם דפנה כבר הכריעה בסוגיה זהה — התקדם שלה הוא חלק מהקאנון.
|
||||||
|
4. לקחים מהחלטות קודמות: `docs/legal-decision-lessons.md`
|
||||||
|
|
||||||
## סוגי מסמכים שאתה מטפל בהם
|
## סוגי מסמכים שאתה מטפל בהם
|
||||||
|
|
||||||
@@ -69,8 +82,25 @@ tools:
|
|||||||
- **רמת התקדים**: עליון / מנהלי / ועדת ערר ארצית / ועדת ערר מחוזית
|
- **רמת התקדים**: עליון / מנהלי / ועדת ערר ארצית / ועדת ערר מחוזית
|
||||||
- **הלכה מחייבת או אמרת אגב**
|
- **הלכה מחייבת או אמרת אגב**
|
||||||
- **כיצד ישרת את מבנה ההנמקה**: כ"כלל" (הנחה עליונה), כ"הרחבה" (Explanation ב-CREAC), או כאנלוגיה
|
- **כיצד ישרת את מבנה ההנמקה**: כ"כלל" (הנחה עליונה), כ"הרחבה" (Explanation ב-CREAC), או כאנלוגיה
|
||||||
|
- **האם זה תקדם מהקאנון של דפנה?** (בדוק `docs/daphna-precedent-network.md` — אם כן, ציין שזה התקדם המועדף שלה לסוגיה)
|
||||||
4. הפק הפניות (`extract_references`)
|
4. הפק הפניות (`extract_references`)
|
||||||
|
|
||||||
|
### שלב 2ב: בדיקה מצטלבת מול הקאנון של דפנה
|
||||||
|
אחרי שאספת את הפסיקה הרלוונטית בתיק:
|
||||||
|
1. **לכל סוגיה משפטית** בתיק — בדוק ב-`daphna-precedent-network.md`:
|
||||||
|
- האם יש תקדם מועדף של דפנה לסוגיה?
|
||||||
|
- האם הוא הוצג בכתבי הטענות? אם לא — סמן כתקדם שיש להוסיף
|
||||||
|
2. **תקדמים אישיים**: `search_decisions` בקטגוריה זהה לתיק. אם דפנה כבר הכריעה בסוגיה דומה:
|
||||||
|
- אם תוצאה דומה: תקדם לחיסכון דוקטרינרי ("כפי שקבענו ב-X")
|
||||||
|
- אם תוצאה הפוכה: ציין כי **חובה** הבחנה (distinguishing)
|
||||||
|
3. **קורפוס פסיקה סמכותית**: `search_precedent_library` — חיפוש סמנטי בהלכות שאושרו ע"י דפנה (פסיקת עליון/מנהלי/ועדות ערר אחרות). מחזיר rule_statement + supporting_quote + citation מוכנים לציטוט בבלוק י. אם הצדדים הפנו לפסק דין שלא בקורפוס — הוסף אותו דרך `precedent_attach` (לתיק) או דרך ממשק ההעלאה ב-`/precedents` (לקורפוס הקבוע).
|
||||||
|
4. **דווח** איזה תקדמים מהקאנון רלוונטיים, איזה תקדמים אישיים נמצאו, ואילו הלכות מהקורפוס הסמכותי תומכות.
|
||||||
|
|
||||||
|
**שלושת המקורות — אל תבלבל:**
|
||||||
|
- `search_decisions` = החלטות דפנה (style_corpus).
|
||||||
|
- `search_precedent_library` = פסיקה חיצונית סמכותית עם הלכות מאושרות.
|
||||||
|
- `precedent_search_library` = ציטוטים שדפנה צירפה ידנית לתיקים בעבר (case_precedents).
|
||||||
|
|
||||||
### שלב 3: מיפוי תכנית
|
### שלב 3: מיפוי תכנית
|
||||||
1. קרא הוראות התכנית **במלואן** — לא רק את הסעיף הנטען
|
1. קרא הוראות התכנית **במלואן** — לא רק את הסעיף הנטען
|
||||||
2. זהה סעיפים רלוונטיים למחלוקת
|
2. זהה סעיפים רלוונטיים למחלוקת
|
||||||
@@ -84,33 +114,81 @@ tools:
|
|||||||
|
|
||||||
### שלב 5: דיווח — חובה!
|
### שלב 5: דיווח — חובה!
|
||||||
|
|
||||||
1. **עדכן סטטוס**: `case_update(case_number, status='research_complete')`
|
1. **שמור את הדוח לדיסק** (חובה — ה-writer וה-QA קוראים מהקובץ הזה ישירות):
|
||||||
|
```
|
||||||
|
{case_dir}/documents/research/precedent-research.md
|
||||||
|
```
|
||||||
|
המבנה המומלץ: רקע דיוני → מפת שומות (אם רלוונטי) → סוגיות + תקדימים מאומתים לכל אחת → המלצה לכיוון. כל תקדים עם citation מלא + ציטוט מדויק + הקשר.
|
||||||
|
|
||||||
2. **שלח מייל**:
|
2. **רשום ב-DB את התקדימים שאומתו** — חובה, אחרת ה-writer יקבל רשימה ריקה כשהוא קורא `precedent_list`.
|
||||||
|
|
||||||
|
לכל פסק דין שעבר את שלב 2 (ניתוח פסיקה) **ויש לו ציטוט מדויק מהמקור** — קרא `precedent_attach`:
|
||||||
|
```
|
||||||
|
mcp__legal-ai__precedent_attach(
|
||||||
|
case_number = "8174-24",
|
||||||
|
citation = "בר\"מ 3644/13 הוועדה המקומית גבעתיים נ' גלר (פורסם בנבו, 24.05.2017)",
|
||||||
|
quote = "ציטוט מדויק מפסק הדין — הקטע הספציפי שרלוונטי לסוגיה",
|
||||||
|
section_id = "issue_2" # או "threshold_1" לטענת סף; ריק אם כללי
|
||||||
|
)
|
||||||
|
```
|
||||||
|
תקדימים שלא הצלחת לאמת (ציטוט לא נמצא, רק "טוענים שמופיע בפסק") **אל תכתוב ל-DB** — סמן ב-comment כ"דורש אימות חיצוני" בלבד.
|
||||||
|
|
||||||
|
3. **עדכן סטטוס**: `case_update(case_number, status='research_complete')`
|
||||||
|
|
||||||
|
4. **שלח מייל**:
|
||||||
```bash
|
```bash
|
||||||
python3 /home/chaim/legal-ai/scripts/notify.py \
|
python3 /home/chaim/legal-ai/scripts/notify.py \
|
||||||
"מחקר תקדימים הושלם — ערר {case_number}" \
|
"מחקר תקדימים הושלם — ערר {case_number}" \
|
||||||
"סיכום: X פסקי דין נותחו, Y תכניות מופו. נדרשת ביקורתך לפני המשך."
|
"סיכום: X פסקי דין נותחו ונרשמו ל-DB, Y תכניות מופו. נדרשת ביקורתך לפני המשך."
|
||||||
```
|
```
|
||||||
|
|
||||||
3. פרסם comment ב-Paperclip עם:
|
5. **פרסם comment ב-Paperclip** עם:
|
||||||
- סיכום כל פסק דין (2-3 שורות לכל אחד)
|
- סיכום כל פסק דין (2-3 שורות לכל אחד) — **ציין במפורש כמה תקדימים נרשמו ב-DB דרך `precedent_attach`**
|
||||||
- מיפוי הוראות תכנית רלוונטיות
|
- מיפוי הוראות תכנית רלוונטיות
|
||||||
- ציר זמן ההליך
|
- ציר זמן ההליך
|
||||||
- **המלצה מובנית לפי מקורות הנמקה:**
|
- **המלצה מובנית לפי מקורות הנמקה:**
|
||||||
- **טקסט**: אילו סעיפי תכנית/חוק מרכזיים (ציטוט הנוסח)
|
- **טקסט**: אילו סעיפי תכנית/חוק מרכזיים (ציטוט הנוסח)
|
||||||
- **תקדים**: אילו פסקי דין הכי חזקים (עם ציון היררכיה ומעמד — הלכה/אגב)
|
- **תקדים**: אילו פסקי דין הכי חזקים (עם ציון היררכיה ומעמד — הלכה/אגב)
|
||||||
- **מדיניות**: אילו שיקולים תכנוניים עולים מהחומר
|
- **מדיניות**: אילו שיקולים תכנוניים עולים מהחומר
|
||||||
|
- קישור למיקום הקובץ: `{case_dir}/documents/research/precedent-research.md`
|
||||||
|
|
||||||
|
### סגור את ה-issue של עצמך — חובה!
|
||||||
|
|
||||||
|
בלי זה Paperclip יזהה "issue in_progress + אין execution חיה" ויפעיל auto-retry בלולאה (נצפה בפועל ב-CMPA-17 ב-30/04/26 — 4 איטרציות מיותרות עד הריגה ידנית).
|
||||||
|
|
||||||
|
**אם הכל עבר בהצלחה (כל בדיקות השלב הקודם עברו, אין כשל בפלט):**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "done"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**אם בדיקות נכשלו, חסר פלט, או חסר מידע קריטי:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "blocked"}'
|
||||||
|
```
|
||||||
|
**אסור** לסיים `done` עם פלט חסר — אם משהו נכשל, סטטוס = `blocked` + comment עם פירוט.
|
||||||
|
|
||||||
### העֵר את העוזר המשפטי (CEO) — חובה!
|
### העֵר את העוזר המשפטי (CEO) — חובה!
|
||||||
```bash
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
"$PAPERCLIP_API_URL/api/agents/752cebdd-6748-4a04-aacd-c7ab0294ef33/wakeup" \
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
-d '{"reason": "חוקר תקדימים סיים משימה [issue-id] בסטטוס [done/blocked]"}'
|
-d '{"source":"automation","triggerDetail":"system","reason":"חוקר תקדימים סיים משימה [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
```
|
```
|
||||||
אם ה-API לא עובד:
|
|
||||||
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
||||||
|
**⚠️ אסור לקבע UUID של CEO** — UUID שונה לכל חברה. תמיד דרך `$PAPERCLIP_COMPANY_ID`. wakeup לחברה אחרת נדחה: `Agent key cannot access another company`.
|
||||||
|
|
||||||
## כללים
|
## כללים
|
||||||
- **דיוק** — ציין מספרי סעיפים, תאריכים, שמות שופטים
|
- **דיוק** — ציין מספרי סעיפים, תאריכים, שמות שופטים
|
||||||
|
|||||||
@@ -19,6 +19,10 @@ tools:
|
|||||||
- mcp__legal-ai__save_block_content
|
- mcp__legal-ai__save_block_content
|
||||||
- mcp__legal-ai__write_block
|
- mcp__legal-ai__write_block
|
||||||
- mcp__legal-ai__search_decisions
|
- mcp__legal-ai__search_decisions
|
||||||
|
- mcp__legal-ai__search_precedent_library
|
||||||
|
- mcp__legal-ai__precedent_library_get
|
||||||
|
- mcp__legal-ai__precedent_library_list
|
||||||
|
- mcp__legal-ai__halacha_review
|
||||||
- mcp__legal-ai__search_case_documents
|
- mcp__legal-ai__search_case_documents
|
||||||
- mcp__legal-ai__get_style_guide
|
- mcp__legal-ai__get_style_guide
|
||||||
- mcp__legal-ai__workflow_status
|
- mcp__legal-ai__workflow_status
|
||||||
@@ -200,15 +204,43 @@ case_update(case_number, status="drafted")
|
|||||||
- ספירת מילים לכל בלוק
|
- ספירת מילים לכל בלוק
|
||||||
- יחסי משקל (% מהמסמך)
|
- יחסי משקל (% מהמסמך)
|
||||||
|
|
||||||
|
### סגור את ה-issue של עצמך — חובה!
|
||||||
|
|
||||||
|
בלי זה Paperclip יזהה "issue in_progress + אין execution חיה" ויפעיל auto-retry בלולאה (נצפה בפועל ב-CMPA-17 ב-30/04/26 — 4 איטרציות מיותרות עד הריגה ידנית).
|
||||||
|
|
||||||
|
**אם הכל עבר בהצלחה (כל בדיקות השלב הקודם עברו, אין כשל בפלט):**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "done"}'
|
||||||
|
```
|
||||||
|
|
||||||
|
**אם בדיקות נכשלו, חסר פלט, או חסר מידע קריטי:**
|
||||||
|
```bash
|
||||||
|
curl -s -X PATCH -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}" \
|
||||||
|
-d '{"status": "blocked"}'
|
||||||
|
```
|
||||||
|
**אסור** לסיים `done` עם פלט חסר — אם משהו נכשל, סטטוס = `blocked` + comment עם פירוט.
|
||||||
|
|
||||||
### העֵר את העוזר המשפטי (CEO) — חובה!
|
### העֵר את העוזר המשפטי (CEO) — חובה!
|
||||||
```bash
|
```bash
|
||||||
|
# CEO לפי חברה — אסור לקבע UUID, חברות שונות = CEO שונה
|
||||||
|
if [ "$PAPERCLIP_COMPANY_ID" = "8639e837-4c9d-47fa-a76b-95788d651896" ]; then
|
||||||
|
CEO_ID="cdbfa8bc-3d61-41a4-a2e7-677ec7d34562" # CMPA — היטלי השבחה
|
||||||
|
else
|
||||||
|
CEO_ID="752cebdd-6748-4a04-aacd-c7ab0294ef33" # CMP — רישוי ובניה
|
||||||
|
fi
|
||||||
|
|
||||||
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
-H "Content-Type: application/json" \
|
-H "Content-Type: application/json" \
|
||||||
"$PAPERCLIP_API_URL/api/agents/752cebdd-6748-4a04-aacd-c7ab0294ef33/wakeup" \
|
"$PAPERCLIP_API_URL/api/agents/$CEO_ID/wakeup" \
|
||||||
-d '{"reason": "כותב החלטה סיים משימה [issue-id] בסטטוס [done/blocked]"}'
|
-d '{"source":"automation","triggerDetail":"system","reason":"כותב החלטה סיים משימה [issue-id] בסטטוס [done/blocked]","payload":{"issueId":"[issue-id]","mutation":"agent_completion"}}'
|
||||||
```
|
```
|
||||||
אם ה-API לא עובד:
|
|
||||||
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
**⚠️ אסור להשתמש ב-INSERT INTO agent_wakeup_requests ישירות!** הכנסה ישירה ל-DB יוצרת רק את הבקשה בלי heartbeat_run — והסוכן לא יתעורר לעולם. **תמיד להשתמש ב-API בלבד.**
|
||||||
|
**⚠️ אסור לקבע UUID של CEO** — UUID שונה לכל חברה. תמיד דרך `$PAPERCLIP_COMPANY_ID`. wakeup לחברה אחרת נדחה: `Agent key cannot access another company`.
|
||||||
|
|
||||||
**אם לא תעדכן סטטוס ל-drafted — בודק האיכות לא יוכל לרוץ!**
|
**אם לא תעדכן סטטוס ל-drafted — בודק האיכות לא יוכל לרוץ!**
|
||||||
|
|
||||||
@@ -313,6 +345,20 @@ curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
|||||||
|
|
||||||
זה לא קישוט. דפנה בונה ג'וריספרודנציה אישית מתמשכת. ראה דוגמה ב-1194-25 פס' 61, 64, 97, 98, 99 — חמש הפניות ל-1130-25.
|
זה לא קישוט. דפנה בונה ג'וריספרודנציה אישית מתמשכת. ראה דוגמה ב-1194-25 פס' 61, 64, 97, 98, 99 — חמש הפניות ל-1130-25.
|
||||||
|
|
||||||
|
### חיפוש פסיקה סמכותית חיצונית (חובה)
|
||||||
|
|
||||||
|
אחרי `search_decisions`, חפש גם ב-**`search_precedent_library`** — הקורפוס של פסיקת ערכאות עליונות וועדות ערר אחרות, עם הלכות שדפנה אישרה. זה המקור היחיד לציטוטי פסיקה בבלוק י לפי CREAC:
|
||||||
|
|
||||||
|
- **rule (כלל)** — נסח את הכלל המחייב מתוך `rule_statement`. אל תמציא ניסוח חדש; השתמש בניסוח שאושר.
|
||||||
|
- **explanation (הרחבה)** — צטט את `supporting_quote` במלואו, מילה במילה. כל ציטוט חייב לכלול `case_number` + `court` + מראה מקום (`page_reference` כשיש).
|
||||||
|
|
||||||
|
**הבחנה בין כלים:**
|
||||||
|
- `search_decisions` = החלטות דפנה עצמה (סגנון, אסטרטגיה, ג'וריספרודנציה אישית).
|
||||||
|
- `search_precedent_library` = פסיקה חיצונית סמכותית (מחייבת או משכנעת — בית המשפט העליון, מנהלי, ועדות ערר אחרות).
|
||||||
|
- `precedent_search_library` (שונה!) = ציטוטים שדפנה צירפה ידנית לתיקים בעבר. לא לבלבל.
|
||||||
|
|
||||||
|
חפש לפי `practice_area` (rishuy_uvniya / betterment_levy / compensation_197) ולפי `subject_tag` רלוונטי. הלכות שלא אושרו ע"י דפנה לא מוחזרות מהכלי — אם החיפוש ריק, חזור ל-`search_decisions` בלבד.
|
||||||
|
|
||||||
### אנטי-דפוסים — בדיקה אחרי כתיבה (חובה)
|
### אנטי-דפוסים — בדיקה אחרי כתיבה (חובה)
|
||||||
|
|
||||||
- [ ] **אין רשימות ממוספרות בתוך פסקה** (`(1)... (2)... (3)...`) — דפנה מעולם לא משתמשת
|
- [ ] **אין רשימות ממוספרות בתוך פסקה** (`(1)... (2)... (3)...`) — דפנה מעולם לא משתמשת
|
||||||
|
|||||||
@@ -2,7 +2,7 @@
|
|||||||
"master": {
|
"master": {
|
||||||
"tasks": [
|
"tasks": [
|
||||||
{
|
{
|
||||||
"id": "32",
|
"id": 32,
|
||||||
"title": "הקמת סביבת פיתוח ותשתית בסיסית",
|
"title": "הקמת סביבת פיתוח ותשתית בסיסית",
|
||||||
"description": "הקמת סביבת הפיתוח הבסיסית עם Python, FastAPI, PostgreSQL ו-Infisical לניהול סודות",
|
"description": "הקמת סביבת הפיתוח הבסיסית עם Python, FastAPI, PostgreSQL ו-Infisical לניהול סודות",
|
||||||
"details": "יצירת פרויקט Python עם FastAPI כשרת API, PostgreSQL כמסד נתונים, ו-Infisical לניהול סודות. הגדרת Docker containers לפיתוח מקומי. יצירת מבנה תיקיות: /src, /tests, /docs, /data. הגדרת requirements.txt עם כל התלויות הנדרשות: fastapi, uvicorn, sqlalchemy, psycopg2, python-multipart, python-docx, PyPDF2, anthropic, infisical-python. הגדרת משתני סביבה דרך Infisical.",
|
"details": "יצירת פרויקט Python עם FastAPI כשרת API, PostgreSQL כמסד נתונים, ו-Infisical לניהול סודות. הגדרת Docker containers לפיתוח מקומי. יצירת מבנה תיקיות: /src, /tests, /docs, /data. הגדרת requirements.txt עם כל התלויות הנדרשות: fastapi, uvicorn, sqlalchemy, psycopg2, python-multipart, python-docx, PyPDF2, anthropic, infisical-python. הגדרת משתני סביבה דרך Infisical.",
|
||||||
@@ -14,7 +14,7 @@
|
|||||||
"updatedAt": "2026-04-03T08:53:33.842Z"
|
"updatedAt": "2026-04-03T08:53:33.842Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "33",
|
"id": 33,
|
||||||
"title": "מודול קליטה ועיבוד מסמכים",
|
"title": "מודול קליטה ועיבוד מסמכים",
|
||||||
"description": "פיתוח מודול לקליטת קבצי PDF, DOCX, MD וחילוץ טקסט כולל OCR",
|
"description": "פיתוח מודול לקליטת קבצי PDF, DOCX, MD וחילוץ טקסט כולל OCR",
|
||||||
"details": "יצירת מחלקה DocumentProcessor שמטפלת בקבצים מסוגים שונים. עבור PDF: שימוש ב-PyPDF2 לטקסט רגיל ו-pytesseract לOCR של קבצים סרוקים. עבור DOCX: שימוש ב-python-docx. עבור MD: קריאה ישירה. הוספת זיהוי אוטומטי של קבצים סרוקים. יצירת API endpoint POST /documents/upload שמקבל קבצים ומחזיר טקסט מחולץ. שמירת מטא-דאטה של כל מסמך במסד הנתונים.",
|
"details": "יצירת מחלקה DocumentProcessor שמטפלת בקבצים מסוגים שונים. עבור PDF: שימוש ב-PyPDF2 לטקסט רגיל ו-pytesseract לOCR של קבצים סרוקים. עבור DOCX: שימוש ב-python-docx. עבור MD: קריאה ישירה. הוספת זיהוי אוטומטי של קבצים סרוקים. יצירת API endpoint POST /documents/upload שמקבל קבצים ומחזיר טקסט מחולץ. שמירת מטא-דאטה של כל מסמך במסד הנתונים.",
|
||||||
@@ -28,7 +28,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:38:55.716Z"
|
"updatedAt": "2026-04-03T09:38:55.716Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "34",
|
"id": 34,
|
||||||
"title": "מודול סיווג מסמכים וזיהוי צדדים",
|
"title": "מודול סיווג מסמכים וזיהוי צדדים",
|
||||||
"description": "פיתוח מודול לסיווג מסמכים לסוגים (ערר, תשובה, פרוטוקול וכו') וזיהוי צדדים",
|
"description": "פיתוח מודול לסיווג מסמכים לסוגים (ערר, תשובה, פרוטוקול וכו') וזיהוי צדדים",
|
||||||
"details": "יצירת מחלקה DocumentClassifier שמשתמשת ב-Claude API לסיווג מסמכים. הגדרת prompt מובנה שמזהה: סוג מסמך (ערר/תשובה/תגובה/פרוטוקול/תכנית/היתר/פסק דין/החלטה), צדדים (עוררים, משיבים, ועדה, מבקשי היתר), סוג ערר לפי מספר תיק (1xxx=רישוי, 8xxx=השבחה, 9xxx=פיצויים). יצירת מבנה נתונים מובנה לשמירת המידע המסווג. הוספת ולידציה לתוצאות הסיווג.",
|
"details": "יצירת מחלקה DocumentClassifier שמשתמשת ב-Claude API לסיווג מסמכים. הגדרת prompt מובנה שמזהה: סוג מסמך (ערר/תשובה/תגובה/פרוטוקול/תכנית/היתר/פסק דין/החלטה), צדדים (עוררים, משיבים, ועדה, מבקשי היתר), סוג ערר לפי מספר תיק (1xxx=רישוי, 8xxx=השבחה, 9xxx=פיצויים). יצירת מבנה נתונים מובנה לשמירת המידע המסווג. הוספת ולידציה לתוצאות הסיווג.",
|
||||||
@@ -42,7 +42,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:43:02.411Z"
|
"updatedAt": "2026-04-03T09:43:02.411Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "35",
|
"id": 35,
|
||||||
"title": "מודול חילוץ טענות",
|
"title": "מודול חילוץ טענות",
|
||||||
"description": "פיתוח מודול לחילוץ וסיכום טענות מכתבי טענות לפי צד",
|
"description": "פיתוח מודול לחילוץ וסיכום טענות מכתבי טענות לפי צד",
|
||||||
"details": "יצירת מחלקה ClaimsExtractor שמחלצת טענות מכתבי ערר ותשובה. שימוש ב-Claude API עם prompt מיוחד שמזהה טענות לפי צד ומסכם אותן בצורה נאמנה למקור. יצירת מבנה נתונים שמקשר בין טענה למסמך המקור ולמיקום בו. הוספת מנגנון לזיהוי טענות חוזרות או דומות. שמירת הטענות במסד הנתונים עם קישור לתיק ולצד.",
|
"details": "יצירת מחלקה ClaimsExtractor שמחלצת טענות מכתבי ערר ותשובה. שימוש ב-Claude API עם prompt מיוחד שמזהה טענות לפי צד ומסכם אותן בצורה נאמנה למקור. יצירת מבנה נתונים שמקשר בין טענה למסמך המקור ולמיקום בו. הוספת מנגנון לזיהוי טענות חוזרות או דומות. שמירת הטענות במסד הנתונים עם קישור לתיק ולצד.",
|
||||||
@@ -56,7 +56,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:45:38.799Z"
|
"updatedAt": "2026-04-03T09:45:38.799Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "36",
|
"id": 36,
|
||||||
"title": "מודול זיהוי תכניות ופסיקה",
|
"title": "מודול זיהוי תכניות ופסיקה",
|
||||||
"description": "פיתוח מודול לזיהוי תכניות חלות על המקרקעין ופסיקה מצוטטת במסמכים",
|
"description": "פיתוח מודול לזיהוי תכניות חלות על המקרקעין ופסיקה מצוטטת במסמכים",
|
||||||
"details": "יצירת מחלקה LegalReferencesExtractor שמזהה: תכניות (תב\"ע, תמ\"א, תכניות מקומיות), פסיקה מצוטטת (עם מספרי תיק ושנה), חקיקה רלוונטית. שימוש ב-regex patterns לזיהוי דפוסים נפוצים ו-Claude API לאימות ועידון. יצירת מאגר מקומי של תכניות ופסיקה שכבר זוהו. הוספת מנגנון לולידציה של הפניות שזוהו.",
|
"details": "יצירת מחלקה LegalReferencesExtractor שמזהה: תכניות (תב\"ע, תמ\"א, תכניות מקומיות), פסיקה מצוטטת (עם מספרי תיק ושנה), חקיקה רלוונטית. שימוש ב-regex patterns לזיהוי דפוסים נפוצים ו-Claude API לאימות ועידון. יצירת מאגר מקומי של תכניות ופסיקה שכבר זוהו. הוספת מנגנון לולידציה של הפניות שזוהו.",
|
||||||
@@ -70,7 +70,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:48:16.636Z"
|
"updatedAt": "2026-04-03T09:48:16.636Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "37",
|
"id": 37,
|
||||||
"title": "ממשק הזנת תוצאה וסיעור מוחות",
|
"title": "ממשק הזנת תוצאה וסיעור מוחות",
|
||||||
"description": "פיתוח ממשק CLI להזנת תוצאה (דחייה/קבלה/חלקית) ומנגנון סיעור מוחות",
|
"description": "פיתוח ממשק CLI להזנת תוצאה (דחייה/קבלה/חלקית) ומנגנון סיעור מוחות",
|
||||||
"details": "יצירת CLI interface עם typer שמאפשר לחיים להזין: סוג תוצאה (דחייה/קבלה/קבלה חלקית), נימוק (אופציונלי). אם לא הוזן נימוק - הפעלת מודול BrainstormingEngine שמציג טענות מרכזיות ומציע 2-3 כיוונים אפשריים. יצירת שיח אינטראקטיבי בין חיים למערכת עד הגעה לכיוון מוסכם. שמירת מסמך הכיוון הסופי. הוספת מנגנון מניעה מכתיבת דיון ללא כיוון מאושר.",
|
"details": "יצירת CLI interface עם typer שמאפשר לחיים להזין: סוג תוצאה (דחייה/קבלה/קבלה חלקית), נימוק (אופציונלי). אם לא הוזן נימוק - הפעלת מודול BrainstormingEngine שמציג טענות מרכזיות ומציע 2-3 כיוונים אפשריים. יצירת שיח אינטראקטיבי בין חיים למערכת עד הגעה לכיוון מוסכם. שמירת מסמך הכיוון הסופי. הוספת מנגנון מניעה מכתיבת דיון ללא כיוון מאושר.",
|
||||||
@@ -85,7 +85,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:55:06.069Z"
|
"updatedAt": "2026-04-03T09:55:06.069Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "38",
|
"id": 38,
|
||||||
"title": "מנוע כתיבת בלוק הפתיחה (בלוק ה)",
|
"title": "מנוע כתיבת בלוק הפתיחה (בלוק ה)",
|
||||||
"description": "פיתוח מנוע לכתיבת בלוק הפתיחה בסגנון דפנה",
|
"description": "פיתוח מנוע לכתיבת בלוק הפתיחה בסגנון דפנה",
|
||||||
"details": "יצירת מחלקה OpeningBlockWriter שכותבת את בלוק הפתיחה. ניתוח דפוסי הפתיחה מ-7 ההחלטות הקיימות (\"לפנינו\" vs \"עניינה של החלטה זו\"). יצירת prompt מובנה שמתאים את הפתיחה לסוג הערר ולמורכבות התיק. הוספת מנגנון לבחירת נוסח הפתיחה המתאים. שמירת תבניות פתיחה במסד הנתונים.",
|
"details": "יצירת מחלקה OpeningBlockWriter שכותבת את בלוק הפתיחה. ניתוח דפוסי הפתיחה מ-7 ההחלטות הקיימות (\"לפנינו\" vs \"עניינה של החלטה זו\"). יצירת prompt מובנה שמתאים את הפתיחה לסוג הערר ולמורכבות התיק. הוספת מנגנון לבחירת נוסח הפתיחה המתאים. שמירת תבניות פתיחה במסד הנתונים.",
|
||||||
@@ -99,7 +99,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.296Z"
|
"updatedAt": "2026-04-03T09:58:34.296Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "39",
|
"id": 39,
|
||||||
"title": "מנוע כתיבת בלוק הרקע (בלוק ו)",
|
"title": "מנוע כתיבת בלוק הרקע (בלוק ו)",
|
||||||
"description": "פיתוח מנוע לכתיבת בלוק הרקע בצורה ניטרלית",
|
"description": "פיתוח מנוע לכתיבת בלוק הרקע בצורה ניטרלית",
|
||||||
"details": "יצירת מחלקה BackgroundBlockWriter שכותבת רקע ניטרלי. הגדרת כללי ניטרליות: אין ציטוטים מצדדים, אין מילות שיפוט, הצגת עובדות בלבד. יצירת רשימת מילים אסורות ומנגנון ולידציה. שימוש במידע מהמסמכים המסווגים לבניית הרקע. הוספת מנגנון לקביעת אורך הרקע לפי מורכבות התיק (3%-18% מההחלטה).",
|
"details": "יצירת מחלקה BackgroundBlockWriter שכותבת רקע ניטרלי. הגדרת כללי ניטרליות: אין ציטוטים מצדדים, אין מילות שיפוט, הצגת עובדות בלבד. יצירת רשימת מילים אסורות ומנגנון ולידציה. שימוש במידע מהמסמכים המסווגים לבניית הרקע. הוספת מנגנון לקביעת אורך הרקע לפי מורכבות התיק (3%-18% מההחלטה).",
|
||||||
@@ -113,7 +113,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.300Z"
|
"updatedAt": "2026-04-03T09:58:34.300Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "40",
|
"id": 40,
|
||||||
"title": "מנוע כתיבת בלוק הטענות (בלוק ז)",
|
"title": "מנוע כתיבת בלוק הטענות (בלוק ז)",
|
||||||
"description": "פיתוח מנוע לכתיבת סיכום טענות הצדדים בגוף שלישי",
|
"description": "פיתוח מנוע לכתיבת סיכום טענות הצדדים בגוף שלישי",
|
||||||
"details": "יצירת מחלקה ClaimsBlockWriter שמסכמת טענות בגוף שלישי. שימוש בטענות שחולצו במודול חילוץ הטענות. הבטחת נאמנות מוחלטת למקור - אין שינוי מילים או קיצור ללא ציון. יצירת מבנה לוגי של הצגת הטענות לפי צד. הוספת מנגנון לקישור כל טענה למקור המדויק במסמך.",
|
"details": "יצירת מחלקה ClaimsBlockWriter שמסכמת טענות בגוף שלישי. שימוש בטענות שחולצו במודול חילוץ הטענות. הבטחת נאמנות מוחלטת למקור - אין שינוי מילים או קיצור ללא ציון. יצירת מבנה לוגי של הצגת הטענות לפי צד. הוספת מנגנון לקישור כל טענה למקור המדויק במסמך.",
|
||||||
@@ -127,7 +127,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.303Z"
|
"updatedAt": "2026-04-03T09:58:34.303Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "41",
|
"id": 41,
|
||||||
"title": "מנוע כתיבת בלוק ההליכים (בלוק ח)",
|
"title": "מנוע כתיבת בלוק ההליכים (בלוק ח)",
|
||||||
"description": "פיתוח מנוע לכתיבת בלוק ההליכים (רק כשהיו הליכים מעבר לדיון פשוט)",
|
"description": "פיתוח מנוע לכתיבת בלוק ההליכים (רק כשהיו הליכים מעבר לדיון פשוט)",
|
||||||
"details": "יצירת מחלקה ProceduresBlockWriter שכותבת תיעוד כרונולוגי של הליכים. זיהוי אוטומטי מתי נדרש הבלוק (סיור, השלמות טיעון, החלטות ביניים). יצירת ציר זמן של האירועים מהמסמכים. הבטחת דיוק עובדתי ומבנה כרונולוגי. הוספת מנגנון להחלטה אוטומטית האם הבלוק נדרש.",
|
"details": "יצירת מחלקה ProceduresBlockWriter שכותבת תיעוד כרונולוגי של הליכים. זיהוי אוטומטי מתי נדרש הבלוק (סיור, השלמות טיעון, החלטות ביניים). יצירת ציר זמן של האירועים מהמסמכים. הבטחת דיוק עובדתי ומבנה כרונולוגי. הוספת מנגנון להחלטה אוטומטית האם הבלוק נדרש.",
|
||||||
@@ -141,7 +141,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.305Z"
|
"updatedAt": "2026-04-03T09:58:34.305Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "42",
|
"id": 42,
|
||||||
"title": "מנוע כתיבת בלוק התכניות (בלוק ט)",
|
"title": "מנוע כתיבת בלוק התכניות (בלוק ט)",
|
||||||
"description": "פיתוח מנוע לכתיבת בלוק התכניות והמסגרת הנורמטיבית",
|
"description": "פיתוח מנוע לכתיבת בלוק התכניות והמסגרת הנורמטיבית",
|
||||||
"details": "יצירת מחלקה PlansBlockWriter שמטפלת ברישום תכניות. הגדרת כללי החלטה מתי נדרש פרק נפרד (מורכבות תכנונית, שאלה משפטית כמו ס' 152). שימוש במידע התכניות שזוהו במודול זיהוי התכניות. יצירת מבנה הירכי של התכניות (ארציות, מחוזיות, מקומיות). הוספת מנגנון לקביעת עומק הפירוט הנדרש.",
|
"details": "יצירת מחלקה PlansBlockWriter שמטפלת ברישום תכניות. הגדרת כללי החלטה מתי נדרש פרק נפרד (מורכבות תכנונית, שאלה משפטית כמו ס' 152). שימוש במידע התכניות שזוהו במודול זיהוי התכניות. יצירת מבנה הירכי של התכניות (ארציות, מחוזיות, מקומיות). הוספת מנגנון לקביעת עומק הפירוט הנדרש.",
|
||||||
@@ -155,7 +155,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.308Z"
|
"updatedAt": "2026-04-03T09:58:34.308Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "43",
|
"id": 43,
|
||||||
"title": "מנוע כתיבת בלוק הדיון (בלוק י) - ליבת המערכת",
|
"title": "מנוע כתיבת בלוק הדיון (בלוק י) - ליבת המערכת",
|
||||||
"description": "פיתוח מנוע הכתיבה המרכזי לבלוק הדיון בשיטת CREAC",
|
"description": "פיתוח מנוע הכתיבה המרכזי לבלוק הדיון בשיטת CREAC",
|
||||||
"details": "יצירת מחלקה DiscussionBlockWriter - הליבה של המערכת. יישום שיטת CREAC: מסקנה בפתיחה, כלל משפטי, הסבר, יישום על המקרה, מסקנה. הבטחת מענה לכל טענה מבלוק ז. שימוש בכיוון שנקבע בשלב סיעור המוחות. הוספת מנגנון למניעת כפילויות והפניות לבלוקים קודמים. יצירת מבנה לוגי של הנימוקים לפי סדר חשיבות.",
|
"details": "יצירת מחלקה DiscussionBlockWriter - הליבה של המערכת. יישום שיטת CREAC: מסקנה בפתיחה, כלל משפטי, הסבר, יישום על המקרה, מסקנה. הבטחת מענה לכל טענה מבלוק ז. שימוש בכיוון שנקבע בשלב סיעור המוחות. הוספת מנגנון למניעת כפילויות והפניות לבלוקים קודמים. יצירת מבנה לוגי של הנימוקים לפי סדר חשיבות.",
|
||||||
@@ -169,7 +169,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.311Z"
|
"updatedAt": "2026-04-03T09:58:34.311Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "44",
|
"id": 44,
|
||||||
"title": "מנוע כתיבת בלוק הסיכום (בלוק יא)",
|
"title": "מנוע כתיבת בלוק הסיכום (בלוק יא)",
|
||||||
"description": "פיתוח מנוע לכתיבת בלוק הסיכום עם הוראות אופרטיביות",
|
"description": "פיתוח מנוע לכתיבת בלוק הסיכום עם הוראות אופרטיביות",
|
||||||
"details": "יצירת מחלקה SummaryBlockWriter שכותבת הוראות אופרטיביות. גזירת ההוראות מהדיון שנכתב בבלוק י. הבטחת התאמה מדויקת להכרעה שנקבעה. יצירת מבנה ברור של ההוראות (מה מתקבל, מה נדחה, מה התנאים). הוספת מנגנון לולידציה של עקביות בין הדיון לסיכום.",
|
"details": "יצירת מחלקה SummaryBlockWriter שכותבת הוראות אופרטיביות. גזירת ההוראות מהדיון שנכתב בבלוק י. הבטחת התאמה מדויקת להכרעה שנקבעה. יצירת מבנה ברור של ההוראות (מה מתקבל, מה נדחה, מה התנאים). הוספת מנגנון לולידציה של עקביות בין הדיון לסיכום.",
|
||||||
@@ -183,7 +183,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:58:34.313Z"
|
"updatedAt": "2026-04-03T09:58:34.313Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "45",
|
"id": 45,
|
||||||
"title": "מנוע ייצוא DOCX מעוצב",
|
"title": "מנוע ייצוא DOCX מעוצב",
|
||||||
"description": "פיתוח מנוע לייצוא ההחלטה לקובץ DOCX מעוצב בעברית RTL",
|
"description": "פיתוח מנוע לייצוא ההחלטה לקובץ DOCX מעוצב בעברית RTL",
|
||||||
"details": "יצירת מחלקה DocxExporter שמייצרת DOCX מעוצב. הגדרת גופן David, כיוון RTL, כותרות מעוצבות, מספור סעיפים רציף. יצירת תבנית DOCX בסיסית עם הגדרות העיצוב. הוספת מנגנון לסימון מקומות תמונה (GIS, תשריט, סיור). הבטחת תמיכה מלאה בעברית ובכיוון RTL. יצירת מבנה היררכי של כותרות וסעיפים.",
|
"details": "יצירת מחלקה DocxExporter שמייצרת DOCX מעוצב. הגדרת גופן David, כיוון RTL, כותרות מעוצבות, מספור סעיפים רציף. יצירת תבנית DOCX בסיסית עם הגדרות העיצוב. הוספת מנגנון לסימון מקומות תמונה (GIS, תשריט, סיור). הבטחת תמיכה מלאה בעברית ובכיוון RTL. יצירת מבנה היררכי של כותרות וסעיפים.",
|
||||||
@@ -197,7 +197,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:12:36.842Z"
|
"updatedAt": "2026-04-03T10:12:36.842Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "46",
|
"id": 46,
|
||||||
"title": "מנגנון בקרת איכות ווולידציה",
|
"title": "מנגנון בקרת איכות ווולידציה",
|
||||||
"description": "פיתוח מנגנון בקרת איכות לוולידציה של ההחלטה לפני הפלט",
|
"description": "פיתוח מנגנון בקרת איכות לוולידציה של ההחלטה לפני הפלט",
|
||||||
"details": "יצירת מחלקה QualityController שבודקת: אפס הזיות (כל הפניה מול מסמכים שסופקו), מענה לכל טענה, רקע ניטרלי (ללא מילות שיפוט), משקלות בלוקים בטווח יחסי הזהב ±10%, ציטוטים נאמנים למקור. יצירת דוח ולידציה מפורט. הוספת מנגנון למניעת פלט במקרה של כשלון ולידציה קריטי.",
|
"details": "יצירת מחלקה QualityController שבודקת: אפס הזיות (כל הפניה מול מסמכים שסופקו), מענה לכל טענה, רקע ניטרלי (ללא מילות שיפוט), משקלות בלוקים בטווח יחסי הזהב ±10%, ציטוטים נאמנים למקור. יצירת דוח ולידציה מפורט. הוספת מנגנון למניעת פלט במקרה של כשלון ולידציה קריטי.",
|
||||||
@@ -211,7 +211,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:14:00.311Z"
|
"updatedAt": "2026-04-03T10:14:00.311Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "47",
|
"id": 47,
|
||||||
"title": "מודול לולאת למידה",
|
"title": "מודול לולאת למידה",
|
||||||
"description": "פיתוח מודול לקליטת גרסה סופית והשוואה לטיוטה ללמידה",
|
"description": "פיתוח מודול לקליטת גרסה סופית והשוואה לטיוטה ללמידה",
|
||||||
"details": "יצירת מחלקה LearningLoop שמקבלת את הגרסה הסופית שדפנה חתמה. השוואת הטיוטה לגרסה הסופית וזיהוי הבדלים. חילוץ לקחים: ביטויים חדשים, דפוסים שהשתנו, שגיאות חוזרות. עדכון מודל הסגנון על בסיס הלקחים. יצירת דוח למידה לחיים. שמירת הלקחים במסד הנתונים לשיפור עתידי.",
|
"details": "יצירת מחלקה LearningLoop שמקבלת את הגרסה הסופית שדפנה חתמה. השוואת הטיוטה לגרסה הסופית וזיהוי הבדלים. חילוץ לקחים: ביטויים חדשים, דפוסים שהשתנו, שגיאות חוזרות. עדכון מודל הסגנון על בסיס הלקחים. יצירת דוח למידה לחיים. שמירת הלקחים במסד הנתונים לשיפור עתידי.",
|
||||||
@@ -225,7 +225,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:15:14.639Z"
|
"updatedAt": "2026-04-03T10:15:14.639Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "48",
|
"id": 48,
|
||||||
"title": "מודול מדדי הצלחה ודשבורד",
|
"title": "מודול מדדי הצלחה ודשבורד",
|
||||||
"description": "פיתוח מודול למדידת KPIs ויצירת דשבורד מעקב",
|
"description": "פיתוח מודול למדידת KPIs ויצירת דשבורד מעקב",
|
||||||
"details": "יצירת מחלקה MetricsTracker שמודדת: אחוז שינוי (השוואת טיוטה לגרסה סופית), זמן לטיוטה (מקצה לקצה), אפס הזיות (ספירת הפניות לא תקינות), מענה לכל טענה, משקלות בלוקים, רקע ניטרלי. יצירת דשבורד פשוט עם הצגת המדדים לאורך זמן. הוספת התראות כשמדד יורד מתחת לסף המינימום.",
|
"details": "יצירת מחלקה MetricsTracker שמודדת: אחוז שינוי (השוואת טיוטה לגרסה סופית), זמן לטיוטה (מקצה לקצה), אפס הזיות (ספירת הפניות לא תקינות), מענה לכל טענה, משקלות בלוקים, רקע ניטרלי. יצירת דשבורד פשוט עם הצגת המדדים לאורך זמן. הוספת התראות כשמדד יורד מתחת לסף המינימום.",
|
||||||
@@ -239,7 +239,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:16:10.708Z"
|
"updatedAt": "2026-04-03T10:16:10.708Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "49",
|
"id": 49,
|
||||||
"title": "מנגנון ניהול סודות ואבטחה",
|
"title": "מנגנון ניהול סודות ואבטחה",
|
||||||
"description": "יישום מנגנון אבטחה מלא עם Infisical וניהול סודות",
|
"description": "יישום מנגנון אבטחה מלא עם Infisical וניהול סודות",
|
||||||
"details": "הגדרת Infisical לניהול כל הסודות: Anthropic API key, מחרוזות חיבור למסד נתונים, מפתחות הצפנה. יצירת מנגנון הצפנה לחומרי התיקים במסד הנתונים. הגדרת מדיניות גישה והרשאות. יצירת מנגנון audit log לכל הפעולות. הבטחת שחומרי התיקים לא נשלחים לשירותים חיצוניים מלבד Anthropic API.",
|
"details": "הגדרת Infisical לניהול כל הסודות: Anthropic API key, מחרוזות חיבור למסד נתונים, מפתחות הצפנה. יצירת מנגנון הצפנה לחומרי התיקים במסד הנתונים. הגדרת מדיניות גישה והרשאות. יצירת מנגנון audit log לכל הפעולות. הבטחת שחומרי התיקים לא נשלחים לשירותים חיצוניים מלבד Anthropic API.",
|
||||||
@@ -253,7 +253,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:17:43.954Z"
|
"updatedAt": "2026-04-03T10:17:43.954Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "50",
|
"id": 50,
|
||||||
"title": "מנגנון גיבוי ושחזור",
|
"title": "מנגנון גיבוי ושחזור",
|
||||||
"description": "יישום מנגנון גיבוי יומי אוטומטי ושחזור מסד הנתונים",
|
"description": "יישום מנגנון גיבוי יומי אוטומטי ושחזור מסד הנתונים",
|
||||||
"details": "יצירת סקריפט גיבוי יומי אוטומטי למסד הנתונים PostgreSQL. הגדרת cron job לביצוע הגיבוי בשעות הלילה. יצירת מנגנון שחזור מגיבוי. שמירת הגיבויים במיקום מאובטח. הוספת מנגנון לבדיקת תקינות הגיבויים. יצירת תיעוד לתהליכי גיבוי ושחזור.",
|
"details": "יצירת סקריפט גיבוי יומי אוטומטי למסד הנתונים PostgreSQL. הגדרת cron job לביצוע הגיבוי בשעות הלילה. יצירת מנגנון שחזור מגיבוי. שמירת הגיבויים במיקום מאובטח. הוספת מנגנון לבדיקת תקינות הגיבויים. יצירת תיעוד לתהליכי גיבוי ושחזור.",
|
||||||
@@ -267,7 +267,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:18:18.247Z"
|
"updatedAt": "2026-04-03T10:18:18.247Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "51",
|
"id": 51,
|
||||||
"title": "ממשק CLI מלא ותיעוד",
|
"title": "ממשק CLI מלא ותיעוד",
|
||||||
"description": "פיתוח ממשק CLI מלא עם כל הפקודות הנדרשות ותיעוד מקיף",
|
"description": "פיתוח ממשק CLI מלא עם כל הפקודות הנדרשות ותיעוד מקיף",
|
||||||
"details": "יצירת CLI מקיף עם typer שכולל: העלאת מסמכים, הזנת תוצאה, סיעור מוחות, יצירת טיוטה, הזנת גרסה סופית, הצגת מדדים. הוספת help מפורט לכל פקודה. יצירת תיעוד מקיף למשתמש עם דוגמאות שימוש. הוספת מנגנון לולידציה של קלטים. יצירת מנגנון לטיפול בשגיאות ומסרי שגיאה ברורים בעברית.",
|
"details": "יצירת CLI מקיף עם typer שכולל: העלאת מסמכים, הזנת תוצאה, סיעור מוחות, יצירת טיוטה, הזנת גרסה סופית, הצגת מדדים. הוספת help מפורט לכל פקודה. יצירת תיעוד מקיף למשתמש עם דוגמאות שימוש. הוספת מנגנון לולידציה של קלטים. יצירת מנגנון לטיפול בשגיאות ומסרי שגיאה ברורים בעברית.",
|
||||||
@@ -282,7 +282,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:19:20.241Z"
|
"updatedAt": "2026-04-03T10:19:20.241Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "52",
|
"id": 52,
|
||||||
"title": "בדיקות אינטגרציה ומבחן הסמכה",
|
"title": "בדיקות אינטגרציה ומבחן הסמכה",
|
||||||
"description": "יצירת חבילת בדיקות מקיפה ומבחן הסמכה על תיק אמיתי",
|
"description": "יצירת חבילת בדיקות מקיפה ומבחן הסמכה על תיק אמיתי",
|
||||||
"details": "יצירת בדיקות אינטגרציה לכל התהליך מקצה לקצה. בדיקה עם תיק הכט (תיק שכבר יש לו החלטה סופית) - השוואת הטיוטה שהמערכת מייצרת להחלטה הסופית. מדידת פער ווידוא שהוא קטן מ-10%. יצירת מבחן הסמכה מובנה לפני שימוש מבצעי. הוספת בדיקות ביצועים - וידוא שהמערכת מייצרת טיוטה תוך יום עבודה.",
|
"details": "יצירת בדיקות אינטגרציה לכל התהליך מקצה לקצה. בדיקה עם תיק הכט (תיק שכבר יש לו החלטה סופית) - השוואת הטיוטה שהמערכת מייצרת להחלטה הסופית. מדידת פער ווידוא שהוא קטן מ-10%. יצירת מבחן הסמכה מובנה לפני שימוש מבצעי. הוספת בדיקות ביצועים - וידוא שהמערכת מייצרת טיוטה תוך יום עבודה.",
|
||||||
@@ -296,7 +296,7 @@
|
|||||||
"updatedAt": "2026-04-04T07:50:59.998Z"
|
"updatedAt": "2026-04-04T07:50:59.998Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "53",
|
"id": 53,
|
||||||
"title": "הוספת שלב 6 - הגהת דפנה לדרישות הפונקציונליות",
|
"title": "הוספת שלב 6 - הגהת דפנה לדרישות הפונקציונליות",
|
||||||
"description": "הגדרת שלב הגהת דפנה החסר מהדרישות הפונקציונליות, כולל זרימת העבודה והממשקים",
|
"description": "הגדרת שלב הגהת דפנה החסר מהדרישות הפונקציונליות, כולל זרימת העבודה והממשקים",
|
||||||
"details": "יש להגדיר בדרישות הפונקציונליות: (1) איך דפנה מקבלת את הטיוטה בפורמט DOCX, (2) איך מחזירה הערות ותיקונים (ממשק או פורמט מובנה), (3) מי מעלה את הגרסה הסופית ללולאת הלמידה. כולל הגדרת API endpoints לקבלת הטיוטה ולהחזרת הערות, ומנגנון עדכון המודל על בסיס הפידבק.",
|
"details": "יש להגדיר בדרישות הפונקציונליות: (1) איך דפנה מקבלת את הטיוטה בפורמט DOCX, (2) איך מחזירה הערות ותיקונים (ממשק או פורמט מובנה), (3) מי מעלה את הגרסה הסופית ללולאת הלמידה. כולל הגדרת API endpoints לקבלת הטיוטה ולהחזרת הערות, ומנגנון עדכון המודל על בסיס הפידבק.",
|
||||||
@@ -308,7 +308,7 @@
|
|||||||
"updatedAt": "2026-04-02T20:58:19.827Z"
|
"updatedAt": "2026-04-02T20:58:19.827Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "54",
|
"id": 54,
|
||||||
"title": "החלפת דרישת 'אפס הזיות' במנגנון grounding ווולידציה",
|
"title": "החלפת דרישת 'אפס הזיות' במנגנון grounding ווולידציה",
|
||||||
"description": "החלפת הדרישה הלא ריאלית של אפס הזיות במנגנון grounding מתקדם ומערכת וולידציה אוטומטית",
|
"description": "החלפת הדרישה הלא ריאלית של אפס הזיות במנגנון grounding מתקדם ומערכת וולידציה אוטומטית",
|
||||||
"details": "יישום מנגנון grounding שמקשר כל הפניה למסמך מקור ספציפי עם citation tracking. פיתוח מערכת וולידציה אוטומטית שבודקת כל ציטוט/הפניה מול המסמכים שסופקו. הגדרת מדד: שיעור הפניות שלא עוברות וולידציה = 0. כולל מנגנון flagging של הפניות חשודות ודרישה לאישור ידני.",
|
"details": "יישום מנגנון grounding שמקשר כל הפניה למסמך מקור ספציפי עם citation tracking. פיתוח מערכת וולידציה אוטומטית שבודקת כל ציטוט/הפניה מול המסמכים שסופקו. הגדרת מדד: שיעור הפניות שלא עוברות וולידציה = 0. כולל מנגנון flagging של הפניות חשודות ודרישה לאישור ידני.",
|
||||||
@@ -320,7 +320,7 @@
|
|||||||
"updatedAt": "2026-04-02T20:58:55.741Z"
|
"updatedAt": "2026-04-02T20:58:55.741Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "55",
|
"id": 55,
|
||||||
"title": "הוספת ניהול context window overflow",
|
"title": "הוספת ניהול context window overflow",
|
||||||
"description": "פיתוח מנגנון לטיפול בתיקים מורכבים שחורגים מ-context window של המודל",
|
"description": "פיתוח מנגנון לטיפול בתיקים מורכבים שחורגים מ-context window של המודל",
|
||||||
"details": "יישום מדידת גודל חומרים בטוקנים, אסטרטגיית chunking חכמה ו/או summarization של מסמכים ארוכים. הגדרת סף התראה כשמתקרבים לגבול context window. פיתוח אלגוריתם לסדר עדיפויות של מסמכים והחלטה איזה חלקים לכלול בהקשר הנוכחי.",
|
"details": "יישום מדידת גודל חומרים בטוקנים, אסטרטגיית chunking חכמה ו/או summarization של מסמכים ארוכים. הגדרת סף התראה כשמתקרבים לגבול context window. פיתוח אלגוריתם לסדר עדיפויות של מסמכים והחלטה איזה חלקים לכלול בהקשר הנוכחי.",
|
||||||
@@ -332,7 +332,7 @@
|
|||||||
"updatedAt": "2026-04-02T20:59:34.704Z"
|
"updatedAt": "2026-04-02T20:59:34.704Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "56",
|
"id": 56,
|
||||||
"title": "הגדרה מתמטית מדויקת של 'אחוז שינוי'",
|
"title": "הגדרה מתמטית מדויקת של 'אחוז שינוי'",
|
||||||
"description": "הגדרה ברורה ומתמטית של מדד אחוז השינוי עם דוגמאות קונקרטיות",
|
"description": "הגדרה ברורה ומתמטית של מדד אחוז השינוי עם דוגמאות קונקרטיות",
|
||||||
"details": "הגדרת מדד אחוז שינוי מבוסס edit distance על מילים (לא תווים). ספירת שינויים: הוספה, מחיקה, החלפה של מילים. נוסחה: (מספר שינויים / סך מילים בטקסט המקורי) * 100. כולל דוגמאות מפורטות ומקרי קצה כמו שינוי סדר מילים, שינויי פיסוק, וטיפול בסעיפים חדשים.",
|
"details": "הגדרת מדד אחוז שינוי מבוסס edit distance על מילים (לא תווים). ספירת שינויים: הוספה, מחיקה, החלפה של מילים. נוסחה: (מספר שינויים / סך מילים בטקסט המקורי) * 100. כולל דוגמאות מפורטות ומקרי קצה כמו שינוי סדר מילים, שינויי פיסוק, וטיפול בסעיפים חדשים.",
|
||||||
@@ -344,7 +344,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:00:03.477Z"
|
"updatedAt": "2026-04-02T21:00:03.477Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "57",
|
"id": 57,
|
||||||
"title": "הוספת דרישות לבלוקים א-ד ויב",
|
"title": "הוספת דרישות לבלוקים א-ד ויב",
|
||||||
"description": "הגדרת דרישות פונקציונליות לבלוקים החסרים: כותרת, הרכב, צדדים וחתימות",
|
"description": "הגדרת דרישות פונקציונליות לבלוקים החסרים: כותרת, הרכב, צדדים וחתימות",
|
||||||
"details": "הגדרת דרישות מפורטות לבלוק א (כותרת התיק), בלוק ב (הרכב בית הדין), בלוק ג (זיהוי הצדדים), בלוק ד (פרטים נוספים על הצדדים), ובלוק יב (חתימות). כולל פורמט הפלט, מקורות המידע, וכללי עיבוד לכל בלוק. התאמה לתבנית הפסיקה הסטנדרטית.",
|
"details": "הגדרת דרישות מפורטות לבלוק א (כותרת התיק), בלוק ב (הרכב בית הדין), בלוק ג (זיהוי הצדדים), בלוק ד (פרטים נוספים על הצדדים), ובלוק יב (חתימות). כולל פורמט הפלט, מקורות המידע, וכללי עיבוד לכל בלוק. התאמה לתבנית הפסיקה הסטנדרטית.",
|
||||||
@@ -358,7 +358,7 @@
|
|||||||
"updatedAt": "2026-04-02T20:58:19.831Z"
|
"updatedAt": "2026-04-02T20:58:19.831Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "58",
|
"id": 58,
|
||||||
"title": "יישום מנגנון שמירת מצב ביניים (persistence)",
|
"title": "יישום מנגנון שמירת מצב ביניים (persistence)",
|
||||||
"description": "פיתוח מערכת לשמירת מצב העבודה ו-recovery מנפילות מערכת",
|
"description": "פיתוח מערכת לשמירת מצב העבודה ו-recovery מנפילות מערכת",
|
||||||
"details": "יישום מנגנון auto-save שמשמר את מצב העבודה כל כמה דקות. שמירת גרסאות ביניים של כל בלוק, מעקב אחר השלב הנוכחי בתהליך, ומנגנון recovery שמאפשר המשך עבודה מהנקודה האחרונה שנשמרה. כולל ממשק למשתמש לבחירת נקודת שחזור.",
|
"details": "יישום מנגנון auto-save שמשמר את מצב העבודה כל כמה דקות. שמירת גרסאות ביניים של כל בלוק, מעקב אחר השלב הנוכחי בתהליך, ומנגנון recovery שמאפשר המשך עבודה מהנקודה האחרונה שנשמרה. כולל ממשק למשתמש לבחירת נקודת שחזור.",
|
||||||
@@ -370,7 +370,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:01:07.799Z"
|
"updatedAt": "2026-04-02T21:01:07.799Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "59",
|
"id": 59,
|
||||||
"title": "תיקון ספירת שלבים בטבלת מעקב",
|
"title": "תיקון ספירת שלבים בטבלת מעקב",
|
||||||
"description": "עדכון טבלת המעקב להתאמה למספר השלבים בפועל",
|
"description": "עדכון טבלת המעקב להתאמה למספר השלבים בפועל",
|
||||||
"details": "עדכון הטבלה לציון 7 שלבים במקום 6, כולל השלב החדש של הגהת דפנה. עדכון כל הרפרנסים למספר השלבים במסמכי הדרישות והתיעוד. וידוא עקביות בין כל המסמכים.",
|
"details": "עדכון הטבלה לציון 7 שלבים במקום 6, כולל השלב החדש של הגהת דפנה. עדכון כל הרפרנסים למספר השלבים במסמכי הדרישות והתיעוד. וידוא עקביות בין כל המסמכים.",
|
||||||
@@ -384,7 +384,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:01:45.876Z"
|
"updatedAt": "2026-04-02T21:01:45.876Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "60",
|
"id": 60,
|
||||||
"title": "הכרה ב-MVP לרישוי והשבחה בלבד",
|
"title": "הכרה ב-MVP לרישוי והשבחה בלבד",
|
||||||
"description": "הגדרת גרסה ראשונה שמכסה רק רישוי והשבחה בשל חוסר נתוני אימון לפיצויים",
|
"description": "הגדרת גרסה ראשונה שמכסה רק רישוי והשבחה בשל חוסר נתוני אימון לפיצויים",
|
||||||
"details": "הגדרת MVP שמתמקד ברישוי והשבחה בלבד. תיעוד המגבלות הנוכחיות בנוגע לפיצויים ותכנית לאיסוף נתוני אימון עתידיים. הגדרת קריטריונים להרחבה לפיצויים בגרסאות עתידיות. עדכון מטריקות הצלחה בהתאם למגבלות הגרסה הראשונה.",
|
"details": "הגדרת MVP שמתמקד ברישוי והשבחה בלבד. תיעוד המגבלות הנוכחיות בנוגע לפיצויים ותכנית לאיסוף נתוני אימון עתידיים. הגדרת קריטריונים להרחבה לפיצויים בגרסאות עתידיות. עדכון מטריקות הצלחה בהתאם למגבלות הגרסה הראשונה.",
|
||||||
@@ -396,7 +396,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:01:45.879Z"
|
"updatedAt": "2026-04-02T21:01:45.879Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "61",
|
"id": 61,
|
||||||
"title": "בחינה מחדש של יעד 98% שיעור שינוי",
|
"title": "בחינה מחדש של יעד 98% שיעור שינוי",
|
||||||
"description": "הערכה מחדש של ריאליות יעד 98% בהתבסס על מחקר Endsley על התנהגות מומחים",
|
"description": "הערכה מחדש של ריאליות יעד 98% בהתבסס על מחקר Endsley על התנהגות מומחים",
|
||||||
"details": "ניתוח מחקרי על התנהגות מומחים ונטייתם לבצע שינויים. הגדרת יעד ריאלי יותר המתחשב בגורמים פסיכולוגיים. הצעת מדדי הצלחה חלופיים כמו שיעור שינויים משמעותיים או שביעות רצון המומחים. כולל הגדרת baseline מתוך נתונים היסטוריים אם קיימים.",
|
"details": "ניתוח מחקרי על התנהגות מומחים ונטייתם לבצע שינויים. הגדרת יעד ריאלי יותר המתחשב בגורמים פסיכולוגיים. הצעת מדדי הצלחה חלופיים כמו שיעור שינויים משמעותיים או שביעות רצון המומחים. כולל הגדרת baseline מתוך נתונים היסטוריים אם קיימים.",
|
||||||
@@ -408,7 +408,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:02:13.446Z"
|
"updatedAt": "2026-04-02T21:02:13.446Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "62",
|
"id": 62,
|
||||||
"title": "הגדרת מנגנון לולאת למידה",
|
"title": "הגדרת מנגנון לולאת למידה",
|
||||||
"description": "פיתוח מנגנון עדכון המודל על בסיס פידבק מדפנה ומשתמשים",
|
"description": "פיתוח מנגנון עדכון המודל על בסיס פידבק מדפנה ומשתמשים",
|
||||||
"details": "הגדרת אסטרטגיית עדכון המודל: fine-tuning מול prompt engineering מול עדכון RAG. יישום מנגנון איסוף פידבק מובנה, עיבוד הנתונים לפורמט מתאים לאימון, ותהליך עדכון אוטומטי או חצי-אוטומטי. כולל מנגנון A/B testing לבדיקת שיפורים.",
|
"details": "הגדרת אסטרטגיית עדכון המודל: fine-tuning מול prompt engineering מול עדכון RAG. יישום מנגנון איסוף פידבק מובנה, עיבוד הנתונים לפורמט מתאים לאימון, ותהליך עדכון אוטומטי או חצי-אוטומטי. כולל מנגנון A/B testing לבדיקת שיפורים.",
|
||||||
@@ -423,7 +423,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:02:32.651Z"
|
"updatedAt": "2026-04-02T21:02:32.651Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "63",
|
"id": 63,
|
||||||
"title": "הוספת הגנה מפני prompt injection",
|
"title": "הוספת הגנה מפני prompt injection",
|
||||||
"description": "יישום מנגנון הגנה מפני prompt injection ממסמכי מקור חיצוניים",
|
"description": "יישום מנגנון הגנה מפני prompt injection ממסמכי מקור חיצוניים",
|
||||||
"details": "פיתוח מנגנון סינון וסניטיזציה של מסמכי קלט לזיהוי ניסיונות prompt injection. יישום validation של תוכן המסמכים, הפרדה בין הוראות המערכת לתוכן המסמכים, ומנגנון flagging של מסמכים חשודים. כולל רשימה שחורה של דפוסים מסוכנים.",
|
"details": "פיתוח מנגנון סינון וסניטיזציה של מסמכי קלט לזיהוי ניסיונות prompt injection. יישום validation של תוכן המסמכים, הפרדה בין הוראות המערכת לתוכן המסמכים, ומנגנון flagging של מסמכים חשודים. כולל רשימה שחורה של דפוסים מסוכנים.",
|
||||||
@@ -437,7 +437,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:02:49.768Z"
|
"updatedAt": "2026-04-02T21:02:49.768Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "64",
|
"id": 64,
|
||||||
"title": "הוספת מנגנון back-flows בתהליך",
|
"title": "הוספת מנגנון back-flows בתהליך",
|
||||||
"description": "יישום יכולת חזרה אחורה בתהליך לעריכת בלוקים קודמים או שינוי כיוון",
|
"description": "יישום יכולת חזרה אחורה בתהליך לעריכת בלוקים קודמים או שינוי כיוון",
|
||||||
"details": "פיתוח ממשק לחזרה לשלבים קודמים בתהליך. מנגנון לעריכת בלוקים שכבר הושלמו, עדכון אוטומטי של בלוקים תלויים, ומעקב אחר שינויים. כולל אזהרות למשתמש על השפעת שינויים על בלוקים אחרים ואפשרות לביטול פעולות.",
|
"details": "פיתוח ממשק לחזרה לשלבים קודמים בתהליך. מנגנון לעריכת בלוקים שכבר הושלמו, עדכון אוטומטי של בלוקים תלויים, ומעקב אחר שינויים. כולל אזהרות למשתמש על השפעת שינויים על בלוקים אחרים ואפשרות לביטול פעולות.",
|
||||||
@@ -451,7 +451,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:01:07.801Z"
|
"updatedAt": "2026-04-02T21:01:07.801Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "65",
|
"id": 65,
|
||||||
"title": "הוספת שלב QA/ולידציה לפני שליחה לדפנה",
|
"title": "הוספת שלב QA/ולידציה לפני שליחה לדפנה",
|
||||||
"description": "יישום checklist אוטומטי ומנגנון QA לפני הפלט הסופי",
|
"description": "יישום checklist אוטומטי ומנגנון QA לפני הפלט הסופי",
|
||||||
"details": "פיתוח checklist אוטומטי שבודק שלמות כל הבלוקים, תקינות הפורמט, נוכחות כל הרכיבים הנדרשים, ועקביות פנימית. מנגנון וולידציה של ציטוטים והפניות, בדיקת איכות השפה, ואזהרות על בעיות פוטנציאליות. כולל דוח QA מפורט למשתמש.",
|
"details": "פיתוח checklist אוטומטי שבודק שלמות כל הבלוקים, תקינות הפורמט, נוכחות כל הרכיבים הנדרשים, ועקביות פנימית. מנגנון וולידציה של ציטוטים והפניות, בדיקת איכות השפה, ואזהרות על בעיות פוטנציאליות. כולל דוח QA מפורט למשתמש.",
|
||||||
@@ -466,7 +466,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:03:09.658Z"
|
"updatedAt": "2026-04-02T21:03:09.658Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "66",
|
"id": 66,
|
||||||
"title": "יישום ניהול גרסאות של בלוקים",
|
"title": "יישום ניהול גרסאות של בלוקים",
|
||||||
"description": "פיתוח מערכת ניהול גרסאות לכל בלוק בנפרד",
|
"description": "פיתוח מערכת ניהול גרסאות לכל בלוק בנפרד",
|
||||||
"details": "יישום version control לכל בלוק בנפרד, שמירת היסטוריית שינויים, יכולת השוואה בין גרסאות, ואפשרות לחזרה לגרסה קודמת של בלוק ספציפי. כולל ממשק גרפי להצגת ההבדלים בין גרסאות ומטא-דאטה על כל שינוי (זמן, משתמש, סיבה).",
|
"details": "יישום version control לכל בלוק בנפרד, שמירת היסטוריית שינויים, יכולת השוואה בין גרסאות, ואפשרות לחזרה לגרסה קודמת של בלוק ספציפי. כולל ממשק גרפי להצגת ההבדלים בין גרסאות ומטא-דאטה על כל שינוי (זמן, משתמש, סיבה).",
|
||||||
@@ -480,7 +480,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.961Z"
|
"updatedAt": "2026-04-02T21:04:33.961Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "67",
|
"id": 67,
|
||||||
"title": "טיפול באיחוד תיקים",
|
"title": "טיפול באיחוד תיקים",
|
||||||
"description": "פיתוח מנגנון לטיפול באיחוד תיקים כמו במקרה אריאלי 1078+1083",
|
"description": "פיתוח מנגנון לטיפול באיחוד תיקים כמו במקרה אריאלי 1078+1083",
|
||||||
"details": "יישום לוגיקה לזיהוי תיקים הקשורים זה לזה ומנגנון איחוד אוטומטי או חצי-אוטומטי. טיפול בחפיפות מידע, פתרון קונפליקטים, ושמירת קישוריות בין התיקים המאוחדים. כולל ממשק למשתמש לאישור ועריכת האיחוד המוצע.",
|
"details": "יישום לוגיקה לזיהוי תיקים הקשורים זה לזה ומנגנון איחוד אוטומטי או חצי-אוטומטי. טיפול בחפיפות מידע, פתרון קונפליקטים, ושמירת קישוריות בין התיקים המאוחדים. כולל ממשק למשתמש לאישור ועריכת האיחוד המוצע.",
|
||||||
@@ -495,7 +495,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.964Z"
|
"updatedAt": "2026-04-02T21:04:33.964Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "68",
|
"id": 68,
|
||||||
"title": "תיקון LOA של סיעור מוחות",
|
"title": "תיקון LOA של סיעור מוחות",
|
||||||
"description": "תיקון רמת האוטומציה של סיעור מוחות מרמה ג' לרמה ב'",
|
"description": "תיקון רמת האוטומציה של סיעור מוחות מרמה ג' לרמה ב'",
|
||||||
"details": "עדכון הגדרת רמת האוטומציה (LOA) של תהליך סיעור המוחות מרמה ג' (אוטומציה מלאה) לרמה ב' (אוטומציה עם פיקוח אנושי). עדכון כל המסמכים והממשקים הרלוונטיים. הבטחת התאמה לרמת הביקורת הנדרשת.",
|
"details": "עדכון הגדרת רמת האוטומציה (LOA) של תהליך סיעור המוחות מרמה ג' (אוטומציה מלאה) לרמה ב' (אוטומציה עם פיקוח אנושי). עדכון כל המסמכים והממשקים הרלוונטיים. הבטחת התאמה לרמת הביקורת הנדרשת.",
|
||||||
@@ -507,7 +507,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.967Z"
|
"updatedAt": "2026-04-02T21:04:33.967Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "69",
|
"id": 69,
|
||||||
"title": "הגדרת סיעור מוחות כאופציונלי",
|
"title": "הגדרת סיעור מוחות כאופציונלי",
|
||||||
"description": "שינוי הגדרת סיעור המוחות לאופציונלי גם במקרים שיש נימוק קיים",
|
"description": "שינוי הגדרת סיעור המוחות לאופציונלי גם במקרים שיש נימוק קיים",
|
||||||
"details": "עדכון הלוגיקה כך שסיעור מוחות יהיה אופציונלי בכל המקרים, כולל כאשר קיים נימוק בסיסי. הוספת אפשרות למשתמש לבחור האם להפעיל סיעור מוחות או לדלג עליו. עדכון ממשק המשתמש והדרישות בהתאם.",
|
"details": "עדכון הלוגיקה כך שסיעור מוחות יהיה אופציונלי בכל המקרים, כולל כאשר קיים נימוק בסיסי. הוספת אפשרות למשתמש לבחור האם להפעיל סיעור מוחות או לדלג עליו. עדכון ממשק המשתמש והדרישות בהתאם.",
|
||||||
@@ -521,7 +521,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.969Z"
|
"updatedAt": "2026-04-02T21:04:33.969Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "70",
|
"id": 70,
|
||||||
"title": "הוספת ניטרליות מבנית",
|
"title": "הוספת ניטרליות מבנית",
|
||||||
"description": "הרחבת דרישות הניטרליות מלקסיקלית למבנית",
|
"description": "הרחבת דרישות הניטרליות מלקסיקלית למבנית",
|
||||||
"details": "הגדרת כללים לניטרליות מבנית בנוסף ללקסיקלית: סדר הצגת הטיעונים, אורך היחסי של סעיפים, מיקום המידע, ומבנה הפסיקה. פיתוח מנגנון בדיקה אוטומטית לזיהוי הטיה מבנית ואזהרות למשתמש. כולל הנחיות לכתיבה מאוזנת.",
|
"details": "הגדרת כללים לניטרליות מבנית בנוסף ללקסיקלית: סדר הצגת הטיעונים, אורך היחסי של סעיפים, מיקום המידע, ומבנה הפסיקה. פיתוח מנגנון בדיקה אוטומטית לזיהוי הטיה מבנית ואזהרות למשתמש. כולל הנחיות לכתיבה מאוזנת.",
|
||||||
@@ -535,7 +535,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.973Z"
|
"updatedAt": "2026-04-02T21:04:33.973Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "71",
|
"id": 71,
|
||||||
"title": "מיפוי פרסורמן 4 stages",
|
"title": "מיפוי פרסורמן 4 stages",
|
||||||
"description": "הרחבת המיפוי מ-LOA בלבד לכלל 4 השלבים של מודל פרסורמן",
|
"description": "הרחבת המיפוי מ-LOA בלבד לכלל 4 השלבים של מודל פרסורמן",
|
||||||
"details": "מיפוי מלא של התהליך לפי 4 השלבים של פרסורמן: Information acquisition, Information analysis, Decision selection, Action implementation. הגדרת רמת האוטומציה לכל שלב בנפרד ולא רק LOA כללי. עדכון התיעוד והדרישות בהתאם.",
|
"details": "מיפוי מלא של התהליך לפי 4 השלבים של פרסורמן: Information acquisition, Information analysis, Decision selection, Action implementation. הגדרת רמת האוטומציה לכל שלב בנפרד ולא רק LOA כללי. עדכון התיעוד והדרישות בהתאם.",
|
||||||
@@ -549,7 +549,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.976Z"
|
"updatedAt": "2026-04-02T21:04:33.976Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "72",
|
"id": 72,
|
||||||
"title": "הגדרת דרישות ביצועים per-block וסינכרוני/אסינכרוני",
|
"title": "הגדרת דרישות ביצועים per-block וסינכרוני/אסינכרוני",
|
||||||
"description": "הגדרת דרישות ביצועים מפורטות לכל בלוק ובחירה בין עיבוד סינכרוני לאסינכרוני",
|
"description": "הגדרת דרישות ביצועים מפורטות לכל בלוק ובחירה בין עיבוד סינכרוני לאסינכרוני",
|
||||||
"details": "הגדרת SLA ספציפי לכל בלוק: זמני תגובה מקסימליים, throughput נדרש, ושיעור זמינות. החלטה על ארכיטקטורת עיבוד: סינכרונית לבלוקים קריטיים, אסינכרונית לבלוקים כבדים. יישום מנגנון ניטור ביצועים ואזהרות על חריגה מהסטנדרטים.",
|
"details": "הגדרת SLA ספציפי לכל בלוק: זמני תגובה מקסימליים, throughput נדרש, ושיעור זמינות. החלטה על ארכיטקטורת עיבוד: סינכרונית לבלוקים קריטיים, אסינכרונית לבלוקים כבדים. יישום מנגנון ניטור ביצועים ואזהרות על חריגה מהסטנדרטים.",
|
||||||
@@ -563,7 +563,7 @@
|
|||||||
"updatedAt": "2026-04-02T21:04:33.980Z"
|
"updatedAt": "2026-04-02T21:04:33.980Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "73",
|
"id": 73,
|
||||||
"title": "הרחבת DB schema לתהליך מלא",
|
"title": "הרחבת DB schema לתהליך מלא",
|
||||||
"description": "הוספת שדות וטבלאות חסרים לתמיכה בתהליך המלא של כתיבת החלטות משפטיות",
|
"description": "הוספת שדות וטבלאות חסרים לתמיכה בתהליך המלא של כתיבת החלטות משפטיות",
|
||||||
"details": "בקובץ db.py:\n1. הוספת שדות לטבלת decisions:\n - direction_doc JSONB - לשמירת מסמך הכיוון\n - outcome_reasoning TEXT - לנימוק התוצאה\n2. הרחבת enum של status בטבלת cases ל-13 ערכים:\n ['new', 'uploading', 'processing', 'documents_ready', 'outcome_set', 'brainstorming', 'direction_approved', 'drafting', 'qa_review', 'drafted', 'exported', 'reviewed', 'final']\n3. יצירת טבלת qa_results חדשה:\n - id SERIAL PRIMARY KEY\n - case_number VARCHAR REFERENCES cases\n - validation_type VARCHAR\n - passed BOOLEAN\n - errors JSONB\n - created_at TIMESTAMP\n4. יישום כ-migration עם Alembic",
|
"details": "בקובץ db.py:\n1. הוספת שדות לטבלת decisions:\n - direction_doc JSONB - לשמירת מסמך הכיוון\n - outcome_reasoning TEXT - לנימוק התוצאה\n2. הרחבת enum של status בטבלת cases ל-13 ערכים:\n ['new', 'uploading', 'processing', 'documents_ready', 'outcome_set', 'brainstorming', 'direction_approved', 'drafting', 'qa_review', 'drafted', 'exported', 'reviewed', 'final']\n3. יצירת טבלת qa_results חדשה:\n - id SERIAL PRIMARY KEY\n - case_number VARCHAR REFERENCES cases\n - validation_type VARCHAR\n - passed BOOLEAN\n - errors JSONB\n - created_at TIMESTAMP\n4. יישום כ-migration עם Alembic",
|
||||||
@@ -575,7 +575,7 @@
|
|||||||
"updatedAt": "2026-04-03T08:54:55.256Z"
|
"updatedAt": "2026-04-03T08:54:55.256Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "74",
|
"id": 74,
|
||||||
"title": "הוספת 5 API endpoints חדשים ב-MCP server",
|
"title": "הוספת 5 API endpoints חדשים ב-MCP server",
|
||||||
"description": "יצירת endpoints חדשים לתמיכה בתהליך כתיבת ההחלטות",
|
"description": "יצירת endpoints חדשים לתמיכה בתהליך כתיבת ההחלטות",
|
||||||
"details": "בקובץ server.py או בקבצי API:\n1. POST /api/cases/{case_number}/outcome\n - קבלת: {outcome: string, reasoning: string}\n - שמירה ב-DB\n - עדכון סטטוס ל-outcome_set\n2. GET /api/cases/{case_number}/claims\n - החזרת טענות מחולצות מה-JSONB\n3. POST /api/cases/{case_number}/direction\n - קבלת מסמך כיוון כ-JSON\n - שמירה בשדה direction_doc\n - עדכון סטטוס ל-direction_approved\n4. POST /api/cases/{case_number}/qa\n - הרצת בדיקות QA\n - שמירה בטבלת qa_results\n - החזרת תוצאות\n5. POST /api/cases/{case_number}/learn\n - הפעלת לולאת למידה\n - עדכון מודלים/פרמטרים",
|
"details": "בקובץ server.py או בקבצי API:\n1. POST /api/cases/{case_number}/outcome\n - קבלת: {outcome: string, reasoning: string}\n - שמירה ב-DB\n - עדכון סטטוס ל-outcome_set\n2. GET /api/cases/{case_number}/claims\n - החזרת טענות מחולצות מה-JSONB\n3. POST /api/cases/{case_number}/direction\n - קבלת מסמך כיוון כ-JSON\n - שמירה בשדה direction_doc\n - עדכון סטטוס ל-direction_approved\n4. POST /api/cases/{case_number}/qa\n - הרצת בדיקות QA\n - שמירה בטבלת qa_results\n - החזרת תוצאות\n5. POST /api/cases/{case_number}/learn\n - הפעלת לולאת למידה\n - עדכון מודלים/פרמטרים",
|
||||||
@@ -589,7 +589,7 @@
|
|||||||
"updatedAt": "2026-04-03T08:55:56.839Z"
|
"updatedAt": "2026-04-03T08:55:56.839Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "75",
|
"id": 75,
|
||||||
"title": "הוספת 8 tools חדשים לפלאגין Paperclip",
|
"title": "הוספת 8 tools חדשים לפלאגין Paperclip",
|
||||||
"description": "הרחבת הפלאגין עם כלים חדשים לאינטראקציה עם המערכת המשפטית",
|
"description": "הרחבת הפלאגין עם כלים חדשים לאינטראקציה עם המערכת המשפטית",
|
||||||
"details": "1. בקובץ src/worker.ts - הוספת 8 tools:\n - legal_document_upload: העלאת מסמך\n - legal_document_list: רשימת מסמכים\n - legal_document_text: קריאת טקסט ממסמך\n - legal_search_case: חיפוש תיק\n - legal_find_similar: מציאת תקדימים\n - legal_set_outcome: הגדרת תוצאה\n - legal_get_claims: קבלת טענות\n - legal_style_guide: קבלת הנחיות סגנון\n\n2. בקובץ src/legal-api.ts - יישום 8 methods:\n ```typescript\n async uploadDocument(caseNumber: string, file: File) {...}\n async listDocuments(caseNumber: string) {...}\n async getDocumentText(docId: string) {...}\n async searchCase(query: string) {...}\n async findSimilar(caseNumber: string) {...}\n async setOutcome(caseNumber: string, outcome: string, reasoning: string) {...}\n async getClaims(caseNumber: string) {...}\n async getStyleGuide() {...}\n ```\n\n3. בקובץ plugin.json - עדכון manifest",
|
"details": "1. בקובץ src/worker.ts - הוספת 8 tools:\n - legal_document_upload: העלאת מסמך\n - legal_document_list: רשימת מסמכים\n - legal_document_text: קריאת טקסט ממסמך\n - legal_search_case: חיפוש תיק\n - legal_find_similar: מציאת תקדימים\n - legal_set_outcome: הגדרת תוצאה\n - legal_get_claims: קבלת טענות\n - legal_style_guide: קבלת הנחיות סגנון\n\n2. בקובץ src/legal-api.ts - יישום 8 methods:\n ```typescript\n async uploadDocument(caseNumber: string, file: File) {...}\n async listDocuments(caseNumber: string) {...}\n async getDocumentText(docId: string) {...}\n async searchCase(query: string) {...}\n async findSimilar(caseNumber: string) {...}\n async setOutcome(caseNumber: string, outcome: string, reasoning: string) {...}\n async getClaims(caseNumber: string) {...}\n async getStyleGuide() {...}\n ```\n\n3. בקובץ plugin.json - עדכון manifest",
|
||||||
@@ -603,7 +603,7 @@
|
|||||||
"updatedAt": "2026-04-03T08:59:27.838Z"
|
"updatedAt": "2026-04-03T08:59:27.838Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "76",
|
"id": 76,
|
||||||
"title": "שיפור status sync ב-Paperclip",
|
"title": "שיפור status sync ב-Paperclip",
|
||||||
"description": "מיפוי מלא של 13 סטטוסים והוספת comments מפורטים",
|
"description": "מיפוי מלא של 13 סטטוסים והוספת comments מפורטים",
|
||||||
"details": "1. עדכון מיפוי סטטוסים:\n ```javascript\n const statusMapping = {\n 'new': 'תיק חדש',\n 'uploading': 'העלאת מסמכים',\n 'processing': 'עיבוד מסמכים',\n 'documents_ready': 'מסמכים מוכנים',\n 'outcome_set': 'תוצאה הוגדרה',\n 'brainstorming': 'גיבוש כיוון',\n 'direction_approved': 'כיוון אושר',\n 'drafting': 'כתיבת החלטה',\n 'qa_review': 'בדיקת איכות',\n 'drafted': 'טיוטה מוכנה',\n 'exported': 'יוצאה ל-DOCX',\n 'reviewed': 'נבדקה ע\"י עו\"ד',\n 'final': 'סופית'\n }\n ```\n\n2. הוספת comments אוטומטיים ב-Paperclip:\n - בכל מעבר סטטוס\n - עם timestamp\n - עם פירוט הפעולה\n\n3. עדכון job sync-case-status",
|
"details": "1. עדכון מיפוי סטטוסים:\n ```javascript\n const statusMapping = {\n 'new': 'תיק חדש',\n 'uploading': 'העלאת מסמכים',\n 'processing': 'עיבוד מסמכים',\n 'documents_ready': 'מסמכים מוכנים',\n 'outcome_set': 'תוצאה הוגדרה',\n 'brainstorming': 'גיבוש כיוון',\n 'direction_approved': 'כיוון אושר',\n 'drafting': 'כתיבת החלטה',\n 'qa_review': 'בדיקת איכות',\n 'drafted': 'טיוטה מוכנה',\n 'exported': 'יוצאה ל-DOCX',\n 'reviewed': 'נבדקה ע\"י עו\"ד',\n 'final': 'סופית'\n }\n ```\n\n2. הוספת comments אוטומטיים ב-Paperclip:\n - בכל מעבר סטטוס\n - עם timestamp\n - עם פירוט הפעולה\n\n3. עדכון job sync-case-status",
|
||||||
@@ -617,7 +617,7 @@
|
|||||||
"updatedAt": "2026-04-03T09:00:19.243Z"
|
"updatedAt": "2026-04-03T09:00:19.243Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "77",
|
"id": 77,
|
||||||
"title": "כתיבת SOUL.md לסוכנים",
|
"title": "כתיבת SOUL.md לסוכנים",
|
||||||
"description": "יצירת קבצי הנחיות לסוכני AI בעברית",
|
"description": "יצירת קבצי הנחיות לסוכני AI בעברית",
|
||||||
"details": "1. CEO Agent SOUL.md:\n ```markdown\n # CEO Agent - סוכן מנהל\n \n ## תפקיד\n ניהול תהליך כתיבת החלטה משפטית מקצה לקצה\n \n ## הנחיות\n - עבוד בעברית תמיד\n - נהל את התהליך לפי 13 הסטטוסים\n - התרע לחיים במקרים: תקלה טכנית, החלטה מורכבת, חריגה מזמנים\n - וודא שכל שלב הושלם לפני מעבר לבא\n \n ## מיפוי סטטוסים\n [רשימת 13 סטטוסים עם הסבר לכל אחד]\n ```\n\n2. Case Analyst Agent SOUL.md:\n ```markdown\n # Case Analyst - סוכן מנתח\n \n ## תפקיד\n ניתוח מסמכים משפטיים וחילוץ מידע\n \n ## הנחיות\n - נתח מסמכים בעברית\n - חלץ טענות מרכזיות\n - זהה תקדימים רלוונטיים\n - סכם עובדות מהותיות\n ```",
|
"details": "1. CEO Agent SOUL.md:\n ```markdown\n # CEO Agent - סוכן מנהל\n \n ## תפקיד\n ניהול תהליך כתיבת החלטה משפטית מקצה לקצה\n \n ## הנחיות\n - עבוד בעברית תמיד\n - נהל את התהליך לפי 13 הסטטוסים\n - התרע לחיים במקרים: תקלה טכנית, החלטה מורכבת, חריגה מזמנים\n - וודא שכל שלב הושלם לפני מעבר לבא\n \n ## מיפוי סטטוסים\n [רשימת 13 סטטוסים עם הסבר לכל אחד]\n ```\n\n2. Case Analyst Agent SOUL.md:\n ```markdown\n # Case Analyst - סוכן מנתח\n \n ## תפקיד\n ניתוח מסמכים משפטיים וחילוץ מידע\n \n ## הנחיות\n - נתח מסמכים בעברית\n - חלץ טענות מרכזיות\n - זהה תקדימים רלוונטיים\n - סכם עובדות מהותיות\n ```",
|
||||||
@@ -629,7 +629,7 @@
|
|||||||
"updatedAt": "2026-04-03T08:57:14.984Z"
|
"updatedAt": "2026-04-03T08:57:14.984Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "78",
|
"id": 78,
|
||||||
"title": "יישום skill /brainstorm",
|
"title": "יישום skill /brainstorm",
|
||||||
"description": "יצירת skill לגיבוש כיוון ההחלטה בשיתוף עם המשתמש",
|
"description": "יצירת skill לגיבוש כיוון ההחלטה בשיתוף עם המשתמש",
|
||||||
"details": "בקובץ skills/brainstorm.ts:\n```typescript\nexport async function brainstorm(caseNumber: string) {\n // שלב 1: הצגת טענות מרכזיות\n const claims = await api.getClaims(caseNumber);\n displayClaims(claims);\n \n // שלב 2: הצעת 2-3 כיוונים\n const directions = generateDirections(claims);\n displayDirections(directions);\n \n // שלב 3: דיון אינטראקטיבי\n let approved = false;\n while (!approved) {\n const feedback = await getUserFeedback();\n if (feedback.type === 'approve') {\n approved = true;\n } else {\n directions = refineDirections(directions, feedback);\n }\n }\n \n // שלב 4: יצירת מסמך כיוון\n const directionDoc = {\n mainDirection: directions.selected,\n keyPoints: directions.keyPoints,\n precedents: directions.precedents,\n approvedBy: 'user',\n timestamp: new Date()\n };\n \n // שלב 5: שמירה ועדכון סטטוס\n await api.saveDirection(caseNumber, directionDoc);\n}\n```",
|
"details": "בקובץ skills/brainstorm.ts:\n```typescript\nexport async function brainstorm(caseNumber: string) {\n // שלב 1: הצגת טענות מרכזיות\n const claims = await api.getClaims(caseNumber);\n displayClaims(claims);\n \n // שלב 2: הצעת 2-3 כיוונים\n const directions = generateDirections(claims);\n displayDirections(directions);\n \n // שלב 3: דיון אינטראקטיבי\n let approved = false;\n while (!approved) {\n const feedback = await getUserFeedback();\n if (feedback.type === 'approve') {\n approved = true;\n } else {\n directions = refineDirections(directions, feedback);\n }\n }\n \n // שלב 4: יצירת מסמך כיוון\n const directionDoc = {\n mainDirection: directions.selected,\n keyPoints: directions.keyPoints,\n precedents: directions.precedents,\n approvedBy: 'user',\n timestamp: new Date()\n };\n \n // שלב 5: שמירה ועדכון סטטוס\n await api.saveDirection(caseNumber, directionDoc);\n}\n```",
|
||||||
@@ -643,7 +643,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:16:24.667Z"
|
"updatedAt": "2026-04-03T10:16:24.667Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "79",
|
"id": 79,
|
||||||
"title": "שיפור skill /draft-decision לכתיבה בלוק-אחרי-בלוק",
|
"title": "שיפור skill /draft-decision לכתיבה בלוק-אחרי-בלוק",
|
||||||
"description": "שדרוג מ-stub לכתיבה מלאה עם 12 בלוקים",
|
"description": "שדרוג מ-stub לכתיבה מלאה עם 12 בלוקים",
|
||||||
"details": "בקובץ skills/draft-decision.ts:\n```typescript\nconst BLOCKS = [\n {id: 'ה', name: 'כותרת', temperature: 0.3},\n {id: 'ו', name: 'פתיח', temperature: 0.5},\n {id: 'ז', name: 'רקע', temperature: 0.4},\n {id: 'ח', name: 'טענות הצדדים', temperature: 0.3},\n {id: 'ט', name: 'תמצית', temperature: 0.6},\n {id: 'י', name: 'דיון והכרעה', temperature: 0.7, model: 'opus'},\n {id: 'יא', name: 'סוף דבר', temperature: 0.5}\n];\n\nexport async function draftDecision(caseNumber: string) {\n const direction = await api.getDirection(caseNumber);\n const lastBlock = await getLastCompletedBlock(caseNumber);\n \n for (let i = getBlockIndex(lastBlock) + 1; i < BLOCKS.length; i++) {\n const block = BLOCKS[i];\n \n // כתיבת בלוק\n const content = await writeBlock(block, {\n direction,\n previousBlocks: await getPreviousBlocks(caseNumber, i),\n temperature: block.temperature,\n model: block.model || 'default'\n });\n \n // שמירה מיידית\n await saveBlock(caseNumber, block.id, content);\n \n // בלוק י - CREAC + thinking\n if (block.id === 'י') {\n await applyCREAC(content);\n await addThinkingTags(content);\n }\n }\n}\n\n// Recovery function\nexport async function recoverDraft(caseNumber: string) {\n const lastBlock = await getLastCompletedBlock(caseNumber);\n return draftDecision(caseNumber); // ממשיך מאיפה שנפל\n}\n```",
|
"details": "בקובץ skills/draft-decision.ts:\n```typescript\nconst BLOCKS = [\n {id: 'ה', name: 'כותרת', temperature: 0.3},\n {id: 'ו', name: 'פתיח', temperature: 0.5},\n {id: 'ז', name: 'רקע', temperature: 0.4},\n {id: 'ח', name: 'טענות הצדדים', temperature: 0.3},\n {id: 'ט', name: 'תמצית', temperature: 0.6},\n {id: 'י', name: 'דיון והכרעה', temperature: 0.7, model: 'opus'},\n {id: 'יא', name: 'סוף דבר', temperature: 0.5}\n];\n\nexport async function draftDecision(caseNumber: string) {\n const direction = await api.getDirection(caseNumber);\n const lastBlock = await getLastCompletedBlock(caseNumber);\n \n for (let i = getBlockIndex(lastBlock) + 1; i < BLOCKS.length; i++) {\n const block = BLOCKS[i];\n \n // כתיבת בלוק\n const content = await writeBlock(block, {\n direction,\n previousBlocks: await getPreviousBlocks(caseNumber, i),\n temperature: block.temperature,\n model: block.model || 'default'\n });\n \n // שמירה מיידית\n await saveBlock(caseNumber, block.id, content);\n \n // בלוק י - CREAC + thinking\n if (block.id === 'י') {\n await applyCREAC(content);\n await addThinkingTags(content);\n }\n }\n}\n\n// Recovery function\nexport async function recoverDraft(caseNumber: string) {\n const lastBlock = await getLastCompletedBlock(caseNumber);\n return draftDecision(caseNumber); // ממשיך מאיפה שנפל\n}\n```",
|
||||||
@@ -658,7 +658,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:16:24.670Z"
|
"updatedAt": "2026-04-03T10:16:24.670Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "80",
|
"id": 80,
|
||||||
"title": "יישום skill /qa-validate",
|
"title": "יישום skill /qa-validate",
|
||||||
"description": "בדיקות איכות אוטומטיות על ההחלטה",
|
"description": "בדיקות איכות אוטומטיות על ההחלטה",
|
||||||
"details": "בקובץ skills/qa-validate.ts:\n```typescript\nexport async function qaValidate(caseNumber: string) {\n const decision = await api.getDecision(caseNumber);\n const documents = await api.getDocuments(caseNumber);\n const claims = await api.getClaims(caseNumber);\n \n const checks = [\n {\n name: 'grounding_check',\n fn: () => validateGrounding(decision, documents),\n critical: true\n },\n {\n name: 'claims_coverage',\n fn: () => validateClaimsCoverage(decision, claims),\n critical: true\n },\n {\n name: 'neutral_background',\n fn: () => validateNeutrality(decision.background),\n critical: false\n },\n {\n name: 'weights_range',\n fn: () => validateWeightsInRange(decision),\n critical: true\n },\n {\n name: 'sequential_numbering',\n fn: () => validateNumbering(decision),\n critical: false\n },\n {\n name: 'definitions',\n fn: () => validateDefinitions(decision),\n critical: false\n }\n ];\n \n const results = [];\n let hasErrors = false;\n \n for (const check of checks) {\n const result = await check.fn();\n results.push({...result, name: check.name});\n if (!result.passed && check.critical) {\n hasErrors = true;\n }\n }\n \n // שמירת תוצאות\n await api.saveQAResults(caseNumber, results);\n \n // חסימת ייצוא אם יש שגיאות קריטיות\n if (hasErrors) {\n await api.blockExport(caseNumber);\n throw new Error('QA failed - export blocked');\n }\n \n return results;\n}\n```",
|
"details": "בקובץ skills/qa-validate.ts:\n```typescript\nexport async function qaValidate(caseNumber: string) {\n const decision = await api.getDecision(caseNumber);\n const documents = await api.getDocuments(caseNumber);\n const claims = await api.getClaims(caseNumber);\n \n const checks = [\n {\n name: 'grounding_check',\n fn: () => validateGrounding(decision, documents),\n critical: true\n },\n {\n name: 'claims_coverage',\n fn: () => validateClaimsCoverage(decision, claims),\n critical: true\n },\n {\n name: 'neutral_background',\n fn: () => validateNeutrality(decision.background),\n critical: false\n },\n {\n name: 'weights_range',\n fn: () => validateWeightsInRange(decision),\n critical: true\n },\n {\n name: 'sequential_numbering',\n fn: () => validateNumbering(decision),\n critical: false\n },\n {\n name: 'definitions',\n fn: () => validateDefinitions(decision),\n critical: false\n }\n ];\n \n const results = [];\n let hasErrors = false;\n \n for (const check of checks) {\n const result = await check.fn();\n results.push({...result, name: check.name});\n if (!result.passed && check.critical) {\n hasErrors = true;\n }\n }\n \n // שמירת תוצאות\n await api.saveQAResults(caseNumber, results);\n \n // חסימת ייצוא אם יש שגיאות קריטיות\n if (hasErrors) {\n await api.blockExport(caseNumber);\n throw new Error('QA failed - export blocked');\n }\n \n return results;\n}\n```",
|
||||||
@@ -672,7 +672,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:16:24.673Z"
|
"updatedAt": "2026-04-03T10:16:24.673Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "81",
|
"id": 81,
|
||||||
"title": "אינטגרציה E2E וחיבור Paperclip events",
|
"title": "אינטגרציה E2E וחיבור Paperclip events",
|
||||||
"description": "חיבור מלא בין Paperclip ל-Claude Code עם trigger אוטומטי",
|
"description": "חיבור מלא בין Paperclip ל-Claude Code עם trigger אוטומטי",
|
||||||
"details": "1. חיבור Paperclip events:\n```javascript\n// בקובץ paperclip-integration.js\npaperclip.on('issue.comment.created', async (event) => {\n if (event.comment.includes('/draft')) {\n await claudeCode.trigger('draft-decision', {\n caseNumber: event.issue.number\n });\n }\n});\n```\n\n2. E2E test על תיק הכט:\n```javascript\ntest('full flow - Hecht case', async () => {\n // העלאת חומרים\n await uploadDocuments('hecht', ['doc1.pdf', 'doc2.pdf']);\n \n // הזנת תוצאה\n await setOutcome('hecht', 'rejected', 'אין עילה');\n \n // כתיבה\n await triggerDraft('hecht');\n await waitForStatus('drafted');\n \n // QA\n const qaResults = await runQA('hecht');\n expect(qaResults.passed).toBe(true);\n \n // ייצוא\n const docx = await exportToDocx('hecht');\n \n // השוואה\n const similarity = await compareToFinal(docx, 'hecht-final.docx');\n expect(similarity).toBeGreaterThan(0.9);\n});\n```",
|
"details": "1. חיבור Paperclip events:\n```javascript\n// בקובץ paperclip-integration.js\npaperclip.on('issue.comment.created', async (event) => {\n if (event.comment.includes('/draft')) {\n await claudeCode.trigger('draft-decision', {\n caseNumber: event.issue.number\n });\n }\n});\n```\n\n2. E2E test על תיק הכט:\n```javascript\ntest('full flow - Hecht case', async () => {\n // העלאת חומרים\n await uploadDocuments('hecht', ['doc1.pdf', 'doc2.pdf']);\n \n // הזנת תוצאה\n await setOutcome('hecht', 'rejected', 'אין עילה');\n \n // כתיבה\n await triggerDraft('hecht');\n await waitForStatus('drafted');\n \n // QA\n const qaResults = await runQA('hecht');\n expect(qaResults.passed).toBe(true);\n \n // ייצוא\n const docx = await exportToDocx('hecht');\n \n // השוואה\n const similarity = await compareToFinal(docx, 'hecht-final.docx');\n expect(similarity).toBeGreaterThan(0.9);\n});\n```",
|
||||||
@@ -691,7 +691,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:19:26.776Z"
|
"updatedAt": "2026-04-03T10:19:26.776Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "82",
|
"id": 82,
|
||||||
"title": "מבחן הסמכה",
|
"title": "מבחן הסמכה",
|
||||||
"description": "בדיקת המערכת על תיק עם החלטה קיימת והשוואת איכות",
|
"description": "בדיקת המערכת על תיק עם החלטה קיימת והשוואת איכות",
|
||||||
"details": "שלב ב - בדיקה על תיק עם החלטה:\n```javascript\nexport async function certificationTest() {\n // בחירת תיק עם החלטה סופית\n const testCase = await selectTestCase();\n \n // הסתרת ההחלטה המקורית\n await hideOriginalDecision(testCase.number);\n \n // הרצת המערכת\n await runFullFlow(testCase.number);\n \n // השוואה\n const draft = await getDecision(testCase.number);\n const original = testCase.originalDecision;\n \n const comparison = {\n structure: compareStructure(draft, original),\n content: compareContent(draft, original),\n reasoning: compareReasoning(draft, original),\n outcome: compareOutcome(draft, original)\n };\n \n // חישוב ציון כולל\n const score = calculateScore(comparison);\n \n // בדיקת סף - 90%\n if (score < 0.9) {\n throw new Error(`Score ${score} is below threshold`);\n }\n \n return {score, comparison};\n}\n\n// שלב ג - תיק חי\nexport async function liveTest() {\n const liveCase = await getLiveCase();\n await runFullFlow(liveCase.number);\n \n // שליחה לדפנה לבדיקה\n await sendForReview('dafna@law.firm', liveCase.number);\n}\n```",
|
"details": "שלב ב - בדיקה על תיק עם החלטה:\n```javascript\nexport async function certificationTest() {\n // בחירת תיק עם החלטה סופית\n const testCase = await selectTestCase();\n \n // הסתרת ההחלטה המקורית\n await hideOriginalDecision(testCase.number);\n \n // הרצת המערכת\n await runFullFlow(testCase.number);\n \n // השוואה\n const draft = await getDecision(testCase.number);\n const original = testCase.originalDecision;\n \n const comparison = {\n structure: compareStructure(draft, original),\n content: compareContent(draft, original),\n reasoning: compareReasoning(draft, original),\n outcome: compareOutcome(draft, original)\n };\n \n // חישוב ציון כולל\n const score = calculateScore(comparison);\n \n // בדיקת סף - 90%\n if (score < 0.9) {\n throw new Error(`Score ${score} is below threshold`);\n }\n \n return {score, comparison};\n}\n\n// שלב ג - תיק חי\nexport async function liveTest() {\n const liveCase = await getLiveCase();\n await runFullFlow(liveCase.number);\n \n // שליחה לדפנה לבדיקה\n await sendForReview('dafna@law.firm', liveCase.number);\n}\n```",
|
||||||
@@ -705,7 +705,7 @@
|
|||||||
"updatedAt": "2026-04-03T10:19:26.779Z"
|
"updatedAt": "2026-04-03T10:19:26.779Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "83",
|
"id": 83,
|
||||||
"title": "Phase 1 — Project setup (legal-ai UI rewrite)",
|
"title": "Phase 1 — Project setup (legal-ai UI rewrite)",
|
||||||
"description": "הקמת scaffold של Next.js עם TypeScript + Tailwind v4 + App Router ב-web-ui/. התקנת כל התלויות: @tanstack/react-query, @tanstack/react-table, react-hook-form, @hookform/resolvers, zod, lucide-react, react-dropzone, openapi-typescript. העברת design-system.css tokens (navy/gold/parchment, Heebo) ל-Tailwind theme דרך @theme ו-CSS variables. הגדרת RTL עברית עם Heebo via next/font/google. בניית AppShell עם navy header + gold rule + nav.",
|
"description": "הקמת scaffold של Next.js עם TypeScript + Tailwind v4 + App Router ב-web-ui/. התקנת כל התלויות: @tanstack/react-query, @tanstack/react-table, react-hook-form, @hookform/resolvers, zod, lucide-react, react-dropzone, openapi-typescript. העברת design-system.css tokens (navy/gold/parchment, Heebo) ל-Tailwind theme דרך @theme ו-CSS variables. הגדרת RTL עברית עם Heebo via next/font/google. בניית AppShell עם navy header + gold rule + nav.",
|
||||||
"status": "done",
|
"status": "done",
|
||||||
@@ -801,7 +801,7 @@
|
|||||||
"updatedAt": "2026-04-11T13:50:47.941Z"
|
"updatedAt": "2026-04-11T13:50:47.941Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "84",
|
"id": 84,
|
||||||
"title": "Phase 2 — API client + generated TypeScript types",
|
"title": "Phase 2 — API client + generated TypeScript types",
|
||||||
"description": "Add npm run api:types script that runs openapi-typescript against FastAPI's /openapi.json -> src/lib/api/types.ts. Build lib/api/client.ts (typed fetch wrapper + TanStack Query client with default retry/staleTime). Create one lib/api/<domain>.ts per endpoint category (cases, upload, compose, training, system), each exporting typed useQuery/useMutation hooks. Build lib/sse.ts as EventSource -> Query cache adapter. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
"description": "Add npm run api:types script that runs openapi-typescript against FastAPI's /openapi.json -> src/lib/api/types.ts. Build lib/api/client.ts (typed fetch wrapper + TanStack Query client with default retry/staleTime). Create one lib/api/<domain>.ts per endpoint category (cases, upload, compose, training, system), each exporting typed useQuery/useMutation hooks. Build lib/sse.ts as EventSource -> Query cache adapter. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
||||||
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 2 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 2 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
||||||
@@ -815,7 +815,7 @@
|
|||||||
"updatedAt": "2026-04-11T15:51:34.020Z"
|
"updatedAt": "2026-04-11T15:51:34.020Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "85",
|
"id": 85,
|
||||||
"title": "Phase 3 — Core read views (home, case detail, compose)",
|
"title": "Phase 3 — Core read views (home, case detail, compose)",
|
||||||
"description": "Port the 3 highest-value screens. Use the frontend-design Claude Code skill to generate layout + composition, passing design tokens (navy/gold/parchment, Heebo), editorial voice, and typed API hooks. Use shadcn Card/Badge/Tabs/Sheet/ScrollArea as primitives. Port the custom donut chart into <DonutChart> component. TanStack Query staleTime:5000 for case detail replaces manual 5s polling. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
"description": "Port the 3 highest-value screens. Use the frontend-design Claude Code skill to generate layout + composition, passing design tokens (navy/gold/parchment, Heebo), editorial voice, and typed API hooks. Use shadcn Card/Badge/Tabs/Sheet/ScrollArea as primitives. Port the custom donut chart into <DonutChart> component. TanStack Query staleTime:5000 for case detail replaces manual 5s polling. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
||||||
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 3 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 3 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
||||||
@@ -829,7 +829,7 @@
|
|||||||
"updatedAt": "2026-04-11T16:09:18.006Z"
|
"updatedAt": "2026-04-11T16:09:18.006Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "86",
|
"id": 86,
|
||||||
"title": "Phase 4 — Forms and wizards (new case, upload, inline edits)",
|
"title": "Phase 4 — Forms and wizards (new case, upload, inline edits)",
|
||||||
"description": "Port new case wizard, bulk upload, inline forms on case detail. Use react-hook-form + zod with schemas in lib/schemas/<entity>.ts. Build shared <WizardShell> from shadcn Card + Progress + Tabs. Build <DropZone> (react-dropzone + shadcn). Integrate SSE for upload progress via lib/sse.ts. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
"description": "Port new case wizard, bulk upload, inline forms on case detail. Use react-hook-form + zod with schemas in lib/schemas/<entity>.ts. Build shared <WizardShell> from shadcn Card + Progress + Tabs. Build <DropZone> (react-dropzone + shadcn). Integrate SSE for upload progress via lib/sse.ts. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
||||||
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 4 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 4 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
||||||
@@ -843,7 +843,7 @@
|
|||||||
"updatedAt": "2026-04-11T16:25:55.569Z"
|
"updatedAt": "2026-04-11T16:25:55.569Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "87",
|
"id": 87,
|
||||||
"title": "Phase 5 — Secondary screens (compare, training, style report, skills, diagnostics)",
|
"title": "Phase 5 — Secondary screens (compare, training, style report, skills, diagnostics)",
|
||||||
"description": "Port the remaining 5 views. Use TanStack Table for training corpus and diagnostics lists. Port any charts/visualizations from current index.html. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
"description": "Port the remaining 5 views. Use TanStack Table for training corpus and diagnostics lists. Port any charts/visualizations from current index.html. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
||||||
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 5 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 5 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
||||||
@@ -857,7 +857,7 @@
|
|||||||
"updatedAt": "2026-04-11T17:33:42.976Z"
|
"updatedAt": "2026-04-11T17:33:42.976Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "88",
|
"id": 88,
|
||||||
"title": "Phase 6 — Polish & testing",
|
"title": "Phase 6 — Polish & testing",
|
||||||
"description": "Accessibility pass (keyboard nav, aria-label on RTL icons, focus trap in modals). Error boundaries + toast notifications for failed mutations. Loading states for every query. Cross-browser smoke test (Chrome, Firefox, Safari) + mobile device test. Document E2E smoke test script in web-ui/README.md. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
"description": "Accessibility pass (keyboard nav, aria-label on RTL icons, focus trap in modals). Error boundaries + toast notifications for failed mutations. Loading states for every query. Cross-browser smoke test (Chrome, Firefox, Safari) + mobile device test. Document E2E smoke test script in web-ui/README.md. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
||||||
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 6 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 6 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
||||||
@@ -871,7 +871,7 @@
|
|||||||
"updatedAt": "2026-04-11T17:44:08.337Z"
|
"updatedAt": "2026-04-11T17:44:08.337Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "89",
|
"id": 89,
|
||||||
"title": "Phase 7 — Deployment & cutover",
|
"title": "Phase 7 — Deployment & cutover",
|
||||||
"description": "Add multi-stage Dockerfile for web-ui/ (Node 20 build -> nginx serve of out/). Add web-ui as new app in Coolify project pointing to staging subdomain legal-ai-next.nautilus.marcusgroup.org. Run full smoke test against staging. Cutover: DNS flip legal-ai.nautilus.marcusgroup.org to new app, keep old on rollback subdomain for 1 week. Follow-up PR removes legal-ai/web/static/index.html + design-system.css once stable. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
"description": "Add multi-stage Dockerfile for web-ui/ (Node 20 build -> nginx serve of out/). Add web-ui as new app in Coolify project pointing to staging subdomain legal-ai-next.nautilus.marcusgroup.org. Run full smoke test against staging. Cutover: DNS flip legal-ai.nautilus.marcusgroup.org to new app, keep old on rollback subdomain for 1 week. Follow-up PR removes legal-ai/web/static/index.html + design-system.css once stable. Plan: ~/.claude/plans/joyful-marinating-sutton.md.",
|
||||||
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 7 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
"details": "See full plan at ~/.claude/plans/joyful-marinating-sutton.md for architecture, critical files, risks, and open questions. This task is phase 7 of 7 in the legal-ai UI rewrite from vanilla HTML to Next.js 15 + shadcn/ui.",
|
||||||
@@ -884,7 +884,7 @@
|
|||||||
"subtasks": []
|
"subtasks": []
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "90",
|
"id": 90,
|
||||||
"title": "Phase 4.5 — Practice area integration",
|
"title": "Phase 4.5 — Practice area integration",
|
||||||
"description": "Add practice_area + appeal_subtype to the wizard, types, schema, case header, and cases table. Gap identified after backend commit 26d09d6 (multi-tenant axis) — new Next.js UI has zero integration while vanilla UI is fully wired. Plan: ~/.claude/plans/woolly-cooking-graham.md",
|
"description": "Add practice_area + appeal_subtype to the wizard, types, schema, case header, and cases table. Gap identified after backend commit 26d09d6 (multi-tenant axis) — new Next.js UI has zero integration while vanilla UI is fully wired. Plan: ~/.claude/plans/woolly-cooking-graham.md",
|
||||||
"details": "",
|
"details": "",
|
||||||
@@ -898,7 +898,7 @@
|
|||||||
"updatedAt": "2026-04-11T17:15:57.831Z"
|
"updatedAt": "2026-04-11T17:15:57.831Z"
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
"id": "91",
|
"id": 91,
|
||||||
"title": "Precedent attachment in compose screen",
|
"title": "Precedent attachment in compose screen",
|
||||||
"description": "Add case_precedents table + FastAPI endpoints + MCP tools + Next.js compose UI for attaching legal precedents (quote + citation + optional archived PDF) to threshold_claims/issues and to the case as a whole. Plan: ~/.claude/plans/woolly-cooking-graham.md",
|
"description": "Add case_precedents table + FastAPI endpoints + MCP tools + Next.js compose UI for attaching legal precedents (quote + citation + optional archived PDF) to threshold_claims/issues and to the case as a whole. Plan: ~/.claude/plans/woolly-cooking-graham.md",
|
||||||
"details": "",
|
"details": "",
|
||||||
@@ -974,5 +974,197 @@
|
|||||||
"updated": "2026-04-13T14:20:54.888Z",
|
"updated": "2026-04-13T14:20:54.888Z",
|
||||||
"description": "Tasks for master context"
|
"description": "Tasks for master context"
|
||||||
}
|
}
|
||||||
|
},
|
||||||
|
"legal-ai": {
|
||||||
|
"tasks": [
|
||||||
|
{
|
||||||
|
"id": "1",
|
||||||
|
"title": "V7 schema: precedent library + halachot tables",
|
||||||
|
"description": "Add SCHEMA_V7_SQL to db.py: extend case_law with source_kind/document_id/extraction_status/halacha_extraction_status/practice_area (CHECK constraint for 3 areas)/appeal_subtype/headnote. Create precedent_chunks table with vector(1024). Create halachot table with vector(1024), review_status, practice_areas array. Add IVFFlat indexes. Register V7 in init_schema().",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:17:59.928Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "2",
|
||||||
|
"title": "Chunker: add court ruling section patterns",
|
||||||
|
"description": "Extend services/chunker.py SECTION_PATTERNS with 4 patterns for external court rulings: פסק דין→ruling, נימוקים→legal_analysis, סוף דבר→conclusion, העובדות הצריכות לעניין→facts",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"1"
|
||||||
|
],
|
||||||
|
"priority": "medium",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:18:33.239Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "3",
|
||||||
|
"title": "Service: halacha_extractor.py",
|
||||||
|
"description": "New service that runs claude_session.query_json() over chunks where section_type IN (legal_analysis, ruling, conclusion). Concurrency=3, retry=1. Validates supporting_quote with substring check after Hebrew normalization. All halachot inserted with review_status=pending_review (no auto-publish). Embeds rule_statement+reasoning_summary via Voyage. Uses Hebrew prompt from plan appendix א. Idempotent on case_law_id.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"1",
|
||||||
|
"2"
|
||||||
|
],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:22:12.392Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "4",
|
||||||
|
"title": "Service: precedent_library.py orchestrator",
|
||||||
|
"description": "New service with ingest_precedent(file_path, citation, court, decision_date, source_type, precedent_level, practice_area, appeal_subtype, subject_tags, case_name, task_id) that orchestrates: extract_text → proofread → INSERT case_law (source_kind=external_upload) → chunk → embed → store precedent_chunks → halacha_extractor.extract → embed halachot → publish progress. Plus delete_precedent (cascading), list_precedents(filters), get_precedent(id), search_library(query, filters, limit) merging chunks+approved-halachot ranked.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"1",
|
||||||
|
"2",
|
||||||
|
"3"
|
||||||
|
],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:23:33.235Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "5",
|
||||||
|
"title": "MCP tools: precedent_library + halacha_review",
|
||||||
|
"description": "Create mcp-server/src/legal_mcp/tools/precedent_library.py with tools: precedent_library_upload, precedent_library_list, precedent_library_get, precedent_library_delete, precedent_extract_halachot, search_precedent_library (semantic, returns merged halachot+chunks), halacha_review (approve/reject). Register all in server.py. Do NOT modify existing precedent_search_library or search_decisions.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"4"
|
||||||
|
],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:25:07.439Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "6",
|
||||||
|
"title": "FastAPI endpoints under /api/precedent-library",
|
||||||
|
"description": "Add to web/app.py: POST /api/precedent-library/upload (multipart), GET /api/precedent-library (filters), GET /api/precedent-library/{id}, PATCH /api/precedent-library/{id}, DELETE /api/precedent-library/{id}, POST /api/precedent-library/{id}/extract-halachot, GET /api/precedent-library/search, GET /api/halachot?status=pending_review, PATCH /api/halachot/{id}, GET /api/precedent-library/stats. Reuse existing /api/progress/{task_id} SSE.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"5"
|
||||||
|
],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:26:21.860Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "7",
|
||||||
|
"title": "UI: /precedents page with 4 tabs",
|
||||||
|
"description": "New web-ui/src/app/precedents/page.tsx with tabs: Library (table+filters+upload), Semantic Search, Pending Review (PRIMARY - bulk approval UX with J/K nav, A/R/E shortcuts, side-by-side rule_statement vs supporting_quote, badge count), Stats. New components in web-ui/src/components/precedents/: precedent-upload-sheet, precedent-list-table, precedent-search-panel, precedent-detail-panel, halacha-review-card. New hooks in web-ui/src/lib/api/precedent-library.ts. Add nav link in app-shell.tsx.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"6"
|
||||||
|
],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:34:00.548Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "8",
|
||||||
|
"title": "Agent integration: legal-writer + 3 others",
|
||||||
|
"description": "Update .claude/agents/legal-writer.md (PRIMARY) — add mcp__legal-ai__search_precedent_library to tools and prompt section explaining when to use it for CREAC rule+explanation in block י. Update legal-researcher.md, legal-analyst.md, legal-ceo.md, legal-qa.md to add the tool. Update skills/decision/SKILL.md with section explaining the 3 corpora (style_corpus, case_precedents, precedent_library).",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"5"
|
||||||
|
],
|
||||||
|
"priority": "medium",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T08:36:24.711Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "9",
|
||||||
|
"title": "Service: precedent_metadata_extractor.py",
|
||||||
|
"description": "LLM-based extractor that auto-fills empty metadata fields after upload: short case_name (e.g. 'אהרון ברק' from long citation), summary (2-3 sentences), headnote, key_quote, subject_tags array, appeal_subtype. Reuses claude_session.query_json. Returns dict; caller decides which empty fields to merge (never overrides user values).",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T10:19:15.105Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "10",
|
||||||
|
"title": "Halacha extractor: dual mode (binding vs persuasive)",
|
||||||
|
"description": "Update halacha_extractor.py prompt to branch on is_binding: binding=true → strict halacha extraction (current). binding=false → extract reasoning principles, applications of established halachot, persuasive conclusions. New rule_types: 'application' (applying known rule to facts), 'persuasive' (committee's reasoning citable as authority). Schema unchanged (rule_type already TEXT).",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T10:19:15.117Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "11",
|
||||||
|
"title": "Ingest pipeline: add metadata extraction stage",
|
||||||
|
"description": "In services/precedent_library.py:ingest_precedent, after halacha extraction, run metadata_extractor and PATCH the case_law row with auto-filled fields (only those left empty by user). Publish progress 'extracting_metadata'.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [
|
||||||
|
"9"
|
||||||
|
],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T10:19:15.128Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "12",
|
||||||
|
"title": "UI: precedent edit sheet",
|
||||||
|
"description": "Add edit button to library-list-panel rows that opens a Sheet with all editable fields (case_name, citation, court, date, practice_area, appeal_subtype, subject_tags, summary, headnote, key_quote, source_type, precedent_level, is_binding). Pre-populated from current values. Submit calls PATCH /api/precedent-library/{id} via useUpdatePrecedent. After save, invalidate library list query.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "done",
|
||||||
|
"dependencies": [],
|
||||||
|
"priority": "high",
|
||||||
|
"subtasks": [],
|
||||||
|
"updatedAt": "2026-05-03T10:19:15.134Z"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"id": "13",
|
||||||
|
"title": "Test on 403-17: fix metadata + re-extract",
|
||||||
|
"description": "After deploy: PATCH 403-17 to set case_name='ערר 403/17', then trigger precedent_extract_halachot to test the dual-mode extraction on a non-binding committee decision.",
|
||||||
|
"details": "",
|
||||||
|
"testStrategy": "",
|
||||||
|
"status": "pending",
|
||||||
|
"dependencies": [
|
||||||
|
"9",
|
||||||
|
"10",
|
||||||
|
"11",
|
||||||
|
"12"
|
||||||
|
],
|
||||||
|
"priority": "medium",
|
||||||
|
"subtasks": []
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"metadata": {
|
||||||
|
"version": "1.0.0",
|
||||||
|
"lastModified": "2026-05-03T10:19:15.134Z",
|
||||||
|
"taskCount": 13,
|
||||||
|
"completedCount": 12,
|
||||||
|
"tags": [
|
||||||
|
"legal-ai"
|
||||||
|
]
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -50,6 +50,8 @@
|
|||||||
| [`docs/new-company-setup-guide.md`](docs/new-company-setup-guide.md) | מדריך הקמת חברה חדשה (CMPA) — skills, corpus, style analysis | לפני הוספת חברה/סוג ערר חדש |
|
| [`docs/new-company-setup-guide.md`](docs/new-company-setup-guide.md) | מדריך הקמת חברה חדשה (CMPA) — skills, corpus, style analysis | לפני הוספת חברה/סוג ערר חדש |
|
||||||
| [`docs/audit-report.md`](docs/audit-report.md) | דוח audit של המערכת | רקע כללי |
|
| [`docs/audit-report.md`](docs/audit-report.md) | דוח audit של המערכת | רקע כללי |
|
||||||
| [`docs/case-migration-tracker.md`](docs/case-migration-tracker.md) | מעקב מיגרציה של תיקים קיימים | לצורך מעקב |
|
| [`docs/case-migration-tracker.md`](docs/case-migration-tracker.md) | מעקב מיגרציה של תיקים קיימים | לצורך מעקב |
|
||||||
|
| [`docs/case-deletion-runbook.md`](docs/case-deletion-runbook.md) | runbook מלא למחיקת תיק — legal-ai DB + disk + Paperclip + Gitea, FK ordering, fallback ל-SQL ישיר | לפני reset שלם של תיק (מבחן, מחיקה בטעות) |
|
||||||
|
| [`docs/paperclip-quirks.md`](docs/paperclip-quirks.md) | מלכודות ידועות ב-Paperclip — `issue.released` ש-flips done→todo, bash backtick trap, CEO auto-block, wakeup דרך DB | לפני שמייחסים באג בסוכן ל-skill — לבדוק קודם אם זה Paperclip-side |
|
||||||
| [`docs/decision-block-mapping.md`](docs/decision-block-mapping.md) | מיפוי בלוקים להחלטות — איך 12 הבלוקים משתקפים ב-DOCX | להתמצאות במבנה |
|
| [`docs/decision-block-mapping.md`](docs/decision-block-mapping.md) | מיפוי בלוקים להחלטות — איך 12 הבלוקים משתקפים ב-DOCX | להתמצאות במבנה |
|
||||||
| [`docs/memory.md`](docs/memory.md) | הקשר כללי — skills, פרויקטים שהושלמו, מבנה vault | להתמצאות כללית |
|
| [`docs/memory.md`](docs/memory.md) | הקשר כללי — skills, פרויקטים שהושלמו, מבנה vault | להתמצאות כללית |
|
||||||
| [`skills/decision/SKILL.md`](skills/decision/SKILL.md) | מדריך סגנון מלא של דפנה — טון, מבנה, ביטויים, מתודולוגיה | **לפני כל כתיבת החלטה** |
|
| [`skills/decision/SKILL.md`](skills/decision/SKILL.md) | מדריך סגנון מלא של דפנה — טון, מבנה, ביטויים, מתודולוגיה | **לפני כל כתיבת החלטה** |
|
||||||
|
|||||||
@@ -40,7 +40,7 @@ Local (developer machine, pm2):
|
|||||||
|
|
||||||
External:
|
External:
|
||||||
← Claude API (Opus 4.7 for agents)
|
← Claude API (Opus 4.7 for agents)
|
||||||
← Voyage AI (voyage-3-large, 1024-dim embeddings)
|
← Voyage AI (voyage-3, 1024-dim embeddings)
|
||||||
← Infisical (secret management)
|
← Infisical (secret management)
|
||||||
← Gmail SMTP (agent notifications)
|
← Gmail SMTP (agent notifications)
|
||||||
```
|
```
|
||||||
@@ -59,7 +59,7 @@ External:
|
|||||||
- מפעיל OCR (Google Vision) אם PDF ללא טקסט
|
- מפעיל OCR (Google Vision) אם PDF ללא טקסט
|
||||||
- מריץ proofreader להסרת artifacts מ-Nevo
|
- מריץ proofreader להסרת artifacts מ-Nevo
|
||||||
- מחלץ טקסט ל-`documents.extracted_text`
|
- מחלץ טקסט ל-`documents.extracted_text`
|
||||||
- מפצל ל-chunks של ~500 מילים, מחשב embeddings (voyage-3-large, 1024D), שומר ב-`document_chunks`
|
- מפצל ל-chunks של ~500 מילים, מחשב embeddings (voyage-3, 1024D), שומר ב-`document_chunks`
|
||||||
4. סטטוס תיק: `new` → `proofread`
|
4. סטטוס תיק: `new` → `proofread`
|
||||||
|
|
||||||
### שלב 2 — ניתוח משפטי (legal-researcher + analyst)
|
### שלב 2 — ניתוח משפטי (legal-researcher + analyst)
|
||||||
@@ -223,7 +223,7 @@ legal-qa מריץ 6 בדיקות איכות:
|
|||||||
`case_law`, `statutory_provisions`, `transition_phrases`, `lessons_learned`, `style_corpus`, `style_patterns`
|
`case_law`, `statutory_provisions`, `transition_phrases`, `lessons_learned`, `style_corpus`, `style_patterns`
|
||||||
|
|
||||||
### Layer 4: Semantic Search (RAG)
|
### Layer 4: Semantic Search (RAG)
|
||||||
`document_embeddings`, `paragraph_embeddings`, `case_law_embeddings` (pgvector 1024-dim, voyage-3-large)
|
`document_embeddings`, `paragraph_embeddings`, `case_law_embeddings` (pgvector 1024-dim, voyage-3)
|
||||||
|
|
||||||
### Layer 5 — Multi-tenancy
|
### Layer 5 — Multi-tenancy
|
||||||
`companies`, `tag_company_mappings` (appeal_subtype → company_id)
|
`companies`, `tag_company_mappings` (appeal_subtype → company_id)
|
||||||
@@ -283,7 +283,9 @@ legal-qa מריץ 6 בדיקות איכות:
|
|||||||
## טכנולוגיות עיקריות
|
## טכנולוגיות עיקריות
|
||||||
|
|
||||||
- **Database**: PostgreSQL 15 + pgvector 0.8.1
|
- **Database**: PostgreSQL 15 + pgvector 0.8.1
|
||||||
- **Embeddings**: Voyage AI (`voyage-3-large`, 1024-dim)
|
- **Embeddings**: Voyage AI (`voyage-3`, 1024-dim) + cross-encoder rerank (`rerank-2`)
|
||||||
|
- bi-encoder: voyage-3 לכל chunk (חד-פעמי בעת ingestion)
|
||||||
|
- cross-encoder: rerank-2 לכל query (top-50 → top-K), feature flag `VOYAGE_RERANK_ENABLED`
|
||||||
- **Agents**: Claude Opus 4.7 (via Paperclip pm2)
|
- **Agents**: Claude Opus 4.7 (via Paperclip pm2)
|
||||||
- **DOCX manipulation**: `python-docx` 1.2+ ו-`lxml` 5.2+ (XML surgery)
|
- **DOCX manipulation**: `python-docx` 1.2+ ו-`lxml` 5.2+ (XML surgery)
|
||||||
- **Frontend**: Next.js + TanStack Query + Tailwind
|
- **Frontend**: Next.js + TanStack Query + Tailwind
|
||||||
|
|||||||
179
docs/case-deletion-runbook.md
Normal file
179
docs/case-deletion-runbook.md
Normal file
@@ -0,0 +1,179 @@
|
|||||||
|
# מחיקת תיק — runbook
|
||||||
|
|
||||||
|
> **מתי להשתמש:** reset שלם של תיק (לבדיקות end-to-end), מחיקת תיק שנפתח בטעות, או ניקיון לפני העלאה חוזרת של מסמכים.
|
||||||
|
>
|
||||||
|
> **חשוב:** ה-API `DELETE /api/cases` בלבד **לא מספיק** — הוא מטפל רק בצד legal-ai (DB + on-disk dir). תיק חי במקביל ב-4 מערכות והכול חייב להתנקות יחד.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## איפה ה-state של תיק חי
|
||||||
|
|
||||||
|
| מערכת | מה נשמר | איך מנקים |
|
||||||
|
|---|---|---|
|
||||||
|
| **legal-ai DB** (port 5433) | `cases` + `documents` + `document_chunks` + `claims` + `appraiser_facts` + `decisions` + `qa_results` + `case_precedents` | API DELETE (cascade על FK) |
|
||||||
|
| **legal-ai disk** | `/data/cases/{N}/` בתוך ה-container — מכיל drafts/, documents/, .git/ | API עם `remove_files=true` (`shutil.rmtree` בתוך ה-container) |
|
||||||
|
| **Paperclip DB** (port 54329) | `projects` + `issues` + `issue_comments` + `agent_wakeup_requests` + `heartbeat_runs` (audit) + עוד 6+ טבלאות | SQL ידני (אין API) |
|
||||||
|
| **Gitea** | repo `cases/{N}` אם נוצר ב-case-create | Gitea API |
|
||||||
|
|
||||||
|
ה-API לא מטפל ב-Paperclip ו-Gitea כי אלה מערכות חיצוניות שלגמרי מחוץ ל-DB של legal-ai. תועד מפורשות ב-docstring של [`services/db.py:delete_case`](../mcp-server/src/legal_mcp/services/db.py).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## תהליך מחיקה מלא — שלב אחרי שלב
|
||||||
|
|
||||||
|
הצב את מספר התיק במשתנה לפני שמתחילים:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
CASE_NUMBER=8174-24
|
||||||
|
```
|
||||||
|
|
||||||
|
### שלב 1 — legal-ai (DB + disk)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl -s -X DELETE \
|
||||||
|
"https://legal-ai.nautilus.marcusgroup.org/api/cases?case_number=${CASE_NUMBER}&remove_files=true" \
|
||||||
|
-w "\nhttp=%{http_code}\n"
|
||||||
|
```
|
||||||
|
|
||||||
|
תוצאה צפויה: `200` עם `{"deleted": true, "removed_files": true, ...}`.
|
||||||
|
|
||||||
|
מה זה עושה מאחורי הקלעים:
|
||||||
|
1. `DELETE FROM cases` — מפעיל **CASCADE** ל-7 טבלאות, **SET NULL** ל-`audit_log` ו-`chair_feedback`.
|
||||||
|
2. `shutil.rmtree(/data/cases/{N})` — מסיר את כל הספרייה כולל `.git`.
|
||||||
|
|
||||||
|
> **הערה:** עד לפני [commit `903fb4d`](https://gitea.nautilus.marcusgroup.org/ezer-mishpati/legal-ai/commit/903fb4d) ה-endpoint הזה החזיר 500 כי `db.delete_case` לא היה מוגדר. אם נתקלת ב-500 בגרסה ישנה, השתמש ב-SQL הישיר (ראה Fallback בסוף).
|
||||||
|
|
||||||
|
### שלב 2 — Paperclip
|
||||||
|
|
||||||
|
אין API. SQL ישיר:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
PGPASSWORD=paperclip psql -h localhost -p 54329 -U paperclip -d paperclip <<SQL
|
||||||
|
BEGIN;
|
||||||
|
|
||||||
|
-- 1. מצא את כל ה-issues של הפרויקט (לפי שם)
|
||||||
|
CREATE TEMP TABLE _issue_ids AS
|
||||||
|
SELECT i.id, i.identifier
|
||||||
|
FROM issues i
|
||||||
|
JOIN projects p ON i.project_id = p.id
|
||||||
|
WHERE p.name LIKE '%${CASE_NUMBER}%';
|
||||||
|
|
||||||
|
SELECT identifier FROM _issue_ids ORDER BY identifier; -- וידוא לפני המחיקה
|
||||||
|
|
||||||
|
-- 2. מחק blockers ל-FK עם NO ACTION (אסור למחוק issue אם יש להם reference)
|
||||||
|
DELETE FROM issue_comments WHERE issue_id IN (SELECT id FROM _issue_ids);
|
||||||
|
DELETE FROM cost_events WHERE issue_id IN (SELECT id FROM _issue_ids);
|
||||||
|
DELETE FROM finance_events WHERE issue_id IN (SELECT id FROM _issue_ids);
|
||||||
|
DELETE FROM feedback_votes WHERE issue_id IN (SELECT id FROM _issue_ids);
|
||||||
|
DELETE FROM issue_inbox_archives WHERE issue_id IN (SELECT id FROM _issue_ids);
|
||||||
|
DELETE FROM issue_read_states WHERE issue_id IN (SELECT id FROM _issue_ids);
|
||||||
|
|
||||||
|
-- 3. מחק את ה-issues. CASCADE מטפל ב-7 טבלאות נוספות:
|
||||||
|
-- issue_approvals, issue_attachments, issue_documents,
|
||||||
|
-- issue_execution_decisions, issue_labels, issue_relations,
|
||||||
|
-- issue_work_products
|
||||||
|
DELETE FROM issues WHERE id IN (SELECT id FROM _issue_ids);
|
||||||
|
|
||||||
|
-- 4. שבור FK מ-heartbeat_runs כדי שאפשר יהיה למחוק wakeup_requests.
|
||||||
|
-- heartbeat_runs נשמרים כ-audit log לא משויך.
|
||||||
|
UPDATE heartbeat_runs
|
||||||
|
SET wakeup_request_id = NULL
|
||||||
|
WHERE wakeup_request_id IN (
|
||||||
|
SELECT id FROM agent_wakeup_requests
|
||||||
|
WHERE payload->>'issueId' IN (SELECT id::text FROM _issue_ids)
|
||||||
|
);
|
||||||
|
|
||||||
|
DELETE FROM agent_wakeup_requests
|
||||||
|
WHERE payload->>'issueId' IN (SELECT id::text FROM _issue_ids);
|
||||||
|
|
||||||
|
-- 5. מחק blockers ברמת ה-project (NO ACTION FK ל-projects)
|
||||||
|
DELETE FROM cost_events WHERE project_id IN (SELECT id FROM projects WHERE name LIKE '%${CASE_NUMBER}%');
|
||||||
|
DELETE FROM finance_events WHERE project_id IN (SELECT id FROM projects WHERE name LIKE '%${CASE_NUMBER}%');
|
||||||
|
|
||||||
|
-- 6. מחק את הפרויקט. CASCADE מטפל ב:
|
||||||
|
-- execution_workspaces, project_goals, project_workspaces, routines
|
||||||
|
DELETE FROM projects WHERE name LIKE '%${CASE_NUMBER}%' RETURNING id, name;
|
||||||
|
|
||||||
|
COMMIT;
|
||||||
|
SQL
|
||||||
|
```
|
||||||
|
|
||||||
|
> **למה Paperclip לא הוסיף API למחיקה?** כי זאת מערכת רב-משתמשית ומחיקה היא הרסנית מטבעה — Paperclip מעדיף `archive` (`projects.archived_at`). אנחנו אכן רוצים מחיקה אמיתית רק לסביבת בדיקות.
|
||||||
|
|
||||||
|
### שלב 3 — Gitea (אם repo נוצר)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
GITEA_TOKEN=$(infisical secrets get GITEA__API_TOKEN --silent || \
|
||||||
|
echo "$GITEA_TOKEN") # סגדור מ-Infisical או ENV
|
||||||
|
|
||||||
|
curl -s -X DELETE \
|
||||||
|
-H "Authorization: token ${GITEA_TOKEN}" \
|
||||||
|
"https://gitea.nautilus.marcusgroup.org/api/v1/repos/cases/${CASE_NUMBER}" \
|
||||||
|
-w "http=%{http_code}\n"
|
||||||
|
```
|
||||||
|
|
||||||
|
תוצאה צפויה: `204` (deleted) או `404` (לא נוצר מעולם).
|
||||||
|
|
||||||
|
### שלב 4 — וידוא ניקיון
|
||||||
|
|
||||||
|
```bash
|
||||||
|
echo "=== legal-ai ==="
|
||||||
|
PGPASSWORD=$LEGAL_AI_PG psql -h localhost -p 5433 -U legal_ai -d legal_ai -t -c "
|
||||||
|
SELECT count(*) FROM cases WHERE case_number = '${CASE_NUMBER}';
|
||||||
|
" # → 0
|
||||||
|
|
||||||
|
ls /home/chaim/legal-ai/data/cases/${CASE_NUMBER} 2>&1 | head -1
|
||||||
|
# → "No such file or directory"
|
||||||
|
|
||||||
|
echo "=== Paperclip ==="
|
||||||
|
PGPASSWORD=paperclip psql -h localhost -p 54329 -U paperclip -d paperclip -t -c "
|
||||||
|
SELECT 'projects:'||count(*) FROM projects WHERE name LIKE '%${CASE_NUMBER}%'
|
||||||
|
UNION ALL SELECT 'issues:'||count(*) FROM issues WHERE title LIKE '%${CASE_NUMBER}%'
|
||||||
|
UNION ALL SELECT 'comments:'||count(*) FROM issue_comments WHERE body LIKE '%${CASE_NUMBER}%'
|
||||||
|
UNION ALL SELECT 'wakeups:'||count(*) FROM agent_wakeup_requests WHERE payload::text LIKE '%${CASE_NUMBER}%';
|
||||||
|
" # → all 0
|
||||||
|
|
||||||
|
echo "=== Gitea ==="
|
||||||
|
curl -s -H "Authorization: token ${GITEA_TOKEN}" \
|
||||||
|
"https://gitea.nautilus.marcusgroup.org/api/v1/repos/cases/${CASE_NUMBER}" \
|
||||||
|
| python3 -c "import json,sys; d=json.load(sys.stdin); print(d.get('full_name','NOT FOUND'))"
|
||||||
|
# → NOT FOUND
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Fallback — אם ה-API נשבר
|
||||||
|
|
||||||
|
אם משום מה ה-API DELETE לא עובד (ראינו את זה בעבר עם `delete_case` החסר), עשה DELETE ישיר ב-DB. ה-FK constraints יבצעו את העבודה:
|
||||||
|
|
||||||
|
```sql
|
||||||
|
PGPASSWORD=$LEGAL_AI_PG psql -h localhost -p 5433 -U legal_ai -d legal_ai -c "
|
||||||
|
DELETE FROM cases WHERE case_number = '${CASE_NUMBER}' RETURNING case_number, title;
|
||||||
|
"
|
||||||
|
```
|
||||||
|
|
||||||
|
לאחר מכן הסר את הספרייה מהדיסק. הספרייה בבעלות `root` כי ה-container רץ כ-root, אז תצטרך `sudo`:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
sudo rm -rf /home/chaim/legal-ai/data/cases/${CASE_NUMBER}
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## הערות שנלמדו תוך כדי
|
||||||
|
|
||||||
|
1. **`heartbeat_runs.wakeup_request_id`** הוא ה-trap היחיד. הוא NO ACTION FK, ולכן חוסם מחיקה של `agent_wakeup_requests`. הפתרון: `UPDATE ... SET wakeup_request_id = NULL` לפני המחיקה. ה-runs עצמם נשמרים כ-audit log (לא הפסד).
|
||||||
|
|
||||||
|
2. **פרויקט "name" ב-Paperclip** — לפי הקונבנציה הוא מתחיל ב-"ערר {N}" — לכן `LIKE '%{N}%'` מספיק. אם יש מספר תיקים שמכילים את אותו מספר, להחמיר עם match מלא או לפי `id`.
|
||||||
|
|
||||||
|
3. **Container ↔ host file ownership** — קבצים שיוצר ה-container (כולל ספריית התיק) שייכים ל-`root`. מחיקה מהמארח דורשת `sudo`, או דרך docker exec, או דרך ה-API (שמבצעת `rmtree` בתוך ה-container).
|
||||||
|
|
||||||
|
4. **`audit_log` ו-`chair_feedback` נשארים** — FK שלהם הוא SET NULL כדי לשמור היסטוריה גם אחרי שהתיק נמחק. אם אתה צריך מחיקה היסטרית מוחלטת, מחק שורות אלה ידנית.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## TODO — אוטומציה
|
||||||
|
|
||||||
|
ה-runbook הזה ניתן להמרה לסקריפט `scripts/delete-case.sh` שמקבל `CASE_NUMBER` ומבצע את 4 השלבים עם prompt confirmation. עדיין לא הוטמע — נכון להיום העבודה ידנית.
|
||||||
|
|
||||||
|
מי שמטמיע: שמור את הסקריפט כ-`destructive` ב-SCRIPTS.md ודרוש `--confirm` או prompt אינטראקטיבי. אסור שיעבוד בלי אישור מפורש.
|
||||||
@@ -252,3 +252,136 @@ Total: ~340,000 words of source material.
|
|||||||
Intermediate extraction documents also saved:
|
Intermediate extraction documents also saved:
|
||||||
- `docs/fjc-principles-extraction.md` — 38 principles from FJC
|
- `docs/fjc-principles-extraction.md` — 38 principles from FJC
|
||||||
- `docs/garner-methodology-extraction.md` — ~50 principles from Garner/Scalia
|
- `docs/garner-methodology-extraction.md` — ~50 principles from Garner/Scalia
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Lessons from הר הבשן 1033-25 (April 2026)
|
||||||
|
|
||||||
|
### Source
|
||||||
|
- Final decision: `data/cases/1033-25/exports/עריכה-v2.docx`
|
||||||
|
- Our draft (v6): `data/cases/1033-25/exports/טיוטה-v6.docx`
|
||||||
|
- Intermediate edit (v1): `data/cases/1033-25/exports/עריכה-v1.docx`
|
||||||
|
- Date: April 2026
|
||||||
|
- Result: Full acceptance (קבלה מלאה)
|
||||||
|
- Word counts: Draft 2,126 → Final 2,299 (+8%)
|
||||||
|
- Discussion section: Draft 960 words (19 paras) → Final 1,099 words (23 paras) (+14%)
|
||||||
|
|
||||||
|
### What Our Draft Got Right
|
||||||
|
- **12-block structure preserved** — all blocks in correct order, headings identical
|
||||||
|
- **Opening formula** — bottom-line opening "מצאנו כי דין הערר להתקבל" (mode A adapted for acceptance) — used and kept
|
||||||
|
- **Threshold claims treatment** — all 3 threshold claims handled correctly with same reasoning
|
||||||
|
- **Central argument flow** — committee's own conditions → shadow plan → not feasible → appeal accepted — this was the exact structure Dafna kept
|
||||||
|
- **Background neutrality** — facts-only background passed final review (no party quotes, no value words)
|
||||||
|
- **Most paragraphs kept verbatim** — blocks ו (background), ז (claims), and most of ח (procedures) were kept nearly word-for-word
|
||||||
|
- **Transition phrases** — "ונוסיף", "הנה כי כן", "הדברים מתחדדים שעה שנזכיר כי" — all used correctly and retained
|
||||||
|
- **Direct quote from licensing rep** — "נכון, אני מסכימה, התבקשו הרחבות..." — kept verbatim
|
||||||
|
- **"מסקנת ביניים"** technique — used correctly and retained
|
||||||
|
- **"למען הסדר הטוב"** — correct usage for remaining claims section
|
||||||
|
|
||||||
|
### What the Final Version Changed — Critical Gaps
|
||||||
|
|
||||||
|
#### 20. Over-Doctrinal: Abstract Legal Framework Removed Entirely
|
||||||
|
- **Draft:** Had a 101-word "נבאר" paragraph explaining the general legal authority of committees to require uniform building plans, covering advisory vs. mandatory annexes and administrative review processes — pure CREAC doctrine.
|
||||||
|
- **Final:** Completely deleted. Went straight from conclusion ("מסקנתנו היא שהבקשה אינה עומדת") to factual evidence (shadow plan is theoretical).
|
||||||
|
- **Lesson:** In "clean acceptance" cases where the committee's OWN conditions provide the anchor for the decision, skip the doctrinal framework. The committee said "show us X", the applicant didn't show X — no need to explain WHY committees can require X. CREAC is for contested legal rules, not for applying a committee's own explicitly-stated conditions. This is the most important lesson from this case: **match doctrinal depth to legal uncertainty**.
|
||||||
|
|
||||||
|
#### 21. Background Enhanced with "ודוק" Foreshadowing
|
||||||
|
- **Draft:** Simple description of the permit application: "ופורסמה כנדרש לפי סעיף 149 לחוק"
|
||||||
|
- **Final:** Added 2 sentences after the permit description: "ודוק, בהתאם להוראות התכנית נספח הבינוי מחייב לגבי מספר הקומות המירבי ובכל הנוגע לדרישה להכנת תכנית אחידה הרי שזו מכח שלביות הביצוע של התכנית. על מנת לסטות מהוראות אלו התבקשו ההקלות."
|
||||||
|
- **Lesson:** Dafna plants analytical seeds in the background. This "ודוק" paragraph in the background isn't neutrality-violating — it's explaining how plan provisions work as a matter of technical fact. But it foreshadows the fulcrum of the entire analysis (the reliefs are from MANDATORY provisions, not from advisory guidance). The background reader already understands what's at stake before reaching the discussion. **Rule**: when the decision hinges on a technical planning distinction, explain that distinction in the background (as fact, not as argument).
|
||||||
|
|
||||||
|
#### 22. Procedures Section: Specific Dates → Summary Narrative
|
||||||
|
- **Draft:** Listed specific dates and documents: "ביום 05.02.2026 ניתנה החלטת ביניים... הודעת עמדה מטעם העוררת גלנסקי מיום 23.02.2026, תגובת גבי אינגרם מיום 08.02.2026, ותגובת מבקשת ההיתר מיום 25.02.2026"
|
||||||
|
- **Final:** Generalized: "לאחר מועד זה הוגשו בקשות, עדכונים ותגובות מטעם הצדדים לגבי ניסיון להגיע לידי הסכמות, וגם בניסיון לתכנן בקשה שונה ומכל מקום ועדת הערר אפשרה מרחב של זמן בתקווה כי ההחלטה תתייתר"
|
||||||
|
- **Lesson:** For post-hearing procedural history that didn't change the outcome, Dafna prefers summary narrative over chronological detail. The intermediate decisions, update letters, and their specific dates don't matter to the reader — what matters is the narrative arc: "we gave them time to agree, they didn't, now we decide." Also: "ועדת הערר אפשרה מרחב של זמן בתקווה כי ההחלטה תתייתר" — this signals judicial patience and good faith before ruling.
|
||||||
|
|
||||||
|
#### 23. Concrete Evidence Added: Specific Permits in Buildings 5, 7, 11
|
||||||
|
- **Draft:** General statement that expansions were done ("הרחבות אלו, שחלקן כבר בוצעו וחלקן אושרו...")
|
||||||
|
- **Final:** Added an entire new paragraph: "להלן כדוגמא מתוך היתרי הבניה בבתים מספר 5, 7, ו-11 (בניינים סמוכים ואף צמודים לזה מושא הערר), בהם התבקשו ואושרו תוספות בניה בהתאם להוראות התכנית בקומה ב' (מפלס 5.80+). משזכויות הבניה נוצלו כאמור, הרי שלא תהיה בידם האפשרות לנצל וליישם את הרחבת הבניה באופן דומה לזה המתבקש בענייננו, מה שיגרום לבית 13 להיות חריג לסביבתו" — with accompanying images of the permits.
|
||||||
|
- **Lesson:** In acceptance decisions where you're overturning a committee, provide specific factual evidence that makes the conclusion inevitable. Not "other buildings already expanded" but "HERE are permits 5, 7, 11 showing exactly what was approved at level +5.80, making it physically impossible for the shadow plan to be implemented." The word "חריג לסביבתו" appears here as factual consequence, not as value judgment.
|
||||||
|
|
||||||
|
#### 24. Plan-Provision Integration Paragraphs Added (נחדד + מקל וחומר)
|
||||||
|
- **Draft:** None of this content existed
|
||||||
|
- **Final:** Two new paragraphs:
|
||||||
|
- F13: "נחדד כי בהתאם להוראות התכנית נספח הבינוי מחייב לגבי מספר הקומות, ולכך מתווספת גם הוראת השלביות והדרישה להכנת תכנית אחידה לכל הבניין. ברי כי הכוונה לתכנית הממחישה ומבטיחה כי ההרחבות מושא התכנית יוכלו להתממש לגבי כלל בעלי הזכויות ובאופן המייצר מופע מקובל."
|
||||||
|
- F14: "הדברים מתחדדים ביתר שאת שעה שמבוקשת הקלה שמשמעותה חריגה מהוראות התכנית שאז בוודאי מקל וחומר נכון להכין תכנית אחידה."
|
||||||
|
- **Lesson:** Where the draft used abstract doctrine, Dafna uses specific plan provisions. The "מקל וחומר" argument is new and powerful: if a uniform plan is required even for plan-conforming construction, then all the more so for construction that deviates from the plan. This replaces the general legal framework with a specific, irrefutable logical argument anchored in THIS plan's provisions.
|
||||||
|
|
||||||
|
#### 25. Counter-Factual Reasoning: "Approved by Mistake" + "Barren Discussion"
|
||||||
|
- **Draft:** Simple statement: "לאחר שהתברר בדיון בפנינו כי תכנית הצל אינה ישימה" followed by intermediate conclusion
|
||||||
|
- **Final:** Added entirely new reasoning: "תכנית הצל אושרה מתוך טעות כי הרי לא נוכל להניח כי אושרה למראית עין וברי כי הועדה המקומית ביקשה להבטיח זכויות של אחרים והשתלבות בסביבה. במקום בו התכנית אינה ישימה דיון בה הינו דיון עקר."
|
||||||
|
- **Lesson:** The "benefit of the doubt" technique — assume the committee acted in good faith (they didn't knowingly approve a hollow document), then show that this good-faith assumption actually STRENGTHENS the reversal (if they thought it was real, and it's not, then they were misled). "דיון עקר" = "barren discussion" — a phrase that shuts down any further argument about the shadow plan's merits. This is a new rhetorical move not seen in previous decisions.
|
||||||
|
|
||||||
|
#### 26. Engineer Counter-Factual: "Had He Known..." (Two New Paragraphs)
|
||||||
|
- **Draft:** Nothing about the engineer after the discussion section
|
||||||
|
- **Final:** Two new paragraphs (F18-F19) adding meta-reasoning about the engineer's opinion:
|
||||||
|
- "חוות דעתו של מהנדס הוועדה כי התכנון המבוקש חורג לסביבתו נבחנה לאור תכנית הצל שהוגשה ומשזו הוגשה בחסר חוו"ד הגורם המקצועי נותרה גם היא בחסר."
|
||||||
|
- "ונציין כי חוו"ד מהנדס הוועדה ניתנה במקום בו היה סבור כי תכנית הצל ישימה ובהינתן כך קבע כי הינה עדיין חורגת לסביבה... היה והייתה מוצגת תכנית צל המאגדת את ההיתרים שאושרו וממחישה את חריגות הבניה במרחב, ניתן לשער כי חוו"ד המהנדס הייתה החלטית יותר"
|
||||||
|
- **Lesson:** In acceptance decisions where you're overturning a committee that had professional support, explain WHY the professional got it wrong (or rather, why his analysis was based on faulty premises). The counter-factual "had the engineer known the shadow plan was not feasible, his opposition would have been even stronger" turns the committee's own professional opinion into evidence FOR the reversal. This is Dafna's way of being respectful to professionals while still overturning their conclusions.
|
||||||
|
|
||||||
|
#### 27. "לא נעלם מעינינו" Acknowledge-Before-Reject Removed
|
||||||
|
- **Draft:** Had a 66-word paragraph: "לא נעלם מעינינו כי נספח הבינוי הוגדר כ'מנחה' ולא כ'מחייב'... אולם אף בנספח מנחה, סטייה מהותית... אינה עניין טכני אלא שינוי מהותי"
|
||||||
|
- **Final:** Completely removed
|
||||||
|
- **Lesson:** The "אכן...אולם" or "לא נעלם מעינינו" pattern is for REJECTING an appeal — you need to show you considered the losing side's best argument. In ACCEPTANCE, the losing side is the committee/permit applicant, and the analysis already shows their conditions weren't met. No need to acknowledge the other side's argument when the factual record speaks for itself. **Rule**: "acknowledge-before-reject" = only in rejection decisions or on specific issues where you rule against a party. Don't use it prophylactically.
|
||||||
|
|
||||||
|
#### 28. Committee Response: Personal Circumstances Added
|
||||||
|
- **Draft:** Missing entirely — no mention of "פסק הלכתי" or "נסיבות אישיות חריגות"
|
||||||
|
- **Final:** Added new paragraph in committee response section: "בין השיקולים ששקלו חברי הוועדה נלקחו בחשבון גם נסיבות אישיות חריגות של מבקשת ההיתר, ובכללן פסק הלכתי שהוצג בפני הוועדה, שלפיו בנות מתבגרות אינן יכולות להתגורר באותו מפלס עם שאר בני המשפחה"
|
||||||
|
- **Lesson:** If a committee considered unusual factors (religious rulings, personal hardship), document them in the claims section for completeness, even if they're not addressed in the discussion. Omitting them would create a gap for judicial review — a judge reading the protocol would wonder why the decision doesn't mention them. Including them in the claims section without addressing them in the discussion implicitly signals: "we noted this but it doesn't change the planning analysis."
|
||||||
|
|
||||||
|
#### 29. Opening Precision: Permit Number and Phrasing
|
||||||
|
- **Draft:** "בקשה להיתר שמספרה" (placeholder — number missing!), "בהקלה לתוספת קומה"
|
||||||
|
- **Final:** "בקשה להיתר מס' 20230614", "בקשה הכוללת הקלות 'הקלה לתוספת קומה ללא תכנית אחידה וללא אדריכלות חוץ'"
|
||||||
|
- **Lesson:** (a) Never leave placeholders — "שמספרה" without the actual number is a production error. (b) The permit number is a legal identifier that must appear in the opening. (c) The phrasing "בקשה הכוללת הקלות" (application that includes reliefs) is more precise than "בהקלה" (with a relief). Also: the relief description is quoted in quotation marks from the official publication.
|
||||||
|
|
||||||
|
#### 30. "ונפרט;" Not "נפרט."
|
||||||
|
- **Draft:** "נפרט." (period)
|
||||||
|
- **Final:** "ונפרט;" (ו prefix + semicolon)
|
||||||
|
- **Lesson:** The transition from conclusion to detail uses "ו" prefix (connecting) and semicolon (flowing into the detail), not a period (which creates a full stop). This was already documented in the voice fingerprint ("מעבר עם נקודה-פסיק") but the draft didn't apply it. This confirms: **semicolons before elaboration are not optional — they are Dafna's standard punctuation for transitions into detail**.
|
||||||
|
|
||||||
|
#### 31. Summary: No Forward-Looking Guidance to Losing Party
|
||||||
|
- **Draft:** Had a forward-looking paragraph: "ככל שמבקשת ההיתר תבקש להגיש בקשה מחודשת עליה לעמוד בדרישות התכנית, לרבות הצגת תכנית אחידה ישימה לכל הבניין כנדרש"
|
||||||
|
- **Final:** Replaced with simple restatement: "על כן, הבקשה להיתר לא עמדה בתנאים שהוועדה המקומית עצמה קבעה בהחלטתה מיום 8.7.2024."
|
||||||
|
- **Lesson:** Dafna does NOT give advice to the losing party in the summary. The decision says what was decided, not what the applicant should do next. Forward-looking guidance would be an advisory opinion outside the scope of the decision. Also note: the final added "ולמעשה היא אינה ממחישה את המצב הפיזי והתכנוני 'האמיתי'" — a new phrase capturing the essence of why the shadow plan fails (it doesn't reflect reality).
|
||||||
|
|
||||||
|
#### 32. Unit vs. Extension: Deference to Committee, Not Independent Analysis
|
||||||
|
- **Draft:** "ניתן לקבל בדוחק את עמדת מבקשת ההיתר כי מדובר בתוספת לדירה קיימת" — expressing the committee's own hesitant view
|
||||||
|
- **Final:** "עולה כי הועדה המקומית דנה בכך וקבעה כי מדובר ביחידת דיור אחת שבנייתה מיועדת לשימוש בן משפחה... אין אנו מוצאים להתערב בכך ראשית כי הדבר מקדים את זמנו... ושנית ככל שתאושר בניה זו יש לוודא כי לא תבנה יח"ד נוספת"
|
||||||
|
- **Lesson:** When a secondary issue was resolved by the committee and you're not overturning THAT specific finding, use deference ("אין אנו מוצאים להתערב") rather than expressing your own opinion ("ניתן לקבל בדוחק"). The final also adds a CONDITION ("יש לוודא כי לא תבנה יח"ד נוספת") — practical safeguard rather than theoretical analysis.
|
||||||
|
|
||||||
|
#### 33. No Expenses in Full Acceptance
|
||||||
|
- **Draft:** No mention of expenses
|
||||||
|
- **Final:** No mention of expenses
|
||||||
|
- **Lesson confirmed:** In full acceptance of an appeal by neighbor-appellants against a permit applicant, Dafna does not award expenses to either side. This contrasts with rejection (הכט: appellants pay expenses). The pattern emerges: expenses = only in rejection. Acceptance or partial acceptance = no expenses order.
|
||||||
|
|
||||||
|
### New Transition Phrases Discovered
|
||||||
|
- **"ונפרט;"** — correct form (ו + semicolon, not "נפרט.")
|
||||||
|
- **"דיון בה הינו דיון עקר"** — declaring a point moot
|
||||||
|
- **"אושרה מתוך טעות כי הרי לא נוכל להניח כי אושרה למראית עין"** — benefit-of-the-doubt construction
|
||||||
|
- **"ונציין כי חוו"ד... ניתנה במקום בו היה סבור כי..."** — counter-factual about professional opinion
|
||||||
|
- **"להלן כדוגמא מתוך"** — introducing specific documentary evidence
|
||||||
|
- **"ברי כי הכוונה ל..."** — explaining legislative intent of plan provisions
|
||||||
|
- **"מה שיגרום לבית 13 להיות חריג לסביבתו"** — factual consequence language
|
||||||
|
- **"ועדת הערר אפשרה מרחב של זמן בתקווה כי ההחלטה תתייתר"** — explaining judicial patience
|
||||||
|
|
||||||
|
### Meta-Lesson
|
||||||
|
This is the first "clean acceptance" in our training data (הכט = rejection, בית הכרם = partial acceptance). The key insight: **the draft was too careful**. It built a doctrinal framework (CREAC) as if it needed to justify overturning the committee from first principles, when in reality the committee's OWN conditions provided all the justification needed. Dafna's approach to acceptance:
|
||||||
|
|
||||||
|
1. **Anchor in the committee's own conditions** — no need for external legal authority
|
||||||
|
2. **Show concrete evidence** the conditions weren't met (specific permits, images)
|
||||||
|
3. **Explain WHY the committee was misled** (shadow plan approved by mistake)
|
||||||
|
4. **Counter-factual reasoning** about what professionals would have said with correct information
|
||||||
|
5. **No abstract doctrine needed** when the facts are clear
|
||||||
|
|
||||||
|
The draft's biggest structural error was adding the "נבאר" doctrinal paragraph and the "לא נעלם מעינינו" acknowledge-before-reject. Both are tools for CONTESTED or REJECTED cases. In a clean acceptance, the facts lead directly to the conclusion.
|
||||||
|
|
||||||
|
### Applied To
|
||||||
|
- [ ] Update SKILL.md: add "clean acceptance" track — skip doctrine, anchor in committee's conditions
|
||||||
|
- [ ] Update SKILL.md: "acknowledge-before-reject" only in rejection/contested issues
|
||||||
|
- [ ] Update SKILL.md: no forward-looking guidance in summary
|
||||||
|
- [ ] Update SKILL.md: "ודוק" foreshadowing in background for technical planning distinctions
|
||||||
|
- [ ] Update SKILL.md: counter-factual reasoning about professional opinions
|
||||||
|
- [ ] Update SKILL.md: procedures section — summary narrative for post-hearing history
|
||||||
|
- [ ] Update voice-fingerprint: add new transition phrases
|
||||||
|
- [ ] Update architecture-by-outcome: add "clean acceptance" archetype
|
||||||
|
- [ ] Fix agent opening punctuation: "ונפרט;" not "נפרט."
|
||||||
|
|||||||
157
docs/paperclip-quirks.md
Normal file
157
docs/paperclip-quirks.md
Normal file
@@ -0,0 +1,157 @@
|
|||||||
|
# Paperclip Quirks — מלכודות ידועות
|
||||||
|
|
||||||
|
> **הקשר:** מה ש-Paperclip עושה בעצמו, מתחת לרגליהם של הסוכנים שלנו, ושאנחנו צריכים לעקוף אותו או לחיות איתו.
|
||||||
|
>
|
||||||
|
> כל מלכודת מתועדת עם:
|
||||||
|
> 1. מה קורה בפועל
|
||||||
|
> 2. ראיה אמפירית מתוך לוגים
|
||||||
|
> 3. ההשפעה על הצינור שלנו
|
||||||
|
> 4. עקיפה / תיקון / קבלה
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. `issue.released` הופך `done` ל-`todo`
|
||||||
|
|
||||||
|
### מה קורה
|
||||||
|
|
||||||
|
לאחר שסוכן מבצע `PATCH /api/issues/{id}` עם `status: done`, **Paperclip מבצע פעולה נוספת בשם `issue.released`** מספר שניות מאוחר יותר. ל-`issue.released` יש side-effect לא-מתועד שמחזיר את ה-status ל-`todo`.
|
||||||
|
|
||||||
|
### ראיה אמפירית — תיק 8174-24, CMPA-18 (30/04/26)
|
||||||
|
|
||||||
|
מתוך `activity_log`:
|
||||||
|
|
||||||
|
```
|
||||||
|
ts | action | actor_type | details
|
||||||
|
----------+---------------------+------------+----------------------------------------
|
||||||
|
18:14:49 | issue.comment_added | agent | comment by researcher
|
||||||
|
18:14:57 | issue.updated | agent | {"status": "done", "_previous": {"status": "in_progress"}}
|
||||||
|
18:15:35 | issue.released | agent | ← here
|
||||||
|
```
|
||||||
|
|
||||||
|
מצב מ-`issues` table 38 שניות לאחר ה-`released`:
|
||||||
|
```
|
||||||
|
identifier | status | updated_at
|
||||||
|
CMPA-18 | todo | 18:15:35
|
||||||
|
```
|
||||||
|
|
||||||
|
ה-status חזר מ-`done` ל-`todo` למרות שאף סוכן או משתמש לא ביקש זאת.
|
||||||
|
|
||||||
|
### ההשפעה על הצינור שלנו
|
||||||
|
|
||||||
|
Paperclip מזהה issue ב-`todo` כ"יש עבודה לעשות" → מיד מפעיל wakeup לסוכן הרלוונטי → הסוכן רץ שוב עם prompt cache מלא (~$0.10-0.50 פר-ריצה) → מסתכל סביב ומבין שהעבודה כבר נעשתה → סוגר את ה-issue שוב → `issue.released` חוזר על עצמו ⇒ פוטנציאל ללולאה.
|
||||||
|
|
||||||
|
### עקיפה — בצד שלנו (ללא תיקון Paperclip)
|
||||||
|
|
||||||
|
הסוכן שלנו **עושה זאת כבר היום בהצלחה** במקרה שהוא רואה issue ב-`todo` עם תוצרים קיימים:
|
||||||
|
|
||||||
|
1. בודק שהקבצים הצפויים קיימים (`Glob /documents/research/*.md`)
|
||||||
|
2. בודק שה-DB מאוכלס (`mcp__legal-ai__precedent_list`, `get_claims`, וכו')
|
||||||
|
3. אם הכל קיים → לא מבצע עבודה כפולה → כותב comment "אין שינוי" → `PATCH issue → done`
|
||||||
|
|
||||||
|
**הראיה:** בריצה החוזרת (PID 309786 ב-30/04/26 18:15:54), המנתח של החוקר זיהה תוך 90 שניות שכל 9 התקדימים והקובץ קיימים, וסגר את ה-issue ב-`PATCH → done` שוב. הריצה הזאת עלתה כ-$0.20 — לא חינם, אבל לא לולאה.
|
||||||
|
|
||||||
|
### אם תרצה לחקור פנימה
|
||||||
|
|
||||||
|
ה-`issue.released` נרשם ב-`activity_log` עם `actor_type=agent` אבל בלי `agent_id` שמסביר מי. הוא לא נכתב על ידי הסקריפטים שלנו (אנחנו לא קוראים endpoint כזה). מקור אפשרי:
|
||||||
|
- מנגנון `executionLockedAt` / `executionWorkspaceId` של Paperclip שמשחרר משאבים אחרי שריצה מסתיימת ובמקביל מאפס status
|
||||||
|
|
||||||
|
האפשרות הנכונה לסגור את הבאג היא **ב-Paperclip עצמו** — לתקן את `issue.released` שלא ידרוס status מסוף-מצב כמו `done`. עד שזה נסגר אצלם, אנחנו חיים עם self-recovery.
|
||||||
|
|
||||||
|
### סטטוס
|
||||||
|
|
||||||
|
- **לא נסגר ב-Paperclip** (ידוע לפי 30/04/26)
|
||||||
|
- **טופל בצד שלנו** דרך self-recovery בסקייל של הסוכן (HEARTBEAT.md §4-recovery)
|
||||||
|
- **לתעד עלות**: כל ריצת self-recovery מוסיפה ~$0.20 לתיק
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 2. Bash backtick trap בעת בניית comment body דרך curl
|
||||||
|
|
||||||
|
### מה קורה
|
||||||
|
|
||||||
|
הסוכן בונה pipeline מורכב כדי לפרסם comment עם markdown ארוך:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
curl ... -d "$(python3 -c "
|
||||||
|
body = '''## כותרת
|
||||||
|
📁 קובץ: \`/path/to/file.md\`
|
||||||
|
'''
|
||||||
|
print(json.dumps({'body': body}))")"
|
||||||
|
```
|
||||||
|
|
||||||
|
ה-`bash` שמריץ את ה-`$(...)` הראשון רואה את ה-backticks (` ` ` ) בתוך המחרוזת של Python ומפרש אותם **כ-command substitution של bash**. הוא מנסה להריץ את `/path/to/file.md` כפקודה, ומכיוון שהקובץ לא executable — מחזיר:
|
||||||
|
|
||||||
|
```
|
||||||
|
/bin/bash: line 56: /path/to/file.md: Permission denied
|
||||||
|
```
|
||||||
|
|
||||||
|
### ההטעיה
|
||||||
|
|
||||||
|
ההודעה `Permission denied` היא **לא** באמת בעיית הרשאות:
|
||||||
|
- `ls -la` מראה שהקובץ הוא `chaim:chaim` עם `-rw-r--r--`
|
||||||
|
- `touch` ידני באותו נתיב מצליח
|
||||||
|
- ה-Write tool כבר כתב את הקובץ הזה בהצלחה דקה קודם
|
||||||
|
|
||||||
|
### למה זה קורה דווקא בנתיבי מסמכים
|
||||||
|
|
||||||
|
Backticks הם תחביר markdown נפוץ לציטוט נתיבים: `` `/home/chaim/...` ``. בפלט markdown זה נכון, אבל כשהסוכן מטמיע את ה-markdown בתוך bash heredoc / command substitution, ה-backticks מפעילים את עצמם.
|
||||||
|
|
||||||
|
### תיקון — דפוס "כתוב לקובץ זמני אז curl -d @file"
|
||||||
|
|
||||||
|
במקום:
|
||||||
|
```bash
|
||||||
|
curl ... -d "$(python3 -c "...long body with backticks...")"
|
||||||
|
```
|
||||||
|
|
||||||
|
עשה:
|
||||||
|
```python
|
||||||
|
# 1. כתוב את ה-body לקובץ זמני דרך Write tool (בלי שום bash quoting)
|
||||||
|
Write("/tmp/comment.json", json.dumps({"body": markdown_body}))
|
||||||
|
```
|
||||||
|
```bash
|
||||||
|
# 2. אז curl קורא מהקובץ — אין shell expansion על התוכן
|
||||||
|
curl -s -X POST -H "Authorization: Bearer $PAPERCLIP_API_KEY" \
|
||||||
|
-H "Content-Type: application/json" \
|
||||||
|
"$PAPERCLIP_API_URL/api/issues/{issue-id}/comments" \
|
||||||
|
-d @/tmp/comment.json
|
||||||
|
```
|
||||||
|
|
||||||
|
הנתיב `-d @file` קורא את התוכן של הקובץ **בלי שום ניתוח** — אין shell, אין quoting, אין backticks-as-commands. זה גם מאפשר body של 10K+ תווים ללא הגבלת ARG_MAX.
|
||||||
|
|
||||||
|
### סטטוס
|
||||||
|
|
||||||
|
- **תיעוד ב-HEARTBEAT.md** עם הוראה מפורשת להשתמש ב-Write+`-d @file` ל-bodies מעל 500 תווים
|
||||||
|
- **השפעה היסטורית**: לפני התיקון, הריצה ב-CMPA-18 (30/04/26) הצליחה (curl באמת רץ) — אבל ה-`Permission denied` בלוג היה מבלבל וגרם לחקירה. עתה שהסיבה ידועה, אפשר להתעלם.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 3. CEO main issue auto-block ב-`in_progress`
|
||||||
|
|
||||||
|
### מה קורה
|
||||||
|
|
||||||
|
CEO שמסיים turn (פרסם comment "ממתין לסיום של סוכן Y") ומשאיר את ה-issue ב-`in_progress` יקבל auto-block תוך דקה אחת מ-Paperclip ("live execution disappeared"). הסטטוס יקפוץ ל-`blocked` ויידרש wakeup ידני להמשיך.
|
||||||
|
|
||||||
|
### עקיפה
|
||||||
|
|
||||||
|
CEO צריך להעביר את ה-issue ל-`in_review` (לא `in_progress`) כשהוא ממתין למשאב חיצוני (סוכן אחר, יו"ר). זה מתועד ב-CLAUDE.md זיכרון: `feedback_paperclip_enums.md`.
|
||||||
|
|
||||||
|
### סטטוס
|
||||||
|
|
||||||
|
- **תיקון ב-`legal-ceo.md`** (commit a1969dd)
|
||||||
|
- נצפה עובד ב-CMPA-15 ב-30/04/26 — ה-CEO עבר ל-`in_review` נכון
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 4. Wakeup דרך DB ישיר ≠ wakeup דרך API
|
||||||
|
|
||||||
|
### מה קורה
|
||||||
|
|
||||||
|
`INSERT INTO agent_wakeup_requests` ידני בלי לעבור דרך `POST /api/agents/{id}/wakeup` יוצר רשומת wakeup אבל **לא יוצר `heartbeat_run`**. בלי `heartbeat_run`, ה-runtime של Paperclip לא מזהה שיש משהו להריץ → הסוכן לעולם לא מתעורר.
|
||||||
|
|
||||||
|
### עקיפה
|
||||||
|
|
||||||
|
תמיד להשתמש ב-API. כל הסקייל שלנו תועדו עם האזהרה הזאת.
|
||||||
|
|
||||||
|
### סטטוס
|
||||||
|
|
||||||
|
- **תיקון בכל הסקייל** (CLAUDE.md זיכרון: `reference_paperclip_wakeup.md`)
|
||||||
38
docs/runbooks/coolify-mcp-settings-volumes.md
Normal file
38
docs/runbooks/coolify-mcp-settings-volumes.md
Normal file
@@ -0,0 +1,38 @@
|
|||||||
|
<!-- docs/runbooks/coolify-mcp-settings-volumes.md -->
|
||||||
|
# Coolify Volume Mounts ל-MCP Settings Page
|
||||||
|
|
||||||
|
## רקע
|
||||||
|
|
||||||
|
טאב **Registrations** בדף `/settings` קורא רישומי MCP מתוך:
|
||||||
|
- `~/.claude.json` (host)
|
||||||
|
- `~/.paperclip/instances/*/mcp.json` (host)
|
||||||
|
|
||||||
|
הקונטיינר של legal-ai חייב גישת קריאה לקבצים אלה דרך volume mounts.
|
||||||
|
בלי המאונט, ה-endpoint יחזיר `error: "host_path_unavailable"` והטאב יציג הודעת אי-זמינות.
|
||||||
|
|
||||||
|
## הוראות
|
||||||
|
|
||||||
|
1. פתח Coolify UI: `http://158.178.131.193:8000`.
|
||||||
|
2. נווט לאפליקציה: legal-ai (UUID `gyjo0mtw2c42ej3xxvbz8zio`).
|
||||||
|
3. לשונית **Storages** → **Add Storage**.
|
||||||
|
4. הוסף שני mounts:
|
||||||
|
|
||||||
|
| Source path (host) | Destination path (container) | Mode |
|
||||||
|
|---|---|---|
|
||||||
|
| `/home/chaim/.claude.json` | `/host/.claude.json` | `ro` |
|
||||||
|
| `/home/chaim/.paperclip` | `/host/.paperclip` | `ro` |
|
||||||
|
|
||||||
|
5. שמור ולחץ **Redeploy**.
|
||||||
|
|
||||||
|
## אימות
|
||||||
|
|
||||||
|
אחרי ה-redeploy:
|
||||||
|
```bash
|
||||||
|
curl -s https://legal-ai.nautilus.marcusgroup.org/api/settings/mcp/registrations | jq
|
||||||
|
```
|
||||||
|
צריך להחזיר `"error": null` ורשימת רישומים.
|
||||||
|
|
||||||
|
## הערה אבטחה
|
||||||
|
|
||||||
|
המאונטים הם read-only. ה-endpoint לא מחזיר ערכי env (רק שמות keys),
|
||||||
|
ולא מאפשר לעדכן את הקבצים.
|
||||||
2158
docs/superpowers/plans/2026-05-04-mcp-settings-page.md
Normal file
2158
docs/superpowers/plans/2026-05-04-mcp-settings-page.md
Normal file
File diff suppressed because it is too large
Load Diff
336
docs/superpowers/specs/2026-05-04-mcp-settings-page-design.md
Normal file
336
docs/superpowers/specs/2026-05-04-mcp-settings-page-design.md
Normal file
@@ -0,0 +1,336 @@
|
|||||||
|
# דף הגדרות MCP — איפיון
|
||||||
|
|
||||||
|
**תאריך:** 2026-05-04
|
||||||
|
**מצב:** Draft → ממתין לאישור משתמש
|
||||||
|
**הקשר:** הרחבת `/settings` ב-web-ui עם מידע על MCP server של legal-ai (env vars, tools, registrations).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## 1. מטרה
|
||||||
|
|
||||||
|
לתת ליו"ר/מנהל המערכת מקום מרכזי לראות (ולערוך כשבטוח) את כל מצב התצורה של ה-MCP server, בלי לעבור בין Infisical UI, Coolify UI, וקבצי קונפיגורציה מקומיים.
|
||||||
|
|
||||||
|
## 2. גבולות (Scope)
|
||||||
|
|
||||||
|
**בתוך הסקופ:**
|
||||||
|
- תצוגה + עריכה של env vars לא-סודיים, שמירה ל-Infisical, redeploy ידני של Coolify.
|
||||||
|
- תצוגה (read-only) של env vars סודיים, עם indicator של drift בין Infisical לקונטיינר.
|
||||||
|
- תצוגה (read-only) של רשימת tools שה-MCP server חושף (introspection דינמי).
|
||||||
|
- תצוגה (read-only) של רישומי MCP בקבצי הקונפיגורציה של Claude Code ו-Paperclip.
|
||||||
|
|
||||||
|
**מחוץ לסקופ (אולי בעתיד):**
|
||||||
|
- Enable/disable של tools בודדים.
|
||||||
|
- עריכת `~/.claude.json` או `~/.paperclip/...` מ-UI.
|
||||||
|
- Auth/RBAC חדש (משתמש ב-auth קיים של הדף — אין כרגע).
|
||||||
|
- ניהול secrets — נשאר ב-Infisical UI.
|
||||||
|
- Auto-redeploy אחרי שמירה (משתמש לוחץ Redeploy ידנית).
|
||||||
|
|
||||||
|
## 3. ארכיטקטורה
|
||||||
|
|
||||||
|
### 3.1 מבנה דף (Frontend)
|
||||||
|
|
||||||
|
`/settings` הופך לדף מבוסס-טאבים (`shadcn/Tabs`):
|
||||||
|
|
||||||
|
| Tab | תוכן | מצב |
|
||||||
|
|---|---|---|
|
||||||
|
| Paperclip | התוכן הקיים: Tag mappings + Companies | קיים, ללא שינוי לוגי |
|
||||||
|
| Environment | env vars של MCP server, Infisical / Container | חדש, עריכה |
|
||||||
|
| Tools | רשימת tools של ה-MCP server | חדש, read-only |
|
||||||
|
| Registrations | רישומי MCP ב-Claude Code ו-Paperclip | חדש, read-only |
|
||||||
|
|
||||||
|
טאב ברירת מחדל: `Paperclip`.
|
||||||
|
|
||||||
|
### 3.2 שכבת Backend (FastAPI ב-`web/app.py`)
|
||||||
|
|
||||||
|
#### Endpoints חדשים
|
||||||
|
|
||||||
|
| Path | Method | תיאור |
|
||||||
|
|---|---|---|
|
||||||
|
| `/api/settings/mcp/env` | GET | מחזיר רשימת env vars מאוחדת |
|
||||||
|
| `/api/settings/mcp/env/{key}` | PATCH | מעדכן ערך ב-Infisical (רק לא-סודיים) |
|
||||||
|
| `/api/settings/mcp/env/redeploy` | POST | מפעיל Coolify redeploy |
|
||||||
|
| `/api/settings/mcp/tools` | GET | מחזיר רשימת tools של MCP server |
|
||||||
|
| `/api/settings/mcp/registrations` | GET | מחזיר רישומי MCP מ-`/host/.claude.json` ומ-`/host/.paperclip/instances/*/mcp.json` |
|
||||||
|
|
||||||
|
#### Catalog של env vars
|
||||||
|
|
||||||
|
קובץ חדש: `web/mcp_env_catalog.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from typing import Literal, Any
|
||||||
|
|
||||||
|
EnvType = Literal["bool", "int", "float", "string", "enum"]
|
||||||
|
EnvCategory = Literal["multimodal", "rerank", "halacha", "credentials", "connection", "general"]
|
||||||
|
|
||||||
|
@dataclass(frozen=True)
|
||||||
|
class EnvSpec:
|
||||||
|
key: str
|
||||||
|
category: EnvCategory
|
||||||
|
type: EnvType
|
||||||
|
description: str
|
||||||
|
is_secret: bool
|
||||||
|
is_editable: bool
|
||||||
|
default: Any = None
|
||||||
|
min: float | None = None
|
||||||
|
max: float | None = None
|
||||||
|
enum_values: list[str] | None = None
|
||||||
|
|
||||||
|
ENV_CATALOG: dict[str, EnvSpec] = {
|
||||||
|
# multimodal
|
||||||
|
"MULTIMODAL_ENABLED": EnvSpec("MULTIMODAL_ENABLED", "multimodal", "bool",
|
||||||
|
"הפעלת page-image embeddings", False, True, default=False),
|
||||||
|
"MULTIMODAL_MODEL": EnvSpec("MULTIMODAL_MODEL", "multimodal", "string",
|
||||||
|
"מודל multimodal של Voyage", False, True, default="voyage-multimodal-3"),
|
||||||
|
"MULTIMODAL_DPI": EnvSpec("MULTIMODAL_DPI", "multimodal", "int",
|
||||||
|
"DPI ל-rendering של עמוד למודל", False, True, default=144, min=72, max=300),
|
||||||
|
"MULTIMODAL_THUMB_DPI": EnvSpec("MULTIMODAL_THUMB_DPI", "multimodal", "int",
|
||||||
|
"DPI ל-thumbnail בתצוגה", False, True, default=96, min=72, max=200),
|
||||||
|
"MULTIMODAL_TEXT_WEIGHT": EnvSpec("MULTIMODAL_TEXT_WEIGHT", "multimodal", "float",
|
||||||
|
"משקל text vs image ב-RRF", False, True, default=0.5, min=0.0, max=1.0),
|
||||||
|
"MULTIMODAL_RRF_K": EnvSpec("MULTIMODAL_RRF_K", "multimodal", "int",
|
||||||
|
"RRF damping constant", False, True, default=60, min=1, max=200),
|
||||||
|
# rerank
|
||||||
|
"VOYAGE_RERANK_ENABLED": EnvSpec("VOYAGE_RERANK_ENABLED", "rerank", "bool",
|
||||||
|
"הפעלת cross-encoder rerank", False, True, default=False),
|
||||||
|
"VOYAGE_RERANK_MODEL": EnvSpec("VOYAGE_RERANK_MODEL", "rerank", "string",
|
||||||
|
"מודל rerank", False, True, default="rerank-2"),
|
||||||
|
"VOYAGE_RERANK_FETCH_K": EnvSpec("VOYAGE_RERANK_FETCH_K", "rerank", "int",
|
||||||
|
"מספר candidates לפני rerank", False, True, default=50, min=10, max=200),
|
||||||
|
# halacha
|
||||||
|
"HALACHA_AUTO_APPROVE_THRESHOLD": EnvSpec("HALACHA_AUTO_APPROVE_THRESHOLD",
|
||||||
|
"halacha", "float", "סף confidence ל-auto-approve",
|
||||||
|
False, True, default=0.80, min=0.0, max=1.0),
|
||||||
|
# general
|
||||||
|
"VOYAGE_MODEL": EnvSpec("VOYAGE_MODEL", "general", "string",
|
||||||
|
"מודל embedding ראשי", False, True, default="voyage-law-2"),
|
||||||
|
"AUDIT_ENABLED": EnvSpec("AUDIT_ENABLED", "general", "bool",
|
||||||
|
"הפעלת audit log", False, True, default=True),
|
||||||
|
# credentials (read-only, masked)
|
||||||
|
"VOYAGE_API_KEY": EnvSpec("VOYAGE_API_KEY", "credentials", "string",
|
||||||
|
"Voyage AI API key", True, False),
|
||||||
|
"GOOGLE_CLOUD_VISION_API_KEY": EnvSpec("GOOGLE_CLOUD_VISION_API_KEY",
|
||||||
|
"credentials", "string", "Google Cloud Vision API key", True, False),
|
||||||
|
"INFISICAL_TOKEN": EnvSpec("INFISICAL_TOKEN", "credentials", "string",
|
||||||
|
"Infisical SDK token", True, False),
|
||||||
|
# connection (read-only — מסוכן לשנות runtime)
|
||||||
|
"POSTGRES_URL": EnvSpec("POSTGRES_URL", "connection", "string",
|
||||||
|
"PostgreSQL connection URL", True, False),
|
||||||
|
"REDIS_URL": EnvSpec("REDIS_URL", "connection", "string",
|
||||||
|
"Redis connection URL", False, False),
|
||||||
|
"DATA_DIR": EnvSpec("DATA_DIR", "connection", "string",
|
||||||
|
"Data directory path", False, False),
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
המקור: `mcp-server/src/legal_mcp/config.py`. כל מפתח שלא ב-catalog לא מוצג (whitelist policy).
|
||||||
|
|
||||||
|
#### Response shape של `GET /api/settings/mcp/env`
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"vars": [
|
||||||
|
{
|
||||||
|
"key": "MULTIMODAL_ENABLED",
|
||||||
|
"category": "multimodal",
|
||||||
|
"type": "bool",
|
||||||
|
"description": "הפעלת page-image embeddings",
|
||||||
|
"is_secret": false,
|
||||||
|
"is_editable": true,
|
||||||
|
"default": false,
|
||||||
|
"infisical_value": "true",
|
||||||
|
"container_value": "true",
|
||||||
|
"drift": false,
|
||||||
|
"min": null, "max": null, "enum_values": null
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"key": "VOYAGE_API_KEY",
|
||||||
|
"category": "credentials",
|
||||||
|
"type": "string",
|
||||||
|
"description": "Voyage AI API key",
|
||||||
|
"is_secret": true,
|
||||||
|
"is_editable": false,
|
||||||
|
"infisical_value": "****",
|
||||||
|
"container_value": "****",
|
||||||
|
"drift": false
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"infisical_environment": "dev",
|
||||||
|
"coolify_app_uuid": "gyjo0mtw2c42ej3xxvbz8zio",
|
||||||
|
"errors": []
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
- `infisical_value`: דרך `InfisicalSDKClient.get_secret(...)`. אם יש שגיאה → `null` ועדכון `errors`.
|
||||||
|
- `container_value`: `os.environ.get(key)`. אם לא מוגדר → `null`.
|
||||||
|
- `drift`: `infisical_value != container_value` (אחרי normalization של bool/int/float; secrets לא משווים ערכים גולמיים — רק hash).
|
||||||
|
- ל-secret: שני הערכים מוחזרים מטושטשים (`"****" + last_4`); השוואת drift על ה-hash בלבד.
|
||||||
|
|
||||||
|
#### Save flow ב-`PATCH /api/settings/mcp/env/{key}`
|
||||||
|
|
||||||
|
1. ולידציה: הקיי קיים ב-catalog ו-`is_editable=true`. אם לא → 400.
|
||||||
|
2. ולידציה לפי type: int/float ב-טווח, bool מוסב מ-string, enum בערכים מותרים.
|
||||||
|
3. כתיבה ל-Infisical:
|
||||||
|
```python
|
||||||
|
client.update_secret(
|
||||||
|
project_id=INFISICAL_PROJECT_ID,
|
||||||
|
environment_slug=INFISICAL_ENV, # "dev" כברירת מחדל
|
||||||
|
secret_path="/legal-ai",
|
||||||
|
secret_name=key,
|
||||||
|
secret_value=str(value),
|
||||||
|
)
|
||||||
|
```
|
||||||
|
4. Audit log: `logger.info("mcp_env_update", extra={"key": key, "value": value if not is_secret else "[masked]"})`.
|
||||||
|
5. Response: `{"ok": true, "requires_redeploy": true, "message": "נשמר ב-Infisical. נדרש redeploy."}`.
|
||||||
|
|
||||||
|
#### Redeploy flow ב-`POST /api/settings/mcp/env/redeploy`
|
||||||
|
|
||||||
|
1. קריאה ל-Coolify API: `POST /api/v1/deploy?uuid=gyjo0mtw2c42ej3xxvbz8zio&force=false`.
|
||||||
|
2. אסימון: `COOLIFY_API_TOKEN` (מ-Infisical).
|
||||||
|
3. Polling: קריאה ל-`/api/v1/deployments/{deployment_uuid}` כל 5 שניות, עד `status="finished"` או `status="failed"` (max 10 דקות).
|
||||||
|
4. UI מציג סטטוס מתעדכן (פשוט: spinner + הודעת סטטוס; לא נדרש streaming).
|
||||||
|
|
||||||
|
#### Tools introspection ב-`GET /api/settings/mcp/tools`
|
||||||
|
|
||||||
|
```python
|
||||||
|
from legal_mcp.server import mcp # FastMCP instance
|
||||||
|
|
||||||
|
async def api_mcp_tools():
|
||||||
|
tools = await mcp.list_tools() # FastMCP API
|
||||||
|
return {
|
||||||
|
"tools": [
|
||||||
|
{
|
||||||
|
"name": t.name,
|
||||||
|
"description": t.description,
|
||||||
|
"module": _module_for_tool(t.name), # מ-tools/__init__.py
|
||||||
|
"params_schema": t.inputSchema,
|
||||||
|
"source_location": _source_location(t), # f"{file}:{line}"
|
||||||
|
}
|
||||||
|
for t in tools
|
||||||
|
]
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
`_module_for_tool` ו-`_source_location` נכתבים ב-`web/mcp_introspection.py` עם קריאת `inspect.getfile()` ו-`inspect.getsourcelines()`.
|
||||||
|
|
||||||
|
#### Registrations ב-`GET /api/settings/mcp/registrations`
|
||||||
|
|
||||||
|
קורא:
|
||||||
|
1. `/host/.claude.json` — תחת `mcpServers` או `projects.<path>.mcpServers`.
|
||||||
|
2. `/host/.paperclip/instances/*/mcp.json` — לכל instance בנפרד.
|
||||||
|
|
||||||
|
לכל רישום: `{client, instance_name?, server_name, command, args, cwd, env_keys}`.
|
||||||
|
- `env_keys`: רק שמות, לא ערכים.
|
||||||
|
- אם command/args מכילים paths רגישים — מוצגים as-is (לא secrets).
|
||||||
|
|
||||||
|
#### Coolify config — volume mounts נדרשים
|
||||||
|
|
||||||
|
לפני שהפיצ'ר עולה לפרודקשן, יש לוודא ב-Coolify (UUID `gyjo0mtw2c42ej3xxvbz8zio`):
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
volumes:
|
||||||
|
- /home/chaim/.claude.json:/host/.claude.json:ro
|
||||||
|
- /home/chaim/.paperclip:/host/.paperclip:ro
|
||||||
|
```
|
||||||
|
|
||||||
|
המימוש כולל סקריפט/הוראה אופרטיבית להוסיף את ה-mounts (לא חלק מקוד הפרויקט — שינוי תצורה).
|
||||||
|
|
||||||
|
### 3.3 שכבת Frontend
|
||||||
|
|
||||||
|
#### קובץ קיים: `web-ui/src/lib/api/settings.ts`
|
||||||
|
|
||||||
|
מורחב עם hooks חדשים:
|
||||||
|
|
||||||
|
```ts
|
||||||
|
// קריאות חדשות
|
||||||
|
export function useMcpEnv() { /* GET /api/settings/mcp/env */ }
|
||||||
|
export function useUpdateMcpEnv() { /* PATCH /api/settings/mcp/env/{key} */ }
|
||||||
|
export function useMcpRedeploy() { /* POST /api/settings/mcp/env/redeploy */ }
|
||||||
|
export function useMcpTools() { /* GET /api/settings/mcp/tools */ }
|
||||||
|
export function useMcpRegistrations() { /* GET /api/settings/mcp/registrations */ }
|
||||||
|
```
|
||||||
|
|
||||||
|
#### קבצי components חדשים תחת `web-ui/src/app/settings/_components/`
|
||||||
|
|
||||||
|
```
|
||||||
|
_components/
|
||||||
|
├── paperclip-tab.tsx ← העברת התוכן הקיים מ-page.tsx
|
||||||
|
├── environment-tab.tsx ← רשימת קבוצות + EnvVarRow
|
||||||
|
├── env-var-row.tsx ← שורה אחת של env var
|
||||||
|
├── env-var-editor.tsx ← input controls לפי type
|
||||||
|
├── tools-tab.tsx ← טבלה + drawer
|
||||||
|
├── tool-detail-drawer.tsx ← פרטי tool
|
||||||
|
├── registrations-tab.tsx ← כרטיסים לפי client
|
||||||
|
└── drift-badge.tsx ← badge ויזואלי
|
||||||
|
```
|
||||||
|
|
||||||
|
`page.tsx` הופך לאחראי רק על ה-Tabs ולעטיפה.
|
||||||
|
|
||||||
|
#### חוויית עריכת env var
|
||||||
|
|
||||||
|
לחיצה על שורה → התרחבות (accordion) → הצגת editor + שני ערכים (Infisical / Container) + כפתור "שמור".
|
||||||
|
|
||||||
|
לחיצה על "שמור":
|
||||||
|
1. PATCH → toast הצלחה: "נשמר ב-Infisical. לחץ Redeploy כדי להחיל בקונטיינר."
|
||||||
|
2. השורה מסומנת כ-"pending redeploy" עד ה-redeploy הבא.
|
||||||
|
3. כפתור "Redeploy now" קבוע בתחתית הטאב, מודגש כשיש שינויים pending.
|
||||||
|
|
||||||
|
#### חוויית Tools
|
||||||
|
|
||||||
|
טבלה לפי module. שורה → drawer מימין עם schema + תיאור + מיקום בקוד.
|
||||||
|
|
||||||
|
#### חוויית Registrations
|
||||||
|
|
||||||
|
כרטיס לכל client (Claude Code, Paperclip) → פירוט הרישום: command/args/cwd/env_keys.
|
||||||
|
|
||||||
|
## 4. טיפול בשגיאות
|
||||||
|
|
||||||
|
| תרחיש | התנהגות |
|
||||||
|
|---|---|
|
||||||
|
| Infisical לא זמין | `errors: ["infisical_unreachable"]` ב-GET. ערך infisical = null. UI מציג `?` במקום הערך + tooltip |
|
||||||
|
| Coolify redeploy נכשל | toast עם פרטי השגיאה. ערך נשמר ב-Infisical, מסומן pending |
|
||||||
|
| volume mount חסר ב-Coolify | endpoint registrations מחזיר `{registrations: [], error: "host_path_unavailable"}`. UI מציג הודעה |
|
||||||
|
| ניסיון עריכה של secret | 400 עם הודעה ברורה |
|
||||||
|
| ערך לא חוקי לפי type | 400 עם הודעת ולידציה ספציפית |
|
||||||
|
| FastMCP introspection נכשלת | 500. לוג שגיאה. UI מציג fallback |
|
||||||
|
|
||||||
|
## 5. בטיחות
|
||||||
|
|
||||||
|
- **לא להציג ערכי secret** — ה-API מחזיר תמיד `****<last_4>` עבור secrets.
|
||||||
|
- **Drift detection לא חושף** — השוואה על hash, לא על ערך גולמי.
|
||||||
|
- **PATCH על secret חסום ב-server** — לא רק ב-UI.
|
||||||
|
- **No raw `os.environ` dump** — ה-endpoint מחזיר רק keys ב-catalog.
|
||||||
|
- **Audit log** — כל PATCH מתועד ל-`logger.info` (key + ערך אם לא-סודי).
|
||||||
|
|
||||||
|
## 6. שלבי מימוש (overview ל-plan)
|
||||||
|
|
||||||
|
1. Catalog + endpoint `GET /api/settings/mcp/env` (ללא עריכה).
|
||||||
|
2. UI טאב Environment — read-only עם drift badges.
|
||||||
|
3. PATCH endpoint + UI editor.
|
||||||
|
4. Redeploy endpoint + UI button.
|
||||||
|
5. Tools introspection + UI.
|
||||||
|
6. Volume mounts הוראה (manual Coolify config) + Registrations endpoint + UI.
|
||||||
|
7. בדיקות ידניות end-to-end.
|
||||||
|
|
||||||
|
## 7. שאלות פתוחות (להבהרה לפני plan)
|
||||||
|
|
||||||
|
- **סביבת Infisical** — `dev`? `nautilus`? להחליט סופית. ברירת מחדל ב-spec: `dev`. ייתכן ויהיה ניתן לקבוע ב-env var (`INFISICAL_ENV`).
|
||||||
|
- **Path ב-Infisical** — `/legal-ai`? `/legal-ai/mcp`? להחליט לפי `_GUIDELINES/SAVE_SECRET_RULES`.
|
||||||
|
- **Auth** — אין כרגע על `/settings`. להוסיף לפחות "are you sure" dialog לפני PATCH של ערך משמעותי?
|
||||||
|
|
||||||
|
## 8. בדיקות
|
||||||
|
|
||||||
|
**ידני (אין test suite ל-frontend):**
|
||||||
|
- ✓ פתיחת `/settings` — Paperclip tab עובד כקודם.
|
||||||
|
- ✓ Environment tab — מציג env vars מקבץ catalog בלבד.
|
||||||
|
- ✓ Drift detection — שינוי ידני של env בקונטיינר → drift badge מופיע.
|
||||||
|
- ✓ עריכת `MULTIMODAL_TEXT_WEIGHT` ל-`0.7` → נשמר ב-Infisical.
|
||||||
|
- ✓ Redeploy → ערך חדש נכנס לתוקף בקונטיינר.
|
||||||
|
- ✓ ניסיון עריכת `VOYAGE_API_KEY` → חסום + הודעה.
|
||||||
|
- ✓ Tools tab — מציג את כל ה-tools של legal_mcp.
|
||||||
|
- ✓ Registrations tab — מציג את `~/.claude.json` ו-Paperclip instances.
|
||||||
|
|
||||||
|
**Backend tests** ב-`web/tests/` (אם קיימים — אחרת לדלג):
|
||||||
|
- catalog rejects unknown key
|
||||||
|
- PATCH על secret נחסם
|
||||||
|
- ולידציה של min/max
|
||||||
409
docs/voyage-upgrades-plan.md
Normal file
409
docs/voyage-upgrades-plan.md
Normal file
@@ -0,0 +1,409 @@
|
|||||||
|
# שדרוגי Voyage — תכנית מפורטת
|
||||||
|
|
||||||
|
תכנית 3-שלבית לשדרוג שכבת ה-retrieval של עוזר משפטי. שלב A מבוצע
|
||||||
|
בתאריך התכנית; שלבים B ו-C ממתינים לשיחה החדשה.
|
||||||
|
|
||||||
|
**הקשר**: Voyage = חיפוש (find), Claude = הבנה+כתיבה (read+write). שני
|
||||||
|
המנועים מנותקים ארכיטקטונית — שינוי שכבת ה-retrieval לא משפיע על קלוד
|
||||||
|
עצמו, רק על איזה chunks מגיעים אליו לקריאה.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## שלב A — מעבר ל-voyage-3 (✅ מבוצע)
|
||||||
|
|
||||||
|
### למה voyage-3 ולא voyage-law-2?
|
||||||
|
|
||||||
|
Benchmark על 3 שאילתות עברית-משפטית עם passages אמיתיים מהקורפוס:
|
||||||
|
|
||||||
|
| מודל | Perfect orderings | Total Separation |
|
||||||
|
|---|---|---|
|
||||||
|
| **voyage-3** | **3/3** | **+0.483** |
|
||||||
|
| voyage-3.5 | 3/3 | +0.278 |
|
||||||
|
| voyage-law-2 *(היה)* | 3/3 | +0.238 |
|
||||||
|
| voyage-4 | 2/3 | +0.423 |
|
||||||
|
| voyage-4-large | 2/3 | +0.353 |
|
||||||
|
|
||||||
|
voyage-3 **מנצח כפול** — דירוג מושלם + מרווחים גדולים פי-2 מ-voyage-law-2.
|
||||||
|
מימד נשאר 1024 → אין שינוי schema.
|
||||||
|
|
||||||
|
### מה בוצע
|
||||||
|
|
||||||
|
1. **Coolify env**: `VOYAGE_MODEL=voyage-3` בקונטיינר
|
||||||
|
2. **Local env (`~/.env`)**: `VOYAGE_MODEL=voyage-3`
|
||||||
|
3. **Re-embed של 5 טבלאות** באמצעות `scripts/reembed_voyage.py`:
|
||||||
|
- `document_chunks` — מסמכי תיקים (~6K rows)
|
||||||
|
- `paragraph_embeddings` — קורפוס סגנון (כעת ריק)
|
||||||
|
- `case_law_embeddings` — stubs מצוטטים אוטו'
|
||||||
|
- `precedent_chunks` — פסיקה שהועלתה (~385)
|
||||||
|
- `halachot.embedding` — 400 הלכות (rule_statement + reasoning)
|
||||||
|
4. **MCP server restart** — טעינה מחדש של `embeddings.py` עם המודל החדש
|
||||||
|
|
||||||
|
### Verification
|
||||||
|
|
||||||
|
- `search_precedent_library` על "תכנית רחביה" → 403/17 holding ראשון
|
||||||
|
- `search_decisions` על "השבחה" → תוצאות עקביות
|
||||||
|
- ה-counts בטבלאות לא ירדו (כל row עודכן, לא נמחק)
|
||||||
|
|
||||||
|
### Rollback אם משהו נשבר
|
||||||
|
|
||||||
|
- `VOYAGE_MODEL=voyage-law-2` ב-Coolify + `~/.env`
|
||||||
|
- הרצה מחדש של `scripts/reembed_voyage.py` (חוזרים לקודם)
|
||||||
|
- 10 דקות סך-הכל
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## שלב B — voyage-rerank-2 (Cross-encoder reranking)
|
||||||
|
|
||||||
|
> **שינוי מהותי מהתכנית המקורית.** המקור היה ל-context-3. POC רחב
|
||||||
|
> (4 בנצ'מרקים) הראה ש-context-3 לא משפר עקבית, ובחלק מהמקרים מציג
|
||||||
|
> רגרסיה. במקום זאת, **rerank-2** (cross-encoder) הצליח לתת שיפור של
|
||||||
|
> +4.5% mean@3 על קורפוס מלא של 785 docs, **+11.6% על שאילתות
|
||||||
|
> מעשיות** (P-category — בדיוק התרחיש של legal-writer/legal-researcher),
|
||||||
|
> בלי שינוי schema, בלי re-embed, ובלי double storage.
|
||||||
|
|
||||||
|
### למה rerank-2 ולא context-3?
|
||||||
|
|
||||||
|
POC #4 (אהרון ברק, 18 שאילתות, claude-haiku-4-5 כ-judge):
|
||||||
|
|
||||||
|
| Retriever | mean@3 | mean@5 | MRR |
|
||||||
|
|---|---|---|---|
|
||||||
|
| voyage-3 (baseline) | 3.278 | 3.300 | 0.741 |
|
||||||
|
| **voyage-3 + rerank-2** | **3.574** | **3.467** | **0.769** |
|
||||||
|
| voyage-context-3 (windowed) | 3.481 | 3.378 | 0.685 |
|
||||||
|
|
||||||
|
POC #5 (קורפוס מלא 785 docs, 12 שאילתות):
|
||||||
|
|
||||||
|
| Retriever | mean@3 | קטגוריה P (practical) |
|
||||||
|
|---|---|---|
|
||||||
|
| voyage-3 | 4.306 | 3.78 |
|
||||||
|
| **voyage-3 + rerank-2** | **4.500 (+4.5%)** | **4.22 (+11.6%)** |
|
||||||
|
|
||||||
|
context-3 גם נכשל בקטגוריות keyword שהן 60%+ מהשאילתות בפועל אצל דפנה.
|
||||||
|
|
||||||
|
### איך rerank-2 עובד
|
||||||
|
|
||||||
|
Two-stage retrieval:
|
||||||
|
1. **שלב bi-encoder (כמו היום)**: voyage-3 מטמיע את ה-query, מחזיר
|
||||||
|
top-50 chunks דרך cosine similarity על `pgvector` (מהיר, ~390ms).
|
||||||
|
2. **שלב cross-encoder (חדש)**: rerank-2 מקבל `(query, document)` עבור
|
||||||
|
כל אחד מ-50 הdocuments, ומחזיר ציון רלוונטיות מדויק יותר.
|
||||||
|
הreranker רואה את ה-query ואת ה-doc ביחד דרך attention מלא,
|
||||||
|
לעומת bi-encoder שרק מחשב cosine בין שני embeddings בלתי-תלויים.
|
||||||
|
3. החזרה: top-K (10) המדורגים מחדש.
|
||||||
|
|
||||||
|
**עלות**: +702ms latency (bi-encoder=393ms → +rerank=1095ms).
|
||||||
|
**עלות tokens**: zero לאחסון (רק חישוב per-query).
|
||||||
|
|
||||||
|
### תכנית יישום
|
||||||
|
|
||||||
|
#### B.1 — `voyage_rerank()` ב-`embeddings.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
async def voyage_rerank(
|
||||||
|
query: str, documents: list[str], top_k: int = 10,
|
||||||
|
) -> list[tuple[int, float]]:
|
||||||
|
"""Cross-encoder rerank via Voyage. Returns [(orig_index, score), ...]."""
|
||||||
|
if not documents:
|
||||||
|
return []
|
||||||
|
client = _get_client()
|
||||||
|
result = client.rerank(
|
||||||
|
query=query, documents=documents,
|
||||||
|
model=config.VOYAGE_RERANK_MODEL, # "rerank-2"
|
||||||
|
top_k=top_k,
|
||||||
|
)
|
||||||
|
return [(r.index, r.relevance_score) for r in result.results]
|
||||||
|
```
|
||||||
|
|
||||||
|
#### B.2 — Feature flag ב-`config.py`
|
||||||
|
|
||||||
|
```python
|
||||||
|
VOYAGE_RERANK_MODEL = os.environ.get("VOYAGE_RERANK_MODEL", "rerank-2")
|
||||||
|
VOYAGE_RERANK_ENABLED = (
|
||||||
|
os.environ.get("VOYAGE_RERANK_ENABLED", "false").lower() == "true"
|
||||||
|
)
|
||||||
|
VOYAGE_RERANK_FETCH_K = int(os.environ.get("VOYAGE_RERANK_FETCH_K", "50"))
|
||||||
|
```
|
||||||
|
|
||||||
|
הdefault הוא `false` — הקוד יישמר אך לא יורץ עד שיופעל ידנית.
|
||||||
|
|
||||||
|
#### B.3 — אינטגרציה ב-3 search functions
|
||||||
|
|
||||||
|
ב-`db.py`:
|
||||||
|
- `search_similar` (document_chunks) — נוסיף פרמטר `rerank: bool = False`.
|
||||||
|
אם True: שולפים top-`VOYAGE_RERANK_FETCH_K` במקום `limit`,
|
||||||
|
מעבירים דרך rerank, מחזירים top-`limit`.
|
||||||
|
- `search_precedent_library_semantic` — אותו דבר. הuance: היום יש
|
||||||
|
boost של +0.05 ל-halachot. כש-rerank פעיל, ה-boost מתבטל ו-rerank
|
||||||
|
מוחל על המאוחד (chunks + halachot ביחד) — cross-encoder יבחר נכון
|
||||||
|
בלי boost מלאכותי.
|
||||||
|
- `search_similar_paragraphs` / `search_similar_case_law` (ב-style
|
||||||
|
corpus) — אותו דבר.
|
||||||
|
|
||||||
|
ב-`tools/search.py` — כל הtools (`search_decisions`, `search_case_documents`,
|
||||||
|
`find_similar_cases`, `precedent_search_library`) יעבירו
|
||||||
|
`rerank=config.VOYAGE_RERANK_ENABLED` לקריאות ה-DB.
|
||||||
|
|
||||||
|
#### B.4 — Schema
|
||||||
|
|
||||||
|
אין שינוי. אותם vectors, אותו pgvector.
|
||||||
|
|
||||||
|
#### B.5 — Rollout
|
||||||
|
|
||||||
|
1. שינוי קוד + push + deploy עם feature flag = `false`
|
||||||
|
2. אימות ש-baseline ממשיך לעבוד (לא רגרסיה)
|
||||||
|
3. הפעלה ידנית: `VOYAGE_RERANK_ENABLED=true` ב-Coolify env
|
||||||
|
4. שאילתות אמיתיות מדפנה / סוכנים — observation
|
||||||
|
5. אם רגרסיה — kill switch בשניות (`false` בחזרה)
|
||||||
|
6. אם כל מתעקפם — להגדיר `true` כdefault (in-code) אחרי שבוע יציב
|
||||||
|
|
||||||
|
#### B.6 — Tier check
|
||||||
|
|
||||||
|
Voyage Tier 1: 2M TPM, 2000 RPM ל-rerank-2. עומס שלנו (~עשרות
|
||||||
|
queries בשעה במקרה רגיל) — מתחת ל-1% מהמכסה.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## שלב C — voyage-multimodal-3 (✅ בוצע 2026-05-03)
|
||||||
|
|
||||||
|
> **תיקון שם המודל מהתכנית המקורית**: השם הסופי הוא
|
||||||
|
> `voyage-multimodal-3` (לא 3.5). הוצמד לזה ש-POC #3 הריץ.
|
||||||
|
|
||||||
|
### מצב סופי בייצור
|
||||||
|
|
||||||
|
- `MULTIMODAL_ENABLED=true` ב-Coolify env
|
||||||
|
- Schema V9 ב-DB (document_image_embeddings + precedent_image_embeddings)
|
||||||
|
- 419 page-image embeddings על 8174-24 (146) + 8137-24 (273)
|
||||||
|
- 819 text chunks קיבלו page_number (100% retrofit)
|
||||||
|
- RRF hybrid merge עם boost text+image פעיל
|
||||||
|
|
||||||
|
### שינויים מהתכנית המקורית — שני תיקונים אמפיריים
|
||||||
|
|
||||||
|
1. **Score scaling — Reciprocal Rank Fusion במקום weighted sum.**
|
||||||
|
ה-cosine של voyage-3 (~0.4-0.5) שיטתית גבוה מ-voyage-multimodal-3
|
||||||
|
(~0.20-0.25). A/B ראשון על 7 שאילתות הראה: עם 0.65/0.35 weighted
|
||||||
|
sum ו-MULTIMODAL_ENABLED=true, **0** image rows הופיעו ב-top-5,
|
||||||
|
image side פשוט הוצף. עברנו ל-RRF (`rrf_score = w / (k + rank)`)
|
||||||
|
שעמיד לסקיילים שונים. תוצאה: 5/5 results עם image contribution
|
||||||
|
בכל שאילתה.
|
||||||
|
|
||||||
|
2. **Page tracking — chunker חדש + retrofit ל-819 chunks קיימים.**
|
||||||
|
ה-chunker הישן זרק את ה-page_number של chunks. בלעדיו ה-boost
|
||||||
|
text+image (join על `(document_id, page_number)`) לא יכול לפעול.
|
||||||
|
נוסף `page_offsets` ל-`extractor.extract_text` (משלשה במקום זוג —
|
||||||
|
מעודכן ב-6 callers); chunker מקבל אותו ומסמן page לכל chunk לפי
|
||||||
|
offset של התווים הראשונים שלו. retrofit ל-chunks קיימים
|
||||||
|
(`scripts/backfill_chunk_pages.py`) עובד **בלי re-OCR** —
|
||||||
|
משתמש ב-stored extracted_text כמקור (matches existing chunk
|
||||||
|
content verbatim) ו-PyMuPDF direct text reads כעיגוני page
|
||||||
|
boundaries; pages סרוקים ללא טקסט ישיר עוברים אינטרפולציה.
|
||||||
|
|
||||||
|
### למה NOT לעשות re-OCR ב-retrofit
|
||||||
|
|
||||||
|
ניסיון ראשון השתמש ב-`extractor.extract_text` להפיק page_offsets
|
||||||
|
חדשים. תוצאה: 1/29 chunks נמצאו (28 not found), כי OCR של Google
|
||||||
|
Vision לא דטרמיניסטי — ה-OCR החדש שונה מה-OCR שהפיק את ה-chunks
|
||||||
|
המקוריים. הגרסה החדשה משתמשת ב-stored `documents.extracted_text`
|
||||||
|
שמתאים לחלוטין לתוכן ה-chunks. עלות: $0 (לעומת ~$0.0015/page).
|
||||||
|
|
||||||
|
### Files שהשתנו (יחסית למה שהמסמך הזה תיכנן)
|
||||||
|
|
||||||
|
קוד שנכתב/שונה (5 commits, 242f668 → 8a815ec):
|
||||||
|
- `mcp-server/src/legal_mcp/config.py` — flags MULTIMODAL_*
|
||||||
|
- `mcp-server/src/legal_mcp/services/extractor.py` — render + page_offsets
|
||||||
|
- `mcp-server/src/legal_mcp/services/embeddings.py` — embed_images
|
||||||
|
- `mcp-server/src/legal_mcp/services/db.py` — schema V9 + 4 store/search funcs
|
||||||
|
- `mcp-server/src/legal_mcp/services/chunker.py` — page tracking
|
||||||
|
- `mcp-server/src/legal_mcp/services/processor.py` — ingest integration
|
||||||
|
- `mcp-server/src/legal_mcp/services/precedent_library.py` — same
|
||||||
|
- `mcp-server/src/legal_mcp/services/hybrid_search.py` — חדש, RRF orchestrator
|
||||||
|
- `mcp-server/src/legal_mcp/tools/search.py` — wired to hybrid
|
||||||
|
- `mcp-server/src/legal_mcp/tools/documents.py` + `tools/workflow.py` + `web/app.py` — extract_text triple unpack
|
||||||
|
- `scripts/multimodal_backfill.py` + `scripts/backfill_chunk_pages.py` — חדשים
|
||||||
|
|
||||||
|
### מה נשאר (deferred)
|
||||||
|
|
||||||
|
- UI thumbnails בתוצאות חיפוש (לא חוסם — דפנה מקבלת page numbers)
|
||||||
|
- Backfill על שאר הקורפוס (מעבר ל-2 התיקים): לא דחוף, אפשר per-case
|
||||||
|
- `text_weight` תיאום: כרגע 0.5 (vanilla RRF). אם דפנה תגיד שהיא רואה
|
||||||
|
יותר מדי image-influence, מעלים ל-0.55-0.6 דרך env בלי deploy.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## שלב C המקורי (תכנון, לרפרנס)
|
||||||
|
|
||||||
|
### הבעיה שהוא פותר
|
||||||
|
|
||||||
|
תיקים סרוקים ודוחות שמאי מאבדים מידע ב-OCR:
|
||||||
|
- ✗ פריסת טבלאות (שורות נתונים מתבלגנות)
|
||||||
|
- ✗ חתימות וחותמות
|
||||||
|
- ✗ דיאגרמות, מפות, תרשימים אדריכליים
|
||||||
|
- ✗ נוסחאות מתמטיות
|
||||||
|
|
||||||
|
OCR קיים (Google Cloud Vision) ממיר תמונות לטקסט אבל מטפל בעמוד כשורה-
|
||||||
|
אחר-שורה. תוצאה: בדוח שמאי "שווי לפני | שווי אחרי | ≈ 1.5M ש"ח" הופך
|
||||||
|
ל-"שווי לפני שווי אחרי 1.5M ש"ח" — חיפוש "שומה ל-1.5M" לא תמיד מוצא.
|
||||||
|
|
||||||
|
### מה voyage-multimodal-3.5 עושה
|
||||||
|
|
||||||
|
API: `client.multimodal_embed(inputs=[[image, text?], ...])`. מקבל
|
||||||
|
תמונה (PIL Image או URL) ומחזיר embedding שכולל:
|
||||||
|
- את הטקסט שעל העמוד
|
||||||
|
- את **המבנה הוויזואלי** (טבלה, חתימה, מיקומי גוש)
|
||||||
|
- תרשימים ודיאגרמות
|
||||||
|
|
||||||
|
Searchable יחד עם text embeddings — query טקסטואלית רגילה מוצאת גם
|
||||||
|
פסקאות עם טבלה רלוונטית.
|
||||||
|
|
||||||
|
### תכנית יישום
|
||||||
|
|
||||||
|
#### C.1 — Schema חדש
|
||||||
|
|
||||||
|
```sql
|
||||||
|
CREATE TABLE document_image_embeddings (
|
||||||
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
|
document_id UUID REFERENCES documents(id) ON DELETE CASCADE,
|
||||||
|
page_number INTEGER NOT NULL,
|
||||||
|
image_thumbnail_path TEXT, -- לסרגל תוצאות חיפוש
|
||||||
|
embedding vector(1024),
|
||||||
|
created_at TIMESTAMPTZ DEFAULT now()
|
||||||
|
);
|
||||||
|
CREATE INDEX idx_doc_img_emb_vec
|
||||||
|
ON document_image_embeddings USING ivfflat (embedding vector_cosine_ops);
|
||||||
|
|
||||||
|
CREATE TABLE precedent_image_embeddings (
|
||||||
|
id UUID PRIMARY KEY DEFAULT uuid_generate_v4(),
|
||||||
|
case_law_id UUID REFERENCES case_law(id) ON DELETE CASCADE,
|
||||||
|
page_number INTEGER NOT NULL,
|
||||||
|
image_thumbnail_path TEXT,
|
||||||
|
embedding vector(1024),
|
||||||
|
created_at TIMESTAMPTZ DEFAULT now()
|
||||||
|
);
|
||||||
|
CREATE INDEX idx_prec_img_emb_vec
|
||||||
|
ON precedent_image_embeddings USING ivfflat (embedding vector_cosine_ops);
|
||||||
|
```
|
||||||
|
|
||||||
|
#### C.2 — Pipeline שינוי
|
||||||
|
|
||||||
|
חדש ב-`extractor.py`:
|
||||||
|
```python
|
||||||
|
async def render_pages_as_images(pdf_path: str) -> list[bytes]:
|
||||||
|
"""PyMuPDF render of each page → PNG bytes for multimodal embedding."""
|
||||||
|
import fitz
|
||||||
|
doc = fitz.open(pdf_path)
|
||||||
|
images = []
|
||||||
|
for page in doc:
|
||||||
|
pix = page.get_pixmap(dpi=144) # decent resolution for embeddings
|
||||||
|
images.append(pix.tobytes("png"))
|
||||||
|
return images
|
||||||
|
```
|
||||||
|
|
||||||
|
חדש ב-`embeddings.py`:
|
||||||
|
```python
|
||||||
|
async def embed_images(images: list[bytes], input_type: str = "document") -> list[list[float]]:
|
||||||
|
"""Embed page images via voyage-multimodal-3.5."""
|
||||||
|
from PIL import Image
|
||||||
|
import io
|
||||||
|
pil_images = [Image.open(io.BytesIO(img)) for img in images]
|
||||||
|
response = _get_client().multimodal_embed(
|
||||||
|
inputs=[[img] for img in pil_images],
|
||||||
|
model="voyage-multimodal-3.5",
|
||||||
|
input_type=input_type,
|
||||||
|
)
|
||||||
|
return response.embeddings
|
||||||
|
```
|
||||||
|
|
||||||
|
#### C.3 — Integration ב-ingest pipelines
|
||||||
|
|
||||||
|
`processor.py:process_document` (תיק):
|
||||||
|
```python
|
||||||
|
# אחרי extract+chunk+embed הטקסטואלי:
|
||||||
|
images = await extractor.render_pages_as_images(file_path)
|
||||||
|
img_embs = await embeddings.embed_images(images)
|
||||||
|
await db.store_document_image_embeddings(document_id, img_embs, thumbnails)
|
||||||
|
```
|
||||||
|
|
||||||
|
`precedent_library.py:ingest_precedent`: אותו pattern, על
|
||||||
|
`precedent_image_embeddings`.
|
||||||
|
|
||||||
|
#### C.4 — Hybrid search
|
||||||
|
|
||||||
|
חדש ב-`db.py:search_precedent_library_hybrid`:
|
||||||
|
```python
|
||||||
|
async def search_precedent_library_hybrid(query, limit=10):
|
||||||
|
query_emb = await embeddings.embed_query(query)
|
||||||
|
query_img_emb = await embeddings.embed_query_for_multimodal(query)
|
||||||
|
|
||||||
|
text_results = ... # cosine on precedent_chunks (top 30)
|
||||||
|
image_results = ... # cosine on precedent_image_embeddings (top 30)
|
||||||
|
|
||||||
|
# Merge: weighted score (text 0.6, image 0.4 — tunable)
|
||||||
|
merged = {}
|
||||||
|
for r in text_results: merged[r.case_law_id] = r.score * 0.6
|
||||||
|
for r in image_results:
|
||||||
|
merged[r.case_law_id] = merged.get(r.case_law_id, 0) + r.score * 0.4
|
||||||
|
|
||||||
|
return sorted(merged.items(), key=lambda x: -x[1])[:limit]
|
||||||
|
```
|
||||||
|
|
||||||
|
#### C.5 — UI: thumbnails בתוצאות חיפוש
|
||||||
|
|
||||||
|
ב-`/precedents` חיפוש סמנטי, התוצאות עם רכיב image יציגו thumbnail
|
||||||
|
קטן של העמוד. לחיצה תפתח את ה-PDF במקום הרלוונטי.
|
||||||
|
|
||||||
|
#### C.6 — סדר עדיפויות לדיגום
|
||||||
|
|
||||||
|
1. **דוחות שמאי** — הזכייה הגדולה (טבלאות = ערכים מספריים שכרגע
|
||||||
|
הולכים לאיבוד ב-OCR)
|
||||||
|
2. **תיקים סרוקים ישנים** — שיפור ה-recall של חיפוש
|
||||||
|
3. **פסיקה עם דיאגרמות** (תרשימי גוש/חלקה) — minor
|
||||||
|
|
||||||
|
#### C.7 — עלות + tier
|
||||||
|
|
||||||
|
voyage-multimodal-3.5 הוא מוצר נפרד. בdoc'ים פר-עמוד:
|
||||||
|
- תיק ממוצע: 50-200 עמודים
|
||||||
|
- 100 תיקים = 5,000-20,000 עמודים
|
||||||
|
- Free tier: 200M tokens/month — אבל multimodal נמדד ב-tokens שונה
|
||||||
|
(התמונה צורכת ~1000-2000 tokens לעמוד)
|
||||||
|
|
||||||
|
הערכה: 100 תיקים × 100 עמודים × 1500 tokens = 15M tokens. בthe
|
||||||
|
free tier בקלות. צריך לבדוק תקרת שימוש בפועל בdocs של voyage.
|
||||||
|
|
||||||
|
#### C.8 — שלבים מומלצים
|
||||||
|
|
||||||
|
1. **POC** — תיק אחד עם דו"ח שמאי. embed → search → השוואה לתוצאות
|
||||||
|
טקסט-בלבד.
|
||||||
|
2. **A/B test** — חצי מהתיקים החדשים עם multimodal, חצי בלי. 4
|
||||||
|
שבועות בדיקה — האם דפנה מוצאת תוצאות מדויקות יותר?
|
||||||
|
3. **Rollout** — אם המבחן חיובי, לעבד את הקורפוס הקיים ברקע
|
||||||
|
|
||||||
|
### החלטות שנשארו פתוחות
|
||||||
|
|
||||||
|
- ✋ DPI לרינדור: 144 (סביר), 200 (איכות), 96 (מהיר)?
|
||||||
|
- ✋ נשמור thumbnails ב-disk או רק את ה-embeddings?
|
||||||
|
- ✋ משקלות hybrid search: 0.6/0.4 או יותר נטוי לטקסט?
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## רצף עבודה בשיחה החדשה
|
||||||
|
|
||||||
|
> 1. פתחי `docs/voyage-upgrades-plan.md` (זה המסמך)
|
||||||
|
> 2. אם A הצליח (verify ב-Coolify env), נמשיך ל-B (context-3)
|
||||||
|
> 3. **B.5 קודם** — benchmark לפני re-embed גדול
|
||||||
|
> 4. אם B מצליח, רץ ל-C — אבל ב-2 צעדים זהירים (POC → A/B → rollout)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## נספח: רשימה של קבצים שנגעו ב-Voyage היום
|
||||||
|
|
||||||
|
קוד שנכתב/שונה:
|
||||||
|
- `scripts/reembed_voyage.py` — חדש, סקריפט re-embed
|
||||||
|
- `~/.env` — `VOYAGE_MODEL=voyage-3`
|
||||||
|
- Coolify env (legal-ai app) — `VOYAGE_MODEL=voyage-3`
|
||||||
|
|
||||||
|
קבצים שלא צריכים שינוי (CONFIRM):
|
||||||
|
- `mcp-server/src/legal_mcp/services/embeddings.py` — קורא ל-config.VOYAGE_MODEL
|
||||||
|
- `mcp-server/src/legal_mcp/config.py` — default ל-voyage-law-2 אבל env
|
||||||
|
בקוולפיי + מקומית מנצח
|
||||||
|
- כל הסוכנים (legal-writer, etc.) — לא קוראים ל-Voyage ישירות
|
||||||
|
|
||||||
|
עבור B + C: השינויים במסמך הזה (לא מבוצעים עדיין).
|
||||||
@@ -20,6 +20,7 @@ dependencies = [
|
|||||||
"fastapi>=0.115.0",
|
"fastapi>=0.115.0",
|
||||||
"uvicorn[standard]>=0.30.0",
|
"uvicorn[standard]>=0.30.0",
|
||||||
"httpx>=0.27.0",
|
"httpx>=0.27.0",
|
||||||
|
"infisicalsdk>=1.0.0",
|
||||||
]
|
]
|
||||||
|
|
||||||
[build-system]
|
[build-system]
|
||||||
|
|||||||
@@ -47,6 +47,57 @@ VOYAGE_API_KEY = os.environ.get("VOYAGE_API_KEY", "")
|
|||||||
VOYAGE_MODEL = os.environ.get("VOYAGE_MODEL", "voyage-law-2")
|
VOYAGE_MODEL = os.environ.get("VOYAGE_MODEL", "voyage-law-2")
|
||||||
VOYAGE_DIMENSIONS = 1024
|
VOYAGE_DIMENSIONS = 1024
|
||||||
|
|
||||||
|
# Rerank — cross-encoder second-stage. Off by default; flip with env to
|
||||||
|
# enable across all semantic search tools (search_decisions,
|
||||||
|
# search_case_documents, find_similar_cases, search_precedent_library).
|
||||||
|
VOYAGE_RERANK_MODEL = os.environ.get("VOYAGE_RERANK_MODEL", "rerank-2")
|
||||||
|
VOYAGE_RERANK_ENABLED = (
|
||||||
|
os.environ.get("VOYAGE_RERANK_ENABLED", "false").lower() == "true"
|
||||||
|
)
|
||||||
|
# How many candidates to fetch from bi-encoder before reranking.
|
||||||
|
# 50 was the depth used in the POC; balances recall vs rerank cost.
|
||||||
|
VOYAGE_RERANK_FETCH_K = int(os.environ.get("VOYAGE_RERANK_FETCH_K", "50"))
|
||||||
|
|
||||||
|
# Multimodal — page-image embeddings via voyage-multimodal-3. Off by
|
||||||
|
# default; flip with env to enable per-page image embedding during
|
||||||
|
# ingestion + hybrid (text+image) ranking at search time. POC #3
|
||||||
|
# validated on a 89-page appraisal PDF (38s, 312K tokens, recovered
|
||||||
|
# table structure + image-only scanned pages that text-OCR misses).
|
||||||
|
MULTIMODAL_ENABLED = (
|
||||||
|
os.environ.get("MULTIMODAL_ENABLED", "false").lower() == "true"
|
||||||
|
)
|
||||||
|
MULTIMODAL_MODEL = os.environ.get("MULTIMODAL_MODEL", "voyage-multimodal-3")
|
||||||
|
# Render DPI for the image fed to the embedder. POC used 144 — sweet
|
||||||
|
# spot between embedding quality and tokens/page (144 ≈ 3.5K tok/page).
|
||||||
|
MULTIMODAL_DPI = int(os.environ.get("MULTIMODAL_DPI", "144"))
|
||||||
|
# Separate, lower DPI for the JPEG thumbnail saved to disk for UI
|
||||||
|
# preview. ~96dpi → ~20KB/page; ingestion-time, no re-render at view.
|
||||||
|
MULTIMODAL_THUMB_DPI = int(os.environ.get("MULTIMODAL_THUMB_DPI", "96"))
|
||||||
|
# Hybrid merge: Reciprocal Rank Fusion (RRF) bias for the *text* side.
|
||||||
|
# voyage-3 cosine scores (~0.4-0.5) and voyage-multimodal-3 scores
|
||||||
|
# (~0.20-0.25) live on different scales; a direct weighted sum lets
|
||||||
|
# text always dominate. RRF is rank-based and robust to that. The
|
||||||
|
# weight here biases the contribution of each side: 0.5 = balanced
|
||||||
|
# (vanilla RRF), >0.5 favours text, <0.5 favours image. Tunable per
|
||||||
|
# env without redeploy.
|
||||||
|
MULTIMODAL_TEXT_WEIGHT = float(
|
||||||
|
os.environ.get("MULTIMODAL_TEXT_WEIGHT", "0.5")
|
||||||
|
)
|
||||||
|
# RRF damping constant. Standard literature value is 60: lower values
|
||||||
|
# concentrate weight at top ranks; higher values flatten the curve.
|
||||||
|
MULTIMODAL_RRF_K = int(os.environ.get("MULTIMODAL_RRF_K", "60"))
|
||||||
|
|
||||||
|
# Halacha extraction — auto-approve threshold. Halachot with extractor
|
||||||
|
# confidence >= this value are inserted with review_status='approved'
|
||||||
|
# instead of 'pending_review' (so they immediately appear in
|
||||||
|
# search_precedent_library). Set to a value > 1.0 to disable auto-approval.
|
||||||
|
# 0.80 baseline: 89% of historical extractions land here, manual spot-check
|
||||||
|
# of 10 random samples confirmed quality. Tunable via env if drift is
|
||||||
|
# observed (e.g. raise to 0.90 if false-positives appear).
|
||||||
|
HALACHA_AUTO_APPROVE_THRESHOLD = float(
|
||||||
|
os.environ.get("HALACHA_AUTO_APPROVE_THRESHOLD", "0.80")
|
||||||
|
)
|
||||||
|
|
||||||
# Google Cloud Vision (OCR for scanned PDFs)
|
# Google Cloud Vision (OCR for scanned PDFs)
|
||||||
GOOGLE_CLOUD_VISION_API_KEY = os.environ.get("GOOGLE_CLOUD_VISION_API_KEY", "")
|
GOOGLE_CLOUD_VISION_API_KEY = os.environ.get("GOOGLE_CLOUD_VISION_API_KEY", "")
|
||||||
|
|
||||||
|
|||||||
@@ -23,12 +23,17 @@ logger = logging.getLogger("legal_mcp")
|
|||||||
|
|
||||||
@asynccontextmanager
|
@asynccontextmanager
|
||||||
async def lifespan(server: FastMCP) -> AsyncIterator[None]:
|
async def lifespan(server: FastMCP) -> AsyncIterator[None]:
|
||||||
"""Initialize DB schema on startup, close pool on shutdown."""
|
"""Server startup is now non-blocking.
|
||||||
from legal_mcp.services.db import close_pool, init_schema
|
|
||||||
|
|
||||||
logger.info("Initializing database schema...")
|
Schema init was moved out of the lifespan to fix a race where Claude Code
|
||||||
await init_schema()
|
would call a tool before `tools/list` had been answered — manifesting as
|
||||||
logger.info("Ezer Mishpati MCP server ready")
|
"No such tool available". Lifespan now returns immediately so the MCP
|
||||||
|
handshake completes in milliseconds; the schema is initialized lazily on
|
||||||
|
the first DB access via services/db.get_pool().
|
||||||
|
"""
|
||||||
|
from legal_mcp.services.db import close_pool
|
||||||
|
|
||||||
|
logger.info("Ezer Mishpati MCP server ready (schema init deferred)")
|
||||||
try:
|
try:
|
||||||
yield
|
yield
|
||||||
finally:
|
finally:
|
||||||
@@ -47,6 +52,7 @@ mcp = FastMCP(
|
|||||||
|
|
||||||
from legal_mcp.tools import ( # noqa: E402
|
from legal_mcp.tools import ( # noqa: E402
|
||||||
cases, documents, search, drafting, workflow, precedents,
|
cases, documents, search, drafting, workflow, precedents,
|
||||||
|
precedent_library as plib,
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|
||||||
@@ -142,10 +148,126 @@ async def precedent_remove(precedent_id: str) -> str:
|
|||||||
async def precedent_search_library(
|
async def precedent_search_library(
|
||||||
query: str, practice_area: str = "", limit: int = 10,
|
query: str, practice_area: str = "", limit: int = 10,
|
||||||
) -> str:
|
) -> str:
|
||||||
"""חיפוש בספרייה הרוחבית של ציטוטים שנצברו בין תיקים."""
|
"""חיפוש בציטוטים שדפנה צירפה ידנית לתיקים בעבר (case_precedents).
|
||||||
|
שונה מ-search_precedent_library שמחפש בקורפוס הפסיקה הסמכותית."""
|
||||||
return await precedents.precedent_search_library(query, practice_area, limit)
|
return await precedents.precedent_search_library(query, practice_area, limit)
|
||||||
|
|
||||||
|
|
||||||
|
# ── External Precedent Library — authoritative case-law corpus ─────
|
||||||
|
# Distinct from precedent_search_library above (chair-attached quotes)
|
||||||
|
# and from search_decisions (Daphna's style corpus).
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_library_upload(
|
||||||
|
file_path: str,
|
||||||
|
citation: str,
|
||||||
|
case_name: str = "",
|
||||||
|
court: str = "",
|
||||||
|
decision_date: str = "",
|
||||||
|
source_type: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
practice_area: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
subject_tags: list[str] | None = None,
|
||||||
|
is_binding: bool = True,
|
||||||
|
headnote: str = "",
|
||||||
|
summary: str = "",
|
||||||
|
) -> str:
|
||||||
|
"""העלאת פסיקה חיצונית (פס"ד / החלטה של ועדה אחרת) לקורפוס הסמכותי. מחלץ הלכות אוטומטית — כולן ממתינות לאישור היו"ר. practice_area: rishuy_uvniya / betterment_levy / compensation_197."""
|
||||||
|
return await plib.precedent_library_upload(
|
||||||
|
file_path, citation, case_name, court, decision_date,
|
||||||
|
source_type, precedent_level, practice_area, appeal_subtype,
|
||||||
|
subject_tags, is_binding, headnote, summary,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_library_list(
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
source_type: str = "",
|
||||||
|
search: str = "",
|
||||||
|
limit: int = 100,
|
||||||
|
) -> str:
|
||||||
|
"""רשימת הפסיקה בקורפוס הסמכותי, עם פילטרים."""
|
||||||
|
return await plib.precedent_library_list(
|
||||||
|
practice_area, court, precedent_level, source_type, search, limit,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_library_get(case_law_id: str) -> str:
|
||||||
|
"""פסיקה ספציפית בקורפוס + רשימת ההלכות שחולצו ממנה (כולל ממתינות לאישור)."""
|
||||||
|
return await plib.precedent_library_get(case_law_id)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_library_delete(case_law_id: str) -> str:
|
||||||
|
"""מחיקת פסיקה מהקורפוס (cascade: chunks + halachot)."""
|
||||||
|
return await plib.precedent_library_delete(case_law_id)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_extract_halachot(case_law_id: str) -> str:
|
||||||
|
"""הרצה מחדש של חילוץ הלכות לפסיקה קיימת. ההלכות הקיימות נמחקות, החדשות חוזרות לסטטוס pending_review."""
|
||||||
|
return await plib.precedent_extract_halachot(case_law_id)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_extract_metadata(case_law_id: str) -> str:
|
||||||
|
"""חילוץ מטא-דאטה (case_name קצר, summary, headnote, key_quote, subject_tags, appeal_subtype, date, level, court, source_type) מהטקסט. ממלא רק שדות ריקים."""
|
||||||
|
return await plib.precedent_extract_metadata(case_law_id)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def precedent_process_pending(kind: str = "metadata", limit: int = 20) -> str:
|
||||||
|
"""ריקון תור בקשות חילוץ שנשלחו מ-UI. kind: 'metadata' או 'halacha'. מריץ extractor מקומית עם CLI על כל פריט בתור, ומנקה את הסימון אחרי הצלחה."""
|
||||||
|
return await plib.precedent_process_pending(kind, limit)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def search_precedent_library(
|
||||||
|
query: str,
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
subject_tag: str = "",
|
||||||
|
limit: int = 10,
|
||||||
|
include_halachot: bool = True,
|
||||||
|
) -> str:
|
||||||
|
"""חיפוש סמנטי בקורפוס הפסיקה הסמכותית. מחזיר הלכות (מאושרות בלבד) + קטעי טקסט. השתמש כש-legal-writer צריך לצטט פסיקה מחייבת בבלוק י (CREAC: rule + explanation)."""
|
||||||
|
return await plib.search_precedent_library(
|
||||||
|
query, practice_area, court, precedent_level, appeal_subtype,
|
||||||
|
None, subject_tag, limit, include_halachot,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def halacha_review(
|
||||||
|
halacha_id: str,
|
||||||
|
status: str,
|
||||||
|
reviewer: str = "דפנה",
|
||||||
|
rule_statement: str = "",
|
||||||
|
reasoning_summary: str = "",
|
||||||
|
subject_tags: list[str] | None = None,
|
||||||
|
practice_areas: list[str] | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""אישור / דחייה / עריכה של הלכה שחולצה אוטומטית. status: pending_review / approved / rejected / published."""
|
||||||
|
return await plib.halacha_review(
|
||||||
|
halacha_id, status, reviewer, rule_statement, reasoning_summary,
|
||||||
|
subject_tags, practice_areas,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
@mcp.tool()
|
||||||
|
async def halachot_pending(limit: int = 100) -> str:
|
||||||
|
"""תור ההלכות הממתינות לאישור."""
|
||||||
|
return await plib.halachot_pending(limit)
|
||||||
|
|
||||||
|
|
||||||
# Documents
|
# Documents
|
||||||
@mcp.tool()
|
@mcp.tool()
|
||||||
async def document_upload(
|
async def document_upload(
|
||||||
|
|||||||
@@ -103,7 +103,7 @@ async def extract_facts_from_document(
|
|||||||
f"שמאי: {appraiser_name}{chunk_label}\n\n"
|
f"שמאי: {appraiser_name}{chunk_label}\n\n"
|
||||||
f"--- תחילת שומה ---\n{chunk}\n--- סוף שומה ---"
|
f"--- תחילת שומה ---\n{chunk}\n--- סוף שומה ---"
|
||||||
)
|
)
|
||||||
result = claude_session.query_json(prompt, timeout=180)
|
result = await claude_session.query_json(prompt)
|
||||||
if not isinstance(result, list):
|
if not isinstance(result, list):
|
||||||
logger.warning(
|
logger.warning(
|
||||||
"extract_facts_from_document: chunk %d returned non-list (%s) for doc=%s",
|
"extract_facts_from_document: chunk %d returned non-list (%s) for doc=%s",
|
||||||
|
|||||||
@@ -380,7 +380,7 @@ async def write_block(
|
|||||||
# Call Claude via Claude Code session (no API)
|
# Call Claude via Claude Code session (no API)
|
||||||
model_key = block_cfg["model"]
|
model_key = block_cfg["model"]
|
||||||
timeout = claude_session.LONG_TIMEOUT if model_key == "opus" else claude_session.DEFAULT_TIMEOUT
|
timeout = claude_session.LONG_TIMEOUT if model_key == "opus" else claude_session.DEFAULT_TIMEOUT
|
||||||
content = claude_session.query(prompt, timeout=timeout)
|
content = await claude_session.query(prompt, timeout=timeout)
|
||||||
|
|
||||||
return _build_result(block_id, content, block_cfg)
|
return _build_result(block_id, content, block_cfg)
|
||||||
|
|
||||||
|
|||||||
@@ -134,14 +134,14 @@ async def generate_directions(
|
|||||||
{doc_context or '(אין מסמכים בתיק)'}
|
{doc_context or '(אין מסמכים בתיק)'}
|
||||||
"""
|
"""
|
||||||
|
|
||||||
result = claude_session.query_json(user_content, timeout=120)
|
result = await claude_session.query_json(user_content)
|
||||||
if result is None:
|
if result is None:
|
||||||
logger.warning("Failed to parse brainstorm response: %s", raw[:300])
|
logger.warning("Failed to parse brainstorm response")
|
||||||
return {
|
return {
|
||||||
"key_claims": [],
|
"key_claims": [],
|
||||||
"directions": [],
|
"directions": [],
|
||||||
"recommended_order": "",
|
"recommended_order": "",
|
||||||
"raw_response": raw,
|
"raw_response": "",
|
||||||
}
|
}
|
||||||
|
|
||||||
return result
|
return result
|
||||||
|
|||||||
@@ -7,14 +7,16 @@ from dataclasses import dataclass, field
|
|||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
|
|
||||||
# Hebrew legal section headers
|
# Hebrew legal section headers.
|
||||||
|
# Covers both appeals committee decisions and external court rulings —
|
||||||
|
# court rulings use slightly different vocabulary (פסק דין, נימוקים, סוף דבר).
|
||||||
SECTION_PATTERNS = [
|
SECTION_PATTERNS = [
|
||||||
(r"רקע\s*עובדתי|רקע\s*כללי|העובדות|הרקע", "facts"),
|
(r"רקע\s*עובדתי|רקע\s*כללי|העובדות|הרקע", "facts"),
|
||||||
(r"טענות\s*העוררי[םן]|טענות\s*המערערי[םן]|עיקר\s*טענות\s*העוררי[םן]", "appellant_claims"),
|
(r"טענות\s*העוררי[םן]|טענות\s*המערערי[םן]|עיקר\s*טענות\s*העוררי[םן]", "appellant_claims"),
|
||||||
(r"טענות\s*המשיבי[םן]|תשובת\s*המשיבי[םן]|עיקר\s*טענות\s*המשיבי[םן]", "respondent_claims"),
|
(r"טענות\s*המשיבי[םן]|תשובת\s*המשיבי[םן]|עיקר\s*טענות\s*המשיבי[םן]", "respondent_claims"),
|
||||||
(r"דיון\s*והכרעה|דיון|הכרעה|ניתוח\s*משפטי|המסגרת\s*המשפטית", "legal_analysis"),
|
(r"דיון\s*והכרעה|דיון|הכרעה|ניתוח\s*משפטי|המסגרת\s*המשפטית|נימוקים", "legal_analysis"),
|
||||||
(r"מסקנ[הות]|סיכום", "conclusion"),
|
(r"מסקנ[הות]|סיכום|סוף\s*דבר", "conclusion"),
|
||||||
(r"החלטה|לפיכך\s*אני\s*מחליט|התוצאה", "ruling"),
|
(r"פסק[- ]?דין|החלטה|לפיכך\s*אני\s*מחליט|התוצאה", "ruling"),
|
||||||
(r"מבוא|פתיחה|לפניי", "intro"),
|
(r"מבוא|פתיחה|לפניי", "intro"),
|
||||||
]
|
]
|
||||||
|
|
||||||
@@ -31,8 +33,15 @@ def chunk_document(
|
|||||||
text: str,
|
text: str,
|
||||||
chunk_size: int = config.CHUNK_SIZE_TOKENS,
|
chunk_size: int = config.CHUNK_SIZE_TOKENS,
|
||||||
overlap: int = config.CHUNK_OVERLAP_TOKENS,
|
overlap: int = config.CHUNK_OVERLAP_TOKENS,
|
||||||
|
page_offsets: list[int] | None = None,
|
||||||
) -> list[Chunk]:
|
) -> list[Chunk]:
|
||||||
"""Split a legal document into chunks, respecting section boundaries."""
|
"""Split a legal document into chunks, respecting section boundaries.
|
||||||
|
|
||||||
|
When ``page_offsets`` is supplied (from a PDF extraction), each chunk
|
||||||
|
is tagged with the page number of its first character — used by the
|
||||||
|
multimodal hybrid retriever to join (text chunk, image at same page)
|
||||||
|
and surface text+image matches.
|
||||||
|
"""
|
||||||
if not text.strip():
|
if not text.strip():
|
||||||
return []
|
return []
|
||||||
|
|
||||||
@@ -50,9 +59,34 @@ def chunk_document(
|
|||||||
))
|
))
|
||||||
idx += 1
|
idx += 1
|
||||||
|
|
||||||
|
if page_offsets:
|
||||||
|
_assign_pages(chunks, text, page_offsets)
|
||||||
return chunks
|
return chunks
|
||||||
|
|
||||||
|
|
||||||
|
def _assign_pages(chunks: list[Chunk], text: str, page_offsets: list[int]) -> None:
|
||||||
|
"""Locate each chunk's first character in ``text`` and tag with the
|
||||||
|
page that contains that offset. Mutates chunks in-place.
|
||||||
|
|
||||||
|
Chunks have overlap so we search forward from a position slightly
|
||||||
|
past the previous chunk's start. Falls back to a global search if
|
||||||
|
the forward scan misses (rare — happens only when overlap is bigger
|
||||||
|
than the advance distance below).
|
||||||
|
"""
|
||||||
|
from legal_mcp.services.extractor import page_at_offset
|
||||||
|
pos = 0
|
||||||
|
for c in chunks:
|
||||||
|
idx = text.find(c.content, pos)
|
||||||
|
if idx < 0:
|
||||||
|
idx = text.find(c.content)
|
||||||
|
if idx < 0:
|
||||||
|
continue
|
||||||
|
c.page_number = page_at_offset(idx, page_offsets)
|
||||||
|
# advance past the chunk's halfway point — overlap is < 50% so
|
||||||
|
# the next chunk's starting point will be after this cursor.
|
||||||
|
pos = idx + max(1, len(c.content) // 2)
|
||||||
|
|
||||||
|
|
||||||
def _split_into_sections(text: str) -> list[tuple[str, str]]:
|
def _split_into_sections(text: str) -> list[tuple[str, str]]:
|
||||||
"""Split text into (section_type, text) pairs based on Hebrew headers."""
|
"""Split text into (section_type, text) pairs based on Hebrew headers."""
|
||||||
# Find all section headers and their positions
|
# Find all section headers and their positions
|
||||||
|
|||||||
@@ -7,6 +7,7 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
@@ -17,6 +18,21 @@ from legal_mcp.services import db, claude_session
|
|||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
# Each chunk targets ~12K chars (≈3K tokens of Hebrew). Smaller than the
|
||||||
|
# previous 25K because:
|
||||||
|
# • A single ``claude -p`` call on a 25K-char Hebrew prompt with cold
|
||||||
|
# cache routinely hit ~150-180s. 12K chunks finish in ~60-90s.
|
||||||
|
# • Per-chunk retry costs less when chunks are smaller.
|
||||||
|
# • Parallel chunks benefit more — see CHUNK_CONCURRENCY.
|
||||||
|
CHUNK_TARGET_CHARS = 12000
|
||||||
|
|
||||||
|
# How many chunks to send to Claude in parallel. Each subprocess holds
|
||||||
|
# ~300 MB RSS plus its own MCP stack; concurrency=3 keeps the box usable.
|
||||||
|
CHUNK_CONCURRENCY = 3
|
||||||
|
|
||||||
|
# How many retry attempts per failed chunk before giving up on it.
|
||||||
|
CHUNK_RETRY_ATTEMPTS = 1
|
||||||
|
|
||||||
|
|
||||||
EXTRACT_CLAIMS_PROMPT = """אתה מנתח מסמכים משפטיים בתחום תכנון ובניה. תפקידך לחלץ טענות מכתב טענות.
|
EXTRACT_CLAIMS_PROMPT = """אתה מנתח מסמכים משפטיים בתחום תכנון ובניה. תפקידך לחלץ טענות מכתב טענות.
|
||||||
|
|
||||||
@@ -43,6 +59,103 @@ EXTRACT_CLAIMS_PROMPT = """אתה מנתח מסמכים משפטיים בתחו
|
|||||||
"""
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
# Section markers we treat as natural chunk boundaries when present.
|
||||||
|
# Hebrew legal briefs almost always use numbered sections like "10." or
|
||||||
|
# letter-section headings (".א", ".ב"). Splitting between sections keeps
|
||||||
|
# every chunk a self-contained argumentative unit.
|
||||||
|
_SECTION_BOUNDARY_RE = re.compile(
|
||||||
|
r"\n\s*("
|
||||||
|
r"\d+\.\s+\S" # numbered section: "10. טענות"
|
||||||
|
r"|[א-ת]\.\s+\S" # Hebrew letter section: "א. רקע"
|
||||||
|
r"|##\s+\S" # markdown heading
|
||||||
|
r"|פרק\s+\S" # "פרק" headings
|
||||||
|
r")"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def _split_by_sections(text: str, target: int = CHUNK_TARGET_CHARS) -> list[str]:
|
||||||
|
"""Split a long document into roughly ``target``-sized chunks at section
|
||||||
|
boundaries. Falls back to paragraph breaks, then to hard splits if a
|
||||||
|
section happens to be larger than ``target`` on its own.
|
||||||
|
"""
|
||||||
|
if len(text) <= target:
|
||||||
|
return [text]
|
||||||
|
|
||||||
|
boundaries = [m.start() for m in _SECTION_BOUNDARY_RE.finditer(text)]
|
||||||
|
boundaries = [0, *boundaries, len(text)]
|
||||||
|
|
||||||
|
chunks: list[str] = []
|
||||||
|
start = 0
|
||||||
|
for cut in boundaries[1:]:
|
||||||
|
# Greedy: keep adding sections to the current chunk until adding
|
||||||
|
# the next one would push past ``target``.
|
||||||
|
if cut - start < target:
|
||||||
|
continue
|
||||||
|
end = cut
|
||||||
|
if end - start > target * 1.5:
|
||||||
|
# Section group exceeds 1.5× target — fall back to paragraph
|
||||||
|
# break inside it to avoid one chunk being far too big.
|
||||||
|
soft = text.rfind("\n\n", start, start + target)
|
||||||
|
if soft > start + target // 2:
|
||||||
|
end = soft
|
||||||
|
chunks.append(text[start:end].strip())
|
||||||
|
start = end
|
||||||
|
if start < len(text):
|
||||||
|
chunks.append(text[start:].strip())
|
||||||
|
|
||||||
|
# Hard splits for any chunk that is still too large (rare, but
|
||||||
|
# documents without any section markers can fall through).
|
||||||
|
final: list[str] = []
|
||||||
|
for c in chunks:
|
||||||
|
if len(c) <= target * 1.5:
|
||||||
|
final.append(c)
|
||||||
|
continue
|
||||||
|
for i in range(0, len(c), target):
|
||||||
|
final.append(c[i:i + target])
|
||||||
|
return [c for c in final if c.strip()]
|
||||||
|
|
||||||
|
|
||||||
|
async def _extract_chunk(
|
||||||
|
chunk: str,
|
||||||
|
chunk_index: int,
|
||||||
|
chunk_total: int,
|
||||||
|
context: str,
|
||||||
|
) -> tuple[int, list[dict] | None]:
|
||||||
|
"""Run extraction on one chunk with retry. Returns ``(chunk_index, claims_or_None)``.
|
||||||
|
|
||||||
|
None means the chunk failed both the initial call and every retry
|
||||||
|
(caller can use this to mark the result as partial).
|
||||||
|
"""
|
||||||
|
chunk_label = f" (חלק {chunk_index + 1}/{chunk_total})" if chunk_total > 1 else ""
|
||||||
|
prompt = (
|
||||||
|
f"{EXTRACT_CLAIMS_PROMPT}\n\n"
|
||||||
|
f"{context}{chunk_label}\n\n"
|
||||||
|
f"--- תחילת מסמך ---\n{chunk}\n--- סוף מסמך ---"
|
||||||
|
)
|
||||||
|
last_err: Exception | None = None
|
||||||
|
for attempt in range(CHUNK_RETRY_ATTEMPTS + 1):
|
||||||
|
try:
|
||||||
|
claims = await claude_session.query_json(prompt)
|
||||||
|
except Exception as e:
|
||||||
|
last_err = e
|
||||||
|
logger.warning(
|
||||||
|
"extract_claims chunk %d/%d attempt %d raised: %s",
|
||||||
|
chunk_index + 1, chunk_total, attempt + 1, e,
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
if isinstance(claims, list):
|
||||||
|
return chunk_index, claims
|
||||||
|
logger.warning(
|
||||||
|
"extract_claims chunk %d/%d attempt %d returned non-list (%s)",
|
||||||
|
chunk_index + 1, chunk_total, attempt + 1, type(claims).__name__,
|
||||||
|
)
|
||||||
|
logger.error(
|
||||||
|
"extract_claims chunk %d/%d failed after %d attempts: %s",
|
||||||
|
chunk_index + 1, chunk_total, CHUNK_RETRY_ATTEMPTS + 1, last_err,
|
||||||
|
)
|
||||||
|
return chunk_index, None
|
||||||
|
|
||||||
|
|
||||||
async def extract_claims_with_ai(
|
async def extract_claims_with_ai(
|
||||||
text: str,
|
text: str,
|
||||||
doc_type: str = "appeal",
|
doc_type: str = "appeal",
|
||||||
@@ -50,68 +163,62 @@ async def extract_claims_with_ai(
|
|||||||
) -> list[dict]:
|
) -> list[dict]:
|
||||||
"""חילוץ טענות מכתב טענות באמצעות Claude.
|
"""חילוץ טענות מכתב טענות באמצעות Claude.
|
||||||
|
|
||||||
|
Splits ``text`` at section boundaries, runs every chunk through
|
||||||
|
Claude in parallel (bounded by ``CHUNK_CONCURRENCY``), retries each
|
||||||
|
failed chunk once, and merges the results in original document order.
|
||||||
|
Failed chunks are logged but don't block the overall extraction —
|
||||||
|
we return what we got and surface the gap via the logs.
|
||||||
|
|
||||||
Args:
|
Args:
|
||||||
text: טקסט המסמך
|
text: טקסט המסמך
|
||||||
doc_type: סוג המסמך (appeal/response)
|
doc_type: סוג המסמך (appeal/response)
|
||||||
party_hint: רמז לזהות הצד (אם ידוע)
|
party_hint: רמז לזהות הצד (אם ידוע)
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
רשימת טענות עם party_role, claim_text, topic
|
רשימת טענות עם party_role, claim_text, topic, claim_index.
|
||||||
"""
|
"""
|
||||||
context = f"סוג המסמך: {doc_type}"
|
context = f"סוג המסמך: {doc_type}"
|
||||||
if party_hint:
|
if party_hint:
|
||||||
context += f"\nהצד המגיש: {party_hint}"
|
context += f"\nהצד המגיש: {party_hint}"
|
||||||
|
|
||||||
# For very long documents, split into chunks and merge results
|
chunks = _split_by_sections(text)
|
||||||
max_chars_per_call = 25000
|
if len(chunks) > 1:
|
||||||
chunks = []
|
logger.info(
|
||||||
if len(text) > max_chars_per_call:
|
"extract_claims: split %d chars into %d chunks (target=%d, concurrency=%d)",
|
||||||
# Split at paragraph boundaries
|
len(text), len(chunks), CHUNK_TARGET_CHARS, CHUNK_CONCURRENCY,
|
||||||
pos = 0
|
|
||||||
while pos < len(text):
|
|
||||||
end = min(pos + max_chars_per_call, len(text))
|
|
||||||
if end < len(text):
|
|
||||||
# Find paragraph break near the limit
|
|
||||||
break_pos = text.rfind("\n\n", pos, end)
|
|
||||||
if break_pos > pos + max_chars_per_call // 2:
|
|
||||||
end = break_pos
|
|
||||||
chunks.append(text[pos:end])
|
|
||||||
pos = end
|
|
||||||
logger.info("Document split into %d chunks (%d chars total)", len(chunks), len(text))
|
|
||||||
else:
|
|
||||||
chunks = [text]
|
|
||||||
|
|
||||||
all_claims = []
|
|
||||||
|
|
||||||
for i, chunk in enumerate(chunks):
|
|
||||||
chunk_label = f" (חלק {i+1}/{len(chunks)})" if len(chunks) > 1 else ""
|
|
||||||
prompt = (
|
|
||||||
f"{EXTRACT_CLAIMS_PROMPT}\n\n"
|
|
||||||
f"{context}{chunk_label}\n\n"
|
|
||||||
f"--- תחילת מסמך ---\n{chunk}\n--- סוף מסמך ---"
|
|
||||||
)
|
)
|
||||||
claims = claude_session.query_json(prompt, timeout=120)
|
|
||||||
if claims is None:
|
|
||||||
logger.warning("Failed to parse claims for chunk %d: %s", i, raw[:200])
|
|
||||||
continue
|
|
||||||
if isinstance(claims, list):
|
|
||||||
all_claims.extend(claims)
|
|
||||||
|
|
||||||
claims = all_claims
|
sem = asyncio.Semaphore(CHUNK_CONCURRENCY)
|
||||||
|
|
||||||
|
async def _bounded(idx: int, c: str) -> tuple[int, list[dict] | None]:
|
||||||
|
async with sem:
|
||||||
|
return await _extract_chunk(c, idx, len(chunks), context)
|
||||||
|
|
||||||
|
results = await asyncio.gather(*[_bounded(i, c) for i, c in enumerate(chunks)])
|
||||||
|
|
||||||
|
# Merge in original order. Skip chunks that failed entirely.
|
||||||
|
failed = [i for i, r in results if r is None]
|
||||||
|
if failed:
|
||||||
|
logger.warning(
|
||||||
|
"extract_claims: %d/%d chunks failed (indices=%s) — returning partial result",
|
||||||
|
len(failed), len(chunks), failed,
|
||||||
|
)
|
||||||
|
merged: list[dict] = []
|
||||||
|
for idx, claims in sorted(results, key=lambda x: x[0]):
|
||||||
if not claims:
|
if not claims:
|
||||||
return []
|
continue
|
||||||
|
merged.extend(claims)
|
||||||
|
|
||||||
if not isinstance(claims, list):
|
# Add claim_index and drop entries missing required fields.
|
||||||
return []
|
cleaned: list[dict] = []
|
||||||
|
for i, claim in enumerate(merged):
|
||||||
# Add claim_index
|
if not isinstance(claim, dict):
|
||||||
for i, claim in enumerate(claims):
|
continue
|
||||||
claim["claim_index"] = i
|
|
||||||
# Validate required fields
|
|
||||||
if "party_role" not in claim or "claim_text" not in claim:
|
if "party_role" not in claim or "claim_text" not in claim:
|
||||||
continue
|
continue
|
||||||
|
claim["claim_index"] = i
|
||||||
return [c for c in claims if "party_role" in c and "claim_text" in c]
|
cleaned.append(claim)
|
||||||
|
return cleaned
|
||||||
|
|
||||||
|
|
||||||
def _infer_claim_type(doc_type: str, source_name: str) -> str:
|
def _infer_claim_type(doc_type: str, source_name: str) -> str:
|
||||||
|
|||||||
@@ -1,27 +1,53 @@
|
|||||||
"""Claude Code session bridge — runs prompts via `claude -p` instead of API.
|
"""Claude Code session bridge — runs prompts via the local `claude` CLI.
|
||||||
|
|
||||||
All LLM calls in the project should use this module instead of calling
|
All LLM calls in legal-ai go through this module. We shell out to the local
|
||||||
the Anthropic API directly. This uses the local Claude Code CLI which
|
Claude Code CLI which uses the developer's claude.ai session — zero direct
|
||||||
runs on the user's claude.ai session — zero API cost.
|
API cost.
|
||||||
|
|
||||||
|
**Architectural rule (do not violate):** this module only works when invoked
|
||||||
|
from the local MCP server (the Python process at
|
||||||
|
`/home/chaim/legal-ai/mcp-server/`, launched per `~/.claude.json`). It will
|
||||||
|
**not** work when called from the legal-ai Docker container — that container
|
||||||
|
has no `claude` CLI and no claude.ai session. Any code path under `web/`
|
||||||
|
(FastAPI) that calls this module — directly or via an extractor like
|
||||||
|
`halacha_extractor`, `claims_extractor`, `precedent_metadata_extractor`,
|
||||||
|
`block_writer`, `qa_validator`, `learning_loop`, `local_classifier`,
|
||||||
|
`appraiser_facts_extractor`, `brainstorm`, `style_analyzer` — is wrong.
|
||||||
|
LLM-dependent operations must be exposed as MCP tools and triggered from
|
||||||
|
agents (or the chair via Claude Code), where this module runs locally with
|
||||||
|
CLI access.
|
||||||
|
|
||||||
|
Async history: originally synchronous (``subprocess.run``) with a 120 s
|
||||||
|
timeout. That broke for large legal documents — sync subprocess stalled the
|
||||||
|
asyncio loop, and 120 s was far too short for cold-cache Hebrew prompts
|
||||||
|
(case 8174-24 hit three timeouts in a row). Fixed by going async with a
|
||||||
|
30-minute ceiling.
|
||||||
"""
|
"""
|
||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import json
|
import json
|
||||||
import logging
|
import logging
|
||||||
import subprocess
|
|
||||||
from pathlib import Path
|
|
||||||
|
|
||||||
from legal_mcp.config import parse_llm_json
|
from legal_mcp.config import parse_llm_json
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# Default timeout for claude -p calls (seconds)
|
# Default ceiling for any single ``claude -p`` invocation, in seconds.
|
||||||
DEFAULT_TIMEOUT = 120
|
# 30 min covers any single-document call we make in practice (chunking
|
||||||
LONG_TIMEOUT = 300 # For complex tasks like block writing
|
# handles the rest); the bound exists only to prevent runaway zombies.
|
||||||
|
DEFAULT_TIMEOUT = 1800
|
||||||
|
LONG_TIMEOUT = 3600 # opus block writing on full case context
|
||||||
|
|
||||||
|
|
||||||
def query(prompt: str, timeout: int = DEFAULT_TIMEOUT, max_turns: int = 1) -> str:
|
async def query(
|
||||||
|
prompt: str,
|
||||||
|
timeout: int = DEFAULT_TIMEOUT,
|
||||||
|
max_turns: int = 1,
|
||||||
|
*,
|
||||||
|
system: str | None = None,
|
||||||
|
) -> str:
|
||||||
"""Send a prompt to Claude Code headless and return the text response.
|
"""Send a prompt to Claude Code headless and return the text response.
|
||||||
|
|
||||||
Passes the prompt via stdin (not argv) to avoid the OS ARG_MAX limit —
|
Passes the prompt via stdin (not argv) to avoid the OS ARG_MAX limit —
|
||||||
@@ -29,15 +55,23 @@ def query(prompt: str, timeout: int = DEFAULT_TIMEOUT, max_turns: int = 1) -> st
|
|||||||
|
|
||||||
Args:
|
Args:
|
||||||
prompt: The prompt to send.
|
prompt: The prompt to send.
|
||||||
timeout: Max seconds to wait.
|
timeout: Max seconds before the subprocess is killed.
|
||||||
max_turns: Max conversation turns (1 = single response).
|
max_turns: Max conversation turns (1 = single response).
|
||||||
|
system: Optional repeated-instruction text. Prepended to ``prompt``
|
||||||
|
for the CLI; we don't pass it as a separate arg because the
|
||||||
|
CLI doesn't expose API-level caching. The parameter exists so
|
||||||
|
extractors can structure their calls cleanly today, and to make
|
||||||
|
a future SDK-backed path drop-in.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
The text response from Claude.
|
The text response from Claude.
|
||||||
|
|
||||||
Raises:
|
Raises:
|
||||||
RuntimeError: If claude CLI is not available or fails.
|
RuntimeError: if the CLI is unavailable (e.g., called from the
|
||||||
|
container — see module docstring), or fails, or times out.
|
||||||
"""
|
"""
|
||||||
|
full_prompt = f"{system}\n\n{prompt}" if system else prompt
|
||||||
|
|
||||||
cmd = [
|
cmd = [
|
||||||
"claude", "-p",
|
"claude", "-p",
|
||||||
"--output-format", "json",
|
"--output-format", "json",
|
||||||
@@ -45,23 +79,40 @@ def query(prompt: str, timeout: int = DEFAULT_TIMEOUT, max_turns: int = 1) -> st
|
|||||||
]
|
]
|
||||||
|
|
||||||
try:
|
try:
|
||||||
result = subprocess.run(
|
proc = await asyncio.create_subprocess_exec(
|
||||||
cmd,
|
*cmd,
|
||||||
input=prompt,
|
stdin=asyncio.subprocess.PIPE,
|
||||||
capture_output=True,
|
stdout=asyncio.subprocess.PIPE,
|
||||||
text=True,
|
stderr=asyncio.subprocess.PIPE,
|
||||||
timeout=timeout,
|
|
||||||
)
|
)
|
||||||
except FileNotFoundError:
|
except FileNotFoundError:
|
||||||
raise RuntimeError("Claude CLI not found. Install Claude Code or add 'claude' to PATH.")
|
raise RuntimeError(
|
||||||
except subprocess.TimeoutExpired:
|
"Claude CLI not found. This module only works when invoked "
|
||||||
|
"from the local MCP server — see the architectural rule in "
|
||||||
|
"the module docstring. If this error came from a FastAPI "
|
||||||
|
"endpoint in the container, refactor the call into an MCP "
|
||||||
|
"tool that the chair triggers from Claude Code."
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
stdout_b, stderr_b = await asyncio.wait_for(
|
||||||
|
proc.communicate(input=full_prompt.encode("utf-8")),
|
||||||
|
timeout=timeout,
|
||||||
|
)
|
||||||
|
except asyncio.TimeoutError:
|
||||||
|
# wait_for cancellation alone leaves the child running.
|
||||||
|
try:
|
||||||
|
proc.kill()
|
||||||
|
await proc.wait()
|
||||||
|
except ProcessLookupError:
|
||||||
|
pass
|
||||||
raise RuntimeError(f"Claude CLI timed out after {timeout}s")
|
raise RuntimeError(f"Claude CLI timed out after {timeout}s")
|
||||||
|
|
||||||
if result.returncode != 0:
|
if proc.returncode != 0:
|
||||||
stderr = result.stderr.strip()[:500] if result.stderr else "unknown error"
|
stderr = stderr_b.decode("utf-8", errors="replace").strip()[:500] or "unknown error"
|
||||||
raise RuntimeError(f"Claude CLI failed (exit {result.returncode}): {stderr}")
|
raise RuntimeError(f"Claude CLI failed (exit {proc.returncode}): {stderr}")
|
||||||
|
|
||||||
stdout = result.stdout.strip()
|
stdout = stdout_b.decode("utf-8", errors="replace").strip()
|
||||||
if not stdout:
|
if not stdout:
|
||||||
raise RuntimeError("Claude CLI returned empty response")
|
raise RuntimeError("Claude CLI returned empty response")
|
||||||
|
|
||||||
@@ -75,10 +126,15 @@ def query(prompt: str, timeout: int = DEFAULT_TIMEOUT, max_turns: int = 1) -> st
|
|||||||
return stdout
|
return stdout
|
||||||
|
|
||||||
|
|
||||||
def query_json(prompt: str, timeout: int = DEFAULT_TIMEOUT) -> dict | list | None:
|
async def query_json(
|
||||||
|
prompt: str,
|
||||||
|
timeout: int = DEFAULT_TIMEOUT,
|
||||||
|
*,
|
||||||
|
system: str | None = None,
|
||||||
|
) -> dict | list | None:
|
||||||
"""Send a prompt and parse the response as JSON.
|
"""Send a prompt and parse the response as JSON.
|
||||||
|
|
||||||
Uses parse_llm_json for robust parsing (handles markdown wrapping, truncation).
|
Uses parse_llm_json for robust parsing (handles markdown wrapping, truncation).
|
||||||
"""
|
"""
|
||||||
raw = query(prompt, timeout=timeout)
|
raw = await query(prompt, timeout=timeout, system=system)
|
||||||
return parse_llm_json(raw)
|
return parse_llm_json(raw)
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
@@ -15,47 +15,112 @@ from docx import Document
|
|||||||
from docx.enum.text import WD_ALIGN_PARAGRAPH
|
from docx.enum.text import WD_ALIGN_PARAGRAPH
|
||||||
from docx.oxml import OxmlElement
|
from docx.oxml import OxmlElement
|
||||||
from docx.oxml.ns import qn
|
from docx.oxml.ns import qn
|
||||||
from docx.shared import Cm, Pt, RGBColor
|
|
||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
from legal_mcp.services import db
|
from legal_mcp.services import db
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# ── Constants ─────────────────────────────────────────────────────
|
# Path to the converted decision template. Carries David font, RTL, margins,
|
||||||
|
# and styles (Title / Heading 1-2 / Normal / Quote / List Paragraph).
|
||||||
FONT_NAME = "David"
|
# Populated once by `scripts/convert_decision_template.py` from `.dotx`.
|
||||||
FONT_SIZE_BODY = Pt(12)
|
TEMPLATE_PATH = (
|
||||||
FONT_SIZE_TITLE = Pt(16)
|
Path(__file__).resolve().parents[4]
|
||||||
FONT_SIZE_HEADING = Pt(14)
|
/ "skills" / "docx" / "decision_template.docx"
|
||||||
LINE_SPACING = 1.5
|
)
|
||||||
PAGE_MARGIN = Cm(2.5)
|
|
||||||
|
|
||||||
|
|
||||||
# ── RTL helpers ───────────────────────────────────────────────────
|
# ── RTL helpers ───────────────────────────────────────────────────
|
||||||
|
# Three layers of RTL are required (per skills/docx/SKILL.md):
|
||||||
|
# 1. Section: <w:bidi/> in sectPr (inherited from template)
|
||||||
|
# 2. Paragraph: <w:bidi/> directly in pPr — paragraph direction
|
||||||
|
# 3. Run: <w:rtl/> in rPr — tells Word to use cs (complex-script) font
|
||||||
|
# Without explicit font on run, Hebrew can render in the ascii slot
|
||||||
|
# (Times New Roman) — so we also force David on all four font slots.
|
||||||
|
|
||||||
def _set_rtl_paragraph(paragraph) -> None:
|
HEBREW_FONT = "David"
|
||||||
"""Set paragraph-level RTL properties."""
|
|
||||||
pPr = paragraph._element.get_or_add_pPr()
|
|
||||||
|
def _mark_run_rtl(run) -> None:
|
||||||
|
"""Force David font on all four slots, then add <w:rtl/>."""
|
||||||
|
rPr = run._r.get_or_add_rPr()
|
||||||
|
if rPr.find(qn("w:rFonts")) is None:
|
||||||
|
fonts = OxmlElement("w:rFonts")
|
||||||
|
fonts.set(qn("w:ascii"), HEBREW_FONT)
|
||||||
|
fonts.set(qn("w:hAnsi"), HEBREW_FONT)
|
||||||
|
fonts.set(qn("w:cs"), HEBREW_FONT)
|
||||||
|
fonts.set(qn("w:eastAsia"), HEBREW_FONT)
|
||||||
|
rPr.insert(0, fonts)
|
||||||
|
if rPr.find(qn("w:rtl")) is None:
|
||||||
|
rPr.append(OxmlElement("w:rtl"))
|
||||||
|
|
||||||
|
|
||||||
|
def _mark_paragraph_rtl(paragraph) -> None:
|
||||||
|
"""Add <w:bidi/> directly to pPr (paragraph direction) and <w:rtl/>
|
||||||
|
to the paragraph-mark rPr (affects trailing ¶ glyph)."""
|
||||||
|
pPr = paragraph._p.get_or_add_pPr()
|
||||||
|
# (2) <w:bidi/> directly in pPr — paragraph direction
|
||||||
|
if pPr.find(qn("w:bidi")) is None:
|
||||||
bidi = OxmlElement("w:bidi")
|
bidi = OxmlElement("w:bidi")
|
||||||
bidi.set(qn("w:val"), "1")
|
pstyle = pPr.find(qn("w:pStyle"))
|
||||||
pPr.append(bidi)
|
if pstyle is not None:
|
||||||
|
pstyle.addnext(bidi)
|
||||||
|
else:
|
||||||
|
pPr.insert(0, bidi)
|
||||||
|
# paragraph-mark rPr gets <w:rtl/> so ¶ inherits RTL too
|
||||||
|
rPr = pPr.find(qn("w:rPr"))
|
||||||
|
if rPr is None:
|
||||||
|
rPr = OxmlElement("w:rPr")
|
||||||
|
pPr.append(rPr)
|
||||||
|
if rPr.find(qn("w:rtl")) is None:
|
||||||
|
rPr.append(OxmlElement("w:rtl"))
|
||||||
|
|
||||||
|
|
||||||
def _set_rtl_run(run) -> None:
|
def _set_paragraph_jc(paragraph, value: str) -> None:
|
||||||
"""Set run-level RTL properties."""
|
"""Force <w:jc w:val="..."/> on a paragraph, overriding style-inherited jc.
|
||||||
rPr = run._element.get_or_add_rPr()
|
|
||||||
rtl = OxmlElement("w:rtl")
|
Needed because Heading 3 in the template ships with jc=center — we want
|
||||||
rtl.set(qn("w:val"), "1")
|
body headings justified right (jc=both) like Normal.
|
||||||
rPr.append(rtl)
|
"""
|
||||||
|
pPr = paragraph._p.get_or_add_pPr()
|
||||||
|
existing = pPr.find(qn("w:jc"))
|
||||||
|
if existing is not None:
|
||||||
|
pPr.remove(existing)
|
||||||
|
jc = OxmlElement("w:jc")
|
||||||
|
jc.set(qn("w:val"), value)
|
||||||
|
pPr.append(jc)
|
||||||
|
|
||||||
|
|
||||||
def _set_rtl_section(section) -> None:
|
def _suppress_paragraph_numbering(paragraph) -> None:
|
||||||
"""Set section-level RTL (bidi)."""
|
"""Kill any style-inherited auto-numbering on this paragraph.
|
||||||
sectPr = section._sectPr
|
|
||||||
bidi = OxmlElement("w:bidi")
|
Heading styles linked to outline lists can auto-inject א./ב./ג. markers
|
||||||
bidi.set(qn("w:val"), "1")
|
in some Word versions even when the style we read doesn't show numPr.
|
||||||
sectPr.append(bidi)
|
Setting numId=0 explicitly removes the paragraph from any list.
|
||||||
|
"""
|
||||||
|
pPr = paragraph._p.get_or_add_pPr()
|
||||||
|
existing = pPr.find(qn("w:numPr"))
|
||||||
|
if existing is not None:
|
||||||
|
pPr.remove(existing)
|
||||||
|
numPr = OxmlElement("w:numPr")
|
||||||
|
ilvl = OxmlElement("w:ilvl")
|
||||||
|
ilvl.set(qn("w:val"), "0")
|
||||||
|
numId = OxmlElement("w:numId")
|
||||||
|
numId.set(qn("w:val"), "0")
|
||||||
|
numPr.append(ilvl)
|
||||||
|
numPr.append(numId)
|
||||||
|
pPr.append(numPr)
|
||||||
|
|
||||||
|
|
||||||
|
def _clear_body(doc) -> None:
|
||||||
|
"""Remove all paragraphs in the document body while keeping sectPr.
|
||||||
|
|
||||||
|
The template ships with sample paragraphs we don't want. Section
|
||||||
|
properties (page size, margins, bidi) stay intact.
|
||||||
|
"""
|
||||||
|
body = doc.element.body
|
||||||
|
for p in list(body.findall(qn("w:p"))):
|
||||||
|
body.remove(p)
|
||||||
|
|
||||||
|
|
||||||
# ── Bookmark helpers ──────────────────────────────────────────────
|
# ── Bookmark helpers ──────────────────────────────────────────────
|
||||||
@@ -109,61 +174,109 @@ def _wrap_block_with_bookmarks(doc, block_name: str,
|
|||||||
_insert_bookmark_end(last_new, bm_id)
|
_insert_bookmark_end(last_new, bm_id)
|
||||||
|
|
||||||
|
|
||||||
def _add_paragraph(doc, text: str, style: str = "Normal",
|
# ── Content cleanup ──────────────────────────────────────────────
|
||||||
bold: bool = False, font_size=None,
|
|
||||||
alignment=None, space_after: Pt | None = None) -> None:
|
|
||||||
"""Add an RTL paragraph with David font."""
|
|
||||||
para = doc.add_paragraph()
|
|
||||||
_set_rtl_paragraph(para)
|
|
||||||
|
|
||||||
if alignment:
|
# Em-dash (—, U+2014) and en-dash (–, U+2013) — per chair's no-dash policy,
|
||||||
|
# strip from body text. Surrounding spaces collapse.
|
||||||
|
_DASH_RE = re.compile(r"\s*[—–]\s*")
|
||||||
|
_MULTI_SPACE_RE = re.compile(r" {2,}")
|
||||||
|
|
||||||
|
|
||||||
|
def _strip_dashes(text: str) -> str:
|
||||||
|
"""Remove em/en-dashes and collapse surrounding whitespace."""
|
||||||
|
text = _DASH_RE.sub(" ", text)
|
||||||
|
return _MULTI_SPACE_RE.sub(" ", text).strip()
|
||||||
|
|
||||||
|
|
||||||
|
# Numbered paragraph: "1. content", "23. content" — auto-numbered via
|
||||||
|
# List Paragraph style so order reflects emission, not literal prefix.
|
||||||
|
_NUM_PREFIX_RE = re.compile(r"^(\d+)\.\s+(.*)$", re.DOTALL)
|
||||||
|
|
||||||
|
|
||||||
|
# Markdown inline bold — `**...**`
|
||||||
|
_INLINE_BOLD_RE = re.compile(r"\*\*([^\n*]+?)\*\*")
|
||||||
|
|
||||||
|
|
||||||
|
def _add_runs_with_inline_bold(paragraph, text: str, *, bold_all: bool = False) -> None:
|
||||||
|
"""Split text on `**...**` markers, alternating plain and bold runs.
|
||||||
|
|
||||||
|
Keeps `**טענה חשובה**` rendering as bold instead of leaving literal
|
||||||
|
asterisks. When bold_all is True, every run is bold (used for headings
|
||||||
|
that still carry inline-bold markup).
|
||||||
|
"""
|
||||||
|
pos = 0
|
||||||
|
for m in _INLINE_BOLD_RE.finditer(text):
|
||||||
|
if m.start() > pos:
|
||||||
|
plain = paragraph.add_run(text[pos:m.start()])
|
||||||
|
if bold_all:
|
||||||
|
plain.bold = True
|
||||||
|
_mark_run_rtl(plain)
|
||||||
|
run_bold = paragraph.add_run(m.group(1))
|
||||||
|
run_bold.bold = True
|
||||||
|
_mark_run_rtl(run_bold)
|
||||||
|
pos = m.end()
|
||||||
|
if pos < len(text):
|
||||||
|
tail = paragraph.add_run(text[pos:])
|
||||||
|
if bold_all:
|
||||||
|
tail.bold = True
|
||||||
|
_mark_run_rtl(tail)
|
||||||
|
|
||||||
|
|
||||||
|
def _add_styled_paragraph(doc, text: str, style: str = "Normal",
|
||||||
|
bold: bool = False,
|
||||||
|
alignment=None):
|
||||||
|
"""Add a paragraph using a template style.
|
||||||
|
|
||||||
|
Font, size, RTL direction and spacing all come from the style
|
||||||
|
definition in the template — we only pick the style by name.
|
||||||
|
Renders `**...**` markdown as inline bold runs.
|
||||||
|
|
||||||
|
Returns the paragraph so callers can apply further overrides.
|
||||||
|
"""
|
||||||
|
para = doc.add_paragraph(style=style)
|
||||||
|
_mark_paragraph_rtl(para)
|
||||||
|
|
||||||
|
if alignment is not None:
|
||||||
para.alignment = alignment
|
para.alignment = alignment
|
||||||
else:
|
|
||||||
para.alignment = WD_ALIGN_PARAGRAPH.RIGHT
|
|
||||||
|
|
||||||
run = para.add_run(text)
|
if text:
|
||||||
run.font.name = FONT_NAME
|
_add_runs_with_inline_bold(para, text, bold_all=bold)
|
||||||
run.font.size = font_size or FONT_SIZE_BODY
|
|
||||||
run.bold = bold
|
|
||||||
_set_rtl_run(run)
|
|
||||||
|
|
||||||
# Line spacing
|
return para
|
||||||
pf = para.paragraph_format
|
|
||||||
pf.line_spacing = LINE_SPACING
|
|
||||||
if space_after is not None:
|
|
||||||
pf.space_after = space_after
|
|
||||||
|
|
||||||
|
|
||||||
def _add_centered_paragraph(doc, text: str, bold: bool = True,
|
def _add_centered_paragraph(doc, text: str, *, bold: bool = True,
|
||||||
font_size=None) -> None:
|
style: str = "Normal") -> None:
|
||||||
"""Add centered RTL paragraph."""
|
_add_styled_paragraph(doc, text, style=style, bold=bold,
|
||||||
_add_paragraph(doc, text, bold=bold, font_size=font_size,
|
|
||||||
alignment=WD_ALIGN_PARAGRAPH.CENTER)
|
alignment=WD_ALIGN_PARAGRAPH.CENTER)
|
||||||
|
|
||||||
|
|
||||||
|
def _add_heading(doc, text: str, *, style: str) -> None:
|
||||||
|
"""Heading with overrides: jc=both (overrides style-center / style-left)
|
||||||
|
and suppressed auto-numbering (so style-linked outline lists don't inject
|
||||||
|
א./ב./ג. — chair manages markers manually in content)."""
|
||||||
|
para = doc.add_paragraph(style=style)
|
||||||
|
_mark_paragraph_rtl(para)
|
||||||
|
_set_paragraph_jc(para, "both")
|
||||||
|
_suppress_paragraph_numbering(para)
|
||||||
|
if text:
|
||||||
|
_add_runs_with_inline_bold(para, text)
|
||||||
|
|
||||||
|
|
||||||
def _add_blockquote(doc, text: str) -> None:
|
def _add_blockquote(doc, text: str) -> None:
|
||||||
"""Add indented blockquote paragraph."""
|
"""Indented quote using the template's Quote style."""
|
||||||
para = doc.add_paragraph()
|
_add_styled_paragraph(doc, text, style="Quote")
|
||||||
_set_rtl_paragraph(para)
|
|
||||||
para.alignment = WD_ALIGN_PARAGRAPH.RIGHT
|
|
||||||
|
|
||||||
run = para.add_run(text)
|
|
||||||
run.font.name = FONT_NAME
|
|
||||||
run.font.size = Pt(11)
|
|
||||||
run.italic = True
|
|
||||||
_set_rtl_run(run)
|
|
||||||
|
|
||||||
pf = para.paragraph_format
|
|
||||||
pf.left_indent = Cm(1.5)
|
|
||||||
pf.right_indent = Cm(1.5)
|
|
||||||
pf.line_spacing = LINE_SPACING
|
|
||||||
|
|
||||||
|
|
||||||
def _add_image_placeholder(doc, description: str) -> None:
|
def _add_image_placeholder(doc, description: str) -> None:
|
||||||
"""Add image placeholder box."""
|
_add_styled_paragraph(doc, f"[{description}]", style="Normal",
|
||||||
_add_paragraph(doc, f"[{description}]",
|
alignment=WD_ALIGN_PARAGRAPH.CENTER)
|
||||||
alignment=WD_ALIGN_PARAGRAPH.CENTER,
|
|
||||||
font_size=Pt(10))
|
|
||||||
|
def _add_spacer(doc) -> None:
|
||||||
|
"""Add an empty paragraph as a visual spacer."""
|
||||||
|
para = doc.add_paragraph(style="Normal")
|
||||||
|
_mark_paragraph_rtl(para)
|
||||||
|
|
||||||
|
|
||||||
# ── Main export ───────────────────────────────────────────────────
|
# ── Main export ───────────────────────────────────────────────────
|
||||||
@@ -241,16 +354,14 @@ async def export_decision(
|
|||||||
else:
|
else:
|
||||||
ordered_blocks = list(rows)
|
ordered_blocks = list(rows)
|
||||||
|
|
||||||
# Create document
|
if not TEMPLATE_PATH.exists():
|
||||||
doc = Document()
|
raise FileNotFoundError(
|
||||||
|
f"Template not found at {TEMPLATE_PATH}. "
|
||||||
|
"Run scripts/convert_decision_template.py first."
|
||||||
|
)
|
||||||
|
|
||||||
# Set page margins
|
doc = Document(str(TEMPLATE_PATH))
|
||||||
for section in doc.sections:
|
_clear_body(doc)
|
||||||
section.top_margin = PAGE_MARGIN
|
|
||||||
section.bottom_margin = PAGE_MARGIN
|
|
||||||
section.left_margin = PAGE_MARGIN
|
|
||||||
section.right_margin = PAGE_MARGIN
|
|
||||||
_set_rtl_section(section)
|
|
||||||
|
|
||||||
# Write blocks with bookmarks wrapping each block (anchors for revisions)
|
# Write blocks with bookmarks wrapping each block (anchors for revisions)
|
||||||
bm_counter = [_BOOKMARK_ID_START]
|
bm_counter = [_BOOKMARK_ID_START]
|
||||||
@@ -291,93 +402,132 @@ async def export_decision(
|
|||||||
|
|
||||||
|
|
||||||
def _write_block_to_docx(doc, block_id: str, title: str, content: str) -> None:
|
def _write_block_to_docx(doc, block_id: str, title: str, content: str) -> None:
|
||||||
"""Write a single block to the DOCX document."""
|
"""Write a single block to the DOCX document using template styles."""
|
||||||
# Header blocks (א-ד)
|
# Header blocks (א-ד)
|
||||||
if block_id == "block-alef":
|
if block_id == "block-alef":
|
||||||
for line in content.split("\n"):
|
for line in content.split("\n"):
|
||||||
if line.strip():
|
if line.strip():
|
||||||
_add_centered_paragraph(doc, line.strip(), bold=True, font_size=FONT_SIZE_HEADING)
|
_add_styled_paragraph(doc, line.strip(), style="Heading 1",
|
||||||
|
alignment=WD_ALIGN_PARAGRAPH.CENTER)
|
||||||
return
|
return
|
||||||
|
|
||||||
if block_id == "block-bet":
|
if block_id == "block-bet":
|
||||||
_add_paragraph(doc, "", space_after=Pt(6)) # spacer
|
_add_spacer(doc)
|
||||||
for line in content.split("\n"):
|
for line in content.split("\n"):
|
||||||
if line.strip():
|
if line.strip():
|
||||||
_add_centered_paragraph(doc, line.strip(), bold=False, font_size=FONT_SIZE_BODY)
|
_add_centered_paragraph(doc, line.strip(), bold=False)
|
||||||
return
|
return
|
||||||
|
|
||||||
if block_id == "block-gimel":
|
if block_id == "block-gimel":
|
||||||
_add_paragraph(doc, "", space_after=Pt(6))
|
_add_spacer(doc)
|
||||||
lines = content.split("\n")
|
for line in content.split("\n"):
|
||||||
for line in lines:
|
|
||||||
stripped = line.strip()
|
stripped = line.strip()
|
||||||
if not stripped:
|
if not stripped:
|
||||||
continue
|
continue
|
||||||
if stripped == "נגד":
|
if stripped == "נגד":
|
||||||
_add_centered_paragraph(doc, "— נגד —", bold=True, font_size=FONT_SIZE_BODY)
|
_add_centered_paragraph(doc, "— נגד —", bold=True)
|
||||||
else:
|
else:
|
||||||
_add_centered_paragraph(doc, stripped, bold=False, font_size=FONT_SIZE_BODY)
|
_add_centered_paragraph(doc, stripped, bold=False)
|
||||||
return
|
return
|
||||||
|
|
||||||
if block_id == "block-dalet":
|
if block_id == "block-dalet":
|
||||||
_add_paragraph(doc, "", space_after=Pt(12)) # spacer
|
_add_spacer(doc)
|
||||||
_add_centered_paragraph(doc, "החלטה", bold=True, font_size=FONT_SIZE_TITLE)
|
# Avoid style=Title: its rFonts use theme fonts (majorHAnsi / majorBidi)
|
||||||
_add_paragraph(doc, "", space_after=Pt(12))
|
# and 28pt size — renders Hebrew oversized and in the wrong face.
|
||||||
|
# Heading 1 carries David and proper RTL, bold + center gives the
|
||||||
|
# same visual weight.
|
||||||
|
para = _add_styled_paragraph(doc, "החלטה", style="Heading 1",
|
||||||
|
alignment=WD_ALIGN_PARAGRAPH.CENTER,
|
||||||
|
bold=True)
|
||||||
|
_suppress_paragraph_numbering(para)
|
||||||
|
_add_spacer(doc)
|
||||||
return
|
return
|
||||||
|
|
||||||
if block_id == "block-yod-bet":
|
if block_id == "block-yod-bet":
|
||||||
_add_paragraph(doc, "", space_after=Pt(24)) # spacer
|
_add_spacer(doc)
|
||||||
for line in content.split("\n"):
|
for line in content.split("\n"):
|
||||||
if line.strip():
|
if line.strip():
|
||||||
_add_centered_paragraph(doc, line.strip(), bold=False, font_size=FONT_SIZE_BODY)
|
_add_centered_paragraph(doc, line.strip(), bold=False)
|
||||||
return
|
return
|
||||||
|
|
||||||
# Content blocks (ה-יא) — parse paragraphs
|
# Content blocks (ה-יא) — parse paragraphs
|
||||||
paragraphs = content.split("\n")
|
for para_text in content.split("\n"):
|
||||||
for para_text in paragraphs:
|
stripped = _strip_dashes(para_text.strip())
|
||||||
stripped = para_text.strip()
|
|
||||||
if not stripped:
|
if not stripped:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Section headings (e.g., "תמצית טענות הצדדים", "טענות העוררים")
|
# Markdown H1/H2/H3 → template heading styles
|
||||||
if _is_section_heading(stripped):
|
md_heading = re.match(r"^(#{1,6})\s+(.*)$", stripped)
|
||||||
_add_paragraph(doc, stripped, bold=True, font_size=FONT_SIZE_HEADING,
|
if md_heading:
|
||||||
space_after=Pt(6))
|
level = len(md_heading.group(1))
|
||||||
|
heading_text = md_heading.group(2).strip()
|
||||||
|
style = "Heading 1" if level == 1 else f"Heading {min(level, 3)}"
|
||||||
|
_add_heading(doc, heading_text, style=style)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Standalone `**...**` line — treat as a sub-heading (Heading 3)
|
||||||
|
stand_bold = re.match(r"^\*\*([^\n*]+?)\*\*$", stripped)
|
||||||
|
if stand_bold:
|
||||||
|
_add_heading(doc, stand_bold.group(1).strip(), style="Heading 3")
|
||||||
|
continue
|
||||||
|
|
||||||
|
if _is_section_heading(stripped):
|
||||||
|
_add_heading(doc, stripped, style="Heading 2")
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Blockquotes (indented quotes from protocols/rulings)
|
|
||||||
if stripped.startswith('"') or stripped.startswith("״") or stripped.startswith(">"):
|
if stripped.startswith('"') or stripped.startswith("״") or stripped.startswith(">"):
|
||||||
clean = stripped.lstrip(">").strip().strip('"').strip("״").strip('"')
|
clean = stripped.lstrip(">").strip().strip('"').strip("״").strip('"')
|
||||||
_add_blockquote(doc, clean)
|
_add_blockquote(doc, clean)
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Image placeholders
|
if "📷" in stripped or (stripped.startswith("[") and "תמונה" in stripped):
|
||||||
if "📷" in stripped or stripped.startswith("[") and "תמונה" in stripped:
|
|
||||||
_add_image_placeholder(doc, stripped.strip("[]📷 "))
|
_add_image_placeholder(doc, stripped.strip("[]📷 "))
|
||||||
continue
|
continue
|
||||||
|
|
||||||
# Regular numbered paragraph or plain text
|
# Numbered body paragraph ("1. text") → List Paragraph with auto-num.
|
||||||
_add_paragraph(doc, stripped)
|
# The literal prefix is dropped; Word renders "1. 2. 3. ..." via numId.
|
||||||
|
num_match = _NUM_PREFIX_RE.match(stripped)
|
||||||
|
if num_match:
|
||||||
|
body_text = num_match.group(2).strip()
|
||||||
|
_add_styled_paragraph(doc, body_text, style="List Paragraph")
|
||||||
|
continue
|
||||||
|
|
||||||
|
_add_styled_paragraph(doc, stripped, style="Normal")
|
||||||
|
|
||||||
|
|
||||||
def _is_section_heading(text: str) -> bool:
|
_SECTION_HEADING_PATTERNS = [
|
||||||
"""Detect section headings in decision text."""
|
re.compile(p) for p in (
|
||||||
heading_patterns = [
|
# Block-level titles
|
||||||
|
r"^פתח\s+דבר",
|
||||||
|
r"^רקע\s+עובדתי",
|
||||||
r"^תמצית\s+טענות",
|
r"^תמצית\s+טענות",
|
||||||
|
r"^טענות\s+הצדדים",
|
||||||
r"^טענות\s+העוררי",
|
r"^טענות\s+העוררי",
|
||||||
|
r"^טענות\s+המשיב",
|
||||||
r"^עמדת\s+הוועדה",
|
r"^עמדת\s+הוועדה",
|
||||||
r"^עמדת\s+מבקשי",
|
r"^עמדת\s+מבקשי",
|
||||||
r"^ההליכים\s+בפני",
|
r"^ההליכים\s+בפני",
|
||||||
|
r"^הליכים\s+בפני",
|
||||||
r"^דיון\s+והכרעה",
|
r"^דיון\s+והכרעה",
|
||||||
r"^סוף\s+דבר",
|
r"^סוף\s+דבר",
|
||||||
r"^סיכום",
|
r"^סיכום",
|
||||||
r"^פתח\s+דבר",
|
# Subsection titles produced by legal-writer inside block-vav/block-tet
|
||||||
|
r"^המצב\s+התכנוני",
|
||||||
|
r"^הליכי\s+הרישוי",
|
||||||
|
r"^שומת\s+ההשבחה",
|
||||||
|
r"^הליך\s+השומה",
|
||||||
|
r"^הגשת\s+הערר",
|
||||||
|
r"^תכניות\s+מתאר",
|
||||||
|
r"^תכניות\s+מפורטות",
|
||||||
r"^תכניות\s+חלות",
|
r"^תכניות\s+חלות",
|
||||||
]
|
r"^תכניות\s+החלות",
|
||||||
for pattern in heading_patterns:
|
r"^מדיניות\s+מהנדס",
|
||||||
if re.search(pattern, text):
|
r"^היתרי\s+בני",
|
||||||
return True
|
r"^היתר\s+בני",
|
||||||
# Short bold-like lines (under 60 chars, not numbered)
|
)
|
||||||
if len(text) < 60 and not re.match(r"^\d+\.", text):
|
]
|
||||||
return False
|
|
||||||
return False
|
|
||||||
|
def _is_section_heading(text: str) -> bool:
|
||||||
|
"""Detect legal-decision section headings — mapped to Heading 2 style."""
|
||||||
|
return any(p.search(text) for p in _SECTION_HEADING_PATTERNS)
|
||||||
|
|||||||
@@ -3,19 +3,31 @@
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import logging
|
import logging
|
||||||
|
from typing import TYPE_CHECKING
|
||||||
import voyageai
|
|
||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
import voyageai
|
||||||
|
from PIL import Image as PILImage
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
_client: voyageai.Client | None = None
|
# voyageai is imported lazily inside _get_client to keep MCP server startup
|
||||||
|
# fast — loading voyageai eagerly costs ~450ms and Claude Code's first tool
|
||||||
|
# call can hit a "No such tool available" race if the server isn't ready yet.
|
||||||
|
_client: "voyageai.Client | None" = None
|
||||||
|
|
||||||
|
# Per-call cap for multimodal_embed. POC ran 89 pages (~312K tokens)
|
||||||
|
# in a single call comfortably; 50 leaves safe headroom for densely-
|
||||||
|
# OCR'd legal pages where tokens/page can exceed 4K.
|
||||||
|
_MULTIMODAL_BATCH_SIZE = 50
|
||||||
|
|
||||||
|
|
||||||
def _get_client() -> voyageai.Client:
|
def _get_client() -> "voyageai.Client":
|
||||||
global _client
|
global _client
|
||||||
if _client is None:
|
if _client is None:
|
||||||
|
import voyageai
|
||||||
_client = voyageai.Client(api_key=config.VOYAGE_API_KEY)
|
_client = voyageai.Client(api_key=config.VOYAGE_API_KEY)
|
||||||
return _client
|
return _client
|
||||||
|
|
||||||
@@ -53,3 +65,65 @@ async def embed_query(query: str) -> list[float]:
|
|||||||
"""Embed a single search query."""
|
"""Embed a single search query."""
|
||||||
results = await embed_texts([query], input_type="query")
|
results = await embed_texts([query], input_type="query")
|
||||||
return results[0]
|
return results[0]
|
||||||
|
|
||||||
|
|
||||||
|
async def embed_images(
|
||||||
|
images: "list[PILImage.Image]",
|
||||||
|
input_type: str = "document",
|
||||||
|
) -> list[list[float]]:
|
||||||
|
"""Embed page images via voyage-multimodal-3.
|
||||||
|
|
||||||
|
Each input is a single PIL.Image (one page = one embedding).
|
||||||
|
Returns a list of 1024-dim vectors, one per input image, in order.
|
||||||
|
Batches at ``_MULTIMODAL_BATCH_SIZE`` to stay within Voyage's
|
||||||
|
per-request limits on dense legal pages.
|
||||||
|
"""
|
||||||
|
if not images:
|
||||||
|
return []
|
||||||
|
client = _get_client()
|
||||||
|
out: list[list[float]] = []
|
||||||
|
for i in range(0, len(images), _MULTIMODAL_BATCH_SIZE):
|
||||||
|
batch = images[i : i + _MULTIMODAL_BATCH_SIZE]
|
||||||
|
result = client.multimodal_embed(
|
||||||
|
inputs=[[img] for img in batch],
|
||||||
|
model=config.MULTIMODAL_MODEL,
|
||||||
|
input_type=input_type,
|
||||||
|
truncation=True,
|
||||||
|
)
|
||||||
|
out.extend(result.embeddings)
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
async def embed_query_for_multimodal(query: str) -> list[float]:
|
||||||
|
"""Embed a text query in the multimodal vector space, so it can be
|
||||||
|
cosine-compared against page-image embeddings."""
|
||||||
|
client = _get_client()
|
||||||
|
result = client.multimodal_embed(
|
||||||
|
inputs=[[query]],
|
||||||
|
model=config.MULTIMODAL_MODEL,
|
||||||
|
input_type="query",
|
||||||
|
)
|
||||||
|
return result.embeddings[0]
|
||||||
|
|
||||||
|
|
||||||
|
async def voyage_rerank(
|
||||||
|
query: str, documents: list[str], top_k: int | None = None,
|
||||||
|
) -> list[tuple[int, float]]:
|
||||||
|
"""Cross-encoder rerank via Voyage. Returns [(orig_index, score), ...]
|
||||||
|
sorted by relevance. Each tuple's index refers to the position in the
|
||||||
|
*input* documents list (not a DB row id) — caller maps it back.
|
||||||
|
|
||||||
|
Used as a second stage after bi-encoder retrieval: fetch top-N
|
||||||
|
candidates with cosine, then rerank to get top-K with cross-encoder
|
||||||
|
attention over (query, doc).
|
||||||
|
"""
|
||||||
|
if not documents:
|
||||||
|
return []
|
||||||
|
client = _get_client()
|
||||||
|
result = client.rerank(
|
||||||
|
query=query,
|
||||||
|
documents=documents,
|
||||||
|
model=config.VOYAGE_RERANK_MODEL,
|
||||||
|
top_k=top_k,
|
||||||
|
)
|
||||||
|
return [(r.index, float(r.relevance_score)) for r in result.results]
|
||||||
|
|||||||
@@ -9,29 +9,35 @@ Post-processing: Hebrew abbreviation quote fixer.
|
|||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
import asyncio
|
import asyncio
|
||||||
|
import io
|
||||||
import logging
|
import logging
|
||||||
import re
|
import re
|
||||||
import subprocess
|
import subprocess
|
||||||
import tempfile
|
import tempfile
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
import fitz # PyMuPDF
|
import fitz # PyMuPDF
|
||||||
|
from PIL import Image
|
||||||
from docx import Document as DocxDocument
|
from docx import Document as DocxDocument
|
||||||
from google.cloud import vision
|
|
||||||
from striprtf.striprtf import rtf_to_text
|
from striprtf.striprtf import rtf_to_text
|
||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
from google.cloud import vision
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
# ── Google Cloud Vision client ───────────────────────────────────
|
# ── Google Cloud Vision client (imported lazily — saves ~550ms at MCP startup) ──
|
||||||
|
|
||||||
_vision_client: vision.ImageAnnotatorClient | None = None
|
_vision_client: "vision.ImageAnnotatorClient | None" = None
|
||||||
|
|
||||||
|
|
||||||
def _get_vision_client() -> vision.ImageAnnotatorClient:
|
def _get_vision_client() -> "vision.ImageAnnotatorClient":
|
||||||
global _vision_client
|
global _vision_client
|
||||||
if _vision_client is None:
|
if _vision_client is None:
|
||||||
|
from google.cloud import vision
|
||||||
_vision_client = vision.ImageAnnotatorClient(
|
_vision_client = vision.ImageAnnotatorClient(
|
||||||
client_options={"api_key": config.GOOGLE_CLOUD_VISION_API_KEY}
|
client_options={"api_key": config.GOOGLE_CLOUD_VISION_API_KEY}
|
||||||
)
|
)
|
||||||
@@ -118,12 +124,22 @@ def _fix_hebrew_quotes(text: str) -> str:
|
|||||||
# ── Extraction ───────────────────────────────────────────────────
|
# ── Extraction ───────────────────────────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
async def extract_text(file_path: str) -> tuple[str, int]:
|
# Separator used when joining per-page text. Constant so chunker /
|
||||||
|
# retrofit can reproduce the join when computing page offsets.
|
||||||
|
PAGE_SEPARATOR = "\n\n"
|
||||||
|
|
||||||
|
|
||||||
|
async def extract_text(file_path: str) -> tuple[str, int, list[int] | None]:
|
||||||
"""Extract text from a document file.
|
"""Extract text from a document file.
|
||||||
|
|
||||||
Returns:
|
Returns:
|
||||||
Tuple of (extracted_text, page_count).
|
``(text, page_count, page_offsets)`` where:
|
||||||
page_count is 0 for non-PDF files.
|
- ``text``: concatenated extracted text
|
||||||
|
- ``page_count``: number of pages (0 for non-PDF)
|
||||||
|
- ``page_offsets``: ``page_offsets[i]`` = char start offset of
|
||||||
|
page (i+1) inside ``text``. ``None`` for non-PDFs (where the
|
||||||
|
notion of pages doesn't apply). Used by the chunker to assign
|
||||||
|
a ``page_number`` to each chunk.
|
||||||
"""
|
"""
|
||||||
path = Path(file_path)
|
path = Path(file_path)
|
||||||
suffix = path.suffix.lower()
|
suffix = path.suffix.lower()
|
||||||
@@ -131,18 +147,34 @@ async def extract_text(file_path: str) -> tuple[str, int]:
|
|||||||
if suffix == ".pdf":
|
if suffix == ".pdf":
|
||||||
return await _extract_pdf(path)
|
return await _extract_pdf(path)
|
||||||
elif suffix == ".docx":
|
elif suffix == ".docx":
|
||||||
return _extract_docx(path), 0
|
return _extract_docx(path), 0, None
|
||||||
elif suffix == ".doc":
|
elif suffix == ".doc":
|
||||||
return _extract_doc(path), 0
|
return _extract_doc(path), 0, None
|
||||||
elif suffix == ".rtf":
|
elif suffix == ".rtf":
|
||||||
return _extract_rtf(path), 0
|
return _extract_rtf(path), 0, None
|
||||||
elif suffix in (".txt", ".md"):
|
elif suffix in (".txt", ".md"):
|
||||||
return path.read_text(encoding="utf-8"), 0
|
return path.read_text(encoding="utf-8"), 0, None
|
||||||
else:
|
else:
|
||||||
raise ValueError(f"Unsupported file type: {suffix}")
|
raise ValueError(f"Unsupported file type: {suffix}")
|
||||||
|
|
||||||
|
|
||||||
async def _extract_pdf(path: Path) -> tuple[str, int]:
|
def _join_pages(pages_text: list[str]) -> tuple[str, list[int]]:
|
||||||
|
"""Join per-page text with PAGE_SEPARATOR while recording the start
|
||||||
|
offset of each page in the joined output."""
|
||||||
|
offsets: list[int] = []
|
||||||
|
parts: list[str] = []
|
||||||
|
cursor = 0
|
||||||
|
for i, pg in enumerate(pages_text):
|
||||||
|
offsets.append(cursor)
|
||||||
|
parts.append(pg)
|
||||||
|
cursor += len(pg)
|
||||||
|
if i < len(pages_text) - 1:
|
||||||
|
parts.append(PAGE_SEPARATOR)
|
||||||
|
cursor += len(PAGE_SEPARATOR)
|
||||||
|
return "".join(parts), offsets
|
||||||
|
|
||||||
|
|
||||||
|
async def _extract_pdf(path: Path) -> tuple[str, int, list[int]]:
|
||||||
"""Extract text from PDF.
|
"""Extract text from PDF.
|
||||||
|
|
||||||
Try direct text first, fall back to Google Cloud Vision for scanned
|
Try direct text first, fall back to Google Cloud Vision for scanned
|
||||||
@@ -170,11 +202,32 @@ async def _extract_pdf(path: Path) -> tuple[str, int]:
|
|||||||
pages_text.append(ocr_text)
|
pages_text.append(ocr_text)
|
||||||
|
|
||||||
doc.close()
|
doc.close()
|
||||||
return "\n\n".join(pages_text), page_count
|
joined, offsets = _join_pages(pages_text)
|
||||||
|
return joined, page_count, offsets
|
||||||
|
|
||||||
|
|
||||||
|
def page_at_offset(offset: int, page_offsets: list[int]) -> int:
|
||||||
|
"""Look up the page number containing a given char offset.
|
||||||
|
|
||||||
|
page_offsets[i] is the start of page (i+1) in the joined text;
|
||||||
|
a chunk starting at ``offset`` belongs to the highest-indexed page
|
||||||
|
whose start is ``<= offset``. Returns 1-based page number.
|
||||||
|
"""
|
||||||
|
if not page_offsets:
|
||||||
|
return 1
|
||||||
|
# Linear scan is fine — page_offsets is short (≤ ~200 for our PDFs).
|
||||||
|
page = 1
|
||||||
|
for i, start in enumerate(page_offsets):
|
||||||
|
if start <= offset:
|
||||||
|
page = i + 1
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
return page
|
||||||
|
|
||||||
|
|
||||||
def _ocr_with_google_vision(image_bytes: bytes, page_num: int) -> str:
|
def _ocr_with_google_vision(image_bytes: bytes, page_num: int) -> str:
|
||||||
"""OCR a single page image using Google Cloud Vision API."""
|
"""OCR a single page image using Google Cloud Vision API."""
|
||||||
|
from google.cloud import vision # lazy: keeps MCP startup fast
|
||||||
client = _get_vision_client()
|
client = _get_vision_client()
|
||||||
image = vision.Image(content=image_bytes)
|
image = vision.Image(content=image_bytes)
|
||||||
|
|
||||||
@@ -220,6 +273,65 @@ def _extract_rtf(path: Path) -> str:
|
|||||||
return rtf_to_text(rtf_content)
|
return rtf_to_text(rtf_content)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Multimodal page rendering (V9) ───────────────────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def _pixmap_to_pil(pix: fitz.Pixmap) -> Image.Image:
|
||||||
|
"""Convert a PyMuPDF pixmap to PIL.Image (RGB) without going through
|
||||||
|
PNG bytes. Faster than tobytes('png') → Image.open()."""
|
||||||
|
if pix.alpha:
|
||||||
|
# Drop alpha channel — voyage multimodal expects RGB.
|
||||||
|
pix = fitz.Pixmap(pix, 0)
|
||||||
|
return Image.frombytes("RGB", (pix.width, pix.height), pix.samples)
|
||||||
|
|
||||||
|
|
||||||
|
def render_pages_for_multimodal(
|
||||||
|
pdf_path: str | Path,
|
||||||
|
embed_dpi: int,
|
||||||
|
thumb_dpi: int | None = None,
|
||||||
|
thumbnail_dir: Path | None = None,
|
||||||
|
) -> list[tuple[Image.Image, Path | None]]:
|
||||||
|
"""Render each PDF page as PIL.Image at ``embed_dpi`` for the
|
||||||
|
multimodal embedder, and optionally save a smaller JPEG thumbnail
|
||||||
|
at ``thumb_dpi`` to ``thumbnail_dir`` for UI preview.
|
||||||
|
|
||||||
|
Returns ``[(pil_image, thumb_path_or_None), ...]`` in page order.
|
||||||
|
The full-DPI image stays in memory only — only the thumbnail is
|
||||||
|
persisted to disk.
|
||||||
|
"""
|
||||||
|
src = Path(pdf_path)
|
||||||
|
if not src.is_file():
|
||||||
|
raise FileNotFoundError(f"PDF not found: {src}")
|
||||||
|
if thumbnail_dir is not None:
|
||||||
|
thumbnail_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
out: list[tuple[Image.Image, Path | None]] = []
|
||||||
|
doc = fitz.open(str(src))
|
||||||
|
try:
|
||||||
|
for page_idx, page in enumerate(doc):
|
||||||
|
page_num = page_idx + 1
|
||||||
|
pix = page.get_pixmap(dpi=embed_dpi)
|
||||||
|
img = _pixmap_to_pil(pix)
|
||||||
|
|
||||||
|
thumb_path: Path | None = None
|
||||||
|
if thumbnail_dir is not None and thumb_dpi:
|
||||||
|
thumb_path = thumbnail_dir / f"p{page_num:03d}.jpg"
|
||||||
|
# Downsample the same render rather than re-rendering
|
||||||
|
# with PyMuPDF — far faster.
|
||||||
|
ratio = thumb_dpi / embed_dpi
|
||||||
|
thumb_size = (
|
||||||
|
max(1, int(img.width * ratio)),
|
||||||
|
max(1, int(img.height * ratio)),
|
||||||
|
)
|
||||||
|
thumb = img.resize(thumb_size, Image.Resampling.LANCZOS)
|
||||||
|
thumb.save(thumb_path, "JPEG", quality=75, optimize=True)
|
||||||
|
|
||||||
|
out.append((img, thumb_path))
|
||||||
|
finally:
|
||||||
|
doc.close()
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
# ── Nevo preamble stripping ──────────────────────────────────────
|
# ── Nevo preamble stripping ──────────────────────────────────────
|
||||||
|
|
||||||
_NEVO_MARKERS = ("ספרות:", "חקיקה שאוזכרה:", "מיני-רציו:", "פסקי דין שאוזכרו:",
|
_NEVO_MARKERS = ("ספרות:", "חקיקה שאוזכרה:", "מיני-רציו:", "פסקי דין שאוזכרו:",
|
||||||
|
|||||||
208
mcp-server/src/legal_mcp/services/git_sync.py
Normal file
208
mcp-server/src/legal_mcp/services/git_sync.py
Normal file
@@ -0,0 +1,208 @@
|
|||||||
|
"""Git sync helpers for case repos.
|
||||||
|
|
||||||
|
Each case lives in its own git repo with a Gitea remote. The remote URL
|
||||||
|
embeds an auth token (https://chaim:TOKEN@host/...). When the token is
|
||||||
|
rotated in Infisical, repos created with the old token will fail to
|
||||||
|
push silently — only logged at WARNING level. ``commit_and_push``
|
||||||
|
re-injects the *current* token into the existing origin URL on every
|
||||||
|
call, so push survives token rotation.
|
||||||
|
|
||||||
|
This module also runs a periodic ``sweep_loop`` that catches files
|
||||||
|
written outside the API path (most importantly: agents writing research
|
||||||
|
artefacts directly to the case dir). The full case repo is the user's
|
||||||
|
backup, so anything in the dir must end up on Gitea.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
from legal_mcp import config
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
def _gitea_token() -> str:
|
||||||
|
return os.environ.get("GITEA_ACCESS_TOKEN") or os.environ.get("GITEA_TOKEN", "")
|
||||||
|
|
||||||
|
|
||||||
|
def _git_env(case_dir: str | Path | None = None) -> dict:
|
||||||
|
env = {
|
||||||
|
"GIT_AUTHOR_NAME": "Ezer Mishpati",
|
||||||
|
"GIT_AUTHOR_EMAIL": "legal@local",
|
||||||
|
"GIT_COMMITTER_NAME": "Ezer Mishpati",
|
||||||
|
"GIT_COMMITTER_EMAIL": "legal@local",
|
||||||
|
"PATH": os.environ.get("PATH", "/usr/bin:/bin"),
|
||||||
|
"GIT_TERMINAL_PROMPT": "0",
|
||||||
|
}
|
||||||
|
if case_dir is not None:
|
||||||
|
# Trust the case dir even when the running uid differs from the
|
||||||
|
# owner (prod container is uniform-root, but host runs may not be).
|
||||||
|
env["GIT_CONFIG_COUNT"] = "1"
|
||||||
|
env["GIT_CONFIG_KEY_0"] = "safe.directory"
|
||||||
|
env["GIT_CONFIG_VALUE_0"] = str(case_dir)
|
||||||
|
return env
|
||||||
|
|
||||||
|
|
||||||
|
def _refresh_remote_url(case_dir: Path, env: dict) -> bool:
|
||||||
|
result = subprocess.run(
|
||||||
|
["git", "remote", "get-url", "origin"],
|
||||||
|
cwd=case_dir, capture_output=True, text=True,
|
||||||
|
)
|
||||||
|
if result.returncode != 0:
|
||||||
|
return False
|
||||||
|
current_url = result.stdout.strip()
|
||||||
|
if "@" in current_url and current_url.startswith("https://"):
|
||||||
|
bare_url = "https://" + current_url.split("@", 1)[1]
|
||||||
|
else:
|
||||||
|
bare_url = current_url
|
||||||
|
token = _gitea_token()
|
||||||
|
if not token:
|
||||||
|
return True # Push without auth — will fail, but caller decides what to do
|
||||||
|
auth_url = bare_url.replace("https://", f"https://chaim:{token}@")
|
||||||
|
if auth_url != current_url:
|
||||||
|
subprocess.run(
|
||||||
|
["git", "remote", "set-url", "origin", auth_url],
|
||||||
|
cwd=case_dir, capture_output=True, env=env,
|
||||||
|
)
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
def commit_and_push(case_dir: str | Path, message: str) -> bool:
|
||||||
|
"""Stage, commit, refresh origin URL with current token, and push.
|
||||||
|
|
||||||
|
Best-effort: on failure logs at WARNING and returns False, but never
|
||||||
|
raises. Continues to push even if the commit was a no-op (in case
|
||||||
|
earlier commits are unpushed).
|
||||||
|
"""
|
||||||
|
case_dir = Path(case_dir)
|
||||||
|
if not (case_dir / ".git").exists():
|
||||||
|
return False
|
||||||
|
|
||||||
|
env = _git_env(case_dir)
|
||||||
|
|
||||||
|
subprocess.run(["git", "add", "."], cwd=case_dir, capture_output=True, env=env)
|
||||||
|
commit = subprocess.run(
|
||||||
|
["git", "commit", "-m", message],
|
||||||
|
cwd=case_dir, capture_output=True, text=True, env=env,
|
||||||
|
)
|
||||||
|
if commit.returncode != 0 and "nothing to commit" not in commit.stdout:
|
||||||
|
logger.warning("Git commit failed in %s: %s", case_dir, commit.stderr or commit.stdout)
|
||||||
|
|
||||||
|
if not _refresh_remote_url(case_dir, env):
|
||||||
|
logger.warning("No origin remote configured in %s — skipping push", case_dir)
|
||||||
|
return False
|
||||||
|
|
||||||
|
push = subprocess.run(
|
||||||
|
["git", "push"],
|
||||||
|
cwd=case_dir, capture_output=True, text=True, env=env,
|
||||||
|
)
|
||||||
|
if push.returncode != 0:
|
||||||
|
logger.warning("Git push failed in %s: %s", case_dir, push.stderr)
|
||||||
|
return False
|
||||||
|
return True
|
||||||
|
|
||||||
|
|
||||||
|
# ── Periodic sweep ────────────────────────────────────────────────
|
||||||
|
#
|
||||||
|
# The user's expectation is that "anything I or an agent puts into a case
|
||||||
|
# dir ends up on Gitea". Explicit commit_and_push calls cover the API
|
||||||
|
# write paths, but agents write research/draft files directly to disk.
|
||||||
|
# A short periodic sweep is the safety net.
|
||||||
|
|
||||||
|
_SWEEP_INTERVAL_SEC = 30
|
||||||
|
|
||||||
|
|
||||||
|
def _porcelain_changes(case_dir: Path, env: dict) -> list[str]:
|
||||||
|
"""Return list of `git status --porcelain` lines, or [] if clean/error."""
|
||||||
|
res = subprocess.run(
|
||||||
|
["git", "status", "--porcelain"],
|
||||||
|
cwd=case_dir, capture_output=True, text=True, env=env,
|
||||||
|
)
|
||||||
|
if res.returncode != 0:
|
||||||
|
return []
|
||||||
|
return [ln for ln in res.stdout.splitlines() if ln.strip()]
|
||||||
|
|
||||||
|
|
||||||
|
def _auto_message(changes: list[str]) -> str:
|
||||||
|
"""Build a Hebrew commit message from porcelain output.
|
||||||
|
|
||||||
|
Groups by top-level subdir under the case dir so a sweep that picks up
|
||||||
|
one DOCX export plus one research file produces a useful summary
|
||||||
|
instead of "auto-sync".
|
||||||
|
"""
|
||||||
|
groups: dict[str, int] = {}
|
||||||
|
sample: dict[str, str] = {}
|
||||||
|
for line in changes:
|
||||||
|
path = line[3:].strip().strip('"')
|
||||||
|
if "->" in path: # rename
|
||||||
|
path = path.split("->", 1)[1].strip().strip('"')
|
||||||
|
first = path.split("/", 1)[0]
|
||||||
|
groups[first] = groups.get(first, 0) + 1
|
||||||
|
sample.setdefault(first, path)
|
||||||
|
|
||||||
|
label_map = {
|
||||||
|
"documents": "מסמכים",
|
||||||
|
"drafts": "טיוטות",
|
||||||
|
"exports": "גרסאות",
|
||||||
|
"case.json": "מטא",
|
||||||
|
"notes.md": "הערות",
|
||||||
|
}
|
||||||
|
parts: list[str] = []
|
||||||
|
for top, count in groups.items():
|
||||||
|
label = label_map.get(top, top)
|
||||||
|
parts.append(f"{label} ({count})" if count > 1 else label)
|
||||||
|
summary = " · ".join(parts) or "שינויים"
|
||||||
|
return f"אוטו: {summary}"
|
||||||
|
|
||||||
|
|
||||||
|
def sweep_once() -> dict:
|
||||||
|
"""Walk every case dir and commit+push any dirty changes.
|
||||||
|
|
||||||
|
Synchronous (subprocess-based) but cheap — `git status --porcelain` on
|
||||||
|
a clean dir is a sub-millisecond operation. Returns a small report
|
||||||
|
suitable for logging.
|
||||||
|
"""
|
||||||
|
base: Path = config.CASES_DIR
|
||||||
|
if not base.exists():
|
||||||
|
return {"checked": 0, "synced": 0, "errors": 0}
|
||||||
|
|
||||||
|
checked = synced = errors = 0
|
||||||
|
for case_dir in base.iterdir():
|
||||||
|
if not case_dir.is_dir() or not (case_dir / ".git").exists():
|
||||||
|
continue
|
||||||
|
checked += 1
|
||||||
|
changes = _porcelain_changes(case_dir, _git_env(case_dir))
|
||||||
|
if not changes:
|
||||||
|
continue
|
||||||
|
msg = _auto_message(changes)
|
||||||
|
ok = commit_and_push(case_dir, msg)
|
||||||
|
if ok:
|
||||||
|
synced += 1
|
||||||
|
logger.info("auto-sync committed %d change(s) in %s", len(changes), case_dir.name)
|
||||||
|
else:
|
||||||
|
errors += 1
|
||||||
|
return {"checked": checked, "synced": synced, "errors": errors}
|
||||||
|
|
||||||
|
|
||||||
|
async def sweep_loop(interval_sec: int = _SWEEP_INTERVAL_SEC) -> None:
|
||||||
|
"""Background task: run sweep_once forever every interval_sec.
|
||||||
|
|
||||||
|
Cancellation-safe; logs and continues on transient errors.
|
||||||
|
"""
|
||||||
|
logger.info("git_sync.sweep_loop started (interval=%ds)", interval_sec)
|
||||||
|
while True:
|
||||||
|
try:
|
||||||
|
await asyncio.sleep(interval_sec)
|
||||||
|
# Run the sync subprocess work in a thread to avoid blocking
|
||||||
|
# the FastAPI event loop.
|
||||||
|
await asyncio.to_thread(sweep_once)
|
||||||
|
except asyncio.CancelledError:
|
||||||
|
logger.info("git_sync.sweep_loop cancelled")
|
||||||
|
raise
|
||||||
|
except Exception as exc:
|
||||||
|
logger.warning("git_sync sweep iteration failed: %s", exc)
|
||||||
473
mcp-server/src/legal_mcp/services/halacha_extractor.py
Normal file
473
mcp-server/src/legal_mcp/services/halacha_extractor.py
Normal file
@@ -0,0 +1,473 @@
|
|||||||
|
"""Extract binding legal rules (הלכות) from external court rulings.
|
||||||
|
|
||||||
|
Runs Claude (via the local headless ``claude -p`` bridge) over the
|
||||||
|
legal_analysis / ruling / conclusion chunks of a precedent, returns a
|
||||||
|
structured list of halachot, validates each one against the source text,
|
||||||
|
embeds the rule statement, and stores everything as ``pending_review`` in
|
||||||
|
the ``halachot`` table.
|
||||||
|
|
||||||
|
All extraction is idempotent — calling ``extract(case_law_id)`` twice
|
||||||
|
deletes prior rows for that precedent first.
|
||||||
|
|
||||||
|
Trust model:
|
||||||
|
Per chair decision, NO halacha is auto-published. Every extracted
|
||||||
|
halacha enters with ``review_status='pending_review'``. The chair
|
||||||
|
approves/rejects via the UI, and only ``approved`` (or ``published``)
|
||||||
|
rows are visible to ``search_precedent_library`` and the writing
|
||||||
|
agents.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp import config
|
||||||
|
from legal_mcp.config import parse_llm_json
|
||||||
|
from legal_mcp.services import claude_session, db, embeddings, proofreader
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# Concurrency model mirrors claims_extractor — each ``claude -p`` subprocess
|
||||||
|
# holds ~300 MB RSS, so we cap parallel chunks to keep the box healthy.
|
||||||
|
CHUNK_CONCURRENCY = 3
|
||||||
|
CHUNK_RETRY_ATTEMPTS = 1
|
||||||
|
|
||||||
|
# If at least this fraction of chunks crash and the precedent yields zero
|
||||||
|
# halachot, treat the run as `extraction_failed` rather than `no_halachot`.
|
||||||
|
# Picked at 0.5 so a precedent that genuinely has no holdings (e.g. a remand
|
||||||
|
# ruling that just sends the case back) isn't misflagged just because a few
|
||||||
|
# chunks timed out, while a real rate-limit storm — which kills nearly every
|
||||||
|
# call — is correctly distinguished and re-tried by the caller.
|
||||||
|
EXTRACTION_FAILURE_THRESHOLD = 0.5
|
||||||
|
|
||||||
|
# Sections from which to extract. facts/intro/appellant_claims/respondent_claims
|
||||||
|
# never contain holdings, only positions, so we skip them.
|
||||||
|
EXTRACTABLE_SECTIONS = ("legal_analysis", "ruling", "conclusion")
|
||||||
|
|
||||||
|
|
||||||
|
# Two prompts — choose by source's is_binding flag.
|
||||||
|
#
|
||||||
|
# The binding prompt extracts strict halachot (rules a future panel MUST
|
||||||
|
# follow). It rejects obiter dicta, factual findings, and citations of
|
||||||
|
# other rulings that the present court only mentioned in passing.
|
||||||
|
#
|
||||||
|
# The persuasive prompt is for sources that don't establish binding law
|
||||||
|
# (most appeals committee decisions, district courts on planning matters,
|
||||||
|
# etc.). For those, the value is in **how the panel reasoned and applied**
|
||||||
|
# established law to facts — not in new halachot. The user explicitly
|
||||||
|
# wants to be able to cite "another committee reached the same conclusion"
|
||||||
|
# even though it is not binding.
|
||||||
|
#
|
||||||
|
# The schema's rule_type field accepts six values:
|
||||||
|
# binding | interpretive | procedural | obiter | application | persuasive
|
||||||
|
|
||||||
|
HALACHA_EXTRACTION_PROMPT_BINDING = """אתה משפטן בכיר המתמחה בדיני תכנון ובניה (ועדות ערר, היטל השבחה, פיצויים לפי סעיף 197 לחוק התכנון והבניה). תפקידך: לחלץ הלכות מחייבות מתוך פסק דין/החלטה משפטית של ערכאה עליונה (עליון / מנהלי).
|
||||||
|
|
||||||
|
## הגדרות מחייבות
|
||||||
|
|
||||||
|
הלכה (binding rule) = כלל משפטי שהפסק קובע או מאמץ ומיישם, באופן שניתן להסתמך עליו בהחלטות עתידיות.
|
||||||
|
|
||||||
|
לא-הלכה (אין לחלץ):
|
||||||
|
- אמרת אגב (obiter dicta) — הערות שאינן הכרחיות להכרעה.
|
||||||
|
- ממצאים עובדתיים ספציפיים לתיק ("העורר לא הוכיח X").
|
||||||
|
- ציטוטי הלכות מפסקי דין אחרים שלא אומצו במפורש בפסק זה.
|
||||||
|
- הצהרות על דין קיים שאינן מיושמות בהכרעה.
|
||||||
|
|
||||||
|
הבחנה קריטית: כאשר הפסק מצטט הלכה מפסק קודם, חלץ אותה רק אם בית המשפט בפסק הנוכחי **מאמץ ומחיל** אותה (לא רק מזכיר אותה ברקע).
|
||||||
|
|
||||||
|
## תחומים אפשריים (practice_areas) — תחומי ועדת הערר בלבד
|
||||||
|
- rishuy_uvniya — רישוי ובניה (תיקי 1xxx: היתרים, שימוש חורג, תכניות, קווי בניין, גובה, חניה)
|
||||||
|
- betterment_levy — היטל השבחה (תיקי 8xxx: שומה, מערכות, תכניות המקנות בה, מועד קובע, סופיות ההחלטה)
|
||||||
|
- compensation_197 — פיצויים לפי ס' 197 (תיקי 9xxx: פגיעה במקרקעין, ירידת ערך, ס' 200/פטור)
|
||||||
|
|
||||||
|
הלכה אחת יכולה לחול על כמה תחומים — practice_areas הוא array ולא string יחיד.
|
||||||
|
|
||||||
|
## סוגי הלכה (rule_type)
|
||||||
|
- binding — הלכה מחייבת שהוחלה על התיק.
|
||||||
|
- interpretive — פרשנות סעיף חוק/תכנית שאומצה.
|
||||||
|
- procedural — כלל פרוצדורלי (סמכות, מועדים, הליכי שמיעה).
|
||||||
|
- obiter — אמרת אגב חשובה (חלץ רק אם משמעותית; סמן confidence נמוך).
|
||||||
|
|
||||||
|
## פלט נדרש
|
||||||
|
החזר JSON array בלבד, ללא markdown, ללא הסברים. דוגמה:
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"rule_statement": "ניסוח הכלל בלשון משפטית מדויקת בגוף שלישי, 1-3 משפטים.",
|
||||||
|
"rule_type": "binding",
|
||||||
|
"reasoning_summary": "תמצית ההיגיון: למה בית המשפט הגיע לכלל הזה (1-2 משפטים).",
|
||||||
|
"supporting_quote": "ציטוט מילולי מדויק מהפסק התומך בכלל. חייב להופיע מילה במילה בטקסט הקלט.",
|
||||||
|
"page_reference": "פס' 12 / עמ' 8 — ככל שניתן לזהות מהקלט.",
|
||||||
|
"practice_areas": ["betterment_levy"],
|
||||||
|
"subject_tags": ["מועד_קביעת_שומה", "סופיות_ההחלטה"],
|
||||||
|
"cites": ["עע\\"מ 3975/22"],
|
||||||
|
"confidence": 0.85
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
## כללי איכות
|
||||||
|
1. **נאמנות מוחלטת לציטוט** — supporting_quote חייב להיות הדבקה מדויקת מהקלט. אם אין ציטוט מתאים — אל תמציא הלכה.
|
||||||
|
2. **מספר הלכות** — פסק רגיל מכיל 1-4 הלכות מחייבות. אל תמתח את הרשימה. אם אין הלכה — החזר [].
|
||||||
|
3. **לא לפצל יתר על המידה** — אם שני סעיפים מבטאים את אותו עיקרון, אחד את הניסוח.
|
||||||
|
4. **שפה** — rule_statement בעברית משפטית מקצועית, לא צמצום מילולי של הציטוט.
|
||||||
|
5. **subject_tags** — 2-5 תגיות בעברית, snake_case (חניה, קווי_בניין, שיקול_דעת, פגם_פרוצדורלי, סמכות, מועדים, פגיעה_במקרקעין, ירידת_ערך).
|
||||||
|
6. **confidence** — 0..1. מתחת ל-0.7 = ספק לגבי היות זה הלכה מחייבת.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
HALACHA_EXTRACTION_PROMPT_PERSUASIVE = """אתה משפטן בכיר המתמחה בדיני תכנון ובניה. תפקידך: לחלץ עקרונות, יישומים ומסקנות מתוך החלטה של ועדת ערר אחרת או של בית משפט שאינו ערכאה עליונה לסוגיה.
|
||||||
|
|
||||||
|
## חשוב — מה לחלץ ומה לא
|
||||||
|
|
||||||
|
המקור הזה **אינו** מקור להלכות מחייבות חדשות (binding rules). הלכות מחייבות מגיעות מהעליון/מנהלי. עם זאת, יש כאן ערך משמעותי שצריך לחלץ — איך הפנל הזה ניתח ויישם את הדין הקיים. כשנכתוב החלטה עתידית, נצטט מהמקור הזה כ"גם ועדת הערר ב-X הגיעה למסקנה דומה" — לא כסמכות מחייבת, אלא כתמיכה משכנעת.
|
||||||
|
|
||||||
|
**יש לחלץ:**
|
||||||
|
- **יישום של הלכה ידועה** (rule_type=`application`) — הפנל החיל הלכה ידועה (של עליון/מנהלי) על עובדות הנידונות. תצטט את ניסוח הכלל **כפי שהוצג כאן** (לא בהכרח כפי שנקבע במקור) ואת התוצאה.
|
||||||
|
- **עקרון פרשני שאומץ** (rule_type=`interpretive`) — איך הפנל פירש סעיף חוק / תכנית, באופן שניתן לאמץ.
|
||||||
|
- **כלל פרוצדורלי** (rule_type=`procedural`) — קביעות בנושאי סמכות, מועדים, הליך.
|
||||||
|
- **מסקנה מנומקת ומשכנעת** (rule_type=`persuasive`) — מסקנה שלמה של הפנל בסוגיה, עם ההיגיון התומך, ניתנת לציטוט כאסמכתא משכנעת.
|
||||||
|
|
||||||
|
**אין לחלץ:**
|
||||||
|
- ממצאים עובדתיים ספציפיים לתיק ("העורר לא הוכיח X").
|
||||||
|
- ציטוטים מפסקי דין אחרים ללא ניתוח של הפנל.
|
||||||
|
- אמרות אגב חסרות חשיבות.
|
||||||
|
|
||||||
|
## תחומים אפשריים (practice_areas) — תחומי ועדת הערר בלבד
|
||||||
|
- rishuy_uvniya — רישוי ובניה (תיקי 1xxx: היתרים, שימוש חורג, תכניות, קווי בניין, גובה, חניה)
|
||||||
|
- betterment_levy — היטל השבחה (תיקי 8xxx: שומה, מערכות, תכניות המקנות בה, מועד קובע, סופיות ההחלטה)
|
||||||
|
- compensation_197 — פיצויים לפי ס' 197 (תיקי 9xxx: פגיעה במקרקעין, ירידת ערך, ס' 200/פטור)
|
||||||
|
|
||||||
|
## פלט נדרש
|
||||||
|
החזר JSON array בלבד, ללא markdown, ללא הסברים:
|
||||||
|
[
|
||||||
|
{
|
||||||
|
"rule_statement": "ניסוח הכלל / המסקנה / היישום בלשון משפטית מדויקת, 1-3 משפטים.",
|
||||||
|
"rule_type": "application",
|
||||||
|
"reasoning_summary": "תמצית ההיגיון של הפנל (1-2 משפטים).",
|
||||||
|
"supporting_quote": "ציטוט מילולי מדויק מהקלט שתומך בכלל. חייב להופיע מילה במילה.",
|
||||||
|
"page_reference": "פס' 12 / עמ' 8 — ככל שניתן לזהות.",
|
||||||
|
"practice_areas": ["betterment_levy"],
|
||||||
|
"subject_tags": ["מועד_קביעת_שומה", "תכנית_רחביה"],
|
||||||
|
"cites": ["עע\\"מ 3975/22"],
|
||||||
|
"confidence": 0.85
|
||||||
|
}
|
||||||
|
]
|
||||||
|
|
||||||
|
## כללי איכות
|
||||||
|
1. **נאמנות מוחלטת לציטוט** — supporting_quote חייב להיות הדבקה מדויקת מהקלט. אם אין ציטוט מתאים — אל תוסיף את ההלכה.
|
||||||
|
2. **מספר הלכות** — החלטה ארוכה של ועדת ערר יכולה להניב 2-8 פריטים (יישומים + מסקנות). אם אין מה לחלץ — החזר [].
|
||||||
|
3. **rule_type מדויק** — application = יישום הלכה ידועה. interpretive = פרשנות. procedural = פרוצדורה. persuasive = מסקנה כללית בעלת ערך כאסמכתא.
|
||||||
|
4. **לא לפצל יתר על המידה** — שני סעיפים זהים מבחינה רעיונית = פריט אחד.
|
||||||
|
5. **שפה** — עברית משפטית מקצועית, גוף שלישי.
|
||||||
|
6. **subject_tags** — 2-5 תגיות בעברית, snake_case.
|
||||||
|
7. **confidence** — 0..1. דייק.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
_VALID_PRACTICE_AREAS = {"rishuy_uvniya", "betterment_levy", "compensation_197"}
|
||||||
|
_VALID_RULE_TYPES = {
|
||||||
|
"binding", "interpretive", "procedural", "obiter",
|
||||||
|
"application", "persuasive",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
def _normalize_for_comparison(text: str) -> str:
|
||||||
|
"""Normalize Hebrew text for substring matching.
|
||||||
|
|
||||||
|
Collapses whitespace and unifies the half-dozen Hebrew quote-mark
|
||||||
|
variants. Use ``proofreader._fix_hebrew_quotes`` for the quote part
|
||||||
|
so we stay consistent with the proofreader pipeline.
|
||||||
|
"""
|
||||||
|
fixed = proofreader._fix_hebrew_quotes(text)
|
||||||
|
# Collapse all whitespace (newlines, tabs, multiple spaces) to a single space.
|
||||||
|
return re.sub(r"\s+", " ", fixed).strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _verify_quote(supporting_quote: str, full_text: str) -> bool:
|
||||||
|
"""Return True if ``supporting_quote`` appears verbatim in ``full_text``
|
||||||
|
after Hebrew quote/whitespace normalization.
|
||||||
|
|
||||||
|
The LLM occasionally trims a leading/trailing word from the quote;
|
||||||
|
we accept the quote if at least 90% of its characters match a
|
||||||
|
contiguous substring of the source.
|
||||||
|
"""
|
||||||
|
if not supporting_quote.strip():
|
||||||
|
return False
|
||||||
|
normalized_quote = _normalize_for_comparison(supporting_quote)
|
||||||
|
normalized_text = _normalize_for_comparison(full_text)
|
||||||
|
if not normalized_quote:
|
||||||
|
return False
|
||||||
|
if normalized_quote in normalized_text:
|
||||||
|
return True
|
||||||
|
# Fallback: try the inner 90% of the quote (drops boundary trim).
|
||||||
|
if len(normalized_quote) >= 30:
|
||||||
|
trim = max(2, len(normalized_quote) // 20)
|
||||||
|
inner = normalized_quote[trim:-trim]
|
||||||
|
if inner and inner in normalized_text:
|
||||||
|
return True
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
def _coerce_halacha(raw: dict, is_binding: bool = True) -> dict | None:
|
||||||
|
"""Validate and normalize one LLM-returned halacha dict.
|
||||||
|
|
||||||
|
Returns ``None`` if the entry is missing required fields. ``is_binding``
|
||||||
|
only affects the default rule_type when the LLM returned an unknown
|
||||||
|
value — for binding sources we default to ``binding``, otherwise to
|
||||||
|
``persuasive`` (never pretend an appeals committee created halacha).
|
||||||
|
"""
|
||||||
|
if not isinstance(raw, dict):
|
||||||
|
return None
|
||||||
|
rule_statement = (raw.get("rule_statement") or "").strip()
|
||||||
|
supporting_quote = (raw.get("supporting_quote") or "").strip()
|
||||||
|
if not rule_statement or not supporting_quote:
|
||||||
|
return None
|
||||||
|
|
||||||
|
default_rule_type = "binding" if is_binding else "persuasive"
|
||||||
|
rule_type = (raw.get("rule_type") or default_rule_type).strip().lower()
|
||||||
|
if rule_type not in _VALID_RULE_TYPES:
|
||||||
|
rule_type = default_rule_type
|
||||||
|
# Guard: don't let a non-binding source produce 'binding' rule_type
|
||||||
|
if not is_binding and rule_type == "binding":
|
||||||
|
rule_type = "persuasive"
|
||||||
|
|
||||||
|
practice_areas_raw = raw.get("practice_areas") or []
|
||||||
|
if isinstance(practice_areas_raw, str):
|
||||||
|
practice_areas_raw = [practice_areas_raw]
|
||||||
|
practice_areas = [p for p in practice_areas_raw if p in _VALID_PRACTICE_AREAS]
|
||||||
|
|
||||||
|
subject_tags_raw = raw.get("subject_tags") or []
|
||||||
|
if isinstance(subject_tags_raw, str):
|
||||||
|
subject_tags_raw = [subject_tags_raw]
|
||||||
|
subject_tags = [str(t).strip() for t in subject_tags_raw if str(t).strip()]
|
||||||
|
|
||||||
|
cites_raw = raw.get("cites") or []
|
||||||
|
if isinstance(cites_raw, str):
|
||||||
|
cites_raw = [cites_raw]
|
||||||
|
cites = [str(c).strip() for c in cites_raw if str(c).strip()]
|
||||||
|
|
||||||
|
try:
|
||||||
|
confidence = float(raw.get("confidence", 0.0))
|
||||||
|
except (TypeError, ValueError):
|
||||||
|
confidence = 0.0
|
||||||
|
confidence = max(0.0, min(1.0, confidence))
|
||||||
|
|
||||||
|
return {
|
||||||
|
"rule_statement": rule_statement,
|
||||||
|
"rule_type": rule_type,
|
||||||
|
"reasoning_summary": (raw.get("reasoning_summary") or "").strip(),
|
||||||
|
"supporting_quote": supporting_quote,
|
||||||
|
"page_reference": (raw.get("page_reference") or "").strip(),
|
||||||
|
"practice_areas": practice_areas,
|
||||||
|
"subject_tags": subject_tags,
|
||||||
|
"cites": cites,
|
||||||
|
"confidence": confidence,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def _extract_chunk(
|
||||||
|
chunk_text: str,
|
||||||
|
section_type: str,
|
||||||
|
chunk_index: int,
|
||||||
|
chunk_total: int,
|
||||||
|
context: str,
|
||||||
|
is_binding: bool,
|
||||||
|
) -> tuple[list[dict], bool]:
|
||||||
|
"""Run the halacha extractor on one chunk with retry.
|
||||||
|
|
||||||
|
Returns ``(halachot, succeeded)`` so the caller can distinguish "Claude
|
||||||
|
said there are no halachot here" (`(_, True)`) from "every attempt
|
||||||
|
crashed/timed out" (`(_, False)`). Without this distinction a precedent
|
||||||
|
that hit a rate-limit storm looks identical to one that genuinely has no
|
||||||
|
halachot — and gets silently marked `no_halachot`.
|
||||||
|
|
||||||
|
The prompt branches on ``is_binding`` so non-binding sources (other
|
||||||
|
appeals committees, district courts) yield application/persuasive
|
||||||
|
entries rather than a forced 0-result strict halacha pass.
|
||||||
|
"""
|
||||||
|
base_prompt = (
|
||||||
|
HALACHA_EXTRACTION_PROMPT_BINDING if is_binding
|
||||||
|
else HALACHA_EXTRACTION_PROMPT_PERSUASIVE
|
||||||
|
)
|
||||||
|
chunk_label = f" (חלק {chunk_index + 1}/{chunk_total})" if chunk_total > 1 else ""
|
||||||
|
# Pass the static instruction prompt as `system` so the SDK path can cache
|
||||||
|
# it (5-min ephemeral). Only the per-chunk content varies via `prompt`.
|
||||||
|
user_msg = (
|
||||||
|
f"## הקלט\n"
|
||||||
|
f"סוג קטע: {section_type}\n"
|
||||||
|
f"{context}{chunk_label}\n\n"
|
||||||
|
f"--- תחילת הטקסט ---\n{chunk_text}\n--- סוף הטקסט ---"
|
||||||
|
)
|
||||||
|
last_err: Exception | None = None
|
||||||
|
for attempt in range(CHUNK_RETRY_ATTEMPTS + 1):
|
||||||
|
try:
|
||||||
|
result = await claude_session.query_json(user_msg, system=base_prompt)
|
||||||
|
except Exception as e:
|
||||||
|
last_err = e
|
||||||
|
logger.warning(
|
||||||
|
"halacha_extractor chunk %d/%d attempt %d raised: %s",
|
||||||
|
chunk_index + 1, chunk_total, attempt + 1, e,
|
||||||
|
)
|
||||||
|
continue
|
||||||
|
if isinstance(result, list):
|
||||||
|
return result, True
|
||||||
|
logger.warning(
|
||||||
|
"halacha_extractor chunk %d/%d attempt %d returned non-list (%s)",
|
||||||
|
chunk_index + 1, chunk_total, attempt + 1, type(result).__name__,
|
||||||
|
)
|
||||||
|
logger.error(
|
||||||
|
"halacha_extractor chunk %d/%d failed after %d attempts: %s",
|
||||||
|
chunk_index + 1, chunk_total, CHUNK_RETRY_ATTEMPTS + 1, last_err,
|
||||||
|
)
|
||||||
|
return [], False
|
||||||
|
|
||||||
|
|
||||||
|
async def extract(case_law_id: UUID | str) -> dict:
|
||||||
|
"""Extract halachot from an uploaded precedent and store them.
|
||||||
|
|
||||||
|
Idempotent: replaces any existing halachot for this case_law_id.
|
||||||
|
All inserted rows start as ``review_status='pending_review'``.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
``{"status": "...", "extracted": N, "verified": M, "stored": K, ...}``
|
||||||
|
"""
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
|
||||||
|
record = await db.get_case_law(case_law_id)
|
||||||
|
if not record:
|
||||||
|
return {"status": "not_found", "extracted": 0, "stored": 0}
|
||||||
|
|
||||||
|
is_binding = bool(record.get("is_binding"))
|
||||||
|
|
||||||
|
# Try the targeted sections first (legal_analysis / ruling / conclusion).
|
||||||
|
# If the chunker labeled everything as 'other' (common when a ruling
|
||||||
|
# uses non-standard headings or the section markers aren't bracketed
|
||||||
|
# cleanly), fall back to ALL chunks — better to over-include than to
|
||||||
|
# silently skip a ruling that has reasoning under an unexpected label.
|
||||||
|
chunks = await db.list_precedent_chunks(
|
||||||
|
case_law_id, section_types=EXTRACTABLE_SECTIONS,
|
||||||
|
)
|
||||||
|
if not chunks:
|
||||||
|
chunks = await db.list_precedent_chunks(case_law_id)
|
||||||
|
if chunks:
|
||||||
|
logger.info(
|
||||||
|
"halacha_extractor: case_law=%s — no targeted sections, "
|
||||||
|
"falling back to all %d chunks",
|
||||||
|
case_law_id, len(chunks),
|
||||||
|
)
|
||||||
|
if not chunks:
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||||
|
return {"status": "no_chunks", "extracted": 0, "stored": 0}
|
||||||
|
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "processing")
|
||||||
|
await db.delete_halachot(case_law_id)
|
||||||
|
|
||||||
|
citation = record.get("case_number", "")
|
||||||
|
court = record.get("court", "")
|
||||||
|
date_str = str(record.get("date") or "")
|
||||||
|
context = f"מקור: {citation} — {court}, {date_str}"
|
||||||
|
|
||||||
|
sem = asyncio.Semaphore(CHUNK_CONCURRENCY)
|
||||||
|
|
||||||
|
async def _bounded(idx: int, chunk_row: dict) -> tuple[list[dict], bool]:
|
||||||
|
async with sem:
|
||||||
|
return await _extract_chunk(
|
||||||
|
chunk_row["content"], chunk_row["section_type"],
|
||||||
|
idx, len(chunks), context, is_binding,
|
||||||
|
)
|
||||||
|
|
||||||
|
chunk_results = await asyncio.gather(
|
||||||
|
*[_bounded(i, c) for i, c in enumerate(chunks)]
|
||||||
|
)
|
||||||
|
raw_halachot: list[dict] = []
|
||||||
|
failed_chunks = 0
|
||||||
|
for items, ok in chunk_results:
|
||||||
|
raw_halachot.extend(items)
|
||||||
|
if not ok:
|
||||||
|
failed_chunks += 1
|
||||||
|
|
||||||
|
# If most chunks failed (rate limit storm, claude_session crash, etc.)
|
||||||
|
# do NOT touch the DB status — leave it 'processing' so the caller can
|
||||||
|
# retry without the request falling out of the queue. The caller
|
||||||
|
# (`process_pending_extractions`) is responsible for either retrying or
|
||||||
|
# finalising the status as 'failed' after retries are exhausted. This
|
||||||
|
# is the bug that produced 317/10's silent `no_halachot` after a
|
||||||
|
# 129-chunk neighbour saturated the API.
|
||||||
|
failure_rate = failed_chunks / len(chunks) if chunks else 0
|
||||||
|
if failure_rate >= EXTRACTION_FAILURE_THRESHOLD and not raw_halachot:
|
||||||
|
logger.error(
|
||||||
|
"halacha_extractor: case_law=%s extraction_failed — "
|
||||||
|
"%d/%d chunks failed (rate=%.0f%%), no halachot retrieved. "
|
||||||
|
"DB status left as 'processing' for caller-level retry.",
|
||||||
|
case_law_id, failed_chunks, len(chunks), failure_rate * 100,
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"status": "extraction_failed",
|
||||||
|
"extracted": 0,
|
||||||
|
"stored": 0,
|
||||||
|
"failed_chunks": failed_chunks,
|
||||||
|
"total_chunks": len(chunks),
|
||||||
|
}
|
||||||
|
|
||||||
|
if not raw_halachot:
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||||
|
return {
|
||||||
|
"status": "no_halachot",
|
||||||
|
"extracted": 0,
|
||||||
|
"stored": 0,
|
||||||
|
"failed_chunks": failed_chunks,
|
||||||
|
"total_chunks": len(chunks),
|
||||||
|
}
|
||||||
|
|
||||||
|
# Validate against the full text of the precedent for the quote check.
|
||||||
|
full_text = record.get("full_text") or ""
|
||||||
|
|
||||||
|
cleaned: list[dict] = []
|
||||||
|
for raw in raw_halachot:
|
||||||
|
coerced = _coerce_halacha(raw, is_binding=is_binding)
|
||||||
|
if coerced is None:
|
||||||
|
continue
|
||||||
|
coerced["quote_verified"] = _verify_quote(
|
||||||
|
coerced["supporting_quote"], full_text,
|
||||||
|
)
|
||||||
|
cleaned.append(coerced)
|
||||||
|
|
||||||
|
if not cleaned:
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||||
|
return {"status": "no_valid_halachot", "extracted": len(raw_halachot), "stored": 0}
|
||||||
|
|
||||||
|
# Embed rule_statement + reasoning_summary so semantic search hits the
|
||||||
|
# rule directly rather than the surrounding chunk centroid.
|
||||||
|
embed_inputs = [
|
||||||
|
f"{h['rule_statement']} — {h['reasoning_summary']}".strip(" —")
|
||||||
|
for h in cleaned
|
||||||
|
]
|
||||||
|
try:
|
||||||
|
vectors = await embeddings.embed_texts(embed_inputs, input_type="document")
|
||||||
|
except Exception as e:
|
||||||
|
logger.error("halacha_extractor: embeddings failed: %s", e)
|
||||||
|
vectors = [None] * len(cleaned)
|
||||||
|
|
||||||
|
for halacha, vec in zip(cleaned, vectors):
|
||||||
|
halacha["embedding"] = vec
|
||||||
|
|
||||||
|
stored = await db.store_halachot(case_law_id, cleaned)
|
||||||
|
|
||||||
|
verified = sum(1 for h in cleaned if h["quote_verified"])
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||||
|
|
||||||
|
logger.info(
|
||||||
|
"halacha_extractor: case_law=%s extracted=%d cleaned=%d verified=%d stored=%d",
|
||||||
|
case_law_id, len(raw_halachot), len(cleaned), verified, stored,
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"status": "completed",
|
||||||
|
"extracted": len(raw_halachot),
|
||||||
|
"valid": len(cleaned),
|
||||||
|
"verified": verified,
|
||||||
|
"stored": stored,
|
||||||
|
}
|
||||||
215
mcp-server/src/legal_mcp/services/hybrid_search.py
Normal file
215
mcp-server/src/legal_mcp/services/hybrid_search.py
Normal file
@@ -0,0 +1,215 @@
|
|||||||
|
"""Hybrid (text + image) search wrappers.
|
||||||
|
|
||||||
|
Layered on top of ``rerank.maybe_rerank``. When ``MULTIMODAL_ENABLED`` is
|
||||||
|
true the result comes from a weighted merge of:
|
||||||
|
|
||||||
|
• text side: cosine on chunks → optional rerank-2 cross-encoder
|
||||||
|
• image side: cosine on per-page voyage-multimodal-3 embeddings
|
||||||
|
|
||||||
|
rerank-2 is a *text* cross-encoder, so image-side rows are NOT passed
|
||||||
|
through it; they keep their cosine score and merge alongside the
|
||||||
|
(possibly reranked) text rows. Image-only pages with no overlapping
|
||||||
|
text chunk are surfaced as ``match_type='image'`` so scanned-only or
|
||||||
|
visual-heavy content still appears in results.
|
||||||
|
|
||||||
|
When ``MULTIMODAL_ENABLED`` is false this module degenerates to plain
|
||||||
|
``rerank.maybe_rerank`` — callers can wrap unconditionally and let env
|
||||||
|
control behaviour.
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from typing import Any
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp import config
|
||||||
|
from legal_mcp.services import db, embeddings, rerank
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
async def search_documents_hybrid(
|
||||||
|
query: str,
|
||||||
|
query_text_embedding: list[float],
|
||||||
|
*,
|
||||||
|
limit: int,
|
||||||
|
case_id: UUID | None = None,
|
||||||
|
section_type: str | None = None,
|
||||||
|
practice_area: str | None = None,
|
||||||
|
appeal_subtype: str | None = None,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Hybrid wrapper for document-chunk search (search_decisions /
|
||||||
|
search_case_documents / find_similar_cases)."""
|
||||||
|
fetch_k = max(limit, config.VOYAGE_RERANK_FETCH_K) if config.MULTIMODAL_ENABLED else limit
|
||||||
|
text_results = await rerank.maybe_rerank(
|
||||||
|
query=query,
|
||||||
|
base_search=lambda **kw: db.search_similar(
|
||||||
|
query_embedding=query_text_embedding, **kw,
|
||||||
|
),
|
||||||
|
limit=fetch_k,
|
||||||
|
case_id=case_id,
|
||||||
|
section_type=section_type,
|
||||||
|
practice_area=practice_area,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
)
|
||||||
|
if not config.MULTIMODAL_ENABLED:
|
||||||
|
return text_results[:limit]
|
||||||
|
|
||||||
|
try:
|
||||||
|
query_img_emb = await embeddings.embed_query_for_multimodal(query)
|
||||||
|
img_rows = await db.search_document_images_similar(
|
||||||
|
query_img_emb,
|
||||||
|
limit=fetch_k,
|
||||||
|
case_id=case_id,
|
||||||
|
practice_area=practice_area,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("Hybrid: image side failed, returning text only: %s", e)
|
||||||
|
return text_results[:limit]
|
||||||
|
|
||||||
|
merged = _merge(
|
||||||
|
text_results, img_rows,
|
||||||
|
id_field="document_id",
|
||||||
|
text_weight=config.MULTIMODAL_TEXT_WEIGHT,
|
||||||
|
)
|
||||||
|
return merged[:limit]
|
||||||
|
|
||||||
|
|
||||||
|
async def search_precedent_library_hybrid(
|
||||||
|
query: str,
|
||||||
|
query_text_embedding: list[float],
|
||||||
|
*,
|
||||||
|
limit: int,
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
is_binding: bool | None = None,
|
||||||
|
subject_tag: str = "",
|
||||||
|
include_halachot: bool = True,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Hybrid wrapper for precedent-library search."""
|
||||||
|
fetch_k = max(limit, config.VOYAGE_RERANK_FETCH_K) if config.MULTIMODAL_ENABLED else limit
|
||||||
|
|
||||||
|
async def _base(limit: int) -> list[dict]:
|
||||||
|
return await db.search_precedent_library_semantic(
|
||||||
|
query_embedding=query_text_embedding,
|
||||||
|
practice_area=practice_area,
|
||||||
|
court=court,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
is_binding=is_binding,
|
||||||
|
subject_tag=subject_tag,
|
||||||
|
limit=limit,
|
||||||
|
include_halachot=include_halachot,
|
||||||
|
)
|
||||||
|
|
||||||
|
text_results = await rerank.maybe_rerank(
|
||||||
|
query=query, base_search=_base, limit=fetch_k,
|
||||||
|
)
|
||||||
|
if not config.MULTIMODAL_ENABLED:
|
||||||
|
return text_results[:limit]
|
||||||
|
|
||||||
|
try:
|
||||||
|
query_img_emb = await embeddings.embed_query_for_multimodal(query)
|
||||||
|
img_rows = await db.search_precedent_images_similar(
|
||||||
|
query_img_emb,
|
||||||
|
limit=fetch_k,
|
||||||
|
practice_area=practice_area,
|
||||||
|
court=court,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
is_binding=is_binding,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("Hybrid: image side failed, returning text only: %s", e)
|
||||||
|
return text_results[:limit]
|
||||||
|
|
||||||
|
merged = _merge(
|
||||||
|
text_results, img_rows,
|
||||||
|
id_field="case_law_id",
|
||||||
|
text_weight=config.MULTIMODAL_TEXT_WEIGHT,
|
||||||
|
)
|
||||||
|
return merged[:limit]
|
||||||
|
|
||||||
|
|
||||||
|
def _merge(
|
||||||
|
text_rows: list[dict],
|
||||||
|
img_rows: list[dict],
|
||||||
|
id_field: str,
|
||||||
|
text_weight: float,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Reciprocal Rank Fusion of text + image rows.
|
||||||
|
|
||||||
|
Why RRF: voyage-3 cosine scores (~0.4-0.5) and voyage-multimodal-3
|
||||||
|
scores (~0.2-0.25) live on different scales — a direct weighted
|
||||||
|
sum lets text always dominate. RRF combines by *rank* in each list,
|
||||||
|
making the merge robust to score-scale differences.
|
||||||
|
|
||||||
|
Per item::
|
||||||
|
|
||||||
|
rrf_score = text_weight / (k + text_rank)
|
||||||
|
+ image_weight / (k + image_rank)
|
||||||
|
|
||||||
|
A row that appears in only one list contributes that list's term
|
||||||
|
only. Rows joined at ``(id_field, page_number)`` get both terms —
|
||||||
|
surfaced as ``match_type='text+image'`` with the thumbnail attached.
|
||||||
|
|
||||||
|
Halachot in precedent rows have no page_number; they remain
|
||||||
|
text-only under RRF (the case-level image boost is dropped — RRF
|
||||||
|
works on rank, not raw scores).
|
||||||
|
"""
|
||||||
|
from legal_mcp import config as _cfg
|
||||||
|
img_weight = 1.0 - text_weight
|
||||||
|
k = _cfg.MULTIMODAL_RRF_K
|
||||||
|
|
||||||
|
# Index image rows by their join key for boost detection.
|
||||||
|
img_rank_by_key: dict[tuple, int] = {}
|
||||||
|
img_row_by_key: dict[tuple, dict] = {}
|
||||||
|
for rank, r in enumerate(img_rows, 1):
|
||||||
|
key = (str(r[id_field]), r.get("page_number"))
|
||||||
|
img_rank_by_key[key] = rank
|
||||||
|
img_row_by_key[key] = r
|
||||||
|
|
||||||
|
seen_image_keys: set = set()
|
||||||
|
merged: list[dict] = []
|
||||||
|
for rank, r in enumerate(text_rows, 1):
|
||||||
|
rid = str(r[id_field])
|
||||||
|
page = r.get("page_number")
|
||||||
|
key = (rid, page) if page is not None else None
|
||||||
|
img_rank = img_rank_by_key.get(key) if key else None
|
||||||
|
text_term = text_weight / (k + rank)
|
||||||
|
image_term = img_weight / (k + img_rank) if img_rank else 0.0
|
||||||
|
d = dict(r)
|
||||||
|
d["text_score"] = float(r.get("score", 0.0))
|
||||||
|
d["text_rank"] = rank
|
||||||
|
if img_rank:
|
||||||
|
img_hit = img_row_by_key[key]
|
||||||
|
d["image_score"] = float(img_hit.get("score", 0.0))
|
||||||
|
d["image_rank"] = img_rank
|
||||||
|
d["image_thumbnail_path"] = img_hit.get("image_thumbnail_path")
|
||||||
|
d["match_type"] = "text+image"
|
||||||
|
seen_image_keys.add(key)
|
||||||
|
else:
|
||||||
|
d["image_score"] = 0.0
|
||||||
|
d["match_type"] = "text"
|
||||||
|
d["score"] = text_term + image_term
|
||||||
|
merged.append(d)
|
||||||
|
|
||||||
|
for rank, r in enumerate(img_rows, 1):
|
||||||
|
key = (str(r[id_field]), r.get("page_number"))
|
||||||
|
if key in seen_image_keys:
|
||||||
|
continue
|
||||||
|
d = dict(r)
|
||||||
|
d["text_score"] = 0.0
|
||||||
|
d["image_score"] = float(r.get("score", 0.0))
|
||||||
|
d["image_rank"] = rank
|
||||||
|
d["score"] = img_weight / (k + rank)
|
||||||
|
d["match_type"] = "image"
|
||||||
|
d["content"] = ""
|
||||||
|
d["section_type"] = "image"
|
||||||
|
merged.append(d)
|
||||||
|
|
||||||
|
merged.sort(key=lambda x: -float(x["score"]))
|
||||||
|
return merged
|
||||||
@@ -90,10 +90,10 @@ async def analyze_changes(draft_text: str, final_text: str) -> dict:
|
|||||||
--- גרסה סופית ---
|
--- גרסה סופית ---
|
||||||
{final_sample}
|
{final_sample}
|
||||||
"""
|
"""
|
||||||
result = claude_session.query_json(prompt, timeout=120)
|
result = await claude_session.query_json(prompt)
|
||||||
if result is None:
|
if result is None:
|
||||||
logger.warning("Failed to parse lessons response")
|
logger.warning("Failed to parse lessons response")
|
||||||
return {"changes": [], "new_expressions": [], "overall_assessment": raw[:200]}
|
return {"changes": [], "new_expressions": [], "overall_assessment": ""}
|
||||||
return result
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
539
mcp-server/src/legal_mcp/services/precedent_library.py
Normal file
539
mcp-server/src/legal_mcp/services/precedent_library.py
Normal file
@@ -0,0 +1,539 @@
|
|||||||
|
"""Orchestrator for the External Precedent Library.
|
||||||
|
|
||||||
|
Ingest pipeline (one upload):
|
||||||
|
file → extract_text → proofread → INSERT case_law (source_kind='external_upload')
|
||||||
|
→ chunk → embed → store precedent_chunks
|
||||||
|
→ halacha_extractor.extract → embed halachot → store halachot
|
||||||
|
→ set extraction_status='completed'
|
||||||
|
|
||||||
|
Progress is reported via a caller-supplied async callback so the
|
||||||
|
web layer can pipe updates into the existing Redis ProgressStore /
|
||||||
|
SSE plumbing without this module knowing about Redis.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import re
|
||||||
|
import shutil
|
||||||
|
from datetime import date
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Awaitable, Callable
|
||||||
|
from uuid import UUID, uuid4
|
||||||
|
|
||||||
|
from legal_mcp import config
|
||||||
|
from legal_mcp.services import chunker, db, embeddings, extractor, hybrid_search, rerank # noqa: F401
|
||||||
|
|
||||||
|
# Note: halacha_extractor and precedent_metadata_extractor are NOT imported
|
||||||
|
# at module load. They are imported lazily inside the dedicated re-extract
|
||||||
|
# entry points so that `ingest_precedent` (called from the FastAPI container,
|
||||||
|
# where `claude` CLI is unavailable) cannot accidentally pull them in. See
|
||||||
|
# the architectural rule in services/claude_session.py.
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
ProgressCb = Callable[[str, int, str], Awaitable[None]]
|
||||||
|
|
||||||
|
|
||||||
|
PRECEDENT_LIBRARY_DIR = Path(config.DATA_DIR) / "precedent-library"
|
||||||
|
|
||||||
|
|
||||||
|
_VALID_PRACTICE_AREAS = {"", "rishuy_uvniya", "betterment_levy", "compensation_197"}
|
||||||
|
_VALID_SOURCE_TYPES = {"", "court_ruling", "appeals_committee"}
|
||||||
|
_VALID_PRECEDENT_LEVELS = {
|
||||||
|
"", "עליון", "מנהלי", "ועדת_ערר_ארצית", "ועדת_ערר_מחוזית",
|
||||||
|
"supreme", "administrative", "national_appeals_committee", "district_appeals_committee",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def _noop_progress(_status: str, _percent: int, _msg: str) -> None:
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
def _safe_filename(name: str) -> str:
|
||||||
|
"""Strip path separators and unsafe chars from a user-provided name."""
|
||||||
|
base = Path(name).name
|
||||||
|
return re.sub(r"[^\w.\-+א-ת ]", "_", base) or f"upload-{uuid4().hex[:8]}"
|
||||||
|
|
||||||
|
|
||||||
|
def _stage_file(src_path: Path, source_type: str) -> Path:
|
||||||
|
"""Copy the uploaded file into data/precedent-library/<source_type>/.
|
||||||
|
|
||||||
|
Returns the destination path. Source file is not deleted (caller decides).
|
||||||
|
"""
|
||||||
|
sub = source_type if source_type in {"court_ruling", "appeals_committee"} else "other"
|
||||||
|
dest_dir = PRECEDENT_LIBRARY_DIR / sub
|
||||||
|
dest_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
safe_name = _safe_filename(src_path.name)
|
||||||
|
dest = dest_dir / f"{uuid4().hex[:8]}_{safe_name}"
|
||||||
|
shutil.copy2(src_path, dest)
|
||||||
|
return dest
|
||||||
|
|
||||||
|
|
||||||
|
def _coerce_date(value) -> date | None:
|
||||||
|
if value is None or value == "":
|
||||||
|
return None
|
||||||
|
if isinstance(value, date):
|
||||||
|
return value
|
||||||
|
if isinstance(value, str):
|
||||||
|
try:
|
||||||
|
return date.fromisoformat(value[:10])
|
||||||
|
except ValueError:
|
||||||
|
return None
|
||||||
|
return None
|
||||||
|
|
||||||
|
|
||||||
|
async def ingest_precedent(
|
||||||
|
*,
|
||||||
|
file_path: str | Path,
|
||||||
|
citation: str,
|
||||||
|
case_name: str = "",
|
||||||
|
court: str = "",
|
||||||
|
decision_date=None,
|
||||||
|
source_type: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
practice_area: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
subject_tags: list[str] | None = None,
|
||||||
|
is_binding: bool = True,
|
||||||
|
headnote: str = "",
|
||||||
|
summary: str = "",
|
||||||
|
document_id: UUID | None = None,
|
||||||
|
progress: ProgressCb | None = None,
|
||||||
|
) -> dict:
|
||||||
|
"""Ingest a single uploaded precedent through the full pipeline.
|
||||||
|
|
||||||
|
Required: file_path + citation. Everything else has a sensible default.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
``{"status": "...", "case_law_id": "...", "chunks": N, "halachot": M}``
|
||||||
|
"""
|
||||||
|
progress = progress or _noop_progress
|
||||||
|
src = Path(file_path)
|
||||||
|
if not src.is_file():
|
||||||
|
raise FileNotFoundError(f"file not found: {src}")
|
||||||
|
if not citation.strip():
|
||||||
|
raise ValueError("citation is required")
|
||||||
|
if practice_area not in _VALID_PRACTICE_AREAS:
|
||||||
|
raise ValueError(f"invalid practice_area: {practice_area!r}")
|
||||||
|
if source_type not in _VALID_SOURCE_TYPES:
|
||||||
|
raise ValueError(f"invalid source_type: {source_type!r}")
|
||||||
|
|
||||||
|
await progress("staging", 5, "מעתיק את הקובץ לאחסון")
|
||||||
|
|
||||||
|
staged = _stage_file(src, source_type)
|
||||||
|
|
||||||
|
await progress("extracting", 15, "מחלץ טקסט מהקובץ")
|
||||||
|
try:
|
||||||
|
text, page_count, page_offsets = await extractor.extract_text(str(staged))
|
||||||
|
except Exception as e:
|
||||||
|
await progress("failed", 100, f"כשל בחילוץ טקסט: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
text = (text or "").strip()
|
||||||
|
if not text:
|
||||||
|
await progress("failed", 100, "לא נמצא טקסט בקובץ")
|
||||||
|
raise ValueError("no extractable text in file")
|
||||||
|
|
||||||
|
# Strip any Nevo preamble that might wrap court rulings downloaded from Nevo.
|
||||||
|
text = extractor.strip_nevo_preamble(text)
|
||||||
|
|
||||||
|
await progress("storing_metadata", 25, "שומר את הפסיקה במסד הנתונים")
|
||||||
|
record = await db.create_external_case_law(
|
||||||
|
case_number=citation.strip(),
|
||||||
|
case_name=case_name.strip() or citation.strip(),
|
||||||
|
full_text=text,
|
||||||
|
court=court.strip(),
|
||||||
|
decision_date=_coerce_date(decision_date),
|
||||||
|
practice_area=practice_area,
|
||||||
|
appeal_subtype=appeal_subtype.strip(),
|
||||||
|
subject_tags=list(subject_tags or []),
|
||||||
|
summary=summary.strip(),
|
||||||
|
headnote=headnote.strip(),
|
||||||
|
source_type=source_type,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
is_binding=is_binding,
|
||||||
|
document_id=document_id,
|
||||||
|
)
|
||||||
|
case_law_id = UUID(str(record["id"]))
|
||||||
|
|
||||||
|
try:
|
||||||
|
await progress("chunking", 40, f"מחלק את הטקסט ל-chunks ({page_count} עמ')")
|
||||||
|
chunks = chunker.chunk_document(text, page_offsets=page_offsets)
|
||||||
|
if not chunks:
|
||||||
|
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "completed")
|
||||||
|
await progress("completed", 100, "אין טקסט לעיבוד")
|
||||||
|
return {
|
||||||
|
"status": "completed",
|
||||||
|
"case_law_id": str(case_law_id),
|
||||||
|
"chunks": 0,
|
||||||
|
"halachot": 0,
|
||||||
|
}
|
||||||
|
|
||||||
|
await progress("embedding", 55, f"מייצר embeddings ל-{len(chunks)} chunks")
|
||||||
|
chunk_texts = [c.content for c in chunks]
|
||||||
|
chunk_vectors = await embeddings.embed_texts(chunk_texts, input_type="document")
|
||||||
|
|
||||||
|
chunk_dicts = [
|
||||||
|
{
|
||||||
|
"chunk_index": c.chunk_index,
|
||||||
|
"content": c.content,
|
||||||
|
"section_type": c.section_type,
|
||||||
|
"page_number": c.page_number,
|
||||||
|
"embedding": v,
|
||||||
|
}
|
||||||
|
for c, v in zip(chunks, chunk_vectors)
|
||||||
|
]
|
||||||
|
stored_chunks = await db.store_precedent_chunks(case_law_id, chunk_dicts)
|
||||||
|
|
||||||
|
# Multimodal page-image embeddings (V9). Gated by feature flag.
|
||||||
|
# Non-fatal: text path already succeeded. Only PDFs.
|
||||||
|
if config.MULTIMODAL_ENABLED and page_count > 0 and staged.suffix.lower() == ".pdf":
|
||||||
|
try:
|
||||||
|
await progress(
|
||||||
|
"embedding_images", 70,
|
||||||
|
f"מטמיע {page_count} עמודי תמונה (multimodal)",
|
||||||
|
)
|
||||||
|
await _embed_precedent_pages(case_law_id, staged, page_count)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("Precedent multimodal embedding failed (non-fatal): %s", e)
|
||||||
|
|
||||||
|
# Pipeline split: the container does the non-LLM half (extract +
|
||||||
|
# chunk + embed + store). LLM-driven extraction (metadata, halachot)
|
||||||
|
# runs separately via the MCP tool `precedent_process_pending` from
|
||||||
|
# local Claude Code, where `claude` CLI is available.
|
||||||
|
#
|
||||||
|
# We auto-queue both extractions so the chair doesn't need to click
|
||||||
|
# any button — the moment they (or me) run `precedent_process_pending`
|
||||||
|
# in chat, both kinds get processed.
|
||||||
|
await db.set_case_law_extraction_status(case_law_id, "completed")
|
||||||
|
await db.set_case_law_halacha_status(case_law_id, "pending")
|
||||||
|
await db.request_metadata_extraction(case_law_id)
|
||||||
|
await db.request_halacha_extraction(case_law_id)
|
||||||
|
|
||||||
|
await progress(
|
||||||
|
"completed",
|
||||||
|
100,
|
||||||
|
f"הוכנס לספרייה: {stored_chunks} chunks. "
|
||||||
|
f"חילוץ הלכות ומטא-דאטה ממתינים בתור — "
|
||||||
|
f"להפעיל מ-Claude Code: precedent_process_pending.",
|
||||||
|
)
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": "completed",
|
||||||
|
"case_law_id": str(case_law_id),
|
||||||
|
"chunks": stored_chunks,
|
||||||
|
"halachot": 0,
|
||||||
|
"halachot_pending": True,
|
||||||
|
"metadata_filled": [],
|
||||||
|
"pages": page_count,
|
||||||
|
}
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception("precedent_library.ingest_precedent failed: %s", e)
|
||||||
|
await db.set_case_law_extraction_status(case_law_id, "failed")
|
||||||
|
await progress("failed", 100, f"כשל בעיבוד: {e}")
|
||||||
|
raise
|
||||||
|
|
||||||
|
|
||||||
|
async def reextract_halachot(
|
||||||
|
case_law_id: UUID | str,
|
||||||
|
progress: ProgressCb | None = None,
|
||||||
|
) -> dict:
|
||||||
|
"""Re-run the halacha extractor on an existing precedent. Idempotent.
|
||||||
|
|
||||||
|
**MCP-tool-only path.** This function calls into ``halacha_extractor``,
|
||||||
|
which calls ``claude_session`` — the local CLI is required. Invoking
|
||||||
|
this from the FastAPI container will raise ``Claude CLI not found``.
|
||||||
|
See the architectural rule in ``services/claude_session.py``.
|
||||||
|
"""
|
||||||
|
from legal_mcp.services import halacha_extractor
|
||||||
|
|
||||||
|
progress = progress or _noop_progress
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
|
||||||
|
record = await db.get_case_law(case_law_id)
|
||||||
|
if not record or record.get("source_kind") != "external_upload":
|
||||||
|
raise ValueError("precedent not found or not chair-uploaded")
|
||||||
|
|
||||||
|
await progress("extracting_halachot", 50, "מחלץ הלכות מחדש")
|
||||||
|
result = await halacha_extractor.extract(case_law_id)
|
||||||
|
await progress(
|
||||||
|
"completed",
|
||||||
|
100,
|
||||||
|
f"הופקו {result.get('stored', 0)} הלכות (ממתינות לאישור)",
|
||||||
|
)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
# Wait this many seconds between precedents in a multi-precedent run.
|
||||||
|
# Anthropic rate-limits across the org, so back-to-back extractions of large
|
||||||
|
# rulings (e.g. 129 chunks for one, then 79 for another) can spill the second
|
||||||
|
# precedent into a 429 storm. Observed 2026-05-03: 1110/20 succeeded with 9
|
||||||
|
# halachot, 317/10 immediately after returned silent no_halachot.
|
||||||
|
INTER_PRECEDENT_COOLDOWN_SEC = 30
|
||||||
|
|
||||||
|
# How many times to retry a precedent that came back as 'extraction_failed'
|
||||||
|
# (i.e. >50% chunks crashed). Each retry uses a longer cooldown.
|
||||||
|
PRECEDENT_RETRY_ATTEMPTS = 1
|
||||||
|
PRECEDENT_RETRY_COOLDOWN_SEC = 60
|
||||||
|
|
||||||
|
|
||||||
|
async def process_pending_extractions(kind: str = "metadata", limit: int = 20) -> dict:
|
||||||
|
"""Drain the extraction queue (UI-button-stamped requests).
|
||||||
|
|
||||||
|
The button in the web UI cannot run claude_session itself (it lives in
|
||||||
|
the container, no CLI). It just stamps ``metadata_extraction_requested_at``
|
||||||
|
on the row. This function — called from local Claude Code via the MCP
|
||||||
|
tool — picks each stamped row up, runs the extractor, and clears the
|
||||||
|
timestamp.
|
||||||
|
|
||||||
|
Sequencing: precedents are processed serially (never in parallel) and
|
||||||
|
each is followed by a short cooldown so the Anthropic rate-limit
|
||||||
|
counter has time to drain before the next big precedent starts. If
|
||||||
|
halacha extraction comes back as ``extraction_failed`` we retry the
|
||||||
|
same precedent once with a longer cooldown — matching the empirical
|
||||||
|
pattern where the second precedent in a back-to-back run gets
|
||||||
|
rate-limited but recovers after a brief pause.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
kind: 'metadata' or 'halacha'.
|
||||||
|
limit: max rows to process this run.
|
||||||
|
"""
|
||||||
|
from legal_mcp.services import halacha_extractor, precedent_metadata_extractor
|
||||||
|
|
||||||
|
if kind not in {"metadata", "halacha"}:
|
||||||
|
raise ValueError("kind must be 'metadata' or 'halacha'")
|
||||||
|
|
||||||
|
pending = await db.list_pending_extraction_requests(kind=kind, limit=limit)
|
||||||
|
if not pending:
|
||||||
|
return {"status": "no_pending", "kind": kind, "processed": 0, "results": []}
|
||||||
|
|
||||||
|
async def _run_once(cid: UUID) -> dict:
|
||||||
|
if kind == "metadata":
|
||||||
|
return await precedent_metadata_extractor.extract_and_apply(cid)
|
||||||
|
return await halacha_extractor.extract(cid)
|
||||||
|
|
||||||
|
results: list[dict] = []
|
||||||
|
processed = 0
|
||||||
|
for idx, row in enumerate(pending):
|
||||||
|
if idx > 0:
|
||||||
|
await asyncio.sleep(INTER_PRECEDENT_COOLDOWN_SEC)
|
||||||
|
cid = UUID(str(row["id"]))
|
||||||
|
attempts = 0
|
||||||
|
result: dict = {}
|
||||||
|
try:
|
||||||
|
result = await _run_once(cid)
|
||||||
|
# Retry only on systematic extraction failure (rate-limit storm).
|
||||||
|
# Don't retry on 'no_halachot' — that means Claude looked and
|
||||||
|
# genuinely found nothing.
|
||||||
|
while (
|
||||||
|
result.get("status") == "extraction_failed"
|
||||||
|
and attempts < PRECEDENT_RETRY_ATTEMPTS
|
||||||
|
):
|
||||||
|
attempts += 1
|
||||||
|
logger.warning(
|
||||||
|
"process_pending_extractions: %s returned extraction_failed "
|
||||||
|
"(%d/%d chunks crashed), retry %d/%d after %ds cooldown",
|
||||||
|
cid,
|
||||||
|
result.get("failed_chunks", 0),
|
||||||
|
result.get("total_chunks", 0),
|
||||||
|
attempts, PRECEDENT_RETRY_ATTEMPTS,
|
||||||
|
PRECEDENT_RETRY_COOLDOWN_SEC,
|
||||||
|
)
|
||||||
|
await asyncio.sleep(PRECEDENT_RETRY_COOLDOWN_SEC)
|
||||||
|
result = await _run_once(cid)
|
||||||
|
|
||||||
|
# Finalise: success or terminal failure both clear the request
|
||||||
|
# so the queue moves on. (Use 'failed' DB state for terminal
|
||||||
|
# extraction_failed so the UI shows the warning chip.)
|
||||||
|
if kind == "halacha" and result.get("status") == "extraction_failed":
|
||||||
|
await db.set_case_law_halacha_status(cid, "failed")
|
||||||
|
await db.clear_extraction_request(cid, kind=kind)
|
||||||
|
processed += 1
|
||||||
|
results.append({
|
||||||
|
"case_law_id": str(cid),
|
||||||
|
"case_number": row.get("case_number", ""),
|
||||||
|
"status": result.get("status", "unknown"),
|
||||||
|
"fields": result.get("fields", []),
|
||||||
|
"stored": result.get("stored", 0),
|
||||||
|
"retry_attempts": attempts,
|
||||||
|
})
|
||||||
|
except Exception as e:
|
||||||
|
logger.exception("process_pending_extractions failed for %s: %s", cid, e)
|
||||||
|
results.append({
|
||||||
|
"case_law_id": str(cid),
|
||||||
|
"case_number": row.get("case_number", ""),
|
||||||
|
"status": "failed",
|
||||||
|
"error": str(e),
|
||||||
|
"retry_attempts": attempts,
|
||||||
|
})
|
||||||
|
# Don't clear the request — it stays for the next run.
|
||||||
|
|
||||||
|
return {
|
||||||
|
"status": "completed",
|
||||||
|
"kind": kind,
|
||||||
|
"processed": processed,
|
||||||
|
"total_pending": len(pending),
|
||||||
|
"results": results,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def reextract_metadata(
|
||||||
|
case_law_id: UUID | str,
|
||||||
|
progress: ProgressCb | None = None,
|
||||||
|
) -> dict:
|
||||||
|
"""Re-run metadata extraction on an existing precedent.
|
||||||
|
|
||||||
|
Only fills empty fields (subject_tags, summary, headnote, key_quote,
|
||||||
|
appeal_subtype, and case_name when it equals the citation). User
|
||||||
|
values are preserved.
|
||||||
|
|
||||||
|
**MCP-tool-only path** — same constraint as :func:`reextract_halachot`.
|
||||||
|
"""
|
||||||
|
from legal_mcp.services import precedent_metadata_extractor
|
||||||
|
|
||||||
|
progress = progress or _noop_progress
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
|
||||||
|
record = await db.get_case_law(case_law_id)
|
||||||
|
if not record or record.get("source_kind") != "external_upload":
|
||||||
|
raise ValueError("precedent not found or not chair-uploaded")
|
||||||
|
|
||||||
|
await progress("extracting_metadata", 40, "מחלץ מטא-דאטה (תקציר, תגיות)")
|
||||||
|
result = await precedent_metadata_extractor.extract_and_apply(case_law_id)
|
||||||
|
fields = result.get("fields") or []
|
||||||
|
msg = (
|
||||||
|
f"מולאו {len(fields)} שדות: {', '.join(fields)}"
|
||||||
|
if fields
|
||||||
|
else "לא נמצא מה למלא (כל השדות מאוכלסים או לא ניתן לחלץ)"
|
||||||
|
)
|
||||||
|
await progress("completed", 100, msg)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
async def delete_precedent(case_law_id: UUID | str) -> bool:
|
||||||
|
"""Delete a precedent and cascade chunks + halachot."""
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
return await db.delete_case_law(case_law_id)
|
||||||
|
|
||||||
|
|
||||||
|
async def get_precedent(case_law_id: UUID | str) -> dict | None:
|
||||||
|
"""Get a precedent with its halachot attached."""
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
record = await db.get_case_law(case_law_id)
|
||||||
|
if not record:
|
||||||
|
return None
|
||||||
|
record["halachot"] = await db.list_halachot(case_law_id=case_law_id, limit=500)
|
||||||
|
return record
|
||||||
|
|
||||||
|
|
||||||
|
async def list_precedents(
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
source_type: str = "",
|
||||||
|
search: str = "",
|
||||||
|
limit: int = 100,
|
||||||
|
offset: int = 0,
|
||||||
|
) -> list[dict]:
|
||||||
|
return await db.list_external_case_law(
|
||||||
|
practice_area=practice_area,
|
||||||
|
court=court,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
source_type=source_type,
|
||||||
|
search=search,
|
||||||
|
limit=limit,
|
||||||
|
offset=offset,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def search_library(
|
||||||
|
query: str,
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
is_binding: bool | None = None,
|
||||||
|
subject_tag: str = "",
|
||||||
|
limit: int = 10,
|
||||||
|
include_halachot: bool = True,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Semantic search merging halachot (rule-level) and chunks (passage-level).
|
||||||
|
|
||||||
|
Only ``approved`` / ``published`` halachot are returned, per chair-review
|
||||||
|
policy. Chunks are returned regardless of halacha review status.
|
||||||
|
|
||||||
|
When ``VOYAGE_RERANK_ENABLED`` is set, results are passed through
|
||||||
|
voyage rerank-2 (cross-encoder). The +0.05 halacha boost from
|
||||||
|
``search_precedent_library_semantic`` is preserved before rerank
|
||||||
|
but the rerank scores ultimately decide the order.
|
||||||
|
"""
|
||||||
|
if not query.strip():
|
||||||
|
return []
|
||||||
|
query_vec = await embeddings.embed_query(query)
|
||||||
|
|
||||||
|
return await hybrid_search.search_precedent_library_hybrid(
|
||||||
|
query=query,
|
||||||
|
query_text_embedding=query_vec,
|
||||||
|
limit=limit,
|
||||||
|
practice_area=practice_area,
|
||||||
|
court=court,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
is_binding=is_binding,
|
||||||
|
subject_tag=subject_tag,
|
||||||
|
include_halachot=include_halachot,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def _embed_precedent_pages(
|
||||||
|
case_law_id: UUID,
|
||||||
|
pdf_path: Path,
|
||||||
|
page_count: int,
|
||||||
|
) -> dict:
|
||||||
|
"""Render precedent PDF pages → embed via voyage-multimodal → store.
|
||||||
|
|
||||||
|
Thumbnails go to
|
||||||
|
``data/precedent-library/thumbnails/{case_law_id}/p{N:03d}.jpg``.
|
||||||
|
"""
|
||||||
|
thumb_dir = PRECEDENT_LIBRARY_DIR / "thumbnails" / str(case_law_id)
|
||||||
|
rendered = await asyncio.to_thread(
|
||||||
|
extractor.render_pages_for_multimodal,
|
||||||
|
pdf_path,
|
||||||
|
config.MULTIMODAL_DPI,
|
||||||
|
config.MULTIMODAL_THUMB_DPI,
|
||||||
|
thumb_dir,
|
||||||
|
)
|
||||||
|
images = [pil for pil, _ in rendered]
|
||||||
|
thumbs = [t for _, t in rendered]
|
||||||
|
img_embs = await embeddings.embed_images(images)
|
||||||
|
|
||||||
|
page_records = []
|
||||||
|
for i, (emb, thumb) in enumerate(zip(img_embs, thumbs)):
|
||||||
|
rel_thumb = None
|
||||||
|
if thumb is not None:
|
||||||
|
try:
|
||||||
|
rel_thumb = str(thumb.relative_to(config.DATA_DIR))
|
||||||
|
except ValueError:
|
||||||
|
rel_thumb = str(thumb)
|
||||||
|
page_records.append({
|
||||||
|
"page_number": i + 1,
|
||||||
|
"embedding": emb,
|
||||||
|
"image_thumbnail_path": rel_thumb,
|
||||||
|
})
|
||||||
|
stored = await db.store_precedent_image_embeddings(
|
||||||
|
case_law_id, page_records, model_name=config.MULTIMODAL_MODEL,
|
||||||
|
)
|
||||||
|
logger.info(
|
||||||
|
"Multimodal: stored %d page-image embeddings for case_law %s",
|
||||||
|
stored, case_law_id,
|
||||||
|
)
|
||||||
|
return {"pages_embedded": stored}
|
||||||
@@ -0,0 +1,270 @@
|
|||||||
|
"""Auto-extract precedent metadata from a freshly-uploaded ruling.
|
||||||
|
|
||||||
|
Runs after chunking. Reads the precedent's full_text and asks Claude to
|
||||||
|
fill in the metadata fields that an upload form usually leaves empty:
|
||||||
|
short case_name, summary, headnote, key_quote, subject_tags,
|
||||||
|
appeal_subtype, decision_date, precedent_level, court.
|
||||||
|
|
||||||
|
Caller policy: only empty user-supplied fields are filled. Anything the
|
||||||
|
chair already typed in the upload form is preserved. This is enforced
|
||||||
|
in ``apply_to_record``.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from datetime import date as date_type
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp.config import parse_llm_json
|
||||||
|
from legal_mcp.services import claude_session, db
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
|
||||||
|
# The prompt is short — we only need the first 12K chars of the ruling
|
||||||
|
# (header + opening of discussion is enough for naming + summary). For
|
||||||
|
# subject tags we sample the discussion section too.
|
||||||
|
_HEAD_CHARS = 12_000
|
||||||
|
_TAIL_CHARS = 6_000
|
||||||
|
|
||||||
|
|
||||||
|
# Note: this template is concatenated with f-strings at call-time rather
|
||||||
|
# than using .format(), because the JSON example below contains '{' / '}'
|
||||||
|
# which str.format would interpret as placeholders and crash with
|
||||||
|
# KeyError on the field names.
|
||||||
|
METADATA_EXTRACTION_PROMPT = """אתה מסייע משפטי בכיר. קרא את פסק הדין/ההחלטה הבא וחלץ ממנו מטא-דאטה לקטלוג הקורפוס.
|
||||||
|
|
||||||
|
המטרה: למלא שדות בטופס העלאה שהמשתמש הזין באופן חלקי. **אל תמציא** — אם המידע לא מופיע בטקסט, השאר ריק (מחרוזת ריקה / מערך ריק).
|
||||||
|
|
||||||
|
## פלט נדרש
|
||||||
|
החזר JSON אחד (object — לא array) בפורמט הבא, ללא markdown וללא הסברים:
|
||||||
|
|
||||||
|
{
|
||||||
|
"case_name_short": "שם קצר ל-3-6 מילים (למשל 'אהרון ברק' או 'ב. קרן-נכסים'). אל תכלול מספר תיק. שם המבקש/העורר העיקרי. אם זו החלטה מאוחדת — שם הצד המוביל.",
|
||||||
|
"appeal_subtype": "תת-סוג ספציפי בתוך תחום המשפט (למשל 'תכנית רחביה', 'מימוש במכר', 'תמ\\"א 38', 'שימוש חורג', 'סופיות ההחלטה'). מילה אחת או צירוף קצר.",
|
||||||
|
"summary": "תקציר עניני 2-3 משפטים: מה הייתה השאלה, מה הוכרע. בלי שיפוט.",
|
||||||
|
"headnote": "headnote בסגנון נבו: 1-2 משפטים שמסכמים את העיקרון שנקבע/יושם בפסק. למשל 'תכנית רחביה — היטל השבחה במימוש במכר — אין לחייב כשהזכויות צפות'.",
|
||||||
|
"key_quote": "ציטוט מילולי בודד, 30-100 מילים, שמייצג את לב הפסק. חייב להופיע מילה במילה בטקסט. אם אין ציטוט מתאים — מחרוזת ריקה.",
|
||||||
|
"subject_tags": ["תגיות", "נושא", "בעברית"],
|
||||||
|
"decision_date_iso": "YYYY-MM-DD — תאריך מתן ההחלטה כפי שמופיע בטקסט (בכותרת או בחתימה הסופית). אם לא ניתן לזהות במדויק — מחרוזת ריקה.",
|
||||||
|
"precedent_level": "אחד מ-4: 'עליון' / 'מנהלי' / 'ועדת_ערר_ארצית' / 'ועדת_ערר_מחוזית'. בחר לפי הערכאה שמסומנת בכותרת הפסק. אם לא ברור — מחרוזת ריקה.",
|
||||||
|
"source_type": "אחד מ-2: 'court_ruling' (פסק דין של בית משפט — עליון/מנהלי) / 'appeals_committee' (החלטה של ועדת ערר). אם לא ברור — מחרוזת ריקה.",
|
||||||
|
"court": "שם הערכאה כפי שהוא מופיע בכותרת (למשל 'בית המשפט העליון', 'בית המשפט המחוזי בירושלים בשבתו כבית משפט לעניינים מנהליים', 'ועדת הערר לתכנון ובניה פיצויים והיטלי השבחה — מחוז ירושלים'). מחרוזת ריקה אם לא ניתן לזהות."
|
||||||
|
}
|
||||||
|
|
||||||
|
## כללי איכות
|
||||||
|
1. **case_name_short** — שם בולט וקצר. בלי 'נ\\'' / 'נגד' / מספרי תיק.
|
||||||
|
2. **appeal_subtype** — אופציונלי. אם הסוגיה רחבה ולא מסווגת — השאר ריק.
|
||||||
|
3. **summary** — תיאור ניטרלי, גוף שלישי.
|
||||||
|
4. **headnote** — לא מצטטים, מסכמים. סגנון נבו: ביטוי קצר אחד.
|
||||||
|
5. **key_quote** — חייב להיות הדבקה מילולית מהקלט. אם אין ציטוט בולט — השאר ריק.
|
||||||
|
6. **subject_tags** — 3-7 תגיות בעברית, snake_case (חניה, קווי_בניין, שיקול_דעת, פגם_פרוצדורלי, סמכות, מועדים, פגיעה_במקרקעין, ירידת_ערך, תכנית_רחביה, מימוש_במכר, וכד'). שייך לתחום של ועדת ערר תכנון ובניה.
|
||||||
|
7. **decision_date_iso** — תאריך מדויק בלבד. אם בטקסט יש "ניתנה היום, ט' באלול תשפ"א, 5 בספטמבר 2022" — הפלט: "2022-09-05".
|
||||||
|
8. **precedent_level** — קבע לפי הערכאה: בית המשפט העליון = "עליון"; בית משפט מחוזי בשבתו כבית משפט לעניינים מנהליים = "מנהלי"; ועדת ערר ארצית = "ועדת_ערר_ארצית"; ועדת ערר מחוזית (כמו ועדות תכנון ובניה ירושלים/מחוז המרכז וכד') = "ועדת_ערר_מחוזית". השתמש ב-underscore כפי שמופיע — לא ברווח.
|
||||||
|
9. **source_type** — שני ערכים בלבד: "court_ruling" כשהמסמך הוא פסק דין/החלטה של בית משפט (עליון/בג"ץ/מנהלי/מחוזי); "appeals_committee" כשהמסמך הוא החלטה של ועדת ערר (ארצית או מחוזית). זה משלים את `precedent_level` — שני השדות צריכים להיות תואמים.
|
||||||
|
10. **court** — מהכותרת הראשית של הפסק. ניסוח מלא (לא קיצור). מחרוזת ריקה אם לא ניתן לזהות.
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def _build_text_window(full_text: str) -> str:
|
||||||
|
"""Return the head + tail of the ruling, with a marker if truncated.
|
||||||
|
|
||||||
|
Most rulings have the parties/subject in the head and the conclusion
|
||||||
|
in the tail; the middle is the discussion which is captured via the
|
||||||
|
halacha extractor independently. Sending head+tail keeps the prompt
|
||||||
|
cheap while preserving naming and conclusion context.
|
||||||
|
"""
|
||||||
|
if len(full_text) <= _HEAD_CHARS + _TAIL_CHARS:
|
||||||
|
return full_text
|
||||||
|
return (
|
||||||
|
full_text[:_HEAD_CHARS]
|
||||||
|
+ "\n\n[... חלק האמצע הושמט עקב אורך — ראה את החלק האחרון של הפסק להלן ...]\n\n"
|
||||||
|
+ full_text[-_TAIL_CHARS:]
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
async def extract_metadata(case_law_id: UUID | str) -> dict:
|
||||||
|
"""Run metadata extraction. Returns a dict with the suggested values.
|
||||||
|
|
||||||
|
Does NOT write to the DB — caller decides what to merge.
|
||||||
|
"""
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
|
||||||
|
record = await db.get_case_law(case_law_id)
|
||||||
|
if not record:
|
||||||
|
return {}
|
||||||
|
full_text = (record.get("full_text") or "").strip()
|
||||||
|
if not full_text:
|
||||||
|
return {}
|
||||||
|
|
||||||
|
citation = record.get("case_number") or ""
|
||||||
|
court = record.get("court") or ""
|
||||||
|
date_str = str(record.get("date") or "")
|
||||||
|
practice_area = record.get("practice_area") or ""
|
||||||
|
|
||||||
|
context = (
|
||||||
|
f"מראה מקום: {citation}\n"
|
||||||
|
f"ערכאה: {court}\n"
|
||||||
|
f"תאריך: {date_str}\n"
|
||||||
|
f"תחום: {practice_area}"
|
||||||
|
)
|
||||||
|
text_window = _build_text_window(full_text)
|
||||||
|
# Static instructions go via `system` so the SDK path can cache them
|
||||||
|
# across uploads. Per-precedent content goes in the user prompt.
|
||||||
|
user_msg = (
|
||||||
|
f"## הקלט\n{context}\n\n"
|
||||||
|
f"--- תחילת הטקסט ---\n{text_window}\n--- סוף הטקסט ---"
|
||||||
|
)
|
||||||
|
|
||||||
|
try:
|
||||||
|
result = await claude_session.query_json(
|
||||||
|
user_msg, system=METADATA_EXTRACTION_PROMPT,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("precedent_metadata_extractor: query failed: %s", e)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
if not isinstance(result, dict):
|
||||||
|
logger.warning(
|
||||||
|
"precedent_metadata_extractor: expected dict, got %s",
|
||||||
|
type(result).__name__,
|
||||||
|
)
|
||||||
|
return {}
|
||||||
|
|
||||||
|
# Normalize keys / types
|
||||||
|
out: dict = {}
|
||||||
|
if isinstance(result.get("case_name_short"), str):
|
||||||
|
out["case_name_short"] = result["case_name_short"].strip()
|
||||||
|
if isinstance(result.get("appeal_subtype"), str):
|
||||||
|
out["appeal_subtype"] = result["appeal_subtype"].strip()
|
||||||
|
if isinstance(result.get("summary"), str):
|
||||||
|
out["summary"] = result["summary"].strip()
|
||||||
|
if isinstance(result.get("headnote"), str):
|
||||||
|
out["headnote"] = result["headnote"].strip()
|
||||||
|
if isinstance(result.get("key_quote"), str):
|
||||||
|
out["key_quote"] = result["key_quote"].strip()
|
||||||
|
tags = result.get("subject_tags") or []
|
||||||
|
if isinstance(tags, list):
|
||||||
|
out["subject_tags"] = [str(t).strip() for t in tags if str(t).strip()]
|
||||||
|
if isinstance(result.get("decision_date_iso"), str):
|
||||||
|
out["decision_date_iso"] = result["decision_date_iso"].strip()
|
||||||
|
if isinstance(result.get("precedent_level"), str):
|
||||||
|
# Validate against the closed enum used elsewhere in the system
|
||||||
|
lvl = result["precedent_level"].strip()
|
||||||
|
if lvl in {"עליון", "מנהלי", "ועדת_ערר_ארצית", "ועדת_ערר_מחוזית"}:
|
||||||
|
out["precedent_level"] = lvl
|
||||||
|
if isinstance(result.get("source_type"), str):
|
||||||
|
st = result["source_type"].strip()
|
||||||
|
if st in {"court_ruling", "appeals_committee"}:
|
||||||
|
out["source_type"] = st
|
||||||
|
if isinstance(result.get("court"), str):
|
||||||
|
out["court"] = result["court"].strip()
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
async def apply_to_record(
|
||||||
|
case_law_id: UUID | str,
|
||||||
|
suggested: dict,
|
||||||
|
) -> dict:
|
||||||
|
"""Merge suggested metadata into the case_law row, filling ONLY empty fields.
|
||||||
|
|
||||||
|
Empty rules:
|
||||||
|
- string field == "" → fill from suggested
|
||||||
|
- list field == [] → fill from suggested
|
||||||
|
- if suggested key is missing or empty, skip
|
||||||
|
|
||||||
|
case_name has special handling: if the current case_name equals the
|
||||||
|
case_number (a tell-tale sign of the upload form sending the long
|
||||||
|
citation into both fields), treat it as empty and overwrite.
|
||||||
|
"""
|
||||||
|
if isinstance(case_law_id, str):
|
||||||
|
case_law_id = UUID(case_law_id)
|
||||||
|
record = await db.get_case_law(case_law_id)
|
||||||
|
if not record:
|
||||||
|
return {"updated": False, "fields": []}
|
||||||
|
|
||||||
|
fields_to_update: dict = {}
|
||||||
|
|
||||||
|
cur_case_name = (record.get("case_name") or "").strip()
|
||||||
|
cur_case_number = (record.get("case_number") or "").strip()
|
||||||
|
suggested_case_name = (suggested.get("case_name_short") or "").strip()
|
||||||
|
if suggested_case_name and (
|
||||||
|
not cur_case_name or cur_case_name == cur_case_number
|
||||||
|
):
|
||||||
|
fields_to_update["case_name"] = suggested_case_name
|
||||||
|
|
||||||
|
if not (record.get("appeal_subtype") or "").strip():
|
||||||
|
s = (suggested.get("appeal_subtype") or "").strip()
|
||||||
|
if s:
|
||||||
|
fields_to_update["appeal_subtype"] = s
|
||||||
|
|
||||||
|
if not (record.get("summary") or "").strip():
|
||||||
|
s = (suggested.get("summary") or "").strip()
|
||||||
|
if s:
|
||||||
|
fields_to_update["summary"] = s
|
||||||
|
|
||||||
|
if not (record.get("headnote") or "").strip():
|
||||||
|
s = (suggested.get("headnote") or "").strip()
|
||||||
|
if s:
|
||||||
|
fields_to_update["headnote"] = s
|
||||||
|
|
||||||
|
if not (record.get("key_quote") or "").strip():
|
||||||
|
s = (suggested.get("key_quote") or "").strip()
|
||||||
|
if s:
|
||||||
|
fields_to_update["key_quote"] = s
|
||||||
|
|
||||||
|
cur_tags = record.get("subject_tags") or []
|
||||||
|
if not cur_tags:
|
||||||
|
sug_tags = suggested.get("subject_tags") or []
|
||||||
|
if sug_tags:
|
||||||
|
fields_to_update["subject_tags"] = sug_tags
|
||||||
|
|
||||||
|
# decision_date — only fill if currently null. The DB column is DATE,
|
||||||
|
# so we parse the LLM's ISO string into a date object before passing
|
||||||
|
# it to update_case_law (asyncpg won't coerce a string to DATE).
|
||||||
|
if record.get("date") is None:
|
||||||
|
iso = (suggested.get("decision_date_iso") or "").strip()
|
||||||
|
if iso:
|
||||||
|
try:
|
||||||
|
fields_to_update["date"] = date_type.fromisoformat(iso[:10])
|
||||||
|
except ValueError:
|
||||||
|
logger.debug(
|
||||||
|
"metadata_extractor: ignoring invalid decision_date_iso=%r",
|
||||||
|
iso,
|
||||||
|
)
|
||||||
|
|
||||||
|
if not (record.get("precedent_level") or "").strip():
|
||||||
|
lvl = (suggested.get("precedent_level") or "").strip()
|
||||||
|
if lvl:
|
||||||
|
fields_to_update["precedent_level"] = lvl
|
||||||
|
|
||||||
|
if not (record.get("source_type") or "").strip():
|
||||||
|
st = (suggested.get("source_type") or "").strip()
|
||||||
|
if st:
|
||||||
|
fields_to_update["source_type"] = st
|
||||||
|
|
||||||
|
if not (record.get("court") or "").strip():
|
||||||
|
c = (suggested.get("court") or "").strip()
|
||||||
|
if c:
|
||||||
|
fields_to_update["court"] = c
|
||||||
|
|
||||||
|
if not fields_to_update:
|
||||||
|
return {"updated": False, "fields": []}
|
||||||
|
|
||||||
|
await db.update_case_law(case_law_id, **fields_to_update)
|
||||||
|
return {"updated": True, "fields": list(fields_to_update.keys())}
|
||||||
|
|
||||||
|
|
||||||
|
async def extract_and_apply(case_law_id: UUID | str) -> dict:
|
||||||
|
"""Convenience wrapper: extract → merge into row → return summary."""
|
||||||
|
suggested = await extract_metadata(case_law_id)
|
||||||
|
if not suggested:
|
||||||
|
return {"status": "no_metadata", "fields": []}
|
||||||
|
result = await apply_to_record(case_law_id, suggested)
|
||||||
|
return {
|
||||||
|
"status": "completed" if result["updated"] else "no_changes",
|
||||||
|
"fields": result["fields"],
|
||||||
|
"suggested": suggested,
|
||||||
|
}
|
||||||
@@ -2,10 +2,12 @@
|
|||||||
|
|
||||||
from __future__ import annotations
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
import logging
|
import logging
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp import config
|
||||||
from legal_mcp.services import chunker, db, embeddings, extractor, references_extractor
|
from legal_mcp.services import chunker, db, embeddings, extractor, references_extractor
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
@@ -30,7 +32,7 @@ async def process_document(document_id: UUID, case_id: UUID) -> dict:
|
|||||||
try:
|
try:
|
||||||
# Step 1: Extract text
|
# Step 1: Extract text
|
||||||
logger.info("Extracting text from %s", doc["file_path"])
|
logger.info("Extracting text from %s", doc["file_path"])
|
||||||
text, page_count = await extractor.extract_text(doc["file_path"])
|
text, page_count, page_offsets = await extractor.extract_text(doc["file_path"])
|
||||||
|
|
||||||
await db.update_document(
|
await db.update_document(
|
||||||
document_id,
|
document_id,
|
||||||
@@ -68,9 +70,9 @@ async def process_document(document_id: UUID, case_id: UUID) -> dict:
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.warning("Classification failed (non-fatal): %s", e)
|
logger.warning("Classification failed (non-fatal): %s", e)
|
||||||
|
|
||||||
# Step 2: Chunk
|
# Step 2: Chunk (page_offsets propagates page_number into chunks)
|
||||||
logger.info("Chunking document (%d chars)", len(text))
|
logger.info("Chunking document (%d chars)", len(text))
|
||||||
chunks = chunker.chunk_document(text)
|
chunks = chunker.chunk_document(text, page_offsets=page_offsets)
|
||||||
|
|
||||||
if not chunks:
|
if not chunks:
|
||||||
await db.update_document(document_id, extraction_status="completed")
|
await db.update_document(document_id, extraction_status="completed")
|
||||||
@@ -95,6 +97,21 @@ async def process_document(document_id: UUID, case_id: UUID) -> dict:
|
|||||||
|
|
||||||
stored = await db.store_chunks(document_id, case_id, chunk_dicts)
|
stored = await db.store_chunks(document_id, case_id, chunk_dicts)
|
||||||
|
|
||||||
|
# Step 4.5: Multimodal page-image embeddings (V9). Gated by
|
||||||
|
# MULTIMODAL_ENABLED. Renders each PDF page → embeds via
|
||||||
|
# voyage-multimodal-3 → stores per-page row with thumbnail.
|
||||||
|
# Non-fatal on failure (text path already succeeded).
|
||||||
|
multimodal_result = {"pages_embedded": 0}
|
||||||
|
if config.MULTIMODAL_ENABLED and page_count > 0:
|
||||||
|
try:
|
||||||
|
pdf_path = Path(doc["file_path"])
|
||||||
|
if pdf_path.suffix.lower() == ".pdf":
|
||||||
|
multimodal_result = await _embed_document_pages(
|
||||||
|
document_id, case_id, pdf_path, page_count,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
logger.warning("Multimodal embedding failed (non-fatal): %s", e)
|
||||||
|
|
||||||
# Step 5: Extract references (plans, case law, legislation) — non-fatal
|
# Step 5: Extract references (plans, case law, legislation) — non-fatal
|
||||||
refs_result = {"plans": 0, "case_law": 0, "case_law_linked": 0, "legislation": 0}
|
refs_result = {"plans": 0, "case_law": 0, "case_law_linked": 0, "legislation": 0}
|
||||||
try:
|
try:
|
||||||
@@ -124,9 +141,63 @@ async def process_document(document_id: UUID, case_id: UUID) -> dict:
|
|||||||
"case_law": refs_result["case_law"],
|
"case_law": refs_result["case_law"],
|
||||||
"legislation": refs_result["legislation"],
|
"legislation": refs_result["legislation"],
|
||||||
},
|
},
|
||||||
|
"multimodal": multimodal_result,
|
||||||
}
|
}
|
||||||
|
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
logger.exception("Document processing failed: %s", e)
|
logger.exception("Document processing failed: %s", e)
|
||||||
await db.update_document(document_id, extraction_status="failed")
|
await db.update_document(document_id, extraction_status="failed")
|
||||||
return {"status": "failed", "error": str(e)}
|
return {"status": "failed", "error": str(e)}
|
||||||
|
|
||||||
|
|
||||||
|
async def _embed_document_pages(
|
||||||
|
document_id: UUID,
|
||||||
|
case_id: UUID,
|
||||||
|
pdf_path: Path,
|
||||||
|
page_count: int,
|
||||||
|
) -> dict:
|
||||||
|
"""Render PDF pages → embed via voyage-multimodal → store per-page rows.
|
||||||
|
|
||||||
|
Thumbnails are saved under
|
||||||
|
``data/cases/{case_number}/thumbnails/{document_id}/p{N:03d}.jpg``
|
||||||
|
so the UI can show small previews next to image-side search hits.
|
||||||
|
"""
|
||||||
|
# Layout: data/cases/{case_number}/documents/originals/{file}.pdf
|
||||||
|
# → case_dir = pdf_path.parent.parent.parent
|
||||||
|
case_dir = pdf_path.parent.parent.parent
|
||||||
|
thumb_dir = case_dir / "thumbnails" / str(document_id)
|
||||||
|
|
||||||
|
logger.info("Multimodal: rendering %d pages @ %ddpi", page_count, config.MULTIMODAL_DPI)
|
||||||
|
rendered = await asyncio.to_thread(
|
||||||
|
extractor.render_pages_for_multimodal,
|
||||||
|
pdf_path,
|
||||||
|
config.MULTIMODAL_DPI,
|
||||||
|
config.MULTIMODAL_THUMB_DPI,
|
||||||
|
thumb_dir,
|
||||||
|
)
|
||||||
|
images = [pil for pil, _ in rendered]
|
||||||
|
thumb_paths = [thumb for _, thumb in rendered]
|
||||||
|
|
||||||
|
logger.info("Multimodal: embedding %d pages via %s", len(images), config.MULTIMODAL_MODEL)
|
||||||
|
img_embs = await embeddings.embed_images(images)
|
||||||
|
|
||||||
|
page_records = []
|
||||||
|
for i, (emb, thumb) in enumerate(zip(img_embs, thumb_paths)):
|
||||||
|
rel_thumb = None
|
||||||
|
if thumb is not None:
|
||||||
|
try:
|
||||||
|
rel_thumb = str(thumb.relative_to(config.DATA_DIR))
|
||||||
|
except ValueError:
|
||||||
|
rel_thumb = str(thumb)
|
||||||
|
page_records.append({
|
||||||
|
"page_number": i + 1,
|
||||||
|
"embedding": emb,
|
||||||
|
"image_thumbnail_path": rel_thumb,
|
||||||
|
})
|
||||||
|
|
||||||
|
stored = await db.store_document_image_embeddings(
|
||||||
|
document_id, case_id, page_records,
|
||||||
|
model_name=config.MULTIMODAL_MODEL,
|
||||||
|
)
|
||||||
|
logger.info("Multimodal: stored %d page-image embeddings", stored)
|
||||||
|
return {"pages_embedded": stored, "model": config.MULTIMODAL_MODEL}
|
||||||
|
|||||||
@@ -144,9 +144,9 @@ async def check_claims_coverage(blocks: list[dict], claims: list[dict]) -> dict:
|
|||||||
## בלוק הדיון:
|
## בלוק הדיון:
|
||||||
{discussion}"""
|
{discussion}"""
|
||||||
|
|
||||||
parsed = claude_session.query_json(prompt, timeout=120)
|
parsed = await claude_session.query_json(prompt)
|
||||||
if parsed is None:
|
if parsed is None:
|
||||||
logger.warning("Failed to parse claims check: %s", raw[:300])
|
logger.warning("Failed to parse claims check")
|
||||||
# Fallback: assume all covered (don't block export on parse failure)
|
# Fallback: assume all covered (don't block export on parse failure)
|
||||||
return {"name": "claims_coverage", "passed": True,
|
return {"name": "claims_coverage", "passed": True,
|
||||||
"errors": ["שגיאה בפענוח תוצאות — לא ניתן לבדוק"], "severity": "warning"}
|
"errors": ["שגיאה בפענוח תוצאות — לא ניתן לבדוק"], "severity": "warning"}
|
||||||
|
|||||||
103
mcp-server/src/legal_mcp/services/rerank.py
Normal file
103
mcp-server/src/legal_mcp/services/rerank.py
Normal file
@@ -0,0 +1,103 @@
|
|||||||
|
"""Optional cross-encoder reranking layer for semantic search.
|
||||||
|
|
||||||
|
Wraps a base search function with two-stage retrieval:
|
||||||
|
1. fetch ``VOYAGE_RERANK_FETCH_K`` candidates via the bi-encoder (cosine)
|
||||||
|
2. pass them to voyage rerank-2, return top-``limit``
|
||||||
|
|
||||||
|
When the feature flag is off (or ``force_rerank=False``) the helper just
|
||||||
|
calls the base function with ``limit`` and returns its results unchanged
|
||||||
|
— so callers can wrap unconditionally and let env control behaviour.
|
||||||
|
|
||||||
|
The helper extracts the rerank text from each row using the first
|
||||||
|
non-empty field among ``content``, ``rule_statement``,
|
||||||
|
``reasoning_summary`` (matches the schema used by ``search_similar``
|
||||||
|
and ``search_precedent_library_semantic``).
|
||||||
|
|
||||||
|
Decision validated by POC #5 (785-doc precedent corpus, 12 queries):
|
||||||
|
- mean@3: 4.306 → 4.500 (+4.5%)
|
||||||
|
- practical-category queries: 3.78 → 4.22 (+11.6%)
|
||||||
|
- latency: +702ms per query
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import logging
|
||||||
|
from collections.abc import Awaitable, Callable
|
||||||
|
from typing import Any
|
||||||
|
|
||||||
|
from legal_mcp import config
|
||||||
|
from legal_mcp.services import embeddings
|
||||||
|
|
||||||
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
|
SearchFn = Callable[..., Awaitable[list[dict]]]
|
||||||
|
|
||||||
|
|
||||||
|
def _rerank_text(row: dict) -> str:
|
||||||
|
"""First non-empty text field that voyage rerank should see."""
|
||||||
|
for key in ("content", "rule_statement", "reasoning_summary",
|
||||||
|
"supporting_quote"):
|
||||||
|
v = row.get(key)
|
||||||
|
if v:
|
||||||
|
return str(v)
|
||||||
|
return ""
|
||||||
|
|
||||||
|
|
||||||
|
async def maybe_rerank(
|
||||||
|
query: str,
|
||||||
|
base_search: SearchFn,
|
||||||
|
limit: int,
|
||||||
|
*,
|
||||||
|
force_rerank: bool | None = None,
|
||||||
|
fetch_k: int | None = None,
|
||||||
|
**base_kwargs: Any,
|
||||||
|
) -> list[dict]:
|
||||||
|
"""Two-stage retrieval helper.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
query: original query string (needed for the rerank API).
|
||||||
|
base_search: any async function that takes ``limit=…`` and the
|
||||||
|
other ``base_kwargs`` and returns ``list[dict]``.
|
||||||
|
limit: final number of results to return.
|
||||||
|
force_rerank: override the env flag. ``None`` → use config.
|
||||||
|
fetch_k: override the bi-encoder fetch depth.
|
||||||
|
**base_kwargs: forwarded to ``base_search``.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
List of dict rows. When rerank is active, each row's ``score``
|
||||||
|
is replaced with the rerank-2 relevance score (0..1).
|
||||||
|
"""
|
||||||
|
enabled = (config.VOYAGE_RERANK_ENABLED
|
||||||
|
if force_rerank is None else force_rerank)
|
||||||
|
if not enabled:
|
||||||
|
return await base_search(limit=limit, **base_kwargs)
|
||||||
|
|
||||||
|
depth = fetch_k or config.VOYAGE_RERANK_FETCH_K
|
||||||
|
candidates = await base_search(limit=depth, **base_kwargs)
|
||||||
|
if not candidates:
|
||||||
|
return []
|
||||||
|
|
||||||
|
texts = [_rerank_text(c) for c in candidates]
|
||||||
|
# Drop candidates with empty rerank text (shouldn't happen but be safe)
|
||||||
|
keep = [(i, t) for i, t in enumerate(texts) if t]
|
||||||
|
if not keep:
|
||||||
|
logger.warning("rerank: all candidates empty, falling back to base")
|
||||||
|
return candidates[:limit]
|
||||||
|
keep_idx = [i for i, _ in keep]
|
||||||
|
keep_texts = [t for _, t in keep]
|
||||||
|
|
||||||
|
try:
|
||||||
|
ranked = await embeddings.voyage_rerank(
|
||||||
|
query, keep_texts, top_k=limit,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
# Fail open — if Voyage rerank is down, return bi-encoder ordering
|
||||||
|
logger.warning("rerank failed, falling back to base: %s", e)
|
||||||
|
return candidates[:limit]
|
||||||
|
|
||||||
|
out: list[dict] = []
|
||||||
|
for keep_pos, score in ranked:
|
||||||
|
orig_idx = keep_idx[keep_pos]
|
||||||
|
row = dict(candidates[orig_idx])
|
||||||
|
row["score"] = float(score)
|
||||||
|
out.append(row)
|
||||||
|
return out
|
||||||
@@ -159,7 +159,7 @@ async def _analyze_single_pass(rows, appeal_subtype: str = "") -> dict:
|
|||||||
decisions_text += f"\n\n--- החלטה {row['decision_number'] or 'ללא מספר'} ---\n"
|
decisions_text += f"\n\n--- החלטה {row['decision_number'] or 'ללא מספר'} ---\n"
|
||||||
decisions_text += row["full_text"]
|
decisions_text += row["full_text"]
|
||||||
|
|
||||||
raw = claude_session.query(
|
raw = await claude_session.query(
|
||||||
ANALYSIS_PROMPT.format(decisions=decisions_text),
|
ANALYSIS_PROMPT.format(decisions=decisions_text),
|
||||||
timeout=claude_session.LONG_TIMEOUT,
|
timeout=claude_session.LONG_TIMEOUT,
|
||||||
)
|
)
|
||||||
@@ -176,7 +176,7 @@ async def _analyze_multi_pass(rows, appeal_subtype: str = "") -> dict:
|
|||||||
decision_text = f"--- החלטה {row['decision_number'] or 'ללא מספר'} ---\n"
|
decision_text = f"--- החלטה {row['decision_number'] or 'ללא מספר'} ---\n"
|
||||||
decision_text += row["full_text"]
|
decision_text += row["full_text"]
|
||||||
|
|
||||||
raw = claude_session.query(
|
raw = await claude_session.query(
|
||||||
SINGLE_DECISION_PROMPT.format(decision=decision_text),
|
SINGLE_DECISION_PROMPT.format(decision=decision_text),
|
||||||
timeout=claude_session.LONG_TIMEOUT,
|
timeout=claude_session.LONG_TIMEOUT,
|
||||||
)
|
)
|
||||||
@@ -189,7 +189,7 @@ async def _analyze_multi_pass(rows, appeal_subtype: str = "") -> dict:
|
|||||||
return {"error": "לא הצלחתי לחלץ דפוסים מההחלטות"}
|
return {"error": "לא הצלחתי לחלץ דפוסים מההחלטות"}
|
||||||
|
|
||||||
# Pass 2: Synthesize across all decisions
|
# Pass 2: Synthesize across all decisions
|
||||||
raw = claude_session.query(
|
raw = await claude_session.query(
|
||||||
SYNTHESIS_PROMPT.format(
|
SYNTHESIS_PROMPT.format(
|
||||||
num_decisions=len(rows),
|
num_decisions=len(rows),
|
||||||
patterns=json.dumps(all_patterns, ensure_ascii=False, indent=2),
|
patterns=json.dumps(all_patterns, ensure_ascii=False, indent=2),
|
||||||
|
|||||||
@@ -13,7 +13,7 @@ from uuid import UUID
|
|||||||
import httpx
|
import httpx
|
||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
from legal_mcp.services import audit, db, practice_area as pa
|
from legal_mcp.services import audit, db, git_sync, practice_area as pa
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -28,12 +28,17 @@ def _gitea_token() -> str:
|
|||||||
return os.environ.get("GITEA_ACCESS_TOKEN") or os.environ.get("GITEA_TOKEN", "")
|
return os.environ.get("GITEA_ACCESS_TOKEN") or os.environ.get("GITEA_TOKEN", "")
|
||||||
|
|
||||||
|
|
||||||
async def _setup_gitea_remote(case_number: str, title: str, case_dir: Path) -> bool:
|
async def _setup_gitea_remote(case_number: str, title: str, case_dir: Path) -> dict:
|
||||||
"""Create Gitea repo and configure git remote. Best-effort — returns False on failure."""
|
"""Create Gitea repo and configure git remote.
|
||||||
|
|
||||||
|
Returns a dict with: ok (bool), url (str|None), error (str|None).
|
||||||
|
Never raises — failures are reported via the dict so callers can surface
|
||||||
|
them to the UI instead of silently swallowing them.
|
||||||
|
"""
|
||||||
token = _gitea_token()
|
token = _gitea_token()
|
||||||
if not token:
|
if not token:
|
||||||
logger.info("No GITEA_TOKEN — skipping Gitea repo creation for %s", case_number)
|
logger.info("No GITEA_TOKEN — skipping Gitea repo creation for %s", case_number)
|
||||||
return False
|
return {"ok": False, "url": None, "error": "no_token"}
|
||||||
|
|
||||||
try:
|
try:
|
||||||
async with httpx.AsyncClient(verify=False, timeout=30) as client:
|
async with httpx.AsyncClient(verify=False, timeout=30) as client:
|
||||||
@@ -59,8 +64,9 @@ async def _setup_gitea_remote(case_number: str, title: str, case_dir: Path) -> b
|
|||||||
repo = resp.json()
|
repo = resp.json()
|
||||||
|
|
||||||
clone_url = repo.get("clone_url", "")
|
clone_url = repo.get("clone_url", "")
|
||||||
|
html_url = repo.get("html_url", "")
|
||||||
if not clone_url:
|
if not clone_url:
|
||||||
return False
|
return {"ok": False, "url": None, "error": "no_clone_url"}
|
||||||
|
|
||||||
auth_url = clone_url.replace("https://", f"https://chaim:{token}@")
|
auth_url = clone_url.replace("https://", f"https://chaim:{token}@")
|
||||||
|
|
||||||
@@ -94,15 +100,20 @@ async def _setup_gitea_remote(case_number: str, title: str, case_dir: Path) -> b
|
|||||||
cwd=case_dir, capture_output=True, text=True, env=git_env,
|
cwd=case_dir, capture_output=True, text=True, env=git_env,
|
||||||
)
|
)
|
||||||
if push.returncode != 0:
|
if push.returncode != 0:
|
||||||
logger.warning("Gitea push failed for %s: %s", case_number, push.stderr)
|
stderr = push.stderr.strip()
|
||||||
return False
|
logger.warning("Gitea push failed for %s: %s", case_number, stderr)
|
||||||
|
return {"ok": False, "url": html_url or None, "error": f"push_failed: {stderr[:200]}"}
|
||||||
|
|
||||||
logger.info("Gitea repo created and pushed for %s", case_number)
|
logger.info("Gitea repo created and pushed for %s", case_number)
|
||||||
return True
|
return {"ok": True, "url": html_url or None, "error": None}
|
||||||
|
|
||||||
|
except httpx.HTTPStatusError as exc:
|
||||||
|
msg = f"http_{exc.response.status_code}"
|
||||||
|
logger.warning("Gitea setup failed for %s: %s", case_number, msg)
|
||||||
|
return {"ok": False, "url": None, "error": msg}
|
||||||
except Exception as exc:
|
except Exception as exc:
|
||||||
logger.warning("Gitea setup failed for %s: %s", case_number, exc)
|
logger.warning("Gitea setup failed for %s: %s", case_number, exc)
|
||||||
return False
|
return {"ok": False, "url": None, "error": f"{type(exc).__name__}: {exc}"[:200]}
|
||||||
|
|
||||||
|
|
||||||
async def case_create(
|
async def case_create(
|
||||||
@@ -214,11 +225,10 @@ async def case_create(
|
|||||||
except Exception:
|
except Exception:
|
||||||
pass # git not available — non-critical
|
pass # git not available — non-critical
|
||||||
|
|
||||||
# Create Gitea repo and configure remote (best-effort)
|
# Create Gitea repo and configure remote — surface result so callers can
|
||||||
try:
|
# show failures (e.g. stale token) and offer a retry button instead of
|
||||||
await _setup_gitea_remote(case_number, title, case_dir)
|
# silently producing a case with no remote.
|
||||||
except Exception:
|
case["gitea"] = await _setup_gitea_remote(case_number, title, case_dir)
|
||||||
pass # Gitea not available — non-critical
|
|
||||||
|
|
||||||
return json.dumps(case, default=str, ensure_ascii=False, indent=2)
|
return json.dumps(case, default=str, ensure_ascii=False, indent=2)
|
||||||
|
|
||||||
@@ -315,21 +325,13 @@ async def case_update(
|
|||||||
|
|
||||||
updated = await db.update_case(UUID(case["id"]), **fields)
|
updated = await db.update_case(UUID(case["id"]), **fields)
|
||||||
|
|
||||||
# Git commit the update (best-effort)
|
# Git commit + push the update (best-effort)
|
||||||
try:
|
try:
|
||||||
case_dir = config.find_case_dir(case_number)
|
case_dir = config.find_case_dir(case_number)
|
||||||
if case_dir.exists():
|
if case_dir.exists():
|
||||||
case_json = case_dir / "case.json"
|
case_json = case_dir / "case.json"
|
||||||
case_json.write_text(json.dumps(updated, default=str, ensure_ascii=False, indent=2))
|
case_json.write_text(json.dumps(updated, default=str, ensure_ascii=False, indent=2))
|
||||||
subprocess.run(["git", "add", "case.json"], cwd=case_dir, capture_output=True)
|
git_sync.commit_and_push(case_dir, f"עדכון תיק: {', '.join(fields.keys())}")
|
||||||
subprocess.run(
|
|
||||||
["git", "commit", "-m", f"עדכון תיק: {', '.join(fields.keys())}"],
|
|
||||||
cwd=case_dir,
|
|
||||||
capture_output=True,
|
|
||||||
env={"GIT_AUTHOR_NAME": "Ezer Mishpati", "GIT_AUTHOR_EMAIL": "legal@local",
|
|
||||||
"GIT_COMMITTER_NAME": "Ezer Mishpati", "GIT_COMMITTER_EMAIL": "legal@local",
|
|
||||||
"PATH": "/usr/bin:/bin"},
|
|
||||||
)
|
|
||||||
except Exception:
|
except Exception:
|
||||||
pass # git not available — non-critical
|
pass # git not available — non-critical
|
||||||
|
|
||||||
|
|||||||
@@ -4,12 +4,11 @@ from __future__ import annotations
|
|||||||
|
|
||||||
import json
|
import json
|
||||||
import shutil
|
import shutil
|
||||||
import subprocess
|
|
||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
from legal_mcp.services import db, processor
|
from legal_mcp.services import db, git_sync, processor
|
||||||
|
|
||||||
|
|
||||||
async def document_upload(
|
async def document_upload(
|
||||||
@@ -67,11 +66,10 @@ async def document_upload(
|
|||||||
await db.update_document(UUID(doc["id"]), doc_type=classified_type)
|
await db.update_document(UUID(doc["id"]), doc_type=classified_type)
|
||||||
doc["doc_type"] = classified_type
|
doc["doc_type"] = classified_type
|
||||||
|
|
||||||
# Git commit (best-effort — don't fail upload on git errors)
|
# Git commit + push (best-effort — don't fail upload on git errors)
|
||||||
try:
|
try:
|
||||||
repo_dir = config.find_case_dir(case_number)
|
repo_dir = config.find_case_dir(case_number)
|
||||||
if repo_dir.exists():
|
if repo_dir.exists():
|
||||||
subprocess.run(["git", "add", "."], cwd=repo_dir, capture_output=True)
|
|
||||||
doc_type_hebrew = {
|
doc_type_hebrew = {
|
||||||
"appeal": "כתב ערר",
|
"appeal": "כתב ערר",
|
||||||
"response": "תשובה",
|
"response": "תשובה",
|
||||||
@@ -85,14 +83,7 @@ async def document_upload(
|
|||||||
"exhibit": "נספח",
|
"exhibit": "נספח",
|
||||||
"reference": "מסמך עזר",
|
"reference": "מסמך עזר",
|
||||||
}.get(actual_doc_type, actual_doc_type)
|
}.get(actual_doc_type, actual_doc_type)
|
||||||
subprocess.run(
|
git_sync.commit_and_push(repo_dir, f"הוספת {doc_type_hebrew}: {title}")
|
||||||
["git", "commit", "-m", f"הוספת {doc_type_hebrew}: {title}"],
|
|
||||||
cwd=repo_dir,
|
|
||||||
capture_output=True,
|
|
||||||
env={"GIT_AUTHOR_NAME": "Ezer Mishpati", "GIT_AUTHOR_EMAIL": "legal@local",
|
|
||||||
"GIT_COMMITTER_NAME": "Ezer Mishpati", "GIT_COMMITTER_EMAIL": "legal@local",
|
|
||||||
"PATH": "/usr/bin:/bin"},
|
|
||||||
)
|
|
||||||
except Exception:
|
except Exception:
|
||||||
pass # git not available in container — non-critical
|
pass # git not available in container — non-critical
|
||||||
|
|
||||||
@@ -153,7 +144,7 @@ async def document_upload_training(
|
|||||||
shutil.copy2(str(source), str(dest))
|
shutil.copy2(str(source), str(dest))
|
||||||
|
|
||||||
# Extract text and strip Nevo preamble
|
# Extract text and strip Nevo preamble
|
||||||
text, page_count = await extractor.extract_text(str(dest))
|
text, page_count, _ = await extractor.extract_text(str(dest))
|
||||||
text = extractor.strip_nevo_preamble(text)
|
text = extractor.strip_nevo_preamble(text)
|
||||||
|
|
||||||
# Parse date
|
# Parse date
|
||||||
|
|||||||
@@ -7,7 +7,7 @@ from pathlib import Path
|
|||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
from legal_mcp import config
|
from legal_mcp import config
|
||||||
from legal_mcp.services import db, embeddings, research_md
|
from legal_mcp.services import db, embeddings, git_sync, research_md
|
||||||
from legal_mcp.services.lessons import (
|
from legal_mcp.services.lessons import (
|
||||||
CITATION_GUIDANCE,
|
CITATION_GUIDANCE,
|
||||||
DECISION_TEMPLATES,
|
DECISION_TEMPLATES,
|
||||||
@@ -403,6 +403,9 @@ async def export_docx(case_number: str, output_path: str = "") -> str:
|
|||||||
path = await docx_exporter.export_decision(case_id, output_path or None)
|
path = await docx_exporter.export_decision(case_id, output_path or None)
|
||||||
# Register this export as the new source of truth
|
# Register this export as the new source of truth
|
||||||
await db.set_active_draft_path(case_id, path)
|
await db.set_active_draft_path(case_id, path)
|
||||||
|
case_dir = config.find_case_dir(case_number)
|
||||||
|
if case_dir.exists():
|
||||||
|
git_sync.commit_and_push(case_dir, f"ייצוא DOCX: {Path(path).name}")
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
"status": "completed",
|
"status": "completed",
|
||||||
"path": path,
|
"path": path,
|
||||||
@@ -528,6 +531,9 @@ async def export_interim_draft(case_number: str, output_path: str = "") -> str:
|
|||||||
case_id, output_path or None, mode="interim",
|
case_id, output_path or None, mode="interim",
|
||||||
)
|
)
|
||||||
await db.set_active_draft_path(case_id, path)
|
await db.set_active_draft_path(case_id, path)
|
||||||
|
case_dir = config.find_case_dir(case_number)
|
||||||
|
if case_dir.exists():
|
||||||
|
git_sync.commit_and_push(case_dir, f"טיוטת ביניים: {Path(path).name}")
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
"status": "completed",
|
"status": "completed",
|
||||||
"mode": "interim",
|
"mode": "interim",
|
||||||
@@ -571,6 +577,9 @@ async def apply_user_edit(case_number: str, edit_filename: str) -> str:
|
|||||||
try:
|
try:
|
||||||
retrofit_result = docx_retrofit.retrofit_bookmarks(edit_path)
|
retrofit_result = docx_retrofit.retrofit_bookmarks(edit_path)
|
||||||
await db.set_active_draft_path(case_id, str(edit_path))
|
await db.set_active_draft_path(case_id, str(edit_path))
|
||||||
|
case_dir = config.find_case_dir(case_number)
|
||||||
|
if case_dir.exists():
|
||||||
|
git_sync.commit_and_push(case_dir, f"גרסת עריכה: {edit_path.name}")
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
"status": "completed",
|
"status": "completed",
|
||||||
"active_draft_path": str(edit_path),
|
"active_draft_path": str(edit_path),
|
||||||
@@ -681,6 +690,12 @@ async def revise_draft(case_number: str, revisions_json: str,
|
|||||||
active_path, output_path, revisions, author=author,
|
active_path, output_path, revisions, author=author,
|
||||||
)
|
)
|
||||||
await db.set_active_draft_path(case_id, str(output_path))
|
await db.set_active_draft_path(case_id, str(output_path))
|
||||||
|
case_dir = config.find_case_dir(case_number)
|
||||||
|
if case_dir.exists():
|
||||||
|
git_sync.commit_and_push(
|
||||||
|
case_dir,
|
||||||
|
f"revise: טיוטה-v{next_ver} ({result.applied} שינויים, {result.failed} נכשלו)",
|
||||||
|
)
|
||||||
return json.dumps({
|
return json.dumps({
|
||||||
"status": "completed",
|
"status": "completed",
|
||||||
"output_path": str(output_path),
|
"output_path": str(output_path),
|
||||||
|
|||||||
264
mcp-server/src/legal_mcp/tools/precedent_library.py
Normal file
264
mcp-server/src/legal_mcp/tools/precedent_library.py
Normal file
@@ -0,0 +1,264 @@
|
|||||||
|
"""MCP tools for the External Precedent Library.
|
||||||
|
|
||||||
|
This is distinct from:
|
||||||
|
|
||||||
|
- ``precedents`` (case_precedents table) — chair-attached quotes scoped to
|
||||||
|
a specific case section. Use ``precedent_search_library`` for that.
|
||||||
|
- ``style_corpus`` (Daphna's prior decisions) — searched via
|
||||||
|
``search_decisions`` for style/voice.
|
||||||
|
|
||||||
|
The precedent library is the **authoritative law** corpus: external court
|
||||||
|
rulings and other appeals committees' decisions, with halachot extracted
|
||||||
|
and reviewed by the chair.
|
||||||
|
|
||||||
|
All halachot enter as ``pending_review`` and are invisible to search until
|
||||||
|
the chair approves them — per project review policy.
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import json
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
from legal_mcp.services import db, precedent_library
|
||||||
|
|
||||||
|
|
||||||
|
def _ok(payload) -> str:
|
||||||
|
return json.dumps(payload, ensure_ascii=False, indent=2, default=str)
|
||||||
|
|
||||||
|
|
||||||
|
def _err(msg: str) -> str:
|
||||||
|
return json.dumps({"error": msg}, ensure_ascii=False)
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_library_upload(
|
||||||
|
file_path: str,
|
||||||
|
citation: str,
|
||||||
|
case_name: str = "",
|
||||||
|
court: str = "",
|
||||||
|
decision_date: str = "",
|
||||||
|
source_type: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
practice_area: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
subject_tags: list[str] | None = None,
|
||||||
|
is_binding: bool = True,
|
||||||
|
headnote: str = "",
|
||||||
|
summary: str = "",
|
||||||
|
) -> str:
|
||||||
|
"""העלאת פסיקה חיצונית לקורפוס הסמכותי + חילוץ הלכות אוטומטי.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
file_path: נתיב מלא לקובץ PDF/DOCX/RTF/TXT/MD.
|
||||||
|
citation: מראה המקום ("עע\\"מ 3975/22 ב. קרן-נכסים נ' ועדה מקומית").
|
||||||
|
case_name: שם קצר.
|
||||||
|
court: ערכאה (עליון / מנהלי / ועדת ערר ארצית / ועדת ערר מחוזית).
|
||||||
|
decision_date: ISO date (YYYY-MM-DD), אופציונלי.
|
||||||
|
source_type: court_ruling / appeals_committee.
|
||||||
|
precedent_level: עליון / מנהלי / ועדת_ערר_ארצית / ועדת_ערר_מחוזית.
|
||||||
|
practice_area: rishuy_uvniya / betterment_levy / compensation_197.
|
||||||
|
subject_tags: תגיות נושא (חניה, קווי_בניין, וכד').
|
||||||
|
|
||||||
|
Returns: JSON עם case_law_id, מספר chunks, מספר הלכות שנכנסו לתור אישור.
|
||||||
|
"""
|
||||||
|
if not citation.strip():
|
||||||
|
return _err("citation חובה")
|
||||||
|
try:
|
||||||
|
result = await precedent_library.ingest_precedent(
|
||||||
|
file_path=file_path,
|
||||||
|
citation=citation,
|
||||||
|
case_name=case_name,
|
||||||
|
court=court,
|
||||||
|
decision_date=decision_date or None,
|
||||||
|
source_type=source_type,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
practice_area=practice_area,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
subject_tags=subject_tags or [],
|
||||||
|
is_binding=is_binding,
|
||||||
|
headnote=headnote,
|
||||||
|
summary=summary,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
return _err(str(e))
|
||||||
|
return _ok(result)
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_library_list(
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
source_type: str = "",
|
||||||
|
search: str = "",
|
||||||
|
limit: int = 100,
|
||||||
|
) -> str:
|
||||||
|
"""רשימה של פסיקה בקורפוס הסמכותי, עם פילטרים."""
|
||||||
|
rows = await precedent_library.list_precedents(
|
||||||
|
practice_area=practice_area,
|
||||||
|
court=court,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
source_type=source_type,
|
||||||
|
search=search,
|
||||||
|
limit=limit,
|
||||||
|
)
|
||||||
|
return _ok(rows)
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_library_get(case_law_id: str) -> str:
|
||||||
|
"""פסיקה ספציפית עם כל ההלכות שלה (כולל ממתינות לאישור)."""
|
||||||
|
try:
|
||||||
|
cid = UUID(case_law_id)
|
||||||
|
except ValueError:
|
||||||
|
return _err("case_law_id לא תקין")
|
||||||
|
record = await precedent_library.get_precedent(cid)
|
||||||
|
if not record:
|
||||||
|
return _err("פסיקה לא נמצאה")
|
||||||
|
return _ok(record)
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_library_delete(case_law_id: str) -> str:
|
||||||
|
"""מחיקת פסיקה מהקורפוס. cascade: chunks + halachot."""
|
||||||
|
try:
|
||||||
|
cid = UUID(case_law_id)
|
||||||
|
except ValueError:
|
||||||
|
return _err("case_law_id לא תקין")
|
||||||
|
ok = await precedent_library.delete_precedent(cid)
|
||||||
|
return _ok({"deleted": ok, "case_law_id": case_law_id})
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_extract_halachot(case_law_id: str) -> str:
|
||||||
|
"""הרצה מחדש של חילוץ ההלכות לפסיקה קיימת. הלכות קודמות נמחקות."""
|
||||||
|
try:
|
||||||
|
cid = UUID(case_law_id)
|
||||||
|
except ValueError:
|
||||||
|
return _err("case_law_id לא תקין")
|
||||||
|
try:
|
||||||
|
result = await precedent_library.reextract_halachot(cid)
|
||||||
|
except Exception as e:
|
||||||
|
return _err(str(e))
|
||||||
|
return _ok(result)
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_extract_metadata(case_law_id: str) -> str:
|
||||||
|
"""חילוץ מטא-דאטה (case_name קצר, summary, headnote, key_quote, subject_tags, appeal_subtype, date, level, court, source_type) מהטקסט. ממלא רק שדות ריקים — לא דורס מה שכבר הוזן."""
|
||||||
|
try:
|
||||||
|
cid = UUID(case_law_id)
|
||||||
|
except ValueError:
|
||||||
|
return _err("case_law_id לא תקין")
|
||||||
|
try:
|
||||||
|
result = await precedent_library.reextract_metadata(cid)
|
||||||
|
except Exception as e:
|
||||||
|
return _err(str(e))
|
||||||
|
return _ok(result)
|
||||||
|
|
||||||
|
|
||||||
|
async def precedent_process_pending(kind: str = "metadata", limit: int = 20) -> str:
|
||||||
|
"""ריקון תור בקשות חילוץ שנערמו ע"י כפתורי ה-UI. kind: 'metadata' או 'halacha'.
|
||||||
|
|
||||||
|
הכפתור ב-UI מסמן ב-DB שהפסיקה מבקשת חילוץ. כלי זה (שרץ מקומית עם CLI)
|
||||||
|
סורק את התור ומריץ את ה-extractor לכל פריט. אחרי הצלחה הסימון מתנקה.
|
||||||
|
"""
|
||||||
|
if kind not in {"metadata", "halacha"}:
|
||||||
|
return _err("kind חייב להיות 'metadata' או 'halacha'")
|
||||||
|
try:
|
||||||
|
result = await precedent_library.process_pending_extractions(
|
||||||
|
kind=kind, limit=limit,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
return _err(str(e))
|
||||||
|
return _ok(result)
|
||||||
|
|
||||||
|
|
||||||
|
async def search_precedent_library(
|
||||||
|
query: str,
|
||||||
|
practice_area: str = "",
|
||||||
|
court: str = "",
|
||||||
|
precedent_level: str = "",
|
||||||
|
appeal_subtype: str = "",
|
||||||
|
is_binding: bool | None = None,
|
||||||
|
subject_tag: str = "",
|
||||||
|
limit: int = 10,
|
||||||
|
include_halachot: bool = True,
|
||||||
|
) -> str:
|
||||||
|
"""חיפוש סמנטי בקורפוס הפסיקה הסמכותית.
|
||||||
|
|
||||||
|
מחזיר תוצאות מעורבות: הלכות (rule-level, מאושרות בלבד) + קטעי טקסט
|
||||||
|
(passage-level). הלכות מקבלות boost קל בדירוג כי הן מזוקקות מראש.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
query: שאילתת חיפוש בעברית.
|
||||||
|
practice_area: rishuy_uvniya / betterment_levy / compensation_197.
|
||||||
|
court: סינון לפי ערכאה (substring).
|
||||||
|
precedent_level: עליון / מנהלי / ועדת_ערר_ארצית / ועדת_ערר_מחוזית.
|
||||||
|
appeal_subtype: סינון לתת-סוג.
|
||||||
|
is_binding: True/False (None = ללא סינון).
|
||||||
|
subject_tag: סינון לפי תגית נושא (לדוגמה "מועד_קביעת_שומה").
|
||||||
|
limit: מספר תוצאות מקסימלי.
|
||||||
|
include_halachot: האם לכלול הלכות (ברירת מחדל: כן).
|
||||||
|
|
||||||
|
Returns: רשימה מדורגת. כל פריט הוא {"type": "halacha"|"passage", "score", ...}.
|
||||||
|
"""
|
||||||
|
if not query or len(query.strip()) < 2:
|
||||||
|
return json.dumps([], ensure_ascii=False)
|
||||||
|
results = await precedent_library.search_library(
|
||||||
|
query=query.strip(),
|
||||||
|
practice_area=practice_area,
|
||||||
|
court=court,
|
||||||
|
precedent_level=precedent_level,
|
||||||
|
appeal_subtype=appeal_subtype,
|
||||||
|
is_binding=is_binding,
|
||||||
|
subject_tag=subject_tag,
|
||||||
|
limit=limit,
|
||||||
|
include_halachot=include_halachot,
|
||||||
|
)
|
||||||
|
return _ok(results)
|
||||||
|
|
||||||
|
|
||||||
|
async def halacha_review(
|
||||||
|
halacha_id: str,
|
||||||
|
status: str,
|
||||||
|
reviewer: str = "דפנה",
|
||||||
|
rule_statement: str = "",
|
||||||
|
reasoning_summary: str = "",
|
||||||
|
subject_tags: list[str] | None = None,
|
||||||
|
practice_areas: list[str] | None = None,
|
||||||
|
) -> str:
|
||||||
|
"""אישור / דחייה / עריכה של הלכה שחולצה אוטומטית.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
halacha_id: מזהה ההלכה.
|
||||||
|
status: pending_review / approved / rejected / published.
|
||||||
|
reviewer: שם המאשר (ברירת מחדל: דפנה).
|
||||||
|
rule_statement: עריכת ניסוח הכלל (ריק = ללא שינוי).
|
||||||
|
reasoning_summary: עריכת תמצית ההיגיון (ריק = ללא שינוי).
|
||||||
|
subject_tags: עריכת תגיות (None = ללא שינוי).
|
||||||
|
practice_areas: עריכת תחומים (None = ללא שינוי).
|
||||||
|
"""
|
||||||
|
if status not in {"pending_review", "approved", "rejected", "published"}:
|
||||||
|
return _err(
|
||||||
|
"status לא חוקי. ערכים תקינים: "
|
||||||
|
"pending_review / approved / rejected / published"
|
||||||
|
)
|
||||||
|
try:
|
||||||
|
hid = UUID(halacha_id)
|
||||||
|
except ValueError:
|
||||||
|
return _err("halacha_id לא תקין")
|
||||||
|
|
||||||
|
row = await db.update_halacha(
|
||||||
|
halacha_id=hid,
|
||||||
|
review_status=status,
|
||||||
|
reviewer=reviewer,
|
||||||
|
rule_statement=rule_statement or None,
|
||||||
|
reasoning_summary=reasoning_summary or None,
|
||||||
|
subject_tags=subject_tags,
|
||||||
|
practice_areas=practice_areas,
|
||||||
|
)
|
||||||
|
if row is None:
|
||||||
|
return _err("הלכה לא נמצאה")
|
||||||
|
return _ok(row)
|
||||||
|
|
||||||
|
|
||||||
|
async def halachot_pending(limit: int = 100) -> str:
|
||||||
|
"""תור ההלכות הממתינות לאישור (review_status='pending_review')."""
|
||||||
|
rows = await db.list_halachot(review_status="pending_review", limit=limit)
|
||||||
|
return _ok(rows)
|
||||||
@@ -6,7 +6,7 @@ import json
|
|||||||
import logging
|
import logging
|
||||||
from uuid import UUID
|
from uuid import UUID
|
||||||
|
|
||||||
from legal_mcp.services import db, embeddings
|
from legal_mcp.services import db, embeddings, hybrid_search
|
||||||
|
|
||||||
logger = logging.getLogger(__name__)
|
logger = logging.getLogger(__name__)
|
||||||
|
|
||||||
@@ -43,8 +43,9 @@ async def search_decisions(
|
|||||||
)
|
)
|
||||||
|
|
||||||
query_emb = await embeddings.embed_query(query)
|
query_emb = await embeddings.embed_query(query)
|
||||||
results = await db.search_similar(
|
results = await hybrid_search.search_documents_hybrid(
|
||||||
query_embedding=query_emb,
|
query=query,
|
||||||
|
query_text_embedding=query_emb,
|
||||||
limit=limit,
|
limit=limit,
|
||||||
section_type=section_type or None,
|
section_type=section_type or None,
|
||||||
practice_area=practice_area or None,
|
practice_area=practice_area or None,
|
||||||
@@ -58,11 +59,13 @@ async def search_decisions(
|
|||||||
for r in results:
|
for r in results:
|
||||||
formatted.append({
|
formatted.append({
|
||||||
"score": round(float(r["score"]), 4),
|
"score": round(float(r["score"]), 4),
|
||||||
"case_number": r["case_number"],
|
"case_number": r.get("case_number"),
|
||||||
"document": r["document_title"],
|
"document": r.get("document_title"),
|
||||||
"section": r["section_type"],
|
"section": r.get("section_type"),
|
||||||
"page": r["page_number"],
|
"page": r.get("page_number"),
|
||||||
"content": r["content"],
|
"content": r.get("content", ""),
|
||||||
|
"match_type": r.get("match_type", "text"),
|
||||||
|
"image_thumbnail": r.get("image_thumbnail_path"),
|
||||||
})
|
})
|
||||||
|
|
||||||
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
||||||
@@ -86,8 +89,9 @@ async def search_case_documents(
|
|||||||
|
|
||||||
query_emb = await embeddings.embed_query(query)
|
query_emb = await embeddings.embed_query(query)
|
||||||
# Restricted to case_id — practice_area filter would be redundant.
|
# Restricted to case_id — practice_area filter would be redundant.
|
||||||
results = await db.search_similar(
|
results = await hybrid_search.search_documents_hybrid(
|
||||||
query_embedding=query_emb,
|
query=query,
|
||||||
|
query_text_embedding=query_emb,
|
||||||
limit=limit,
|
limit=limit,
|
||||||
case_id=UUID(case["id"]),
|
case_id=UUID(case["id"]),
|
||||||
)
|
)
|
||||||
@@ -99,10 +103,12 @@ async def search_case_documents(
|
|||||||
for r in results:
|
for r in results:
|
||||||
formatted.append({
|
formatted.append({
|
||||||
"score": round(float(r["score"]), 4),
|
"score": round(float(r["score"]), 4),
|
||||||
"document": r["document_title"],
|
"document": r.get("document_title"),
|
||||||
"section": r["section_type"],
|
"section": r.get("section_type"),
|
||||||
"page": r["page_number"],
|
"page": r.get("page_number"),
|
||||||
"content": r["content"],
|
"content": r.get("content", ""),
|
||||||
|
"match_type": r.get("match_type", "text"),
|
||||||
|
"image_thumbnail": r.get("image_thumbnail_path"),
|
||||||
})
|
})
|
||||||
|
|
||||||
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
||||||
@@ -137,9 +143,12 @@ async def find_similar_cases(
|
|||||||
)
|
)
|
||||||
|
|
||||||
query_emb = await embeddings.embed_query(description)
|
query_emb = await embeddings.embed_query(description)
|
||||||
results = await db.search_similar(
|
# Even with rerank we ask for ``limit*3`` so the dedup-by-case
|
||||||
query_embedding=query_emb,
|
# step downstream still has enough rows to pick the best per case.
|
||||||
limit=limit * 3, # Get more to deduplicate by case
|
results = await hybrid_search.search_documents_hybrid(
|
||||||
|
query=description,
|
||||||
|
query_text_embedding=query_emb,
|
||||||
|
limit=limit * 3,
|
||||||
practice_area=practice_area or None,
|
practice_area=practice_area or None,
|
||||||
appeal_subtype=appeal_subtype or None,
|
appeal_subtype=appeal_subtype or None,
|
||||||
)
|
)
|
||||||
@@ -147,14 +156,16 @@ async def find_similar_cases(
|
|||||||
if not results:
|
if not results:
|
||||||
return "לא נמצאו תיקים דומים."
|
return "לא נמצאו תיקים דומים."
|
||||||
|
|
||||||
# Deduplicate by case_number, keep best score per case
|
# Deduplicate by case_number, keep best score per case.
|
||||||
|
# image-only rows still carry case_number from the join.
|
||||||
seen_cases = {}
|
seen_cases = {}
|
||||||
for r in results:
|
for r in results:
|
||||||
cn = r["case_number"]
|
cn = r.get("case_number")
|
||||||
|
if not cn:
|
||||||
|
continue
|
||||||
if cn not in seen_cases or r["score"] > seen_cases[cn]["score"]:
|
if cn not in seen_cases or r["score"] > seen_cases[cn]["score"]:
|
||||||
seen_cases[cn] = r
|
seen_cases[cn] = r
|
||||||
|
|
||||||
# Sort by score and limit
|
|
||||||
top_cases = sorted(seen_cases.values(), key=lambda x: x["score"], reverse=True)[:limit]
|
top_cases = sorted(seen_cases.values(), key=lambda x: x["score"], reverse=True)[:limit]
|
||||||
|
|
||||||
formatted = []
|
formatted = []
|
||||||
@@ -162,8 +173,9 @@ async def find_similar_cases(
|
|||||||
formatted.append({
|
formatted.append({
|
||||||
"score": round(float(r["score"]), 4),
|
"score": round(float(r["score"]), 4),
|
||||||
"case_number": r["case_number"],
|
"case_number": r["case_number"],
|
||||||
"document": r["document_title"],
|
"document": r.get("document_title"),
|
||||||
"relevant_section": r["content"][:500],
|
"relevant_section": (r.get("content") or "")[:500],
|
||||||
|
"match_type": r.get("match_type", "text"),
|
||||||
})
|
})
|
||||||
|
|
||||||
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
return json.dumps(formatted, ensure_ascii=False, indent=2)
|
||||||
|
|||||||
@@ -308,7 +308,7 @@ async def ingest_final_version(
|
|||||||
# Extract text from file if provided
|
# Extract text from file if provided
|
||||||
if file_path and not final_text:
|
if file_path and not final_text:
|
||||||
from legal_mcp.services import extractor
|
from legal_mcp.services import extractor
|
||||||
final_text, _ = await extractor.extract_text(file_path)
|
final_text, _, _ = await extractor.extract_text(file_path)
|
||||||
|
|
||||||
if not final_text:
|
if not final_text:
|
||||||
return "לא סופק טקסט — יש לספק file_path או final_text."
|
return "לא סופק טקסט — יש לספק file_path או final_text."
|
||||||
|
|||||||
@@ -13,12 +13,20 @@ from lxml import etree
|
|||||||
|
|
||||||
from legal_mcp.services.docx_exporter import (
|
from legal_mcp.services.docx_exporter import (
|
||||||
_BOOKMARK_ID_START,
|
_BOOKMARK_ID_START,
|
||||||
|
HEBREW_FONT,
|
||||||
|
_add_styled_paragraph,
|
||||||
_insert_bookmark_end,
|
_insert_bookmark_end,
|
||||||
_insert_bookmark_start,
|
_insert_bookmark_start,
|
||||||
|
_mark_paragraph_rtl,
|
||||||
|
_mark_run_rtl,
|
||||||
|
_strip_dashes,
|
||||||
_wrap_block_with_bookmarks,
|
_wrap_block_with_bookmarks,
|
||||||
|
_write_block_to_docx,
|
||||||
)
|
)
|
||||||
from legal_mcp.services.docx_reviser import NSMAP, _w, list_bookmarks
|
from legal_mcp.services.docx_reviser import NSMAP, _w, list_bookmarks
|
||||||
|
|
||||||
|
from docx.oxml.ns import qn
|
||||||
|
|
||||||
|
|
||||||
def test_insert_bookmark_helpers_create_valid_xml(tmp_path: Path) -> None:
|
def test_insert_bookmark_helpers_create_valid_xml(tmp_path: Path) -> None:
|
||||||
doc = Document()
|
doc = Document()
|
||||||
@@ -101,3 +109,119 @@ def test_multiple_blocks_get_unique_bookmark_ids(tmp_path: Path) -> None:
|
|||||||
|
|
||||||
names = list_bookmarks(out)
|
names = list_bookmarks(out)
|
||||||
assert set(names) == {"block-alef", "block-bet", "block-gimel"}
|
assert set(names) == {"block-alef", "block-bet", "block-gimel"}
|
||||||
|
|
||||||
|
|
||||||
|
# ── RTL / David-font invariants ───────────────────────────────────
|
||||||
|
# These guard against regressions where Hebrew renders LTR or in the wrong
|
||||||
|
# font slot (Times New Roman instead of David). See plan file for context.
|
||||||
|
|
||||||
|
|
||||||
|
def test_mark_paragraph_rtl_adds_bidi_directly_in_pPr() -> None:
|
||||||
|
doc = Document()
|
||||||
|
p = doc.add_paragraph("טקסט בעברית")
|
||||||
|
_mark_paragraph_rtl(p)
|
||||||
|
pPr = p._p.find(qn("w:pPr"))
|
||||||
|
assert pPr is not None
|
||||||
|
# <w:bidi/> must be a direct child of pPr (paragraph direction),
|
||||||
|
# NOT nested inside <w:rPr>.
|
||||||
|
assert pPr.find(qn("w:bidi")) is not None
|
||||||
|
# paragraph-mark rPr still gets <w:rtl/>
|
||||||
|
rPr = pPr.find(qn("w:rPr"))
|
||||||
|
assert rPr is not None and rPr.find(qn("w:rtl")) is not None
|
||||||
|
|
||||||
|
|
||||||
|
def test_mark_run_rtl_forces_david_on_all_font_slots() -> None:
|
||||||
|
doc = Document()
|
||||||
|
p = doc.add_paragraph()
|
||||||
|
run = p.add_run("טקסט")
|
||||||
|
_mark_run_rtl(run)
|
||||||
|
rPr = run._r.find(qn("w:rPr"))
|
||||||
|
assert rPr is not None
|
||||||
|
fonts = rPr.find(qn("w:rFonts"))
|
||||||
|
assert fonts is not None
|
||||||
|
for slot in ("w:ascii", "w:hAnsi", "w:cs", "w:eastAsia"):
|
||||||
|
assert fonts.get(qn(slot)) == HEBREW_FONT, f"{slot} not {HEBREW_FONT}"
|
||||||
|
assert rPr.find(qn("w:rtl")) is not None
|
||||||
|
|
||||||
|
|
||||||
|
def test_styled_paragraph_applies_bidi_and_david() -> None:
|
||||||
|
"""End-to-end: _add_styled_paragraph produces pPr/bidi + rFonts/cs=David."""
|
||||||
|
doc = Document()
|
||||||
|
_add_styled_paragraph(doc, "פסקה עברית", style="Normal")
|
||||||
|
p = doc.paragraphs[-1]
|
||||||
|
assert p._p.find(qn("w:pPr")).find(qn("w:bidi")) is not None
|
||||||
|
run = p.runs[0]
|
||||||
|
fonts = run._r.find(qn("w:rPr")).find(qn("w:rFonts"))
|
||||||
|
assert fonts.get(qn("w:cs")) == HEBREW_FONT
|
||||||
|
|
||||||
|
|
||||||
|
def test_block_dalet_does_not_use_title_style() -> None:
|
||||||
|
"""Title style uses theme fonts and 28pt — avoid for Hebrew."""
|
||||||
|
doc = Document()
|
||||||
|
_write_block_to_docx(doc, "block-dalet", title="", content="")
|
||||||
|
styles_used = {p.style.name for p in doc.paragraphs}
|
||||||
|
assert "Title" not in styles_used, (
|
||||||
|
f"block-dalet should not produce a Title-styled paragraph, got {styles_used}"
|
||||||
|
)
|
||||||
|
# The 'החלטה' text must still appear somewhere
|
||||||
|
texts = [p.text for p in doc.paragraphs]
|
||||||
|
assert any("החלטה" in t for t in texts)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Heading overrides, numbered-list, dash strip ──────────────────
|
||||||
|
|
||||||
|
|
||||||
|
def test_strip_dashes_removes_em_and_en_dashes() -> None:
|
||||||
|
assert _strip_dashes("תכנית 1454198 — אושרה ביום") == "תכנית 1454198 אושרה ביום"
|
||||||
|
assert _strip_dashes("א – ב") == "א ב"
|
||||||
|
assert _strip_dashes("no dash") == "no dash"
|
||||||
|
# Collapsed whitespace
|
||||||
|
assert _strip_dashes("רקע — עובדתי") == "רקע עובדתי"
|
||||||
|
|
||||||
|
|
||||||
|
def test_heading2_gets_justified_and_no_numbering() -> None:
|
||||||
|
"""Section heading → Heading 2 with jc=both and numId=0."""
|
||||||
|
doc = Document()
|
||||||
|
_write_block_to_docx(doc, "block-vav", title="", content="דיון והכרעה")
|
||||||
|
heading = next(p for p in doc.paragraphs if p.style.name == "Heading 2")
|
||||||
|
pPr = heading._p.find(qn("w:pPr"))
|
||||||
|
jc = pPr.find(qn("w:jc"))
|
||||||
|
assert jc is not None and jc.get(qn("w:val")) == "both"
|
||||||
|
numPr = pPr.find(qn("w:numPr"))
|
||||||
|
assert numPr is not None
|
||||||
|
numId = numPr.find(qn("w:numId"))
|
||||||
|
assert numId is not None and numId.get(qn("w:val")) == "0"
|
||||||
|
|
||||||
|
|
||||||
|
def test_heading3_gets_justified_not_centered() -> None:
|
||||||
|
"""Heading 3 in template has jc=center — override to jc=both."""
|
||||||
|
doc = Document()
|
||||||
|
_write_block_to_docx(doc, "block-vav", title="", content="**המצב התכנוני**")
|
||||||
|
heading = next(p for p in doc.paragraphs if p.style.name == "Heading 3")
|
||||||
|
jc = heading._p.find(qn("w:pPr")).find(qn("w:jc"))
|
||||||
|
assert jc is not None and jc.get(qn("w:val")) == "both"
|
||||||
|
|
||||||
|
|
||||||
|
def test_numbered_paragraph_uses_list_paragraph_and_strips_prefix() -> None:
|
||||||
|
"""'1. text' → List Paragraph style, literal '1. ' removed."""
|
||||||
|
doc = Document()
|
||||||
|
_write_block_to_docx(
|
||||||
|
doc, "block-vav", title="",
|
||||||
|
content="1. עניינו של ערר זה.\n2. שכונת נווה יעקב.",
|
||||||
|
)
|
||||||
|
lp = [p for p in doc.paragraphs if p.style.name == "List Paragraph"]
|
||||||
|
assert len(lp) == 2
|
||||||
|
assert lp[0].text.startswith("עניינו")
|
||||||
|
assert not lp[0].text.startswith("1.")
|
||||||
|
assert lp[1].text.startswith("שכונת")
|
||||||
|
|
||||||
|
|
||||||
|
def test_body_content_has_no_em_dashes() -> None:
|
||||||
|
"""Content with em-dashes is rendered without them."""
|
||||||
|
doc = Document()
|
||||||
|
_write_block_to_docx(
|
||||||
|
doc, "block-vav", title="",
|
||||||
|
content="3. תכנית 5924 — קובעת את שטחי הבנייה.",
|
||||||
|
)
|
||||||
|
texts = "\n".join(p.text for p in doc.paragraphs)
|
||||||
|
assert "—" not in texts
|
||||||
|
|||||||
114
scripts/.archive/extract_claims_8174.py
Normal file
114
scripts/.archive/extract_claims_8174.py
Normal file
@@ -0,0 +1,114 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""One-shot: extract appellant claims for case 8174-24.
|
||||||
|
|
||||||
|
The analyst (CMPA-13) finished but `extract_claims` timed out three times on
|
||||||
|
the main 25K-char appeal document, so we have only 19 committee/response
|
||||||
|
claims in DB and zero appellant claims. This script reruns extraction with
|
||||||
|
a higher timeout and parallel chunks.
|
||||||
|
|
||||||
|
Targets:
|
||||||
|
• כתב ערר 18.12.24 (appeal, 25,474 chars) — appellant claims
|
||||||
|
• השלמת מסמכים תמ״א 38 (decision, 3,718 chars) — supplementary appeal filing
|
||||||
|
|
||||||
|
After phase 1.1-1.3 lands, this script becomes obsolete.
|
||||||
|
|
||||||
|
Usage: /home/chaim/legal-ai/mcp-server/.venv/bin/python scripts/extract_claims_8174.py
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
# Ensure we can import legal_mcp from this repo's mcp-server tree
|
||||||
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent / "mcp-server" / "src"))
|
||||||
|
|
||||||
|
from legal_mcp.services import claims_extractor, claude_session, db
|
||||||
|
|
||||||
|
|
||||||
|
# ── Patch claude_session to use 30-min ceiling ───────────────────────
|
||||||
|
# The hard-coded timeout=120 in claims_extractor.extract_claims_with_ai is
|
||||||
|
# what kept failing. Force every claude_session call here to use 1800s.
|
||||||
|
_orig_query_json = claude_session.query_json
|
||||||
|
_orig_query = claude_session.query
|
||||||
|
|
||||||
|
|
||||||
|
def _patched_query_json(prompt: str, timeout: int = 120):
|
||||||
|
return _orig_query_json(prompt, timeout=max(timeout, 1800))
|
||||||
|
|
||||||
|
|
||||||
|
def _patched_query(prompt: str, timeout: int = 120, max_turns: int = 1):
|
||||||
|
return _orig_query(prompt, timeout=max(timeout, 1800), max_turns=max_turns)
|
||||||
|
|
||||||
|
|
||||||
|
claude_session.query_json = _patched_query_json
|
||||||
|
claude_session.query = _patched_query
|
||||||
|
|
||||||
|
|
||||||
|
CASE_NUMBER = "8174-24"
|
||||||
|
|
||||||
|
TARGETS = [
|
||||||
|
# (doc_id, title hint, doc_type override, party_hint)
|
||||||
|
("655f96f7-d406-44ac-bb53-6b2c1ab2909c", "כתב ערר 18.12.24", "appeal", "יואל גולדמן"),
|
||||||
|
("13b4795a-4fb7-460e-bddf-a5d282a1a67f", "השלמת מסמכים תמ״א 38", "appeal", "יואל גולדמן"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
async def main() -> int:
|
||||||
|
case = await db.get_case_by_number(CASE_NUMBER)
|
||||||
|
if not case:
|
||||||
|
print(f"ERROR: case {CASE_NUMBER} not found")
|
||||||
|
return 1
|
||||||
|
case_id = UUID(case["id"])
|
||||||
|
print(f"=== Case {CASE_NUMBER} — {case['title']} ===")
|
||||||
|
print()
|
||||||
|
|
||||||
|
for doc_id, label, doc_type, party_hint in TARGETS:
|
||||||
|
text = await db.get_document_text(UUID(doc_id))
|
||||||
|
if not text:
|
||||||
|
print(f"SKIP {label} — no extracted_text")
|
||||||
|
continue
|
||||||
|
|
||||||
|
chars = len(text)
|
||||||
|
print(f"--- {label} ({chars:,} chars, doc_type={doc_type}) ---")
|
||||||
|
t0 = time.monotonic()
|
||||||
|
try:
|
||||||
|
result = await claims_extractor.extract_and_store_claims(
|
||||||
|
case_id=case_id,
|
||||||
|
document_id=UUID(doc_id),
|
||||||
|
text=text,
|
||||||
|
doc_type=doc_type,
|
||||||
|
party_hint=party_hint,
|
||||||
|
)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" FAILED: {e}")
|
||||||
|
continue
|
||||||
|
dt = time.monotonic() - t0
|
||||||
|
print(f" done in {dt:.1f}s — {json.dumps(result, ensure_ascii=False)}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
# Final tally
|
||||||
|
pool = await db.get_pool()
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
rows = await conn.fetch(
|
||||||
|
"""SELECT party_role, claim_type, source_document, count(*) as n
|
||||||
|
FROM claims WHERE case_id = $1
|
||||||
|
GROUP BY 1, 2, 3 ORDER BY 1, 3""",
|
||||||
|
case_id,
|
||||||
|
)
|
||||||
|
print("=== Final claims breakdown ===")
|
||||||
|
total = 0
|
||||||
|
for r in rows:
|
||||||
|
n = r["n"]
|
||||||
|
total += n
|
||||||
|
print(f" {r['party_role']:12} {r['claim_type']:10} ({n:3}) ← {r['source_document']}")
|
||||||
|
print(f" TOTAL: {total} claims")
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(asyncio.run(main()))
|
||||||
@@ -16,6 +16,14 @@
|
|||||||
| `convert_decision_template.py` | python | המרת `data/training/טיוטת החלטה.dotx` → `skills/docx/decision_template.docx` לטעינה ב-python-docx | להריץ כשמתעדכנת התבנית |
|
| `convert_decision_template.py` | python | המרת `data/training/טיוטת החלטה.dotx` → `skills/docx/decision_template.docx` לטעינה ב-python-docx | להריץ כשמתעדכנת התבנית |
|
||||||
| `deploy-track-changes.sh` | bash | סנכרון skills CMP↔CMPA + בדיקות + הנחיות deploy לארכיטקטורת Track Changes | ידני |
|
| `deploy-track-changes.sh` | bash | סנכרון skills CMP↔CMPA + בדיקות + הנחיות deploy לארכיטקטורת Track Changes | ידני |
|
||||||
| `retrofit_case.py` | python | retrofit רטרואקטיבי — מזריק bookmarks לקובץ קיים של תיק ספציפי ומגדיר אותו כ-active_draft | ידני (חד-פעמי לתיק) |
|
| `retrofit_case.py` | python | retrofit רטרואקטיבי — מזריק bookmarks לקובץ קיים של תיק ספציפי ומגדיר אותו כ-active_draft | ידני (חד-פעמי לתיק) |
|
||||||
|
| `reembed_voyage.py` | python | Re-embed כל הוקטורים ב-DB עם המודל ב-`VOYAGE_MODEL` (לאחר שינוי מודל). 5 טבלאות, 1024 דמ', batches של 100. ראה `docs/voyage-upgrades-plan.md` | ידני (אחרי החלפת `VOYAGE_MODEL`) |
|
||||||
|
| `voyage_context3_poc.py` | python | POC #1 — voyage-3 vs voyage-context-3 על פסיקה אחת קצרה (קלמנוביץ, 63 chunks). הכרעה: context-3 לא מציג שיפור עקבי | בנצ'מרק חד-פעמי, נשמר לרפרנס |
|
||||||
|
| `voyage_context3_poc_long.py` | python | POC #2 — voyage-context-3 על פסיקה ארוכה (אהרון ברק 219 chunks) עם sliding windows. הכרעה: context-3 לא משתפר על פסיקה גדולה | בנצ'מרק חד-פעמי, נשמר לרפרנס |
|
||||||
|
| `voyage_multimodal_poc.py` | python | POC #3 — voyage-multimodal-3 על דוח שמאי (89 עמודים). הכרעה: שיפור משמעותי לטבלאות + 22 עמודי image-only שhttp text-OCR מאבד | בנצ'מרק חד-פעמי, מוכן לשלב C |
|
||||||
|
| `voyage_rerank_judge_poc.py` | python | POC #4 — voyage-3 vs rerank-2 vs context-3 על אהרון ברק, 18 שאילתות, claude-haiku-4-5 כ-judge. הכרעה: rerank-2 ניצח עם +9% mean@3 | בנצ'מרק חד-פעמי |
|
||||||
|
| `voyage_rerank_corpus_poc.py` | python | POC #5 — voyage-3 vs rerank-2 על קורפוס מלא (785 docs). הכרעה: +4.5% mean@3 כללי, +11.6% על P queries (practical) | בנצ'מרק חד-פעמי, אישר את שלב B |
|
||||||
|
| `multimodal_backfill.py` | python | Backfill voyage-multimodal-3 page embeddings על מסמכי תיקים קיימים. idempotent (skips by default), forces `MULTIMODAL_ENABLED=true` ל-run, רץ מהקונטיינר. שלב C — ראה `docs/voyage-upgrades-plan.md` | ידני per-case (`python multimodal_backfill.py 8174-24 8137-24`) |
|
||||||
|
| `backfill_chunk_pages.py` | python | Backfill `page_number` ב-`document_chunks` קיימים. legacy chunker לא tracked עמודים → `page_number=NULL` חוסם boost של multimodal hybrid (text+image join על אותו עמוד). re-extracts כל PDF (re-OCR אם צריך, ~$0.0015/page), מחשב page_offsets, ומעדכן chunks. idempotent | ידני per-case (`python backfill_chunk_pages.py 8174-24 8137-24`) |
|
||||||
|
|
||||||
## תיקיית `.archive/` — סקריפטים שהושלמו
|
## תיקיית `.archive/` — סקריפטים שהושלמו
|
||||||
|
|
||||||
@@ -32,6 +40,7 @@
|
|||||||
| `export-decision-docx.py` | ייצוא החלטה ל-DOCX | MCP: `export_docx()` |
|
| `export-decision-docx.py` | ייצוא החלטה ל-DOCX | MCP: `export_docx()` |
|
||||||
| `extract-citations.py` | חילוץ ציטוטי פסיקה מבלוק י | MCP service: `references_extractor.py` |
|
| `extract-citations.py` | חילוץ ציטוטי פסיקה מבלוק י | MCP service: `references_extractor.py` |
|
||||||
| `extract-claims.py` | חילוץ טענות מבלוק ז | MCP: `extract_claims()` + `claims_extractor.py` |
|
| `extract-claims.py` | חילוץ טענות מבלוק ז | MCP: `extract_claims()` + `claims_extractor.py` |
|
||||||
|
| `extract_claims_8174.py` | חד-פעמי — חילוץ טענות חסרות לתיק 8174-24 אחרי timeout של האנליסט (43 טענות עורר נוספו 30/04/26) | phase 1: `claude_session` async + 30min timeout + chunking סמנטי |
|
||||||
| `extract_all_google_vision.py` | OCR בכמות עם Google Vision | MCP: `document_upload()` pipeline |
|
| `extract_all_google_vision.py` | OCR בכמות עם Google Vision | MCP: `document_upload()` pipeline |
|
||||||
| `extract_originals.py` | חילוץ טקסט מ-PDF עם Claude Opus | MCP service: `extractor.py` |
|
| `extract_originals.py` | חילוץ טקסט מ-PDF עם Claude Opus | MCP service: `extractor.py` |
|
||||||
| `extract_originals_ocr.py` | חילוץ OCR מלא מ-PDF | MCP service: `extractor.py` |
|
| `extract_originals_ocr.py` | חילוץ OCR מלא מ-PDF | MCP service: `extractor.py` |
|
||||||
|
|||||||
346
scripts/backfill_chunk_pages.py
Normal file
346
scripts/backfill_chunk_pages.py
Normal file
@@ -0,0 +1,346 @@
|
|||||||
|
"""Backfill page_number on existing document_chunks (no re-OCR).
|
||||||
|
|
||||||
|
Why this exists: the legacy chunker did not track which page each chunk
|
||||||
|
came from. After the page-tracking fix, new uploads carry page_number
|
||||||
|
correctly, but existing chunks have ``page_number=NULL`` in the DB.
|
||||||
|
That blocks the multimodal hybrid retriever's text+image boost (which
|
||||||
|
joins (chunk, image) on (document_id, page_number)).
|
||||||
|
|
||||||
|
What it does (per case, per document):
|
||||||
|
|
||||||
|
1. Load stored ``documents.extracted_text`` from the DB. This is
|
||||||
|
the exact text that was used to produce the existing chunks —
|
||||||
|
so chunk content lookups against it match verbatim.
|
||||||
|
2. Open the PDF with PyMuPDF and call ``page.get_text()`` on each
|
||||||
|
page (cheap, no OCR). For pages with usable direct text we get
|
||||||
|
a clean snippet; for fully-scanned pages we get little/nothing.
|
||||||
|
3. Anchor: for each page with a usable snippet, search the snippet
|
||||||
|
in ``extracted_text`` to recover that page's start offset.
|
||||||
|
4. Interpolate: for OCR-only pages with no anchor, position is
|
||||||
|
linearly interpolated between the nearest anchored neighbors
|
||||||
|
(or uniformly when no anchors exist at all).
|
||||||
|
5. For every chunk row (sorted by chunk_index), find the chunk's
|
||||||
|
content in ``extracted_text`` (verbatim match), look up the
|
||||||
|
page from the offsets, and ``UPDATE document_chunks SET
|
||||||
|
page_number = ?``.
|
||||||
|
|
||||||
|
Idempotent: a second run with no --force is a no-op.
|
||||||
|
|
||||||
|
Cost: zero. Runs in seconds even for the 89-page appraisal report.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
docker cp scripts/backfill_chunk_pages.py <c>:/tmp/
|
||||||
|
docker exec <c> python /tmp/backfill_chunk_pages.py 8174-24 8137-24
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
|
||||||
|
def _setup_paths():
|
||||||
|
here = Path(__file__).resolve().parent
|
||||||
|
mcp_src = here.parent / "mcp-server" / "src"
|
||||||
|
if mcp_src.is_dir() and str(mcp_src) not in sys.path:
|
||||||
|
sys.path.insert(0, str(mcp_src))
|
||||||
|
|
||||||
|
|
||||||
|
_setup_paths()
|
||||||
|
import fitz # PyMuPDF # noqa: E402
|
||||||
|
from legal_mcp.services import db # noqa: E402
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format="%(asctime)s [%(levelname)s] %(message)s",
|
||||||
|
)
|
||||||
|
logger = logging.getLogger("backfill_chunk_pages")
|
||||||
|
|
||||||
|
|
||||||
|
# Snippet length for page anchoring. Long enough to be unique, short
|
||||||
|
# enough to survive minor whitespace variation between PyMuPDF direct
|
||||||
|
# extraction and the stored OCR text.
|
||||||
|
ANCHOR_SNIPPET_LEN = 80
|
||||||
|
# Minimum direct-text length on a page to attempt anchoring at all.
|
||||||
|
MIN_DIRECT_LEN = 60
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_local_path(db_path: str) -> Path:
|
||||||
|
p = Path(db_path)
|
||||||
|
if p.is_file():
|
||||||
|
return p
|
||||||
|
if str(p).startswith("/data/"):
|
||||||
|
local = Path("/home/chaim/legal-ai") / Path(*p.parts[1:])
|
||||||
|
if local.is_file():
|
||||||
|
return local
|
||||||
|
return p
|
||||||
|
|
||||||
|
|
||||||
|
def _norm_whitespace(s: str) -> str:
|
||||||
|
"""Collapse runs of whitespace; helps cross-source matching where
|
||||||
|
PyMuPDF direct extraction may differ from the stored OCR text in
|
||||||
|
line-break placement."""
|
||||||
|
return " ".join(s.split())
|
||||||
|
|
||||||
|
|
||||||
|
def _find_anchored_snippet(
|
||||||
|
extracted_text: str, snippet: str, search_start: int = 0,
|
||||||
|
) -> int:
|
||||||
|
"""Search for ``snippet`` in ``extracted_text``, tolerant to
|
||||||
|
whitespace differences. Returns the offset in the original
|
||||||
|
extracted_text, or -1."""
|
||||||
|
# Direct match first — fastest path
|
||||||
|
idx = extracted_text.find(snippet, search_start)
|
||||||
|
if idx >= 0:
|
||||||
|
return idx
|
||||||
|
# Whitespace-normalized fallback
|
||||||
|
norm_text = _norm_whitespace(extracted_text)
|
||||||
|
norm_snip = _norm_whitespace(snippet)
|
||||||
|
if not norm_snip:
|
||||||
|
return -1
|
||||||
|
norm_idx = norm_text.find(norm_snip)
|
||||||
|
if norm_idx < 0:
|
||||||
|
return -1
|
||||||
|
# Map norm offset back to original — count chars until we've passed
|
||||||
|
# `norm_idx` non-collapsed characters in the original.
|
||||||
|
orig_pos = 0
|
||||||
|
norm_pos = 0
|
||||||
|
in_ws = False
|
||||||
|
for ch in extracted_text:
|
||||||
|
if norm_pos == norm_idx:
|
||||||
|
return orig_pos
|
||||||
|
if ch.isspace():
|
||||||
|
if not in_ws:
|
||||||
|
norm_pos += 1
|
||||||
|
in_ws = True
|
||||||
|
else:
|
||||||
|
in_ws = False
|
||||||
|
norm_pos += 1
|
||||||
|
orig_pos += 1
|
||||||
|
return -1
|
||||||
|
|
||||||
|
|
||||||
|
def _compute_page_offsets(pdf_path: Path, extracted_text: str) -> list[int]:
|
||||||
|
"""Return ``page_offsets`` (start char offset of each page in
|
||||||
|
``extracted_text``), using direct PyMuPDF reads for anchoring and
|
||||||
|
linear interpolation for OCR-only pages."""
|
||||||
|
doc = fitz.open(str(pdf_path))
|
||||||
|
n_pages = len(doc)
|
||||||
|
anchors: list[int | None] = [None] * n_pages
|
||||||
|
|
||||||
|
last_pos = 0
|
||||||
|
for i, page in enumerate(doc):
|
||||||
|
direct = page.get_text().strip()
|
||||||
|
if len(direct) < MIN_DIRECT_LEN:
|
||||||
|
continue
|
||||||
|
# Take the first ANCHOR_SNIPPET_LEN chars after stripping
|
||||||
|
snippet = direct[:ANCHOR_SNIPPET_LEN]
|
||||||
|
pos = _find_anchored_snippet(extracted_text, snippet, last_pos)
|
||||||
|
if pos < 0:
|
||||||
|
# try a global search before giving up
|
||||||
|
pos = _find_anchored_snippet(extracted_text, snippet, 0)
|
||||||
|
if pos >= 0:
|
||||||
|
anchors[i] = pos
|
||||||
|
last_pos = pos
|
||||||
|
doc.close()
|
||||||
|
|
||||||
|
# Force first page to start at 0 if not already anchored
|
||||||
|
if anchors[0] is None:
|
||||||
|
anchors[0] = 0
|
||||||
|
|
||||||
|
# Fill gaps via linear interpolation between the nearest anchors;
|
||||||
|
# extrapolate beyond the last anchor by the average page length.
|
||||||
|
page_offsets: list[int] = [0] * n_pages
|
||||||
|
for i in range(n_pages):
|
||||||
|
if anchors[i] is not None:
|
||||||
|
page_offsets[i] = anchors[i]
|
||||||
|
continue
|
||||||
|
# Find prev anchored
|
||||||
|
prev_i = i - 1
|
||||||
|
while prev_i >= 0 and anchors[prev_i] is None:
|
||||||
|
prev_i -= 1
|
||||||
|
# Find next anchored
|
||||||
|
next_i = i + 1
|
||||||
|
while next_i < n_pages and anchors[next_i] is None:
|
||||||
|
next_i += 1
|
||||||
|
prev_pos = anchors[prev_i] if prev_i >= 0 else 0
|
||||||
|
if next_i < n_pages:
|
||||||
|
next_pos = anchors[next_i]
|
||||||
|
ratio = (i - prev_i) / (next_i - prev_i)
|
||||||
|
page_offsets[i] = int(prev_pos + ratio * (next_pos - prev_pos))
|
||||||
|
else:
|
||||||
|
# Extrapolate: assume uniform distribution beyond last anchor
|
||||||
|
# using page-density inferred from prior anchors (or fall
|
||||||
|
# back to total_text/n_pages).
|
||||||
|
avg = len(extracted_text) / max(1, n_pages)
|
||||||
|
page_offsets[i] = int(prev_pos + avg * (i - prev_i))
|
||||||
|
# Monotone-clip just in case interpolation ever goes backwards
|
||||||
|
for i in range(1, n_pages):
|
||||||
|
if page_offsets[i] < page_offsets[i - 1]:
|
||||||
|
page_offsets[i] = page_offsets[i - 1]
|
||||||
|
return page_offsets
|
||||||
|
|
||||||
|
|
||||||
|
def _page_at_offset(offset: int, page_offsets: list[int]) -> int:
|
||||||
|
if not page_offsets:
|
||||||
|
return 1
|
||||||
|
page = 1
|
||||||
|
for i, start in enumerate(page_offsets):
|
||||||
|
if start <= offset:
|
||||||
|
page = i + 1
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
return page
|
||||||
|
|
||||||
|
|
||||||
|
async def _backfill_document(
|
||||||
|
document_id: UUID,
|
||||||
|
title: str,
|
||||||
|
db_file_path: str,
|
||||||
|
force: bool,
|
||||||
|
) -> dict:
|
||||||
|
pool = await db.get_pool()
|
||||||
|
|
||||||
|
chunks = await pool.fetch(
|
||||||
|
"SELECT id, chunk_index, content, page_number FROM document_chunks "
|
||||||
|
"WHERE document_id = $1 ORDER BY chunk_index",
|
||||||
|
document_id,
|
||||||
|
)
|
||||||
|
if not chunks:
|
||||||
|
return {"status": "no_chunks"}
|
||||||
|
|
||||||
|
n_null = sum(1 for c in chunks if c["page_number"] is None)
|
||||||
|
if not force and n_null == 0:
|
||||||
|
logger.info(" skip (all %d chunks already tagged): %s", len(chunks), title)
|
||||||
|
return {"status": "skipped", "chunks": len(chunks)}
|
||||||
|
|
||||||
|
pdf_path = _resolve_local_path(db_file_path)
|
||||||
|
if not pdf_path.is_file():
|
||||||
|
logger.warning(" file missing: %s (%s)", pdf_path, title)
|
||||||
|
return {"status": "missing"}
|
||||||
|
if pdf_path.suffix.lower() != ".pdf":
|
||||||
|
return {"status": "not_pdf"}
|
||||||
|
|
||||||
|
doc_row = await pool.fetchrow(
|
||||||
|
"SELECT extracted_text FROM documents WHERE id = $1", document_id,
|
||||||
|
)
|
||||||
|
extracted_text = doc_row["extracted_text"] if doc_row else None
|
||||||
|
if not extracted_text:
|
||||||
|
return {"status": "no_extracted_text"}
|
||||||
|
|
||||||
|
t0 = time.time()
|
||||||
|
page_offsets = _compute_page_offsets(pdf_path, extracted_text)
|
||||||
|
n_anchored = sum(1 for i in range(len(page_offsets)) if i == 0 or page_offsets[i] > page_offsets[i - 1])
|
||||||
|
|
||||||
|
# The chunker joins paragraphs with single `\n` while extracted_text
|
||||||
|
# has `\n\n` between pages, so verbatim search misses cross-page
|
||||||
|
# chunks. Use the whitespace-tolerant helper that returns an offset
|
||||||
|
# in the *original* text.
|
||||||
|
pos = 0
|
||||||
|
updated = 0
|
||||||
|
not_found = 0
|
||||||
|
for c in chunks:
|
||||||
|
content = c["content"]
|
||||||
|
if not content:
|
||||||
|
continue
|
||||||
|
# Use a unique slice from the chunk to anchor in extracted_text
|
||||||
|
# — anchoring on the chunk's first ~120 chars is enough to
|
||||||
|
# disambiguate across the document.
|
||||||
|
snippet = content[: min(len(content), 120)]
|
||||||
|
idx = _find_anchored_snippet(extracted_text, snippet, pos)
|
||||||
|
if idx < 0:
|
||||||
|
idx = _find_anchored_snippet(extracted_text, snippet, 0)
|
||||||
|
if idx < 0:
|
||||||
|
not_found += 1
|
||||||
|
continue
|
||||||
|
page = _page_at_offset(idx, page_offsets)
|
||||||
|
await pool.execute(
|
||||||
|
"UPDATE document_chunks SET page_number = $1 WHERE id = $2",
|
||||||
|
page, c["id"],
|
||||||
|
)
|
||||||
|
updated += 1
|
||||||
|
pos = idx + max(1, len(content) // 2)
|
||||||
|
|
||||||
|
elapsed = time.time() - t0
|
||||||
|
logger.info(
|
||||||
|
" %s — %d pages, %d anchors, updated %d/%d chunks (%d not found) in %.2fs",
|
||||||
|
title, len(page_offsets), n_anchored, updated, len(chunks), not_found, elapsed,
|
||||||
|
)
|
||||||
|
return {
|
||||||
|
"status": "ok",
|
||||||
|
"elapsed_sec": round(elapsed, 2),
|
||||||
|
"pages": len(page_offsets),
|
||||||
|
"anchors": n_anchored,
|
||||||
|
"chunks_total": len(chunks),
|
||||||
|
"chunks_updated": updated,
|
||||||
|
"chunks_not_found": not_found,
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
|
async def backfill_cases(case_numbers: list[str], force: bool) -> dict:
|
||||||
|
pool = await db.get_pool()
|
||||||
|
summary: dict = {}
|
||||||
|
for cn in case_numbers:
|
||||||
|
logger.info("=" * 60)
|
||||||
|
logger.info("Case %s", cn)
|
||||||
|
case = await db.get_case_by_number(cn)
|
||||||
|
if not case:
|
||||||
|
logger.warning("Case not found: %s", cn)
|
||||||
|
summary[cn] = {"status": "case_not_found"}
|
||||||
|
continue
|
||||||
|
case_id = UUID(str(case["id"]))
|
||||||
|
docs = await pool.fetch(
|
||||||
|
"SELECT id, title, file_path FROM documents WHERE case_id = $1 ORDER BY title",
|
||||||
|
case_id,
|
||||||
|
)
|
||||||
|
logger.info(" %d documents", len(docs))
|
||||||
|
per_doc: list[dict] = []
|
||||||
|
for d in docs:
|
||||||
|
r = await _backfill_document(
|
||||||
|
UUID(str(d["id"])), d["title"], d["file_path"], force,
|
||||||
|
)
|
||||||
|
per_doc.append({"document_id": str(d["id"]), "title": d["title"], **r})
|
||||||
|
summary[cn] = {
|
||||||
|
"documents_total": len(docs),
|
||||||
|
"ok": sum(1 for r in per_doc if r["status"] == "ok"),
|
||||||
|
"skipped": sum(1 for r in per_doc if r["status"] == "skipped"),
|
||||||
|
"missing": sum(1 for r in per_doc if r["status"] == "missing"),
|
||||||
|
"no_chunks": sum(1 for r in per_doc if r["status"] == "no_chunks"),
|
||||||
|
"no_extracted_text": sum(1 for r in per_doc if r["status"] == "no_extracted_text"),
|
||||||
|
"chunks_updated": sum(r.get("chunks_updated", 0) for r in per_doc),
|
||||||
|
"documents": per_doc,
|
||||||
|
}
|
||||||
|
return summary
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Backfill page_number on existing chunks (no OCR)")
|
||||||
|
parser.add_argument("cases", nargs="+", help="Case numbers (e.g. 8174-24 8137-24)")
|
||||||
|
parser.add_argument(
|
||||||
|
"--force", action="store_true",
|
||||||
|
help="Re-process even if all chunks already have page_number (default: skip)",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
summary = asyncio.run(backfill_cases(args.cases, force=args.force))
|
||||||
|
print()
|
||||||
|
print("=" * 60)
|
||||||
|
print("SUMMARY")
|
||||||
|
print("=" * 60)
|
||||||
|
for cn, s in summary.items():
|
||||||
|
if s.get("status") == "case_not_found":
|
||||||
|
print(f" {cn}: NOT FOUND")
|
||||||
|
continue
|
||||||
|
print(
|
||||||
|
f" {cn}: {s['documents_total']} docs — "
|
||||||
|
f"ok {s['ok']}, skipped {s['skipped']}, "
|
||||||
|
f"missing {s['missing']}, chunks_updated {s['chunks_updated']}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
186
scripts/multimodal_backfill.py
Normal file
186
scripts/multimodal_backfill.py
Normal file
@@ -0,0 +1,186 @@
|
|||||||
|
"""Multimodal backfill — embed page images for existing case documents.
|
||||||
|
|
||||||
|
Iterates over documents already in the DB and renders + embeds + stores
|
||||||
|
per-page voyage-multimodal-3 vectors. Skips documents that already have
|
||||||
|
image embeddings (idempotent).
|
||||||
|
|
||||||
|
Independent of the processor pipeline — does NOT re-extract text or
|
||||||
|
re-chunk; only the multimodal step.
|
||||||
|
|
||||||
|
Designed to run from inside the FastAPI/MCP container (where /data is
|
||||||
|
mounted and writable). Locally it requires sudo for the thumbnails dir
|
||||||
|
under /home/chaim/legal-ai/data/cases/...
|
||||||
|
|
||||||
|
Usage::
|
||||||
|
|
||||||
|
# In container (Coolify):
|
||||||
|
docker exec -it <legal-ai-container> python -m legal_mcp.cli \\
|
||||||
|
multimodal_backfill --cases 8174-24 8137-24
|
||||||
|
|
||||||
|
# Or as a script (sets MULTIMODAL_ENABLED=true automatically):
|
||||||
|
/opt/api/mcp-server/.venv/bin/python /opt/api/scripts/multimodal_backfill.py 8174-24 8137-24
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import logging
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
from uuid import UUID
|
||||||
|
|
||||||
|
|
||||||
|
def _setup_paths():
|
||||||
|
"""Ensure mcp-server src is on path even when run as a standalone script."""
|
||||||
|
here = Path(__file__).resolve().parent
|
||||||
|
mcp_src = here.parent / "mcp-server" / "src"
|
||||||
|
if mcp_src.is_dir() and str(mcp_src) not in sys.path:
|
||||||
|
sys.path.insert(0, str(mcp_src))
|
||||||
|
|
||||||
|
|
||||||
|
_setup_paths()
|
||||||
|
# Force the flag on for this run regardless of env — backfill is the
|
||||||
|
# whole point of running this script. The deploy-time default stays off.
|
||||||
|
os.environ["MULTIMODAL_ENABLED"] = "true"
|
||||||
|
|
||||||
|
from legal_mcp import config # noqa: E402
|
||||||
|
from legal_mcp.services import db, embeddings, extractor, processor # noqa: E402
|
||||||
|
|
||||||
|
logging.basicConfig(
|
||||||
|
level=logging.INFO,
|
||||||
|
format="%(asctime)s [%(levelname)s] %(message)s",
|
||||||
|
)
|
||||||
|
logger = logging.getLogger("multimodal_backfill")
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_local_path(db_path: str) -> Path:
|
||||||
|
"""Map container path /data/... to host /home/chaim/legal-ai/data/...
|
||||||
|
when running locally; pass-through when already absolute and present."""
|
||||||
|
p = Path(db_path)
|
||||||
|
if p.is_file():
|
||||||
|
return p
|
||||||
|
if str(p).startswith("/data/"):
|
||||||
|
local = Path("/home/chaim/legal-ai") / Path(*p.parts[1:])
|
||||||
|
if local.is_file():
|
||||||
|
return local
|
||||||
|
return p
|
||||||
|
|
||||||
|
|
||||||
|
async def _backfill_document(
|
||||||
|
document_id: UUID,
|
||||||
|
case_id: UUID,
|
||||||
|
title: str,
|
||||||
|
db_file_path: str,
|
||||||
|
skip_if_exists: bool,
|
||||||
|
) -> dict:
|
||||||
|
pool = await db.get_pool()
|
||||||
|
if skip_if_exists:
|
||||||
|
existing = await pool.fetchval(
|
||||||
|
"SELECT count(*) FROM document_image_embeddings WHERE document_id = $1",
|
||||||
|
document_id,
|
||||||
|
)
|
||||||
|
if existing and existing > 0:
|
||||||
|
logger.info(" skip (%d rows already): %s", existing, title)
|
||||||
|
return {"status": "skipped", "rows": int(existing)}
|
||||||
|
|
||||||
|
pdf_path = _resolve_local_path(db_file_path)
|
||||||
|
if not pdf_path.is_file():
|
||||||
|
logger.warning(" file missing: %s (%s)", pdf_path, title)
|
||||||
|
return {"status": "missing"}
|
||||||
|
if pdf_path.suffix.lower() != ".pdf":
|
||||||
|
logger.info(" not a PDF, skipping: %s", title)
|
||||||
|
return {"status": "not_pdf"}
|
||||||
|
|
||||||
|
page_count = await pool.fetchval(
|
||||||
|
"SELECT page_count FROM documents WHERE id = $1", document_id,
|
||||||
|
)
|
||||||
|
if not page_count:
|
||||||
|
# Open to count
|
||||||
|
import fitz
|
||||||
|
d = fitz.open(str(pdf_path))
|
||||||
|
page_count = len(d)
|
||||||
|
d.close()
|
||||||
|
|
||||||
|
logger.info(" embedding %s (%d pages)", title, page_count)
|
||||||
|
t0 = time.time()
|
||||||
|
result = await processor._embed_document_pages(
|
||||||
|
document_id, case_id, pdf_path, page_count,
|
||||||
|
)
|
||||||
|
elapsed = time.time() - t0
|
||||||
|
logger.info(" done in %.1fs: %s", elapsed, result)
|
||||||
|
return {"status": "ok", "elapsed_sec": round(elapsed, 1), **result}
|
||||||
|
|
||||||
|
|
||||||
|
async def backfill_cases(case_numbers: list[str], skip_if_exists: bool = True) -> dict:
|
||||||
|
"""Embed page images for every PDF document in the given cases."""
|
||||||
|
await db.init_schema() # in case schema V9 hasn't been applied
|
||||||
|
pool = await db.get_pool()
|
||||||
|
summary: dict = {}
|
||||||
|
for cn in case_numbers:
|
||||||
|
logger.info("=" * 60)
|
||||||
|
logger.info("Case %s", cn)
|
||||||
|
case = await db.get_case_by_number(cn)
|
||||||
|
if not case:
|
||||||
|
logger.warning("Case not found: %s", cn)
|
||||||
|
summary[cn] = {"status": "case_not_found"}
|
||||||
|
continue
|
||||||
|
case_id = UUID(str(case["id"]))
|
||||||
|
docs = await pool.fetch(
|
||||||
|
"SELECT id, title, file_path FROM documents WHERE case_id = $1 ORDER BY title",
|
||||||
|
case_id,
|
||||||
|
)
|
||||||
|
logger.info(" %d documents", len(docs))
|
||||||
|
per_doc: list[dict] = []
|
||||||
|
for d in docs:
|
||||||
|
doc_id = UUID(str(d["id"]))
|
||||||
|
title = d["title"]
|
||||||
|
r = await _backfill_document(
|
||||||
|
doc_id, case_id, title, d["file_path"], skip_if_exists,
|
||||||
|
)
|
||||||
|
per_doc.append({"document_id": str(doc_id), "title": title, **r})
|
||||||
|
summary[cn] = {
|
||||||
|
"documents_total": len(docs),
|
||||||
|
"embedded": sum(1 for r in per_doc if r["status"] == "ok"),
|
||||||
|
"skipped": sum(1 for r in per_doc if r["status"] == "skipped"),
|
||||||
|
"missing": sum(1 for r in per_doc if r["status"] == "missing"),
|
||||||
|
"not_pdf": sum(1 for r in per_doc if r["status"] == "not_pdf"),
|
||||||
|
"documents": per_doc,
|
||||||
|
}
|
||||||
|
return summary
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(description="Multimodal backfill for case documents")
|
||||||
|
parser.add_argument(
|
||||||
|
"cases", nargs="+", help="Case numbers to backfill (e.g. 8174-24 8137-24)"
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--re-embed", action="store_true",
|
||||||
|
help="Re-embed even if image embeddings already exist (default: skip)",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
logger.info("MULTIMODAL_MODEL=%s DPI=%d THUMB_DPI=%d",
|
||||||
|
config.MULTIMODAL_MODEL, config.MULTIMODAL_DPI, config.MULTIMODAL_THUMB_DPI)
|
||||||
|
summary = asyncio.run(
|
||||||
|
backfill_cases(args.cases, skip_if_exists=not args.re_embed)
|
||||||
|
)
|
||||||
|
print()
|
||||||
|
print("=" * 60)
|
||||||
|
print("SUMMARY")
|
||||||
|
print("=" * 60)
|
||||||
|
for cn, s in summary.items():
|
||||||
|
if s.get("status") == "case_not_found":
|
||||||
|
print(f" {cn}: NOT FOUND")
|
||||||
|
continue
|
||||||
|
print(
|
||||||
|
f" {cn}: {s['documents_total']} docs — "
|
||||||
|
f"embedded {s['embedded']}, skipped {s['skipped']}, "
|
||||||
|
f"missing {s['missing']}, non-pdf {s['not_pdf']}"
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
170
scripts/reembed_voyage.py
Normal file
170
scripts/reembed_voyage.py
Normal file
@@ -0,0 +1,170 @@
|
|||||||
|
"""Re-embed all Voyage-stored vectors with the model in env VOYAGE_MODEL.
|
||||||
|
|
||||||
|
Use after changing VOYAGE_MODEL in env (e.g. voyage-law-2 → voyage-3).
|
||||||
|
The script reads each table that stores embeddings, batches the source
|
||||||
|
text through the new model (Voyage allows 128 inputs / call), and
|
||||||
|
UPDATEs the rows in place.
|
||||||
|
|
||||||
|
Tables touched:
|
||||||
|
- document_chunks (content)
|
||||||
|
- paragraph_embeddings (joined with decision_paragraphs.content)
|
||||||
|
- case_law_embeddings (chunk_text)
|
||||||
|
- precedent_chunks (content)
|
||||||
|
- halachot (rule_statement + reasoning_summary)
|
||||||
|
|
||||||
|
Run from the legal-ai venv with VOYAGE_API_KEY + VOYAGE_MODEL +
|
||||||
|
POSTGRES_* set in env (or ~/.env). Idempotent — safe to re-run.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
/home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||||
|
/home/chaim/legal-ai/scripts/reembed_voyage.py
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
|
||||||
|
# Load ~/.env if present
|
||||||
|
ENV_PATH = os.path.expanduser("~/.env")
|
||||||
|
if os.path.isfile(ENV_PATH):
|
||||||
|
with open(ENV_PATH) as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line and not line.startswith("#") and "=" in line:
|
||||||
|
k, v = line.split("=", 1)
|
||||||
|
os.environ.setdefault(k, v)
|
||||||
|
|
||||||
|
import asyncpg # noqa: E402
|
||||||
|
import voyageai # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
VOYAGE_MODEL = os.environ.get("VOYAGE_MODEL", "voyage-3")
|
||||||
|
BATCH = 100 # Voyage allows 128, leave headroom for token limits
|
||||||
|
|
||||||
|
# (table, primary key, source-text SQL, update SQL with $1=embedding $2=id)
|
||||||
|
TABLES = [
|
||||||
|
(
|
||||||
|
"document_chunks",
|
||||||
|
"SELECT id, content FROM document_chunks WHERE content IS NOT NULL AND content <> ''",
|
||||||
|
"UPDATE document_chunks SET embedding = $1 WHERE id = $2",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"paragraph_embeddings",
|
||||||
|
# paragraph_embeddings stores embedding only — text is in decision_paragraphs
|
||||||
|
"SELECT pe.id, dp.content "
|
||||||
|
"FROM paragraph_embeddings pe "
|
||||||
|
"JOIN decision_paragraphs dp ON dp.id = pe.paragraph_id "
|
||||||
|
"WHERE dp.content IS NOT NULL AND dp.content <> ''",
|
||||||
|
"UPDATE paragraph_embeddings SET embedding = $1 WHERE id = $2",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"case_law_embeddings",
|
||||||
|
"SELECT id, chunk_text FROM case_law_embeddings "
|
||||||
|
"WHERE chunk_text IS NOT NULL AND chunk_text <> ''",
|
||||||
|
"UPDATE case_law_embeddings SET embedding = $1 WHERE id = $2",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"precedent_chunks",
|
||||||
|
"SELECT id, content FROM precedent_chunks WHERE content IS NOT NULL AND content <> ''",
|
||||||
|
"UPDATE precedent_chunks SET embedding = $1 WHERE id = $2",
|
||||||
|
),
|
||||||
|
(
|
||||||
|
"halachot",
|
||||||
|
# Embed rule_statement + reasoning_summary, matching the original
|
||||||
|
# storage in halacha_extractor.extract().
|
||||||
|
"SELECT id, "
|
||||||
|
" TRIM(BOTH ' —' FROM rule_statement || ' — ' || COALESCE(reasoning_summary, '')) "
|
||||||
|
" AS embed_text "
|
||||||
|
"FROM halachot WHERE rule_statement IS NOT NULL AND rule_statement <> ''",
|
||||||
|
"UPDATE halachot SET embedding = $1 WHERE id = $2",
|
||||||
|
),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
async def embed_batch(client, texts: list[str]) -> list[list[float]]:
|
||||||
|
"""Voyage embed_texts with explicit input_type='document' for storage."""
|
||||||
|
return client.embed(texts, model=VOYAGE_MODEL, input_type="document").embeddings
|
||||||
|
|
||||||
|
|
||||||
|
async def reembed_table(
|
||||||
|
pool: asyncpg.Pool, voyage, label: str, select_sql: str, update_sql: str,
|
||||||
|
) -> dict:
|
||||||
|
rows = await pool.fetch(select_sql)
|
||||||
|
n = len(rows)
|
||||||
|
print(f"\n[{label}] {n} rows")
|
||||||
|
if n == 0:
|
||||||
|
return {"table": label, "rows": 0, "elapsed": 0.0}
|
||||||
|
start = time.time()
|
||||||
|
done = 0
|
||||||
|
for i in range(0, n, BATCH):
|
||||||
|
batch_rows = rows[i:i + BATCH]
|
||||||
|
texts = [r[1] for r in batch_rows]
|
||||||
|
ids = [r[0] for r in batch_rows]
|
||||||
|
try:
|
||||||
|
embeddings = await embed_batch(voyage, texts)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" [{label}] batch {i // BATCH} failed: {e}", file=sys.stderr)
|
||||||
|
continue
|
||||||
|
# Update each row
|
||||||
|
async with pool.acquire() as conn:
|
||||||
|
async with conn.transaction():
|
||||||
|
for emb, rid in zip(embeddings, ids):
|
||||||
|
# asyncpg accepts list[float] for vector via asyncpg-pgvector;
|
||||||
|
# but pgvector type is inferred via str cast on the wire
|
||||||
|
await conn.execute(update_sql, str(emb), rid)
|
||||||
|
done += len(batch_rows)
|
||||||
|
elapsed = time.time() - start
|
||||||
|
print(f" [{label}] {done}/{n} ({done/n*100:.1f}%) "
|
||||||
|
f"elapsed={elapsed:.0f}s rate={done/max(elapsed,0.1):.1f}/s")
|
||||||
|
elapsed = time.time() - start
|
||||||
|
return {"table": label, "rows": n, "elapsed": elapsed}
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
api_key = os.environ.get("VOYAGE_API_KEY")
|
||||||
|
if not api_key:
|
||||||
|
sys.exit("VOYAGE_API_KEY not set (export it or add to ~/.env)")
|
||||||
|
|
||||||
|
pg_host = os.environ.get("POSTGRES_HOST", "127.0.0.1")
|
||||||
|
pg_port = int(os.environ.get("POSTGRES_PORT", "5433"))
|
||||||
|
pg_user = os.environ.get("POSTGRES_USER", "legal_ai")
|
||||||
|
pg_pw = os.environ.get("POSTGRES_PASSWORD", "")
|
||||||
|
pg_db = os.environ.get("POSTGRES_DB", "legal_ai")
|
||||||
|
if not pg_pw:
|
||||||
|
sys.exit("POSTGRES_PASSWORD not set")
|
||||||
|
|
||||||
|
print(f"Re-embed all tables with model: {VOYAGE_MODEL}")
|
||||||
|
print(f"DB: {pg_user}@{pg_host}:{pg_port}/{pg_db}")
|
||||||
|
|
||||||
|
voyage = voyageai.Client(api_key=api_key)
|
||||||
|
pool = await asyncpg.create_pool(
|
||||||
|
host=pg_host, port=pg_port, user=pg_user,
|
||||||
|
password=pg_pw, database=pg_db,
|
||||||
|
min_size=1, max_size=4,
|
||||||
|
)
|
||||||
|
|
||||||
|
# pgvector needs explicit codec setup so we can pass list[float]
|
||||||
|
async def _init(conn: asyncpg.Connection) -> None:
|
||||||
|
await conn.execute("SET search_path = public")
|
||||||
|
await pool.__aenter__() # noqa — enter context to ensure init
|
||||||
|
|
||||||
|
summary = []
|
||||||
|
try:
|
||||||
|
for label, select_sql, update_sql in TABLES:
|
||||||
|
r = await reembed_table(pool, voyage, label, select_sql, update_sql)
|
||||||
|
summary.append(r)
|
||||||
|
finally:
|
||||||
|
await pool.close()
|
||||||
|
|
||||||
|
total_rows = sum(r["rows"] for r in summary)
|
||||||
|
total_time = sum(r["elapsed"] for r in summary)
|
||||||
|
print(f"\n{'=' * 60}\nDONE — {total_rows} rows in {total_time:.0f}s")
|
||||||
|
for r in summary:
|
||||||
|
print(f" {r['table']:30s} {r['rows']:>6} rows {r['elapsed']:>5.0f}s")
|
||||||
|
print(f"\nModel: {VOYAGE_MODEL}")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
182
scripts/voyage_context3_poc.py
Normal file
182
scripts/voyage_context3_poc.py
Normal file
@@ -0,0 +1,182 @@
|
|||||||
|
"""POC: Compare voyage-3 vs voyage-context-3 retrieval on case 403/17.
|
||||||
|
|
||||||
|
Pulls all chunks of "אהרון ברק - תכנית רחביה" (case_law_id=e151fc25-...),
|
||||||
|
runs them through voyage-context-3 in a single contextualized_embed call,
|
||||||
|
then runs benchmark queries and compares rankings against the existing
|
||||||
|
voyage-3 embeddings (already in the DB).
|
||||||
|
|
||||||
|
No DB writes — all comparisons in memory. Output: ranking table for each
|
||||||
|
query showing top-10 from both models side-by-side.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
/home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||||
|
/home/chaim/legal-ai/scripts/voyage_context3_poc.py
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import math
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
|
||||||
|
# Load ~/.env
|
||||||
|
ENV_PATH = os.path.expanduser("~/.env")
|
||||||
|
if os.path.isfile(ENV_PATH):
|
||||||
|
with open(ENV_PATH) as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line and not line.startswith("#") and "=" in line:
|
||||||
|
k, v = line.split("=", 1)
|
||||||
|
os.environ.setdefault(k, v)
|
||||||
|
|
||||||
|
import asyncpg # noqa: E402
|
||||||
|
import voyageai # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
# Using קלמנוביץ/לויתן (52K chars, 63 chunks, ~18K tokens)
|
||||||
|
# — fits in single context-3 call (32K token limit per inner list).
|
||||||
|
# אהרון ברק (60K tokens) requires splitting; we'll handle that after POC.
|
||||||
|
CASE_ID = "436efd48-c8ab-49f0-b3a9-52bf15ea806d" # בר"מ 25226-04-25
|
||||||
|
CONTEXT_MODEL = "voyage-context-3"
|
||||||
|
BASELINE_MODEL = "voyage-3" # already in DB
|
||||||
|
|
||||||
|
QUERIES = [
|
||||||
|
"סמכות ועדת ערר",
|
||||||
|
"פיצויים לפי סעיף 197",
|
||||||
|
"ירידת ערך מקרקעין",
|
||||||
|
"תכנית פוגעת",
|
||||||
|
"שיקול דעת ועדה מקומית",
|
||||||
|
"חוות דעת שמאי מכריע",
|
||||||
|
"מקרקעין גובלים",
|
||||||
|
"תקופת התיישנות תביעה",
|
||||||
|
"אינטרס ציבורי בתכנון",
|
||||||
|
"דחיית תביעת פיצויים",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def cosine(a: list[float], b: list[float]) -> float:
|
||||||
|
dot = sum(x * y for x, y in zip(a, b))
|
||||||
|
na = math.sqrt(sum(x * x for x in a))
|
||||||
|
nb = math.sqrt(sum(y * y for y in b))
|
||||||
|
return dot / (na * nb) if na and nb else 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def parse_pgvector(s: str) -> list[float]:
|
||||||
|
"""pgvector text format: '[0.1,0.2,...]'."""
|
||||||
|
return [float(x) for x in s.strip("[]").split(",")]
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
api_key = os.environ["VOYAGE_API_KEY"]
|
||||||
|
pg_pw = os.environ["POSTGRES_PASSWORD"]
|
||||||
|
|
||||||
|
voyage = voyageai.Client(api_key=api_key)
|
||||||
|
|
||||||
|
pool = await asyncpg.create_pool(
|
||||||
|
host="127.0.0.1", port=5433, user="legal_ai",
|
||||||
|
password=pg_pw, database="legal_ai",
|
||||||
|
min_size=1, max_size=2,
|
||||||
|
)
|
||||||
|
|
||||||
|
# 1. Pull all chunks + their existing voyage-3 embeddings
|
||||||
|
rows = await pool.fetch("""
|
||||||
|
SELECT chunk_index, content, embedding::text AS emb_text
|
||||||
|
FROM precedent_chunks
|
||||||
|
WHERE case_law_id = $1
|
||||||
|
ORDER BY chunk_index
|
||||||
|
""", CASE_ID)
|
||||||
|
print(f"[load] {len(rows)} chunks from case 403/17")
|
||||||
|
|
||||||
|
chunks = [r["content"] for r in rows]
|
||||||
|
indices = [r["chunk_index"] for r in rows]
|
||||||
|
baseline_embs = [parse_pgvector(r["emb_text"]) for r in rows]
|
||||||
|
|
||||||
|
# 2. Embed all chunks with voyage-context-3 — single contextualized call
|
||||||
|
total_chars = sum(len(c) for c in chunks)
|
||||||
|
print(f"[context] embedding {len(chunks)} chunks, {total_chars:,} chars total")
|
||||||
|
start = time.time()
|
||||||
|
result = voyage.contextualized_embed(
|
||||||
|
inputs=[chunks], # one document = one inner list
|
||||||
|
model=CONTEXT_MODEL,
|
||||||
|
input_type="document",
|
||||||
|
)
|
||||||
|
elapsed = time.time() - start
|
||||||
|
# ContextualizedEmbeddingsObject: result.results = list of per-document
|
||||||
|
# embeddings. result.results[0].embeddings = list of chunk embeddings.
|
||||||
|
context_embs = result.results[0].embeddings
|
||||||
|
total_tokens = getattr(result, "total_tokens", "?")
|
||||||
|
print(f"[context] done in {elapsed:.1f}s — total_tokens={total_tokens}")
|
||||||
|
assert len(context_embs) == len(chunks), "embedding count mismatch"
|
||||||
|
|
||||||
|
# 3. For each query — embed twice and compare top-10
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print(f"{'Q':<3} {'baseline (voyage-3)':<48} {'context-3':<48}")
|
||||||
|
print("=" * 100)
|
||||||
|
|
||||||
|
rank_overlaps = []
|
||||||
|
score_lifts = []
|
||||||
|
|
||||||
|
for q_idx, query in enumerate(QUERIES, 1):
|
||||||
|
# Baseline query embedding (regular embed)
|
||||||
|
q_baseline = voyage.embed(
|
||||||
|
[query], model=BASELINE_MODEL, input_type="query"
|
||||||
|
).embeddings[0]
|
||||||
|
# Context query embedding — must use contextualized_embed even for
|
||||||
|
# single-string queries (regular embed() rejects voyage-context-3).
|
||||||
|
q_context = voyage.contextualized_embed(
|
||||||
|
inputs=[[query]],
|
||||||
|
model=CONTEXT_MODEL,
|
||||||
|
input_type="query",
|
||||||
|
).results[0].embeddings[0]
|
||||||
|
|
||||||
|
# Score every chunk under both models
|
||||||
|
scores_b = sorted(
|
||||||
|
[(cosine(q_baseline, e), i) for i, e in enumerate(baseline_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
scores_c = sorted(
|
||||||
|
[(cosine(q_context, e), i) for i, e in enumerate(context_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
top10_b = [i for _, i in scores_b[:10]]
|
||||||
|
top10_c = [i for _, i in scores_c[:10]]
|
||||||
|
|
||||||
|
# Compute overlap and avg score in top-3
|
||||||
|
overlap = len(set(top10_b) & set(top10_c))
|
||||||
|
avg_b_top3 = sum(s for s, _ in scores_b[:3]) / 3
|
||||||
|
avg_c_top3 = sum(s for s, _ in scores_c[:3]) / 3
|
||||||
|
rank_overlaps.append(overlap)
|
||||||
|
score_lifts.append(avg_c_top3 - avg_b_top3)
|
||||||
|
|
||||||
|
print(f"\n[Q{q_idx}] {query}")
|
||||||
|
print(f" overlap top-10: {overlap}/10 | avg score top-3: "
|
||||||
|
f"baseline={avg_b_top3:.3f} context-3={avg_c_top3:.3f} "
|
||||||
|
f"Δ={avg_c_top3 - avg_b_top3:+.3f}")
|
||||||
|
for rank in range(5):
|
||||||
|
sb, ib = scores_b[rank]
|
||||||
|
sc, ic = scores_c[rank]
|
||||||
|
cb = chunks[ib].replace("\n", " ").strip()[:50]
|
||||||
|
cc = chunks[ic].replace("\n", " ").strip()[:50]
|
||||||
|
print(f" #{rank+1} [{indices[ib]:3d}] {sb:.3f} {cb:<55} "
|
||||||
|
f"| [{indices[ic]:3d}] {sc:.3f} {cc}")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print("SUMMARY")
|
||||||
|
print("=" * 100)
|
||||||
|
avg_overlap = sum(rank_overlaps) / len(rank_overlaps)
|
||||||
|
avg_lift = sum(score_lifts) / len(score_lifts)
|
||||||
|
print(f"Avg overlap top-10: {avg_overlap:.1f}/10 "
|
||||||
|
f"(higher = models agree more)")
|
||||||
|
print(f"Avg score lift top-3 (context - baseline): {avg_lift:+.4f}")
|
||||||
|
print(f"\nNote: cosine scores are not directly comparable across models.")
|
||||||
|
print(f"What matters more is which CHUNKS bubble to the top —")
|
||||||
|
print(f"reading the actual content above tells the real story.")
|
||||||
|
|
||||||
|
await pool.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
238
scripts/voyage_context3_poc_long.py
Normal file
238
scripts/voyage_context3_poc_long.py
Normal file
@@ -0,0 +1,238 @@
|
|||||||
|
"""POC #2: voyage-3 vs voyage-context-3 on a LONG case (אהרון ברק 403/17).
|
||||||
|
|
||||||
|
Case is 178K chars / 219 chunks / ~60K tokens — too big for a single
|
||||||
|
contextualized_embed call (32K token limit per inner list). We split the
|
||||||
|
chunks into overlapping sliding windows (~80 chunks each, ~22K tokens)
|
||||||
|
and merge: each chunk gets the embedding from the window where it sits
|
||||||
|
*most centrally* (max symmetric context on both sides).
|
||||||
|
|
||||||
|
The hypothesis: voyage-context-3 should shine here because the case is
|
||||||
|
full of internal references ("ראה לעיל סעיף 13", "להבדיל מעניין X",
|
||||||
|
"תוצאת הבחינה ב-בר"מ 1975/24 שנידונה לעיל"). voyage-3 embeds chunks
|
||||||
|
in isolation; context-3 sees ~80 surrounding chunks per embedding.
|
||||||
|
|
||||||
|
No DB writes. Output: side-by-side ranking comparison + summary.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
/home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||||
|
/home/chaim/legal-ai/scripts/voyage_context3_poc_long.py
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import math
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
|
||||||
|
ENV_PATH = os.path.expanduser("~/.env")
|
||||||
|
if os.path.isfile(ENV_PATH):
|
||||||
|
with open(ENV_PATH) as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line and not line.startswith("#") and "=" in line:
|
||||||
|
k, v = line.split("=", 1)
|
||||||
|
os.environ.setdefault(k, v)
|
||||||
|
|
||||||
|
import asyncpg # noqa: E402
|
||||||
|
import voyageai # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
CASE_ID = "e151fc25-cf12-4563-b638-a86323f8413b" # 403/17 אהרון ברק (178K chars)
|
||||||
|
CONTEXT_MODEL = "voyage-context-3"
|
||||||
|
BASELINE_MODEL = "voyage-3"
|
||||||
|
|
||||||
|
# Sliding-window split params. With 219 chunks and ~60K tokens total
|
||||||
|
# (~275 tokens/chunk average), 3 windows of 80 chunks each is ~22K tokens
|
||||||
|
# per call — comfortably under 32K.
|
||||||
|
WINDOW_SIZE = 80
|
||||||
|
WINDOW_STRIDE = 70 # overlap = WINDOW_SIZE - WINDOW_STRIDE = 10
|
||||||
|
|
||||||
|
# Mix of:
|
||||||
|
# (a) generic queries (also tested in POC #1)
|
||||||
|
# (b) queries that require *internal* document context
|
||||||
|
QUERIES = [
|
||||||
|
# generic
|
||||||
|
"תכנית רחביה הוראות בנייה",
|
||||||
|
"פיצויים לפי סעיף 197 ירידת ערך",
|
||||||
|
"השפעת תכנית על שווי מקרקעין",
|
||||||
|
"סמכות ועדת ערר לדון בפיצויים",
|
||||||
|
"תוספת זכויות בנייה כפיצוי",
|
||||||
|
# internal-context — should benefit context-3
|
||||||
|
"ההבחנה בין השבחה לפיצויים",
|
||||||
|
"מה נקבע לגבי תמ\"א 38 בפסק הדין",
|
||||||
|
"ההלכה שנקבעה בעניין רובע 3",
|
||||||
|
"כלל הנטרול של זכויות תכנוניות",
|
||||||
|
"הסכמת השופט אלרון לחוות הדעת",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def cosine(a: list[float], b: list[float]) -> float:
|
||||||
|
dot = sum(x * y for x, y in zip(a, b))
|
||||||
|
na = math.sqrt(sum(x * x for x in a))
|
||||||
|
nb = math.sqrt(sum(y * y for y in b))
|
||||||
|
return dot / (na * nb) if na and nb else 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def parse_pgvector(s: str) -> list[float]:
|
||||||
|
return [float(x) for x in s.strip("[]").split(",")]
|
||||||
|
|
||||||
|
|
||||||
|
def build_windows(n: int, size: int, stride: int) -> list[tuple[int, int]]:
|
||||||
|
"""Return list of (start, end) ranges (end exclusive) covering 0..n.
|
||||||
|
|
||||||
|
Last window extends to n exactly. Overlap = size - stride.
|
||||||
|
"""
|
||||||
|
windows = []
|
||||||
|
start = 0
|
||||||
|
while start < n:
|
||||||
|
end = min(start + size, n)
|
||||||
|
windows.append((start, end))
|
||||||
|
if end == n:
|
||||||
|
break
|
||||||
|
start += stride
|
||||||
|
return windows
|
||||||
|
|
||||||
|
|
||||||
|
def assign_chunk_to_window(
|
||||||
|
chunk_idx: int, windows: list[tuple[int, int]],
|
||||||
|
) -> int:
|
||||||
|
"""Pick the window where chunk_idx sits most centrally (max symmetric
|
||||||
|
distance to either edge). Ties broken by larger window."""
|
||||||
|
best = -1
|
||||||
|
best_score = -1
|
||||||
|
for w_idx, (s, e) in enumerate(windows):
|
||||||
|
if not (s <= chunk_idx < e):
|
||||||
|
continue
|
||||||
|
# symmetric distance: min(distance to s, distance to e-1)
|
||||||
|
dist = min(chunk_idx - s, (e - 1) - chunk_idx)
|
||||||
|
if dist > best_score:
|
||||||
|
best_score = dist
|
||||||
|
best = w_idx
|
||||||
|
return best
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
api_key = os.environ["VOYAGE_API_KEY"]
|
||||||
|
pg_pw = os.environ["POSTGRES_PASSWORD"]
|
||||||
|
|
||||||
|
voyage = voyageai.Client(api_key=api_key)
|
||||||
|
|
||||||
|
pool = await asyncpg.create_pool(
|
||||||
|
host="127.0.0.1", port=5433, user="legal_ai",
|
||||||
|
password=pg_pw, database="legal_ai",
|
||||||
|
min_size=1, max_size=2,
|
||||||
|
)
|
||||||
|
|
||||||
|
rows = await pool.fetch("""
|
||||||
|
SELECT chunk_index, content, embedding::text AS emb_text
|
||||||
|
FROM precedent_chunks
|
||||||
|
WHERE case_law_id = $1
|
||||||
|
ORDER BY chunk_index
|
||||||
|
""", CASE_ID)
|
||||||
|
n = len(rows)
|
||||||
|
print(f"[load] {n} chunks from אהרון ברק 403/17")
|
||||||
|
|
||||||
|
chunks = [r["content"] for r in rows]
|
||||||
|
indices = [r["chunk_index"] for r in rows]
|
||||||
|
baseline_embs = [parse_pgvector(r["emb_text"]) for r in rows]
|
||||||
|
|
||||||
|
# Build windows
|
||||||
|
windows = build_windows(n, WINDOW_SIZE, WINDOW_STRIDE)
|
||||||
|
print(f"[windows] {len(windows)} windows: "
|
||||||
|
f"{', '.join(f'[{s}:{e})' for s, e in windows)}")
|
||||||
|
|
||||||
|
# Embed each window with context-3
|
||||||
|
window_embs: list[list[list[float]]] = [] # [window][chunk_in_window][dim]
|
||||||
|
total_call_tokens = 0
|
||||||
|
total_start = time.time()
|
||||||
|
for w_idx, (s, e) in enumerate(windows):
|
||||||
|
sub_chunks = chunks[s:e]
|
||||||
|
sub_chars = sum(len(c) for c in sub_chunks)
|
||||||
|
start = time.time()
|
||||||
|
result = voyage.contextualized_embed(
|
||||||
|
inputs=[sub_chunks],
|
||||||
|
model=CONTEXT_MODEL,
|
||||||
|
input_type="document",
|
||||||
|
)
|
||||||
|
elapsed = time.time() - start
|
||||||
|
toks = getattr(result, "total_tokens", 0)
|
||||||
|
total_call_tokens += toks
|
||||||
|
print(f" [window {w_idx}] [{s}:{e}) — {len(sub_chunks)} chunks, "
|
||||||
|
f"{sub_chars:,} chars, {toks} tokens — {elapsed:.1f}s")
|
||||||
|
window_embs.append(result.results[0].embeddings)
|
||||||
|
total_elapsed = time.time() - total_start
|
||||||
|
print(f"[context] all windows done in {total_elapsed:.1f}s, "
|
||||||
|
f"{total_call_tokens} total tokens")
|
||||||
|
|
||||||
|
# Merge: for each chunk, pick the embedding from its most-central window
|
||||||
|
context_embs: list[list[float]] = []
|
||||||
|
chunk_window_choice = []
|
||||||
|
for i in range(n):
|
||||||
|
w_idx = assign_chunk_to_window(i, windows)
|
||||||
|
chunk_window_choice.append(w_idx)
|
||||||
|
s, _ = windows[w_idx]
|
||||||
|
context_embs.append(window_embs[w_idx][i - s])
|
||||||
|
print(f"[merge] window distribution: "
|
||||||
|
f"{[chunk_window_choice.count(j) for j in range(len(windows))]}")
|
||||||
|
|
||||||
|
# Run queries
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print(f"{'Q':<3} {'baseline (voyage-3)':<48} {'context-3 (windowed)':<48}")
|
||||||
|
print("=" * 100)
|
||||||
|
|
||||||
|
rank_overlaps = []
|
||||||
|
for q_idx, query in enumerate(QUERIES, 1):
|
||||||
|
q_baseline = voyage.embed(
|
||||||
|
[query], model=BASELINE_MODEL, input_type="query"
|
||||||
|
).embeddings[0]
|
||||||
|
q_context = voyage.contextualized_embed(
|
||||||
|
inputs=[[query]],
|
||||||
|
model=CONTEXT_MODEL,
|
||||||
|
input_type="query",
|
||||||
|
).results[0].embeddings[0]
|
||||||
|
|
||||||
|
scores_b = sorted(
|
||||||
|
[(cosine(q_baseline, e), i) for i, e in enumerate(baseline_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
scores_c = sorted(
|
||||||
|
[(cosine(q_context, e), i) for i, e in enumerate(context_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
|
||||||
|
top10_b = [i for _, i in scores_b[:10]]
|
||||||
|
top10_c = [i for _, i in scores_c[:10]]
|
||||||
|
overlap = len(set(top10_b) & set(top10_c))
|
||||||
|
rank_overlaps.append(overlap)
|
||||||
|
|
||||||
|
print(f"\n[Q{q_idx}] {query}")
|
||||||
|
print(f" overlap top-10: {overlap}/10 | "
|
||||||
|
f"avg score top-3: baseline="
|
||||||
|
f"{sum(s for s, _ in scores_b[:3])/3:.3f} "
|
||||||
|
f"context-3={sum(s for s, _ in scores_c[:3])/3:.3f}")
|
||||||
|
for rank in range(5):
|
||||||
|
sb, ib = scores_b[rank]
|
||||||
|
sc, ic = scores_c[rank]
|
||||||
|
cb = chunks[ib].replace("\n", " ").strip()[:50]
|
||||||
|
cc = chunks[ic].replace("\n", " ").strip()[:50]
|
||||||
|
print(f" #{rank+1} [{indices[ib]:3d}] {sb:.3f} {cb:<55} "
|
||||||
|
f"| [{indices[ic]:3d}] {sc:.3f} {cc}")
|
||||||
|
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print("SUMMARY")
|
||||||
|
print("=" * 100)
|
||||||
|
avg = sum(rank_overlaps) / len(rank_overlaps)
|
||||||
|
print(f"Avg overlap top-10: {avg:.1f}/10")
|
||||||
|
print(f"Per-query overlap: {rank_overlaps}")
|
||||||
|
print(f"Total context-3 tokens used: {total_call_tokens:,} "
|
||||||
|
f"(in {len(windows)} calls)")
|
||||||
|
print(f"\nNote: cosine across models not directly comparable. The")
|
||||||
|
print(f"meaningful test is *which chunks bubble to the top* — read")
|
||||||
|
print(f"the actual text above to judge relevance.")
|
||||||
|
|
||||||
|
await pool.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
213
scripts/voyage_multimodal_poc.py
Normal file
213
scripts/voyage_multimodal_poc.py
Normal file
@@ -0,0 +1,213 @@
|
|||||||
|
"""POC #3: voyage-3 (text) vs voyage-multimodal-3.5 (page images) on a
|
||||||
|
real appraisal PDF (89 pages, full of tables / signatures / numerical
|
||||||
|
data — the corpus class where multimodal should help most).
|
||||||
|
|
||||||
|
Document under test:
|
||||||
|
baf10153-d2fc-4481-b250-9fe87440ce69
|
||||||
|
"נספח - שומה מכרעת (אבלין דוידזון שמאמא) - 15.09.24"
|
||||||
|
case 8137-24, 89 pages, 2.1 MB
|
||||||
|
|
||||||
|
The pipeline:
|
||||||
|
1. Pull the existing voyage-3 text-chunk embeddings from `document_chunks`.
|
||||||
|
2. Render each PDF page → PNG (PyMuPDF, dpi=144).
|
||||||
|
3. Embed all pages via voyage-multimodal-3.5.
|
||||||
|
4. Run benchmark queries (mix of generic + table-specific + visual)
|
||||||
|
against both: text top-K and page top-K.
|
||||||
|
|
||||||
|
The comparison is *qualitative* — text and image embeddings are
|
||||||
|
different "spaces" returning different ID types (chunk_id vs page_num).
|
||||||
|
What we look at is whether image-based retrieval surfaces tables,
|
||||||
|
signatures, or numerical data that text-only OCR loses.
|
||||||
|
|
||||||
|
No DB writes.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
/home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||||
|
/home/chaim/legal-ai/scripts/voyage_multimodal_poc.py
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import io
|
||||||
|
import math
|
||||||
|
import os
|
||||||
|
import time
|
||||||
|
|
||||||
|
ENV_PATH = os.path.expanduser("~/.env")
|
||||||
|
if os.path.isfile(ENV_PATH):
|
||||||
|
with open(ENV_PATH) as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line and not line.startswith("#") and "=" in line:
|
||||||
|
k, v = line.split("=", 1)
|
||||||
|
os.environ.setdefault(k, v)
|
||||||
|
|
||||||
|
import asyncpg # noqa: E402
|
||||||
|
import voyageai # noqa: E402
|
||||||
|
import fitz # PyMuPDF # noqa: E402
|
||||||
|
from PIL import Image # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
DOCUMENT_ID = "baf10153-d2fc-4481-b250-9fe87440ce69"
|
||||||
|
PDF_PATH = (
|
||||||
|
"/home/chaim/legal-ai/data/cases/8137-24/documents/originals/"
|
||||||
|
"נספח - שומה מכרעת (אבלין דוידזון שמאמא) - 15.09.24.pdf"
|
||||||
|
)
|
||||||
|
TEXT_MODEL = "voyage-3"
|
||||||
|
MULTIMODAL_MODEL = "voyage-multimodal-3" # check supported: 3.5 may not exist yet
|
||||||
|
DPI = 144
|
||||||
|
# voyage-multimodal: max 1000 inputs/call, 320M pixels/call (rough),
|
||||||
|
# so 89 pages at 1240×1750 ≈ 192M pixels = single call.
|
||||||
|
|
||||||
|
QUERIES = [
|
||||||
|
# generic-textual (both should handle)
|
||||||
|
"שיטת ההיוון בשומה",
|
||||||
|
"מתודולוגיית הערכת שווי",
|
||||||
|
# table/numerical (multimodal should help)
|
||||||
|
"טבלת השוואת ערכים לפני ואחרי התכנית",
|
||||||
|
"שווי המקרקעין במצב הקודם",
|
||||||
|
"שווי המקרקעין במצב החדש",
|
||||||
|
"ירידת ערך באחוזים",
|
||||||
|
# visual elements (text-only loses)
|
||||||
|
"חתימת השמאי",
|
||||||
|
"תרשים גוש וחלקה",
|
||||||
|
"מפת מיקום הנכס",
|
||||||
|
# context-heavy
|
||||||
|
"מסקנת השמאי המכריע",
|
||||||
|
"עקרון הצפיפות בתכנית",
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def cosine(a: list[float], b: list[float]) -> float:
|
||||||
|
dot = sum(x * y for x, y in zip(a, b))
|
||||||
|
na = math.sqrt(sum(x * x for x in a))
|
||||||
|
nb = math.sqrt(sum(y * y for y in b))
|
||||||
|
return dot / (na * nb) if na and nb else 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def parse_pgvector(s: str) -> list[float]:
|
||||||
|
return [float(x) for x in s.strip("[]").split(",")]
|
||||||
|
|
||||||
|
|
||||||
|
def render_pdf_pages(pdf_path: str, dpi: int) -> list[Image.Image]:
|
||||||
|
"""Render each page → PIL.Image (RGB)."""
|
||||||
|
doc = fitz.open(pdf_path)
|
||||||
|
images: list[Image.Image] = []
|
||||||
|
for page in doc:
|
||||||
|
pix = page.get_pixmap(dpi=dpi)
|
||||||
|
png_bytes = pix.tobytes("png")
|
||||||
|
img = Image.open(io.BytesIO(png_bytes)).convert("RGB")
|
||||||
|
images.append(img)
|
||||||
|
doc.close()
|
||||||
|
return images
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
api_key = os.environ["VOYAGE_API_KEY"]
|
||||||
|
pg_pw = os.environ["POSTGRES_PASSWORD"]
|
||||||
|
|
||||||
|
voyage = voyageai.Client(api_key=api_key)
|
||||||
|
|
||||||
|
# 1. Render PDF pages
|
||||||
|
print(f"[render] {PDF_PATH}")
|
||||||
|
start = time.time()
|
||||||
|
images = render_pdf_pages(PDF_PATH, DPI)
|
||||||
|
elapsed = time.time() - start
|
||||||
|
print(f"[render] {len(images)} pages in {elapsed:.1f}s, "
|
||||||
|
f"{images[0].size}px @ {DPI}dpi")
|
||||||
|
|
||||||
|
# 2. Pull existing text chunks + voyage-3 embeddings
|
||||||
|
pool = await asyncpg.create_pool(
|
||||||
|
host="127.0.0.1", port=5433, user="legal_ai",
|
||||||
|
password=pg_pw, database="legal_ai",
|
||||||
|
min_size=1, max_size=2,
|
||||||
|
)
|
||||||
|
rows = await pool.fetch("""
|
||||||
|
SELECT id, chunk_index, page_number, content,
|
||||||
|
embedding::text AS emb_text
|
||||||
|
FROM document_chunks
|
||||||
|
WHERE document_id = $1
|
||||||
|
ORDER BY chunk_index
|
||||||
|
""", DOCUMENT_ID)
|
||||||
|
print(f"[text] {len(rows)} text chunks loaded (voyage-3 in DB)")
|
||||||
|
text_contents = [r["content"] for r in rows]
|
||||||
|
text_chunk_pages = [r["page_number"] for r in rows]
|
||||||
|
text_embs = [parse_pgvector(r["emb_text"]) for r in rows]
|
||||||
|
|
||||||
|
# 3. Multimodal embed — try multimodal-3 first, fall back if needed
|
||||||
|
target_model = "voyage-multimodal-3"
|
||||||
|
print(f"[multimodal] embedding {len(images)} pages with {target_model}…")
|
||||||
|
start = time.time()
|
||||||
|
try:
|
||||||
|
mm_result = voyage.multimodal_embed(
|
||||||
|
inputs=[[img] for img in images], # list of single-image inputs
|
||||||
|
model=target_model,
|
||||||
|
input_type="document",
|
||||||
|
truncation=True,
|
||||||
|
)
|
||||||
|
except voyageai.error.InvalidRequestError as e:
|
||||||
|
print(f" [error] {e}")
|
||||||
|
await pool.close()
|
||||||
|
return
|
||||||
|
elapsed = time.time() - start
|
||||||
|
image_embs = mm_result.embeddings
|
||||||
|
mm_tokens = getattr(mm_result, "total_tokens", "?")
|
||||||
|
image_tokens = getattr(mm_result, "image_pixels", "?")
|
||||||
|
text_tokens_mm = getattr(mm_result, "text_tokens", "?")
|
||||||
|
print(f"[multimodal] done in {elapsed:.1f}s — "
|
||||||
|
f"total_tokens={mm_tokens} text_tokens={text_tokens_mm} "
|
||||||
|
f"image_pixels={image_tokens}")
|
||||||
|
assert len(image_embs) == len(images), "embedding count mismatch"
|
||||||
|
print(f"[multimodal] embedding dim = {len(image_embs[0])}")
|
||||||
|
|
||||||
|
# 4. Run queries
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print("QUERY RESULTS — top-5 chunks (text/voyage-3) "
|
||||||
|
"vs top-5 pages (multimodal)")
|
||||||
|
print("=" * 100)
|
||||||
|
|
||||||
|
for q_idx, query in enumerate(QUERIES, 1):
|
||||||
|
# Text-side: voyage-3 query embedding
|
||||||
|
q_text = voyage.embed(
|
||||||
|
[query], model=TEXT_MODEL, input_type="query"
|
||||||
|
).embeddings[0]
|
||||||
|
# Multimodal-side: same model, query input_type
|
||||||
|
q_mm = voyage.multimodal_embed(
|
||||||
|
inputs=[[query]],
|
||||||
|
model=target_model,
|
||||||
|
input_type="query",
|
||||||
|
).embeddings[0]
|
||||||
|
|
||||||
|
text_scores = sorted(
|
||||||
|
[(cosine(q_text, e), i) for i, e in enumerate(text_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)[:5]
|
||||||
|
mm_scores = sorted(
|
||||||
|
[(cosine(q_mm, e), i) for i, e in enumerate(image_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)[:5]
|
||||||
|
|
||||||
|
print(f"\n[Q{q_idx}] {query}")
|
||||||
|
print(f" --- text (voyage-3) top-5 ---")
|
||||||
|
for s, i in text_scores:
|
||||||
|
page = text_chunk_pages[i] if text_chunk_pages[i] else "?"
|
||||||
|
preview = text_contents[i].replace("\n", " ").strip()[:70]
|
||||||
|
print(f" {s:.3f} page={page:>3} chunk={i:>3} {preview}")
|
||||||
|
print(f" --- multimodal (image-only) top-5 ---")
|
||||||
|
for s, i in mm_scores:
|
||||||
|
print(f" {s:.3f} page={i+1:>3} (image)")
|
||||||
|
|
||||||
|
# Token / cost summary
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print("SUMMARY")
|
||||||
|
print("=" * 100)
|
||||||
|
print(f"PDF: {len(images)} pages @ {DPI}dpi → {target_model}")
|
||||||
|
print(f"Total multimodal tokens: {mm_tokens}")
|
||||||
|
print(f"Embedding dim: {len(image_embs[0])}")
|
||||||
|
print(f"Time: {elapsed:.1f}s for full doc")
|
||||||
|
|
||||||
|
await pool.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
318
scripts/voyage_rerank_corpus_poc.py
Normal file
318
scripts/voyage_rerank_corpus_poc.py
Normal file
@@ -0,0 +1,318 @@
|
|||||||
|
"""POC #5 — full precedent_library corpus benchmark.
|
||||||
|
|
||||||
|
Tests R1 (voyage-3) vs R2 (voyage-3 + rerank-2) on the *real* corpus that
|
||||||
|
search_precedent_library queries against:
|
||||||
|
|
||||||
|
precedent_chunks — 385 rows from 3 precedent cases
|
||||||
|
halachot — 400 rule statements with reasoning summaries
|
||||||
|
|
||||||
|
Total: 785 documents. The MCP tool merges results from both tables so the
|
||||||
|
benchmark mirrors production retrieval. R3 (context-3) is dropped — it
|
||||||
|
would require windowed re-embedding of 3 cases which we already proved
|
||||||
|
doesn't help (POC #2). The question now is: does rerank-2's +9% on a
|
||||||
|
single case generalize to a heterogeneous corpus?
|
||||||
|
|
||||||
|
Also measures end-to-end latency: pure voyage-3 vs voyage-3 + rerank.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
/home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||||
|
/home/chaim/legal-ai/scripts/voyage_rerank_corpus_poc.py
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import math
|
||||||
|
import os
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from collections import defaultdict
|
||||||
|
|
||||||
|
ENV_PATH = os.path.expanduser("~/.env")
|
||||||
|
if os.path.isfile(ENV_PATH):
|
||||||
|
with open(ENV_PATH) as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line and not line.startswith("#") and "=" in line:
|
||||||
|
k, v = line.split("=", 1)
|
||||||
|
os.environ.setdefault(k, v)
|
||||||
|
|
||||||
|
import asyncpg # noqa: E402
|
||||||
|
import voyageai # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
TEXT_MODEL = "voyage-3"
|
||||||
|
RERANK_MODEL = "rerank-2"
|
||||||
|
JUDGE_MODEL = "claude-haiku-4-5-20251001"
|
||||||
|
TOP_VEC = 50 # voyage-3 retrieve depth
|
||||||
|
TOP_K = 10 # final returned to "agent"
|
||||||
|
JUDGE_K = 5 # how many top results to actually judge per retriever
|
||||||
|
|
||||||
|
# 12 queries spanning typical use cases by Daphna's agents:
|
||||||
|
# precedent search for citing in decision blocks י-יא.
|
||||||
|
QUERIES = [
|
||||||
|
# K — keyword
|
||||||
|
("K1", "פיצויים לפי סעיף 197"),
|
||||||
|
("K2", "תמ\"א 38 והשבחה"),
|
||||||
|
("K3", "כלל הנטרול בשמאות"),
|
||||||
|
# C — conceptual
|
||||||
|
("C1", "תכלית היטל ההשבחה"),
|
||||||
|
("C2", "מה מקנה לבעלים זכות לפיצוי"),
|
||||||
|
("C3", "ההבחנה בין השבחה לפיצויים"),
|
||||||
|
# N — narrative / context-aware
|
||||||
|
("N1", "מה נקבע לגבי תמ\"א 38 בפסיקה"),
|
||||||
|
("N2", "ההלכה לעניין נטרול ציפיות"),
|
||||||
|
("N3", "תכנית פוגעת ושומה"),
|
||||||
|
# P — practical (drafting needs — what an agent typically asks)
|
||||||
|
("P1", "פסיקה שדנה בתכנית מתאר ארצית"),
|
||||||
|
("P2", "מתי מותר לוועדה לדחות פיצויים"),
|
||||||
|
("P3", "שיקול דעת הוועדה המקומית"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def cosine(a, b):
|
||||||
|
dot = sum(x * y for x, y in zip(a, b))
|
||||||
|
na = math.sqrt(sum(x * x for x in a))
|
||||||
|
nb = math.sqrt(sum(y * y for y in b))
|
||||||
|
return dot / (na * nb) if na and nb else 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def parse_pgvector(s):
|
||||||
|
return [float(x) for x in s.strip("[]").split(",")]
|
||||||
|
|
||||||
|
|
||||||
|
BATCH_JUDGE_PROMPT = """אתה שופט רלוונטיות במשפט ישראלי.
|
||||||
|
לפניך שאילתה ומספר פסקאות מפסקי דין/הלכות. דרג כל פסקה 1-5 לפי רלוונטיות.
|
||||||
|
|
||||||
|
5 — תשובה ישירה למה שנשאל
|
||||||
|
4 — מאד רלוונטי, מכיל מידע ליבה
|
||||||
|
3 — רלוונטי חלקית, נוגע בעקיפין
|
||||||
|
2 — מעט קשור, רעש סביב הנושא
|
||||||
|
1 — לא רלוונטי בכלל
|
||||||
|
|
||||||
|
השאילתה:
|
||||||
|
{query}
|
||||||
|
|
||||||
|
הפסקאות:
|
||||||
|
{chunks_block}
|
||||||
|
|
||||||
|
החזר JSON בלבד: {{"scores": {{"<id>": <1-5>, ...}}}}
|
||||||
|
ללא טקסט נוסף, ללא ```."""
|
||||||
|
|
||||||
|
|
||||||
|
def batch_judge(query: str, items: list[tuple[str, str]]) -> dict[str, int]:
|
||||||
|
"""Judge (id, text) pairs via claude CLI. Returns {id: score}."""
|
||||||
|
blocks = []
|
||||||
|
for cid, content in items:
|
||||||
|
snippet = content.replace("\n", " ").strip()[:1500]
|
||||||
|
blocks.append(f"<id={cid}>\n{snippet}\n</id>")
|
||||||
|
prompt = BATCH_JUDGE_PROMPT.format(
|
||||||
|
query=query, chunks_block="\n\n".join(blocks))
|
||||||
|
proc = subprocess.run(
|
||||||
|
["claude", "-p", "--model", JUDGE_MODEL],
|
||||||
|
input=prompt, capture_output=True, text=True, timeout=180,
|
||||||
|
)
|
||||||
|
out = proc.stdout.strip()
|
||||||
|
out = re.sub(r"^```(?:json)?\s*", "", out)
|
||||||
|
out = re.sub(r"\s*```$", "", out)
|
||||||
|
try:
|
||||||
|
data = json.loads(out)
|
||||||
|
raw = data.get("scores", {})
|
||||||
|
return {str(k): int(v) for k, v in raw.items()
|
||||||
|
if str(v).isdigit() and 1 <= int(v) <= 5}
|
||||||
|
except (json.JSONDecodeError, ValueError, TypeError) as e:
|
||||||
|
print(f" [judge parse fail: {e}; out={out[:200]!r}]")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
voyage_key = os.environ["VOYAGE_API_KEY"]
|
||||||
|
pg_pw = os.environ["POSTGRES_PASSWORD"]
|
||||||
|
|
||||||
|
try:
|
||||||
|
subprocess.run(["claude", "--version"], capture_output=True,
|
||||||
|
text=True, timeout=10, check=True)
|
||||||
|
except (subprocess.CalledProcessError, FileNotFoundError, TimeoutError):
|
||||||
|
sys.exit("claude CLI not found")
|
||||||
|
|
||||||
|
voyage = voyageai.Client(api_key=voyage_key)
|
||||||
|
|
||||||
|
pool = await asyncpg.create_pool(
|
||||||
|
host="127.0.0.1", port=5433, user="legal_ai",
|
||||||
|
password=pg_pw, database="legal_ai",
|
||||||
|
min_size=1, max_size=2,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Load full corpus: precedent_chunks + halachot
|
||||||
|
pc_rows = await pool.fetch("""
|
||||||
|
SELECT 'pc:' || id::text AS doc_id,
|
||||||
|
content,
|
||||||
|
embedding::text AS emb_text
|
||||||
|
FROM precedent_chunks
|
||||||
|
WHERE content IS NOT NULL AND embedding IS NOT NULL
|
||||||
|
""")
|
||||||
|
h_rows = await pool.fetch("""
|
||||||
|
SELECT 'h:' || id::text AS doc_id,
|
||||||
|
TRIM(BOTH ' —' FROM rule_statement || ' — ' ||
|
||||||
|
COALESCE(reasoning_summary, '')) AS content,
|
||||||
|
embedding::text AS emb_text
|
||||||
|
FROM halachot
|
||||||
|
WHERE rule_statement IS NOT NULL AND embedding IS NOT NULL
|
||||||
|
""")
|
||||||
|
all_rows = list(pc_rows) + list(h_rows)
|
||||||
|
print(f"[load] corpus: {len(pc_rows)} precedent_chunks + "
|
||||||
|
f"{len(h_rows)} halachot = {len(all_rows)} total")
|
||||||
|
|
||||||
|
doc_ids = [r["doc_id"] for r in all_rows]
|
||||||
|
contents = [r["content"] for r in all_rows]
|
||||||
|
embs = [parse_pgvector(r["emb_text"]) for r in all_rows]
|
||||||
|
|
||||||
|
# Latency measurement: 5 queries, time the two pipelines
|
||||||
|
print("\n[latency] measuring 5 sample queries…")
|
||||||
|
sample = QUERIES[:5]
|
||||||
|
r1_lat = []
|
||||||
|
r2_lat = []
|
||||||
|
for _, query in sample:
|
||||||
|
# R1: voyage-3 embed + cosine top-10
|
||||||
|
t0 = time.time()
|
||||||
|
q_emb = voyage.embed([query], model=TEXT_MODEL,
|
||||||
|
input_type="query").embeddings[0]
|
||||||
|
scores = sorted([(cosine(q_emb, e), i) for i, e in enumerate(embs)],
|
||||||
|
reverse=True)[:TOP_K]
|
||||||
|
r1_lat.append(time.time() - t0)
|
||||||
|
# R2: voyage-3 embed + cosine top-50 + rerank-2 → top-10
|
||||||
|
t0 = time.time()
|
||||||
|
q_emb = voyage.embed([query], model=TEXT_MODEL,
|
||||||
|
input_type="query").embeddings[0]
|
||||||
|
cands = sorted([(cosine(q_emb, e), i) for i, e in enumerate(embs)],
|
||||||
|
reverse=True)[:TOP_VEC]
|
||||||
|
cand_texts = [contents[i] for _, i in cands]
|
||||||
|
rr = voyage.rerank(query=query, documents=cand_texts,
|
||||||
|
model=RERANK_MODEL, top_k=TOP_K)
|
||||||
|
r2_lat.append(time.time() - t0)
|
||||||
|
print(f" R1 (voyage-3 only) avg={sum(r1_lat)/5*1000:.0f}ms"
|
||||||
|
f" min={min(r1_lat)*1000:.0f} max={max(r1_lat)*1000:.0f}")
|
||||||
|
print(f" R2 (voyage-3 + rerank-2) avg={sum(r2_lat)/5*1000:.0f}ms"
|
||||||
|
f" min={min(r2_lat)*1000:.0f} max={max(r2_lat)*1000:.0f}")
|
||||||
|
print(f" Δ (rerank overhead) avg={(sum(r2_lat)-sum(r1_lat))/5*1000:.0f}ms")
|
||||||
|
|
||||||
|
# Retrieval functions
|
||||||
|
def r1_baseline(query: str, k: int = TOP_K) -> list[int]:
|
||||||
|
q = voyage.embed([query], model=TEXT_MODEL,
|
||||||
|
input_type="query").embeddings[0]
|
||||||
|
scores = sorted([(cosine(q, e), i) for i, e in enumerate(embs)],
|
||||||
|
reverse=True)
|
||||||
|
return [i for _, i in scores[:k]]
|
||||||
|
|
||||||
|
def r2_rerank(query: str, k: int = TOP_K) -> list[int]:
|
||||||
|
cands = r1_baseline(query, k=TOP_VEC)
|
||||||
|
cand_texts = [contents[i] for i in cands]
|
||||||
|
rr = voyage.rerank(query=query, documents=cand_texts,
|
||||||
|
model=RERANK_MODEL, top_k=k)
|
||||||
|
return [cands[r.index] for r in rr.results]
|
||||||
|
|
||||||
|
retrievers = [("R1-voyage3", r1_baseline),
|
||||||
|
("R2-rerank2", r2_rerank)]
|
||||||
|
|
||||||
|
print(f"\n[judge] running {len(QUERIES)} queries × 2 retrievers, "
|
||||||
|
f"top-{JUDGE_K} judged…")
|
||||||
|
|
||||||
|
all_results = []
|
||||||
|
for qid, query in QUERIES:
|
||||||
|
print(f"\n[{qid}] {query}")
|
||||||
|
retr_results = {}
|
||||||
|
for r_name, r_fn in retrievers:
|
||||||
|
try:
|
||||||
|
retr_results[r_name] = r_fn(query, k=JUDGE_K)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" {r_name}: FAILED — {e}")
|
||||||
|
retr_results[r_name] = []
|
||||||
|
union = sorted({i for top in retr_results.values() for i in top})
|
||||||
|
items = [(doc_ids[i], contents[i]) for i in union]
|
||||||
|
print(f" judging {len(items)} unique docs…")
|
||||||
|
scores_map = batch_judge(query, items)
|
||||||
|
for r_name, top in retr_results.items():
|
||||||
|
scores = [scores_map.get(doc_ids[i], 0) for i in top]
|
||||||
|
mean3 = sum(scores[:3]) / 3 if len(scores) >= 3 else 0
|
||||||
|
mean5 = sum(scores) / len(scores) if scores else 0
|
||||||
|
mrr = 0.0
|
||||||
|
for r, s in enumerate(scores):
|
||||||
|
if s >= 4:
|
||||||
|
mrr = 1.0 / (r + 1)
|
||||||
|
break
|
||||||
|
print(f" {r_name}: doc_ids={[doc_ids[i][:14] for i in top]} "
|
||||||
|
f"scores={scores} m@3={mean3:.2f} m@5={mean5:.2f} "
|
||||||
|
f"MRR={mrr:.3f}")
|
||||||
|
all_results.append({
|
||||||
|
"qid": qid, "category": qid[0], "query": query,
|
||||||
|
"retriever": r_name,
|
||||||
|
"doc_ids": [doc_ids[i] for i in top],
|
||||||
|
"scores": scores, "mean3": mean3, "mean5": mean5, "mrr": mrr,
|
||||||
|
})
|
||||||
|
|
||||||
|
# Aggregate
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print("AGGREGATED RESULTS — full precedent_library corpus (785 docs)")
|
||||||
|
print("=" * 100)
|
||||||
|
by_r = defaultdict(lambda: {"mean3": [], "mean5": [], "mrr": []})
|
||||||
|
by_cat_r = defaultdict(lambda: {"mean3": [], "mean5": [], "mrr": []})
|
||||||
|
for r in all_results:
|
||||||
|
by_r[r["retriever"]]["mean3"].append(r["mean3"])
|
||||||
|
by_r[r["retriever"]]["mean5"].append(r["mean5"])
|
||||||
|
by_r[r["retriever"]]["mrr"].append(r["mrr"])
|
||||||
|
ck = (r["category"], r["retriever"])
|
||||||
|
by_cat_r[ck]["mean3"].append(r["mean3"])
|
||||||
|
by_cat_r[ck]["mean5"].append(r["mean5"])
|
||||||
|
by_cat_r[ck]["mrr"].append(r["mrr"])
|
||||||
|
|
||||||
|
print(f"\nOverall ({len(QUERIES)} queries):")
|
||||||
|
print(f"{'retriever':<14} {'mean@3':>8} {'mean@5':>8} {'MRR':>8}")
|
||||||
|
avg = lambda xs: sum(xs) / len(xs) if xs else 0
|
||||||
|
for r_name, _ in retrievers:
|
||||||
|
m = by_r[r_name]
|
||||||
|
print(f"{r_name:<14} {avg(m['mean3']):>8.3f} "
|
||||||
|
f"{avg(m['mean5']):>8.3f} {avg(m['mrr']):>8.3f}")
|
||||||
|
# Improvement
|
||||||
|
r1m = avg(by_r["R1-voyage3"]["mean3"])
|
||||||
|
r2m = avg(by_r["R2-rerank2"]["mean3"])
|
||||||
|
if r1m > 0:
|
||||||
|
print(f"\nR2 vs R1 improvement: "
|
||||||
|
f"mean@3 {(r2m - r1m) / r1m * 100:+.1f}%")
|
||||||
|
|
||||||
|
print(f"\nBy category:")
|
||||||
|
print(f"{'cat':<3} {'retriever':<14} {'mean@3':>8} {'mean@5':>8} "
|
||||||
|
f"{'MRR':>8}")
|
||||||
|
for cat in ["K", "C", "N", "P"]:
|
||||||
|
for r_name, _ in retrievers:
|
||||||
|
m = by_cat_r[(cat, r_name)]
|
||||||
|
if not m["mean3"]:
|
||||||
|
continue
|
||||||
|
print(f"{cat:<3} {r_name:<14} {avg(m['mean3']):>8.3f} "
|
||||||
|
f"{avg(m['mean5']):>8.3f} {avg(m['mrr']):>8.3f}")
|
||||||
|
|
||||||
|
print(f"\nPer-query winner (highest mean@3):")
|
||||||
|
print(f"{'qid':<4} {'query':<40} {'winner':<14} {'scores'}")
|
||||||
|
by_q = defaultdict(list)
|
||||||
|
for r in all_results:
|
||||||
|
by_q[r["qid"]].append(r)
|
||||||
|
for qid, results in sorted(by_q.items()):
|
||||||
|
max_s = max(r["mean3"] for r in results)
|
||||||
|
winners = [r["retriever"] for r in results if r["mean3"] == max_s]
|
||||||
|
scores = " | ".join(f"{r['retriever'][:7]}={r['mean3']:.2f}"
|
||||||
|
for r in results)
|
||||||
|
q_str = next(q for qid_, q in QUERIES if qid_ == qid)[:38]
|
||||||
|
print(f"{qid:<4} {q_str:<40} {','.join(w[:8] for w in winners):<14} "
|
||||||
|
f"{scores}")
|
||||||
|
|
||||||
|
out_path = "/tmp/voyage_rerank_corpus_results.json"
|
||||||
|
with open(out_path, "w") as f:
|
||||||
|
json.dump(all_results, f, ensure_ascii=False, indent=2)
|
||||||
|
print(f"\nSaved to {out_path}")
|
||||||
|
|
||||||
|
await pool.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
361
scripts/voyage_rerank_judge_poc.py
Normal file
361
scripts/voyage_rerank_judge_poc.py
Normal file
@@ -0,0 +1,361 @@
|
|||||||
|
"""POC #4: Comprehensive retrieval benchmark with LLM-as-judge.
|
||||||
|
|
||||||
|
Compares 3 retrievers on אהרון ברק 403/17 (219 chunks):
|
||||||
|
R1 — voyage-3 (current production baseline)
|
||||||
|
R2 — voyage-3 + voyage-rerank-2 (retrieve 50, rerank, top-10)
|
||||||
|
R3 — voyage-context-3 (windowed, from POC #2)
|
||||||
|
|
||||||
|
Judges relevance with claude-haiku-4-5 — for each (query, chunk) pair the
|
||||||
|
judge returns 1-5. Aggregates: mean relevance@3, @5, @10, MRR (rank of
|
||||||
|
first 4+ chunk), per-query winner.
|
||||||
|
|
||||||
|
20 queries grouped into 3 categories so we can see *which* query types
|
||||||
|
benefit from which retriever:
|
||||||
|
K — keyword/lexical (term-heavy, specific entity)
|
||||||
|
C — conceptual (abstract idea, principle)
|
||||||
|
N — narrative/contextual (requires document-internal reference)
|
||||||
|
|
||||||
|
Usage (key passed via env, NOT stored in script):
|
||||||
|
ANTHROPIC_API_KEY=... \\
|
||||||
|
/home/chaim/legal-ai/mcp-server/.venv/bin/python \\
|
||||||
|
/home/chaim/legal-ai/scripts/voyage_rerank_judge_poc.py
|
||||||
|
"""
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import asyncio
|
||||||
|
import json
|
||||||
|
import math
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import time
|
||||||
|
from collections import defaultdict
|
||||||
|
|
||||||
|
ENV_PATH = os.path.expanduser("~/.env")
|
||||||
|
if os.path.isfile(ENV_PATH):
|
||||||
|
with open(ENV_PATH) as f:
|
||||||
|
for line in f:
|
||||||
|
line = line.strip()
|
||||||
|
if line and not line.startswith("#") and "=" in line:
|
||||||
|
k, v = line.split("=", 1)
|
||||||
|
os.environ.setdefault(k, v)
|
||||||
|
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
import asyncpg # noqa: E402
|
||||||
|
import voyageai # noqa: E402
|
||||||
|
|
||||||
|
|
||||||
|
CASE_ID = "e151fc25-cf12-4563-b638-a86323f8413b" # אהרון ברק 403/17
|
||||||
|
TEXT_MODEL = "voyage-3"
|
||||||
|
CONTEXT_MODEL = "voyage-context-3"
|
||||||
|
RERANK_MODEL = "rerank-2"
|
||||||
|
JUDGE_MODEL = "claude-haiku-4-5-20251001"
|
||||||
|
|
||||||
|
WINDOW_SIZE = 80
|
||||||
|
WINDOW_STRIDE = 70
|
||||||
|
|
||||||
|
# 18 queries × 3 retrievers × top-5 = 270 judge calls. ~$0.05 with haiku.
|
||||||
|
QUERIES = [
|
||||||
|
# K — keyword/lexical
|
||||||
|
("K1", "תכנית רחביה הוראות בנייה"),
|
||||||
|
("K2", "תמ\"א 38"),
|
||||||
|
("K3", "תכנית 9988"),
|
||||||
|
("K4", "סעיף 197 לחוק התכנון והבניה"),
|
||||||
|
("K5", "השופט גרוסקופף"),
|
||||||
|
("K6", "ועדה מקומית ירושלים"),
|
||||||
|
# C — conceptual / abstract principles
|
||||||
|
("C1", "כלל הנטרול של זכויות תכנוניות"),
|
||||||
|
("C2", "אינטרס הציבור בתכנון"),
|
||||||
|
("C3", "תכלית היטל ההשבחה"),
|
||||||
|
("C4", "תכנית פוגעת לעומת תכנית משביחה"),
|
||||||
|
("C5", "ההבחנה בין השבחה לפיצויים"),
|
||||||
|
("C6", "מהותו של היטל ההשבחה"),
|
||||||
|
# N — narrative / context-dependent
|
||||||
|
("N1", "מה נקבע לגבי תמ\"א 38 בפסק הדין"),
|
||||||
|
("N2", "מסקנת בית המשפט בעניין רובע 3"),
|
||||||
|
("N3", "ההלכה שנקבעה בעניין שמעוני"),
|
||||||
|
("N4", "ההבדל בין המקרה שלפנינו לעניין רון"),
|
||||||
|
("N5", "סוף דבר ותוצאת פסק הדין"),
|
||||||
|
("N6", "הסכמת השופטים האחרים לחוות הדעת"),
|
||||||
|
]
|
||||||
|
|
||||||
|
|
||||||
|
def cosine(a, b):
|
||||||
|
dot = sum(x * y for x, y in zip(a, b))
|
||||||
|
na = math.sqrt(sum(x * x for x in a))
|
||||||
|
nb = math.sqrt(sum(y * y for y in b))
|
||||||
|
return dot / (na * nb) if na and nb else 0.0
|
||||||
|
|
||||||
|
|
||||||
|
def parse_pgvector(s):
|
||||||
|
return [float(x) for x in s.strip("[]").split(",")]
|
||||||
|
|
||||||
|
|
||||||
|
def build_windows(n, size, stride):
|
||||||
|
out = []
|
||||||
|
s = 0
|
||||||
|
while s < n:
|
||||||
|
e = min(s + size, n)
|
||||||
|
out.append((s, e))
|
||||||
|
if e == n:
|
||||||
|
break
|
||||||
|
s += stride
|
||||||
|
return out
|
||||||
|
|
||||||
|
|
||||||
|
def central_window(idx, windows):
|
||||||
|
best, best_d = -1, -1
|
||||||
|
for w_idx, (s, e) in enumerate(windows):
|
||||||
|
if not (s <= idx < e):
|
||||||
|
continue
|
||||||
|
d = min(idx - s, (e - 1) - idx)
|
||||||
|
if d > best_d:
|
||||||
|
best_d = d
|
||||||
|
best = w_idx
|
||||||
|
return best
|
||||||
|
|
||||||
|
|
||||||
|
BATCH_JUDGE_PROMPT = """אתה שופט רלוונטיות במשפט ישראלי.
|
||||||
|
לפניך שאילתה ומספר פסקאות מפסק דין. דרג כל פסקה בנפרד 1-5 לפי רלוונטיות.
|
||||||
|
|
||||||
|
סולם:
|
||||||
|
5 — תשובה ישירה ומדויקת לשאילתה
|
||||||
|
4 — מאד רלוונטי, מכיל מידע ליבה
|
||||||
|
3 — רלוונטי חלקית, נוגע בעקיפין בנושא
|
||||||
|
2 — מעט קשור, רעש סביב הנושא
|
||||||
|
1 — לא רלוונטי בכלל
|
||||||
|
|
||||||
|
השאילתה:
|
||||||
|
{query}
|
||||||
|
|
||||||
|
הפסקאות:
|
||||||
|
{chunks_block}
|
||||||
|
|
||||||
|
החזר JSON בלבד, בפורמט: {{"scores": {{"<id>": <1-5>, ...}}}}
|
||||||
|
ללא טקסט נוסף, ללא explanations, ללא ```."""
|
||||||
|
|
||||||
|
|
||||||
|
def batch_judge(query: str,
|
||||||
|
items: list[tuple[int, str]]) -> dict[int, int]:
|
||||||
|
"""Judge a list of (chunk_idx, content) pairs in a single CLI call.
|
||||||
|
|
||||||
|
Returns: dict[chunk_idx → score 1-5]. Returns 0 for parse failures.
|
||||||
|
"""
|
||||||
|
chunks_block_lines = []
|
||||||
|
for ci, content in items:
|
||||||
|
snippet = content.replace("\n", " ").strip()[:1500]
|
||||||
|
chunks_block_lines.append(f"<id={ci}>\n{snippet}\n</id>")
|
||||||
|
prompt = BATCH_JUDGE_PROMPT.format(
|
||||||
|
query=query,
|
||||||
|
chunks_block="\n\n".join(chunks_block_lines),
|
||||||
|
)
|
||||||
|
proc = subprocess.run(
|
||||||
|
["claude", "-p", "--model", JUDGE_MODEL],
|
||||||
|
input=prompt, capture_output=True, text=True, timeout=120,
|
||||||
|
)
|
||||||
|
out = proc.stdout.strip()
|
||||||
|
# Strip ```json fences if any
|
||||||
|
out = re.sub(r"^```(?:json)?\s*", "", out)
|
||||||
|
out = re.sub(r"\s*```$", "", out)
|
||||||
|
try:
|
||||||
|
data = json.loads(out)
|
||||||
|
raw = data.get("scores", {})
|
||||||
|
return {int(k): int(v) for k, v in raw.items()
|
||||||
|
if str(v).isdigit() and 1 <= int(v) <= 5}
|
||||||
|
except (json.JSONDecodeError, ValueError, TypeError) as e:
|
||||||
|
print(f" [judge parse fail: {e}; out={out[:200]!r}]")
|
||||||
|
return {}
|
||||||
|
|
||||||
|
|
||||||
|
async def main():
|
||||||
|
voyage_key = os.environ["VOYAGE_API_KEY"]
|
||||||
|
pg_pw = os.environ["POSTGRES_PASSWORD"]
|
||||||
|
|
||||||
|
# Verify Claude CLI is available (uses OAuth from ~/.claude/.credentials)
|
||||||
|
try:
|
||||||
|
subprocess.run(["claude", "--version"], capture_output=True,
|
||||||
|
text=True, timeout=10, check=True)
|
||||||
|
except (subprocess.CalledProcessError, FileNotFoundError, TimeoutError):
|
||||||
|
sys.exit("claude CLI not found or not authenticated")
|
||||||
|
|
||||||
|
voyage = voyageai.Client(api_key=voyage_key)
|
||||||
|
|
||||||
|
# Load chunks + voyage-3 embeddings
|
||||||
|
pool = await asyncpg.create_pool(
|
||||||
|
host="127.0.0.1", port=5433, user="legal_ai",
|
||||||
|
password=pg_pw, database="legal_ai",
|
||||||
|
min_size=1, max_size=2,
|
||||||
|
)
|
||||||
|
rows = await pool.fetch("""
|
||||||
|
SELECT chunk_index, content, embedding::text AS emb_text
|
||||||
|
FROM precedent_chunks
|
||||||
|
WHERE case_law_id = $1
|
||||||
|
ORDER BY chunk_index
|
||||||
|
""", CASE_ID)
|
||||||
|
chunks = [r["content"] for r in rows]
|
||||||
|
chunk_indices = [r["chunk_index"] for r in rows]
|
||||||
|
baseline_embs = [parse_pgvector(r["emb_text"]) for r in rows]
|
||||||
|
n = len(chunks)
|
||||||
|
print(f"[load] {n} chunks loaded")
|
||||||
|
|
||||||
|
# Compute context-3 (windowed) embeddings — same as POC #2
|
||||||
|
windows = build_windows(n, WINDOW_SIZE, WINDOW_STRIDE)
|
||||||
|
print(f"[context-3] embedding {len(windows)} windows…")
|
||||||
|
win_embs = []
|
||||||
|
for s, e in windows:
|
||||||
|
result = voyage.contextualized_embed(
|
||||||
|
inputs=[chunks[s:e]],
|
||||||
|
model=CONTEXT_MODEL,
|
||||||
|
input_type="document",
|
||||||
|
)
|
||||||
|
win_embs.append(result.results[0].embeddings)
|
||||||
|
context_embs = []
|
||||||
|
for i in range(n):
|
||||||
|
w = central_window(i, windows)
|
||||||
|
s, _ = windows[w]
|
||||||
|
context_embs.append(win_embs[w][i - s])
|
||||||
|
print(f"[context-3] done")
|
||||||
|
|
||||||
|
# Retrieval functions
|
||||||
|
def r1_baseline(query: str, k: int = 10) -> list[int]:
|
||||||
|
q = voyage.embed([query], model=TEXT_MODEL,
|
||||||
|
input_type="query").embeddings[0]
|
||||||
|
scores = sorted(
|
||||||
|
[(cosine(q, e), i) for i, e in enumerate(baseline_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
return [i for _, i in scores[:k]]
|
||||||
|
|
||||||
|
def r2_rerank(query: str, k: int = 10) -> list[int]:
|
||||||
|
# 1) voyage-3 retrieve top-50
|
||||||
|
cands = r1_baseline(query, k=50)
|
||||||
|
cand_texts = [chunks[i] for i in cands]
|
||||||
|
# 2) voyage-rerank-2 over the 50
|
||||||
|
rr = voyage.rerank(
|
||||||
|
query=query, documents=cand_texts,
|
||||||
|
model=RERANK_MODEL, top_k=k,
|
||||||
|
)
|
||||||
|
# rr.results: list of RerankingResult(index=..., relevance_score=...)
|
||||||
|
# `index` refers to position in cand_texts → map back to chunk idx
|
||||||
|
return [cands[r.index] for r in rr.results]
|
||||||
|
|
||||||
|
def r3_context(query: str, k: int = 10) -> list[int]:
|
||||||
|
q = voyage.contextualized_embed(
|
||||||
|
inputs=[[query]],
|
||||||
|
model=CONTEXT_MODEL,
|
||||||
|
input_type="query",
|
||||||
|
).results[0].embeddings[0]
|
||||||
|
scores = sorted(
|
||||||
|
[(cosine(q, e), i) for i, e in enumerate(context_embs)],
|
||||||
|
reverse=True,
|
||||||
|
)
|
||||||
|
return [i for _, i in scores[:k]]
|
||||||
|
|
||||||
|
retrievers = [("R1-voyage3", r1_baseline),
|
||||||
|
("R2-rerank2", r2_rerank),
|
||||||
|
("R3-context3", r3_context)]
|
||||||
|
|
||||||
|
# Run all queries × all retrievers, judging top-5 per pair.
|
||||||
|
# Strategy: for each query, gather the union of all retrievers' top-K
|
||||||
|
# and judge them in ONE batched CLI call → 18 calls total instead of 270.
|
||||||
|
all_results = []
|
||||||
|
JUDGE_TOP_K = 5
|
||||||
|
print(f"\n[judge] running {len(QUERIES)} queries × "
|
||||||
|
f"{len(retrievers)} retrievers × top-{JUDGE_TOP_K} — batched per query…")
|
||||||
|
|
||||||
|
for qid, query in QUERIES:
|
||||||
|
print(f"\n[{qid}] {query}")
|
||||||
|
# Collect retrievals first
|
||||||
|
retr_results = {}
|
||||||
|
for r_name, r_fn in retrievers:
|
||||||
|
try:
|
||||||
|
retr_results[r_name] = r_fn(query, k=JUDGE_TOP_K)
|
||||||
|
except Exception as e:
|
||||||
|
print(f" {r_name}: FAILED — {e}")
|
||||||
|
retr_results[r_name] = []
|
||||||
|
# Union of unique chunk indices to judge
|
||||||
|
union = sorted({i for top in retr_results.values() for i in top})
|
||||||
|
items = [(i, chunks[i]) for i in union]
|
||||||
|
print(f" judging {len(items)} unique chunks via batch CLI…")
|
||||||
|
scores_map = batch_judge(query, items)
|
||||||
|
# Build per-retriever score lists
|
||||||
|
for r_name, top in retr_results.items():
|
||||||
|
scores = [scores_map.get(i, 0) for i in top]
|
||||||
|
mean3 = sum(scores[:3]) / 3 if len(scores) >= 3 else 0
|
||||||
|
mean5 = sum(scores) / len(scores) if scores else 0
|
||||||
|
mrr = 0.0
|
||||||
|
for r, s in enumerate(scores):
|
||||||
|
if s >= 4:
|
||||||
|
mrr = 1.0 / (r + 1)
|
||||||
|
break
|
||||||
|
print(f" {r_name}: chunks={[chunk_indices[i] for i in top]} "
|
||||||
|
f"scores={scores} mean@3={mean3:.2f} mean@5={mean5:.2f} "
|
||||||
|
f"MRR={mrr:.3f}")
|
||||||
|
all_results.append({
|
||||||
|
"qid": qid, "category": qid[0], "query": query,
|
||||||
|
"retriever": r_name,
|
||||||
|
"chunks": [chunk_indices[i] for i in top],
|
||||||
|
"scores": scores,
|
||||||
|
"mean3": mean3, "mean5": mean5, "mrr": mrr,
|
||||||
|
})
|
||||||
|
|
||||||
|
# Aggregate
|
||||||
|
print("\n" + "=" * 100)
|
||||||
|
print("AGGREGATED RESULTS")
|
||||||
|
print("=" * 100)
|
||||||
|
|
||||||
|
by_retriever = defaultdict(lambda: {"mean3": [], "mean5": [], "mrr": []})
|
||||||
|
by_cat_retriever = defaultdict(
|
||||||
|
lambda: {"mean3": [], "mean5": [], "mrr": []})
|
||||||
|
for r in all_results:
|
||||||
|
by_retriever[r["retriever"]]["mean3"].append(r["mean3"])
|
||||||
|
by_retriever[r["retriever"]]["mean5"].append(r["mean5"])
|
||||||
|
by_retriever[r["retriever"]]["mrr"].append(r["mrr"])
|
||||||
|
cat_key = (r["category"], r["retriever"])
|
||||||
|
by_cat_retriever[cat_key]["mean3"].append(r["mean3"])
|
||||||
|
by_cat_retriever[cat_key]["mean5"].append(r["mean5"])
|
||||||
|
by_cat_retriever[cat_key]["mrr"].append(r["mrr"])
|
||||||
|
|
||||||
|
print("\nOverall (across all 18 queries):")
|
||||||
|
print(f"{'retriever':<14} {'mean@3':>8} {'mean@5':>8} {'MRR':>8}")
|
||||||
|
for r_name, _ in retrievers:
|
||||||
|
m = by_retriever[r_name]
|
||||||
|
avg = lambda xs: sum(xs) / len(xs) if xs else 0
|
||||||
|
print(f"{r_name:<14} {avg(m['mean3']):>8.3f} "
|
||||||
|
f"{avg(m['mean5']):>8.3f} {avg(m['mrr']):>8.3f}")
|
||||||
|
|
||||||
|
print("\nBy category (K=keyword, C=conceptual, N=narrative):")
|
||||||
|
print(f"{'cat':<3} {'retriever':<14} {'mean@3':>8} {'mean@5':>8} {'MRR':>8}")
|
||||||
|
for cat in ["K", "C", "N"]:
|
||||||
|
for r_name, _ in retrievers:
|
||||||
|
m = by_cat_retriever[(cat, r_name)]
|
||||||
|
avg = lambda xs: sum(xs) / len(xs) if xs else 0
|
||||||
|
print(f"{cat:<3} {r_name:<14} {avg(m['mean3']):>8.3f} "
|
||||||
|
f"{avg(m['mean5']):>8.3f} {avg(m['mrr']):>8.3f}")
|
||||||
|
|
||||||
|
print("\nPer-query winner (highest mean@3, ties shown):")
|
||||||
|
print(f"{'qid':<4} {'query':<45} {'winner':<24} {'scores'}")
|
||||||
|
by_query = defaultdict(list)
|
||||||
|
for r in all_results:
|
||||||
|
by_query[r["qid"]].append(r)
|
||||||
|
for qid, results in sorted(by_query.items()):
|
||||||
|
max_score = max(r["mean3"] for r in results)
|
||||||
|
winners = [r["retriever"] for r in results if r["mean3"] == max_score]
|
||||||
|
scores = " | ".join(f"{r['retriever'][:7]}={r['mean3']:.2f}"
|
||||||
|
for r in results)
|
||||||
|
q_str = next(q for qid_, q in QUERIES if qid_ == qid)[:42]
|
||||||
|
print(f"{qid:<4} {q_str:<45} {','.join(w[:8] for w in winners):<24} "
|
||||||
|
f"{scores}")
|
||||||
|
|
||||||
|
# Save raw results to JSON for further analysis
|
||||||
|
out_path = "/tmp/voyage_rerank_judge_results.json"
|
||||||
|
with open(out_path, "w") as f:
|
||||||
|
json.dump(all_results, f, ensure_ascii=False, indent=2)
|
||||||
|
print(f"\nRaw results saved to {out_path}")
|
||||||
|
|
||||||
|
await pool.close()
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
asyncio.run(main())
|
||||||
@@ -291,6 +291,24 @@ description: This skill should be used when writing legal decisions (החלטו
|
|||||||
במקום לצטט כל פסק דין בנפרד, דפנה מפנה להחלטה שכבר ריכזה את הפסיקה: "בכל הנוגע ל[נושא], נפנה לניתוח המקיף שערכה ועדת הערר במסגרת ערר [שם] (פורסם בנבו) משם עולה כי..." ואז ציטוט בלוק ארוך (200-500 מילים) מתוך ההחלטה המרכזת שכוללת הפניות לפסיקה רלוונטית. הסיום: "אם כך, לעת הזו, הגישה הנוהגת היא ש..."
|
במקום לצטט כל פסק דין בנפרד, דפנה מפנה להחלטה שכבר ריכזה את הפסיקה: "בכל הנוגע ל[נושא], נפנה לניתוח המקיף שערכה ועדת הערר במסגרת ערר [שם] (פורסם בנבו) משם עולה כי..." ואז ציטוט בלוק ארוך (200-500 מילים) מתוך ההחלטה המרכזת שכוללת הפניות לפסיקה רלוונטית. הסיום: "אם כך, לעת הזו, הגישה הנוהגת היא ש..."
|
||||||
|
|
||||||
|
|
||||||
|
### 7.5 שלושה מקורות פסיקה — אל תבלבל
|
||||||
|
|
||||||
|
המערכת מפרידה בין שלושה קורפוסי פסיקה. כל אחד מהם משמש למטרה אחרת ויש כלי MCP נפרד לחיפוש בו:
|
||||||
|
|
||||||
|
| קורפוס | טבלה | כלי חיפוש | תפקיד |
|
||||||
|
|---|---|---|---|
|
||||||
|
| תקדימי דפנה (סגנון) | `style_corpus` + `paragraph_embeddings` | `search_decisions` | החלטות שדפנה עצמה כתבה. מקור לסגנון, ניסוחים, ג'וריספרודנציה אישית. |
|
||||||
|
| ספריית הפסיקה הסמכותית | `case_law` (`source_kind='external_upload'`) + `halachot` | `search_precedent_library` | פסיקה חיצונית מחייבת — עליון, מנהלי, ועדות ערר אחרות — עם הלכות שאושרו ע"י דפנה. **המקור היחיד לציטוטים בבלוק י לפי CREAC.** |
|
||||||
|
| ציטוטים שצורפו ידנית | `case_precedents` | `precedent_search_library` | quotes שדפנה צירפה לתיק ספציפי בעבר. דומה לקורפוס סמכותי אך פר-תיק, ידני, לא עוברת חילוץ הלכות. |
|
||||||
|
|
||||||
|
**הזרימה הסטנדרטית בבלוק י:**
|
||||||
|
1. `search_decisions` קודם — בדוק אם דפנה כבר הכריעה בסוגיה דומה (חיסכון דוקטרינרי / הבחנה).
|
||||||
|
2. `search_precedent_library` — חפש את הכלל המחייב והציטוט התומך לפסקת CREAC.
|
||||||
|
3. אם הצדדים הפנו לפסיקה שלא בקורפוס — דפנה מעלה אותה דרך `/precedents` ב-UI; חילוץ ההלכות אוטומטי וההלכות מחכות לאישורה.
|
||||||
|
|
||||||
|
**איסור על המצאת ציטוטים** — ציטוט פסיקה חייב להגיע מאחד מהקורפוסים. אם אין הלכה מאושרת תומכת בנקודה — אל תמציא; ציין שהנושא דורש הוספת פסיקה לקורפוס.
|
||||||
|
|
||||||
|
|
||||||
## 8. כתיבת סיכום / סוף דבר
|
## 8. כתיבת סיכום / סוף דבר
|
||||||
|
|
||||||
### 8.1 ערר שנדחה
|
### 8.1 ערר שנדחה
|
||||||
|
|||||||
@@ -13,6 +13,10 @@ const API_ORIGIN =
|
|||||||
const nextConfig: NextConfig = {
|
const nextConfig: NextConfig = {
|
||||||
output: "standalone",
|
output: "standalone",
|
||||||
|
|
||||||
|
experimental: {
|
||||||
|
proxyClientMaxBodySize: "100mb",
|
||||||
|
},
|
||||||
|
|
||||||
async rewrites() {
|
async rewrites() {
|
||||||
return [
|
return [
|
||||||
{
|
{
|
||||||
|
|||||||
@@ -246,3 +246,24 @@
|
|||||||
color: var(--color-navy);
|
color: var(--color-navy);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/* ── Status pill shimmer ──────────────────────────────────────────
|
||||||
|
* Indeterminate "in progress" indicator used by precedent-library
|
||||||
|
* StatusPill while extraction is running. A diagonal stripe slides
|
||||||
|
* left-to-right across the badge background. */
|
||||||
|
@keyframes ezer-shimmer {
|
||||||
|
0% { background-position: 200% 0; }
|
||||||
|
100% { background-position: -200% 0; }
|
||||||
|
}
|
||||||
|
|
||||||
|
.shimmer-active {
|
||||||
|
background-image: linear-gradient(
|
||||||
|
90deg,
|
||||||
|
transparent 0%,
|
||||||
|
rgba(168, 124, 58, 0.18) 50%,
|
||||||
|
transparent 100%
|
||||||
|
);
|
||||||
|
background-size: 200% 100%;
|
||||||
|
background-repeat: no-repeat;
|
||||||
|
animation: ezer-shimmer 1.6s linear infinite;
|
||||||
|
}
|
||||||
|
|||||||
@@ -1,17 +1,31 @@
|
|||||||
"use client";
|
"use client";
|
||||||
|
|
||||||
|
import { useMemo } from "react";
|
||||||
import Link from "next/link";
|
import Link from "next/link";
|
||||||
import { AppShell } from "@/components/app-shell";
|
import { AppShell } from "@/components/app-shell";
|
||||||
import { KPICards } from "@/components/cases/kpi-cards";
|
import { KPICards } from "@/components/cases/kpi-cards";
|
||||||
import { StatusDonut } from "@/components/cases/status-donut";
|
import { StatusDonut } from "@/components/cases/status-donut";
|
||||||
|
import { AppealTypeBars, subtypeOf } from "@/components/cases/appeal-type-bars";
|
||||||
import { CasesTable } from "@/components/cases/cases-table";
|
import { CasesTable } from "@/components/cases/cases-table";
|
||||||
import { Card, CardContent } from "@/components/ui/card";
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
import { Button } from "@/components/ui/button";
|
import { Button } from "@/components/ui/button";
|
||||||
import { useCases } from "@/lib/api/cases";
|
import { useCases, type Case } from "@/lib/api/cases";
|
||||||
|
|
||||||
export default function HomePage() {
|
export default function HomePage() {
|
||||||
const { data, isPending, error } = useCases(true);
|
const { data, isPending, error } = useCases(true);
|
||||||
|
|
||||||
|
const { permits, levies } = useMemo(() => {
|
||||||
|
const permits: Case[] = [];
|
||||||
|
const levies: Case[] = [];
|
||||||
|
(data ?? []).forEach((c) => {
|
||||||
|
const s = subtypeOf(c);
|
||||||
|
if (s === "building_permit") permits.push(c);
|
||||||
|
else if (s === "betterment_levy" || s === "compensation_197") levies.push(c);
|
||||||
|
else permits.push(c); // fallback bucket — keep visible
|
||||||
|
});
|
||||||
|
return { permits, levies };
|
||||||
|
}, [data]);
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<AppShell>
|
<AppShell>
|
||||||
<section className="space-y-8">
|
<section className="space-y-8">
|
||||||
@@ -35,25 +49,70 @@ export default function HomePage() {
|
|||||||
|
|
||||||
<KPICards cases={data} loading={isPending} />
|
<KPICards cases={data} loading={isPending} />
|
||||||
|
|
||||||
<div className="grid gap-6 lg:grid-cols-[1fr_auto]">
|
<div className="grid gap-6 lg:grid-cols-[1fr_320px]">
|
||||||
|
<div className="space-y-6 min-w-0">
|
||||||
<Card className="bg-surface border-rule shadow-sm">
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
<CardContent className="px-6 py-5">
|
<CardContent className="px-6 py-5">
|
||||||
<div className="flex items-center justify-between mb-4">
|
<div className="flex items-center justify-between gap-3 mb-4 flex-wrap">
|
||||||
<h2 className="text-navy text-xl mb-0">רשימת תיקים</h2>
|
<div className="flex items-baseline gap-3">
|
||||||
|
<h2 className="text-navy text-xl mb-0">רישוי ובנייה</h2>
|
||||||
|
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
||||||
|
עררים 1xxx
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
||||||
מעודכן חי
|
מעודכן חי
|
||||||
</span>
|
</span>
|
||||||
</div>
|
</div>
|
||||||
<CasesTable cases={data} loading={isPending} error={error} />
|
<CasesTable
|
||||||
|
cases={permits}
|
||||||
|
loading={isPending}
|
||||||
|
error={error}
|
||||||
|
emptyText="אין תיקי רישוי פעילים"
|
||||||
|
searchPlaceholder="חיפוש בעררי רישוי…"
|
||||||
|
/>
|
||||||
</CardContent>
|
</CardContent>
|
||||||
</Card>
|
</Card>
|
||||||
|
|
||||||
<Card className="bg-surface border-rule shadow-sm lg:w-[320px]">
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<div className="flex items-center justify-between gap-3 mb-4 flex-wrap">
|
||||||
|
<div className="flex items-baseline gap-3">
|
||||||
|
<h2 className="text-navy text-xl mb-0">היטל השבחה ופיצויים</h2>
|
||||||
|
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
||||||
|
עררים 8xxx · 9xxx
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
||||||
|
מעודכן חי
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
<CasesTable
|
||||||
|
cases={levies}
|
||||||
|
loading={isPending}
|
||||||
|
error={error}
|
||||||
|
emptyText="אין תיקי היטל השבחה או פיצויים פעילים"
|
||||||
|
searchPlaceholder="חיפוש בעררי השבחה ופיצויים…"
|
||||||
|
/>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<aside className="space-y-6 lg:sticky lg:top-6 lg:self-start">
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
<CardContent className="px-6 py-5">
|
<CardContent className="px-6 py-5">
|
||||||
<h2 className="text-navy text-lg mb-4">פיזור סטטוסים</h2>
|
<h2 className="text-navy text-lg mb-4">פיזור סטטוסים</h2>
|
||||||
<StatusDonut cases={data} />
|
<StatusDonut cases={data} />
|
||||||
</CardContent>
|
</CardContent>
|
||||||
</Card>
|
</Card>
|
||||||
|
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<h2 className="text-navy text-lg mb-4">פיזור לפי תחום</h2>
|
||||||
|
<AppealTypeBars cases={data} />
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</aside>
|
||||||
</div>
|
</div>
|
||||||
</section>
|
</section>
|
||||||
</AppShell>
|
</AppShell>
|
||||||
|
|||||||
170
web-ui/src/app/precedents/[id]/page.tsx
Normal file
170
web-ui/src/app/precedents/[id]/page.tsx
Normal file
@@ -0,0 +1,170 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { use, useState } from "react";
|
||||||
|
import Link from "next/link";
|
||||||
|
import { Pencil } from "lucide-react";
|
||||||
|
import { AppShell } from "@/components/app-shell";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { usePrecedent } from "@/lib/api/precedent-library";
|
||||||
|
import { PrecedentEditSheet } from "@/components/precedents/precedent-edit-sheet";
|
||||||
|
import { ExtractedHalachotSection } from "@/components/precedents/extracted-halachot";
|
||||||
|
|
||||||
|
const PRACTICE_AREA_LABELS: Record<string, string> = {
|
||||||
|
rishuy_uvniya: "רישוי ובנייה",
|
||||||
|
betterment_levy: "היטל השבחה",
|
||||||
|
compensation_197: "פיצויים (197)",
|
||||||
|
};
|
||||||
|
|
||||||
|
const SOURCE_TYPE_LABELS: Record<string, string> = {
|
||||||
|
court_ruling: "פסק דין",
|
||||||
|
appeals_committee: "ועדת ערר",
|
||||||
|
};
|
||||||
|
|
||||||
|
/* Next 16 breaking change: route params are now a Promise.
|
||||||
|
* The `use()` hook unwraps them inside a client component. */
|
||||||
|
export default function PrecedentDetailPage({
|
||||||
|
params,
|
||||||
|
}: {
|
||||||
|
params: Promise<{ id: string }>;
|
||||||
|
}) {
|
||||||
|
const { id } = use(params);
|
||||||
|
const [editing, setEditing] = useState(false);
|
||||||
|
const { data, isPending, error } = usePrecedent(id);
|
||||||
|
|
||||||
|
return (
|
||||||
|
<AppShell>
|
||||||
|
<section className="space-y-6" dir="rtl">
|
||||||
|
<header>
|
||||||
|
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
||||||
|
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
||||||
|
<span aria-hidden> · </span>
|
||||||
|
<Link href="/precedents" className="hover:text-gold-deep">ספריית פסיקה</Link>
|
||||||
|
<span aria-hidden> · </span>
|
||||||
|
<span className="text-navy">פרטי פסיקה</span>
|
||||||
|
</nav>
|
||||||
|
</header>
|
||||||
|
|
||||||
|
{error ? (
|
||||||
|
<Card className="bg-danger-bg border-danger/40">
|
||||||
|
<CardContent className="px-6 py-6 text-center space-y-3">
|
||||||
|
<p className="text-danger font-semibold">שגיאה בטעינת הפסיקה</p>
|
||||||
|
<p className="text-sm text-ink-muted">{error.message}</p>
|
||||||
|
<Button asChild variant="outline">
|
||||||
|
<Link href="/precedents">חזרה לספרייה</Link>
|
||||||
|
</Button>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
) : isPending || !data ? (
|
||||||
|
<div className="space-y-3">
|
||||||
|
{[...Array(5)].map((_, i) => <Skeleton key={i} className="h-16 w-full" />)}
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5 space-y-4">
|
||||||
|
<div className="flex items-start justify-between gap-3 flex-wrap">
|
||||||
|
<div className="min-w-0 flex-1">
|
||||||
|
<h1 className="text-navy text-2xl font-semibold mb-1 leading-tight">
|
||||||
|
{data.case_name || "—"}
|
||||||
|
</h1>
|
||||||
|
<div className="text-ink-muted text-sm font-mono" dir="ltr">
|
||||||
|
{data.case_number}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<Button variant="outline" size="sm" onClick={() => setEditing(true)}>
|
||||||
|
<Pencil className="w-3.5 h-3.5 me-1" /> ערוך פרטים
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="flex items-center gap-2 flex-wrap">
|
||||||
|
{data.practice_area ? (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
{PRACTICE_AREA_LABELS[data.practice_area] ?? data.practice_area}
|
||||||
|
</Badge>
|
||||||
|
) : null}
|
||||||
|
{data.source_type ? (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
{SOURCE_TYPE_LABELS[data.source_type] ?? data.source_type}
|
||||||
|
</Badge>
|
||||||
|
) : null}
|
||||||
|
{data.precedent_level ? (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
{data.precedent_level}
|
||||||
|
</Badge>
|
||||||
|
) : null}
|
||||||
|
{data.is_binding ? (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="text-[0.7rem] bg-gold-wash text-gold-deep border-gold/40"
|
||||||
|
>
|
||||||
|
הלכה מחייבת
|
||||||
|
</Badge>
|
||||||
|
) : null}
|
||||||
|
{data.court ? (
|
||||||
|
<span className="text-[0.78rem] text-ink-muted">{data.court}</span>
|
||||||
|
) : null}
|
||||||
|
{data.date ? (
|
||||||
|
<span className="text-[0.78rem] text-ink-muted tabular-nums" dir="ltr">
|
||||||
|
{data.date.slice(0, 10)}
|
||||||
|
</span>
|
||||||
|
) : null}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{data.headnote ? (
|
||||||
|
<div>
|
||||||
|
<h3 className="text-navy text-sm font-semibold m-0 mb-1">Headnote</h3>
|
||||||
|
<p className="text-ink-soft text-sm leading-relaxed m-0">
|
||||||
|
{data.headnote}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
) : null}
|
||||||
|
|
||||||
|
{data.summary ? (
|
||||||
|
<div>
|
||||||
|
<h3 className="text-navy text-sm font-semibold m-0 mb-1">תקציר</h3>
|
||||||
|
<p className="text-ink-soft text-sm leading-relaxed m-0 whitespace-pre-line">
|
||||||
|
{data.summary}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
) : null}
|
||||||
|
|
||||||
|
{(data as { key_quote?: string }).key_quote ? (
|
||||||
|
<div>
|
||||||
|
<h3 className="text-navy text-sm font-semibold m-0 mb-1">ציטוט מרכזי</h3>
|
||||||
|
<blockquote className="text-ink-soft text-sm leading-relaxed border-r-2 border-gold pr-3 m-0">
|
||||||
|
{(data as { key_quote?: string }).key_quote}
|
||||||
|
</blockquote>
|
||||||
|
</div>
|
||||||
|
) : null}
|
||||||
|
|
||||||
|
{data.subject_tags?.length ? (
|
||||||
|
<div className="flex items-center gap-1 flex-wrap pt-1">
|
||||||
|
{data.subject_tags.map((t) => (
|
||||||
|
<Badge key={t} variant="outline" className="text-[0.65rem]">
|
||||||
|
{t}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
) : null}
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<ExtractedHalachotSection halachot={data.halachot ?? []} />
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<PrecedentEditSheet
|
||||||
|
caseLawId={editing ? id : null}
|
||||||
|
onOpenChange={(open) => setEditing(open)}
|
||||||
|
/>
|
||||||
|
</section>
|
||||||
|
</AppShell>
|
||||||
|
);
|
||||||
|
}
|
||||||
96
web-ui/src/app/precedents/page.tsx
Normal file
96
web-ui/src/app/precedents/page.tsx
Normal file
@@ -0,0 +1,96 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import Link from "next/link";
|
||||||
|
import { AppShell } from "@/components/app-shell";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { LibraryListPanel } from "@/components/precedents/library-list-panel";
|
||||||
|
import { LibrarySearchPanel } from "@/components/precedents/library-search-panel";
|
||||||
|
import { HalachaReviewPanel } from "@/components/precedents/halacha-review-panel";
|
||||||
|
import { LibraryStatsPanel } from "@/components/precedents/library-stats-panel";
|
||||||
|
import { useHalachotPending } from "@/lib/api/precedent-library";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Precedent Library admin page.
|
||||||
|
*
|
||||||
|
* Four tabs:
|
||||||
|
* - ספרייה — browse all uploaded precedents (filters + upload + delete)
|
||||||
|
* - חיפוש סמנטי — semantic search across halachot + chunks
|
||||||
|
* - ממתין לאישור — chair review queue (PRIMARY tab; halachot from
|
||||||
|
* auto-extraction must be approved before agents can use them)
|
||||||
|
* - סטטיסטיקה — counts and coverage
|
||||||
|
*
|
||||||
|
* Distinct from /training (style corpus = Daphna's voice) and the
|
||||||
|
* per-case precedent attacher (chair-attached quotes scoped to a case).
|
||||||
|
*/
|
||||||
|
|
||||||
|
function PendingBadge() {
|
||||||
|
const { data } = useHalachotPending();
|
||||||
|
const n = data?.count ?? 0;
|
||||||
|
if (!n) return null;
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="ms-1 bg-gold-wash text-gold-deep border-gold/40 text-[0.65rem]"
|
||||||
|
>
|
||||||
|
{n}
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
export default function PrecedentsPage() {
|
||||||
|
return (
|
||||||
|
<AppShell>
|
||||||
|
<section className="space-y-6">
|
||||||
|
<header>
|
||||||
|
<nav className="text-[0.78rem] text-ink-muted mb-1">
|
||||||
|
<Link href="/" className="hover:text-gold-deep">בית</Link>
|
||||||
|
<span aria-hidden> · </span>
|
||||||
|
<span className="text-navy">ספריית פסיקה</span>
|
||||||
|
</nav>
|
||||||
|
<h1 className="text-navy mb-0">ספריית הפסיקה הסמכותית</h1>
|
||||||
|
<p className="text-ink-muted text-sm mt-1 max-w-3xl">
|
||||||
|
פסיקה חיצונית — פסקי דין של ערכאות עליונות והחלטות של ועדות ערר אחרות.
|
||||||
|
כל קובץ עובר חילוץ הלכות אוטומטי, וההלכות ממתינות לאישור היו"ר לפני
|
||||||
|
שהן זמינות לסוכני הכתיבה (legal-writer וכו').
|
||||||
|
</p>
|
||||||
|
</header>
|
||||||
|
|
||||||
|
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
||||||
|
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<Tabs defaultValue="library" dir="rtl">
|
||||||
|
<TabsList className="bg-rule-soft/60">
|
||||||
|
<TabsTrigger value="library">ספרייה</TabsTrigger>
|
||||||
|
<TabsTrigger value="search">חיפוש סמנטי</TabsTrigger>
|
||||||
|
<TabsTrigger value="review">
|
||||||
|
ממתין לאישור
|
||||||
|
<PendingBadge />
|
||||||
|
</TabsTrigger>
|
||||||
|
<TabsTrigger value="stats">סטטיסטיקה</TabsTrigger>
|
||||||
|
</TabsList>
|
||||||
|
|
||||||
|
<TabsContent value="library" className="mt-5">
|
||||||
|
<LibraryListPanel />
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
<TabsContent value="search" className="mt-5">
|
||||||
|
<LibrarySearchPanel />
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
<TabsContent value="review" className="mt-5">
|
||||||
|
<HalachaReviewPanel />
|
||||||
|
</TabsContent>
|
||||||
|
|
||||||
|
<TabsContent value="stats" className="mt-5">
|
||||||
|
<LibraryStatsPanel />
|
||||||
|
</TabsContent>
|
||||||
|
</Tabs>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</section>
|
||||||
|
</AppShell>
|
||||||
|
);
|
||||||
|
}
|
||||||
128
web-ui/src/app/settings/_components/blocks-tab.tsx
Normal file
128
web-ui/src/app/settings/_components/blocks-tab.tsx
Normal file
@@ -0,0 +1,128 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { Layers, AlertCircle } from "lucide-react";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { useMcpBlocks, type McpBlock } from "@/lib/api/settings";
|
||||||
|
|
||||||
|
const GEN_TYPE_LABEL: Record<string, string> = {
|
||||||
|
"template-fill": "מילוי תבנית",
|
||||||
|
"paraphrase": "פרפרזה",
|
||||||
|
"reproduction": "שעתוק",
|
||||||
|
"guided-synthesis": "סינתזה מודרכת",
|
||||||
|
"rhetorical-construction": "בניה רטורית",
|
||||||
|
};
|
||||||
|
|
||||||
|
const GEN_TYPE_TONE: Record<string, string> = {
|
||||||
|
"template-fill": "text-ink-muted border-rule",
|
||||||
|
"paraphrase": "text-info border-info/40",
|
||||||
|
"reproduction": "text-info border-info/40",
|
||||||
|
"guided-synthesis": "text-warn border-warn/40",
|
||||||
|
"rhetorical-construction": "text-gold-deep border-gold/40",
|
||||||
|
};
|
||||||
|
|
||||||
|
function BlockRow({ block }: { block: McpBlock }) {
|
||||||
|
const isLLM = block.model !== "script";
|
||||||
|
return (
|
||||||
|
<div className="rounded-md border border-rule p-4 bg-rule-soft/20 hover:bg-rule-soft/40 transition-colors">
|
||||||
|
<div className="flex items-start gap-3">
|
||||||
|
<div className="flex-shrink-0 w-10 h-10 rounded-md bg-navy/5 border border-navy/20 flex items-center justify-center">
|
||||||
|
<span className="text-navy text-sm font-semibold tabular-nums">
|
||||||
|
{block.index}
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
<div className="flex-1 min-w-0 space-y-2">
|
||||||
|
<div className="flex items-center gap-2 flex-wrap">
|
||||||
|
<h3 className="text-navy font-medium">{block.title}</h3>
|
||||||
|
<code dir="ltr" className="font-mono text-[0.72rem] text-ink-muted">
|
||||||
|
{block.id}
|
||||||
|
</code>
|
||||||
|
</div>
|
||||||
|
<div className="flex items-center gap-2 flex-wrap">
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className={`text-[0.7rem] ${GEN_TYPE_TONE[block.gen_type] ?? ""}`}
|
||||||
|
>
|
||||||
|
{GEN_TYPE_LABEL[block.gen_type] ?? block.gen_type}
|
||||||
|
</Badge>
|
||||||
|
<Badge variant="outline" className="text-[0.7rem] font-mono" dir="ltr">
|
||||||
|
{block.model}
|
||||||
|
</Badge>
|
||||||
|
{isLLM && block.temperature !== null && (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
temp <span className="tabular-nums">{block.temperature}</span>
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
{block.max_tokens !== null && (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
max <span className="tabular-nums">{block.max_tokens}</span>
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
{(block.creac_role || block.jwm_purpose) && (
|
||||||
|
<div className="grid grid-cols-1 md:grid-cols-2 gap-x-4 gap-y-1 text-[0.78rem] text-ink-muted pt-1">
|
||||||
|
{block.creac_role && (
|
||||||
|
<div>
|
||||||
|
<span className="text-[0.7rem] uppercase tracking-wide me-1">
|
||||||
|
CREAC:
|
||||||
|
</span>
|
||||||
|
<span dir="ltr">{block.creac_role}</span>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
{block.jwm_purpose && (
|
||||||
|
<div>
|
||||||
|
<span className="text-[0.7rem] uppercase tracking-wide me-1">
|
||||||
|
JWM:
|
||||||
|
</span>
|
||||||
|
<span dir="ltr">{block.jwm_purpose}</span>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
export function BlocksTab() {
|
||||||
|
const { data, isPending, error } = useMcpBlocks();
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-danger/40">
|
||||||
|
<CardContent className="p-6 flex items-center gap-3 text-danger">
|
||||||
|
<AlertCircle className="w-5 h-5" />
|
||||||
|
<span>שגיאה בטעינת בלוקים: {error.message}</span>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<div className="flex items-center gap-2 mb-4 text-ink-muted text-sm">
|
||||||
|
<Layers className="w-4 h-4" />
|
||||||
|
<span>
|
||||||
|
ארכיטקטורת 12 הבלוקים של החלטת ועדת ערר. מקור הסכימה:{" "}
|
||||||
|
<code dir="ltr" className="font-mono text-[0.78rem]">
|
||||||
|
docs/block-schema.md
|
||||||
|
</code>
|
||||||
|
.
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
<div className="space-y-3">
|
||||||
|
{data.blocks.map((b) => (
|
||||||
|
<BlockRow key={b.id} block={b} />
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
39
web-ui/src/app/settings/_components/drift-badge.tsx
Normal file
39
web-ui/src/app/settings/_components/drift-badge.tsx
Normal file
@@ -0,0 +1,39 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { AlertTriangle, CheckCircle2, HelpCircle } from "lucide-react";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
drift: boolean;
|
||||||
|
// When false, Coolify was unreachable: drift state is unknown, not "synced".
|
||||||
|
coolifyAvailable?: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function DriftBadge({ drift, coolifyAvailable = true }: Props) {
|
||||||
|
if (!coolifyAvailable) {
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="text-ink-muted border-rule gap-1"
|
||||||
|
title="Coolify לא זמין — מצב ה-drift לא ידוע"
|
||||||
|
>
|
||||||
|
<HelpCircle className="w-3 h-3" />
|
||||||
|
Unknown
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (drift) {
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="text-warn border-warn/40 gap-1">
|
||||||
|
<AlertTriangle className="w-3 h-3" />
|
||||||
|
Drift
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="text-success border-success/40 gap-1">
|
||||||
|
<CheckCircle2 className="w-3 h-3" />
|
||||||
|
Synced
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
76
web-ui/src/app/settings/_components/env-var-editor.tsx
Normal file
76
web-ui/src/app/settings/_components/env-var-editor.tsx
Normal file
@@ -0,0 +1,76 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Switch } from "@/components/ui/switch";
|
||||||
|
import {
|
||||||
|
Select,
|
||||||
|
SelectContent,
|
||||||
|
SelectItem,
|
||||||
|
SelectTrigger,
|
||||||
|
SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import type { McpEnvVar } from "@/lib/api/settings";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
spec: McpEnvVar;
|
||||||
|
value: string;
|
||||||
|
onChange: (v: string) => void;
|
||||||
|
disabled?: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function EnvVarEditor({ spec, value, onChange, disabled }: Props) {
|
||||||
|
if (spec.type === "bool") {
|
||||||
|
const checked = value === "true";
|
||||||
|
return (
|
||||||
|
<Switch
|
||||||
|
checked={checked}
|
||||||
|
onCheckedChange={(c) => onChange(c ? "true" : "false")}
|
||||||
|
disabled={disabled}
|
||||||
|
/>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (spec.enum_values && spec.enum_values.length > 0) {
|
||||||
|
return (
|
||||||
|
<Select value={value} onValueChange={onChange} disabled={disabled}>
|
||||||
|
<SelectTrigger className="w-[220px]">
|
||||||
|
<SelectValue />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
{spec.enum_values.map((v) => (
|
||||||
|
<SelectItem key={v} value={v}>
|
||||||
|
{v}
|
||||||
|
</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (spec.type === "int" || spec.type === "float") {
|
||||||
|
return (
|
||||||
|
<Input
|
||||||
|
type="number"
|
||||||
|
value={value}
|
||||||
|
onChange={(e) => onChange(e.target.value)}
|
||||||
|
min={spec.min ?? undefined}
|
||||||
|
max={spec.max ?? undefined}
|
||||||
|
step={spec.type === "float" ? "0.01" : "1"}
|
||||||
|
disabled={disabled}
|
||||||
|
className="w-[160px] text-start"
|
||||||
|
dir="ltr"
|
||||||
|
/>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Input
|
||||||
|
type="text"
|
||||||
|
value={value}
|
||||||
|
onChange={(e) => onChange(e.target.value)}
|
||||||
|
disabled={disabled}
|
||||||
|
className="w-[260px] text-start"
|
||||||
|
dir="ltr"
|
||||||
|
/>
|
||||||
|
);
|
||||||
|
}
|
||||||
123
web-ui/src/app/settings/_components/env-var-row.tsx
Normal file
123
web-ui/src/app/settings/_components/env-var-row.tsx
Normal file
@@ -0,0 +1,123 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import { ExternalLink, Save, Lock } from "lucide-react";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import type { McpEnvVar } from "@/lib/api/settings";
|
||||||
|
import { useUpdateMcpEnv } from "@/lib/api/settings";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { DriftBadge } from "./drift-badge";
|
||||||
|
import { EnvVarEditor } from "./env-var-editor";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
spec: McpEnvVar;
|
||||||
|
coolifyAppUuid: string;
|
||||||
|
coolifyAvailable: boolean;
|
||||||
|
onPendingRedeploy: () => void;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function EnvVarRow({
|
||||||
|
spec,
|
||||||
|
coolifyAppUuid,
|
||||||
|
coolifyAvailable,
|
||||||
|
onPendingRedeploy,
|
||||||
|
}: Props) {
|
||||||
|
const [draft, setDraft] = useState<string>(spec.coolify_value ?? "");
|
||||||
|
const update = useUpdateMcpEnv();
|
||||||
|
const dirty = draft !== (spec.coolify_value ?? "");
|
||||||
|
|
||||||
|
function handleSave() {
|
||||||
|
update.mutate(
|
||||||
|
{ key: spec.key, value: draft },
|
||||||
|
{
|
||||||
|
onSuccess: (res) => {
|
||||||
|
toast.success(res.message);
|
||||||
|
onPendingRedeploy();
|
||||||
|
},
|
||||||
|
onError: (err) => toast.error(`שגיאה: ${err.message}`),
|
||||||
|
},
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const coolifyEnvUrl =
|
||||||
|
`https://coolify.nautilus.marcusgroup.org/project/applications/${coolifyAppUuid}/environment-variables`;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="rounded-md border border-rule p-4 bg-rule-soft/20 hover:bg-rule-soft/40 transition-colors">
|
||||||
|
<div className="flex items-start justify-between gap-3 mb-3">
|
||||||
|
<div className="flex-1 min-w-0">
|
||||||
|
<div className="flex items-center gap-2 flex-wrap">
|
||||||
|
<code className="font-mono text-sm font-medium text-navy" dir="ltr">
|
||||||
|
{spec.key}
|
||||||
|
</code>
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
{spec.type}
|
||||||
|
</Badge>
|
||||||
|
{spec.is_secret && (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem] text-warn border-warn/40 gap-1">
|
||||||
|
<Lock className="w-3 h-3" />
|
||||||
|
secret
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
<DriftBadge drift={spec.drift} coolifyAvailable={coolifyAvailable} />
|
||||||
|
{spec.has_duplicates && (
|
||||||
|
<Badge variant="outline" className="text-[0.7rem] text-warn border-warn/40">
|
||||||
|
duplicates
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
<p className="text-sm text-ink-muted mt-1">{spec.description}</p>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="grid grid-cols-1 md:grid-cols-2 gap-3 text-sm">
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<span className="text-[0.72rem] text-ink-muted w-20">Coolify:</span>
|
||||||
|
{spec.is_editable ? (
|
||||||
|
<EnvVarEditor
|
||||||
|
spec={spec}
|
||||||
|
value={draft}
|
||||||
|
onChange={setDraft}
|
||||||
|
disabled={update.isPending}
|
||||||
|
/>
|
||||||
|
) : (
|
||||||
|
<span className="font-mono text-ink" dir="ltr">
|
||||||
|
{spec.coolify_value ?? <em className="text-ink-muted">— לא מוגדר —</em>}
|
||||||
|
</span>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
<div className="flex items-center gap-2">
|
||||||
|
<span className="text-[0.72rem] text-ink-muted w-20">Container:</span>
|
||||||
|
<span className="font-mono text-ink" dir="ltr">
|
||||||
|
{spec.container_value ?? <em className="text-ink-muted">— לא מוגדר —</em>}
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="flex items-center justify-end gap-2 mt-3">
|
||||||
|
{!spec.is_editable && (
|
||||||
|
<a
|
||||||
|
href={coolifyEnvUrl}
|
||||||
|
target="_blank"
|
||||||
|
rel="noopener noreferrer"
|
||||||
|
className="text-[0.78rem] text-gold-deep hover:underline flex items-center gap-1"
|
||||||
|
>
|
||||||
|
ערוך ב-Coolify
|
||||||
|
<ExternalLink className="w-3 h-3" />
|
||||||
|
</a>
|
||||||
|
)}
|
||||||
|
{spec.is_editable && (
|
||||||
|
<Button
|
||||||
|
size="sm"
|
||||||
|
onClick={handleSave}
|
||||||
|
disabled={!dirty || update.isPending}
|
||||||
|
>
|
||||||
|
<Save className="w-3.5 h-3.5" data-icon="inline-start" />
|
||||||
|
{update.isPending ? "שומר..." : "שמור"}
|
||||||
|
</Button>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
139
web-ui/src/app/settings/_components/environment-tab.tsx
Normal file
139
web-ui/src/app/settings/_components/environment-tab.tsx
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState, useMemo } from "react";
|
||||||
|
import { RefreshCw, AlertCircle } from "lucide-react";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import {
|
||||||
|
useMcpEnv,
|
||||||
|
useMcpRedeploy,
|
||||||
|
type McpEnvVar,
|
||||||
|
type EnvCategory,
|
||||||
|
} from "@/lib/api/settings";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { EnvVarRow } from "./env-var-row";
|
||||||
|
|
||||||
|
const CATEGORY_LABELS: Record<EnvCategory, string> = {
|
||||||
|
multimodal: "Multimodal",
|
||||||
|
rerank: "Rerank",
|
||||||
|
halacha: "Halacha",
|
||||||
|
general: "כללי",
|
||||||
|
credentials: "אישורים",
|
||||||
|
connection: "חיבורים",
|
||||||
|
};
|
||||||
|
|
||||||
|
const CATEGORY_ORDER: EnvCategory[] = [
|
||||||
|
"multimodal", "rerank", "halacha", "general", "credentials", "connection",
|
||||||
|
];
|
||||||
|
|
||||||
|
export function EnvironmentTab() {
|
||||||
|
const { data, isPending, error } = useMcpEnv();
|
||||||
|
const redeploy = useMcpRedeploy();
|
||||||
|
const [pendingRedeploy, setPendingRedeploy] = useState(false);
|
||||||
|
|
||||||
|
const grouped = useMemo(() => {
|
||||||
|
if (!data?.vars) return new Map<EnvCategory, McpEnvVar[]>();
|
||||||
|
const m = new Map<EnvCategory, McpEnvVar[]>();
|
||||||
|
for (const v of data.vars) {
|
||||||
|
const arr = m.get(v.category) ?? [];
|
||||||
|
arr.push(v);
|
||||||
|
m.set(v.category, arr);
|
||||||
|
}
|
||||||
|
return m;
|
||||||
|
}, [data]);
|
||||||
|
|
||||||
|
function handleRedeploy() {
|
||||||
|
redeploy.mutate(undefined, {
|
||||||
|
onSuccess: (res) => {
|
||||||
|
toast.success(res.message);
|
||||||
|
setPendingRedeploy(false);
|
||||||
|
},
|
||||||
|
onError: (err) => toast.error(`Redeploy נכשל: ${err.message}`),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-danger/40">
|
||||||
|
<CardContent className="p-6 flex items-center gap-3 text-danger">
|
||||||
|
<AlertCircle className="w-5 h-5" />
|
||||||
|
<span>שגיאה בטעינת env vars: {error.message}</span>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
const coolifyAvailable = data.errors.length === 0;
|
||||||
|
const driftCount = data.vars.filter((v) => v.drift).length;
|
||||||
|
const duplicatesCount = data.vars.filter((v) => v.has_duplicates).length;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-6 py-4 flex items-center justify-between gap-4 flex-wrap">
|
||||||
|
<div className="flex items-center gap-3 flex-wrap text-sm">
|
||||||
|
<Badge variant="outline">
|
||||||
|
Coolify app: <code dir="ltr" className="ms-1">{data.coolify_app_uuid.slice(0, 8)}…</code>
|
||||||
|
</Badge>
|
||||||
|
{driftCount > 0 && (
|
||||||
|
<Badge variant="outline" className="text-warn border-warn/40">
|
||||||
|
{driftCount} drift
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
{duplicatesCount > 0 && (
|
||||||
|
<Badge variant="outline" className="text-warn border-warn/40">
|
||||||
|
{duplicatesCount} duplicates
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
{data.errors.length > 0 && (
|
||||||
|
<Badge variant="outline" className="text-danger border-danger/40">
|
||||||
|
{data.errors.join(", ")}
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
<Button
|
||||||
|
onClick={handleRedeploy}
|
||||||
|
disabled={redeploy.isPending}
|
||||||
|
variant={pendingRedeploy ? "default" : "outline"}
|
||||||
|
size="sm"
|
||||||
|
>
|
||||||
|
<RefreshCw className={redeploy.isPending ? "w-3.5 h-3.5 animate-spin" : "w-3.5 h-3.5"} data-icon="inline-start" />
|
||||||
|
{redeploy.isPending ? "Redeploying..." : "Redeploy now"}
|
||||||
|
</Button>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
|
||||||
|
{CATEGORY_ORDER.map((cat) => {
|
||||||
|
const vars = grouped.get(cat);
|
||||||
|
if (!vars || vars.length === 0) return null;
|
||||||
|
return (
|
||||||
|
<Card key={cat} className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<h2 className="text-navy text-lg mb-4 flex items-center gap-2">
|
||||||
|
{CATEGORY_LABELS[cat]}
|
||||||
|
<Badge variant="outline" className="text-[0.7rem] tabular-nums">
|
||||||
|
{vars.length}
|
||||||
|
</Badge>
|
||||||
|
</h2>
|
||||||
|
<div className="space-y-3">
|
||||||
|
{vars.map((v) => (
|
||||||
|
<EnvVarRow
|
||||||
|
key={v.key}
|
||||||
|
spec={v}
|
||||||
|
coolifyAppUuid={data.coolify_app_uuid}
|
||||||
|
coolifyAvailable={coolifyAvailable}
|
||||||
|
onPendingRedeploy={() => setPendingRedeploy(true)}
|
||||||
|
/>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
})}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
225
web-ui/src/app/settings/_components/paperclip-tab.tsx
Normal file
225
web-ui/src/app/settings/_components/paperclip-tab.tsx
Normal file
@@ -0,0 +1,225 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import { Plus, Trash2, Tags, Building2 } from "lucide-react";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import {
|
||||||
|
Select,
|
||||||
|
SelectContent,
|
||||||
|
SelectItem,
|
||||||
|
SelectTrigger,
|
||||||
|
SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import {
|
||||||
|
useTagMappings,
|
||||||
|
usePaperclipCompanies,
|
||||||
|
useAddTagMapping,
|
||||||
|
useDeleteTagMapping,
|
||||||
|
} from "@/lib/api/settings";
|
||||||
|
import { APPEAL_SUBTYPES } from "@/lib/practice-area";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
|
||||||
|
const TAG_SUGGESTIONS = APPEAL_SUBTYPES.filter((s) => s.value !== "unknown");
|
||||||
|
|
||||||
|
export function PaperclipTab() {
|
||||||
|
const { data: mappings, isPending: loadingMappings } = useTagMappings();
|
||||||
|
const { data: companies, isPending: loadingCompanies } = usePaperclipCompanies();
|
||||||
|
const addMapping = useAddTagMapping();
|
||||||
|
const deleteMapping = useDeleteTagMapping();
|
||||||
|
|
||||||
|
const [tag, setTag] = useState("");
|
||||||
|
const [tagLabel, setTagLabel] = useState("");
|
||||||
|
const [companyId, setCompanyId] = useState("");
|
||||||
|
|
||||||
|
function handleTagInput(value: string) {
|
||||||
|
setTag(value);
|
||||||
|
const match = TAG_SUGGESTIONS.find((s) => s.value === value);
|
||||||
|
if (match) setTagLabel(match.label);
|
||||||
|
}
|
||||||
|
|
||||||
|
function handleAdd() {
|
||||||
|
if (!tag || !companyId) {
|
||||||
|
toast.error("יש לבחור תגית וחברה");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
const company = companies?.find((c) => c.id === companyId);
|
||||||
|
addMapping.mutate(
|
||||||
|
{
|
||||||
|
tag,
|
||||||
|
tag_label: tagLabel,
|
||||||
|
company_id: companyId,
|
||||||
|
company_name: company?.name ?? "",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
onSuccess: () => {
|
||||||
|
toast.success("מיפוי נוסף בהצלחה");
|
||||||
|
setTag("");
|
||||||
|
setTagLabel("");
|
||||||
|
setCompanyId("");
|
||||||
|
},
|
||||||
|
onError: (err) => toast.error(`שגיאה: ${err.message}`),
|
||||||
|
},
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function handleDelete(id: string, tag: string) {
|
||||||
|
deleteMapping.mutate(id, {
|
||||||
|
onSuccess: () => toast.success(`מיפוי "${tag}" נמחק`),
|
||||||
|
onError: (err) => toast.error(`שגיאה: ${err.message}`),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-6">
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<h2 className="text-navy text-lg mb-3 flex items-center gap-2">
|
||||||
|
<Building2 className="w-4 h-4" />
|
||||||
|
חברות ב-Paperclip
|
||||||
|
</h2>
|
||||||
|
{loadingCompanies ? (
|
||||||
|
<Skeleton className="h-12 w-full" />
|
||||||
|
) : !companies?.length ? (
|
||||||
|
<p className="text-ink-muted text-sm">לא נמצאו חברות</p>
|
||||||
|
) : (
|
||||||
|
<div className="flex flex-wrap gap-3">
|
||||||
|
{companies.map((c) => (
|
||||||
|
<div
|
||||||
|
key={c.id}
|
||||||
|
className="flex items-center gap-2 rounded-md bg-rule-soft/60 border border-rule px-4 py-2.5"
|
||||||
|
>
|
||||||
|
<span className="text-sm font-medium text-ink">{c.name}</span>
|
||||||
|
<Badge variant="outline" className="text-[0.7rem] tabular-nums">
|
||||||
|
{c.prefix}
|
||||||
|
</Badge>
|
||||||
|
</div>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
|
||||||
|
<Card className="bg-surface border-rule shadow-sm">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<h2 className="text-navy text-lg mb-4 flex items-center gap-2">
|
||||||
|
<Tags className="w-4 h-4" />
|
||||||
|
מיפוי תגיות
|
||||||
|
<Badge variant="outline" className="text-[0.7rem] tabular-nums">
|
||||||
|
{mappings?.length ?? 0}
|
||||||
|
</Badge>
|
||||||
|
</h2>
|
||||||
|
|
||||||
|
<div className="flex flex-wrap items-end gap-3 mb-5 p-4 rounded-md bg-rule-soft/40 border border-rule">
|
||||||
|
<div className="flex flex-col gap-1.5 min-w-[180px]">
|
||||||
|
<label className="text-[0.72rem] text-ink-muted">תגית</label>
|
||||||
|
<Input
|
||||||
|
list="tag-suggestions"
|
||||||
|
value={tag}
|
||||||
|
onChange={(e) => handleTagInput(e.target.value)}
|
||||||
|
placeholder="סוג ערר או תגית חופשית"
|
||||||
|
className="w-[220px]"
|
||||||
|
/>
|
||||||
|
<datalist id="tag-suggestions">
|
||||||
|
{TAG_SUGGESTIONS.map((s) => (
|
||||||
|
<option key={s.value} value={s.value}>
|
||||||
|
{s.label}
|
||||||
|
</option>
|
||||||
|
))}
|
||||||
|
</datalist>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="flex flex-col gap-1.5 min-w-[140px]">
|
||||||
|
<label className="text-[0.72rem] text-ink-muted">תווית</label>
|
||||||
|
<Input
|
||||||
|
value={tagLabel}
|
||||||
|
onChange={(e) => setTagLabel(e.target.value)}
|
||||||
|
placeholder="שם לתצוגה"
|
||||||
|
className="w-[160px]"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="flex flex-col gap-1.5 min-w-[200px]">
|
||||||
|
<label className="text-[0.72rem] text-ink-muted">
|
||||||
|
חברה ב-Paperclip
|
||||||
|
</label>
|
||||||
|
<Select value={companyId} onValueChange={setCompanyId}>
|
||||||
|
<SelectTrigger className="w-[240px]">
|
||||||
|
<SelectValue placeholder="בחר חברה" />
|
||||||
|
</SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
{companies?.map((c) => (
|
||||||
|
<SelectItem key={c.id} value={c.id}>
|
||||||
|
{c.name} ({c.prefix})
|
||||||
|
</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<Button
|
||||||
|
onClick={handleAdd}
|
||||||
|
disabled={addMapping.isPending || !tag || !companyId}
|
||||||
|
size="default"
|
||||||
|
>
|
||||||
|
<Plus className="w-4 h-4" data-icon="inline-start" />
|
||||||
|
{addMapping.isPending ? "שומר..." : "הוסף מיפוי"}
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{loadingMappings ? (
|
||||||
|
<Skeleton className="h-32 w-full" />
|
||||||
|
) : !mappings?.length ? (
|
||||||
|
<p className="text-ink-muted text-sm">
|
||||||
|
אין מיפויים. הוסף מיפוי כדי שתיקים חדשים ישויכו אוטומטית
|
||||||
|
לפרויקט בחברה הנכונה.
|
||||||
|
</p>
|
||||||
|
) : (
|
||||||
|
<div className="overflow-x-auto">
|
||||||
|
<table className="w-full text-sm">
|
||||||
|
<thead>
|
||||||
|
<tr className="border-b border-rule text-ink-muted text-[0.72rem] uppercase tracking-wider">
|
||||||
|
<th className="text-start py-2 px-3 font-medium">Tag</th>
|
||||||
|
<th className="text-start py-2 px-3 font-medium">Label</th>
|
||||||
|
<th className="text-start py-2 px-3 font-medium">Company</th>
|
||||||
|
<th className="py-2 px-3 w-12" />
|
||||||
|
</tr>
|
||||||
|
</thead>
|
||||||
|
<tbody>
|
||||||
|
{mappings.map((m) => (
|
||||||
|
<tr
|
||||||
|
key={m.id}
|
||||||
|
className="border-b border-rule/60 hover:bg-rule-soft/40 transition-colors"
|
||||||
|
>
|
||||||
|
<td className="py-2.5 px-3">
|
||||||
|
<Badge variant="outline" className="text-[0.75rem] font-mono">
|
||||||
|
{m.tag}
|
||||||
|
</Badge>
|
||||||
|
</td>
|
||||||
|
<td className="py-2.5 px-3 text-ink">{m.tag_label}</td>
|
||||||
|
<td className="py-2.5 px-3 text-ink">{m.company_name}</td>
|
||||||
|
<td className="py-2.5 px-3">
|
||||||
|
<Button
|
||||||
|
variant="ghost"
|
||||||
|
size="icon-xs"
|
||||||
|
onClick={() => handleDelete(m.id, m.tag)}
|
||||||
|
disabled={deleteMapping.isPending}
|
||||||
|
title="מחק מיפוי"
|
||||||
|
>
|
||||||
|
<Trash2 className="w-3.5 h-3.5 text-danger" />
|
||||||
|
</Button>
|
||||||
|
</td>
|
||||||
|
</tr>
|
||||||
|
))}
|
||||||
|
</tbody>
|
||||||
|
</table>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
134
web-ui/src/app/settings/_components/registrations-tab.tsx
Normal file
134
web-ui/src/app/settings/_components/registrations-tab.tsx
Normal file
@@ -0,0 +1,134 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { Plug, AlertCircle } from "lucide-react";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { useMcpRegistrations } from "@/lib/api/settings";
|
||||||
|
|
||||||
|
export function RegistrationsTab() {
|
||||||
|
const { data, isPending, error } = useMcpRegistrations();
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-64 w-full" />;
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-danger/40">
|
||||||
|
<CardContent className="p-6 flex items-center gap-3 text-danger">
|
||||||
|
<AlertCircle className="w-5 h-5" />
|
||||||
|
<span>שגיאה: {error.message}</span>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
if (data.error === "host_path_unavailable") {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-warn/40">
|
||||||
|
<CardContent className="p-6">
|
||||||
|
<div className="flex items-center gap-3 text-warn mb-2">
|
||||||
|
<AlertCircle className="w-5 h-5" />
|
||||||
|
<span className="font-medium">תיקיית /host לא זמינה בקונטיינר</span>
|
||||||
|
</div>
|
||||||
|
<p className="text-sm text-ink-muted mb-2">
|
||||||
|
כדי להציג רישומי MCP, יש להוסיף volume mounts ב-Coolify.
|
||||||
|
ראה runbook ב-
|
||||||
|
<code dir="ltr" className="mx-1">
|
||||||
|
docs/runbooks/coolify-mcp-settings-volumes.md
|
||||||
|
</code>
|
||||||
|
</p>
|
||||||
|
{data.message && (
|
||||||
|
<p className="text-sm text-ink-muted">{data.message}</p>
|
||||||
|
)}
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!data.registrations.length) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-rule">
|
||||||
|
<CardContent className="p-6 text-ink-muted text-sm">
|
||||||
|
לא נמצאו רישומי MCP.
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Group by client
|
||||||
|
const groups = new Map<string, typeof data.registrations>();
|
||||||
|
for (const r of data.registrations) {
|
||||||
|
const arr = groups.get(r.client) ?? [];
|
||||||
|
arr.push(r);
|
||||||
|
groups.set(r.client, arr);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<div className="flex items-center gap-2 text-sm text-ink-muted">
|
||||||
|
<Plug className="w-4 h-4" />
|
||||||
|
סה"כ {data.registrations.length} רישומים
|
||||||
|
</div>
|
||||||
|
{[...groups.entries()].map(([client, regs]) => (
|
||||||
|
<Card key={client} className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<h2 className="text-navy text-lg mb-4 flex items-center gap-2">
|
||||||
|
{client}
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
{regs.length}
|
||||||
|
</Badge>
|
||||||
|
</h2>
|
||||||
|
<div className="space-y-3">
|
||||||
|
{regs.map((r, i) => (
|
||||||
|
<div
|
||||||
|
key={`${r.server_name}-${i}`}
|
||||||
|
className="rounded-md border border-rule bg-rule-soft/20 p-4 space-y-2 text-sm"
|
||||||
|
>
|
||||||
|
<div className="flex items-center gap-2 mb-1">
|
||||||
|
<code dir="ltr" className="font-mono font-medium text-navy">
|
||||||
|
{r.server_name}
|
||||||
|
</code>
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]" dir="ltr">
|
||||||
|
{r.transport}
|
||||||
|
</Badge>
|
||||||
|
</div>
|
||||||
|
<div className="grid grid-cols-1 md:grid-cols-[100px_1fr] gap-x-3 gap-y-1.5 text-[0.82rem]">
|
||||||
|
<span className="text-ink-muted">command:</span>
|
||||||
|
<code dir="ltr" className="font-mono text-ink break-all">
|
||||||
|
{r.command || "—"}
|
||||||
|
</code>
|
||||||
|
<span className="text-ink-muted">args:</span>
|
||||||
|
<code dir="ltr" className="font-mono text-ink break-all">
|
||||||
|
{r.args.length ? JSON.stringify(r.args) : "[]"}
|
||||||
|
</code>
|
||||||
|
<span className="text-ink-muted">cwd:</span>
|
||||||
|
<code dir="ltr" className="font-mono text-ink break-all">
|
||||||
|
{r.cwd || "—"}
|
||||||
|
</code>
|
||||||
|
<span className="text-ink-muted">env keys:</span>
|
||||||
|
<div className="flex flex-wrap gap-1">
|
||||||
|
{r.env_keys.length === 0 ? (
|
||||||
|
<span className="text-ink-muted">—</span>
|
||||||
|
) : (
|
||||||
|
r.env_keys.map((k) => (
|
||||||
|
<Badge
|
||||||
|
key={k}
|
||||||
|
variant="outline"
|
||||||
|
className="text-[0.7rem] font-mono"
|
||||||
|
dir="ltr"
|
||||||
|
>
|
||||||
|
{k}
|
||||||
|
</Badge>
|
||||||
|
))
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
65
web-ui/src/app/settings/_components/tool-detail-drawer.tsx
Normal file
65
web-ui/src/app/settings/_components/tool-detail-drawer.tsx
Normal file
@@ -0,0 +1,65 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import {
|
||||||
|
Sheet,
|
||||||
|
SheetContent,
|
||||||
|
SheetHeader,
|
||||||
|
SheetTitle,
|
||||||
|
SheetDescription,
|
||||||
|
} from "@/components/ui/sheet";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import type { McpTool } from "@/lib/api/settings";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
tool: McpTool | null;
|
||||||
|
open: boolean;
|
||||||
|
onOpenChange: (o: boolean) => void;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function ToolDetailDrawer({ tool, open, onOpenChange }: Props) {
|
||||||
|
return (
|
||||||
|
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||||
|
<SheetContent dir="rtl" side="left" className="sm:max-w-xl overflow-y-auto">
|
||||||
|
{tool && (
|
||||||
|
<>
|
||||||
|
<SheetHeader>
|
||||||
|
<SheetTitle dir="ltr" className="font-mono text-navy">
|
||||||
|
{tool.name}
|
||||||
|
</SheetTitle>
|
||||||
|
<SheetDescription>{tool.description || "—"}</SheetDescription>
|
||||||
|
</SheetHeader>
|
||||||
|
<div className="space-y-4 mt-4 px-4 pb-6">
|
||||||
|
<div>
|
||||||
|
<div className="text-[0.72rem] text-ink-muted uppercase mb-1">
|
||||||
|
Module
|
||||||
|
</div>
|
||||||
|
<Badge variant="outline" className="font-mono" dir="ltr">
|
||||||
|
{tool.module}
|
||||||
|
</Badge>
|
||||||
|
</div>
|
||||||
|
<div>
|
||||||
|
<div className="text-[0.72rem] text-ink-muted uppercase mb-1">
|
||||||
|
Source
|
||||||
|
</div>
|
||||||
|
<code dir="ltr" className="text-xs text-ink break-all">
|
||||||
|
{tool.source_location || "—"}
|
||||||
|
</code>
|
||||||
|
</div>
|
||||||
|
<div>
|
||||||
|
<div className="text-[0.72rem] text-ink-muted uppercase mb-1">
|
||||||
|
Parameters Schema
|
||||||
|
</div>
|
||||||
|
<pre
|
||||||
|
dir="ltr"
|
||||||
|
className="text-xs bg-rule-soft/40 border border-rule rounded-md p-3 overflow-x-auto"
|
||||||
|
>
|
||||||
|
{JSON.stringify(tool.params_schema, null, 2)}
|
||||||
|
</pre>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</SheetContent>
|
||||||
|
</Sheet>
|
||||||
|
);
|
||||||
|
}
|
||||||
83
web-ui/src/app/settings/_components/tools-tab.tsx
Normal file
83
web-ui/src/app/settings/_components/tools-tab.tsx
Normal file
@@ -0,0 +1,83 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState, useMemo } from "react";
|
||||||
|
import { Wrench, AlertCircle } from "lucide-react";
|
||||||
|
import { Card, CardContent } from "@/components/ui/card";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { useMcpTools, type McpTool } from "@/lib/api/settings";
|
||||||
|
import { ToolDetailDrawer } from "./tool-detail-drawer";
|
||||||
|
|
||||||
|
export function ToolsTab() {
|
||||||
|
const { data, isPending, error } = useMcpTools();
|
||||||
|
const [selected, setSelected] = useState<McpTool | null>(null);
|
||||||
|
const [open, setOpen] = useState(false);
|
||||||
|
|
||||||
|
const grouped = useMemo(() => {
|
||||||
|
if (!data?.tools) return new Map<string, McpTool[]>();
|
||||||
|
const m = new Map<string, McpTool[]>();
|
||||||
|
for (const t of data.tools) {
|
||||||
|
const mod = t.module.split(".").pop() || "other";
|
||||||
|
const arr = m.get(mod) ?? [];
|
||||||
|
arr.push(t);
|
||||||
|
m.set(mod, arr);
|
||||||
|
}
|
||||||
|
return m;
|
||||||
|
}, [data]);
|
||||||
|
|
||||||
|
if (isPending) return <Skeleton className="h-96 w-full" />;
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<Card className="bg-surface border-danger/40">
|
||||||
|
<CardContent className="p-6 flex items-center gap-3 text-danger">
|
||||||
|
<AlertCircle className="w-5 h-5" />
|
||||||
|
<span>שגיאה בטעינת tools: {error.message}</span>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (!data) return null;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<div className="flex items-center gap-2 text-sm text-ink-muted">
|
||||||
|
<Wrench className="w-4 h-4" />
|
||||||
|
סה"כ {data.count} tools
|
||||||
|
</div>
|
||||||
|
{[...grouped.entries()].sort().map(([mod, tools]) => (
|
||||||
|
<Card key={mod} className="bg-surface border-rule">
|
||||||
|
<CardContent className="px-6 py-5">
|
||||||
|
<h2 className="text-navy text-lg mb-3 flex items-center gap-2">
|
||||||
|
<code dir="ltr">{mod}</code>
|
||||||
|
<Badge variant="outline" className="text-[0.7rem]">
|
||||||
|
{tools.length}
|
||||||
|
</Badge>
|
||||||
|
</h2>
|
||||||
|
<div className="grid grid-cols-1 md:grid-cols-2 gap-2">
|
||||||
|
{tools.map((t) => (
|
||||||
|
<button
|
||||||
|
key={t.name}
|
||||||
|
onClick={() => {
|
||||||
|
setSelected(t);
|
||||||
|
setOpen(true);
|
||||||
|
}}
|
||||||
|
className="text-start rounded-md border border-rule px-3 py-2 hover:bg-rule-soft/40 transition-colors"
|
||||||
|
>
|
||||||
|
<code dir="ltr" className="font-mono text-sm text-navy">
|
||||||
|
{t.name}
|
||||||
|
</code>
|
||||||
|
{t.description && (
|
||||||
|
<p className="text-[0.78rem] text-ink-muted mt-0.5 line-clamp-2">
|
||||||
|
{t.description}
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
</button>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</CardContent>
|
||||||
|
</Card>
|
||||||
|
))}
|
||||||
|
<ToolDetailDrawer tool={selected} open={open} onOpenChange={setOpen} />
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
@@ -1,80 +1,16 @@
|
|||||||
"use client";
|
"use client";
|
||||||
|
|
||||||
import { useState } from "react";
|
|
||||||
import Link from "next/link";
|
import Link from "next/link";
|
||||||
import { Plus, Trash2, Tags, Building2 } from "lucide-react";
|
import { Server, Wrench, Plug, Building2, Layers } from "lucide-react";
|
||||||
import { AppShell } from "@/components/app-shell";
|
import { AppShell } from "@/components/app-shell";
|
||||||
import { Card, CardContent } from "@/components/ui/card";
|
import { Tabs, TabsContent, TabsList, TabsTrigger } from "@/components/ui/tabs";
|
||||||
import { Badge } from "@/components/ui/badge";
|
import { PaperclipTab } from "./_components/paperclip-tab";
|
||||||
import { Button } from "@/components/ui/button";
|
import { EnvironmentTab } from "./_components/environment-tab";
|
||||||
import { Input } from "@/components/ui/input";
|
import { ToolsTab } from "./_components/tools-tab";
|
||||||
import { Skeleton } from "@/components/ui/skeleton";
|
import { RegistrationsTab } from "./_components/registrations-tab";
|
||||||
import {
|
import { BlocksTab } from "./_components/blocks-tab";
|
||||||
Select,
|
|
||||||
SelectContent,
|
|
||||||
SelectItem,
|
|
||||||
SelectTrigger,
|
|
||||||
SelectValue,
|
|
||||||
} from "@/components/ui/select";
|
|
||||||
import {
|
|
||||||
useTagMappings,
|
|
||||||
usePaperclipCompanies,
|
|
||||||
useAddTagMapping,
|
|
||||||
useDeleteTagMapping,
|
|
||||||
} from "@/lib/api/settings";
|
|
||||||
import { APPEAL_SUBTYPES } from "@/lib/practice-area";
|
|
||||||
import { toast } from "sonner";
|
|
||||||
|
|
||||||
const TAG_SUGGESTIONS = APPEAL_SUBTYPES.filter((s) => s.value !== "unknown");
|
|
||||||
|
|
||||||
export default function SettingsPage() {
|
export default function SettingsPage() {
|
||||||
const { data: mappings, isPending: loadingMappings } = useTagMappings();
|
|
||||||
const { data: companies, isPending: loadingCompanies } = usePaperclipCompanies();
|
|
||||||
const addMapping = useAddTagMapping();
|
|
||||||
const deleteMapping = useDeleteTagMapping();
|
|
||||||
|
|
||||||
const [tag, setTag] = useState("");
|
|
||||||
const [tagLabel, setTagLabel] = useState("");
|
|
||||||
const [companyId, setCompanyId] = useState("");
|
|
||||||
|
|
||||||
function handleTagInput(value: string) {
|
|
||||||
setTag(value);
|
|
||||||
const match = TAG_SUGGESTIONS.find((s) => s.value === value);
|
|
||||||
if (match) setTagLabel(match.label);
|
|
||||||
}
|
|
||||||
|
|
||||||
function handleAdd() {
|
|
||||||
if (!tag || !companyId) {
|
|
||||||
toast.error("יש לבחור תגית וחברה");
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
const company = companies?.find((c) => c.id === companyId);
|
|
||||||
addMapping.mutate(
|
|
||||||
{
|
|
||||||
tag,
|
|
||||||
tag_label: tagLabel,
|
|
||||||
company_id: companyId,
|
|
||||||
company_name: company?.name ?? "",
|
|
||||||
},
|
|
||||||
{
|
|
||||||
onSuccess: () => {
|
|
||||||
toast.success("מיפוי נוסף בהצלחה");
|
|
||||||
setTag("");
|
|
||||||
setTagLabel("");
|
|
||||||
setCompanyId("");
|
|
||||||
},
|
|
||||||
onError: (err) => toast.error(`שגיאה: ${err.message}`),
|
|
||||||
},
|
|
||||||
);
|
|
||||||
}
|
|
||||||
|
|
||||||
function handleDelete(id: string, tag: string) {
|
|
||||||
deleteMapping.mutate(id, {
|
|
||||||
onSuccess: () => toast.success(`מיפוי "${tag}" נמחק`),
|
|
||||||
onError: (err) => toast.error(`שגיאה: ${err.message}`),
|
|
||||||
});
|
|
||||||
}
|
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<AppShell>
|
<AppShell>
|
||||||
<section className="space-y-6">
|
<section className="space-y-6">
|
||||||
@@ -88,164 +24,42 @@ export default function SettingsPage() {
|
|||||||
</nav>
|
</nav>
|
||||||
<h1 className="text-navy mb-0">הגדרות</h1>
|
<h1 className="text-navy mb-0">הגדרות</h1>
|
||||||
<p className="text-ink-muted text-sm mt-1 max-w-2xl">
|
<p className="text-ink-muted text-sm mt-1 max-w-2xl">
|
||||||
ניהול מיפוי תגיות ערר לחברות ב-Paperclip. כל תיק חדש ישויך
|
תצורת המערכת, MCP server, ו-Paperclip integration.
|
||||||
אוטומטית לפרויקט בחברה הנכונה לפי סוג הערר.
|
|
||||||
</p>
|
</p>
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
<div className="h-[2px] bg-gradient-to-l from-transparent via-gold to-transparent" />
|
||||||
|
|
||||||
{/* Companies overview */}
|
<Tabs defaultValue="paperclip" className="space-y-4">
|
||||||
<Card className="bg-surface border-rule shadow-sm">
|
<TabsList>
|
||||||
<CardContent className="px-6 py-5">
|
<TabsTrigger value="paperclip">
|
||||||
<h2 className="text-navy text-lg mb-3 flex items-center gap-2">
|
<Building2 className="w-4 h-4" data-icon="inline-start" />
|
||||||
<Building2 className="w-4 h-4" />
|
Paperclip
|
||||||
חברות ב-Paperclip
|
</TabsTrigger>
|
||||||
</h2>
|
<TabsTrigger value="environment">
|
||||||
{loadingCompanies ? (
|
<Server className="w-4 h-4" data-icon="inline-start" />
|
||||||
<Skeleton className="h-12 w-full" />
|
Environment
|
||||||
) : !companies?.length ? (
|
</TabsTrigger>
|
||||||
<p className="text-ink-muted text-sm">לא נמצאו חברות</p>
|
<TabsTrigger value="tools">
|
||||||
) : (
|
<Wrench className="w-4 h-4" data-icon="inline-start" />
|
||||||
<div className="flex flex-wrap gap-3">
|
Tools
|
||||||
{companies.map((c) => (
|
</TabsTrigger>
|
||||||
<div
|
<TabsTrigger value="blocks">
|
||||||
key={c.id}
|
<Layers className="w-4 h-4" data-icon="inline-start" />
|
||||||
className="flex items-center gap-2 rounded-md bg-rule-soft/60 border border-rule px-4 py-2.5"
|
Blocks
|
||||||
>
|
</TabsTrigger>
|
||||||
<span className="text-sm font-medium text-ink">{c.name}</span>
|
<TabsTrigger value="registrations">
|
||||||
<Badge variant="outline" className="text-[0.7rem] tabular-nums">
|
<Plug className="w-4 h-4" data-icon="inline-start" />
|
||||||
{c.prefix}
|
Registrations
|
||||||
</Badge>
|
</TabsTrigger>
|
||||||
</div>
|
</TabsList>
|
||||||
))}
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</CardContent>
|
|
||||||
</Card>
|
|
||||||
|
|
||||||
{/* Tag mappings */}
|
<TabsContent value="paperclip"><PaperclipTab /></TabsContent>
|
||||||
<Card className="bg-surface border-rule shadow-sm">
|
<TabsContent value="environment"><EnvironmentTab /></TabsContent>
|
||||||
<CardContent className="px-6 py-5">
|
<TabsContent value="tools"><ToolsTab /></TabsContent>
|
||||||
<h2 className="text-navy text-lg mb-4 flex items-center gap-2">
|
<TabsContent value="blocks"><BlocksTab /></TabsContent>
|
||||||
<Tags className="w-4 h-4" />
|
<TabsContent value="registrations"><RegistrationsTab /></TabsContent>
|
||||||
מיפוי תגיות
|
</Tabs>
|
||||||
<Badge variant="outline" className="text-[0.7rem] tabular-nums">
|
|
||||||
{mappings?.length ?? 0}
|
|
||||||
</Badge>
|
|
||||||
</h2>
|
|
||||||
|
|
||||||
{/* Add form */}
|
|
||||||
<div className="flex flex-wrap items-end gap-3 mb-5 p-4 rounded-md bg-rule-soft/40 border border-rule">
|
|
||||||
<div className="flex flex-col gap-1.5 min-w-[180px]">
|
|
||||||
<label className="text-[0.72rem] text-ink-muted">
|
|
||||||
תגית
|
|
||||||
</label>
|
|
||||||
<Input
|
|
||||||
list="tag-suggestions"
|
|
||||||
value={tag}
|
|
||||||
onChange={(e) => handleTagInput(e.target.value)}
|
|
||||||
placeholder="סוג ערר או תגית חופשית"
|
|
||||||
className="w-[220px]"
|
|
||||||
/>
|
|
||||||
<datalist id="tag-suggestions">
|
|
||||||
{TAG_SUGGESTIONS.map((s) => (
|
|
||||||
<option key={s.value} value={s.value}>
|
|
||||||
{s.label}
|
|
||||||
</option>
|
|
||||||
))}
|
|
||||||
</datalist>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div className="flex flex-col gap-1.5 min-w-[140px]">
|
|
||||||
<label className="text-[0.72rem] text-ink-muted">תווית</label>
|
|
||||||
<Input
|
|
||||||
value={tagLabel}
|
|
||||||
onChange={(e) => setTagLabel(e.target.value)}
|
|
||||||
placeholder="שם לתצוגה"
|
|
||||||
className="w-[160px]"
|
|
||||||
/>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<div className="flex flex-col gap-1.5 min-w-[200px]">
|
|
||||||
<label className="text-[0.72rem] text-ink-muted">
|
|
||||||
חברה ב-Paperclip
|
|
||||||
</label>
|
|
||||||
<Select value={companyId} onValueChange={setCompanyId}>
|
|
||||||
<SelectTrigger className="w-[240px]">
|
|
||||||
<SelectValue placeholder="בחר חברה" />
|
|
||||||
</SelectTrigger>
|
|
||||||
<SelectContent>
|
|
||||||
{companies?.map((c) => (
|
|
||||||
<SelectItem key={c.id} value={c.id}>
|
|
||||||
{c.name} ({c.prefix})
|
|
||||||
</SelectItem>
|
|
||||||
))}
|
|
||||||
</SelectContent>
|
|
||||||
</Select>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
<Button
|
|
||||||
onClick={handleAdd}
|
|
||||||
disabled={addMapping.isPending || !tag || !companyId}
|
|
||||||
size="default"
|
|
||||||
>
|
|
||||||
<Plus className="w-4 h-4" data-icon="inline-start" />
|
|
||||||
{addMapping.isPending ? "שומר..." : "הוסף מיפוי"}
|
|
||||||
</Button>
|
|
||||||
</div>
|
|
||||||
|
|
||||||
{/* Table */}
|
|
||||||
{loadingMappings ? (
|
|
||||||
<Skeleton className="h-32 w-full" />
|
|
||||||
) : !mappings?.length ? (
|
|
||||||
<p className="text-ink-muted text-sm">
|
|
||||||
אין מיפויים. הוסף מיפוי כדי שתיקים חדשים ישויכו אוטומטית
|
|
||||||
לפרויקט בחברה הנכונה.
|
|
||||||
</p>
|
|
||||||
) : (
|
|
||||||
<div className="overflow-x-auto">
|
|
||||||
<table className="w-full text-sm">
|
|
||||||
<thead>
|
|
||||||
<tr className="border-b border-rule text-ink-muted text-[0.72rem] uppercase tracking-wider">
|
|
||||||
<th className="text-start py-2 px-3 font-medium">Tag</th>
|
|
||||||
<th className="text-start py-2 px-3 font-medium">Label</th>
|
|
||||||
<th className="text-start py-2 px-3 font-medium">Company</th>
|
|
||||||
<th className="py-2 px-3 w-12" />
|
|
||||||
</tr>
|
|
||||||
</thead>
|
|
||||||
<tbody>
|
|
||||||
{mappings.map((m) => (
|
|
||||||
<tr
|
|
||||||
key={m.id}
|
|
||||||
className="border-b border-rule/60 hover:bg-rule-soft/40 transition-colors"
|
|
||||||
>
|
|
||||||
<td className="py-2.5 px-3">
|
|
||||||
<Badge variant="outline" className="text-[0.75rem] font-mono">
|
|
||||||
{m.tag}
|
|
||||||
</Badge>
|
|
||||||
</td>
|
|
||||||
<td className="py-2.5 px-3 text-ink">{m.tag_label}</td>
|
|
||||||
<td className="py-2.5 px-3 text-ink">{m.company_name}</td>
|
|
||||||
<td className="py-2.5 px-3">
|
|
||||||
<Button
|
|
||||||
variant="ghost"
|
|
||||||
size="icon-xs"
|
|
||||||
onClick={() => handleDelete(m.id, m.tag)}
|
|
||||||
disabled={deleteMapping.isPending}
|
|
||||||
title="מחק מיפוי"
|
|
||||||
>
|
|
||||||
<Trash2 className="w-3.5 h-3.5 text-danger" />
|
|
||||||
</Button>
|
|
||||||
</td>
|
|
||||||
</tr>
|
|
||||||
))}
|
|
||||||
</tbody>
|
|
||||||
</table>
|
|
||||||
</div>
|
|
||||||
)}
|
|
||||||
</CardContent>
|
|
||||||
</Card>
|
|
||||||
</section>
|
</section>
|
||||||
</AppShell>
|
</AppShell>
|
||||||
);
|
);
|
||||||
|
|||||||
@@ -3,35 +3,70 @@
|
|||||||
import type { ReactNode } from "react";
|
import type { ReactNode } from "react";
|
||||||
import Link from "next/link";
|
import Link from "next/link";
|
||||||
import { usePathname } from "next/navigation";
|
import { usePathname } from "next/navigation";
|
||||||
|
import { ChevronDown, Settings } from "lucide-react";
|
||||||
|
|
||||||
|
import {
|
||||||
|
DropdownMenu,
|
||||||
|
DropdownMenuContent,
|
||||||
|
DropdownMenuItem,
|
||||||
|
DropdownMenuLabel,
|
||||||
|
DropdownMenuSeparator,
|
||||||
|
DropdownMenuTrigger,
|
||||||
|
} from "@/components/ui/dropdown-menu";
|
||||||
|
import { GlobalSearch } from "@/components/global-search";
|
||||||
|
import { headerSubtitle } from "@/components/header-context";
|
||||||
|
|
||||||
/**
|
/**
|
||||||
* Ezer Mishpati navigation shell.
|
* Ezer Mishpati navigation shell — two-row header.
|
||||||
*
|
*
|
||||||
* Editorial/judicial aesthetic:
|
* Row 1 (brand): logo + dynamic context subtitle · global search · agent boards
|
||||||
* - Navy header with a gold hairline rule (border-b-3)
|
* Row 2 (nav): work group · knowledge group · admin dropdown
|
||||||
* - Parchment/cream body background (set on <body> via globals.css)
|
*
|
||||||
|
* Editorial/judicial aesthetic preserved:
|
||||||
|
* - Navy background with a gold hairline rule (border-b-3)
|
||||||
|
* - Parchment text, gold accents on hover/active
|
||||||
* - Hebrew RTL throughout (set on <html> in layout.tsx)
|
* - Hebrew RTL throughout (set on <html> in layout.tsx)
|
||||||
*
|
* - Active item gets `aria-current="page"` and a gold underline anchored
|
||||||
* Nav items pick up an `aria-current="page"` and a gold underline when
|
* to the bottom border, so screen readers announce the section and
|
||||||
* the current route matches, so screen readers announce the active
|
* sighted users see where they are.
|
||||||
* section and sighted users can see where they are.
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
type NavItem = {
|
type NavItem = { href: string; label: string };
|
||||||
href: string;
|
type NavGroup = { id: string; items: NavItem[] };
|
||||||
label: string;
|
|
||||||
};
|
|
||||||
|
|
||||||
const NAV_ITEMS: NavItem[] = [
|
const NAV_GROUPS: NavGroup[] = [
|
||||||
|
{
|
||||||
|
id: "work",
|
||||||
|
items: [
|
||||||
{ href: "/", label: "בית" },
|
{ href: "/", label: "בית" },
|
||||||
{ href: "/archive", label: "ארכיון" },
|
{ href: "/archive", label: "ארכיון" },
|
||||||
|
],
|
||||||
|
},
|
||||||
|
{
|
||||||
|
id: "knowledge",
|
||||||
|
items: [
|
||||||
|
{ href: "/precedents", label: "ספריית פסיקה" },
|
||||||
{ href: "/training", label: "אימון סגנון" },
|
{ href: "/training", label: "אימון סגנון" },
|
||||||
{ href: "/methodology", label: "מתודולוגיה" },
|
{ href: "/methodology", label: "מתודולוגיה" },
|
||||||
|
],
|
||||||
|
},
|
||||||
|
];
|
||||||
|
|
||||||
|
const ADMIN_ITEMS: NavItem[] = [
|
||||||
{ href: "/skills", label: "מיומנויות" },
|
{ href: "/skills", label: "מיומנויות" },
|
||||||
{ href: "/diagnostics", label: "אבחון" },
|
{ href: "/diagnostics", label: "אבחון" },
|
||||||
{ href: "/settings", label: "הגדרות" },
|
{ href: "/settings", label: "הגדרות" },
|
||||||
];
|
];
|
||||||
|
|
||||||
|
type AgentBoard = { prefix: string; label: string; hint: string };
|
||||||
|
|
||||||
|
const AGENT_BOARDS: AgentBoard[] = [
|
||||||
|
{ prefix: "CMP", label: "רישוי ובניה", hint: "תיקי 1xxx" },
|
||||||
|
{ prefix: "CMPA", label: "היטלי השבחה", hint: "תיקי 8xxx / 9xxx" },
|
||||||
|
];
|
||||||
|
|
||||||
|
const PAPERCLIP_BASE = "https://pc.nautilus.marcusgroup.org";
|
||||||
|
|
||||||
function isActive(pathname: string, href: string): boolean {
|
function isActive(pathname: string, href: string): boolean {
|
||||||
if (href === "/") return pathname === "/";
|
if (href === "/") return pathname === "/";
|
||||||
return pathname === href || pathname.startsWith(`${href}/`);
|
return pathname === href || pathname.startsWith(`${href}/`);
|
||||||
@@ -39,56 +74,148 @@ function isActive(pathname: string, href: string): boolean {
|
|||||||
|
|
||||||
export function AppShell({ children }: { children: ReactNode }) {
|
export function AppShell({ children }: { children: ReactNode }) {
|
||||||
const pathname = usePathname();
|
const pathname = usePathname();
|
||||||
|
const subtitle = headerSubtitle(pathname);
|
||||||
|
const adminActive = ADMIN_ITEMS.some((i) => isActive(pathname, i.href));
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<>
|
<>
|
||||||
<header
|
<header
|
||||||
className="
|
className="
|
||||||
relative z-10 flex items-center gap-4
|
relative z-10 flex flex-col
|
||||||
px-10 py-[18px]
|
|
||||||
bg-navy text-parchment
|
bg-navy text-parchment
|
||||||
border-b-[3px] border-gold
|
border-b-[3px] border-gold
|
||||||
shadow-md
|
shadow-md
|
||||||
"
|
"
|
||||||
>
|
>
|
||||||
<Link href="/" className="flex items-baseline gap-3 hover:text-parchment">
|
{/* ─── Row 1 — brand bar (3-column grid) ─── */}
|
||||||
<span className="font-display text-[1.45rem] font-bold tracking-[0.02em] text-parchment">
|
{/* Side columns flex 1fr each so the search column stays centered on
|
||||||
|
the viewport regardless of how wide the brand or agent labels grow. */}
|
||||||
|
<div className="grid grid-cols-[minmax(0,1fr)_minmax(280px,460px)_minmax(0,1fr)] items-center gap-4 px-10 pt-[14px] pb-2">
|
||||||
|
<Link
|
||||||
|
href="/"
|
||||||
|
className="flex items-baseline gap-3 hover:text-parchment min-w-0 justify-self-start"
|
||||||
|
>
|
||||||
|
<span className="font-display text-[1.45rem] font-bold tracking-[0.02em] text-parchment whitespace-nowrap">
|
||||||
עוזר משפטי
|
עוזר משפטי
|
||||||
</span>
|
</span>
|
||||||
<span className="text-gold-soft text-sm font-medium">ניהול תיקים</span>
|
<span
|
||||||
|
className="text-gold-soft text-sm font-medium truncate"
|
||||||
|
aria-live="polite"
|
||||||
|
>
|
||||||
|
{subtitle}
|
||||||
|
</span>
|
||||||
</Link>
|
</Link>
|
||||||
|
|
||||||
<nav
|
<div className="w-full justify-self-center">
|
||||||
className="me-auto flex items-center gap-1"
|
<GlobalSearch />
|
||||||
aria-label="ניווט ראשי"
|
</div>
|
||||||
|
|
||||||
|
<DropdownMenu>
|
||||||
|
<DropdownMenuTrigger
|
||||||
|
className="
|
||||||
|
justify-self-end flex items-baseline gap-2 px-3 py-1.5 rounded
|
||||||
|
transition-colors outline-none
|
||||||
|
text-parchment/80 hover:text-parchment hover:bg-navy-soft/60
|
||||||
|
focus-visible:ring-2 focus-visible:ring-gold/60
|
||||||
|
data-[state=open]:bg-navy-soft/80 data-[state=open]:text-parchment
|
||||||
|
"
|
||||||
|
aria-label="ניהול סוכנים — בחר ועדה"
|
||||||
>
|
>
|
||||||
{NAV_ITEMS.map((item) => {
|
<span className="font-display text-[1.45rem] font-bold tracking-[0.02em] text-parchment whitespace-nowrap">
|
||||||
const active = isActive(pathname, item.href);
|
ניהול סוכנים
|
||||||
return (
|
</span>
|
||||||
<Link
|
<ChevronDown className="size-4 self-center text-gold-soft" aria-hidden="true" />
|
||||||
key={item.href}
|
</DropdownMenuTrigger>
|
||||||
href={item.href}
|
|
||||||
aria-current={active ? "page" : undefined}
|
<DropdownMenuContent align="end" sideOffset={10} className="min-w-[240px]">
|
||||||
|
<DropdownMenuLabel className="text-xs text-muted-foreground text-center">
|
||||||
|
Paperclip פתח דאשבורד
|
||||||
|
</DropdownMenuLabel>
|
||||||
|
<DropdownMenuSeparator />
|
||||||
|
{AGENT_BOARDS.map((board) => (
|
||||||
|
<DropdownMenuItem key={board.prefix} asChild>
|
||||||
|
<a
|
||||||
|
href={`${PAPERCLIP_BASE}/${board.prefix}/dashboard`}
|
||||||
|
target="_blank"
|
||||||
|
rel="noreferrer noopener"
|
||||||
|
className="flex flex-col gap-0.5 cursor-pointer py-1.5"
|
||||||
|
>
|
||||||
|
<span className="font-medium whitespace-nowrap">{board.label}</span>
|
||||||
|
<span className="text-xs text-muted-foreground tracking-wide whitespace-nowrap">
|
||||||
|
<span className="font-mono">{board.prefix}</span> · {board.hint}
|
||||||
|
</span>
|
||||||
|
</a>
|
||||||
|
</DropdownMenuItem>
|
||||||
|
))}
|
||||||
|
</DropdownMenuContent>
|
||||||
|
</DropdownMenu>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* ─── Row 2 — section nav ─── */}
|
||||||
|
<div className="flex items-center gap-3 px-10 pt-1 pb-[18px]">
|
||||||
|
<nav className="flex items-center gap-3" aria-label="ניווט ראשי">
|
||||||
|
{NAV_GROUPS.map((group, idx) => (
|
||||||
|
<div key={group.id} className="flex items-center">
|
||||||
|
{idx > 0 && (
|
||||||
|
<span
|
||||||
|
className="mx-2 h-4 w-px bg-parchment/20"
|
||||||
|
aria-hidden="true"
|
||||||
|
/>
|
||||||
|
)}
|
||||||
|
<div className="flex items-center gap-1">
|
||||||
|
{group.items.map((item) => (
|
||||||
|
<NavLink key={item.href} item={item} active={isActive(pathname, item.href)} />
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
))}
|
||||||
|
</nav>
|
||||||
|
|
||||||
|
<DropdownMenu>
|
||||||
|
<DropdownMenuTrigger
|
||||||
className={`
|
className={`
|
||||||
relative px-3 py-1.5 rounded text-sm transition-colors
|
relative ms-auto shrink-0 flex items-center gap-1.5
|
||||||
${
|
px-3 py-1.5 rounded text-sm transition-colors outline-none
|
||||||
active
|
focus-visible:ring-2 focus-visible:ring-gold/60
|
||||||
|
${adminActive
|
||||||
? "text-parchment font-semibold bg-navy-soft/80"
|
? "text-parchment font-semibold bg-navy-soft/80"
|
||||||
: "text-parchment/80 hover:text-parchment hover:bg-navy-soft/60"
|
: "text-parchment/80 hover:text-parchment hover:bg-navy-soft/60"}
|
||||||
}
|
data-[state=open]:bg-navy-soft/80 data-[state=open]:text-parchment
|
||||||
`}
|
`}
|
||||||
|
aria-label="הגדרות מערכת"
|
||||||
>
|
>
|
||||||
{item.label}
|
<Settings className="size-4" aria-hidden="true" />
|
||||||
{active && (
|
<ChevronDown className="size-3" aria-hidden="true" />
|
||||||
|
{adminActive && (
|
||||||
<span
|
<span
|
||||||
className="absolute -bottom-[19px] inset-x-2 h-[2px] bg-gold"
|
className="absolute -bottom-[19px] inset-x-2 h-[2px] bg-gold"
|
||||||
aria-hidden="true"
|
aria-hidden="true"
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
|
</DropdownMenuTrigger>
|
||||||
|
|
||||||
|
<DropdownMenuContent align="end" sideOffset={10} className="min-w-[180px]">
|
||||||
|
<DropdownMenuLabel className="text-xs text-muted-foreground">
|
||||||
|
מערכת
|
||||||
|
</DropdownMenuLabel>
|
||||||
|
<DropdownMenuSeparator />
|
||||||
|
{ADMIN_ITEMS.map((item) => {
|
||||||
|
const active = isActive(pathname, item.href);
|
||||||
|
return (
|
||||||
|
<DropdownMenuItem key={item.href} asChild>
|
||||||
|
<Link
|
||||||
|
href={item.href}
|
||||||
|
aria-current={active ? "page" : undefined}
|
||||||
|
className={`cursor-pointer ${active ? "font-semibold" : ""}`}
|
||||||
|
>
|
||||||
|
{item.label}
|
||||||
</Link>
|
</Link>
|
||||||
|
</DropdownMenuItem>
|
||||||
);
|
);
|
||||||
})}
|
})}
|
||||||
</nav>
|
</DropdownMenuContent>
|
||||||
|
</DropdownMenu>
|
||||||
|
</div>
|
||||||
</header>
|
</header>
|
||||||
|
|
||||||
<main
|
<main
|
||||||
@@ -100,3 +227,26 @@ export function AppShell({ children }: { children: ReactNode }) {
|
|||||||
</>
|
</>
|
||||||
);
|
);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
function NavLink({ item, active }: { item: NavItem; active: boolean }) {
|
||||||
|
return (
|
||||||
|
<Link
|
||||||
|
href={item.href}
|
||||||
|
aria-current={active ? "page" : undefined}
|
||||||
|
className={`
|
||||||
|
relative px-3 py-1.5 rounded text-sm transition-colors
|
||||||
|
${active
|
||||||
|
? "text-parchment font-semibold bg-navy-soft/80"
|
||||||
|
: "text-parchment/80 hover:text-parchment hover:bg-navy-soft/60"}
|
||||||
|
`}
|
||||||
|
>
|
||||||
|
{item.label}
|
||||||
|
{active && (
|
||||||
|
<span
|
||||||
|
className="absolute -bottom-[19px] inset-x-2 h-[2px] bg-gold"
|
||||||
|
aria-hidden="true"
|
||||||
|
/>
|
||||||
|
)}
|
||||||
|
</Link>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|||||||
55
web-ui/src/components/cases/appeal-type-bars.tsx
Normal file
55
web-ui/src/components/cases/appeal-type-bars.tsx
Normal file
@@ -0,0 +1,55 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { deriveSubtype } from "@/lib/practice-area";
|
||||||
|
import type { AppealSubtype } from "@/lib/practice-area";
|
||||||
|
import type { Case } from "@/lib/api/cases";
|
||||||
|
|
||||||
|
type Bucket = { key: AppealSubtype; label: string; color: string };
|
||||||
|
|
||||||
|
const BUCKETS: Bucket[] = [
|
||||||
|
{ key: "building_permit", label: "רישוי ובנייה", color: "var(--color-info)" },
|
||||||
|
{ key: "betterment_levy", label: "היטל השבחה", color: "var(--color-gold)" },
|
||||||
|
{ key: "compensation_197", label: "פיצויים (ס׳ 197)", color: "var(--color-warn)" },
|
||||||
|
];
|
||||||
|
|
||||||
|
export function subtypeOf(c: Case): AppealSubtype {
|
||||||
|
return c.appeal_subtype && c.appeal_subtype !== "unknown"
|
||||||
|
? c.appeal_subtype
|
||||||
|
: deriveSubtype(c.case_number);
|
||||||
|
}
|
||||||
|
|
||||||
|
export function AppealTypeBars({ cases }: { cases?: Case[] }) {
|
||||||
|
const counts: Record<AppealSubtype, number> = {
|
||||||
|
building_permit: 0,
|
||||||
|
betterment_levy: 0,
|
||||||
|
compensation_197: 0,
|
||||||
|
unknown: 0,
|
||||||
|
};
|
||||||
|
(cases ?? []).forEach((c) => {
|
||||||
|
counts[subtypeOf(c)] += 1;
|
||||||
|
});
|
||||||
|
const max = Math.max(1, ...BUCKETS.map((b) => counts[b.key]));
|
||||||
|
|
||||||
|
return (
|
||||||
|
<ul className="flex flex-col gap-3">
|
||||||
|
{BUCKETS.map((b) => {
|
||||||
|
const n = counts[b.key];
|
||||||
|
const widthPct = (n / max) * 100;
|
||||||
|
return (
|
||||||
|
<li key={b.key} className="space-y-1.5">
|
||||||
|
<div className="flex items-baseline justify-between gap-2 text-sm">
|
||||||
|
<span className="text-ink-soft truncate">{b.label}</span>
|
||||||
|
<span className="text-ink font-semibold tabular-nums">{n}</span>
|
||||||
|
</div>
|
||||||
|
<div className="h-2 rounded-full bg-rule-soft/60 overflow-hidden">
|
||||||
|
<div
|
||||||
|
className="h-full rounded-full transition-[width] duration-500"
|
||||||
|
style={{ width: `${widthPct}%`, background: b.color }}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
</li>
|
||||||
|
);
|
||||||
|
})}
|
||||||
|
</ul>
|
||||||
|
);
|
||||||
|
}
|
||||||
@@ -4,6 +4,7 @@ import { Badge } from "@/components/ui/badge";
|
|||||||
import { StatusBadge } from "@/components/cases/status-badge";
|
import { StatusBadge } from "@/components/cases/status-badge";
|
||||||
import { SyncIndicator } from "@/components/cases/sync-indicator";
|
import { SyncIndicator } from "@/components/cases/sync-indicator";
|
||||||
import { CaseArchiveAction } from "@/components/cases/case-archive-action";
|
import { CaseArchiveAction } from "@/components/cases/case-archive-action";
|
||||||
|
import { CreateRepoButton } from "@/components/cases/create-repo-button";
|
||||||
import {
|
import {
|
||||||
PRACTICE_AREA_LABELS,
|
PRACTICE_AREA_LABELS,
|
||||||
APPEAL_SUBTYPE_LABELS,
|
APPEAL_SUBTYPE_LABELS,
|
||||||
@@ -67,6 +68,7 @@ export function CaseHeader({ data }: { data?: CaseDetail }) {
|
|||||||
archivedAt={data.archived_at}
|
archivedAt={data.archived_at}
|
||||||
/>
|
/>
|
||||||
)}
|
)}
|
||||||
|
<CreateRepoButton data={data} />
|
||||||
</div>
|
</div>
|
||||||
<h1 className="text-navy text-xl font-bold leading-snug max-w-2xl mb-0">
|
<h1 className="text-navy text-xl font-bold leading-snug max-w-2xl mb-0">
|
||||||
{data?.title ?? "טוען…"}
|
{data?.title ?? "טוען…"}
|
||||||
|
|||||||
@@ -17,7 +17,6 @@ import {
|
|||||||
import { Input } from "@/components/ui/input";
|
import { Input } from "@/components/ui/input";
|
||||||
import { Skeleton } from "@/components/ui/skeleton";
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
import { StatusBadge } from "@/components/cases/status-badge";
|
import { StatusBadge } from "@/components/cases/status-badge";
|
||||||
import { APPEAL_SUBTYPE_LABELS } from "@/lib/practice-area";
|
|
||||||
import type { Case } from "@/lib/api/cases";
|
import type { Case } from "@/lib/api/cases";
|
||||||
|
|
||||||
function formatDate(iso?: string) {
|
function formatDate(iso?: string) {
|
||||||
@@ -60,15 +59,6 @@ const columns: ColumnDef<Case>[] = [
|
|||||||
header: "סטטוס",
|
header: "סטטוס",
|
||||||
cell: ({ row }) => <StatusBadge status={row.original.status} />,
|
cell: ({ row }) => <StatusBadge status={row.original.status} />,
|
||||||
},
|
},
|
||||||
{
|
|
||||||
accessorKey: "appeal_subtype",
|
|
||||||
header: "תחום",
|
|
||||||
cell: ({ row }) => {
|
|
||||||
const s = row.original.appeal_subtype;
|
|
||||||
if (!s || s === "unknown") return <span className="text-ink-muted">—</span>;
|
|
||||||
return <span className="text-ink-soft text-sm">{APPEAL_SUBTYPE_LABELS[s]}</span>;
|
|
||||||
},
|
|
||||||
},
|
|
||||||
{
|
{
|
||||||
accessorKey: "document_count",
|
accessorKey: "document_count",
|
||||||
header: "מסמכים",
|
header: "מסמכים",
|
||||||
@@ -91,10 +81,14 @@ export function CasesTable({
|
|||||||
cases,
|
cases,
|
||||||
loading,
|
loading,
|
||||||
error,
|
error,
|
||||||
|
emptyText = "עדיין אין תיקי ערר",
|
||||||
|
searchPlaceholder = "חיפוש לפי מס׳ ערר או כותרת…",
|
||||||
}: {
|
}: {
|
||||||
cases?: Case[];
|
cases?: Case[];
|
||||||
loading?: boolean;
|
loading?: boolean;
|
||||||
error?: Error | null;
|
error?: Error | null;
|
||||||
|
emptyText?: string;
|
||||||
|
searchPlaceholder?: string;
|
||||||
}) {
|
}) {
|
||||||
const [sorting, setSorting] = useState<SortingState>([
|
const [sorting, setSorting] = useState<SortingState>([
|
||||||
{ id: "updated_at", desc: true },
|
{ id: "updated_at", desc: true },
|
||||||
@@ -128,7 +122,7 @@ export function CasesTable({
|
|||||||
<Input
|
<Input
|
||||||
value={globalFilter}
|
value={globalFilter}
|
||||||
onChange={(e) => setGlobalFilter(e.target.value)}
|
onChange={(e) => setGlobalFilter(e.target.value)}
|
||||||
placeholder="חיפוש לפי מס׳ ערר או כותרת…"
|
placeholder={searchPlaceholder}
|
||||||
className="max-w-sm bg-surface"
|
className="max-w-sm bg-surface"
|
||||||
dir="rtl"
|
dir="rtl"
|
||||||
/>
|
/>
|
||||||
@@ -176,7 +170,7 @@ export function CasesTable({
|
|||||||
<TableRow>
|
<TableRow>
|
||||||
<TableCell colSpan={columns.length} className="text-center text-ink-muted py-12">
|
<TableCell colSpan={columns.length} className="text-center text-ink-muted py-12">
|
||||||
<div className="text-gold text-2xl mb-2" aria-hidden>❦</div>
|
<div className="text-gold text-2xl mb-2" aria-hidden>❦</div>
|
||||||
{globalFilter ? "אין תיקים תואמים לחיפוש" : "עדיין אין תיקי ערר"}
|
{globalFilter ? "אין תיקים תואמים לחיפוש" : emptyText}
|
||||||
</TableCell>
|
</TableCell>
|
||||||
</TableRow>
|
</TableRow>
|
||||||
) : (
|
) : (
|
||||||
|
|||||||
59
web-ui/src/components/cases/create-repo-button.tsx
Normal file
59
web-ui/src/components/cases/create-repo-button.tsx
Normal file
@@ -0,0 +1,59 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { Cloud, Loader2 } from "lucide-react";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import {
|
||||||
|
useCreateGiteaRepo,
|
||||||
|
useGitStatus,
|
||||||
|
type CaseDetail,
|
||||||
|
} from "@/lib/api/cases";
|
||||||
|
|
||||||
|
export function CreateRepoButton({ data }: { data?: CaseDetail }) {
|
||||||
|
const { data: gitStatus } = useGitStatus(data?.case_number);
|
||||||
|
const createRepo = useCreateGiteaRepo(data?.case_number);
|
||||||
|
|
||||||
|
if (!data?.case_number || !gitStatus) return null;
|
||||||
|
|
||||||
|
/* Show only when something is actually missing — repo, remote, or
|
||||||
|
* unpushed commits. If everything is in sync, the SyncIndicator already
|
||||||
|
* communicates that. */
|
||||||
|
const needsRepo = gitStatus.error === "no_repo" || !gitStatus.has_remote;
|
||||||
|
if (!needsRepo) return null;
|
||||||
|
|
||||||
|
function handleClick() {
|
||||||
|
if (!data) return;
|
||||||
|
createRepo.mutate(
|
||||||
|
{ title: data.title ?? "", description: data.subject ?? "" },
|
||||||
|
{
|
||||||
|
onSuccess: (res) =>
|
||||||
|
toast.success(
|
||||||
|
res.pushed
|
||||||
|
? `הריפו נוצר ב-Gitea: ${res.repo_url}`
|
||||||
|
: `הריפו נוצר אך ה-push נכשל. בדוק את הלוגים של ה-backend.`,
|
||||||
|
),
|
||||||
|
onError: (err) =>
|
||||||
|
toast.error(
|
||||||
|
err instanceof Error ? err.message : "יצירת הריפו נכשלה",
|
||||||
|
),
|
||||||
|
},
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Button
|
||||||
|
variant="outline"
|
||||||
|
size="sm"
|
||||||
|
onClick={handleClick}
|
||||||
|
disabled={createRepo.isPending}
|
||||||
|
title="יצירת ריפו ב-Gitea וקישור התיק המקומי אליו"
|
||||||
|
>
|
||||||
|
{createRepo.isPending ? (
|
||||||
|
<Loader2 className="me-1 h-3.5 w-3.5 animate-spin" />
|
||||||
|
) : (
|
||||||
|
<Cloud className="me-1 h-3.5 w-3.5" />
|
||||||
|
)}
|
||||||
|
צור ריפו ב-Gitea
|
||||||
|
</Button>
|
||||||
|
);
|
||||||
|
}
|
||||||
@@ -359,7 +359,7 @@ export function DocumentsPanel({
|
|||||||
</div>
|
</div>
|
||||||
)}
|
)}
|
||||||
|
|
||||||
<div className="max-h-[70vh] overflow-y-auto" dir="rtl">
|
<div className="max-h-[70vh] overflow-y-auto overflow-x-hidden" dir="rtl">
|
||||||
<ul className="divide-y divide-rule" dir="rtl">
|
<ul className="divide-y divide-rule" dir="rtl">
|
||||||
{sorted.map((doc) => (
|
{sorted.map((doc) => (
|
||||||
<DocumentRow key={doc.id} doc={doc} caseNumber={caseNumber} />
|
<DocumentRow key={doc.id} doc={doc} caseNumber={caseNumber} />
|
||||||
|
|||||||
@@ -49,14 +49,14 @@ export function KPICards({ cases, loading }: { cases?: Case[]; loading?: boolean
|
|||||||
${TONE_STYLES[b.tone]}
|
${TONE_STYLES[b.tone]}
|
||||||
`}
|
`}
|
||||||
>
|
>
|
||||||
<CardContent className="px-5 py-4 flex flex-col gap-1">
|
<CardContent className="px-5 py-3 flex flex-col gap-0.5">
|
||||||
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
<span className="text-[0.72rem] uppercase tracking-[0.08em] text-ink-muted">
|
||||||
{b.label}
|
{b.label}
|
||||||
</span>
|
</span>
|
||||||
<span className="font-display text-[2.3rem] font-black leading-none">
|
<span className="font-display text-[1.7rem] font-black leading-none">
|
||||||
{loading ? "—" : b.value}
|
{loading ? "—" : b.value}
|
||||||
</span>
|
</span>
|
||||||
<span className="text-[0.78rem] text-ink-muted mt-0.5">{b.caption}</span>
|
<span className="text-[0.74rem] text-ink-muted">{b.caption}</span>
|
||||||
</CardContent>
|
</CardContent>
|
||||||
</Card>
|
</Card>
|
||||||
))}
|
))}
|
||||||
|
|||||||
@@ -55,29 +55,32 @@ export function StatusDonut({ cases }: { cases?: Case[] }) {
|
|||||||
.join(", ")})`;
|
.join(", ")})`;
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<div className="flex items-center gap-6">
|
<div className="flex flex-col items-center gap-5">
|
||||||
<div
|
<div
|
||||||
className="relative w-[140px] h-[140px] rounded-full shadow-sm"
|
className="relative w-[150px] h-[150px] rounded-full shadow-sm"
|
||||||
style={{ background }}
|
style={{ background }}
|
||||||
aria-label="פיזור תיקים לפי סטטוס"
|
aria-label="פיזור תיקים לפי סטטוס"
|
||||||
>
|
>
|
||||||
<div className="absolute inset-[18px] bg-surface rounded-full flex flex-col items-center justify-center">
|
<div className="absolute inset-[20px] bg-surface rounded-full flex flex-col items-center justify-center">
|
||||||
<span className="font-display text-2xl font-black text-navy leading-none">
|
<span className="font-display text-3xl font-black text-navy leading-none tabular-nums">
|
||||||
{total}
|
{total}
|
||||||
</span>
|
</span>
|
||||||
<span className="text-[0.7rem] text-ink-muted mt-1">תיקים</span>
|
<span className="text-[0.7rem] text-ink-muted mt-1 tracking-wide">תיקים</span>
|
||||||
</div>
|
</div>
|
||||||
</div>
|
</div>
|
||||||
|
|
||||||
<ul className="flex flex-col gap-1.5 text-sm">
|
<ul className="grid grid-cols-1 gap-1.5 text-sm w-full">
|
||||||
{(Object.keys(GROUP_META) as GroupKey[]).map((k) => (
|
{(Object.keys(GROUP_META) as GroupKey[]).map((k) => (
|
||||||
<li key={k} className="flex items-center gap-2">
|
<li
|
||||||
|
key={k}
|
||||||
|
className="grid grid-cols-[auto_1fr_auto] items-center gap-2 py-1 px-2 rounded-md hover:bg-rule-soft/40 transition-colors"
|
||||||
|
>
|
||||||
<span
|
<span
|
||||||
className="inline-block w-2.5 h-2.5 rounded-full"
|
className="inline-block w-2.5 h-2.5 rounded-full shrink-0"
|
||||||
style={{ background: GROUP_META[k].color }}
|
style={{ background: GROUP_META[k].color }}
|
||||||
/>
|
/>
|
||||||
<span className="text-ink-soft">{GROUP_META[k].label}</span>
|
<span className="text-ink-soft truncate">{GROUP_META[k].label}</span>
|
||||||
<span className="text-ink-muted tabular-nums me-auto ms-1">
|
<span className="text-ink font-semibold tabular-nums">
|
||||||
{counts[k]}
|
{counts[k]}
|
||||||
</span>
|
</span>
|
||||||
</li>
|
</li>
|
||||||
|
|||||||
@@ -43,6 +43,7 @@ function statusLabel(event: ProgressEvent | null): string {
|
|||||||
if (event.status === "processing")
|
if (event.status === "processing")
|
||||||
return event.step ? `בעיבוד · ${event.step}` : "בעיבוד";
|
return event.step ? `בעיבוד · ${event.step}` : "בעיבוד";
|
||||||
if (event.status === "completed") return "הושלם";
|
if (event.status === "completed") return "הושלם";
|
||||||
|
if (event.status === "unknown") return "הושלם";
|
||||||
if (event.status === "failed") return event.error ?? "נכשל";
|
if (event.status === "failed") return event.error ?? "נכשל";
|
||||||
return event.status;
|
return event.status;
|
||||||
}
|
}
|
||||||
@@ -52,15 +53,16 @@ function progressPercent(event: ProgressEvent | null): number {
|
|||||||
if (event.status === "queued") return 10;
|
if (event.status === "queued") return 10;
|
||||||
if (event.status === "processing") return 55;
|
if (event.status === "processing") return 55;
|
||||||
if (event.status === "completed") return 100;
|
if (event.status === "completed") return 100;
|
||||||
|
if (event.status === "unknown") return 100;
|
||||||
if (event.status === "failed") return 100;
|
if (event.status === "failed") return 100;
|
||||||
return 25;
|
return 25;
|
||||||
}
|
}
|
||||||
|
|
||||||
function UploadRowView({ row }: { row: UploadRow }) {
|
function UploadRowView({ row, caseNumber }: { row: UploadRow; caseNumber: string }) {
|
||||||
const progress = useProgress(row.taskId);
|
const progress = useProgress(row.taskId, caseNumber);
|
||||||
const pct = row.error ? 100 : progressPercent(progress);
|
const pct = row.error ? 100 : progressPercent(progress);
|
||||||
const failed = row.error || progress?.status === "failed";
|
const failed = row.error || progress?.status === "failed";
|
||||||
const done = progress?.status === "completed";
|
const done = progress?.status === "completed" || progress?.status === "unknown";
|
||||||
|
|
||||||
return (
|
return (
|
||||||
<li className="rounded-lg border border-rule bg-parchment/40 px-4 py-3 space-y-2">
|
<li className="rounded-lg border border-rule bg-parchment/40 px-4 py-3 space-y-2">
|
||||||
@@ -197,7 +199,7 @@ export function UploadSheet({ caseNumber }: { caseNumber: string }) {
|
|||||||
{rows.length > 0 && (
|
{rows.length > 0 && (
|
||||||
<ul className="space-y-2">
|
<ul className="space-y-2">
|
||||||
{rows.map((row) => (
|
{rows.map((row) => (
|
||||||
<UploadRowView key={row.id} row={row} />
|
<UploadRowView key={row.id} row={row} caseNumber={caseNumber} />
|
||||||
))}
|
))}
|
||||||
</ul>
|
</ul>
|
||||||
)}
|
)}
|
||||||
|
|||||||
340
web-ui/src/components/global-search.tsx
Normal file
340
web-ui/src/components/global-search.tsx
Normal file
@@ -0,0 +1,340 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useEffect, useId, useMemo, useRef, useState } from "react";
|
||||||
|
import Link from "next/link";
|
||||||
|
import { Search, FileText, BookOpen, FolderOpen, Loader2 } from "lucide-react";
|
||||||
|
|
||||||
|
import {
|
||||||
|
useGlobalSearch,
|
||||||
|
type CaseHit,
|
||||||
|
type DocumentHit,
|
||||||
|
} from "@/lib/api/global-search";
|
||||||
|
import type { SearchHit as PrecedentHit } from "@/lib/api/precedent-library";
|
||||||
|
|
||||||
|
const DEBOUNCE_MS = 250;
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Header global search — debounced input + per-source result panel.
|
||||||
|
*
|
||||||
|
* Three independent fan-out queries (cases / precedent / documents) each
|
||||||
|
* render their own section with an own loading/empty state, so the fast
|
||||||
|
* SQL source paints immediately while the slower vector sources stream in.
|
||||||
|
*/
|
||||||
|
export function GlobalSearch() {
|
||||||
|
const [raw, setRaw] = useState("");
|
||||||
|
const [debounced, setDebounced] = useState("");
|
||||||
|
const [open, setOpen] = useState(false);
|
||||||
|
const inputRef = useRef<HTMLInputElement>(null);
|
||||||
|
const wrapperRef = useRef<HTMLDivElement>(null);
|
||||||
|
const listboxId = useId();
|
||||||
|
|
||||||
|
// Debounce.
|
||||||
|
useEffect(() => {
|
||||||
|
const t = setTimeout(() => setDebounced(raw), DEBOUNCE_MS);
|
||||||
|
return () => clearTimeout(t);
|
||||||
|
}, [raw]);
|
||||||
|
|
||||||
|
// Cmd/Ctrl+K — focus + select.
|
||||||
|
useEffect(() => {
|
||||||
|
const onKey = (e: KeyboardEvent) => {
|
||||||
|
const isModK = (e.metaKey || e.ctrlKey) && e.key.toLowerCase() === "k";
|
||||||
|
if (isModK) {
|
||||||
|
e.preventDefault();
|
||||||
|
inputRef.current?.focus();
|
||||||
|
inputRef.current?.select();
|
||||||
|
}
|
||||||
|
};
|
||||||
|
window.addEventListener("keydown", onKey);
|
||||||
|
return () => window.removeEventListener("keydown", onKey);
|
||||||
|
}, []);
|
||||||
|
|
||||||
|
// Click-outside.
|
||||||
|
useEffect(() => {
|
||||||
|
if (!open) return;
|
||||||
|
const onClick = (e: MouseEvent) => {
|
||||||
|
if (!wrapperRef.current?.contains(e.target as Node)) setOpen(false);
|
||||||
|
};
|
||||||
|
window.addEventListener("mousedown", onClick);
|
||||||
|
return () => window.removeEventListener("mousedown", onClick);
|
||||||
|
}, [open]);
|
||||||
|
|
||||||
|
// Route changes don't need a separate listener: clicking any link
|
||||||
|
// (nav, search result, in-page) triggers mousedown outside wrapperRef,
|
||||||
|
// and the click-outside effect closes the panel.
|
||||||
|
|
||||||
|
const { cases, precedent, documents, isQueryReady, anyLoading } =
|
||||||
|
useGlobalSearch(debounced);
|
||||||
|
|
||||||
|
const showPanel = open && isQueryReady;
|
||||||
|
|
||||||
|
const hasAnyResults = useMemo(() => {
|
||||||
|
return (
|
||||||
|
(cases.data?.items.length ?? 0) > 0 ||
|
||||||
|
(precedent.data?.items.length ?? 0) > 0 ||
|
||||||
|
(documents.data?.length ?? 0) > 0
|
||||||
|
);
|
||||||
|
}, [cases.data, precedent.data, documents.data]);
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div ref={wrapperRef} className="relative w-full max-w-[460px]">
|
||||||
|
<div className="relative">
|
||||||
|
<Search
|
||||||
|
className="absolute end-3 top-1/2 -translate-y-1/2 size-4 text-parchment/50 pointer-events-none"
|
||||||
|
aria-hidden="true"
|
||||||
|
/>
|
||||||
|
<input
|
||||||
|
ref={inputRef}
|
||||||
|
type="search"
|
||||||
|
value={raw}
|
||||||
|
onChange={(e) => {
|
||||||
|
setRaw(e.target.value);
|
||||||
|
setOpen(true);
|
||||||
|
}}
|
||||||
|
onFocus={() => setOpen(true)}
|
||||||
|
onKeyDown={(e) => {
|
||||||
|
if (e.key === "Escape") {
|
||||||
|
setOpen(false);
|
||||||
|
inputRef.current?.blur();
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
placeholder="חפש תיק, פסיקה, או מסמך…"
|
||||||
|
aria-label="חיפוש גלובלי"
|
||||||
|
autoComplete="off"
|
||||||
|
spellCheck={false}
|
||||||
|
className="
|
||||||
|
w-full ps-3 pe-10 py-2 rounded-md
|
||||||
|
bg-navy-soft/60 border border-parchment/15
|
||||||
|
text-sm text-parchment placeholder:text-parchment/40
|
||||||
|
focus:outline-none focus:ring-2 focus:ring-gold/60 focus:border-transparent
|
||||||
|
transition-colors
|
||||||
|
"
|
||||||
|
/>
|
||||||
|
<kbd
|
||||||
|
className="
|
||||||
|
absolute end-9 top-1/2 -translate-y-1/2
|
||||||
|
text-[10px] font-mono text-parchment/40
|
||||||
|
border border-parchment/20 rounded px-1 py-0.5
|
||||||
|
pointer-events-none select-none
|
||||||
|
hidden md:inline-block
|
||||||
|
"
|
||||||
|
aria-hidden="true"
|
||||||
|
>
|
||||||
|
⌘K
|
||||||
|
</kbd>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{showPanel && (
|
||||||
|
<div
|
||||||
|
id={listboxId}
|
||||||
|
role="listbox"
|
||||||
|
aria-label="תוצאות חיפוש"
|
||||||
|
className="
|
||||||
|
absolute top-full inset-x-0 mt-2 z-50
|
||||||
|
rounded-lg bg-popover text-popover-foreground
|
||||||
|
shadow-xl ring-1 ring-foreground/10
|
||||||
|
max-h-[70vh] overflow-y-auto
|
||||||
|
"
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
{anyLoading && !hasAnyResults && (
|
||||||
|
<div className="px-4 py-6 text-sm text-muted-foreground flex items-center gap-2">
|
||||||
|
<Loader2 className="size-4 animate-spin" aria-hidden="true" />
|
||||||
|
מחפש…
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<ResultGroup
|
||||||
|
title="תיקים"
|
||||||
|
icon={<FolderOpen className="size-4" aria-hidden="true" />}
|
||||||
|
isLoading={cases.isLoading}
|
||||||
|
isError={cases.isError}
|
||||||
|
count={cases.data?.count}
|
||||||
|
>
|
||||||
|
{cases.data?.items.map((hit) => (
|
||||||
|
<CaseRow key={hit.case_number} hit={hit} onSelect={() => setOpen(false)} />
|
||||||
|
))}
|
||||||
|
</ResultGroup>
|
||||||
|
|
||||||
|
<ResultGroup
|
||||||
|
title="ספריית פסיקה"
|
||||||
|
icon={<BookOpen className="size-4" aria-hidden="true" />}
|
||||||
|
isLoading={precedent.isLoading}
|
||||||
|
isError={precedent.isError}
|
||||||
|
count={precedent.data?.count}
|
||||||
|
seeMoreHref={`/precedents?q=${encodeURIComponent(debounced)}`}
|
||||||
|
>
|
||||||
|
{precedent.data?.items.map((hit, i) => (
|
||||||
|
<PrecedentRow key={`${hit.case_law_id}-${i}`} hit={hit} onSelect={() => setOpen(false)} />
|
||||||
|
))}
|
||||||
|
</ResultGroup>
|
||||||
|
|
||||||
|
<ResultGroup
|
||||||
|
title="מסמכי תיקים והחלטות"
|
||||||
|
icon={<FileText className="size-4" aria-hidden="true" />}
|
||||||
|
isLoading={documents.isLoading}
|
||||||
|
isError={documents.isError}
|
||||||
|
count={documents.data?.length}
|
||||||
|
>
|
||||||
|
{documents.data?.map((hit, i) => (
|
||||||
|
<DocumentRow key={`${hit.case_number}-${i}`} hit={hit} onSelect={() => setOpen(false)} />
|
||||||
|
))}
|
||||||
|
</ResultGroup>
|
||||||
|
|
||||||
|
{!anyLoading && !hasAnyResults && (
|
||||||
|
<div className="px-4 py-6 text-sm text-muted-foreground text-center">
|
||||||
|
לא נמצא כלום עבור <span className="font-mono">{debounced}</span>.
|
||||||
|
<br />
|
||||||
|
נסה מילים נרדפות או חפש לפי מספר תיק.
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
// ─── Result groups + rows ────────────────────────────────────────────
|
||||||
|
|
||||||
|
function ResultGroup({
|
||||||
|
title,
|
||||||
|
icon,
|
||||||
|
isLoading,
|
||||||
|
isError,
|
||||||
|
count,
|
||||||
|
seeMoreHref,
|
||||||
|
children,
|
||||||
|
}: {
|
||||||
|
title: string;
|
||||||
|
icon: React.ReactNode;
|
||||||
|
isLoading: boolean;
|
||||||
|
isError: boolean;
|
||||||
|
count: number | undefined;
|
||||||
|
seeMoreHref?: string;
|
||||||
|
children: React.ReactNode;
|
||||||
|
}) {
|
||||||
|
const hasItems = (count ?? 0) > 0;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="border-b border-foreground/5 last:border-b-0">
|
||||||
|
<div className="flex items-center gap-2 px-3 pt-3 pb-1.5 text-xs font-semibold text-muted-foreground uppercase tracking-wider">
|
||||||
|
{icon}
|
||||||
|
<span>{title}</span>
|
||||||
|
{count !== undefined && (
|
||||||
|
<span className="text-foreground/50 normal-case font-normal">({count})</span>
|
||||||
|
)}
|
||||||
|
{isLoading && <Loader2 className="size-3 animate-spin ms-auto" aria-hidden="true" />}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{isError && (
|
||||||
|
<div className="px-3 pb-2 text-xs text-destructive">שגיאה בטעינת תוצאות</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{!isLoading && !isError && !hasItems && (
|
||||||
|
<div className="px-3 pb-2 text-xs text-muted-foreground/70">אין תוצאות בקטגוריה זו</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="flex flex-col">{children}</div>
|
||||||
|
|
||||||
|
{seeMoreHref && hasItems && (
|
||||||
|
<Link
|
||||||
|
href={seeMoreHref}
|
||||||
|
className="block px-3 py-1.5 text-xs text-gold-deep hover:bg-accent/40 transition-colors"
|
||||||
|
>
|
||||||
|
הצג עוד →
|
||||||
|
</Link>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function CaseRow({ hit, onSelect }: { hit: CaseHit; onSelect: () => void }) {
|
||||||
|
return (
|
||||||
|
<Link
|
||||||
|
href={`/cases/${encodeURIComponent(hit.case_number)}`}
|
||||||
|
onClick={onSelect}
|
||||||
|
role="option"
|
||||||
|
className="block px-3 py-2 hover:bg-accent/40 transition-colors"
|
||||||
|
>
|
||||||
|
<div className="flex items-baseline justify-between gap-3">
|
||||||
|
<span className="font-medium text-sm">ערר {hit.case_number}</span>
|
||||||
|
{hit.status && (
|
||||||
|
<span className="text-[10px] uppercase tracking-wider text-muted-foreground">
|
||||||
|
{hit.status}
|
||||||
|
</span>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
<div className="text-xs text-muted-foreground truncate">
|
||||||
|
{hit.title}
|
||||||
|
{hit.property_address && <span> · {hit.property_address}</span>}
|
||||||
|
</div>
|
||||||
|
</Link>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function PrecedentRow({ hit, onSelect }: { hit: PrecedentHit; onSelect: () => void }) {
|
||||||
|
const href = `/precedents/${hit.case_law_id}`;
|
||||||
|
const pct = Math.round(hit.score * 100);
|
||||||
|
|
||||||
|
if (hit.type === "halacha") {
|
||||||
|
return (
|
||||||
|
<Link
|
||||||
|
href={href}
|
||||||
|
onClick={onSelect}
|
||||||
|
role="option"
|
||||||
|
className="block px-3 py-2 hover:bg-accent/40 transition-colors"
|
||||||
|
>
|
||||||
|
<div className="flex items-baseline justify-between gap-3">
|
||||||
|
<span className="font-medium text-sm line-clamp-1">הלכה — {hit.rule_statement}</span>
|
||||||
|
<span className="text-[10px] text-gold-deep tabular-nums">{pct}%</span>
|
||||||
|
</div>
|
||||||
|
<div className="text-xs text-muted-foreground truncate">
|
||||||
|
{hit.case_name} · {hit.case_number}
|
||||||
|
</div>
|
||||||
|
</Link>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Link
|
||||||
|
href={href}
|
||||||
|
onClick={onSelect}
|
||||||
|
role="option"
|
||||||
|
className="block px-3 py-2 hover:bg-accent/40 transition-colors"
|
||||||
|
>
|
||||||
|
<div className="flex items-baseline justify-between gap-3">
|
||||||
|
<span className="font-medium text-sm">פסקה</span>
|
||||||
|
<span className="text-[10px] text-gold-deep tabular-nums">{pct}%</span>
|
||||||
|
</div>
|
||||||
|
<div className="text-xs text-muted-foreground line-clamp-2">
|
||||||
|
“{hit.content.slice(0, 160)}…”
|
||||||
|
</div>
|
||||||
|
<div className="text-[11px] text-muted-foreground/70">
|
||||||
|
{hit.case_name} · {hit.case_number}
|
||||||
|
</div>
|
||||||
|
</Link>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function DocumentRow({ hit, onSelect }: { hit: DocumentHit; onSelect: () => void }) {
|
||||||
|
const pct = Math.round(hit.score * 100);
|
||||||
|
return (
|
||||||
|
<Link
|
||||||
|
href={`/cases/${encodeURIComponent(hit.case_number)}`}
|
||||||
|
onClick={onSelect}
|
||||||
|
role="option"
|
||||||
|
className="block px-3 py-2 hover:bg-accent/40 transition-colors"
|
||||||
|
>
|
||||||
|
<div className="flex items-baseline justify-between gap-3">
|
||||||
|
<span className="font-medium text-sm truncate">{hit.document}</span>
|
||||||
|
<span className="text-[10px] text-gold-deep tabular-nums">{pct}%</span>
|
||||||
|
</div>
|
||||||
|
<div className="text-xs text-muted-foreground line-clamp-2">
|
||||||
|
“{hit.content.slice(0, 160)}…”
|
||||||
|
</div>
|
||||||
|
<div className="text-[11px] text-muted-foreground/70">
|
||||||
|
ערר {hit.case_number}
|
||||||
|
{hit.page != null && <span> · עמ׳ {hit.page}</span>}
|
||||||
|
</div>
|
||||||
|
</Link>
|
||||||
|
);
|
||||||
|
}
|
||||||
31
web-ui/src/components/header-context.ts
Normal file
31
web-ui/src/components/header-context.ts
Normal file
@@ -0,0 +1,31 @@
|
|||||||
|
/**
|
||||||
|
* Resolves the dynamic subtitle shown next to the brand in the AppShell
|
||||||
|
* header. Reflects the current section so the user always sees where they
|
||||||
|
* are without scanning the nav row.
|
||||||
|
*
|
||||||
|
* Special-cases case routes (`/cases/{caseNumber}` and `/compose`) so the
|
||||||
|
* subtitle includes the case number — the most useful piece of context
|
||||||
|
* during decision drafting.
|
||||||
|
*/
|
||||||
|
export function headerSubtitle(pathname: string): string {
|
||||||
|
if (pathname === "/") return "בית";
|
||||||
|
|
||||||
|
if (pathname.startsWith("/cases/")) {
|
||||||
|
const [, , slug] = pathname.split("/");
|
||||||
|
if (!slug || slug === "new") return "תיק חדש";
|
||||||
|
const isCompose = pathname.includes("/compose");
|
||||||
|
const decoded = decodeURIComponent(slug);
|
||||||
|
return isCompose ? `ערר ${decoded} · ניסוח` : `ערר ${decoded}`;
|
||||||
|
}
|
||||||
|
|
||||||
|
if (pathname.startsWith("/archive")) return "ארכיון";
|
||||||
|
if (pathname.startsWith("/training")) return "אימון סגנון";
|
||||||
|
if (pathname.startsWith("/precedents")) return "ספריית פסיקה";
|
||||||
|
if (pathname.startsWith("/methodology")) return "מתודולוגיה";
|
||||||
|
if (pathname.startsWith("/skills")) return "מיומנויות";
|
||||||
|
if (pathname.startsWith("/diagnostics")) return "אבחון";
|
||||||
|
if (pathname.startsWith("/settings")) return "הגדרות";
|
||||||
|
if (pathname.startsWith("/feedback")) return "הערות יו״ר";
|
||||||
|
|
||||||
|
return "ניהול תיקים";
|
||||||
|
}
|
||||||
199
web-ui/src/components/precedents/extracted-halachot.tsx
Normal file
199
web-ui/src/components/precedents/extracted-halachot.tsx
Normal file
@@ -0,0 +1,199 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useMemo, useState } from "react";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import type { Halacha } from "@/lib/api/precedent-library";
|
||||||
|
|
||||||
|
const RULE_TYPE_LABELS: Record<string, string> = {
|
||||||
|
binding: "הלכה מחייבת",
|
||||||
|
interpretive: "פרשני",
|
||||||
|
procedural: "פרוצדורלי",
|
||||||
|
obiter: "אמרת אגב",
|
||||||
|
application: "יישום הלכה",
|
||||||
|
persuasive: "משכנע",
|
||||||
|
};
|
||||||
|
|
||||||
|
type StatusFilter = "all" | "approved" | "pending" | "rejected";
|
||||||
|
|
||||||
|
export function ReviewStatusPill({ status }: { status: Halacha["review_status"] }) {
|
||||||
|
if (status === "approved" || status === "published") {
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="text-[0.65rem] bg-gold-wash text-gold-deep border-gold/40"
|
||||||
|
>
|
||||||
|
מאושרת
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (status === "pending_review") {
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="text-[0.65rem] bg-rule-soft text-ink-muted"
|
||||||
|
>
|
||||||
|
ממתינה
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="text-[0.65rem] bg-danger-bg text-danger border-danger/40"
|
||||||
|
>
|
||||||
|
נדחתה
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Read-only roll-up of every halacha extracted from a precedent —
|
||||||
|
* approved + pending + rejected. The "ממתין לאישור" tab only surfaces
|
||||||
|
* pending items globally; this section is the per-case view. To act on
|
||||||
|
* an item (approve / edit / reject), go to the review tab — keeping the
|
||||||
|
* surfaces separated avoids duplicate review UX in two places. */
|
||||||
|
export function ExtractedHalachotSection({ halachot }: { halachot: Halacha[] }) {
|
||||||
|
const [filter, setFilter] = useState<StatusFilter>("all");
|
||||||
|
|
||||||
|
const counts = useMemo(() => {
|
||||||
|
const c = { all: halachot.length, approved: 0, pending: 0, rejected: 0 };
|
||||||
|
for (const h of halachot) {
|
||||||
|
if (h.review_status === "approved" || h.review_status === "published") {
|
||||||
|
c.approved++;
|
||||||
|
} else if (h.review_status === "pending_review") {
|
||||||
|
c.pending++;
|
||||||
|
} else if (h.review_status === "rejected") {
|
||||||
|
c.rejected++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return c;
|
||||||
|
}, [halachot]);
|
||||||
|
|
||||||
|
const sorted = useMemo(() => {
|
||||||
|
const matches = (h: Halacha) => {
|
||||||
|
if (filter === "all") return true;
|
||||||
|
if (filter === "approved") {
|
||||||
|
return h.review_status === "approved" || h.review_status === "published";
|
||||||
|
}
|
||||||
|
if (filter === "pending") return h.review_status === "pending_review";
|
||||||
|
return h.review_status === "rejected";
|
||||||
|
};
|
||||||
|
return halachot
|
||||||
|
.filter(matches)
|
||||||
|
.sort((a, b) => a.halacha_index - b.halacha_index);
|
||||||
|
}, [halachot, filter]);
|
||||||
|
|
||||||
|
if (!halachot.length) {
|
||||||
|
return (
|
||||||
|
<div className="rounded-lg border border-rule bg-rule-soft/40 p-4 text-center text-ink-muted text-sm">
|
||||||
|
עדיין לא חולצו הלכות מהפסיקה הזו.
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
const tabs: { key: StatusFilter; label: string; count: number }[] = [
|
||||||
|
{ key: "all", label: "הכל", count: counts.all },
|
||||||
|
{ key: "approved", label: "מאושרות", count: counts.approved },
|
||||||
|
{ key: "pending", label: "ממתינות", count: counts.pending },
|
||||||
|
{ key: "rejected", label: "נדחו", count: counts.rejected },
|
||||||
|
];
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-3">
|
||||||
|
<div className="flex items-center justify-between gap-2 flex-wrap">
|
||||||
|
<h3 className="text-navy text-base font-semibold m-0">
|
||||||
|
הלכות שחולצו ({counts.all})
|
||||||
|
</h3>
|
||||||
|
<div className="flex items-center gap-1 flex-wrap" role="tablist">
|
||||||
|
{tabs.map((t) => {
|
||||||
|
const active = filter === t.key;
|
||||||
|
return (
|
||||||
|
<button
|
||||||
|
key={t.key}
|
||||||
|
type="button"
|
||||||
|
role="tab"
|
||||||
|
aria-selected={active}
|
||||||
|
onClick={() => setFilter(t.key)}
|
||||||
|
className={`text-[0.78rem] px-2.5 py-1 rounded border transition-colors tabular-nums ${
|
||||||
|
active
|
||||||
|
? "bg-navy text-parchment border-navy"
|
||||||
|
: "bg-surface text-ink-muted border-rule hover:bg-rule-soft"
|
||||||
|
}`}
|
||||||
|
>
|
||||||
|
{t.label} ({t.count})
|
||||||
|
</button>
|
||||||
|
);
|
||||||
|
})}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{!sorted.length ? (
|
||||||
|
<div className="text-center text-ink-muted text-sm py-6">
|
||||||
|
אין הלכות בקטגוריה זו.
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<ol className="space-y-3 list-none ps-0 m-0">
|
||||||
|
{sorted.map((h) => (
|
||||||
|
<li
|
||||||
|
key={h.id}
|
||||||
|
className="rounded-lg border border-rule bg-surface p-3 space-y-2"
|
||||||
|
>
|
||||||
|
<div className="flex items-start gap-2 flex-wrap">
|
||||||
|
<span className="text-[0.7rem] text-ink-muted tabular-nums font-mono">
|
||||||
|
#{h.halacha_index}
|
||||||
|
</span>
|
||||||
|
<ReviewStatusPill status={h.review_status} />
|
||||||
|
<Badge variant="outline" className="text-[0.65rem]">
|
||||||
|
{RULE_TYPE_LABELS[h.rule_type] ?? h.rule_type}
|
||||||
|
</Badge>
|
||||||
|
<Badge variant="outline" className="text-[0.65rem] tabular-nums">
|
||||||
|
ביטחון {h.confidence.toFixed(2)}
|
||||||
|
</Badge>
|
||||||
|
{h.page_reference ? (
|
||||||
|
<span className="text-[0.7rem] text-ink-muted ms-auto">
|
||||||
|
{h.page_reference}
|
||||||
|
</span>
|
||||||
|
) : null}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<p
|
||||||
|
className="text-navy font-medium text-sm leading-relaxed m-0"
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
{h.rule_statement}
|
||||||
|
</p>
|
||||||
|
|
||||||
|
{h.reasoning_summary ? (
|
||||||
|
<p
|
||||||
|
className="text-ink-soft text-[0.82rem] leading-relaxed m-0"
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
<span className="text-ink-muted text-[0.7rem]">היגיון: </span>
|
||||||
|
{h.reasoning_summary}
|
||||||
|
</p>
|
||||||
|
) : null}
|
||||||
|
|
||||||
|
{h.supporting_quote ? (
|
||||||
|
<blockquote
|
||||||
|
className="text-ink-soft text-[0.82rem] leading-relaxed border-r-2 border-gold pr-3 m-0"
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
“{h.supporting_quote}”
|
||||||
|
</blockquote>
|
||||||
|
) : null}
|
||||||
|
|
||||||
|
{h.subject_tags?.length ? (
|
||||||
|
<div className="flex items-center gap-1 flex-wrap pt-1">
|
||||||
|
{h.subject_tags.map((t) => (
|
||||||
|
<Badge key={t} variant="outline" className="text-[0.65rem]">
|
||||||
|
{t}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
) : null}
|
||||||
|
</li>
|
||||||
|
))}
|
||||||
|
</ol>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
474
web-ui/src/components/precedents/halacha-review-panel.tsx
Normal file
474
web-ui/src/components/precedents/halacha-review-panel.tsx
Normal file
@@ -0,0 +1,474 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useEffect, useMemo, useState } from "react";
|
||||||
|
import { Check, X, Edit2, ChevronDown, ChevronLeft, AlertTriangle } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import { practiceAreaLabel } from "./practice-area";
|
||||||
|
import {
|
||||||
|
useHalachotPending, useUpdateHalacha, type Halacha,
|
||||||
|
} from "@/lib/api/precedent-library";
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Halacha review queue — the chair-only path between automatic
|
||||||
|
* extraction and agent visibility. Per the project's review policy,
|
||||||
|
* NO halacha is auto-published; every row sits in pending_review until
|
||||||
|
* approved.
|
||||||
|
*
|
||||||
|
* UX: items are grouped by precedent (case_law_id). Groups start
|
||||||
|
* collapsed so the chair picks one ruling at a time. Within an open
|
||||||
|
* group, J/K navigates, A approves, R rejects, E edits. Items inside
|
||||||
|
* each group are sorted by confidence ascending so the doubtful ones
|
||||||
|
* surface first.
|
||||||
|
*/
|
||||||
|
|
||||||
|
function formatDate(iso: string | null | undefined) {
|
||||||
|
if (!iso) return "—";
|
||||||
|
try {
|
||||||
|
return new Date(iso).toLocaleDateString("he-IL");
|
||||||
|
} catch {
|
||||||
|
return iso;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* The upload form (and Nevo PDFs) embed Unicode bidi marks (RTL/LTR/embedding/
|
||||||
|
* isolate) inside the citation. They render as zero-width but visually push
|
||||||
|
* the text away from where it should sit. Strip for display only. */
|
||||||
|
function cleanCitation(s: string | null | undefined): string {
|
||||||
|
if (!s) return "—";
|
||||||
|
return s.replace(/[--]/g, "").trim();
|
||||||
|
}
|
||||||
|
|
||||||
|
const RULE_TYPE_LABELS: Record<string, string> = {
|
||||||
|
binding: "הלכה מחייבת",
|
||||||
|
interpretive: "פרשני",
|
||||||
|
procedural: "פרוצדורלי",
|
||||||
|
obiter: "אמרת אגב",
|
||||||
|
application: "יישום הלכה",
|
||||||
|
persuasive: "משכנע",
|
||||||
|
};
|
||||||
|
|
||||||
|
function ruleTypeLabel(t: string): string {
|
||||||
|
return RULE_TYPE_LABELS[t] ?? t;
|
||||||
|
}
|
||||||
|
|
||||||
|
type EditState = { rule_statement: string; reasoning_summary: string };
|
||||||
|
|
||||||
|
function HalachaCard({
|
||||||
|
h, focused, onApprove, onReject, onSave,
|
||||||
|
}: {
|
||||||
|
h: Halacha;
|
||||||
|
focused: boolean;
|
||||||
|
onApprove: () => void;
|
||||||
|
onReject: () => void;
|
||||||
|
onSave: (patch: Partial<EditState>) => Promise<void>;
|
||||||
|
}) {
|
||||||
|
const [editing, setEditing] = useState(false);
|
||||||
|
const [draft, setDraft] = useState<EditState>({
|
||||||
|
rule_statement: h.rule_statement,
|
||||||
|
reasoning_summary: h.reasoning_summary,
|
||||||
|
});
|
||||||
|
|
||||||
|
// Reset draft when underlying row changes (focus moves to a new card).
|
||||||
|
useEffect(() => {
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setDraft({
|
||||||
|
rule_statement: h.rule_statement,
|
||||||
|
reasoning_summary: h.reasoning_summary,
|
||||||
|
});
|
||||||
|
}, [h.id, h.rule_statement, h.reasoning_summary]);
|
||||||
|
|
||||||
|
const onSubmitEdit = async () => {
|
||||||
|
await onSave(draft);
|
||||||
|
setEditing(false);
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div
|
||||||
|
data-halacha-id={h.id}
|
||||||
|
className={`
|
||||||
|
rounded-lg border bg-surface p-4 space-y-3 transition-colors
|
||||||
|
${focused ? "border-gold ring-2 ring-gold/40 shadow-md" : "border-rule"}
|
||||||
|
`}
|
||||||
|
>
|
||||||
|
{/* Header — status pills only (citation is in the group header) */}
|
||||||
|
<div className="flex items-start gap-2 text-[0.78rem] text-ink-muted flex-wrap">
|
||||||
|
{h.page_reference && (
|
||||||
|
<span className="text-[0.7rem]">{h.page_reference}</span>
|
||||||
|
)}
|
||||||
|
<span className="ms-auto flex items-center gap-2">
|
||||||
|
{h.quote_verified ? (
|
||||||
|
<Badge variant="outline" className="text-[0.65rem] bg-gold-wash text-gold-deep border-gold/40">
|
||||||
|
<Check className="w-3 h-3 me-1" /> ציטוט מאומת
|
||||||
|
</Badge>
|
||||||
|
) : (
|
||||||
|
<Badge variant="outline" className="text-[0.65rem] bg-danger-bg text-danger border-danger/40">
|
||||||
|
<AlertTriangle className="w-3 h-3 me-1" /> ציטוט לא מאומת
|
||||||
|
</Badge>
|
||||||
|
)}
|
||||||
|
<Badge variant="outline" className="text-[0.65rem] tabular-nums">
|
||||||
|
ביטחון {h.confidence.toFixed(2)}
|
||||||
|
</Badge>
|
||||||
|
<Badge variant="outline" className="text-[0.65rem]">
|
||||||
|
{ruleTypeLabel(h.rule_type)}
|
||||||
|
</Badge>
|
||||||
|
</span>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Side-by-side rule vs quote */}
|
||||||
|
<div className="grid md:grid-cols-2 gap-3">
|
||||||
|
<div>
|
||||||
|
<div className="text-[0.7rem] text-ink-muted mb-1">ניסוח הכלל</div>
|
||||||
|
{editing ? (
|
||||||
|
<Textarea
|
||||||
|
value={draft.rule_statement} rows={4} dir="rtl"
|
||||||
|
onChange={(e) => setDraft({ ...draft, rule_statement: e.target.value })}
|
||||||
|
className="bg-gold-wash/50 border-gold/30"
|
||||||
|
/>
|
||||||
|
) : (
|
||||||
|
<p className="text-navy font-medium leading-relaxed" dir="rtl">
|
||||||
|
{h.rule_statement}
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
<div>
|
||||||
|
<div className="text-[0.7rem] text-ink-muted mb-1">ציטוט תומך</div>
|
||||||
|
<blockquote className="text-ink-soft text-sm leading-relaxed border-r-2 border-gold pr-3" dir="rtl">
|
||||||
|
“{h.supporting_quote}”
|
||||||
|
</blockquote>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Reasoning */}
|
||||||
|
{(editing || h.reasoning_summary) && (
|
||||||
|
<div>
|
||||||
|
<div className="text-[0.7rem] text-ink-muted mb-1">תמצית ההיגיון</div>
|
||||||
|
{editing ? (
|
||||||
|
<Textarea
|
||||||
|
value={draft.reasoning_summary} rows={2} dir="rtl"
|
||||||
|
onChange={(e) => setDraft({ ...draft, reasoning_summary: e.target.value })}
|
||||||
|
className="bg-gold-wash/50 border-gold/30"
|
||||||
|
/>
|
||||||
|
) : (
|
||||||
|
<p className="text-ink-soft text-sm leading-relaxed" dir="rtl">
|
||||||
|
{h.reasoning_summary}
|
||||||
|
</p>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{/* Tags */}
|
||||||
|
<div className="flex items-center gap-2 flex-wrap">
|
||||||
|
{h.practice_areas?.map((p) => (
|
||||||
|
<Badge key={p} variant="outline" className="text-[0.65rem] bg-navy-soft/30 text-navy">
|
||||||
|
{practiceAreaLabel(p)}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
{h.subject_tags?.map((t) => (
|
||||||
|
<Badge key={t} variant="outline" className="text-[0.65rem]">
|
||||||
|
{t}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Actions */}
|
||||||
|
<div className="flex items-center gap-2 justify-end pt-1 border-t border-rule-soft">
|
||||||
|
{editing ? (
|
||||||
|
<>
|
||||||
|
<Button size="sm" variant="ghost" onClick={() => setEditing(false)}>
|
||||||
|
ביטול
|
||||||
|
</Button>
|
||||||
|
<Button size="sm" onClick={onSubmitEdit}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
שמור שינויים
|
||||||
|
</Button>
|
||||||
|
</>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<Button size="sm" variant="ghost" onClick={() => setEditing(true)}>
|
||||||
|
<Edit2 className="w-3.5 h-3.5 me-1" />
|
||||||
|
ערוך (E)
|
||||||
|
</Button>
|
||||||
|
<Button size="sm" variant="ghost"
|
||||||
|
onClick={onReject}
|
||||||
|
className="text-danger hover:text-danger hover:bg-danger-bg">
|
||||||
|
<X className="w-3.5 h-3.5 me-1" />
|
||||||
|
דחה (R)
|
||||||
|
</Button>
|
||||||
|
<Button size="sm" onClick={onApprove}
|
||||||
|
className="bg-gold text-navy hover:bg-gold-deep">
|
||||||
|
<Check className="w-3.5 h-3.5 me-1" />
|
||||||
|
אשר (A)
|
||||||
|
</Button>
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
type Group = {
|
||||||
|
caseLawId: string;
|
||||||
|
caseNumber: string;
|
||||||
|
court: string;
|
||||||
|
decisionDate: string | null;
|
||||||
|
precedentLevel: string;
|
||||||
|
items: Halacha[];
|
||||||
|
};
|
||||||
|
|
||||||
|
export function HalachaReviewPanel() {
|
||||||
|
const { data, isPending, error } = useHalachotPending(500);
|
||||||
|
const update = useUpdateHalacha();
|
||||||
|
const [expandedIds, setExpandedIds] = useState<Set<string>>(new Set());
|
||||||
|
const [focusedId, setFocusedId] = useState<string | null>(null);
|
||||||
|
|
||||||
|
// Group by precedent. Within each group, items sorted by confidence ascending
|
||||||
|
// so the chair sees doubtful entries first. Groups themselves sort by item
|
||||||
|
// count descending — biggest piles surface first.
|
||||||
|
const groups = useMemo<Group[]>(() => {
|
||||||
|
const map = new Map<string, Group>();
|
||||||
|
for (const h of data?.items ?? []) {
|
||||||
|
const k = h.case_law_id;
|
||||||
|
let g = map.get(k);
|
||||||
|
if (!g) {
|
||||||
|
g = {
|
||||||
|
caseLawId: k,
|
||||||
|
caseNumber: h.case_number ?? "",
|
||||||
|
court: h.court ?? "",
|
||||||
|
decisionDate: h.decision_date ?? null,
|
||||||
|
precedentLevel: h.precedent_level ?? "",
|
||||||
|
items: [],
|
||||||
|
};
|
||||||
|
map.set(k, g);
|
||||||
|
}
|
||||||
|
g.items.push(h);
|
||||||
|
}
|
||||||
|
for (const g of map.values()) {
|
||||||
|
g.items.sort((a, b) => a.confidence - b.confidence);
|
||||||
|
}
|
||||||
|
return Array.from(map.values()).sort((a, b) => b.items.length - a.items.length);
|
||||||
|
}, [data]);
|
||||||
|
|
||||||
|
const totalCount = data?.items.length ?? 0;
|
||||||
|
|
||||||
|
// Items the keyboard handler can navigate. Only items inside expanded
|
||||||
|
// groups are "visible" — collapsed groups hide their items from J/K.
|
||||||
|
const visibleItems = useMemo<Halacha[]>(() => {
|
||||||
|
const out: Halacha[] = [];
|
||||||
|
for (const g of groups) {
|
||||||
|
if (expandedIds.has(g.caseLawId)) out.push(...g.items);
|
||||||
|
}
|
||||||
|
return out;
|
||||||
|
}, [groups, expandedIds]);
|
||||||
|
|
||||||
|
// If the focused item disappears (approved/rejected/group-collapsed), pick
|
||||||
|
// a sensible neighbour so the highlight doesn't vanish silently.
|
||||||
|
useEffect(() => {
|
||||||
|
if (focusedId === null) return;
|
||||||
|
if (visibleItems.some((h) => h.id === focusedId)) return;
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setFocusedId(visibleItems[0]?.id ?? null);
|
||||||
|
}, [focusedId, visibleItems]);
|
||||||
|
|
||||||
|
// Scroll the focused row into view
|
||||||
|
useEffect(() => {
|
||||||
|
if (!focusedId) return;
|
||||||
|
const el = document.querySelector(`[data-halacha-id="${focusedId}"]`);
|
||||||
|
el?.scrollIntoView({ block: "nearest", behavior: "smooth" });
|
||||||
|
}, [focusedId]);
|
||||||
|
|
||||||
|
const focused = focusedId
|
||||||
|
? visibleItems.find((h) => h.id === focusedId) ?? null
|
||||||
|
: null;
|
||||||
|
|
||||||
|
const moveFocus = (delta: 1 | -1) => {
|
||||||
|
if (visibleItems.length === 0) return;
|
||||||
|
const idx = focusedId
|
||||||
|
? visibleItems.findIndex((h) => h.id === focusedId)
|
||||||
|
: -1;
|
||||||
|
const next = idx < 0
|
||||||
|
? (delta > 0 ? 0 : visibleItems.length - 1)
|
||||||
|
: Math.max(0, Math.min(visibleItems.length - 1, idx + delta));
|
||||||
|
setFocusedId(visibleItems[next].id);
|
||||||
|
};
|
||||||
|
|
||||||
|
const review = async (
|
||||||
|
h: Halacha,
|
||||||
|
status: "approved" | "rejected",
|
||||||
|
extra?: Partial<EditState>,
|
||||||
|
) => {
|
||||||
|
try {
|
||||||
|
await update.mutateAsync({
|
||||||
|
id: h.id,
|
||||||
|
patch: { review_status: status, ...extra },
|
||||||
|
});
|
||||||
|
toast.success(status === "approved" ? "אושר" : "נדחה");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "שגיאה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const toggleGroup = (caseLawId: string) => {
|
||||||
|
setExpandedIds((prev) => {
|
||||||
|
const next = new Set(prev);
|
||||||
|
if (next.has(caseLawId)) {
|
||||||
|
next.delete(caseLawId);
|
||||||
|
} else {
|
||||||
|
next.add(caseLawId);
|
||||||
|
// Auto-focus first item of the just-opened group
|
||||||
|
const g = groups.find((x) => x.caseLawId === caseLawId);
|
||||||
|
if (g && g.items.length) {
|
||||||
|
setTimeout(() => setFocusedId(g.items[0].id), 0);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return next;
|
||||||
|
});
|
||||||
|
};
|
||||||
|
|
||||||
|
// Keyboard navigation — only acts when something is focused (i.e. at
|
||||||
|
// least one group is open).
|
||||||
|
useEffect(() => {
|
||||||
|
const onKey = (e: KeyboardEvent) => {
|
||||||
|
const tag = (e.target as HTMLElement)?.tagName?.toLowerCase();
|
||||||
|
if (tag === "input" || tag === "textarea") return;
|
||||||
|
if (e.key === "j") {
|
||||||
|
e.preventDefault();
|
||||||
|
moveFocus(1);
|
||||||
|
} else if (e.key === "k") {
|
||||||
|
e.preventDefault();
|
||||||
|
moveFocus(-1);
|
||||||
|
} else if ((e.key === "a" || e.key === "A") && focused) {
|
||||||
|
e.preventDefault();
|
||||||
|
review(focused, "approved");
|
||||||
|
} else if ((e.key === "r" || e.key === "R") && focused) {
|
||||||
|
e.preventDefault();
|
||||||
|
if (window.confirm("לדחות הלכה זו?")) review(focused, "rejected");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
window.addEventListener("keydown", onKey);
|
||||||
|
return () => window.removeEventListener("keydown", onKey);
|
||||||
|
// eslint-disable-next-line react-hooks/exhaustive-deps
|
||||||
|
}, [focused, visibleItems]);
|
||||||
|
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<div className="rounded bg-danger-bg border border-danger/40 px-6 py-5 text-danger text-center">
|
||||||
|
{error.message}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (isPending) {
|
||||||
|
return (
|
||||||
|
<div className="space-y-3">
|
||||||
|
{[...Array(3)].map((_, i) => <Skeleton key={i} className="h-24 w-full" />)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!groups.length) {
|
||||||
|
return (
|
||||||
|
<div className="text-center text-ink-muted py-16">
|
||||||
|
<p className="text-lg">אין הלכות הממתינות לאישור.</p>
|
||||||
|
<p className="text-sm mt-2">העלה פסיקה חדשה — ההלכות שיחולצו ממנה יופיעו כאן.</p>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<div className="flex items-center gap-3 text-sm text-ink-muted flex-wrap">
|
||||||
|
<span>
|
||||||
|
<span className="text-navy font-semibold">{totalCount}</span> ממתינות
|
||||||
|
ב-<span className="text-navy font-semibold">{groups.length}</span> פסיקות
|
||||||
|
</span>
|
||||||
|
<span className="me-auto text-[0.72rem]">
|
||||||
|
ניווט: <kbd className="bg-rule-soft px-1.5 rounded">J</kbd>/<kbd className="bg-rule-soft px-1.5 rounded">K</kbd>
|
||||||
|
{" "}· אישור: <kbd className="bg-rule-soft px-1.5 rounded">A</kbd>
|
||||||
|
{" "}· דחייה: <kbd className="bg-rule-soft px-1.5 rounded">R</kbd>
|
||||||
|
{" "}· עריכה: <kbd className="bg-rule-soft px-1.5 rounded">E</kbd>
|
||||||
|
</span>
|
||||||
|
{expandedIds.size > 0 && (
|
||||||
|
<Button
|
||||||
|
size="sm" variant="ghost"
|
||||||
|
onClick={() => setExpandedIds(new Set())}
|
||||||
|
>
|
||||||
|
סגור הכל
|
||||||
|
</Button>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-3">
|
||||||
|
{groups.map((g) => {
|
||||||
|
const isOpen = expandedIds.has(g.caseLawId);
|
||||||
|
return (
|
||||||
|
<div
|
||||||
|
key={g.caseLawId}
|
||||||
|
className="rounded-lg border border-rule bg-surface overflow-hidden"
|
||||||
|
>
|
||||||
|
<button
|
||||||
|
type="button"
|
||||||
|
onClick={() => toggleGroup(g.caseLawId)}
|
||||||
|
className={`
|
||||||
|
w-full flex items-center gap-3 px-4 py-3 text-right
|
||||||
|
hover:bg-gold-wash/30 transition-colors
|
||||||
|
${isOpen ? "bg-gold-wash/40 border-b border-rule" : ""}
|
||||||
|
`}
|
||||||
|
aria-expanded={isOpen}
|
||||||
|
>
|
||||||
|
{isOpen ? (
|
||||||
|
<ChevronDown className="w-4 h-4 text-ink-muted shrink-0" />
|
||||||
|
) : (
|
||||||
|
<ChevronLeft className="w-4 h-4 text-ink-muted shrink-0" />
|
||||||
|
)}
|
||||||
|
<div className="flex-1 min-w-0 text-right">
|
||||||
|
<div className="font-semibold text-navy truncate">
|
||||||
|
{cleanCitation(g.caseNumber)}
|
||||||
|
</div>
|
||||||
|
<div className="text-[0.72rem] text-ink-muted flex items-center gap-2 flex-wrap mt-0.5">
|
||||||
|
{g.court && <span>{g.court}</span>}
|
||||||
|
{g.decisionDate && <span>· {formatDate(g.decisionDate)}</span>}
|
||||||
|
{g.precedentLevel && <span>· {g.precedentLevel}</span>}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="bg-gold-wash text-gold-deep border-gold/40 tabular-nums"
|
||||||
|
>
|
||||||
|
{g.items.length} ממתינות
|
||||||
|
</Badge>
|
||||||
|
</button>
|
||||||
|
|
||||||
|
{isOpen && (
|
||||||
|
<div className="p-4 space-y-3 bg-rule-soft/20">
|
||||||
|
{g.items.map((h) => (
|
||||||
|
<HalachaCard
|
||||||
|
key={h.id}
|
||||||
|
h={h}
|
||||||
|
focused={h.id === focusedId}
|
||||||
|
onApprove={() => review(h, "approved")}
|
||||||
|
onReject={() => {
|
||||||
|
if (window.confirm("לדחות הלכה זו?")) review(h, "rejected");
|
||||||
|
}}
|
||||||
|
onSave={async (patch) => {
|
||||||
|
try {
|
||||||
|
await update.mutateAsync({ id: h.id, patch });
|
||||||
|
toast.success("נשמר");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "שגיאה");
|
||||||
|
}
|
||||||
|
}}
|
||||||
|
/>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
})}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
323
web-ui/src/components/precedents/library-list-panel.tsx
Normal file
323
web-ui/src/components/precedents/library-list-panel.tsx
Normal file
@@ -0,0 +1,323 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import { Trash2, Plus, Pencil } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import {
|
||||||
|
Table, TableBody, TableCell, TableHead, TableHeader, TableRow,
|
||||||
|
} from "@/components/ui/table";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import {
|
||||||
|
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import {
|
||||||
|
usePrecedents,
|
||||||
|
useDeletePrecedent,
|
||||||
|
isPrecedentActive,
|
||||||
|
type Precedent,
|
||||||
|
type PracticeArea,
|
||||||
|
} from "@/lib/api/precedent-library";
|
||||||
|
import { PRACTICE_AREAS, PRECEDENT_LEVELS, practiceAreaShort } from "./practice-area";
|
||||||
|
import { PrecedentUploadSheet } from "./precedent-upload-sheet";
|
||||||
|
import { PrecedentEditSheet } from "./precedent-edit-sheet";
|
||||||
|
|
||||||
|
function formatDate(iso: string | null) {
|
||||||
|
if (!iso) return "—";
|
||||||
|
try {
|
||||||
|
return new Date(iso).toLocaleDateString("he-IL");
|
||||||
|
} catch {
|
||||||
|
return iso;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/* The upload form (and Nevo PDFs) embed Unicode bidi marks (RTL/LTR/embedding/
|
||||||
|
* isolate) inside the citation. They render as zero-width but visually push
|
||||||
|
* the text away from the cell edge. Strip them for display only — DB still
|
||||||
|
* has the original. */
|
||||||
|
function cleanCitation(s: string | null | undefined): string {
|
||||||
|
if (!s) return "—";
|
||||||
|
return s.replace(/[--]/g, "").trim();
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Shimmering pill — used while extraction is actively running.
|
||||||
|
* Visually distinct from the static "queued" / "completed" pills. */
|
||||||
|
function ActivePill({ label }: { label: string }) {
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="bg-gold-wash text-gold-deep border-gold/40 shimmer-active"
|
||||||
|
>
|
||||||
|
{label}
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* Five distinct states. The "queued" state is what the user actually
|
||||||
|
* sees most of the time (after upload, both extractions are auto-queued
|
||||||
|
* but the local MCP worker hasn't drained them yet); "מחלץ" / "מעבד"
|
||||||
|
* shimmers and only appears while the extractor is actively running.
|
||||||
|
*/
|
||||||
|
function StatusPill({ p }: { p: Precedent }) {
|
||||||
|
if (p.extraction_status === "failed") {
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="bg-danger-bg text-danger border-danger/40">
|
||||||
|
נכשל
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (p.extraction_status === "processing") {
|
||||||
|
return <ActivePill label="מעבד טקסט" />;
|
||||||
|
}
|
||||||
|
if (p.extraction_status !== "completed") {
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="bg-rule-soft text-ink-muted">
|
||||||
|
בתור
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (p.halacha_extraction_status === "processing") {
|
||||||
|
return <ActivePill label="מחלץ הלכות" />;
|
||||||
|
}
|
||||||
|
if (p.halacha_extraction_status === "failed") {
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="bg-danger-bg text-danger border-danger/40">
|
||||||
|
חילוץ נכשל
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
if (p.halacha_extraction_status === "pending") {
|
||||||
|
if (p.halacha_extraction_requested_at) {
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="bg-rule-soft text-ink-muted">
|
||||||
|
ממתין לחילוץ
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return (
|
||||||
|
<Badge variant="outline" className="bg-rule-soft text-ink-muted">
|
||||||
|
לא חולץ
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
// halacha_extraction_status === "completed"
|
||||||
|
if (p.halachot_count === 0) {
|
||||||
|
return <Badge variant="outline">ללא הלכות</Badge>;
|
||||||
|
}
|
||||||
|
return (
|
||||||
|
<Badge
|
||||||
|
variant="outline"
|
||||||
|
className="bg-gold-wash text-gold-deep border-gold/40"
|
||||||
|
>
|
||||||
|
{p.approved_count}/{p.halachot_count} מאושרות
|
||||||
|
</Badge>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function PrecedentRow({
|
||||||
|
p, onEdit,
|
||||||
|
}: {
|
||||||
|
p: Precedent;
|
||||||
|
onEdit: (id: string) => void;
|
||||||
|
}) {
|
||||||
|
const del = useDeletePrecedent();
|
||||||
|
const active = isPrecedentActive(p);
|
||||||
|
|
||||||
|
const onDelete = async () => {
|
||||||
|
if (active) {
|
||||||
|
toast.error(
|
||||||
|
"מתבצע עיבוד — לא ניתן למחוק עכשיו. המתיני לסיום או רעני את הדף.",
|
||||||
|
);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (!window.confirm(`למחוק את ${p.case_number}? cascade ימחק את ה-chunks וההלכות.`)) return;
|
||||||
|
try {
|
||||||
|
await del.mutateAsync(p.id);
|
||||||
|
toast.success("נמחק");
|
||||||
|
} catch (e) {
|
||||||
|
toast.error(e instanceof Error ? e.message : "שגיאה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<TableRow className="border-rule hover:bg-gold-wash/30 align-top">
|
||||||
|
<TableCell
|
||||||
|
/* shadcn TableCell defaults to whitespace-nowrap which forces the
|
||||||
|
* row wider than the container; for this column we want the long
|
||||||
|
* citation to wrap onto a second line instead of triggering the
|
||||||
|
* horizontal scroll on the table wrapper. min-w/max-w keeps the
|
||||||
|
* column wide enough to avoid awkward 2-word lines while leaving
|
||||||
|
* room for the other columns. */
|
||||||
|
className="font-semibold text-navy text-right whitespace-normal break-words min-w-[280px] max-w-[420px] py-3"
|
||||||
|
dir="rtl"
|
||||||
|
>
|
||||||
|
<span dir="auto">{cleanCitation(p.case_number)}</span>
|
||||||
|
</TableCell>
|
||||||
|
<TableCell className="text-ink whitespace-normal break-words max-w-[260px] py-3">
|
||||||
|
<div className="font-medium">{cleanCitation(p.case_name)}</div>
|
||||||
|
{p.court ? (
|
||||||
|
<div className="text-[0.72rem] text-ink-muted">{p.court}</div>
|
||||||
|
) : null}
|
||||||
|
</TableCell>
|
||||||
|
<TableCell className="text-ink-muted">
|
||||||
|
{p.date ? formatDate(p.date) : <span className="text-ink-light">—</span>}
|
||||||
|
</TableCell>
|
||||||
|
<TableCell>
|
||||||
|
{p.practice_area ? (
|
||||||
|
<Badge variant="outline" className="bg-navy-soft/40 text-navy border-navy/30">
|
||||||
|
{practiceAreaShort(p.practice_area)}
|
||||||
|
</Badge>
|
||||||
|
) : (
|
||||||
|
<span className="text-ink-light">—</span>
|
||||||
|
)}
|
||||||
|
</TableCell>
|
||||||
|
<TableCell className="text-ink-muted text-[0.78rem]">
|
||||||
|
{p.precedent_level ? (
|
||||||
|
p.precedent_level
|
||||||
|
) : (
|
||||||
|
<span className="text-ink-light">—</span>
|
||||||
|
)}
|
||||||
|
</TableCell>
|
||||||
|
<TableCell>
|
||||||
|
<StatusPill p={p} />
|
||||||
|
</TableCell>
|
||||||
|
<TableCell className="text-end">
|
||||||
|
<div className="flex items-center gap-1 justify-end">
|
||||||
|
<Button
|
||||||
|
variant="ghost" size="sm" onClick={() => onEdit(p.id)}
|
||||||
|
aria-label={`ערוך את ${p.case_number}`}
|
||||||
|
title="ערוך פרטים"
|
||||||
|
className="text-ink-muted hover:text-navy"
|
||||||
|
>
|
||||||
|
<Pencil className="w-4 h-4" />
|
||||||
|
</Button>
|
||||||
|
<Button
|
||||||
|
variant="ghost" size="sm" onClick={onDelete}
|
||||||
|
disabled={del.isPending || active}
|
||||||
|
aria-label={`מחק את ${p.case_number}`}
|
||||||
|
title={active ? "מתבצע עיבוד — לא ניתן למחוק" : "מחק"}
|
||||||
|
className="text-danger hover:text-danger hover:bg-danger-bg disabled:opacity-30"
|
||||||
|
>
|
||||||
|
<Trash2 className="w-4 h-4" />
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</TableCell>
|
||||||
|
</TableRow>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
export function LibraryListPanel() {
|
||||||
|
const [practiceArea, setPracticeArea] = useState<PracticeArea>("");
|
||||||
|
const [precedentLevel, setPrecedentLevel] = useState("");
|
||||||
|
const [search, setSearch] = useState("");
|
||||||
|
const [uploadOpen, setUploadOpen] = useState(false);
|
||||||
|
const [editingId, setEditingId] = useState<string | null>(null);
|
||||||
|
|
||||||
|
const { data, isPending, error } = usePrecedents({
|
||||||
|
practiceArea: practiceArea || undefined,
|
||||||
|
precedentLevel: precedentLevel || undefined,
|
||||||
|
search: search.trim() || undefined,
|
||||||
|
limit: 200,
|
||||||
|
});
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<div className="flex items-end gap-3 flex-wrap">
|
||||||
|
<div className="flex-1 min-w-[200px]">
|
||||||
|
<label className="text-[0.78rem] text-ink-muted">חיפוש (מספר תיק / שם / תקציר)</label>
|
||||||
|
<Input
|
||||||
|
value={search}
|
||||||
|
onChange={(e) => setSearch(e.target.value)}
|
||||||
|
placeholder="עע"מ 3975/22"
|
||||||
|
dir="rtl"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="min-w-[180px]">
|
||||||
|
<label className="text-[0.78rem] text-ink-muted">תחום</label>
|
||||||
|
<Select value={practiceArea || "_all"} onValueChange={(v) => setPracticeArea(v === "_all" ? "" : v as PracticeArea)}>
|
||||||
|
<SelectTrigger><SelectValue placeholder="הכל" /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_all">הכל</SelectItem>
|
||||||
|
{PRACTICE_AREAS.map((a) => (
|
||||||
|
<SelectItem key={a.value} value={a.value}>{a.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="min-w-[170px]">
|
||||||
|
<label className="text-[0.78rem] text-ink-muted">רמת תקדים</label>
|
||||||
|
<Select value={precedentLevel || "_all"} onValueChange={(v) => setPrecedentLevel(v === "_all" ? "" : v)}>
|
||||||
|
<SelectTrigger><SelectValue placeholder="הכל" /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_all">הכל</SelectItem>
|
||||||
|
{PRECEDENT_LEVELS.map((l) => (
|
||||||
|
<SelectItem key={l.value} value={l.value}>{l.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<Button onClick={() => setUploadOpen(true)} className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
<Plus className="w-4 h-4 me-1" />
|
||||||
|
העלאת פסיקה
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{error ? (
|
||||||
|
<div className="rounded bg-danger-bg border border-danger/40 px-6 py-5 text-danger text-center">
|
||||||
|
{error.message}
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<div className="rounded-lg border border-rule bg-surface shadow-sm overflow-hidden">
|
||||||
|
<Table>
|
||||||
|
<TableHeader className="bg-rule-soft/60">
|
||||||
|
<TableRow className="border-rule">
|
||||||
|
<TableHead className="text-navy text-right">מס׳ / מראה מקום</TableHead>
|
||||||
|
<TableHead className="text-navy text-right">שם / ערכאה</TableHead>
|
||||||
|
<TableHead className="text-navy text-right">תאריך</TableHead>
|
||||||
|
<TableHead className="text-navy text-right">תחום</TableHead>
|
||||||
|
<TableHead className="text-navy text-right">רמה</TableHead>
|
||||||
|
<TableHead className="text-navy text-right">הלכות</TableHead>
|
||||||
|
<TableHead className="text-navy" />
|
||||||
|
</TableRow>
|
||||||
|
</TableHeader>
|
||||||
|
<TableBody>
|
||||||
|
{isPending ? (
|
||||||
|
[...Array(5)].map((_, i) => (
|
||||||
|
<TableRow key={i} className="border-rule">
|
||||||
|
{[...Array(7)].map((_, j) => (
|
||||||
|
<TableCell key={j}>
|
||||||
|
<Skeleton className="h-5 w-full" />
|
||||||
|
</TableCell>
|
||||||
|
))}
|
||||||
|
</TableRow>
|
||||||
|
))
|
||||||
|
) : !data?.items.length ? (
|
||||||
|
<TableRow className="border-rule">
|
||||||
|
<TableCell colSpan={7} className="text-center text-ink-muted py-10">
|
||||||
|
אין פסיקה בקורפוס. העלה את פסק הדין הראשון.
|
||||||
|
</TableCell>
|
||||||
|
</TableRow>
|
||||||
|
) : (
|
||||||
|
data.items.map((p) => (
|
||||||
|
<PrecedentRow key={p.id} p={p} onEdit={setEditingId} />
|
||||||
|
))
|
||||||
|
)}
|
||||||
|
</TableBody>
|
||||||
|
</Table>
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<PrecedentUploadSheet open={uploadOpen} onOpenChange={setUploadOpen} />
|
||||||
|
<PrecedentEditSheet
|
||||||
|
caseLawId={editingId}
|
||||||
|
onOpenChange={(open) => { if (!open) setEditingId(null); }}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
173
web-ui/src/components/precedents/library-search-panel.tsx
Normal file
173
web-ui/src/components/precedents/library-search-panel.tsx
Normal file
@@ -0,0 +1,173 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useState } from "react";
|
||||||
|
import { Search } from "lucide-react";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Badge } from "@/components/ui/badge";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import {
|
||||||
|
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import {
|
||||||
|
useLibrarySearch, type PracticeArea, type SearchHit,
|
||||||
|
} from "@/lib/api/precedent-library";
|
||||||
|
import { PRACTICE_AREAS, PRECEDENT_LEVELS } from "./practice-area";
|
||||||
|
|
||||||
|
function formatDate(iso: string | null) {
|
||||||
|
if (!iso) return "—";
|
||||||
|
try {
|
||||||
|
return new Date(iso).toLocaleDateString("he-IL");
|
||||||
|
} catch {
|
||||||
|
return iso;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
function HalachaCard({ hit }: { hit: Extract<SearchHit, { type: "halacha" }> }) {
|
||||||
|
return (
|
||||||
|
<div className="rounded-lg border border-gold/40 bg-gold-wash/40 p-4 space-y-2">
|
||||||
|
<div className="flex items-center gap-2 text-[0.78rem] text-ink-muted flex-wrap">
|
||||||
|
<Badge className="bg-gold text-navy border-0">הלכה</Badge>
|
||||||
|
<span className="font-mono" dir="ltr">{hit.case_number}</span>
|
||||||
|
{hit.court && <span>· {hit.court}</span>}
|
||||||
|
{hit.decision_date && <span>· {formatDate(hit.decision_date)}</span>}
|
||||||
|
{hit.precedent_level && <span>· {hit.precedent_level}</span>}
|
||||||
|
<span className="ms-auto tabular-nums">דירוג {hit.score.toFixed(2)}</span>
|
||||||
|
</div>
|
||||||
|
<p className="text-navy font-medium text-[0.95rem]" dir="rtl">
|
||||||
|
{hit.rule_statement}
|
||||||
|
</p>
|
||||||
|
<blockquote className="text-ink-soft text-sm border-r-2 border-gold pr-3" dir="rtl">
|
||||||
|
“{hit.supporting_quote}”
|
||||||
|
{hit.page_reference && <span className="text-ink-muted text-[0.72rem] ms-2">({hit.page_reference})</span>}
|
||||||
|
</blockquote>
|
||||||
|
{hit.subject_tags?.length > 0 && (
|
||||||
|
<div className="flex flex-wrap gap-1">
|
||||||
|
{hit.subject_tags.map((t) => (
|
||||||
|
<Badge key={t} variant="outline" className="text-[0.65rem] bg-surface">
|
||||||
|
{t}
|
||||||
|
</Badge>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
function PassageCard({ hit }: { hit: Extract<SearchHit, { type: "passage" }> }) {
|
||||||
|
return (
|
||||||
|
<div className="rounded-lg border border-rule bg-surface p-4 space-y-2">
|
||||||
|
<div className="flex items-center gap-2 text-[0.78rem] text-ink-muted flex-wrap">
|
||||||
|
<Badge variant="outline" className="bg-rule-soft text-ink-muted">קטע</Badge>
|
||||||
|
<span className="font-mono" dir="ltr">{hit.case_number}</span>
|
||||||
|
{hit.court && <span>· {hit.court}</span>}
|
||||||
|
{hit.decision_date && <span>· {formatDate(hit.decision_date)}</span>}
|
||||||
|
<span className="text-[0.7rem]">· {hit.section_type}</span>
|
||||||
|
<span className="ms-auto tabular-nums">דירוג {hit.score.toFixed(2)}</span>
|
||||||
|
</div>
|
||||||
|
<p className="text-ink text-sm leading-relaxed" dir="rtl">
|
||||||
|
{hit.content.slice(0, 600)}
|
||||||
|
{hit.content.length > 600 && <span>…</span>}
|
||||||
|
</p>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
export function LibrarySearchPanel() {
|
||||||
|
const [draft, setDraft] = useState("");
|
||||||
|
const [query, setQuery] = useState("");
|
||||||
|
const [practiceArea, setPracticeArea] = useState<PracticeArea>("");
|
||||||
|
const [precedentLevel, setPrecedentLevel] = useState("");
|
||||||
|
const [includeHalachot, setIncludeHalachot] = useState(true);
|
||||||
|
|
||||||
|
const { data, isFetching, error } = useLibrarySearch(query, {
|
||||||
|
practiceArea: practiceArea || undefined,
|
||||||
|
precedentLevel: precedentLevel || undefined,
|
||||||
|
includeHalachot,
|
||||||
|
limit: 20,
|
||||||
|
});
|
||||||
|
|
||||||
|
const onSubmit = (e: React.FormEvent) => {
|
||||||
|
e.preventDefault();
|
||||||
|
setQuery(draft.trim());
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-4">
|
||||||
|
<form onSubmit={onSubmit} className="flex items-end gap-3 flex-wrap">
|
||||||
|
<div className="flex-1 min-w-[300px]">
|
||||||
|
<label className="text-[0.78rem] text-ink-muted">שאילתת חיפוש</label>
|
||||||
|
<Input value={draft} onChange={(e) => setDraft(e.target.value)}
|
||||||
|
placeholder="השבחה אובייקטיבית" dir="rtl" />
|
||||||
|
</div>
|
||||||
|
<div className="min-w-[180px]">
|
||||||
|
<label className="text-[0.78rem] text-ink-muted">תחום</label>
|
||||||
|
<Select value={practiceArea || "_all"}
|
||||||
|
onValueChange={(v) => setPracticeArea(v === "_all" ? "" : v as PracticeArea)}>
|
||||||
|
<SelectTrigger><SelectValue /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_all">הכל</SelectItem>
|
||||||
|
{PRACTICE_AREAS.map((a) => (
|
||||||
|
<SelectItem key={a.value} value={a.value}>{a.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
<div className="min-w-[170px]">
|
||||||
|
<label className="text-[0.78rem] text-ink-muted">רמת תקדים</label>
|
||||||
|
<Select value={precedentLevel || "_all"}
|
||||||
|
onValueChange={(v) => setPrecedentLevel(v === "_all" ? "" : v)}>
|
||||||
|
<SelectTrigger><SelectValue /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_all">הכל</SelectItem>
|
||||||
|
{PRECEDENT_LEVELS.map((l) => (
|
||||||
|
<SelectItem key={l.value} value={l.value}>{l.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
<Button type="submit" className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
<Search className="w-4 h-4 me-1" />
|
||||||
|
חפש
|
||||||
|
</Button>
|
||||||
|
</form>
|
||||||
|
|
||||||
|
<label className="flex items-center gap-2 cursor-pointer text-sm text-ink-muted">
|
||||||
|
<input type="checkbox" checked={includeHalachot}
|
||||||
|
onChange={(e) => setIncludeHalachot(e.target.checked)} />
|
||||||
|
כלול הלכות (rule-level matches)
|
||||||
|
</label>
|
||||||
|
|
||||||
|
{!query.trim() ? (
|
||||||
|
<div className="text-center text-ink-muted py-12">
|
||||||
|
הקלד שאילתא כדי לחפש בקורפוס. החיפוש סמנטי — לא טקסטואלי.
|
||||||
|
</div>
|
||||||
|
) : error ? (
|
||||||
|
<div className="rounded bg-danger-bg border border-danger/40 px-6 py-5 text-danger text-center">
|
||||||
|
{error.message}
|
||||||
|
</div>
|
||||||
|
) : isFetching ? (
|
||||||
|
<div className="space-y-3">
|
||||||
|
{[...Array(3)].map((_, i) => <Skeleton key={i} className="h-24 w-full" />)}
|
||||||
|
</div>
|
||||||
|
) : !data?.items.length ? (
|
||||||
|
<div className="text-center text-ink-muted py-12">
|
||||||
|
לא נמצאו תוצאות. נסה ניסוח אחר או הסר פילטרים.
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<div className="space-y-3">
|
||||||
|
<p className="text-[0.78rem] text-ink-muted">
|
||||||
|
{data.count} תוצאות (הלכות מאושרות בלבד)
|
||||||
|
</p>
|
||||||
|
{data.items.map((hit, i) =>
|
||||||
|
hit.type === "halacha" ? (
|
||||||
|
<HalachaCard key={`h-${hit.halacha_id ?? i}`} hit={hit} />
|
||||||
|
) : (
|
||||||
|
<PassageCard key={`p-${hit.chunk_id ?? i}`} hit={hit} />
|
||||||
|
),
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
89
web-ui/src/components/precedents/library-stats-panel.tsx
Normal file
89
web-ui/src/components/precedents/library-stats-panel.tsx
Normal file
@@ -0,0 +1,89 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import { useLibraryStats } from "@/lib/api/precedent-library";
|
||||||
|
import { practiceAreaLabel } from "./practice-area";
|
||||||
|
|
||||||
|
function StatCard({ label, value, accent }: { label: string; value: number | string; accent?: boolean }) {
|
||||||
|
return (
|
||||||
|
<div
|
||||||
|
className={`
|
||||||
|
rounded-lg border p-5 bg-surface shadow-sm
|
||||||
|
${accent ? "border-gold bg-gold-wash/40" : "border-rule"}
|
||||||
|
`}
|
||||||
|
>
|
||||||
|
<div className="text-[0.78rem] text-ink-muted mb-1">{label}</div>
|
||||||
|
<div className={`text-3xl font-bold tabular-nums ${accent ? "text-gold-deep" : "text-navy"}`}>
|
||||||
|
{value}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
export function LibraryStatsPanel() {
|
||||||
|
const { data, isPending, error } = useLibraryStats();
|
||||||
|
|
||||||
|
if (error) {
|
||||||
|
return (
|
||||||
|
<div className="rounded bg-danger-bg border border-danger/40 px-6 py-5 text-danger text-center">
|
||||||
|
{error.message}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (isPending || !data) {
|
||||||
|
return (
|
||||||
|
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
|
||||||
|
{[...Array(4)].map((_, i) => <Skeleton key={i} className="h-28 w-full" />)}
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
|
|
||||||
|
return (
|
||||||
|
<div className="space-y-6">
|
||||||
|
<div className="grid grid-cols-2 md:grid-cols-4 gap-3">
|
||||||
|
<StatCard label="פסיקה בקורפוס" value={data.precedents_total} />
|
||||||
|
<StatCard label="הלכות בסך הכל" value={data.halachot_total} />
|
||||||
|
<StatCard
|
||||||
|
label="ממתינות לאישור" value={data.halachot_pending}
|
||||||
|
accent={data.halachot_pending > 0}
|
||||||
|
/>
|
||||||
|
<StatCard label="מאושרות (זמינות לסוכנים)" value={data.halachot_approved} />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="grid md:grid-cols-2 gap-4">
|
||||||
|
<div className="rounded-lg border border-rule bg-surface p-5">
|
||||||
|
<h3 className="text-navy font-semibold mb-3">פילוח לפי תחום</h3>
|
||||||
|
{data.by_practice_area.length === 0 ? (
|
||||||
|
<p className="text-ink-muted text-sm">אין נתונים</p>
|
||||||
|
) : (
|
||||||
|
<ul className="space-y-2">
|
||||||
|
{data.by_practice_area.map((row) => (
|
||||||
|
<li key={row.practice_area || "—"} className="flex items-center gap-2 text-sm">
|
||||||
|
<span className="text-ink">{practiceAreaLabel(row.practice_area || null)}</span>
|
||||||
|
<span className="ms-auto tabular-nums text-navy font-semibold">{row.count}</span>
|
||||||
|
</li>
|
||||||
|
))}
|
||||||
|
</ul>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="rounded-lg border border-rule bg-surface p-5">
|
||||||
|
<h3 className="text-navy font-semibold mb-3">פילוח לפי רמת תקדים</h3>
|
||||||
|
{data.by_precedent_level.length === 0 ? (
|
||||||
|
<p className="text-ink-muted text-sm">אין נתונים</p>
|
||||||
|
) : (
|
||||||
|
<ul className="space-y-2">
|
||||||
|
{data.by_precedent_level.map((row) => (
|
||||||
|
<li key={row.precedent_level || "—"} className="flex items-center gap-2 text-sm">
|
||||||
|
<span className="text-ink">{row.precedent_level || "—"}</span>
|
||||||
|
<span className="ms-auto tabular-nums text-navy font-semibold">{row.count}</span>
|
||||||
|
</li>
|
||||||
|
))}
|
||||||
|
</ul>
|
||||||
|
)}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
);
|
||||||
|
}
|
||||||
37
web-ui/src/components/precedents/practice-area.ts
Normal file
37
web-ui/src/components/precedents/practice-area.ts
Normal file
@@ -0,0 +1,37 @@
|
|||||||
|
/**
|
||||||
|
* Practice-area constants for the precedent library.
|
||||||
|
*
|
||||||
|
* The chair confined the library to the three appeals committee
|
||||||
|
* domains — no national-insurance corpus. The DB enforces this
|
||||||
|
* via a CHECK constraint on case_law.practice_area.
|
||||||
|
*/
|
||||||
|
|
||||||
|
export const PRACTICE_AREAS = [
|
||||||
|
{ value: "rishuy_uvniya", label: "רישוי ובניה", short: "רישוי" },
|
||||||
|
{ value: "betterment_levy", label: "היטל השבחה", short: "השבחה" },
|
||||||
|
{ value: "compensation_197", label: "פיצויים לפי ס' 197", short: "פיצויים" },
|
||||||
|
] as const;
|
||||||
|
|
||||||
|
export const PRECEDENT_LEVELS = [
|
||||||
|
{ value: "עליון", label: "עליון" },
|
||||||
|
{ value: "מנהלי", label: "מנהלי" },
|
||||||
|
{ value: "ועדת_ערר_ארצית", label: "ועדת ערר ארצית" },
|
||||||
|
{ value: "ועדת_ערר_מחוזית", label: "ועדת ערר מחוזית" },
|
||||||
|
] as const;
|
||||||
|
|
||||||
|
export const SOURCE_TYPES = [
|
||||||
|
{ value: "court_ruling", label: "פסק דין" },
|
||||||
|
{ value: "appeals_committee", label: "החלטת ועדת ערר" },
|
||||||
|
] as const;
|
||||||
|
|
||||||
|
export function practiceAreaLabel(value: string | null | undefined): string {
|
||||||
|
if (!value) return "—";
|
||||||
|
const match = PRACTICE_AREAS.find((p) => p.value === value);
|
||||||
|
return match ? match.label : value;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function practiceAreaShort(value: string | null | undefined): string {
|
||||||
|
if (!value) return "—";
|
||||||
|
const match = PRACTICE_AREAS.find((p) => p.value === value);
|
||||||
|
return match ? match.short : value;
|
||||||
|
}
|
||||||
291
web-ui/src/components/precedents/precedent-edit-sheet.tsx
Normal file
291
web-ui/src/components/precedents/precedent-edit-sheet.tsx
Normal file
@@ -0,0 +1,291 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useEffect, useState } from "react";
|
||||||
|
import { Save, Sparkles } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import {
|
||||||
|
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||||
|
} from "@/components/ui/sheet";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Label } from "@/components/ui/label";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import { Skeleton } from "@/components/ui/skeleton";
|
||||||
|
import {
|
||||||
|
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import {
|
||||||
|
usePrecedent,
|
||||||
|
useUpdatePrecedent,
|
||||||
|
useRequestMetadataExtraction,
|
||||||
|
type PracticeArea,
|
||||||
|
type SourceType,
|
||||||
|
} from "@/lib/api/precedent-library";
|
||||||
|
import {
|
||||||
|
PRACTICE_AREAS, PRECEDENT_LEVELS, SOURCE_TYPES,
|
||||||
|
} from "./practice-area";
|
||||||
|
import { ExtractedHalachotSection } from "./extracted-halachot";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
caseLawId: string | null;
|
||||||
|
onOpenChange: (open: boolean) => void;
|
||||||
|
};
|
||||||
|
|
||||||
|
/* All editable fields. Pulled fresh from /api/precedent-library/{id}
|
||||||
|
* each time the sheet opens so the form reflects any auto-fill that
|
||||||
|
* happened in the background. */
|
||||||
|
type FormState = {
|
||||||
|
citation: string;
|
||||||
|
case_name: string;
|
||||||
|
court: string;
|
||||||
|
decision_date: string;
|
||||||
|
practice_area: PracticeArea;
|
||||||
|
appeal_subtype: string;
|
||||||
|
source_type: SourceType;
|
||||||
|
precedent_level: string;
|
||||||
|
is_binding: boolean;
|
||||||
|
subject_tags: string;
|
||||||
|
summary: string;
|
||||||
|
headnote: string;
|
||||||
|
key_quote: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
const EMPTY: FormState = {
|
||||||
|
citation: "", case_name: "", court: "", decision_date: "",
|
||||||
|
practice_area: "", appeal_subtype: "", source_type: "",
|
||||||
|
precedent_level: "", is_binding: true, subject_tags: "",
|
||||||
|
summary: "", headnote: "", key_quote: "",
|
||||||
|
};
|
||||||
|
|
||||||
|
export function PrecedentEditSheet({ caseLawId, onOpenChange }: Props) {
|
||||||
|
const open = caseLawId !== null;
|
||||||
|
const { data: record, isPending } = usePrecedent(caseLawId);
|
||||||
|
const update = useUpdatePrecedent();
|
||||||
|
const requestMetadata = useRequestMetadataExtraction();
|
||||||
|
|
||||||
|
const [form, setForm] = useState<FormState>(EMPTY);
|
||||||
|
|
||||||
|
// Hydrate form when the record loads.
|
||||||
|
useEffect(() => {
|
||||||
|
if (!record) return;
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setForm({
|
||||||
|
citation: record.case_number || "",
|
||||||
|
case_name: record.case_name || "",
|
||||||
|
court: record.court || "",
|
||||||
|
decision_date: record.date ? record.date.slice(0, 10) : "",
|
||||||
|
practice_area: (record.practice_area || "") as PracticeArea,
|
||||||
|
appeal_subtype: record.appeal_subtype || "",
|
||||||
|
source_type: (record.source_type || "") as SourceType,
|
||||||
|
precedent_level: record.precedent_level || "",
|
||||||
|
is_binding: record.is_binding ?? true,
|
||||||
|
subject_tags: (record.subject_tags || []).join(", "),
|
||||||
|
summary: record.summary || "",
|
||||||
|
headnote: record.headnote || "",
|
||||||
|
key_quote: (record as { key_quote?: string }).key_quote || "",
|
||||||
|
});
|
||||||
|
}, [record]);
|
||||||
|
|
||||||
|
const onSubmit = async (e: React.FormEvent) => {
|
||||||
|
e.preventDefault();
|
||||||
|
if (!caseLawId) return;
|
||||||
|
try {
|
||||||
|
const patch: Record<string, unknown> = {
|
||||||
|
case_name: form.case_name.trim(),
|
||||||
|
court: form.court.trim(),
|
||||||
|
practice_area: form.practice_area || undefined,
|
||||||
|
appeal_subtype: form.appeal_subtype.trim(),
|
||||||
|
source_type: form.source_type || undefined,
|
||||||
|
precedent_level: form.precedent_level || undefined,
|
||||||
|
is_binding: form.is_binding,
|
||||||
|
subject_tags: form.subject_tags
|
||||||
|
.split(",").map((t) => t.trim()).filter(Boolean),
|
||||||
|
summary: form.summary.trim(),
|
||||||
|
headnote: form.headnote.trim(),
|
||||||
|
key_quote: form.key_quote.trim(),
|
||||||
|
};
|
||||||
|
if (form.decision_date) patch.decision_date = form.decision_date;
|
||||||
|
// citation (case_number) is the unique key; we don't allow editing it
|
||||||
|
// here to avoid orphaning halachot. To rename, delete + re-upload.
|
||||||
|
await update.mutateAsync({ id: caseLawId, patch });
|
||||||
|
toast.success("נשמר");
|
||||||
|
onOpenChange(false);
|
||||||
|
} catch (err) {
|
||||||
|
toast.error(err instanceof Error ? err.message : "שגיאה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const onRequestMetadata = async () => {
|
||||||
|
if (!caseLawId) return;
|
||||||
|
try {
|
||||||
|
await requestMetadata.mutateAsync(caseLawId);
|
||||||
|
toast.success(
|
||||||
|
"סומן לחילוץ מטא-דאטה. הריצי מ-Claude Code: precedent_process_pending",
|
||||||
|
);
|
||||||
|
} catch (err) {
|
||||||
|
toast.error(err instanceof Error ? err.message : "שגיאה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Sheet open={open} onOpenChange={(o) => { if (!o) onOpenChange(false); }}>
|
||||||
|
<SheetContent side="left" className="w-full sm:max-w-2xl overflow-y-auto" dir="rtl">
|
||||||
|
<SheetHeader>
|
||||||
|
<SheetTitle className="text-navy">עריכת פרטי פסיקה</SheetTitle>
|
||||||
|
<SheetDescription className="text-ink-muted">
|
||||||
|
כל השדות ניתנים לעריכה חוץ ממראה המקום (מזהה ייחודי).
|
||||||
|
כפתור "חלץ מטא-דאטה" שולח בקשה לתור מקומי שאני מרוקן
|
||||||
|
מ-Claude Code (ה-LLM רץ מקומית עם <code>claude session</code>,
|
||||||
|
לא ב-API).
|
||||||
|
</SheetDescription>
|
||||||
|
</SheetHeader>
|
||||||
|
|
||||||
|
{isPending || !record ? (
|
||||||
|
<div className="px-6 pb-6 mt-4 space-y-3">
|
||||||
|
{[...Array(6)].map((_, i) => <Skeleton key={i} className="h-10 w-full" />)}
|
||||||
|
</div>
|
||||||
|
) : (
|
||||||
|
<>
|
||||||
|
<form onSubmit={onSubmit} className="px-6 pb-6 space-y-4 mt-4">
|
||||||
|
<div className="rounded-lg border border-rule bg-rule-soft/40 p-3 flex items-start gap-3">
|
||||||
|
<div className="flex-1 min-w-0">
|
||||||
|
<div className="text-[0.78rem] text-ink-muted">מראה מקום (לא ניתן לעריכה)</div>
|
||||||
|
<div className="text-navy font-mono text-sm break-all" dir="ltr">
|
||||||
|
{record.case_number}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
<Button
|
||||||
|
type="button" size="sm" variant="outline"
|
||||||
|
onClick={onRequestMetadata}
|
||||||
|
disabled={requestMetadata.isPending}
|
||||||
|
className="shrink-0"
|
||||||
|
title="שולח בקשה לחילוץ מטא-דאטה לתור המקומי"
|
||||||
|
>
|
||||||
|
<Sparkles className="w-3.5 h-3.5 me-1" />
|
||||||
|
חלץ מטא-דאטה
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="case-name">שם קצר</Label>
|
||||||
|
<Input id="case-name" value={form.case_name}
|
||||||
|
onChange={(e) => setForm({ ...form, case_name: e.target.value })}
|
||||||
|
placeholder="ערר 403/17 / אהרון ברק" />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="court">ערכאה</Label>
|
||||||
|
<Input id="court" value={form.court}
|
||||||
|
onChange={(e) => setForm({ ...form, court: e.target.value })} />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="date">תאריך</Label>
|
||||||
|
<Input id="date" type="date" value={form.decision_date}
|
||||||
|
onChange={(e) => setForm({ ...form, decision_date: e.target.value })} />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="appeal-subtype">תת-סוג</Label>
|
||||||
|
<Input id="appeal-subtype" value={form.appeal_subtype}
|
||||||
|
onChange={(e) => setForm({ ...form, appeal_subtype: e.target.value })}
|
||||||
|
placeholder="תכנית רחביה / סופיות ההחלטה" />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label>תחום</Label>
|
||||||
|
<div className="flex gap-4 flex-wrap">
|
||||||
|
{PRACTICE_AREAS.map((a) => (
|
||||||
|
<label key={a.value} className="flex items-center gap-2 cursor-pointer">
|
||||||
|
<input type="radio" name="practice_area" value={a.value}
|
||||||
|
checked={form.practice_area === a.value}
|
||||||
|
onChange={() => setForm({ ...form, practice_area: a.value as PracticeArea })} />
|
||||||
|
<span className="text-sm">{a.label}</span>
|
||||||
|
</label>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="source-type">סוג מקור</Label>
|
||||||
|
<Select value={form.source_type || "_none"}
|
||||||
|
onValueChange={(v) => setForm({ ...form, source_type: v === "_none" ? "" : v as SourceType })}>
|
||||||
|
<SelectTrigger><SelectValue /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_none">—</SelectItem>
|
||||||
|
{SOURCE_TYPES.map((s) => (
|
||||||
|
<SelectItem key={s.value} value={s.value}>{s.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="precedent-level">רמת תקדים</Label>
|
||||||
|
<Select value={form.precedent_level || "_none"}
|
||||||
|
onValueChange={(v) => setForm({ ...form, precedent_level: v === "_none" ? "" : v })}>
|
||||||
|
<SelectTrigger><SelectValue /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_none">—</SelectItem>
|
||||||
|
{PRECEDENT_LEVELS.map((l) => (
|
||||||
|
<SelectItem key={l.value} value={l.value}>{l.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="tags">תגיות נושא (מופרדות בפסיקים)</Label>
|
||||||
|
<Input id="tags" value={form.subject_tags}
|
||||||
|
onChange={(e) => setForm({ ...form, subject_tags: e.target.value })}
|
||||||
|
placeholder="חניה, קווי בניין, שיקול דעת" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="summary">תקציר (2-3 משפטים)</Label>
|
||||||
|
<Textarea id="summary" value={form.summary} rows={3} dir="rtl"
|
||||||
|
onChange={(e) => setForm({ ...form, summary: e.target.value })} />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="headnote">Headnote (משפט-שניים)</Label>
|
||||||
|
<Textarea id="headnote" value={form.headnote} rows={2} dir="rtl"
|
||||||
|
onChange={(e) => setForm({ ...form, headnote: e.target.value })} />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="key-quote">ציטוט מרכזי</Label>
|
||||||
|
<Textarea id="key-quote" value={form.key_quote} rows={3} dir="rtl"
|
||||||
|
onChange={(e) => setForm({ ...form, key_quote: e.target.value })} />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<label className="flex items-center gap-2 cursor-pointer">
|
||||||
|
<input type="checkbox" checked={form.is_binding}
|
||||||
|
onChange={(e) => setForm({ ...form, is_binding: e.target.checked })} />
|
||||||
|
<span className="text-sm">הלכה מחייבת (binding)</span>
|
||||||
|
<span className="text-[0.7rem] text-ink-muted">
|
||||||
|
— בדרך כלל רק עליון/מנהלי. ועדות ערר אחרות = לא מחייב.
|
||||||
|
</span>
|
||||||
|
</label>
|
||||||
|
|
||||||
|
<div className="flex gap-2 justify-end pt-2 border-t border-rule-soft">
|
||||||
|
<Button type="button" variant="ghost"
|
||||||
|
onClick={() => onOpenChange(false)} disabled={update.isPending}>
|
||||||
|
ביטול
|
||||||
|
</Button>
|
||||||
|
<Button type="submit" disabled={update.isPending}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
<Save className="w-4 h-4 me-1" />
|
||||||
|
שמור
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
<div className="px-6 pb-8 pt-2 border-t border-rule">
|
||||||
|
<ExtractedHalachotSection halachot={record.halachot ?? []} />
|
||||||
|
</div>
|
||||||
|
</>
|
||||||
|
)}
|
||||||
|
</SheetContent>
|
||||||
|
</Sheet>
|
||||||
|
);
|
||||||
|
}
|
||||||
301
web-ui/src/components/precedents/precedent-upload-sheet.tsx
Normal file
301
web-ui/src/components/precedents/precedent-upload-sheet.tsx
Normal file
@@ -0,0 +1,301 @@
|
|||||||
|
"use client";
|
||||||
|
|
||||||
|
import { useEffect, useState } from "react";
|
||||||
|
import { Upload, Loader2, CheckCircle2, AlertCircle } from "lucide-react";
|
||||||
|
import { toast } from "sonner";
|
||||||
|
import { useQueryClient } from "@tanstack/react-query";
|
||||||
|
import {
|
||||||
|
Sheet, SheetContent, SheetHeader, SheetTitle, SheetDescription,
|
||||||
|
} from "@/components/ui/sheet";
|
||||||
|
import { Button } from "@/components/ui/button";
|
||||||
|
import { Input } from "@/components/ui/input";
|
||||||
|
import { Label } from "@/components/ui/label";
|
||||||
|
import { Textarea } from "@/components/ui/textarea";
|
||||||
|
import {
|
||||||
|
Select, SelectContent, SelectItem, SelectTrigger, SelectValue,
|
||||||
|
} from "@/components/ui/select";
|
||||||
|
import { Progress } from "@/components/ui/progress";
|
||||||
|
import {
|
||||||
|
useUploadPrecedent, libraryKeys,
|
||||||
|
type PracticeArea, type SourceType,
|
||||||
|
} from "@/lib/api/precedent-library";
|
||||||
|
import { useProgress } from "@/lib/api/documents";
|
||||||
|
import {
|
||||||
|
PRACTICE_AREAS, PRECEDENT_LEVELS, SOURCE_TYPES,
|
||||||
|
} from "./practice-area";
|
||||||
|
|
||||||
|
const ACCEPT = ".pdf,.docx,.doc,.rtf,.txt,.md";
|
||||||
|
|
||||||
|
type Props = {
|
||||||
|
open: boolean;
|
||||||
|
onOpenChange: (open: boolean) => void;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function PrecedentUploadSheet({ open, onOpenChange }: Props) {
|
||||||
|
const [file, setFile] = useState<File | null>(null);
|
||||||
|
const [citation, setCitation] = useState("");
|
||||||
|
const [caseName, setCaseName] = useState("");
|
||||||
|
const [court, setCourt] = useState("");
|
||||||
|
const [decisionDate, setDecisionDate] = useState("");
|
||||||
|
const [sourceType, setSourceType] = useState<SourceType>("");
|
||||||
|
const [precedentLevel, setPrecedentLevel] = useState("");
|
||||||
|
const [practiceArea, setPracticeArea] = useState<PracticeArea>("");
|
||||||
|
const [appealSubtype, setAppealSubtype] = useState("");
|
||||||
|
const [subjectTags, setSubjectTags] = useState("");
|
||||||
|
const [headnote, setHeadnote] = useState("");
|
||||||
|
const [isBinding, setIsBinding] = useState(true);
|
||||||
|
|
||||||
|
const [taskId, setTaskId] = useState<string | null>(null);
|
||||||
|
const upload = useUploadPrecedent();
|
||||||
|
const progress = useProgress(taskId);
|
||||||
|
const qc = useQueryClient();
|
||||||
|
|
||||||
|
// Reset form when the sheet closes — fields, file input, and any in-flight
|
||||||
|
// task subscription. We accept the cascade-render warning because resetting
|
||||||
|
// form state on close is exactly the intended side effect.
|
||||||
|
useEffect(() => {
|
||||||
|
if (open) return;
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setFile(null); setCitation(""); setCaseName(""); setCourt("");
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setDecisionDate(""); setSourceType(""); setPrecedentLevel("");
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setPracticeArea(""); setAppealSubtype(""); setSubjectTags("");
|
||||||
|
// eslint-disable-next-line react-hooks/set-state-in-effect
|
||||||
|
setHeadnote(""); setIsBinding(true); setTaskId(null);
|
||||||
|
}, [open]);
|
||||||
|
|
||||||
|
// Auto-close on completion + refresh library list/stats so the new
|
||||||
|
// row appears with up-to-date counts (halachot, approved). The mutation's
|
||||||
|
// onSuccess fires when POST returns the task_id; we need a second
|
||||||
|
// invalidation when SSE reports terminal status, otherwise the table
|
||||||
|
// shows stale data.
|
||||||
|
useEffect(() => {
|
||||||
|
if (progress?.status === "completed") {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
toast.success("הפסיקה הוכנסה לקורפוס. ההלכות ממתינות לאישור.");
|
||||||
|
const t = window.setTimeout(() => onOpenChange(false), 1200);
|
||||||
|
return () => window.clearTimeout(t);
|
||||||
|
}
|
||||||
|
if (progress?.status === "failed") {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
toast.error(`כשל בעיבוד: ${progress.error || "שגיאה לא ידועה"}`);
|
||||||
|
}
|
||||||
|
}, [progress, onOpenChange, qc]);
|
||||||
|
|
||||||
|
const onSubmit = async (e: React.FormEvent) => {
|
||||||
|
e.preventDefault();
|
||||||
|
if (!file) {
|
||||||
|
toast.error("בחר קובץ");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (!citation.trim()) {
|
||||||
|
toast.error("מראה המקום (citation) חובה");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
if (!practiceArea) {
|
||||||
|
toast.error("בחר תחום משפט");
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
try {
|
||||||
|
const tags = subjectTags
|
||||||
|
.split(",")
|
||||||
|
.map((t) => t.trim())
|
||||||
|
.filter(Boolean);
|
||||||
|
const res = await upload.mutateAsync({
|
||||||
|
file,
|
||||||
|
citation: citation.trim(),
|
||||||
|
case_name: caseName.trim(),
|
||||||
|
court: court.trim(),
|
||||||
|
decision_date: decisionDate || undefined,
|
||||||
|
source_type: sourceType || undefined,
|
||||||
|
precedent_level: precedentLevel || undefined,
|
||||||
|
practice_area: practiceArea,
|
||||||
|
appeal_subtype: appealSubtype.trim(),
|
||||||
|
subject_tags: tags,
|
||||||
|
is_binding: isBinding,
|
||||||
|
headnote: headnote.trim(),
|
||||||
|
});
|
||||||
|
setTaskId(res.task_id);
|
||||||
|
} catch (err) {
|
||||||
|
toast.error(err instanceof Error ? err.message : "כשל בהעלאה");
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
const isProcessing = taskId !== null && progress?.status !== "completed" && progress?.status !== "failed";
|
||||||
|
const stage = (progress as { stage?: string; percent?: number; step?: string } | null)?.stage;
|
||||||
|
const percent = (progress as { percent?: number } | null)?.percent ?? 0;
|
||||||
|
|
||||||
|
return (
|
||||||
|
<Sheet open={open} onOpenChange={onOpenChange}>
|
||||||
|
<SheetContent side="left" className="w-full sm:max-w-2xl overflow-y-auto" dir="rtl">
|
||||||
|
<SheetHeader>
|
||||||
|
<SheetTitle className="text-navy">העלאת פסיקה לקורפוס הסמכותי</SheetTitle>
|
||||||
|
<SheetDescription className="text-ink-muted">
|
||||||
|
הקובץ יעבור חילוץ טקסט, יצירת embeddings, וחילוץ הלכות אוטומטי.
|
||||||
|
ההלכות יחכו לאישורך לפני שהן זמינות לסוכני הכתיבה.
|
||||||
|
</SheetDescription>
|
||||||
|
</SheetHeader>
|
||||||
|
|
||||||
|
<form onSubmit={onSubmit} className="px-6 pb-6 space-y-4 mt-4">
|
||||||
|
{/* File */}
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="file">קובץ (PDF / DOCX / DOC / RTF / TXT / MD)</Label>
|
||||||
|
<Input
|
||||||
|
id="file" type="file" accept={ACCEPT}
|
||||||
|
onChange={(e) => setFile(e.target.files?.[0] ?? null)}
|
||||||
|
disabled={isProcessing}
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Citation */}
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="citation">מראה המקום (חובה)</Label>
|
||||||
|
<Input
|
||||||
|
id="citation" value={citation}
|
||||||
|
onChange={(e) => setCitation(e.target.value)}
|
||||||
|
placeholder={`עע"מ 3975/22 ב. קרן-נכסים נ' ועדה מקומית`}
|
||||||
|
disabled={isProcessing} dir="rtl"
|
||||||
|
/>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Two-col grid */}
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="case-name">שם קצר</Label>
|
||||||
|
<Input id="case-name" value={caseName}
|
||||||
|
onChange={(e) => setCaseName(e.target.value)}
|
||||||
|
placeholder="ב. קרן-נכסים" disabled={isProcessing} />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="court">ערכאה</Label>
|
||||||
|
<Input id="court" value={court}
|
||||||
|
onChange={(e) => setCourt(e.target.value)}
|
||||||
|
placeholder='בית משפט עליון / בג"ץ / מנהלי / ועדת ערר'
|
||||||
|
disabled={isProcessing} />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="date">תאריך החלטה</Label>
|
||||||
|
<Input id="date" type="date" value={decisionDate}
|
||||||
|
onChange={(e) => setDecisionDate(e.target.value)}
|
||||||
|
disabled={isProcessing} />
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="appeal-subtype">תת-סוג (חופשי)</Label>
|
||||||
|
<Input id="appeal-subtype" value={appealSubtype}
|
||||||
|
onChange={(e) => setAppealSubtype(e.target.value)}
|
||||||
|
placeholder="שימוש חורג / סופיות ההחלטה" disabled={isProcessing} />
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
{/* Practice area (required radio) */}
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label>תחום משפט (חובה)</Label>
|
||||||
|
<div className="flex gap-4 flex-wrap">
|
||||||
|
{PRACTICE_AREAS.map((a) => (
|
||||||
|
<label key={a.value} className="flex items-center gap-2 cursor-pointer">
|
||||||
|
<input
|
||||||
|
type="radio" name="practice_area" value={a.value}
|
||||||
|
checked={practiceArea === a.value}
|
||||||
|
onChange={() => setPracticeArea(a.value as PracticeArea)}
|
||||||
|
disabled={isProcessing}
|
||||||
|
/>
|
||||||
|
<span className="text-sm text-ink">{a.label}</span>
|
||||||
|
</label>
|
||||||
|
))}
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="grid grid-cols-2 gap-3">
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="source-type">סוג מקור</Label>
|
||||||
|
<Select value={sourceType || "_none"}
|
||||||
|
onValueChange={(v) => setSourceType(v === "_none" ? "" : v as SourceType)}
|
||||||
|
disabled={isProcessing}>
|
||||||
|
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_none">—</SelectItem>
|
||||||
|
{SOURCE_TYPES.map((s) => (
|
||||||
|
<SelectItem key={s.value} value={s.value}>{s.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="precedent-level">רמת תקדים</Label>
|
||||||
|
<Select value={precedentLevel || "_none"}
|
||||||
|
onValueChange={(v) => setPrecedentLevel(v === "_none" ? "" : v)}
|
||||||
|
disabled={isProcessing}>
|
||||||
|
<SelectTrigger><SelectValue placeholder="—" /></SelectTrigger>
|
||||||
|
<SelectContent>
|
||||||
|
<SelectItem value="_none">—</SelectItem>
|
||||||
|
{PRECEDENT_LEVELS.map((l) => (
|
||||||
|
<SelectItem key={l.value} value={l.value}>{l.label}</SelectItem>
|
||||||
|
))}
|
||||||
|
</SelectContent>
|
||||||
|
</Select>
|
||||||
|
</div>
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="tags">תגיות נושא (מופרדות בפסיקים)</Label>
|
||||||
|
<Input id="tags" value={subjectTags}
|
||||||
|
onChange={(e) => setSubjectTags(e.target.value)}
|
||||||
|
placeholder="חניה, קווי בניין, שיקול דעת" disabled={isProcessing} />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<div className="space-y-1">
|
||||||
|
<Label htmlFor="headnote">תקציר / headnote (אופציונלי)</Label>
|
||||||
|
<Textarea id="headnote" value={headnote} rows={2}
|
||||||
|
onChange={(e) => setHeadnote(e.target.value)}
|
||||||
|
placeholder="תקציר חופשי שיוצג ברשימה" disabled={isProcessing} dir="rtl" />
|
||||||
|
</div>
|
||||||
|
|
||||||
|
<label className="flex items-center gap-2 cursor-pointer">
|
||||||
|
<input type="checkbox" checked={isBinding}
|
||||||
|
onChange={(e) => setIsBinding(e.target.checked)}
|
||||||
|
disabled={isProcessing} />
|
||||||
|
<span className="text-sm">הלכה מחייבת</span>
|
||||||
|
</label>
|
||||||
|
|
||||||
|
{isProcessing && (
|
||||||
|
<div className="rounded-lg border border-rule bg-rule-soft/40 p-4 space-y-2">
|
||||||
|
<div className="flex items-center gap-2 text-sm text-navy">
|
||||||
|
<Loader2 className="w-4 h-4 animate-spin" />
|
||||||
|
<span>{(progress as { step?: string } | null)?.step || stage || "מעבד"}</span>
|
||||||
|
</div>
|
||||||
|
<Progress value={percent} className="h-1.5" />
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{progress?.status === "completed" && (
|
||||||
|
<div className="rounded-lg border border-gold/40 bg-gold-wash p-4 flex items-center gap-2 text-gold-deep text-sm">
|
||||||
|
<CheckCircle2 className="w-4 h-4" />
|
||||||
|
נכנס לקורפוס. ההלכות ממתינות בתור האישור.
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
{progress?.status === "failed" && (
|
||||||
|
<div className="rounded-lg border border-danger/40 bg-danger-bg p-4 flex items-center gap-2 text-danger text-sm">
|
||||||
|
<AlertCircle className="w-4 h-4" />
|
||||||
|
{progress.error || "שגיאה"}
|
||||||
|
</div>
|
||||||
|
)}
|
||||||
|
|
||||||
|
<div className="flex gap-2 justify-end pt-2">
|
||||||
|
<Button type="button" variant="ghost"
|
||||||
|
onClick={() => onOpenChange(false)} disabled={upload.isPending}>
|
||||||
|
ביטול
|
||||||
|
</Button>
|
||||||
|
<Button type="submit"
|
||||||
|
disabled={upload.isPending || isProcessing}
|
||||||
|
className="bg-navy text-parchment hover:bg-navy-soft">
|
||||||
|
<Upload className="w-4 h-4 me-1" />
|
||||||
|
העלה
|
||||||
|
</Button>
|
||||||
|
</div>
|
||||||
|
</form>
|
||||||
|
</SheetContent>
|
||||||
|
</Sheet>
|
||||||
|
);
|
||||||
|
}
|
||||||
33
web-ui/src/components/ui/switch.tsx
Normal file
33
web-ui/src/components/ui/switch.tsx
Normal file
@@ -0,0 +1,33 @@
|
|||||||
|
"use client"
|
||||||
|
|
||||||
|
import * as React from "react"
|
||||||
|
import { Switch as SwitchPrimitive } from "radix-ui"
|
||||||
|
|
||||||
|
import { cn } from "@/lib/utils"
|
||||||
|
|
||||||
|
function Switch({
|
||||||
|
className,
|
||||||
|
size = "default",
|
||||||
|
...props
|
||||||
|
}: React.ComponentProps<typeof SwitchPrimitive.Root> & {
|
||||||
|
size?: "sm" | "default"
|
||||||
|
}) {
|
||||||
|
return (
|
||||||
|
<SwitchPrimitive.Root
|
||||||
|
data-slot="switch"
|
||||||
|
data-size={size}
|
||||||
|
className={cn(
|
||||||
|
"peer group/switch relative inline-flex shrink-0 items-center rounded-full border border-transparent transition-all outline-none after:absolute after:-inset-x-3 after:-inset-y-2 focus-visible:border-ring focus-visible:ring-3 focus-visible:ring-ring/50 aria-invalid:border-destructive aria-invalid:ring-3 aria-invalid:ring-destructive/20 data-[size=default]:h-[18.4px] data-[size=default]:w-[32px] data-[size=sm]:h-[14px] data-[size=sm]:w-[24px] dark:aria-invalid:border-destructive/50 dark:aria-invalid:ring-destructive/40 data-checked:bg-primary data-unchecked:bg-input dark:data-unchecked:bg-input/80 data-disabled:cursor-not-allowed data-disabled:opacity-50",
|
||||||
|
className
|
||||||
|
)}
|
||||||
|
{...props}
|
||||||
|
>
|
||||||
|
<SwitchPrimitive.Thumb
|
||||||
|
data-slot="switch-thumb"
|
||||||
|
className="pointer-events-none block rounded-full bg-background ring-0 transition-transform group-data-[size=default]/switch:size-4 group-data-[size=sm]/switch:size-3 group-data-[size=default]/switch:data-checked:translate-x-[calc(100%-2px)] rtl:group-data-[size=default]/switch:data-checked:-translate-x-[calc(100%-2px)] group-data-[size=sm]/switch:data-checked:translate-x-[calc(100%-2px)] rtl:group-data-[size=sm]/switch:data-checked:-translate-x-[calc(100%-2px)] dark:data-checked:bg-primary-foreground group-data-[size=default]/switch:data-unchecked:translate-x-0 rtl:group-data-[size=default]/switch:data-unchecked:-translate-x-0 group-data-[size=sm]/switch:data-unchecked:translate-x-0 rtl:group-data-[size=sm]/switch:data-unchecked:-translate-x-0 dark:data-unchecked:bg-foreground"
|
||||||
|
/>
|
||||||
|
</SwitchPrimitive.Root>
|
||||||
|
)
|
||||||
|
}
|
||||||
|
|
||||||
|
export { Switch }
|
||||||
@@ -217,6 +217,30 @@ export function useGitStatus(caseNumber: string | undefined) {
|
|||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
export type CreateGiteaRepoResult = {
|
||||||
|
repo_url: string;
|
||||||
|
clone_url: string;
|
||||||
|
pushed: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useCreateGiteaRepo(caseNumber: string | undefined) {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (input: { title: string; description?: string }) =>
|
||||||
|
apiRequest<CreateGiteaRepoResult>(`/api/integrations/gitea/create-repo`, {
|
||||||
|
method: "POST",
|
||||||
|
body: {
|
||||||
|
case_number: caseNumber,
|
||||||
|
title: input.title,
|
||||||
|
description: input.description ?? "",
|
||||||
|
},
|
||||||
|
}),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: [...casesKeys.all, "git-status", caseNumber ?? ""] });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
export type StartWorkflowResult = {
|
export type StartWorkflowResult = {
|
||||||
case_number: string;
|
case_number: string;
|
||||||
status: string;
|
status: string;
|
||||||
|
|||||||
@@ -22,7 +22,10 @@ export type UploadTaggedResponse = {
|
|||||||
};
|
};
|
||||||
|
|
||||||
export type ProgressEvent = {
|
export type ProgressEvent = {
|
||||||
status: "queued" | "processing" | "completed" | "failed" | string;
|
/* "unknown" is sent by the backend when the task TTL expired or the
|
||||||
|
* caller subscribed before any state was published. Treat it as a
|
||||||
|
* terminal hint to refetch case state from the source of truth. */
|
||||||
|
status: "queued" | "processing" | "completed" | "failed" | "unknown" | string;
|
||||||
filename?: string;
|
filename?: string;
|
||||||
step?: string;
|
step?: string;
|
||||||
error?: string;
|
error?: string;
|
||||||
@@ -191,28 +194,54 @@ export function useExtractAppraiserFacts(caseNumber: string) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
export function useProgress(taskId: string | null) {
|
export function useProgress(taskId: string | null, caseNumber?: string) {
|
||||||
const [event, setEvent] = useState<ProgressEvent | null>(null);
|
const [event, setEvent] = useState<ProgressEvent | null>(null);
|
||||||
|
const qc = useQueryClient();
|
||||||
|
|
||||||
useEffect(() => {
|
useEffect(() => {
|
||||||
if (!taskId) return;
|
if (!taskId) return;
|
||||||
setEvent(null);
|
setEvent(null);
|
||||||
|
|
||||||
|
/* Self-heal fallback: if no SSE message arrives within 10s — usually
|
||||||
|
* because the proxy chain held the chunks or the EventSource is
|
||||||
|
* silently retrying — synthesize a refresh by invalidating the case
|
||||||
|
* detail. The actual document state is in the case detail anyway, so
|
||||||
|
* the UI heals from the source of truth without depending on SSE. */
|
||||||
|
let firstMessageReceived = false;
|
||||||
|
const fallback = window.setTimeout(() => {
|
||||||
|
if (firstMessageReceived) return;
|
||||||
|
if (caseNumber) qc.invalidateQueries({ queryKey: casesKeys.detail(caseNumber) });
|
||||||
|
setEvent({ status: "completed" });
|
||||||
|
}, 10_000);
|
||||||
|
|
||||||
const close = openSSE<ProgressEvent>(
|
const close = openSSE<ProgressEvent>(
|
||||||
`/api/progress/${encodeURIComponent(taskId)}`,
|
`/api/progress/${encodeURIComponent(taskId)}`,
|
||||||
{
|
{
|
||||||
onMessage: (data) => {
|
onMessage: (data) => {
|
||||||
|
firstMessageReceived = true;
|
||||||
setEvent(data);
|
setEvent(data);
|
||||||
if (data.status === "completed" || data.status === "failed") {
|
if (
|
||||||
/* Close from within the callback — the backend ends the stream
|
data.status === "completed" ||
|
||||||
* naturally, but closing eagerly avoids the auto-reconnect loop
|
data.status === "failed" ||
|
||||||
* EventSource does after EOF. */
|
data.status === "unknown"
|
||||||
|
) {
|
||||||
|
/* Close from within the callback so EventSource does not
|
||||||
|
* auto-reconnect after the server's EOF. For "unknown" we
|
||||||
|
* also nudge a case-detail refetch — the task state is gone
|
||||||
|
* but the document row will tell us the truth. */
|
||||||
|
if (data.status === "unknown" && caseNumber) {
|
||||||
|
qc.invalidateQueries({ queryKey: casesKeys.detail(caseNumber) });
|
||||||
|
}
|
||||||
close();
|
close();
|
||||||
}
|
}
|
||||||
},
|
},
|
||||||
},
|
},
|
||||||
);
|
);
|
||||||
return () => close();
|
return () => {
|
||||||
}, [taskId]);
|
window.clearTimeout(fallback);
|
||||||
|
close();
|
||||||
|
};
|
||||||
|
}, [taskId, caseNumber, qc]);
|
||||||
|
|
||||||
return event;
|
return event;
|
||||||
}
|
}
|
||||||
|
|||||||
113
web-ui/src/lib/api/global-search.ts
Normal file
113
web-ui/src/lib/api/global-search.ts
Normal file
@@ -0,0 +1,113 @@
|
|||||||
|
/**
|
||||||
|
* Global header search — fans out to three independent sources in parallel.
|
||||||
|
*
|
||||||
|
* Each source is its own `useQuery`, so the fastest one (cases, plain SQL)
|
||||||
|
* shows up immediately while the slower vector searches stream in. A failure
|
||||||
|
* in one source does not block the others — the result panel renders per-
|
||||||
|
* source skeletons and per-source error states.
|
||||||
|
*
|
||||||
|
* Sources:
|
||||||
|
* - cases → GET /api/search/cases (SQL ILIKE on case_number/address/parties)
|
||||||
|
* - precedent → GET /api/precedent-library/search (semantic, halachot+chunks)
|
||||||
|
* - documents → GET /api/search (semantic, all case docs + past decisions)
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useQuery } from "@tanstack/react-query";
|
||||||
|
import { apiRequest } from "./client";
|
||||||
|
import type { SearchHit as PrecedentHit } from "./precedent-library";
|
||||||
|
|
||||||
|
export type CaseHit = {
|
||||||
|
case_number: string;
|
||||||
|
title: string;
|
||||||
|
property_address: string | null;
|
||||||
|
status: string | null;
|
||||||
|
practice_area: string | null;
|
||||||
|
appeal_subtype: string | null;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type DocumentHit = {
|
||||||
|
score: number;
|
||||||
|
case_number: string;
|
||||||
|
document: string;
|
||||||
|
section: string;
|
||||||
|
page: number | null;
|
||||||
|
content: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
const MIN_QUERY_LEN = 2;
|
||||||
|
const STALE_MS = 10_000;
|
||||||
|
const PER_SOURCE_LIMIT = 5;
|
||||||
|
|
||||||
|
const enabled = (q: string) => q.trim().length >= MIN_QUERY_LEN;
|
||||||
|
|
||||||
|
export function useCasesSearch(query: string) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["global-search", "cases", query],
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<{ items: CaseHit[]; count: number }>(
|
||||||
|
`/api/search/cases?q=${encodeURIComponent(query)}&limit=${PER_SOURCE_LIMIT}`,
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
enabled: enabled(query),
|
||||||
|
staleTime: STALE_MS,
|
||||||
|
placeholderData: (prev) => prev,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function usePrecedentSearch(query: string) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["global-search", "precedent", query],
|
||||||
|
queryFn: ({ signal }) => {
|
||||||
|
const p = new URLSearchParams({
|
||||||
|
q: query,
|
||||||
|
limit: String(PER_SOURCE_LIMIT),
|
||||||
|
include_halachot: "true",
|
||||||
|
});
|
||||||
|
return apiRequest<{ items: PrecedentHit[]; count: number }>(
|
||||||
|
`/api/precedent-library/search?${p.toString()}`,
|
||||||
|
{ signal },
|
||||||
|
);
|
||||||
|
},
|
||||||
|
enabled: enabled(query),
|
||||||
|
staleTime: STALE_MS,
|
||||||
|
placeholderData: (prev) => prev,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/**
|
||||||
|
* The /api/search endpoint returns either an array of DocumentHit, or
|
||||||
|
* `{message: "לא נמצאו תוצאות."}` when empty. Normalize to a plain array.
|
||||||
|
*/
|
||||||
|
export function useDocumentsSearch(query: string) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["global-search", "documents", query],
|
||||||
|
queryFn: async ({ signal }) => {
|
||||||
|
const raw = await apiRequest<DocumentHit[] | { message: string }>(
|
||||||
|
`/api/search?query=${encodeURIComponent(query)}&limit=${PER_SOURCE_LIMIT}`,
|
||||||
|
{ signal },
|
||||||
|
);
|
||||||
|
return Array.isArray(raw) ? raw : [];
|
||||||
|
},
|
||||||
|
enabled: enabled(query),
|
||||||
|
staleTime: STALE_MS,
|
||||||
|
placeholderData: (prev) => prev,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useGlobalSearch(query: string) {
|
||||||
|
const cases = useCasesSearch(query);
|
||||||
|
const precedent = usePrecedentSearch(query);
|
||||||
|
const documents = useDocumentsSearch(query);
|
||||||
|
|
||||||
|
const isQueryReady = enabled(query);
|
||||||
|
const anyLoading =
|
||||||
|
isQueryReady && (cases.isLoading || precedent.isLoading || documents.isLoading);
|
||||||
|
|
||||||
|
return {
|
||||||
|
cases,
|
||||||
|
precedent,
|
||||||
|
documents,
|
||||||
|
isQueryReady,
|
||||||
|
anyLoading,
|
||||||
|
};
|
||||||
|
}
|
||||||
455
web-ui/src/lib/api/precedent-library.ts
Normal file
455
web-ui/src/lib/api/precedent-library.ts
Normal file
@@ -0,0 +1,455 @@
|
|||||||
|
/**
|
||||||
|
* External Precedent Library hooks.
|
||||||
|
*
|
||||||
|
* The library is the authoritative case-law corpus — chair-uploaded
|
||||||
|
* court rulings + other appeals committee decisions, with halachot
|
||||||
|
* extracted automatically and queued for chair approval. Distinct from:
|
||||||
|
* - /api/training (Daphna's style corpus — sample decisions for tone)
|
||||||
|
* - /api/precedents (chair-attached quotes per case section)
|
||||||
|
*
|
||||||
|
* Endpoints touched (all under /api/precedent-library and /api/halachot):
|
||||||
|
* - POST /upload (multipart) → task_id (consumed by useProgress)
|
||||||
|
* - GET / (filters) → list
|
||||||
|
* - GET /{id} → detail with halachot
|
||||||
|
* - PATCH /{id} → metadata edit
|
||||||
|
* - DELETE /{id} → remove
|
||||||
|
* - POST /{id}/extract-halachot → re-run halacha extractor
|
||||||
|
* - GET /search → semantic search (halachot + chunks)
|
||||||
|
* - GET /stats
|
||||||
|
* - GET /api/halachot?status=... → review queue
|
||||||
|
* - PATCH /api/halachot/{id} → approve/reject/edit
|
||||||
|
*/
|
||||||
|
|
||||||
|
import { useMutation, useQuery, useQueryClient } from "@tanstack/react-query";
|
||||||
|
import { ApiError, apiRequest } from "./client";
|
||||||
|
|
||||||
|
export type PracticeArea =
|
||||||
|
| ""
|
||||||
|
| "rishuy_uvniya"
|
||||||
|
| "betterment_levy"
|
||||||
|
| "compensation_197";
|
||||||
|
|
||||||
|
export type SourceType = "" | "court_ruling" | "appeals_committee";
|
||||||
|
|
||||||
|
export type Precedent = {
|
||||||
|
id: string;
|
||||||
|
case_number: string;
|
||||||
|
case_name: string;
|
||||||
|
court: string;
|
||||||
|
date: string | null;
|
||||||
|
practice_area: PracticeArea | "";
|
||||||
|
appeal_subtype: string;
|
||||||
|
source_type: SourceType | "";
|
||||||
|
precedent_level: string;
|
||||||
|
is_binding: boolean;
|
||||||
|
summary: string;
|
||||||
|
headnote: string;
|
||||||
|
subject_tags: string[];
|
||||||
|
source_kind: string;
|
||||||
|
extraction_status: string;
|
||||||
|
halacha_extraction_status: string;
|
||||||
|
metadata_extraction_requested_at: string | null;
|
||||||
|
halacha_extraction_requested_at: string | null;
|
||||||
|
created_at: string;
|
||||||
|
halachot_count: number;
|
||||||
|
approved_count: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type Halacha = {
|
||||||
|
id: string;
|
||||||
|
case_law_id: string;
|
||||||
|
halacha_index: number;
|
||||||
|
rule_statement: string;
|
||||||
|
rule_type: string;
|
||||||
|
reasoning_summary: string;
|
||||||
|
supporting_quote: string;
|
||||||
|
page_reference: string;
|
||||||
|
practice_areas: string[];
|
||||||
|
subject_tags: string[];
|
||||||
|
cites: string[];
|
||||||
|
confidence: number;
|
||||||
|
quote_verified: boolean;
|
||||||
|
review_status: "pending_review" | "approved" | "rejected" | "published";
|
||||||
|
reviewer: string;
|
||||||
|
reviewed_at: string | null;
|
||||||
|
created_at: string;
|
||||||
|
updated_at: string;
|
||||||
|
/* Joined from case_law for review/list views */
|
||||||
|
case_number?: string;
|
||||||
|
case_name?: string;
|
||||||
|
court?: string;
|
||||||
|
decision_date?: string | null;
|
||||||
|
precedent_level?: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type PrecedentDetail = Precedent & {
|
||||||
|
full_text: string;
|
||||||
|
halachot: Halacha[];
|
||||||
|
};
|
||||||
|
|
||||||
|
export type SearchHit =
|
||||||
|
| {
|
||||||
|
type: "halacha";
|
||||||
|
score: number;
|
||||||
|
halacha_id: string;
|
||||||
|
case_law_id: string;
|
||||||
|
rule_statement: string;
|
||||||
|
reasoning_summary: string;
|
||||||
|
supporting_quote: string;
|
||||||
|
page_reference: string;
|
||||||
|
practice_areas: string[];
|
||||||
|
subject_tags: string[];
|
||||||
|
confidence: number;
|
||||||
|
rule_type: string;
|
||||||
|
case_number: string;
|
||||||
|
case_name: string;
|
||||||
|
court: string;
|
||||||
|
decision_date: string | null;
|
||||||
|
precedent_level: string;
|
||||||
|
}
|
||||||
|
| {
|
||||||
|
type: "passage";
|
||||||
|
score: number;
|
||||||
|
chunk_id: string;
|
||||||
|
case_law_id: string;
|
||||||
|
content: string;
|
||||||
|
section_type: string;
|
||||||
|
page_number: number | null;
|
||||||
|
case_number: string;
|
||||||
|
case_name: string;
|
||||||
|
court: string;
|
||||||
|
decision_date: string | null;
|
||||||
|
precedent_level: string;
|
||||||
|
practice_area: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type LibraryStats = {
|
||||||
|
precedents_total: number;
|
||||||
|
by_practice_area: { practice_area: string; count: number }[];
|
||||||
|
by_precedent_level: { precedent_level: string; count: number }[];
|
||||||
|
halachot_total: number;
|
||||||
|
halachot_pending: number;
|
||||||
|
halachot_approved: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type ListFilters = {
|
||||||
|
practiceArea?: PracticeArea;
|
||||||
|
court?: string;
|
||||||
|
precedentLevel?: string;
|
||||||
|
sourceType?: SourceType;
|
||||||
|
search?: string;
|
||||||
|
limit?: number;
|
||||||
|
offset?: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export const libraryKeys = {
|
||||||
|
all: ["precedent-library"] as const,
|
||||||
|
list: (filters: ListFilters) =>
|
||||||
|
[...libraryKeys.all, "list", filters] as const,
|
||||||
|
detail: (id: string) => [...libraryKeys.all, "detail", id] as const,
|
||||||
|
search: (q: string, filters: Record<string, string | boolean>) =>
|
||||||
|
[...libraryKeys.all, "search", q, filters] as const,
|
||||||
|
stats: () => [...libraryKeys.all, "stats"] as const,
|
||||||
|
halachotPending: () => [...libraryKeys.all, "halachot", "pending"] as const,
|
||||||
|
halachot: (filters: Record<string, string>) =>
|
||||||
|
[...libraryKeys.all, "halachot", filters] as const,
|
||||||
|
};
|
||||||
|
|
||||||
|
export function usePrecedents(filters: ListFilters = {}) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: libraryKeys.list(filters),
|
||||||
|
queryFn: ({ signal }) => {
|
||||||
|
const p = new URLSearchParams();
|
||||||
|
if (filters.practiceArea) p.set("practice_area", filters.practiceArea);
|
||||||
|
if (filters.court) p.set("court", filters.court);
|
||||||
|
if (filters.precedentLevel) p.set("precedent_level", filters.precedentLevel);
|
||||||
|
if (filters.sourceType) p.set("source_type", filters.sourceType);
|
||||||
|
if (filters.search) p.set("search", filters.search);
|
||||||
|
if (filters.limit) p.set("limit", String(filters.limit));
|
||||||
|
if (filters.offset) p.set("offset", String(filters.offset));
|
||||||
|
const qs = p.toString();
|
||||||
|
return apiRequest<{ items: Precedent[]; count: number }>(
|
||||||
|
`/api/precedent-library${qs ? `?${qs}` : ""}`,
|
||||||
|
{ signal },
|
||||||
|
);
|
||||||
|
},
|
||||||
|
staleTime: 30_000,
|
||||||
|
/* Poll while any row is mid-processing or queued for the local MCP
|
||||||
|
* worker. Once everything settles to completed/failed the polling
|
||||||
|
* stops on its own — no fixed background timer. */
|
||||||
|
refetchInterval: (query) => {
|
||||||
|
const data = query.state.data;
|
||||||
|
if (!data) return false;
|
||||||
|
const active = data.items.some((p) => isPrecedentActive(p));
|
||||||
|
return active ? 5000 : false;
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/** A precedent is "active" while text/halacha extraction is in flight or
|
||||||
|
* legitimately queued for the local MCP worker. Used by the auto-refresh
|
||||||
|
* poller and by the row UI to disable destructive actions.
|
||||||
|
*
|
||||||
|
* Once a status is "completed" or "failed", the row is NEVER active —
|
||||||
|
* even if the corresponding `*_requested_at` timestamp still has a value.
|
||||||
|
* The worker is supposed to NULL it on success but in practice doesn't
|
||||||
|
* always, and treating those rows as active leaves them permanently
|
||||||
|
* undeletable. */
|
||||||
|
export function isPrecedentActive(p: Precedent): boolean {
|
||||||
|
// Text extraction
|
||||||
|
if (p.extraction_status === "processing") return true;
|
||||||
|
|
||||||
|
// Halacha extraction
|
||||||
|
if (p.halacha_extraction_status === "processing") return true;
|
||||||
|
if (
|
||||||
|
p.halacha_extraction_status === "pending" &&
|
||||||
|
p.halacha_extraction_requested_at !== null
|
||||||
|
) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
// Metadata extraction has no status column — only the timestamp.
|
||||||
|
// Treat as active only when extraction hasn't yet fully completed
|
||||||
|
// (otherwise stale timestamps linger after success).
|
||||||
|
if (
|
||||||
|
p.metadata_extraction_requested_at !== null &&
|
||||||
|
p.extraction_status !== "completed"
|
||||||
|
) {
|
||||||
|
return true;
|
||||||
|
}
|
||||||
|
|
||||||
|
return false;
|
||||||
|
}
|
||||||
|
|
||||||
|
export function usePrecedent(id: string | null) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: libraryKeys.detail(id ?? ""),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<PrecedentDetail>(
|
||||||
|
`/api/precedent-library/${encodeURIComponent(id!)}`,
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
enabled: Boolean(id),
|
||||||
|
staleTime: 30_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useLibraryStats() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: libraryKeys.stats(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<LibraryStats>("/api/precedent-library/stats", { signal }),
|
||||||
|
staleTime: 60_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export type SearchFilters = {
|
||||||
|
practiceArea?: PracticeArea;
|
||||||
|
court?: string;
|
||||||
|
precedentLevel?: string;
|
||||||
|
appealSubtype?: string;
|
||||||
|
subjectTag?: string;
|
||||||
|
includeHalachot?: boolean;
|
||||||
|
limit?: number;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useLibrarySearch(query: string, filters: SearchFilters = {}) {
|
||||||
|
const params: Record<string, string | boolean> = {};
|
||||||
|
if (filters.practiceArea) params.practice_area = filters.practiceArea;
|
||||||
|
if (filters.court) params.court = filters.court;
|
||||||
|
if (filters.precedentLevel) params.precedent_level = filters.precedentLevel;
|
||||||
|
if (filters.appealSubtype) params.appeal_subtype = filters.appealSubtype;
|
||||||
|
if (filters.subjectTag) params.subject_tag = filters.subjectTag;
|
||||||
|
if (filters.includeHalachot !== undefined)
|
||||||
|
params.include_halachot = filters.includeHalachot;
|
||||||
|
|
||||||
|
return useQuery({
|
||||||
|
queryKey: libraryKeys.search(query, params),
|
||||||
|
queryFn: ({ signal }) => {
|
||||||
|
const p = new URLSearchParams({ q: query });
|
||||||
|
for (const [k, v] of Object.entries(params)) p.set(k, String(v));
|
||||||
|
if (filters.limit) p.set("limit", String(filters.limit));
|
||||||
|
return apiRequest<{ items: SearchHit[]; count: number }>(
|
||||||
|
`/api/precedent-library/search?${p.toString()}`,
|
||||||
|
{ signal },
|
||||||
|
);
|
||||||
|
},
|
||||||
|
enabled: query.trim().length >= 2,
|
||||||
|
staleTime: 10_000,
|
||||||
|
placeholderData: (prev) => prev,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export type PrecedentUploadInput = {
|
||||||
|
file: File;
|
||||||
|
citation: string;
|
||||||
|
case_name?: string;
|
||||||
|
court?: string;
|
||||||
|
decision_date?: string;
|
||||||
|
source_type?: SourceType;
|
||||||
|
precedent_level?: string;
|
||||||
|
practice_area?: PracticeArea;
|
||||||
|
appeal_subtype?: string;
|
||||||
|
subject_tags?: string[];
|
||||||
|
is_binding?: boolean;
|
||||||
|
headnote?: string;
|
||||||
|
summary?: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useUploadPrecedent() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: async (input: PrecedentUploadInput) => {
|
||||||
|
const fd = new FormData();
|
||||||
|
fd.append("file", input.file);
|
||||||
|
fd.append("citation", input.citation);
|
||||||
|
if (input.case_name) fd.append("case_name", input.case_name);
|
||||||
|
if (input.court) fd.append("court", input.court);
|
||||||
|
if (input.decision_date) fd.append("decision_date", input.decision_date);
|
||||||
|
if (input.source_type) fd.append("source_type", input.source_type);
|
||||||
|
if (input.precedent_level)
|
||||||
|
fd.append("precedent_level", input.precedent_level);
|
||||||
|
if (input.practice_area)
|
||||||
|
fd.append("practice_area", input.practice_area);
|
||||||
|
if (input.appeal_subtype)
|
||||||
|
fd.append("appeal_subtype", input.appeal_subtype);
|
||||||
|
if (input.subject_tags && input.subject_tags.length)
|
||||||
|
fd.append("subject_tags", JSON.stringify(input.subject_tags));
|
||||||
|
fd.append("is_binding", String(input.is_binding ?? true));
|
||||||
|
if (input.headnote) fd.append("headnote", input.headnote);
|
||||||
|
if (input.summary) fd.append("summary", input.summary);
|
||||||
|
|
||||||
|
const res = await fetch("/api/precedent-library/upload", {
|
||||||
|
method: "POST",
|
||||||
|
body: fd,
|
||||||
|
});
|
||||||
|
const parsed = await res.json().catch(() => null);
|
||||||
|
if (!res.ok) {
|
||||||
|
throw new ApiError(
|
||||||
|
`Upload failed with ${res.status}`,
|
||||||
|
res.status,
|
||||||
|
parsed,
|
||||||
|
);
|
||||||
|
}
|
||||||
|
return parsed as { task_id: string };
|
||||||
|
},
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useDeletePrecedent() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (id: string) =>
|
||||||
|
apiRequest<{ deleted: boolean }>(
|
||||||
|
`/api/precedent-library/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "DELETE" },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export type PrecedentPatch = Partial<{
|
||||||
|
case_name: string;
|
||||||
|
court: string;
|
||||||
|
decision_date: string;
|
||||||
|
practice_area: PracticeArea;
|
||||||
|
appeal_subtype: string;
|
||||||
|
subject_tags: string[];
|
||||||
|
summary: string;
|
||||||
|
headnote: string;
|
||||||
|
source_type: SourceType;
|
||||||
|
precedent_level: string;
|
||||||
|
is_binding: boolean;
|
||||||
|
}>;
|
||||||
|
|
||||||
|
export function useUpdatePrecedent() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: ({ id, patch }: { id: string; patch: PrecedentPatch }) =>
|
||||||
|
apiRequest<Precedent>(
|
||||||
|
`/api/precedent-library/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "PATCH", body: patch },
|
||||||
|
),
|
||||||
|
onSuccess: (_, { id }) => {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.detail(id) });
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
/* Extraction can't run inside the container (no `claude` CLI). The
|
||||||
|
* "request" endpoints below stamp a queue marker in case_law; the chair
|
||||||
|
* (or me) drains the queue from Claude Code by invoking the MCP tool
|
||||||
|
* `precedent_process_pending`, which runs the actual extractor locally.
|
||||||
|
* See the rule in mcp-server/src/legal_mcp/services/claude_session.py. */
|
||||||
|
|
||||||
|
export function useRequestMetadataExtraction() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (id: string) =>
|
||||||
|
apiRequest<{ queued: boolean }>(
|
||||||
|
`/api/precedent-library/${encodeURIComponent(id)}/request-metadata`,
|
||||||
|
{ method: "POST" },
|
||||||
|
),
|
||||||
|
onSuccess: (_, id) => {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.detail(id) });
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useRequestHalachotExtraction() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: (id: string) =>
|
||||||
|
apiRequest<{ queued: boolean }>(
|
||||||
|
`/api/precedent-library/${encodeURIComponent(id)}/request-halachot`,
|
||||||
|
{ method: "POST" },
|
||||||
|
),
|
||||||
|
onSuccess: (_, id) => {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.detail(id) });
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useHalachotPending(limit = 200) {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: libraryKeys.halachotPending(),
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<{ items: Halacha[]; count: number }>(
|
||||||
|
`/api/halachot?review_status=pending_review&limit=${limit}`,
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
staleTime: 5_000,
|
||||||
|
refetchOnMount: "always",
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export type HalachaPatch = Partial<{
|
||||||
|
review_status: "pending_review" | "approved" | "rejected" | "published";
|
||||||
|
reviewer: string;
|
||||||
|
rule_statement: string;
|
||||||
|
reasoning_summary: string;
|
||||||
|
subject_tags: string[];
|
||||||
|
practice_areas: string[];
|
||||||
|
}>;
|
||||||
|
|
||||||
|
export function useUpdateHalacha() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: ({ id, patch }: { id: string; patch: HalachaPatch }) =>
|
||||||
|
apiRequest<Halacha>(
|
||||||
|
`/api/halachot/${encodeURIComponent(id)}`,
|
||||||
|
{ method: "PATCH", body: patch },
|
||||||
|
),
|
||||||
|
onSuccess: () => {
|
||||||
|
qc.invalidateQueries({ queryKey: libraryKeys.all });
|
||||||
|
},
|
||||||
|
});
|
||||||
|
}
|
||||||
@@ -55,3 +55,141 @@ export function useDeleteTagMapping() {
|
|||||||
onSuccess: () => qc.invalidateQueries({ queryKey: ["settings", "tag-mappings"] }),
|
onSuccess: () => qc.invalidateQueries({ queryKey: ["settings", "tag-mappings"] }),
|
||||||
});
|
});
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// ── MCP Settings ────────────────────────────────────────────────
|
||||||
|
|
||||||
|
export type EnvCategory =
|
||||||
|
| "multimodal"
|
||||||
|
| "rerank"
|
||||||
|
| "halacha"
|
||||||
|
| "credentials"
|
||||||
|
| "connection"
|
||||||
|
| "general";
|
||||||
|
|
||||||
|
export type EnvType = "bool" | "int" | "float" | "string";
|
||||||
|
|
||||||
|
export type McpEnvVar = {
|
||||||
|
key: string;
|
||||||
|
category: EnvCategory;
|
||||||
|
type: EnvType;
|
||||||
|
description: string;
|
||||||
|
is_secret: boolean;
|
||||||
|
is_editable: boolean;
|
||||||
|
default: unknown;
|
||||||
|
min: number | null;
|
||||||
|
max: number | null;
|
||||||
|
enum_values: string[] | null;
|
||||||
|
coolify_value: string | null;
|
||||||
|
container_value: string | null;
|
||||||
|
drift: boolean;
|
||||||
|
has_duplicates: boolean;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type McpEnvResponse = {
|
||||||
|
vars: McpEnvVar[];
|
||||||
|
coolify_app_uuid: string;
|
||||||
|
errors: string[];
|
||||||
|
};
|
||||||
|
|
||||||
|
export type McpTool = {
|
||||||
|
name: string;
|
||||||
|
description: string;
|
||||||
|
params_schema: unknown;
|
||||||
|
module: string;
|
||||||
|
source_location: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export type McpRegistration = {
|
||||||
|
client: string;
|
||||||
|
server_name: string;
|
||||||
|
command: string;
|
||||||
|
args: string[];
|
||||||
|
cwd: string;
|
||||||
|
env_keys: string[];
|
||||||
|
transport: string;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useMcpEnv() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["settings", "mcp-env"] as const,
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<McpEnvResponse>("/api/settings/mcp/env", { signal }),
|
||||||
|
staleTime: 5_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useUpdateMcpEnv() {
|
||||||
|
const qc = useQueryClient();
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: ({ key, value }: { key: string; value: unknown }) =>
|
||||||
|
apiRequest<{
|
||||||
|
ok: boolean;
|
||||||
|
key: string;
|
||||||
|
saved_value: string;
|
||||||
|
requires_redeploy: boolean;
|
||||||
|
message: string;
|
||||||
|
}>(`/api/settings/mcp/env/${encodeURIComponent(key)}`, {
|
||||||
|
method: "PATCH",
|
||||||
|
body: { value },
|
||||||
|
}),
|
||||||
|
onSuccess: () => qc.invalidateQueries({ queryKey: ["settings", "mcp-env"] }),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useMcpRedeploy() {
|
||||||
|
return useMutation({
|
||||||
|
mutationFn: () =>
|
||||||
|
apiRequest<{ ok: boolean; deployment_uuid: string | null; message: string }>(
|
||||||
|
"/api/settings/mcp/env/redeploy",
|
||||||
|
{ method: "POST" },
|
||||||
|
),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useMcpTools() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["settings", "mcp-tools"] as const,
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<{ tools: McpTool[]; count: number }>("/api/settings/mcp/tools", {
|
||||||
|
signal,
|
||||||
|
}),
|
||||||
|
staleTime: 60_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export function useMcpRegistrations() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["settings", "mcp-registrations"] as const,
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<{
|
||||||
|
registrations: McpRegistration[];
|
||||||
|
error: string | null;
|
||||||
|
message?: string;
|
||||||
|
}>("/api/settings/mcp/registrations", { signal }),
|
||||||
|
staleTime: 60_000,
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
export type McpBlock = {
|
||||||
|
id: string;
|
||||||
|
index: number;
|
||||||
|
title: string;
|
||||||
|
gen_type: string;
|
||||||
|
model: string;
|
||||||
|
temperature: number | null;
|
||||||
|
max_tokens: number | null;
|
||||||
|
creac_role: string | null;
|
||||||
|
jwm_purpose: string | null;
|
||||||
|
};
|
||||||
|
|
||||||
|
export function useMcpBlocks() {
|
||||||
|
return useQuery({
|
||||||
|
queryKey: ["settings", "mcp-blocks"] as const,
|
||||||
|
queryFn: ({ signal }) =>
|
||||||
|
apiRequest<{ blocks: McpBlock[]; count: number }>(
|
||||||
|
"/api/settings/mcp/blocks",
|
||||||
|
{ signal },
|
||||||
|
),
|
||||||
|
staleTime: 5 * 60_000, // 5 minutes — static reference data
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|||||||
File diff suppressed because it is too large
Load Diff
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user