fix(retrieval): make decisions findable by name + unhide committee uploads
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m57s
All checks were successful
Build & Deploy / build-and-deploy (push) Successful in 3m57s
Root cause of "agent can't find the Agasi decision in the corpus" (CMPA-55): the decision was fully ingested, but the retrieval layer failed on the realistic agent query — searching by case name. - RC-A (#52): lexical tsvector covered only chunk content + halacha text, so a bare-name query ("אגסי") matched decisions that *cite* the case, not the case itself. Add meta_tsv on case_law(case_name, case_number) (SCHEMA V20) and OR it into the lexical halacha/chunk SQL with a match boost, so a name/number hit surfaces the case's own rows. Agasi: rank 4 → rank 1. - RC-B (#53): precedent_library_list hard-defaulted source_kind=external_upload and never exposed the param, hiding uploaded ערר/בל"מ (internal_committee) decisions. Thread source_kind through service → tool → MCP tool (supports 'internal_committee' / 'all_committees'). - #54: agent instructions (researcher/analyst/writer) — search-by-name protocol: add content/case-number, search both corpora, use all_committees before declaring "not in corpus". - #55: chunker produced tiny fragment chunks ("דיון", "החלטה") from header keywords matched mid-sentence. Anchor SECTION_PATTERNS to line start + merge sub-min sections; exclude <50-char fragments at query time (484 existing fragments hidden; full re-chunk tracked as #57). Tests: scripts/test_retrieval_by_name.py (name ranks case above citer + substantive regressions); chunker unit checks (0 tiny chunks). New findings filed as tasks #56 (halacha source_kind leak) and #57 (re-chunk migration). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -12,6 +12,7 @@
|
||||
| `sync_missing_agent_skills.py` | python | סקריפט "אל-כשל" להוספת `paperclipSkillSync` ל-`הגהת מסמכים` ו-`מנתח משפטי` שפיספסו את ה-sync ההיסטורי (Gap #28). תומך `--verify`/`--dry-run`/`--apply`. גיבוי אוטומטי ל-`agents-pre-skill-sync-*.sql`. דורש `PAPERCLIP_BOARD_API_KEY` (Infisical /paperclip ב-nautilus env). idempotent. | חד-פעמי (בוצע 2026-05-04). שמור לרפרנס |
|
||||
| `sync_agents_across_companies.py` | python | **סנכרון סוכנים מ-CMP (1xxx, master) ל-CMPA (8xxx, mirror)** — Gap #25. משווה adapter_config (model/timeout/instructions/skills/etc), runtime_config (heartbeat), ושדות top-level (budget/metadata/icon/title/role). מסנן אוטומטית local skills שלא קיימים ב-mirror. לוגיקת subset (mirror יכול להחזיק יותר skills כי ה-API מוסיף required runtime skills). תומך `--verify`/`--dry-run`/`--apply [--only NAME]`. גיבוי אוטומטי. דורש `PAPERCLIP_BOARD_API_KEY`. **להריץ אחרי כל שינוי הגדרות ב-CMP.** **⚠ אם `adapter_type` שונה בין CMP ל-CMPA — הסקריפט מדלג על הסוכן עם warning. בעת מעבר adapter (למשל ל-`deepseek_local`) חובה לעדכן ידנית בשתי החברות לפני sync.** | ידני אחרי כל שינוי |
|
||||
| `fix_paperclipai_skills_drift.py` | python | סקריפט חד-פעמי (בוצע 2026-05-04) שניקה drift על `paperclipai/*` skills בין CMP ל-CMPA. הסיר `paperclip-dev` מכל 14 הסוכנים, ודאג ש-`paperclip-converting-plans-to-tasks` קיים רק על CEO ו-analyst. תומך `--apply` (ברירת מחדל: dry-run). דורש `PAPERCLIP_BOARD_API_KEY`. נשמר לרפרנס למקרה שhdrift חוזר. | חד-פעמי (בוצע) |
|
||||
| `test_retrieval_by_name.py` | python | בדיקת אחזור-לפי-שם (#52/RC-A) — מאמת ש`search_precedent_library`/`search_internal_decisions` מדרגים את ההחלטה עצמה (אגסי) מעל מי שמצטט אותה, + רגרסיות לשאילתות מהותיות. הרצה: `DOTENV_PATH=/home/chaim/.env DATA_DIR=.../data mcp-server/.venv/bin/python scripts/test_retrieval_by_name.py` (exit 0 = עבר). | ידני אחרי שינוי שכבת חיפוש |
|
||||
| `auto-sync-cases.sh` | bash | סנכרון תיקי ערר ל-Gitea — רץ כל דקה | `* * * * *` (cron) |
|
||||
| `backup-db.sh` | bash | גיבוי PostgreSQL יומי ל-`data/backups/` (gzip) | לתזמן: `0 2 * * *` |
|
||||
| `restore-db.sh` | bash | שחזור DB מגיבוי (companion ל-backup-db.sh) | ידני |
|
||||
|
||||
89
scripts/test_retrieval_by_name.py
Normal file
89
scripts/test_retrieval_by_name.py
Normal file
@@ -0,0 +1,89 @@
|
||||
#!/usr/bin/env python
|
||||
"""Repro + regression test for retrieval-by-name (RC-A, tasks #52).
|
||||
|
||||
Bug: searching the precedent corpus by a bare case NAME ("אגסי") fails to
|
||||
surface the decision itself, because the lexical tsvector covers only chunk
|
||||
content + halacha text — not case_name / case_number. A name query therefore
|
||||
matches decisions that *cite* the case, not the case.
|
||||
|
||||
Run with the MCP venv:
|
||||
DOTENV_PATH=/home/chaim/.env DATA_DIR=/home/chaim/legal-ai/data \
|
||||
mcp-server/.venv/bin/python scripts/test_retrieval_by_name.py
|
||||
|
||||
Exit 0 = all assertions pass. Non-zero = failure (prints what was found).
|
||||
"""
|
||||
import asyncio
|
||||
import sys
|
||||
|
||||
sys.path.insert(0, "/home/chaim/legal-ai/mcp-server/src")
|
||||
|
||||
from legal_mcp.services import embeddings, hybrid_search # noqa: E402
|
||||
|
||||
AGASI_ID = "1a87efe5-6e13-4ed4-a9ec-3f2f7d61e4ec"
|
||||
# Vinfeld CITES Agasi (its halacha quote names אגסי) but is NOT Agasi.
|
||||
# An exact name match must rank the case itself above any case citing it.
|
||||
VINFELD_ID = "bd5d849c-c15f-43c3-96ab-d44337af9cb5"
|
||||
NAME_QUERY = "אגסי"
|
||||
SUBSTANTIVE_QUERY = 'פטור היטל השבחה לפי סעיף 19(ג)(1) שתי דירות 140 מ"ר אחת מושכרת'
|
||||
|
||||
|
||||
def _ids(rows):
|
||||
return [str(r.get("case_law_id")) for r in rows]
|
||||
|
||||
|
||||
def _rank_of(rows, cid):
|
||||
for i, r in enumerate(rows, 1):
|
||||
if str(r.get("case_law_id")) == cid:
|
||||
return i
|
||||
return None
|
||||
|
||||
|
||||
async def _search(query, source_kind, limit=10):
|
||||
query_emb = await embeddings.embed_query(query)
|
||||
return await hybrid_search.search_precedent_library_hybrid(
|
||||
query,
|
||||
query_emb,
|
||||
source_kind=source_kind,
|
||||
limit=limit,
|
||||
include_halachot=True,
|
||||
)
|
||||
|
||||
|
||||
async def main():
|
||||
results = {"pass": [], "fail": []}
|
||||
|
||||
# 1) THE BUG: bare-name query must rank the case ITSELF (Agasi) above any
|
||||
# case that merely CITES it (Vinfeld), and within the top 3.
|
||||
rows = await _search(NAME_QUERY, "internal_committee", limit=10)
|
||||
a_rank = _rank_of(rows, AGASI_ID)
|
||||
v_rank = _rank_of(rows, VINFELD_ID)
|
||||
ok = bool(a_rank) and a_rank <= 3 and (v_rank is None or a_rank < v_rank)
|
||||
msg = (f"[name/internal] query='{NAME_QUERY}' -> Agasi rank={a_rank}, "
|
||||
f"Vinfeld(citer) rank={v_rank} (top ids: {_ids(rows)[:5]})")
|
||||
(results["pass"] if ok else results["fail"]).append(msg)
|
||||
|
||||
# 2) REGRESSION: substantive query must still find Agasi with a real score.
|
||||
rows = await _search(SUBSTANTIVE_QUERY, "internal_committee", limit=10)
|
||||
rank = _rank_of(rows, AGASI_ID)
|
||||
top_score = float(rows[0]["score"]) if rows else 0.0
|
||||
msg = f"[substantive/internal] Agasi rank={rank}, top_score={top_score:.3f}"
|
||||
(results["pass"] if rank and rank <= 8 else results["fail"]).append(msg)
|
||||
|
||||
# 3) REGRESSION: substantive query in the full precedent library still works
|
||||
# (Vinfeld/נווה שלום etc. should surface; just assert non-empty + has betterment content).
|
||||
rows = await _search(SUBSTANTIVE_QUERY, "external_upload", limit=10)
|
||||
msg = f"[substantive/external] returned {len(rows)} rows (top ids: {_ids(rows)[:3]})"
|
||||
(results["pass"] if len(rows) >= 3 else results["fail"]).append(msg)
|
||||
|
||||
print("\n=== PASS ===")
|
||||
for m in results["pass"]:
|
||||
print(" ✓", m)
|
||||
print("=== FAIL ===")
|
||||
for m in results["fail"]:
|
||||
print(" ✗", m)
|
||||
|
||||
return 1 if results["fail"] else 0
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
sys.exit(asyncio.run(main()))
|
||||
Reference in New Issue
Block a user