Add CMPA (betterment levy) training support and update methodology

Support ingestion of betterment levy (היטל השבחה) decisions into a separate training corpus (CMPA). Key changes: - Add .doc file extraction via LibreOffice conversion in extractor - Add practice_area/appeal_subtype columns to style_corpus table - Route training files to cmp/ or cmpa/ subdirs based on appeal subtype - Fix derive_subtype to handle ARAR-YY-NNNN format (was matching year digit) - Expose practice_area/appeal_subtype params in MCP upload_training tool - Add appeal_subtype filter to analyze_style for per-type style analysis - Update betterment levy methodology in lessons.py: checklist (from generic to corpus-based), opening/closing strategies, and discussion rules Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 14:00:35 +00:00
parent 684a4cfd3b
commit ba39707c70
8 changed files with 145 additions and 51 deletions
--- a/mcp-server/src/legal_mcp/services/style_analyzer.py
+++ b/mcp-server/src/legal_mcp/services/style_analyzer.py
@@ -109,16 +109,27 @@ SYNTHESIS_PROMPT = """\
 """


-async def analyze_corpus() -> dict:
+async def analyze_corpus(appeal_subtype: str = "") -> dict:
    """Analyze the style corpus and extract/update patterns.

+    Args:
+        appeal_subtype: filter by appeal subtype (e.g. 'betterment_levy', 'building_permit').
+                        Empty string = all decisions.
+
    Returns summary of patterns found.
    """
    pool = await db.get_pool()
    async with pool.acquire() as conn:
-        rows = await conn.fetch(
-            "SELECT full_text, decision_number FROM style_corpus ORDER BY decision_date DESC LIMIT 20"
-        )
+        if appeal_subtype:
+            rows = await conn.fetch(
+                "SELECT full_text, decision_number FROM style_corpus "
+                "WHERE appeal_subtype = $1 ORDER BY decision_date DESC LIMIT 20",
+                appeal_subtype,
+            )
+        else:
+            rows = await conn.fetch(
+                "SELECT full_text, decision_number FROM style_corpus ORDER BY decision_date DESC LIMIT 20"
+            )

    if not rows:
        return {"error": "אין החלטות בקורפוס. העלה החלטות קודמות תחילה."}