Maximize context and output per Anthropic best practices

Per official Anthropic documentation (April 2026): Output tokens increased to match model capabilities: - block-yod (discussion): 8K → 32K (Opus supports 128K) - block-zayin (claims): 4K → 16K - block-vav (background): 4K → 16K - claims_extractor: 4K → 8K (fixes truncated JSON) - qa_validator: 4K → 8K Source documents sent in full (not truncated): - Was: 3000 chars per doc, 15K total - Now: full document text, no truncation - Reduces hallucinations: "extract word-for-word quotes first" Prompt structure follows long-context tips: - Source documents placed FIRST (top of prompt) - Instructions and query placed LAST - "Queries at the end improve quality by up to 30%" Extended thinking uses adaptive mode for Opus 4.6. Streaming enabled for all requests > 21K tokens. Unified JSON parsing via parse_llm_json() helper in config.py. Applied to: classifier, claims_extractor, brainstorm, qa_validator, learning_loop (5 files). Also: extractor.py now supports .md files. Sources: - https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking - https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips - https://docs.anthropic.com/en/docs/minimizing-hallucinations - https://docs.anthropic.com/en/docs/about-claude/models/overview Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-03 14:17:43 +00:00
parent bed9d5c7e9
commit e24e24dac5
8 changed files with 86 additions and 81 deletions
--- a/mcp-server/src/legal_mcp/services/claims_extractor.py
+++ b/mcp-server/src/legal_mcp/services/claims_extractor.py
@@ -7,7 +7,6 @@

 from __future__ import annotations

-import json
 import logging
 import re
 from uuid import UUID
@@ -15,6 +14,7 @@ from uuid import UUID
 import anthropic

 from legal_mcp import config
+from legal_mcp.config import parse_llm_json
 from legal_mcp.services import db

 logger = logging.getLogger(__name__)
@@ -91,7 +91,7 @@ async def extract_claims_with_ai(
    client = _get_anthropic()
    message = client.messages.create(
        model="claude-sonnet-4-20250514",
-        max_tokens=4096,
+        max_tokens=8192,
        messages=[
            {
                "role": "user",
@@ -105,17 +105,8 @@ async def extract_claims_with_ai(
    )

    raw = message.content[0].text.strip()
-    # Strip markdown code blocks if present
-    raw = re.sub(r"^```(?:json)?\s*", "", raw)
-    raw = re.sub(r"\s*```$", "", raw)
-    try:
-        # Extract JSON array from response
-        json_match = re.search(r"\[.*\]", raw, re.DOTALL)
-        if json_match:
-            claims = json.loads(json_match.group())
-        else:
-            claims = json.loads(raw)
-    except json.JSONDecodeError:
+    claims = parse_llm_json(raw)
+    if claims is None:
        logger.warning("Failed to parse claims response: %s", raw[:200])
        return []