Maximize context and output per Anthropic best practices

Per official Anthropic documentation (April 2026):

Output tokens increased to match model capabilities:
- block-yod (discussion): 8K → 32K (Opus supports 128K)
- block-zayin (claims): 4K → 16K
- block-vav (background): 4K → 16K
- claims_extractor: 4K → 8K (fixes truncated JSON)
- qa_validator: 4K → 8K

Source documents sent in full (not truncated):
- Was: 3000 chars per doc, 15K total
- Now: full document text, no truncation
- Reduces hallucinations: "extract word-for-word quotes first"

Prompt structure follows long-context tips:
- Source documents placed FIRST (top of prompt)
- Instructions and query placed LAST
- "Queries at the end improve quality by up to 30%"

Extended thinking uses adaptive mode for Opus 4.6.
Streaming enabled for all requests > 21K tokens.

Unified JSON parsing via parse_llm_json() helper in config.py.
Applied to: classifier, claims_extractor, brainstorm, qa_validator,
learning_loop (5 files).

Also: extractor.py now supports .md files.

Sources:
- https://docs.anthropic.com/en/docs/build-with-claude/extended-thinking
- https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/long-context-tips
- https://docs.anthropic.com/en/docs/minimizing-hallucinations
- https://docs.anthropic.com/en/docs/about-claude/models/overview

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-04-03 14:17:43 +00:00
parent bed9d5c7e9
commit e24e24dac5
8 changed files with 86 additions and 81 deletions

View File

@@ -21,6 +21,7 @@ from uuid import UUID
import anthropic
from legal_mcp import config
from legal_mcp.config import parse_llm_json
from legal_mcp.services import db
logger = logging.getLogger(__name__)
@@ -139,7 +140,7 @@ async def check_claims_coverage(blocks: list[dict], claims: list[dict]) -> dict:
client = _get_anthropic()
message = client.messages.create(
model="claude-haiku-4-5-20251001",
max_tokens=4096,
max_tokens=8192,
messages=[{
"role": "user",
"content": f"""{CLAIMS_CHECK_PROMPT}
@@ -153,13 +154,8 @@ async def check_claims_coverage(blocks: list[dict], claims: list[dict]) -> dict:
)
raw = message.content[0].text.strip()
# Strip markdown code blocks if present
raw = re.sub(r"^```(?:json)?\s*", "", raw)
raw = re.sub(r"\s*```$", "", raw)
try:
json_match = re.search(r"\{.*\}", raw, re.DOTALL)
parsed = json.loads(json_match.group()) if json_match else json.loads(raw)
except (json.JSONDecodeError, AttributeError):
parsed = parse_llm_json(raw)
if parsed is None:
logger.warning("Failed to parse claims check: %s", raw[:300])
# Fallback: assume all covered (don't block export on parse failure)
return {"name": "claims_coverage", "passed": True,