feat(storage): X14 Phase 2c — route remaining sync write-sites through storage.py

Completes the write-side rewiring (INV-STG1) for the call-sites that run in
synchronous contexts, via a new blocking facade in storage.py
(put_bytes_sync / put_file_sync — asyncio.run, or a worker thread when a loop
is already running):
- services/extractor.py: multimodal thumbnail JPEGs → DERIVED (rendered in a
  to_thread worker)
- services/docx_reviser.py: track-changes save (_save_docx_xml) + empty-diff
  copy (copy_with_revisions) → DOCUMENTS
- services/docx_retrofit.py: in-place retrofit backup → DOCUMENTS

Each site keeps a fallback to a direct disk write when the target path is
outside DATA_DIR (caller-provided). Under the default STORAGE_BACKEND=
filesystem the bytes land exactly where they did before — zero behaviour
change.

Also: mcp_env_catalog MINIO_ENDPOINT default updated to the durable
container-name endpoint (http://minio-bx2ykvw94xbutsex41hz4vv8:9000), matching
the Coolify "Connect to Predefined Network" change made for network durability.

All binary write-sites now flow through storage.py. git-tracked text
(case.json/notes/research-md/draft-md) stays on disk by design (INV-STG7);
court-fetch temp files are ephemeral.

tests: +2 (thumbnail renderer routes through storage; put_bytes_sync
round-trip); 55 storage/docx/track-changes green; 244 collected, no import
breakage.

Keeps G2; completes INV-STG1 write coverage. Spec: docs/spec/X14-storage-minio.md.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-08 08:26:09 +00:00
parent bc5dd9ac48
commit b2ea0c28dd
6 changed files with 127 additions and 8 deletions

View File

@@ -14,6 +14,9 @@ from __future__ import annotations
import logging
import re
import shutil
from legal_mcp import config
from legal_mcp.services import storage
import zipfile
from io import BytesIO
from pathlib import Path
@@ -304,10 +307,17 @@ def retrofit_bookmarks(
end_idx = len(paragraphs) - 1
ranges.append((name, start_idx, max(start_idx, end_idx)))
# Backup if overwriting in place
# Backup if overwriting in place — through the storage layer (INV-STG1).
if backup and output_path.resolve() == docx_path.resolve():
backup_path = docx_path.with_suffix(".pre-retrofit.docx")
shutil.copy2(str(docx_path), str(backup_path))
try:
_bkey = backup_path.resolve().relative_to(
Path(config.DATA_DIR).resolve()).as_posix()
storage.put_file_sync(
docx_path, _bkey, bucket=storage.Bucket.DOCUMENTS,
content_type="application/vnd.openxmlformats-officedocument.wordprocessingml.document")
except ValueError:
shutil.copy2(str(docx_path), str(backup_path))
# Inject bookmarks, skipping any that already exist
next_id = _next_bookmark_id(doc_tree)