feat(storage): X14 Phase 2c — route remaining sync write-sites through storage.py
Completes the write-side rewiring (INV-STG1) for the call-sites that run in synchronous contexts, via a new blocking facade in storage.py (put_bytes_sync / put_file_sync — asyncio.run, or a worker thread when a loop is already running): - services/extractor.py: multimodal thumbnail JPEGs → DERIVED (rendered in a to_thread worker) - services/docx_reviser.py: track-changes save (_save_docx_xml) + empty-diff copy (copy_with_revisions) → DOCUMENTS - services/docx_retrofit.py: in-place retrofit backup → DOCUMENTS Each site keeps a fallback to a direct disk write when the target path is outside DATA_DIR (caller-provided). Under the default STORAGE_BACKEND= filesystem the bytes land exactly where they did before — zero behaviour change. Also: mcp_env_catalog MINIO_ENDPOINT default updated to the durable container-name endpoint (http://minio-bx2ykvw94xbutsex41hz4vv8:9000), matching the Coolify "Connect to Predefined Network" change made for network durability. All binary write-sites now flow through storage.py. git-tracked text (case.json/notes/research-md/draft-md) stays on disk by design (INV-STG7); court-fetch temp files are ephemeral. tests: +2 (thumbnail renderer routes through storage; put_bytes_sync round-trip); 55 storage/docx/track-changes green; 244 collected, no import breakage. Keeps G2; completes INV-STG1 write coverage. Spec: docs/spec/X14-storage-minio.md. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -37,6 +37,7 @@ is absent (the default filesystem backend needs nothing extra).
|
||||
"""
|
||||
from __future__ import annotations
|
||||
|
||||
import asyncio
|
||||
import logging
|
||||
import shutil
|
||||
import tempfile
|
||||
@@ -470,3 +471,43 @@ def local_path(key, *, bucket=Bucket.DOCUMENTS) -> Path | None:
|
||||
|
||||
async def ensure_local(key, *, bucket=Bucket.DOCUMENTS) -> Path:
|
||||
return await get_storage().ensure_local(key, bucket=bucket)
|
||||
|
||||
|
||||
# ── synchronous facade ─────────────────────────────────────────────
|
||||
# A few legacy writers are plain sync functions (track-changes save, retrofit
|
||||
# backup, the multimodal thumbnail renderer which runs in a worker thread via
|
||||
# asyncio.to_thread). They go through the same layer via this blocking shim so
|
||||
# INV-STG1 holds everywhere.
|
||||
|
||||
def _run_coro_blocking(coro):
|
||||
"""Run a storage coroutine to completion from synchronous code.
|
||||
|
||||
No running loop in this thread (the common case — sync helpers, or a
|
||||
to_thread worker) → asyncio.run. If a loop *is* already running here, the
|
||||
coroutine is offloaded to a fresh thread so we never deadlock the loop."""
|
||||
try:
|
||||
asyncio.get_running_loop()
|
||||
except RuntimeError:
|
||||
return asyncio.run(coro)
|
||||
box: dict = {}
|
||||
|
||||
def _worker():
|
||||
box["value"] = asyncio.run(coro)
|
||||
|
||||
import threading
|
||||
t = threading.Thread(target=_worker)
|
||||
t.start()
|
||||
t.join()
|
||||
return box["value"]
|
||||
|
||||
|
||||
def put_bytes_sync(key, data, *, bucket=Bucket.DOCUMENTS, content_type=None,
|
||||
metadata=None) -> str:
|
||||
return _run_coro_blocking(
|
||||
put_bytes(key, data, bucket=bucket, content_type=content_type, metadata=metadata))
|
||||
|
||||
|
||||
def put_file_sync(src, key, *, bucket=Bucket.DOCUMENTS, content_type=None,
|
||||
metadata=None) -> str:
|
||||
return _run_coro_blocking(
|
||||
put_file(src, key, bucket=bucket, content_type=content_type, metadata=metadata))
|
||||
|
||||
Reference in New Issue
Block a user