security(chat): bind chat service to docker bridge + require Bearer auth

Address security-review finding: the host-side legal-chat-service was binding 0.0.0.0:8770 with no authentication. The service spawns the claude CLI, whose tool set includes Bash + Edit — so an unauthenticated /chat/start is effectively RCE. Oracle Cloud's security list closes the port externally, but defense-in-depth requires two independent layers: 1. Bind defaults to 10.0.1.1 (docker0 bridge gateway). Reachable from containers on docker bridges (the legal-ai container has a route via the coolify network), invisible to anything outside the host. The --host flag is still configurable for local-dev (127.0.0.1) or special-case deployments, but 0.0.0.0 is explicitly discouraged in the docstring. 2. /chat/start requires Authorization: Bearer <LEGAL_CHAT_SHARED_SECRET>. The secret is loaded from /home/chaim/.legal-chat-service.env (chmod 600, off-repo) by the pm2 ecosystem and mirrored as a Coolify env var so the FastAPI chat_proxy sends a matching header. hmac.compare_digest prevents timing oracles. /health stays unauthenticated (static OK, no subprocess) so the FastAPI proxy can probe liveness without the secret. The service refuses to start if LEGAL_CHAT_SHARED_SECRET is empty or shorter than 24 chars — no silent fallback to an open mode. When the Infisical MCP comes back, migrate the secret into the vault at /_GUIDELINES per the project secrets policy. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-27 10:22:14 +00:00
parent 5ad541e54c
commit d3c6baf9e2
3 changed files with 130 additions and 27 deletions
--- a/mcp-server/src/legal_mcp/chat_service/server.py
+++ b/mcp-server/src/legal_mcp/chat_service/server.py
@@ -4,18 +4,33 @@ Endpoints:
    POST /chat/start    — body: {prompt, system?, resume_session_id?}
                          returns SSE stream of events from
                          ``claude_session.query_streaming``.
-    GET  /health        — liveness probe.
+                          REQUIRES Authorization: Bearer <secret>.
+    GET  /health        — liveness probe (no auth — used by FastAPI for status).

 Run with pm2:
-    pm2 start ecosystem.config.cjs --only legal-chat-service
+    pm2 start scripts/legal-chat-service.config.cjs

 Standalone for dev:
    cd ~/legal-ai/mcp-server
-    .venv/bin/python -m legal_mcp.chat_service.server --port 8770
+    LEGAL_CHAT_SHARED_SECRET=... .venv/bin/python -m legal_mcp.chat_service.server \
+        --port 8770 --host 10.0.1.1

-We intentionally bind to 127.0.0.1 only — the FastAPI container reaches
-us via ``host.docker.internal``, and exposing the bridge publicly would
-let anyone run claude CLI commands against Daphna's session.
+Security posture
+----------------
+1. Bind defaults to ``10.0.1.1`` — the host's docker0 bridge gateway.
+   Containers on docker bridges (including the legal-ai container, which
+   sits on the ``coolify`` network but routes to docker0 at the host)
+   can reach this address; processes outside the host cannot. Binding to
+   ``0.0.0.0`` is permitted but discouraged (relies on the cloud-level
+   firewall as the sole perimeter).
+2. ``/chat/start`` requires a ``Authorization: Bearer <LEGAL_CHAT_SHARED_SECRET>``
+   header. The secret is loaded from the environment; without it set,
+   the server refuses to start (no fallback to "open" mode, by design —
+   the claude CLI it spawns can run arbitrary tool calls, so an
+   unauthenticated /chat/start is RCE-equivalent).
+3. ``/health`` is intentionally unauthenticated so the FastAPI proxy
+   can probe liveness with no token. It returns only a static OK and
+   never spawns subprocesses, so it can't be abused.
 """

 from __future__ import annotations
@@ -42,10 +57,31 @@ from legal_mcp.services import claude_session  # noqa: E402
 logger = logging.getLogger("legal_chat_service")


+# Loaded once at startup. Validated to be non-empty in main(); the handler
+# uses a constant-time compare to avoid timing oracles on a short input.
+_SHARED_SECRET: str = ""
+
+
 async def health(request: web.Request) -> web.Response:
    return web.json_response({"ok": True, "service": "legal-chat-service"})


+def _check_bearer(request: web.Request) -> web.Response | None:
+    """Validate ``Authorization: Bearer <secret>``. Returns 401 response on failure."""
+    auth = request.headers.get("Authorization", "")
+    expected = "Bearer " + _SHARED_SECRET
+    # ``compare_digest`` defends against timing attacks. Strings of different
+    # length still leak length, but for a 43-char urlsafe token that's
+    # uninteresting and the auth scheme prefix anchors it anyway.
+    import hmac
+    if not auth or not hmac.compare_digest(auth, expected):
+        return web.json_response(
+            {"error": "unauthorized: missing or invalid Bearer token"},
+            status=401,
+        )
+    return None
+
+
 async def chat_start(request: web.Request) -> web.StreamResponse:
    """Drive ``claude_session.query_streaming`` and forward events as SSE.

@@ -55,6 +91,10 @@ async def chat_start(request: web.Request) -> web.StreamResponse:
        resume_session_id: str | None  — continue a prior CLI session
        timeout: int = 3600            — hard timeout for the subprocess
    """
+    unauth = _check_bearer(request)
+    if unauth is not None:
+        return unauth
+
    try:
        body = await request.json()
    except json.JSONDecodeError:
@@ -124,9 +164,15 @@ def build_app() -> web.Application:
 def main() -> int:
    parser = argparse.ArgumentParser(description="legal-chat-service")
    parser.add_argument("--port", type=int, default=8770)
-    parser.add_argument("--host", default="127.0.0.1",
-                        help="bind address; 127.0.0.1 keeps the service "
-                             "loopback-only — leave it alone in production")
+    parser.add_argument(
+        "--host", default="10.0.1.1",
+        help=(
+            "bind address. Default 10.0.1.1 = docker0 bridge gateway — "
+            "reachable from containers, invisible to non-host networks. "
+            "Use 127.0.0.1 for host-local dev; do not bind 0.0.0.0 "
+            "without a separate perimeter firewall."
+        ),
+    )
    parser.add_argument("--log-level", default="INFO")
    args = parser.parse_args()

@@ -135,7 +181,27 @@ def main() -> int:
        format="%(asctime)s %(name)s %(levelname)s %(message)s",
    )

+    secret = os.environ.get("LEGAL_CHAT_SHARED_SECRET", "").strip()
+    if not secret:
+        logger.error(
+            "LEGAL_CHAT_SHARED_SECRET is empty; refusing to start. "
+            "Set it in /home/chaim/.legal-chat-service.env (loaded by "
+            "pm2) and mirror it as a Coolify env var on the legal-ai app."
+        )
+        return 2
+    if len(secret) < 24:
+        logger.error(
+            "LEGAL_CHAT_SHARED_SECRET is too short (got %d chars); "
+            "refusing to start. Use >=32 chars (e.g. python3 -c "
+            "'import secrets; print(secrets.token_urlsafe(32))').",
+            len(secret),
+        )
+        return 2
+    global _SHARED_SECRET
+    _SHARED_SECRET = secret
+
    app = build_app()
+    logger.info("legal-chat-service listening on %s:%d", args.host, args.port)
    web.run_app(app, host=args.host, port=args.port, print=lambda _msg: None)
    return 0