Files
legal-ai/Dockerfile
Chaim 476c2fc5d1 feat(upload): accept legacy .doc, convert via LibreOffice in container
Legacy Hebrew .doc precedents (e.g. nevo.co.il CP1255 OLE2) can now be
uploaded directly through the precedent-library, missing-precedent, and
training upload paths — the frontend already advertised .doc but the
backend gate rejected it before reaching the extractor.

- web/app.py: add .doc to ALLOWED_EXTENSIONS (covers all paths that share
  the set: precedent library, missing-precedent, training).
- Dockerfile: install libreoffice-writer-nogui (no X11/Java) so the
  extractor's existing _extract_doc LibreOffice conversion works in the
  Coolify container (was missing → would fail at runtime).
- extractor.py: isolate the LibreOffice user profile per call to avoid a
  profile-lock failure on concurrent .doc conversions.

Verified in python:3.12-slim (prod base): .doc→.docx→text yields text
byte-identical to a native Word .docx save (103 paragraphs, 24,341 chars).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-03 13:47:47 +00:00

87 lines
3.7 KiB
Docker

# ══════════════════════════════════════════════════════════════
# Dockerfile — Next.js frontend + FastAPI backend (single container)
#
# The container runs both:
# - FastAPI (uvicorn) on :8000 — the API backend
# - Next.js (node) on :3000 — the frontend (proxies /api/* to :8000)
#
# start.sh launches both processes.
# ══════════════════════════════════════════════════════════════
# ── Stage 1: Node deps ────────────────────────────────────────
FROM node:20-alpine AS deps
WORKDIR /app
COPY web-ui/package.json web-ui/package-lock.json ./
RUN npm ci --no-audit --no-fund
# ── Stage 2: Build Next.js ────────────────────────────────────
FROM node:20-alpine AS builder
WORKDIR /app
COPY --from=deps /app/node_modules ./node_modules
COPY web-ui/ ./
ENV NEXT_TELEMETRY_DISABLED=1
RUN npm run build
# ── Stage 3: Install Python deps (use slim for pre-built wheels) ──
FROM python:3.12-slim AS pydeps
WORKDIR /opt/api
COPY mcp-server/ ./mcp-server/
RUN pip install --no-cache-dir ./mcp-server
# ── Stage 4: Runner ───────────────────────────────────────────
FROM python:3.12-slim AS runner
WORKDIR /app
# Install Node.js 20.x + LibreOffice Writer (headless .doc→.docx conversion
# in extractor.py:_extract_doc — needed for legacy Hebrew .doc precedents).
RUN apt-get update && apt-get install -y --no-install-recommends \
curl ca-certificates git libreoffice-writer-nogui \
&& curl -fsSL https://deb.nodesource.com/setup_20.x | bash - \
&& apt-get install -y --no-install-recommends nodejs \
&& rm -rf /var/lib/apt/lists/*
ENV NODE_ENV=production
ENV NEXT_TELEMETRY_DISABLED=1
ENV PORT=3000
ENV HOSTNAME=0.0.0.0
# Copy Python packages from pydeps stage
COPY --from=pydeps /usr/local/lib/python3.12/site-packages /usr/local/lib/python3.12/site-packages
COPY --from=pydeps /usr/local/bin/uvicorn /usr/local/bin/uvicorn
# Copy Next.js standalone build
COPY --from=builder /app/public ./public
COPY --from=builder /app/.next/standalone ./
COPY --from=builder /app/.next/static ./.next/static
# Copy FastAPI backend code
COPY web/ ./web/
COPY mcp-server/src/ ./mcp-server/src/
# DOCX template used by analysis_docx_exporter — loaded at runtime by path
# (Path(__file__).resolve().parents[4] / "skills/docx/decision_template.docx")
COPY skills/docx/decision_template.docx ./skills/docx/decision_template.docx
# Reference content the /training tab reads at runtime:
# - .claude/agents/hermes-curator.md → GET /api/training/curator/prompt
# - skills/decision/SKILL.md → system prompt for the chat
# - docs/legal-decision-lessons.md → system prompt for the chat
# - docs/corpus-analysis.md → system prompt for the chat
#
# These are read-only at runtime; chair edits go through git, not the container.
COPY .claude/agents/hermes-curator.md ./.claude/agents/hermes-curator.md
COPY skills/decision/SKILL.md ./skills/decision/SKILL.md
COPY docs/legal-decision-lessons.md ./docs/legal-decision-lessons.md
COPY docs/corpus-analysis.md ./docs/corpus-analysis.md
# Make mcp-server source available to web/app.py (it does sys.path.insert for legal_mcp)
ENV PYTHONPATH=/app/mcp-server/src
# Copy startup script
COPY start.sh ./start.sh
RUN chmod +x ./start.sh
EXPOSE 3000
CMD ["./start.sh"]