feat(graph): in-app corpus citation graph (/graph) — Phase 1

Native, Obsidian-graph-view-like network of the precedent corpus, rendered
in web-ui from a read-only projection of the live DB. Replaces the idea of
exporting to an external Obsidian vault (which would be a parallel, drifting
copy of the corpus — the exact root cause G2 forbids).

The graph edges already existed in the data model; this only surfaces them:
nodes = precedents (case_law) + synthesized topic/practice-area hubs;
edges = cites (precedent_internal_citations) + same_chain (case_law_relations)
+ tagged/in_area (subject_tags / practice_area membership). Node size =
incoming-citation count (index-backed GROUP BY on idx_pic_target). Click a
node → local-graph neighborhood focus; panel deep-links to /precedents/[id].

Backend (read-only, SELECT only — G2):
- web/graph_api.py — Pydantic models (CorpusGraph/GraphNode/GraphEdge, so
  OpenAPI emits real types — UI2) + SQL assembly over the shared db.get_pool().
- web/app.py — GET /api/graph/corpus, GET /api/graph/node/{id}/neighborhood,
  both with explicit response_model. practice_area validated against the
  closed enum (G5); both endpoints write nothing.

Frontend:
- react-force-graph-2d (canvas/d3-force), loaded via next/dynamic ssr:false.
- /graph page + nav entry; graph.ts TanStack hooks; filter panel (practice_area
  / source / min-citations / search / node-type toggles), node detail panel,
  hover+selection neighborhood highlight. Explicit error handling (UI4).

Not a retrieval path (03-retrieval): returns graph topology, never ranked
search results. Halacha nodes + corroboration/equivalence edges are Phase 2,
already gated behind the node_types param (no contract change needed).

SQL validated read-only against the live DB (142 precedents, 85 resolved
citations, JSONB tag expansion, ANY(uuid[]) edge + BFS queries). web-ui lint
+ build pass; /graph in the route table.

Invariants: keeps G2 (single source of truth — live projection, no parallel
store), G5 (corpus separation filtered server-side), UI2 (response models),
UI4 (no swallowed UI errors).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
2026-06-07 18:50:56 +00:00
parent acb8e2c206
commit c80e4ce8ff
11 changed files with 1651 additions and 0 deletions

View File

@@ -5757,6 +5757,48 @@ async def precedent_remove_relation(case_law_id: str, related_id: str):
return {"unlinked": True, "case_law_id": case_law_id, "related_id": related_id}
# ── Corpus graph (the /graph page) ────────────────────────────────────
# Read-only topology projection of the precedent corpus — nodes + edges
# assembled live from the canonical tables (G2: no parallel store, no drift).
# NOT a retrieval path (03-retrieval): returns graph structure, not ranked
# search results. Explicit Pydantic response_model (graph_api.CorpusGraph) so
# the OpenAPI schema emits real types for the UI (UI2).
from web import graph_api # noqa: E402 (FastAPI-only, web-ui-facing read projection)
@app.get("/api/graph/corpus", response_model=graph_api.CorpusGraph)
async def graph_corpus(
practice_area: str = "",
source: str = "",
node_types: str = "",
min_citations: int = 0,
limit: int = graph_api.NODE_CAP_DEFAULT,
q: str = "",
):
"""Full corpus graph under the given filters (most-cited nodes survive the cap)."""
if practice_area and practice_area not in _PRACTICE_AREAS:
raise HTTPException(400, "practice_area לא תקין")
pool = await db.get_pool()
return await graph_api.build_corpus_graph(
pool,
practice_area=practice_area,
source=source,
node_types=node_types,
min_citations=min_citations,
limit=limit,
q=q,
)
@app.get("/api/graph/node/{node_id}/neighborhood", response_model=graph_api.CorpusGraph)
async def graph_node_neighborhood(node_id: str, depth: int = 1, node_types: str = ""):
"""Local-graph focus: the node + its neighbors out to ``depth`` (1-2)."""
pool = await db.get_pool()
return await graph_api.build_node_neighborhood(
pool, node_id, depth=depth, node_types=node_types
)
# Halacha and metadata extraction are LLM-driven and rely on the local
# `claude` CLI via mcp-server/services/claude_session.py — they CANNOT run
# from this container (no CLI, no claude.ai session). The endpoints below