feat(bulletins): catalog monthly "עו"ד על נדל"ן" bulletins into the radar (X12)
עלון חודשי רב-נושאי (פרסום נפרד מהיומון היומי) → מתפצל ל-N שורות digest באותה טבלה (publication='עו"ד על נדל"ן', לא קורפוס מקביל — G2): - bulletin_splitter (LLM local-only, tools=""): מפצל ל-cases[]+articles[]; עדכוני-חקיקה מדולגים (החלטת יו"ר). - bulletin_library.ingest_bulletin: כל מצביע-פסיקה → digest_kind='decision' + embedding + autolink (כולל X13 court-fetch); כל מאמר → digest_kind='article' (טקסט-מלא + embedding, רקע בלבד — INV-DIG1 חל). - content_hash per-item הוא מפתח-הדדאפ (yomon_number ריק) → אידמפוטנטי. - db.create_digest: פרמטר digest_kind (זורם ל-INSERT + upsert). - scripts/ingest_bulletins.py (host, venv) לעיבוד הארכיון. - spec X12 §2.1. אומת (dry-run, ללא DB): עלון 180 → 4 cases+1 article · עלון 201 → 4 cases (כולל ערר-197) +1 article. עדכוני-חקיקה דולגו. claude_session נשאר local-only. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -3667,10 +3667,12 @@ async def create_digest(
|
||||
subject_tags: list[str] | None = None,
|
||||
source_document_path: str = "",
|
||||
extraction_status: str = "processing",
|
||||
digest_kind: str = "",
|
||||
) -> dict:
|
||||
"""Upsert a digest (X12). Idempotent on yomon_number (INV-G3): a repeat
|
||||
upload of the same yomon updates in place. content_hash is the secondary
|
||||
dedup key for digests whose number couldn't be parsed."""
|
||||
dedup key for digests whose number couldn't be parsed (and the primary key
|
||||
for bulletin items, which carry no yomon_number — see uq_digests_content_hash)."""
|
||||
pool = await get_pool()
|
||||
content_hash = _content_hash(analysis_text)
|
||||
async with pool.acquire() as conn:
|
||||
@@ -3684,10 +3686,10 @@ async def create_digest(
|
||||
headline_holding, analysis_text, summary, underlying_citation,
|
||||
underlying_court, underlying_date, underlying_judge, practice_area,
|
||||
appeal_subtype, subject_tags, source_document_path,
|
||||
content_hash, extraction_status
|
||||
content_hash, extraction_status, digest_kind
|
||||
) VALUES (
|
||||
$1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13,
|
||||
$14, $15, $16, $17, $18
|
||||
$14, $15, $16, $17, $18, $19
|
||||
)
|
||||
ON CONFLICT (yomon_number) WHERE yomon_number <> ''
|
||||
DO UPDATE SET
|
||||
@@ -3708,6 +3710,7 @@ async def create_digest(
|
||||
source_document_path = COALESCE(NULLIF(EXCLUDED.source_document_path, ''), digests.source_document_path),
|
||||
content_hash = EXCLUDED.content_hash,
|
||||
extraction_status = EXCLUDED.extraction_status,
|
||||
digest_kind = COALESCE(NULLIF(EXCLUDED.digest_kind, ''), digests.digest_kind),
|
||||
updated_at = now()
|
||||
RETURNING {_DIGEST_COLS}
|
||||
""",
|
||||
@@ -3715,7 +3718,7 @@ async def create_digest(
|
||||
headline_holding, analysis_text, summary, underlying_citation,
|
||||
underlying_court, underlying_date, underlying_judge, practice_area,
|
||||
appeal_subtype, list(subject_tags or []), source_document_path,
|
||||
content_hash, extraction_status,
|
||||
content_hash, extraction_status, digest_kind,
|
||||
)
|
||||
return _row_to_digest(row)
|
||||
|
||||
|
||||
Reference in New Issue
Block a user