feat(digests): self-heal in drain_digests — auto-resume after quota/interruption
ה-cron של drain_digests הוא מנגנון ה-resume (pending-based, idempotent, host-side, לא תלוי בסשן). חיזוק: אם enrich נכשל באמצע (מכסת claude נגמרה) השורה נשארה 'completed' עם שדות ריקים → לא היתה מטופלת שוב. עכשיו drain מאפס בתחילתו כל digest 'completed' עם concept_tag ריק *וגם* underlying_citation ריק (= חילוץ שמעולם לא נחת; שורה תקינה תמיד מכילה לפחות מראה-מקום) → pending לריצה חוזרת. כך כל קטיעה/מכסה מתאוששת אוטומטית בריצת ה-cron הבאה. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -36,6 +36,20 @@ CONCURRENCY = int(os.environ.get("DIGEST_DRAIN_CONCURRENCY", "3"))
|
||||
|
||||
async def main() -> int:
|
||||
pool = await db.get_pool()
|
||||
# Self-heal: an enrich that failed mid-LLM (e.g. the local claude
|
||||
# subscription window was exhausted) can leave a row 'completed' with no
|
||||
# concept_tag AND no underlying_citation — a real digest always extracts at
|
||||
# least a citation, so "both empty" means the extraction never landed. Reset
|
||||
# those to 'pending' so the next run retries (idempotent auto-resume). Safe:
|
||||
# successfully-enriched rows always have a concept_tag or citation.
|
||||
healed = await pool.execute(
|
||||
"UPDATE digests SET extraction_status = 'pending' "
|
||||
"WHERE extraction_status = 'completed' "
|
||||
"AND coalesce(concept_tag,'') = '' AND coalesce(underlying_citation,'') = '' "
|
||||
"AND coalesce(analysis_text,'') <> ''"
|
||||
)
|
||||
if healed and healed != "UPDATE 0":
|
||||
print(f"self-heal: reset failed-empty digests → pending ({healed})", flush=True)
|
||||
rows = await pool.fetch(
|
||||
"SELECT id FROM digests WHERE extraction_status = 'pending' ORDER BY created_at"
|
||||
)
|
||||
|
||||
Reference in New Issue
Block a user