feat(goldset): AI second-opinion per item (QA aid) #107

chaim · 2026-06-07T14:24:57Z

chaim commented

2026-06-07 14:24:57 +00:00

מה ולמה

בקשת חיים — המלצה עצמאית לצד כל תיוג, כדי שיוכל לשקול-מחדש את הסימונים שלו. מוסיף חוות-דעת-AI שנייה (לא ground-truth):

schema: halacha_goldset.ai_is_holding / ai_correct_type / ai_rationale / ai_generated_at (אדיטיבי).
db: goldset_set_ai_recommendation + goldset_list מחזיר את שדות ה-ai_*.
scripts/goldset_ai_recommend.py — claude מקומי שופט is_holding + סוג + נימוק לכל פריט, עצמאית (רובריקה משלו). עצמאי מהוולידטורים שנמדדים ב-#81.8 → אין מעגליות. לא מוחל אוטומטית.
web-ui: כל כרטיס מציג "🤖 המלצת AI: הלכה/לא · סוג" + נימוק + צ'יפ הסכמה/אי-הסכמה מול התיוג שלך (ענבר באי-הסכמה); ומסנן "⚠ אי-הסכמות AI (N)" לסקירת המחלוקות בלבד.

מתודולוגית: האדם נשאר אמת-המידה; ה-AI הוא תמריץ לחשוב-מחדש, לא להעתיק.

אימות

tsc --noEmit exit 0; הגנרטור שומר המלצות ומסמן אי-הסכמות מול תיוג קיים.

🤖 Generated with Claude Code

## מה ולמה בקשת חיים — המלצה עצמאית לצד כל תיוג, כדי שיוכל לשקול-מחדש את הסימונים שלו. מוסיף **חוות-דעת-AI שנייה (לא ground-truth):** - **schema:** `halacha_goldset.ai_is_holding / ai_correct_type / ai_rationale / ai_generated_at` (אדיטיבי). - **db:** `goldset_set_ai_recommendation` + `goldset_list` מחזיר את שדות ה-`ai_*`. - **scripts/goldset_ai_recommend.py** — claude מקומי שופט `is_holding` + סוג + נימוק לכל פריט, **עצמאית** (רובריקה משלו). עצמאי מהוולידטורים שנמדדים ב-#81.8 → **אין מעגליות**. לא מוחל אוטומטית. - **web-ui:** כל כרטיס מציג **"🤖 המלצת AI: הלכה/לא · סוג"** + נימוק + צ'יפ הסכמה/אי-הסכמה מול התיוג שלך (ענבר באי-הסכמה); ומסנן **"⚠ אי-הסכמות AI (N)"** לסקירת המחלוקות בלבד. **מתודולוגית:** האדם נשאר אמת-המידה; ה-AI הוא תמריץ לחשוב-מחדש, לא להעתיק. ## אימות - `tsc --noEmit` exit 0; הגנרטור שומר המלצות ומסמן אי-הסכמות מול תיוג קיים. 🤖 Generated with [Claude Code](https://claude.com/claude-code)

chaim added 1 commit 2026-06-07 14:24:57 +00:00

feat(goldset): AI second-opinion per item (QA aid) — compare vs human tag 0e35060d3d

The chair wanted an independent recommendation beside each tag, to reconsider
his own judgments. Adds a NON-ground-truth AI second-opinion:

- schema: halacha_goldset.ai_is_holding / ai_correct_type / ai_rationale /
  ai_generated_at (additive).
- db.goldset_set_ai_recommendation + goldset_list now returns the ai_* fields.
- scripts/goldset_ai_recommend.py — local claude_session judges is_holding +
  type + a one-line rationale per item, INDEPENDENTLY (own legal rubric).
  Independent of the rule-based validators #81.8 measures → no circularity.
  Never auto-applied; QA aid only.
- web-ui: each card shows "🤖 המלצת AI: הלכה/לא · type" + rationale and an
  agreement/disagreement chip vs the human tag (amber on disagree); a
  "⚠ אי-הסכמות AI (N)" filter to review only the conflicts.

Methodology note kept explicit: the human stays the ground truth; the AI is a
prompt to reconsider, not to copy.

Verified: tsc --noEmit 0; generator stores recs and flags disagreements with
existing human tags.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

chaim merged commit 638a542cf4 into main

2026-06-07 14:25:07 +00:00

chaim deleted branch worktree-goldset-ai-recommendation

2026-06-07 14:25:07 +00:00

chaim referenced this issue from a commit

2026-06-07 14:25:08 +00:00

Merge pull request 'feat(goldset): AI second-opinion per item (QA aid)' (#107) from worktree-goldset-ai-recommendation into main

chaim referenced this pull request

2026-06-08 09:15:33 +00:00

docs(claude-md): לרזות CLAUDE.md מ-11.3k ל-~7k טוקן (TaskMaster #107.1) #160

Sign in to join this conversation.