legal-ai/.taskmaster/tasks/tasks.json at 5a00a0ef47de59662a95ca0176058044e60ffcca

Files

Chaim 4debe9995b chore(#15 ): adopt MULTIMODAL_TEXT_WEIGHT=0.65 + close #15 , open #80

A/B eval (eval_retrieval.py, 86-query gold-set) showed the 0.5 default was
mis-tuned: the image side was too heavy and dragged precedent_library recall
0.971 -> 0.885. Sweep 0.5..0.75 — at 0.65 multimodal beats text-only on every
overall metric AND every corpus (R@5 0.994 vs 0.989, nDCG@5 0.960 vs 0.944,
MRR 0.954 vs 0.936). Dafna approved.

- MULTIMODAL_TEXT_WEIGHT=0.65 set in Coolify (legal-ai, runtime) + redeploy.
- baseline.json updated to the 0.65 config (future regression reference).
- #15 done (premise was stale — multimodal already default on 110 docs; the
  win was tuning the weight, not the backfill).
- #80 opened: the costly 140-doc legacy backfill is deferred until a targeted
  image-answer gold-set proves the table/image value prop (untested here).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

2026-06-03 08:45:06 +00:00

210 KiB

Raw Blame History

View Raw

210 KiB Raw Blame History

210 KiB

Raw Blame History