armada/serve.py · tidal-to-tracks-tooling · PLN / Tidal

fix(vibe): warm-up runs a real text forward (absorb torch lazy-init) · cec9dec3

Loading weights wasn't enough — the first forward still cost ~30s on torch's
one-time graph/thread init. Warm now runs a throwaway _embed_texts() so the first
USER query is ~1.5s, not 30s.

authored Jun 07, 2026

cec9dec3

serve.py 6.65 KB

Edit Web IDE

Replace serve.py