| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| escales | ||
| manifeste | ||
| semaphore | ||
| tasks | ||
| tide-table | ||
| ui | ||
| .gitignore | ||
| DESIGN.md | ||
| PRODUCT.md | ||
| README.md | ||
| serve.py |
The ~88 fine descriptors were re-encoded through RoBERTa on every folder's forward — a fixed cost that made batch size irrelevant. Now cache the normalized text embeddings + logit scale at load; per-folder forwards run only the audio tower (get_audio_features) and a matmul against the cached text embeds. ~1.8x (14s->7.7s/folder; text was ~45% of per-folder cost). API note (transformers 5.10.2): get_text/audio_features return a model-output object whose .pooler_output IS the projected 512-d joint embedding — verified identical to a full ClapModel forward to 1.6e-7. No classification change.
| Name |
Last commit
|
Last update |
|---|---|---|
| .. | ||
| escales | Loading commit data... | |
| manifeste | Loading commit data... | |
| semaphore | Loading commit data... | |
| tasks | Loading commit data... | |
| tide-table | Loading commit data... | |
| ui | Loading commit data... | |
| .gitignore | Loading commit data... | |
| DESIGN.md | Loading commit data... | |
| PRODUCT.md | Loading commit data... | |
| README.md | Loading commit data... | |
| serve.py | Loading commit data... |