-
feat(sample-classify): CLAP zero-shot sample-family analyzer (katana) · 3e14a623
Ground sample families by LISTENING, not by name. sample_classify.py runs laion/clap-htsat-unfused (transformers, torch CPU) over Dirt-Samples one-shots, scoring each against text prompts for the 12 fleet families; aggregates per folder (dominant + homogeneity → kits show as mixed). ffmpeg audio I/O, no librosa. validate/run/one commands; validate measures top-1 vs the name-confident folders. Finding (validate): 58% top-1 agreement with the name-heuristic at fine 12-way. KEY: the name 'ground truth' itself is wrong in many disagreements — CLAP correctly calls 808hc/808mc congas (perc), which the name-classifier mislabeled bass via '808'. CLAP is near-perfect on vox/break/clear-bass/kick/keys; the genuinely fuzzy zone is the melodic cluster (synth/lead/keys/pad). Prompt-tuning is whack-a-mole on noisy truth. Conclusion: trust CLAP coarsely, not at fine 12-way silently.
PLN (Algolia) authored3e14a623
×