feat(sample-classify): fine ontology + PANNs + ensemble methods
Per PLN: richer ontology + PANNs/AudioSet + ensembles for sample grounding.
- sample_ontology.py: 99 fine descriptors across the 12 families ('this is the
sound of {a reese bass}'); scored per-descriptor then marginalized to family.
CLAP fine: 58% -> 68% top-1 (coarse super-family 76%) vs the noisy name truth.
- sample_panns.py: PANNs Cnn14 (AudioSet 527) -> conservative label->family map ->
per-family prob vector. ffmpeg @32k, zero-pad short one-shots (Cnn14 needs >=1s
of mel frames or conv5 collapses). Weak on electronic one-shots (AudioSet
'Clapping'=applause, not a drum-machine clap).
- sample_classify.py: --method clap|panns|ensemble, --fine|--coarse. clap_vector()
exposes the family-prob vector; ensemble = mean of CLAP+PANNs vectors -> argmax.
Scoreboard (vs name-heuristic, itself noisy): clap-coarse 58% | clap-fine 68% |
panns - | ensemble - (head-to-head primed, not yet run). Stubborn residual =
bass<->kick one-shot (spectral decay tiebreaker is the next lever).
Showing
armada/tide-table/sample_ontology.py
0 → 100644
armada/tide-table/sample_panns.py
0 → 100644
Please
register
or
sign in
to comment