Skip to content

  • Projects
  • Groups
  • Snippets
  • Help
  • This project
    • Loading...
  • Sign in / Register
T
Tidal
  • Overview
    • Overview
    • Details
    • Activity
    • Cycle Analytics
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
    • Charts
  • Issues 0
    • Issues 0
    • List
    • Board
    • Labels
    • Milestones
  • Merge Requests 0
    • Merge Requests 0
  • CI / CD
    • CI / CD
    • Pipelines
    • Jobs
    • Schedules
    • Charts
  • Wiki
    • Wiki
  • Snippets
    • Snippets
  • Members
    • Members
  • Collapse sidebar
  • Activity
  • Graph
  • Charts
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
  • PLN
  • Tidal
  • Repository

Switch branch/tag
  • Tidal
  • armada
  • tide-table
  • sample_classify.py
Find file
BlameHistoryPermalink
  • PLN (Algolia)'s avatar
    perf(clap): encode the text tower once, audio-only forward per folder · 96854679
    The ~88 fine descriptors were re-encoded through RoBERTa on every folder's
    forward — a fixed cost that made batch size irrelevant. Now cache the normalized
    text embeddings + logit scale at load; per-folder forwards run only the audio
    tower (get_audio_features) and a matmul against the cached text embeds. ~1.8x
    (14s->7.7s/folder; text was ~45% of per-folder cost).
    
    API note (transformers 5.10.2): get_text/audio_features return a model-output
    object whose .pooler_output IS the projected 512-d joint embedding — verified
    identical to a full ClapModel forward to 1.6e-7. No classification change.
    PLN (Algolia) authored Jun 07, 2026
    96854679
sample_classify.py 14.1 KB
EditWeb IDE
×

Replace sample_classify.py

Attach a file by drag & drop or click to upload


Cancel
A new branch will be created in your fork and a new merge request will be started.