Commits · ac4643d025a311735d7178e367ce99962183a4bc · PLN / Tidal

07 Jun, 2026 20 commits

feat(semantics): validated CLAP vibe-search + live /vibe endpoint (#82,#86) · ac4643d0

Katana-first finding: per-one-shot CLAP genre/mood tags are unreliable (every
hit → boom-bap/euphoric — a 0.3s sound has no genre), but the audio EMBEDDINGS
are gold for RELATIVE similarity. 'warm dusty rhodes' → suns_keys gold-keys +
west-coast electric; 'jazzy upright bass' → no_sunshine/come_bass loops; a kick's
nearest neighbours are other kicks (0.96 cross-folder). So we ship similarity,
not fake absolute labels (Principle 1: trust the instrument).

- sample_semantics.py validated on real audio; semantics_embeds.npz = 1490×512-d.
- serve.py: lazy CLAP /vibe?q= (embed any phrase → rank) + /similar?name= (by
  audio-embed cosine). 503 if unbuilt, 400/404 on bad input; static serving
  untouched. Single-user LAN, torch loads once on first hit.

authored Jun 07, 2026

ac4643d0

docs(tasks): log 014 — ParVagues Unwrapped explorable viz · 7981dd07
PLN (Algolia) authored Jun 07, 2026

7981dd07

feat(unwrapped): explorable feature-space map of the sample corpus (#81) · b7adc89e

ParVagues Unwrapped — a Ship's Bridge dark explorable over unwrapped.json:
- The map: canvas scatter of all 1,485 samples; pick any 2 of the 5 named
  superfeature axes (or raw features); colour by family / audio-cluster /
  folder-kind / folder-agreement; hover to identify, CLICK TO AUDITION the .wav.
- The five axes: PCA loading bars (what composes each superfeature) + RF
  family-discriminator importances (centroid + temporal-centroid on top).
- Family fingerprints: per-family beeswarm on a switchable feature — the
  kick-punchy / bass-sustained split made visible.
- Folders are loose: family x audio-cluster contingency grid (ARI 0.25) — the
  timbre-not-label story, the data behind 'a folder is a loose grouping'.
Zero npm deps; fleet family colours; magenta reserved for the audition pulse.

authored Jun 07, 2026

b7adc89e

feat(unwrapped): bake explorable per-sample feature dataset (#81) · c2d6eedb

Joins per-file features × resolver family/kind × standardized PCA projection
(PC1-5) × KMeans timbral cluster × audition .wav path into one unwrapped.json
the browser explores without recomputing. Reuses feature_eda.load_matrix so the
matrix matches the MDA exactly. Names the 5 leading PCs as superfeature axes
(Brightness / Timbre / Loudness / Envelope kick-bass / Tonal-noisy), orienting
coords by a stable anchor so poles read right. Contract tests green.

authored Jun 07, 2026

c2d6eedb

fix(resolve): a folder is loose — kits read as kits, not a 'dominant' verdict · 9134398c

PLN: each SAMPLE is classified individually; a folder is a loose grouping, often a
heterogeneous KIT. Added kind=single|dominant|kit; for kits folder_agrees=None and
the run shows KIT [fam+fam] not a misleading dominant+flag. Folder-name flag now fires
only for folders claiming ~one family (cpluck->synth still flagged). Regenerated palette.

authored Jun 07, 2026

9134398c

feat(feature_eda): MDA over 1485 samples → the superfeature axes · 01b88ed5

sample_features.py overfetched 36 L0/L1 features × 1485 corpus samples; feature_eda
mines them three ways:
- correlation: only 2 redundant pairs ≥0.9 (duration~temporal_centroid 0.97,
  bandwidth~rolloff 0.91) → the overfetch was lean, 34/36 independent.
- PCA: intrinsic dim 19 (90%) / 24 (95%) — genuinely high-D. The 5 leading PCs are
  interpretable SUPERFEATURE AXES: PC1 brightness (rolloff/centroid), PC2 timbre
  (mfcc5-8), PC3 loudness (rms/peak/flux), PC4 envelope/time (temporal_centroid,
  decay_slope, attack — the kick↔bass axis), PC5 tonal-vs-noisy (kurtosis/chroma_entropy).
- clustering: KMeans(12) vs resolver families ARI=0.25 NMI=0.40 (timbral clusters
  partly orthogonal to semantic family — consistent with 'folders are loose').
  RF importance: spectral_centroid + temporal_centroid are the #1/#2 family
  discriminators → validates productizing the kick↔bass tiebreaker (#80).
TDD: 3 synthetic invariants (redundancy/dim/separation) + real-data load guard.

authored Jun 07, 2026

01b88ed5

data(step1): grounded sample palette — 149 corpus folders (clap:fine) · a93c47af

Layered resolver (L1 filename -> L3 batched CLAP audio -> L2 folder cross-check).
116 homogeneous, 33 kit-like, 20 folder-name disagreements flagged (cpluck->synth,
amencutup->snare, sunny_brass->pad, nujazz_keys125->lead...) — audio overrides the
name, flags surface for review not silent override. max_files=6, ~7min.

authored Jun 07, 2026

a93c47af

perf(clap): encode the text tower once, audio-only forward per folder · 96854679

The ~88 fine descriptors were re-encoded through RoBERTa on every folder's
forward — a fixed cost that made batch size irrelevant. Now cache the normalized
text embeddings + logit scale at load; per-folder forwards run only the audio
tower (get_audio_features) and a matmul against the cached text embeds. ~1.8x
(14s->7.7s/folder; text was ~45% of per-folder cost).

API note (transformers 5.10.2): get_text/audio_features return a model-output
object whose .pooler_output IS the projected 512-d joint embedding — verified
identical to a full ClapModel forward to 1.6e-7. No classification change.

authored Jun 07, 2026

96854679

feat(resolve): --max-files knob to trade CLAP cost vs kit per-index detail · 3250dcbb

Opaque folders send every file through a batched HTSAT forward; --max-files N
caps that. Defaults to CLF.MAX_FILES (12). Use 6 for fast dominant-family
grounding when full per-index kit detail isn't needed.

authored Jun 07, 2026

3250dcbb

data(eda): ground all 61 local takes (Stage-1 corner-B coverage) · 04cbf307

Per-take per-orbit level/register/activity from local Ardour interchange stems
via the optimized build_eda. 61/61 grounded (~8min, warm cache 12-19s on the
big SET takes). 2 empties correctly flagged (Take28 0.6s, Take63 8.6s sketches);
kick orbit (d1) sub-register on all 57 active (centroid never >300Hz).

authored Jun 07, 2026

04cbf307

perf(resolve): batch the resolver's audio path into one CLAP forward/folder · c32db821

resolve_folder looped classify_file per file -> one CLAP forward per opaque
file. Now: run L1 (filename) for all files first, then classify ONLY the opaque
ones in a single batched forward. Extracted classify_files() as the one place
per-file audio prediction lives (CLAP batched, PANNs per-file); classify_file
and classify_folder both route through it so batching can't be lost by accident.
Validated on the jazz kit (per-index BD/HH/SN via filename, CB/P1/P2 via audio).

authored Jun 07, 2026

c32db821

perf(build_eda): L-only single-decode + thread-parallel orbits · 16937a78

The cost is the ffmpeg read (LUKS-decrypt + WAV parse), not the FFT or pipe
(astats-only emit was no faster). Decode the L channel only — stereo orbit
stems share the activity envelope, and that single full-SR decode now serves
both the 2s-bin envelope and the loudest-window spectral profile. Orbits decode
concurrently in a bounded thread pool (ffmpeg releases the GIL). 78-min SET take:
96s->57s; further gains hit the disk I/O wall (~190MB/s), not CPU. Low-SR was
rejected (anti-alias lowpass kills hat brilliance + mislocates the loud window).

authored Jun 07, 2026

16937a78

fix(catalog_view): fail loud when site content (corner C) is missing · f8b7b41e

build_catalog_view silently returned 0 tracks when www is on a branch without
next/content/lives (the redesign branch) — which would let tide.py build clobber the
good catalog_view.json with an empty one. Now raises a clear error; IT fixtures skip
cleanly when the external site dep is unavailable. Suite: 49 passed, 17 skipped.

authored Jun 07, 2026

f8b7b41e

feat(sample_features): librosa overfetch extractor + TDD on real samples · 0f0331ac

~35-feature vector (Peeters/CUIDADO timbre + MIR), convol-inspired L0→L1 tier tags.
Validated kick↔bass discriminators on real samples (temporal_centroid + decay_slope):
bd 0.064s/-165dB/s vs fbass 2.30s/-11dB/s. 5 invariant tests green.

authored Jun 07, 2026

0f0331ac

feat(build_eda): per-take EDA emitter + TakeEDA/OrbitEDA/EDAStatus models · 9731a386

Measures level/register/time-activity per orbit from local Ardour stems → eda_{take}.json
(silent/empty flags); ledger scans interchange → eda_status.json. Finding: 61 takes
locally reachable (not the 9 assumed) → most of the catalog groundable offline.

authored Jun 07, 2026

9731a386

feat(sample-resolve): layered family resolver → grounded sample_families.json · 3ce8bba8

L1 filename (sample_meta) → L3 audio fallback (sample_classify ensemble-fine, the
73% scoreboard winner) → L2 folder cross-check (flags folder_agrees=False when the
folder name lies). Audio runs ONLY on opaque files. Per folder: dominant, agreement
conf, homogeneous, kit_like, per-:index labels with source, provenance. Validates
the domain knowledge mechanically: jazz=kit (000_BD→kick :0, HH/OH→hat, SN→snare),
cpluck=multisample (plucks+body, folder='keys' flagged), 808hc=perc via audio.

authored Jun 07, 2026

3ce8bba8

feat(presence): typed online-presence SSOT (profiles + per-release links) · bdfdf5fc

The site hardcodes platform URLs across 7+ components; this consolidates them into
one typed source. models.py: Platform/LinkKind/ReleaseKind enums + PresenceLink,
ReleaseTrack, Release, Presence. presence.authored.yaml = verified inventory (6
profiles, 4 releases, 18 links) with provenance; build_presence.py validates →
presence.json and stamps default file-provenance. Wired into tide.py build +
generated TS (Presence interface) so the site can later consume it — one-directional,
opt-in (site keeps owning editorial gig content). Per-track links modelled as a
capability, populated only where a real direct URL exists. 61 tests green.

authored Jun 07, 2026

bdfdf5fc

fix(classifier): drop bare startswith — 2-char cues must not prefix-match · 9f15bee8

The unguarded s.startswith(m) let 2-char cues overreach: cp→cpluck→snare,
808→808hc→bass. It was redundant (the ≥3-char token-prefix clause already covers
single-token plurals like pads→pad). Removed it: short cues now fire only as exact
name/token. cpluck→keys (its string identity, not snare); 808hc/808mc→None and
defer to audio (CLAP hears conga). Regression tests added. Full suite 61 green.

authored Jun 07, 2026

9f15bee8

feat(sample-meta): L1 filename-metadata harvester (free per-file signal) · 521127ff

RIFF chunk dump proved samples carry NO semantic embedded metadata (only encoder
tags) — the Pulsar browser shows FILENAMES. So harvest the filename: leading pad
index + instrument-token lexicon → fleet family + source hint. Conservative: opaque
names (JUPI, doing_it_right, 808hc's 'HC') stay family=None → fall back to audio.
Detects kit-like folders (≥2 families by name), the 'jazz is a kit' case.
Corpus coverage: 49% folders / 31% files named, 36 kit-like folders.

authored Jun 07, 2026

521127ff

perf(sample-classify): batch CLAP per folder + parallel decode + progress · acf6a7c1

One forward pass per folder (text tower computed once) instead of per-file —
~Nx fewer forwards; ffmpeg decode on a thread pool. validate prints a live
[i/N] running-accuracy line. source field now reflects method:mode.

authored Jun 07, 2026

acf6a7c1

06 Jun, 2026 15 commits

feat(sample-classify): fine ontology + PANNs + ensemble methods · bb70f394

Per PLN: richer ontology + PANNs/AudioSet + ensembles for sample grounding.
- sample_ontology.py: 99 fine descriptors across the 12 families ('this is the
  sound of {a reese bass}'); scored per-descriptor then marginalized to family.
  CLAP fine: 58% -> 68% top-1 (coarse super-family 76%) vs the noisy name truth.
- sample_panns.py: PANNs Cnn14 (AudioSet 527) -> conservative label->family map ->
  per-family prob vector. ffmpeg @32k, zero-pad short one-shots (Cnn14 needs >=1s
  of mel frames or conv5 collapses). Weak on electronic one-shots (AudioSet
  'Clapping'=applause, not a drum-machine clap).
- sample_classify.py: --method clap|panns|ensemble, --fine|--coarse. clap_vector()
  exposes the family-prob vector; ensemble = mean of CLAP+PANNs vectors -> argmax.

Scoreboard (vs name-heuristic, itself noisy): clap-coarse 58% | clap-fine 68% |
panns - | ensemble - (head-to-head primed, not yet run). Stubborn residual =
bass<->kick one-shot (spectral decay tiebreaker is the next lever).

authored Jun 06, 2026

bb70f394

feat(sample-classify): CLAP zero-shot sample-family analyzer (katana) · 3e14a623

Ground sample families by LISTENING, not by name. sample_classify.py runs
laion/clap-htsat-unfused (transformers, torch CPU) over Dirt-Samples one-shots,
scoring each against text prompts for the 12 fleet families; aggregates per folder
(dominant + homogeneity → kits show as mixed). ffmpeg audio I/O, no librosa.
validate/run/one commands; validate measures top-1 vs the name-confident folders.

Finding (validate): 58% top-1 agreement with the name-heuristic at fine 12-way.
KEY: the name 'ground truth' itself is wrong in many disagreements — CLAP correctly
calls 808hc/808mc congas (perc), which the name-classifier mislabeled bass via '808'.
CLAP is near-perfect on vox/break/clear-bass/kick/keys; the genuinely fuzzy zone is
the melodic cluster (synth/lead/keys/pad). Prompt-tuning is whack-a-mole on noisy
truth. Conclusion: trust CLAP coarsely, not at fine 12-way silently.

authored Jun 06, 2026

3e14a623

fix(classifier): refuse to guess kits + source-named folders; fix sept1 morpher · e9bba22c

PLN-flagged chain of labeling errors, traced to the SSOT classifier:
- 'jazz' was matched to BREAK, but jazz is a multisample KIT (jazz:0=kick,
  :1=snare/hat…). A folder name is not a reliable family signal: it may be one
  family, a heterogeneous kit, or a demucs grab named after a SOURCE song
  (wap, take5, the_revolution, xplosive, rample*). classify_sample_family now
  fires ONLY on names that lexically encode an instrument; everything else is
  None (= needs per-sample analysis). No 'kit registry' (that's name-guessing too).
- removed over-reaching genre/source tokens: jazz, dnb, jungle, loop from break;
  drum from perc. This also FIXES jungle_pads (→pad, was break) and
  jungle_vocals (→vox). amen kept (amencutup genuinely is the Amen break).
- tempo: strip Tidal '--' line comments before parsing cps (ton_numero's
  commented-out morpher no longer counts); a track with a live 'cps (range …)'
  is now flagged morph even when it also declares a fixed setcps. morphing=1
  (Septembre 1er, 60→180), was 0.
- report: + stage_tempo_by_year, sources/roadmap, recurrence gig_slugs,
  classified/unclassified coverage (21% of palette uses need analysis, honest).
- tests: classifier refuses kit/source names; jungle_pads→pad guard. 60 green.

authored Jun 06, 2026

e9bba22c

feat(corpus-viz): hover tooltips + all-dots collab view + copy de-slop · 3f2863d4

PLN review pass:
- tooltip layer: hover any mark for the tracks/examples behind it. cloud dots
  (track + bpm + date), tempo line dots (median + what it means), histogram
  bands, gig/sample columns, family segments (count + % + example members),
  idiom bars (plain-language gloss of each phrase), staple bars, collab dots
- collab story now plots EVERY track as a dot; tracks at the same BPM pile into
  one larger dot (radius ~ count); median drawn as a tick; sample-tell chips get
  lift tooltips. needs tide_eda collab_fingerprint to emit per-track {name,bpm}
- copy: removed cheap 'not X, Y' contrasts + em dashes per impeccable clarify;
  added 'hover anything' affordance hints

authored Jun 06, 2026

3f2863d4

feat(corpus-viz): 'by the numbers' scrollytelling dataviz (task #64) · 33d49591

Phase-2 infodesign of the tide_eda corpus EDA: a standalone, scroll-driven
data essay (corpus.html) on the Ship's Bridge dark instrument language.

Six curated stories, vanilla SVG (zero npm deps), measured-width responsive:
1. the slow climb — studio vs club tempo, dual median lines + track cloud +
   histogram + felt-vs-written 2x inset; studio<->club lens is first-class
2. 2024 breakout — vocab burst + gig cadence on a shared time axis, magenta
   reserved for the one earned 2024 accent
3. palette — 12 sample families as a proportion bar, fleet colors + glyphs
4. the accent — signature gMask/gMute idioms (f*16 in 62/73)
5. collab fingerprint — per-collab bpm range + distinctive samples (raph fast)
6. set-staples — recurrence, Cafe trilogy highlighted

- tide_eda: add stage_tempo_by_year (gig-date x track-tempo) for the club lens
- build_corpus.py + 'tide.py corpus': inline-bundle to one self-contained file
  for deploy to me.nech.pl/parvagues/viz (dist/ gitignored)
- IntersectionObserver reveal/lazy-draw, reduced-motion + mobile reflow,
  load-fail banner; lens auto-disables on single-lens sections (honest)
- 59 tests green

authored Jun 06, 2026

33d49591

docs(corpus-viz): confirmed design brief + implementation spec (task #64) · 830d2dd1

Phase-2 infodesign brief from /impeccable shape: standalone scrollytelling
'ParVagues, by the numbers', 6 curated stories on the fleet color language,
studio<->club toggle as first-class lens, vanilla SVG no-deps, deploy static
to me.nech.pl/parvagues/viz. Exact eda_report.json field->chart map for a
clean post-compact build.

authored Jun 06, 2026

830d2dd1

feat(backlog_setlists v2): explicit ends, label-strip, leet fallback + tests · 9a428b6e

- ANCHORS support (slug, end_line) to bound bleeders (mephisteuf/opal-2025/
  38c3-toilet) into clean blocks — no more swallowing neighbour setlists.
- clean_track_line: strip leading set-position labels (Intro:/Finale:/Sunset),
  keep 'Label: Name' / drop 'Name: gloss'; recovers the LIVE-Noctambule pattern.
- leet_fold fallback (B00K→book, T01l3ts→toilets, G00D→good) — resolves only
  after a straight miss so genuine digits (sept1) still match first.
- 9 gap gigs recovered (opal-2024 14/14, algolia-fdlm 26/26, 39c3 13/14…).
- 6 mechanical tests (leet, label, gloss, bpm/transition, ascii/comment skip,
  real-backlog coverage guard). 59 pass.

authored Jun 06, 2026

9a428b6e

fix(eda holes): canonical sample classifier + style normalization · 21d003dd

- models.py: add classify_sample_family() SSOT (token-aware, DM-suffix gated,
  strong-contains fallback) + extend SAMPLE_FAMILIES cues → unclassified 89→34
  uses (~93% classified); honest unknowns (armora, 90s_matrix) stay None.
- models.py: STYLE_ALIASES + norm_style() merges nu-jazz/nujazz, breaks/breakbeat,
  chip/chiptune, hip-hop/hip so the style chart is honest.
- tide_eda + tests use the canonical classifier (DRY); +3 mechanical tests
  (token cases, no-overreach incl. the shil-'oh' FP, style norm). 53 pass.
- regen tokens (cues feed match[] arrays).

authored Jun 06, 2026

21d003dd

feat(tide_eda): phase-1 EDA over the corpus (tempo, collab, pairings, time) · f7132e2d

Reusable data-scientist pass that finds the stories before any viz:
- TRUE tempo parsed from setcps (corner A), vs metadata bpm (C), with A↔C delta
  (most conflicts are exact 2× half-time/double-time notation, not errors)
- studio tempo (git-creation date) vs stage tempo (gig date) — the creep 110→126
- collab fingerprint: bpm profile + lift-distinctive samples per collaborator
  (raph outed as the fast/club one ~138bpm; nova-solo ~117)
- sample pairings: lift-ranked co-occurrence, ad-hoc 'pair <prefix>' query
- cadence (all 37 canonical gigs), sample-family split, signature idioms,
  set-staples, vocabulary growth (Dirt-Samples symlink mtime, 334-pack Jul'24 burst)
Emits eda_report.json (tidy cuts for the viz phase).

authored Jun 06, 2026

f7132e2d

feat(backlog_setlists): recover gig setlists from backlog.md → resolved .tidal · 5acf72f7

The site tracks.json covers only 23/37 gigs; the rest live in backlog.md as
codename-keyed setlist blocks. New mechanical parser cleans the noisy lines
(ASCII-art, emoji, emphasis, inline [bpm]/{transition} notes, commented-out
tracks) and resolves names to canonical .tidal paths via the catalog's own
alias index (DRY). Gig-codename→slug mapping is authored (confirm:-flagged
when uncertain); suspect_bleed flags blocks that swallowed a neighbour list.
6 gigs cleanly recovered (52 track-slots); seam is larger (dozens of setlists).

authored Jun 06, 2026

5acf72f7

fleet color language + triangle UX redesign (impeccable pass) · 6815967f

ontology in models (SSOT) → generated tokens:
- models.py: OntologyTerm + ColorFamily; ROLE/LIFECYCLE/AGREE terms (color+glyph+
  label) and an authored 12-family SAMPLE-WORLD model (hue-anchored). Variation =
  OKLCH lightness ladder + ±8° hue-fan → 8 shades/family = 96 sample slots. Sample
  axis (conceptual) kept separate from the measured-register axis (ROLE_FAMILIES).
- tools/gen_tokens.py: OKLCH→sRGB (no deps) → tokens.css (armada/ui) + tokens.json
  (triangle fetches at runtime, overrides :root). Validated vs DesignTokens. New
  tide.py 'tokens' step. Never hue alone (glyph+label per term).

triangle UX (low cognitive load, clear hierarchy):
- header calm at rest: one title line, one quiet coverage line, and the agreement
  bar is now the interactive sea-state instrument (overview + legend + level filter
  in one) — replaces the duplicated chips+stack+pills.
- faceted full-text search: free terms + scopes (sample:/gig:/phrase:/take:/style:/
  name:), covers the pattern registry, with 'matched-in' hints so results never feel
  mysterious; '/' focuses, clear button, teaching empty state.
- sound chips colored by sample family (shared lexical classifier from tokens),
  glyph-paired; agreement painted from the ontology (unparsed no longer red).
- tests: 51 green (OKLCH math, shade ladder greyscale-distinctness, ontology
  integrity, unparsed≠conflict, DRY tokens contract, classifier examples).

authored Jun 06, 2026

6815967f

triangle #60: surface pattern registry — kin + phrases in drawer, Clusters view · aa0de028

- drawer 'patterns & kin' section: a track's nearest tracks (shared n-gram
  similarity, clickable to jump) + notable phrase chips (↔ shared idiom /
  ⟳ repeated section), click a phrase to filter the table to who else plays it
- Clusters view toggle: connected components of the neighbor graph rendered as
  family cards (members + the shared phrases that bind them) + a distinctive/solo
  roster. 8 families + 54 solos on the real corpus
- pattern_registry.json loaded best-effort alongside catalog_view.json

authored Jun 06, 2026

aa0de028

tide-table: load-once viz (embed source/audio), usability, #56 take-filter, #59 pattern n-grams · dc8af085

- triangle viz loads once: .tidal source embedded per-row in catalog_view.json
  (no per-file fetch / relative-path fragility); take audio resolved at build
  time so a player only renders when a proxy exists; data-load failure banner
- usability: full-width search, fixed colgroup widths, ellipsized sound-lists,
  tooltip'd empty states (corner-C/B gap, not a disagreement)
- tide.py serve: serves the tide-table dir load-once, prints the triangle URL
- #56: drop empty/sketch take-types from candidate links (Take63/64 noise)
- #59 pattern_ngrams.py + PatternRegistry: mini-notation phrases as 'things'
  (unique/repeated/shared), track clustering by sound-token n-gram Jaccard
  (df-filtered + min-shingle guard). 73 tracks -> 1325 phrases, 193 shared,
  33 repeated; surfaces the gMask/gMute boolean idioms + euclid figures
- tests: 43 green (new guards: no empty/sketch candidates, source embedded,
  audio paths resolve; full UT+IT for the pattern registry)

authored Jun 06, 2026

dc8af085

pipeline driver + generated catalog + ground-truth drawer · 975f9150

tide.py: one command regenerates every downstream artifact in dependency order
(track_recording_map → catalog_view → catalog → ts_types) + `tide.py test`.

Demote catalog.yaml (hand-state, no valid-as-of) → generated artifact:
- catalog.authored.yaml: the ONLY hand facts (license/collab/inspiration/status/
  notes), each with provenance; keyed by .tidal path
- build_catalog.py: catalog_view ⊕ overlay → catalog.generated.json (73 tracks,
  7 authored), validated against models.Catalog on emit
- catalog.yaml marked DEPRECATED

Triangle drawer = ground-truth validator (per corner):
- A: scrollable, syntax-highlighted .tidal source (fetched live) + parsed orbits
- B: audio player when a take proxy exists
- C: per-gig bpm/style/dur + external gig link + raw ingredient list (site's claim)
- search now spans ingredients/samples too
- enrich catalog_view rows with raw ingredients (+ Ingredient/Catalog models, TS)

Serve from the REPO ROOT so source/audio resolve:
  python3 armada/serve.py --dir . --port 8731
  → /armada/tide-table/triangle.html

30 pytest green.

authored Jun 06, 2026

975f9150

triangle: catalog confirmation view (A=score ⋈ C=metadata ⋈ B=recording) · 2704f92c

Reconcile every track across three corners and light the cells where they
agree or fight, so "is our analysis good?" becomes a measured number.

- build_catalog_view.py: reconciler over 73 tracks; pure build() + thin main(),
  validates against the pydantic CatalogView on emit
- tidal_score.py: parse indented do-block dN (recovered burn_this_book + 13 more)
- agree() taxonomy: unparsed / no-claim / agree / partial / conflict / divergent
  — a parser miss is NEVER reported as a disagreement (the cardinal fix)
- models.py CatalogView → gen_ts_types.py → types.gen.ts (DRY single source)
- tests/: 28 pytest — parser regressions, agree taxonomy, IT-on-real-data with
  coverage regression guards, DRY contract (mechanically tests the metadata prior)
- triangle.html: Ship's Bridge lit-cell dashboard (served by serve.py)
- tasks/013: captain's log

Result: 47 agree / 5 partial / 2 conflict / 3 divergent / 16 no-claim;
recorded 40/73; EDA coverage 1/63 (the gap, now visible). The first conflicts
flagged were our own parser bugs — metadata was righter than the machine.

authored Jun 06, 2026

2704f92c

05 Jun, 2026 5 commits

edl_render: proxy→master alignment + fragment-test loop (#36) · 5bb32aea

Reframe (per PLN): we split the MASTERED set into shippable units (tracks / EP /
gapless BC album); we don't touch the gig recording. Source is swappable (online
streaming_final today; a re-rendered master can drop in).

- align: measure proxy→master offset by xcorr of energy envelopes (proxy from
  stemmap, master decoded). Coarse 2s + fine 0.25s + two-half drift check.
  Result: master_t = proxy_t + 2.75s, stable across halves (peak 0.69-0.80) —
  confirms the manifest's calibrated +3. (Caught + fixed a sign bug on first pass:
  measure, then verify your own number.) Per-source: the +3 was the gig-GT, this
  is the stem-render; same ballpark, but measured not assumed.
- frag: audition cut variants for a boundary from the real master — context,
  standalone clean edges (trim bleed + fade), and xf_direct (WAP tail acrossfades
  into 'Plosive head, the bordel dropped = PLN's 'crossfade direct de 0326 a 0345').
  Skip window auto-set from the v2 bleed detector. Each render self-verified non-silent.
- WAP->'Plosive (cut4) rendered: 5 frags @ /frags/ for phone audition.

Next: split (hybrid gapless-exact + standalone-fade outputs) once PLN picks edges.

authored Jun 05, 2026

5bb32aea

bleed v2: per-sound spectral novelty on shared orbits (#39) · d19c1220

v1 only saw a foreign ORBIT lighting up on the wrong side of a cut; it was blind
to a foreign SOUND on an orbit active on BOTH sides (e.g. breaks playing through
a transition). v2 (--spectral) closes that:
- per shared orbit, profile tail/head probe windows + each track's interior ref,
  composite timbral distance (band-shape cosine + centroid shift); flag when the
  tail sounds like the NEXT track (incoming) or the head like the PREVIOUS (outgoing).
- gated by MIN_REF_DIST (versions must differ) and SPEC_MIN_RMS=-50 (probe must
  have real sound — the silence gate caught 4 of my own false positives).
- independently flagged the WAP->'Plosive breaks bleed (orbit-08, Delta=0.475) that
  v1 missed = PLN's 'bordel jusqu'a 0342' zone; + Premier Septembre->Techno Orage.
- EDL now 9 ear + 7 orbit + 6 spectral derived rows; round-trips through MasterEDL.

Delta = severity rank; live crossfades blend by design, so PLN's ear gates feel.

authored Jun 05, 2026

d19c1220

judge: derive orbit labels from score, graded activation, DRY types, waveform zoom/loop · 91a9ac89

#40 Judge annotations were stale+wrong: labels came from a hardcoded ROLE_MAP
(orbit-03 'hats' though punkachien.tidal plays 'dr'; d6/meth_bass missing) and a
hard -38dB on/off gate hid audible orbits that dip within a 2s bin (PLN saw only
d3 OR d8 while hearing both).
- tidal_score.py: parse per-orbit sound map from any .tidal (reuses tfidf vocab;
  also accepts the bare $ "sample" source form). dr/meth_bass/jungle_breaks now correct.
- audio_lens: classify_family() (breaks->tops, drums->percs by identity, register
  by measured centroid), profile peak_db. Fix orbit_files to try padded+unpadded
  channel names (Take89 uses 'Tidal 01-1' -> centroids were silently None).
- build_player_data: labels from score, family validated by centroid, two
  thresholds (litFloorDb -52 visible / activeDb -38 driving), emits validated PlayerData.
- OrbitRail: graded activation (dim->vivid), no orbit vanishes when quiet.

#42 DRY: pydantic models in models.py are the single source of truth; gen_ts_types.py
generates ui/src/types.gen.ts (types.ts is now a re-export). No more hand-synced shapes.

#41 WaveformPlayer: Audacity-style zoom (+/-/ctrl-wheel) + scroll, drag-select a span
and loop it (Regions plugin), keyboard L=loop. tsc + vite build green.

authored Jun 05, 2026

91a9ac89

feat(boundary_bleed): auto-detect cross-boundary sound leaks → EDL · 9089a988

Embodies the division-of-labor principle: the machine flags OBJECTIVE boundary
leaks so PLN's ears judge only feel. Detector = exclusive-orbit ignition: at each
cut, an orbit core to one neighbour but not the other, lit on the wrong side (down
to a sensitive -55dB floor for quiet foreign onsets).

Validated on Take89 (Montreuil): independently caught BOTH of PLN's hand-flagged
bad_cuts — cut1 (Take5 bass pre-echo, "changement de basse avant le cut") and cut5
(Plosive→Jeudi Drill, the Bogdan stabs on d9/orbit-09, ~-48dB) — plus 5 new
candidates. Auto-populates master_edl_take89.json: 16 edits = 9 ear + 7 derived.

Mapping nailed down (was wrong first pass): stemmap orbit-NN == dN; Tidal's
`# orbit X` is SuperDirt-0-indexed → d(X+1) (so `# orbit 8` == d9). d8 = always
breaks. Honest coverage cap logged: a foreign SOUND on a shared+active orbit is
invisible at orbit level → v2 = spectral/per-sound novelty.

authored Jun 05, 2026

9089a988

feat(tfidf): index synths as sound sources; identify Take35 = Septième Armée · ce494fda

PLN: "samples and synths are the same class for fingerprinting." sample_tfidf now
indexes synthdef names (SC synthdefs/ + SCLOrkSynths quark + SuperDirt builtins)
alongside Dirt-Samples, tags each sound sample|synth, and persists kinds. vocab
871 sounds (293 samples + 63 synths used). septieme_armee signature now surfaces
moogBass/FMRhodes1/bassWarsaw — but all common (no rare tell): its identity is the
orbit-arrangement (the SNA riff), not a signature sound. L3 needs both signals.

Take35 identified by blind ear-test as Septième Armée (Seven Nation Army cover,
septieme_armee.tidal, 4:35), NOT the 38C3 "Pitbul Punk" the ±3d date-join guessed.
Corroborated by orbit→sound map (d4 bassWarsaw = the riff bass, etc.). take_gig_map
corrected; performance_notes logs the find + cover-license caveat.

authored Jun 05, 2026

ce494fda