Commit 2704f92c by PLN (Algolia)

triangle: catalog confirmation view (A=score ⋈ C=metadata ⋈ B=recording)

Reconcile every track across three corners and light the cells where they
agree or fight, so "is our analysis good?" becomes a measured number.

- build_catalog_view.py: reconciler over 73 tracks; pure build() + thin main(),
  validates against the pydantic CatalogView on emit
- tidal_score.py: parse indented do-block dN (recovered burn_this_book + 13 more)
- agree() taxonomy: unparsed / no-claim / agree / partial / conflict / divergent
  — a parser miss is NEVER reported as a disagreement (the cardinal fix)
- models.py CatalogView → gen_ts_types.py → types.gen.ts (DRY single source)
- tests/: 28 pytest — parser regressions, agree taxonomy, IT-on-real-data with
  coverage regression guards, DRY contract (mechanically tests the metadata prior)
- triangle.html: Ship's Bridge lit-cell dashboard (served by serve.py)
- tasks/013: captain's log

Result: 47 agree / 5 partial / 2 conflict / 3 divergent / 16 no-claim;
recorded 40/73; EDA coverage 1/63 (the gap, now visible). The first conflicts
flagged were our own parser bugs — metadata was righter than the machine.
parent 5bb32aea
---
log: 013
title: "The triangle that checks itself"
date: 2026-06-06
task: "#46–#49 triangle reconciler + tests; #47 model; #48 viz"
tags: [tooling, salvage, epistemics, testing, catalog]
shareable: true
---
## Cap (what & why)
"Is our analysis of ParVagues tracks actually any good?" To answer it instead of
hoping, we built the **triangle**: reconcile every track across three corners —
**A** the `.tidal` score (ground truth from code), **C** the site's gig metadata
(`tracks.json` ingredients), **B** the Ardour recordings — and *light the cells*
where they agree or fight. One view, a measured coverage number.
## Manœuvre (how)
`build_catalog_view.py` joins the corners (73 distinct tracks) and computes an
A↔C agreement per track, reusing `tidal_score`'s own extractor so both sides parse
identically (DRY). Modeled the output as a pydantic `CatalogView` (single source of
truth → generated TS), validated on emit. Standalone `triangle.html` (Ship's Bridge
palette) renders it, served by `serve.py`. Then we did what PLN asked — **tested it
mechanically**: 28 pytest cases (parser regressions, the agree taxonomy, real-corpus
invariants, coverage regression guards, the DRY contract).
## Prise (findings / artifacts)
- 73 tracks · score parsed 73/73 · **A↔C: 47 agree · 5 partial · 2 conflict · 3 divergent · 16 no-claim**.
- **Recording: 40 have a candidate take, 33 unrecorded. EDA coverage = 1/63 takes** — the real gap, now a red number on screen.
- Two flagged "conflicts" turned out to be **our tooling's fault, not the metadata's**:
- *Burn this Book* — score parsed EMPTY because the whole track is an indented
`do`-block and the parser only read column-0 `dN`. `drums_nes`/`cpluck` were there
all along. Fixed the parser; it now agrees.
- *Blue Gold*`ccc0.tidal` (a 30-min workshop ébauche from a given kit) genuinely
has none of the claimed `suns_keys`/`suns_guitar`. The site's `tracks.json` had
**fallen back to the first `.tidal` in the `ccc/` folder** because the real file
(`collab/mousquetaires/blue_gold.tidal`) isn't there. A wrong *link*, not wrong
metadata. New `divergent` level + alias-fragmentation flag catch exactly this.
- Files: `build_catalog_view.py`, `tidal_score.py` (indented-dN), `models.py`
(`CatalogView`), `tools/gen_ts_types.py`, `tests/` (28 green), `triangle.html`.
## Sel (the shareable learning)
The cardinal sin our first pass committed: **`agree()` reported "I couldn't read the
score" as "the score disagrees."** A blind parser that blames the data is worse than
no check at all — we'd have "corrected" correct metadata. The fix is epistemic, not
just code: separate *unparsed* / *divergent* / *conflict*, and never let a parser miss
masquerade as a finding. Katana-first: sharpen the blade before you cut.
## Hameçon (hook)
We built a tool to fact-check the catalog. The first three things it flagged were its
own bugs. The metadata was righter than the machine — and 67 of 73 ParVagues tracks
hide their sounds behind `gFunc` abstractions, so "the score is ground truth" is true,
but *reading* it is the hard part.
#!/usr/bin/env python3
"""build_catalog_view — the TRIANGLE reconciler.
Joins the three corners of the catalog into one confirmation view, so we can SEE
(not hope) that our analysis of ParVagues tracks holds up:
A — score the track's `.tidal` (ground truth from code; tidal_score)
C — metadata the site's gig tracklists (bpm/style/dur + claimed ingredients;
www/next/content/lives/**/tracks.json) — the canonical metadata
B — recording the Ardour takes a track plausibly lives in (date-join via
take_gig_map) + whether that take has been EDA'd (audio-grounded)
For each distinct track (identity = its `.tidal` path) it computes per-EDGE
agreement so a dashboard can light cells green/amber/red/grey:
• A present? does the cited `.tidal` exist + parse to a score
• A↔C sound agreement how many of C's claimed ingredient-sounds the score
actually contains (precision = |C∩A|/|C|), plus the
symmetric jaccard and the score-only / metadata-only
diffs (metadata-only = suspicious / stale claim)
• B take-link candidate take(s) (date-join), and EDA coverage there
Everything here is corner A (local, free) ⋈ corner C (local, free); corner B is
the existing L0 metadata prior (date-join) — NOTHING in B is ear-verified yet.
EDA coverage is surfaced as a first-class RED number to drive the (freebox-gated)
catalog-wide EDA pass. Emits a validated `CatalogView` (see models.py).
python3 build_catalog_view.py # writes catalog_view.json + prints
"""
from __future__ import annotations
import json
import glob
import re
import sys
from datetime import date
from pathlib import Path
import tidal_score as ts
sys.path.insert(0, str(Path(__file__).resolve().parent.parent.parent / "tools"))
from sample_tfidf import SPLIT, sound_vocab # noqa: E402
HERE = Path(__file__).parent
REPO = HERE.parent.parent # …/Sound/Tidal
LIVES = Path("/home/pln/Work/Web/www/next/content/lives")
TAKE_MAP = HERE / "take_gig_map.md"
OUT = HERE / "catalog_view.json"
AS_OF = "2026-06-06"
FUZZY_DAYS = 3
# ingredient `type`s that are not a loaded sound source
NON_SOUND_TYPES = {"moment", "effect"}
# parse "d4 — bass" / "d10" style orbit hints from an ingredient description/code
DN_HINT = re.compile(r"\bd(\d{1,2})\b")
# ── corner C: site gig tracklists ────────────────────────────────────────────
def load_gigs():
gigs = {}
for f in glob.glob(str(LIVES / "**/tracks.json"), recursive=True):
slug = str(Path(f).parent.relative_to(LIVES))
try:
d = json.load(open(f))
except Exception:
continue
tracks = d.get("tracks") if isinstance(d, dict) else d
dt = (d.get("date") if isinstance(d, dict) else None) or ""
gigs[slug] = {"date": dt[:10], "tracks": tracks or []}
return gigs
def ingredient_sounds(track, vocab):
"""Sounds the site CLAIMS this track loads, extracted from ingredient codes
with the SAME per-block logic as the score (tidal_score `_sounds_in_block`,
raw fallback) so A↔C compares like-for-like. Vocab-only here would drop
non-vocab sounds the score keeps (e.g. `kick`, custom synths) and fabricate
disagreement — the asymmetry a unit test caught."""
claimed = []
for ing in track.get("ingredients", []):
if ing.get("type") in NON_SOUND_TYPES:
continue
code = ing.get("code", "")
claimed += ts._sounds_in_block(code, vocab) or ts._raw_sounds(code)
# de-dup, keep order
seen, out = set(), []
for t in claimed:
if t not in seen:
seen.add(t); out.append(t)
return out
# ── corner B: takes ──────────────────────────────────────────────────────────
def load_takes():
takes = []
for line in TAKE_MAP.read_text().splitlines():
m = re.match(r"\|\s*(\d{4}-\d{2}-\d{2})\s*\|\s*(Take\d+)\s*\|\s*([\d:]+)\s*\|"
r"\s*(\d+)\s*\|\s*(\w+)\s*\|\s*(.*?)\s*\|", line)
if m:
d, tk, dur, orb, typ, label = m.groups()
takes.append({"date": d, "take": tk, "dur": dur, "orbits": int(orb),
"type": typ, "label": label})
return takes
def parse_date(s):
m = re.match(r"(\d{4})-(\d{2})-(\d{2})", s or "")
return date(*map(int, m.groups())) if m else None
def takes_for_gig(gig_date, takes):
"""Takes whose date matches the gig (exact, else ±FUZZY_DAYS)."""
gd = parse_date(gig_date)
if not gd:
return []
exact = [t for t in takes if t["date"] == gig_date]
if exact:
return [(t, "date-exact") for t in exact]
near = [(abs((parse_date(t["date"]) - gd).days), t) for t in takes
if parse_date(t["date"])]
near = [(t, f"date±{dd}d") for dd, t in near if dd <= FUZZY_DAYS]
return near
def eda_index():
"""Which takes have audio EDA (are spectrally grounded). Detect from the
per-take EDA artifacts on disk; today only Take89 (Montreuil) is deep-EDA'd."""
have = {}
for f in glob.glob(str(HERE / "**/eda_*.json"), recursive=True):
try:
d = json.load(open(f))
except Exception:
continue
tk = d.get("take") if isinstance(d, dict) else None
if tk:
have[tk] = Path(f).relative_to(HERE).as_posix()
# stemmap_take89.json is also EDA-grade grounding for Take89
for f in glob.glob(str(HERE / "**/stemmap_take*.json"), recursive=True):
m = re.search(r"stemmap_(take\d+)", Path(f).name.lower())
if m:
have.setdefault("Take" + m.group(1)[4:], Path(f).relative_to(HERE).as_posix())
return have
# ── agreement ─────────────────────────────────────────────────────────────────
def agree(score_sounds, claimed_sounds):
"""A↔C sound agreement, with a taxonomy that never blames the metadata for a
parser miss (the Burn-this-Book / Blue-Gold lesson):
empty neither side has parseable sounds
unparsed corner A (score) is empty — WE couldn't read it; not a conflict
no-claim corner C lists no parseable sound (ingredients absent/effects-only)
agree precision ≥ .6 (most claimed sounds are in the score)
partial .3 ≤ precision < .6
conflict 0 < precision < .3 (real partial disagreement)
divergent precision == 0 AND both sides rich (≥2 each) → the linked `.tidal`
and the metadata describe DIFFERENT tracks (likely a wrong file
link / identity collision, e.g. Blue Gold → ccc0 ébauche), NOT a
claim that either source is wrong
precision = |C∩A| / |C| (how many claimed sounds the score actually contains).
"""
a, c = set(score_sounds), set(claimed_sounds)
inter = a & c
precision = len(inter) / len(c) if c else None
jaccard = len(inter) / len(a | c) if (a or c) else None
if not a and not c:
level = "empty"
elif not a:
level = "unparsed" # never "conflict" — corner A is blind here
elif not c:
level = "no-claim"
elif precision >= 0.6:
level = "agree"
elif precision >= 0.3:
level = "partial"
elif precision > 0:
level = "conflict"
elif len(a) >= 2 and len(c) >= 2:
level = "divergent" # wrong file link / identity collision
else:
level = "conflict"
return {
"level": level,
"precision": round(precision, 3) if precision is not None else None,
"jaccard": round(jaccard, 3) if jaccard is not None else None,
"score_only": sorted(a - c), # usually basic drums C omits — benign
"metadata_only": sorted(c - a), # claimed but not in score — SUSPICIOUS
"shared": sorted(inter),
}
def build():
"""The pipeline: A=score ⋈ C=metadata ⋈ B=recording → CatalogView dict.
Pure (no IO beyond reading the corpus); returns the dict so tide.py and the
test suite can drive it without a subprocess."""
vocab, kind = sound_vocab()
gigs, takes = load_gigs(), load_takes()
eda = eda_index()
# invert: track .tidal path → its appearances across gigs
tracks: dict[str, dict] = {}
for slug, g in gigs.items():
for tr in g["tracks"]:
fp = tr.get("file")
if not fp:
continue
rec = tracks.setdefault(fp, {"appearances": [], "names": []})
rec["appearances"].append((slug, g["date"], tr))
if tr.get("name") and tr["name"] not in rec["names"]:
rec["names"].append(tr["name"])
rows = []
for fp, rec in sorted(tracks.items()):
path = REPO / fp
a_present = path.exists()
score = ts.orbit_sounds(path, vocab, kind) if a_present else {}
score_sounds = [d["sound"] for d in score.values()]
# corner C: union claimed sounds + representative metadata
claimed, metas, gig_slugs = [], [], []
for slug, gdate, tr in rec["appearances"]:
gig_slugs.append(slug)
claimed += ingredient_sounds(tr, vocab)
metas.append({"gig": slug, "date": gdate, "bpm": tr.get("bpm"),
"style": tr.get("style"), "dur": tr.get("duration")})
claimed = list(dict.fromkeys(claimed)) # de-dup, keep order
# corner B: candidate takes via date-join across this track's gigs
cand_takes, edad = {}, 0
for slug, gdate, _tr in rec["appearances"]:
for tk, method in takes_for_gig(gdate, takes):
key = tk["take"]
if key not in cand_takes:
has_eda = key in eda
cand_takes[key] = {
"take": key, "type": tk["type"], "via": slug,
"method": method, "is_set": tk["type"].upper() == "SET",
"eda": eda.get(key),
}
if has_eda:
edad += 1
ac = agree(score_sounds, claimed)
rows.append({
"track": fp,
"names": rec["names"],
"name": rec["names"][0] if rec["names"] else Path(fp).stem,
"gigs": sorted(set(gig_slugs)),
"metas": metas,
# corner A
"score_present": a_present,
"n_orbits": len(score),
"score": {f"d{o}": score[o]["sound"] for o in sorted(score)},
"score_sounds": sorted(set(score_sounds)),
# corner C
"claimed_sounds": claimed,
# A↔C
"ac": ac,
# corner B
"takes": sorted(cand_takes.values(), key=lambda x: x["take"]),
"n_takes": len(cand_takes),
"n_takes_eda": edad,
"recorded": bool(cand_takes),
})
# ── alias fragmentation: one human NAME → multiple .tidal files ───────────
# (legit versions/arrangements OR a wrong link — divergent A↔C tells them apart)
name_files: dict[str, set] = {}
for r in rows:
for nm in r["names"]:
name_files.setdefault(nm.lower(), set()).add(r["track"])
for r in rows:
sib = set()
for nm in r["names"]:
sib |= name_files.get(nm.lower(), set())
sib.discard(r["track"])
r["alias_siblings"] = sorted(sib) # other files sharing this name
# ── coverage stats ───────────────────────────────────────────────────────
n = len(rows)
def cnt(pred): return sum(1 for r in rows if pred(r))
def lvl(x): return cnt(lambda r: r["ac"]["level"] == x)
stats = {
"tracks_total": n,
"score_present": cnt(lambda r: r["score_present"]),
"score_parsed": cnt(lambda r: r["n_orbits"] > 0),
"with_metadata": cnt(lambda r: r["claimed_sounds"]),
"ac_agree": lvl("agree"),
"ac_partial": lvl("partial"),
"ac_conflict": lvl("conflict"),
"ac_divergent": lvl("divergent"),
"ac_unparsed": lvl("unparsed"),
"ac_no_claim": lvl("no-claim"),
"ac_empty": lvl("empty"),
"alias_fragmented": cnt(lambda r: r["alias_siblings"]),
"recorded": cnt(lambda r: r["recorded"]),
"unrecorded": cnt(lambda r: not r["recorded"]),
"with_eda": cnt(lambda r: r["n_takes_eda"] > 0),
"takes_total": len(takes),
"takes_with_eda": len(eda),
"gigs_total": len(gigs),
}
return {
"schema": "catalog-view v1 (triangle: A=score ⋈ C=metadata ⋈ B=recording; "
"B + EDA unverified)",
"as_of": AS_OF,
"stats": stats,
"tracks": rows,
}
def main():
out = build()
from models import CatalogView # DRY contract: validate before writing
CatalogView.model_validate(out)
OUT.write_text(json.dumps(out, ensure_ascii=False, indent=1))
rows = out["tracks"]
print(f"✓ {OUT}")
s = out["stats"]
print(f" tracks: {s['tracks_total']} | score parsed {s['score_parsed']} "
f"| metadata {s['with_metadata']}")
print(f" A↔C: {s['ac_agree']} agree · {s['ac_partial']} partial · "
f"{s['ac_conflict']} conflict · {s['ac_divergent']} DIVERGENT · "
f"{s['ac_no_claim']} no-claim · {s['ac_unparsed']} unparsed")
print(f" alias-fragmented (name→multiple files): {s['alias_fragmented']}")
print(f" recording: {s['recorded']} have a candidate take · "
f"{s['unrecorded']} unrecorded · {s['with_eda']} EDA-grounded")
print(f" EDA coverage: {s['takes_with_eda']}/{s['takes_total']} takes")
div = [r for r in rows if r["ac"]["level"] == "divergent"]
if div:
print(f"\n ⚠ DIVERGENT ({len(div)}) — linked .tidal ≠ metadata (wrong link?):")
for r in div[:15]:
sib = f" ↔ {len(r['alias_siblings'])} other file(s) share the name" \
if r["alias_siblings"] else ""
print(f" {r['name']:<26} {r['track']}{sib}")
conflicts = [r for r in rows if r["ac"]["level"] == "conflict"]
if conflicts:
print(f"\n ⚠ conflicts ({len(conflicts)}) — partial disagreement:")
for r in conflicts[:15]:
print(f" {r['name']:<26} claimed∉score: "
f"{', '.join(r['ac']['metadata_only']) or '—'}")
if __name__ == "__main__":
main()
This source diff could not be displayed because it is too large. You can view the blob instead.
......@@ -19,7 +19,7 @@ from datetime import date
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
from pydantic import BaseModel, ConfigDict, Field
# ── provenance ────────────────────────────────────────────────────────────────
......@@ -220,3 +220,99 @@ class PlayerData(BaseModel):
roleGroups: list[RoleGroup]
note: str = ""
takes: list[Take]
# ── catalog view — the TRIANGLE: A=score ⋈ C=metadata ⋈ B=recording (#46) ─────
# Generated downstream artifact (build_catalog_view.build), validated on emit.
# A confirmation map: does the .tidal score, the site metadata, and the recordings
# agree? Lit-cell dashboard consumes this. (parsers-over-copy; tested mechanically)
class AgreeLevel(str, Enum):
empty = "empty" # neither side has parseable sounds
unparsed = "unparsed" # corner A blind — WE couldn't read it (NOT a conflict)
no_claim = "no-claim" # corner C lists no parseable sound
agree = "agree" # ≥60% of claimed sounds are in the score
partial = "partial" # 30–60%
conflict = "conflict" # 0–30% — real partial disagreement
divergent = "divergent" # zero overlap, both rich → wrong file link / collision
class AgreeResult(BaseModel):
"""A↔C: how well the site's claimed ingredients match the actual score."""
level: AgreeLevel
precision: Optional[float] = None # |C∩A|/|C| — how true the claims are
jaccard: Optional[float] = None
score_only: list[str] = Field(default_factory=list) # in score, not claimed
metadata_only: list[str] = Field(default_factory=list) # claimed, not in score
shared: list[str] = Field(default_factory=list)
class TrackMeta(BaseModel):
"""Corner C facts for one gig appearance (site tracklist, a pointer)."""
gig: str
date: str = ""
bpm: Optional[float] = None
style: Optional[str] = None
dur: Optional[float] = None
class TakeRef(BaseModel):
"""Corner B candidate: a take this track plausibly lives in (date-join, L0)."""
take: str
type: str # track | SET | sketch | empty
via: str # the gig slug that joined them
method: str # date-exact | date±Nd
is_set: bool
eda: Optional[str] = None # path to the take's EDA artifact, if grounded
class TrackRow(BaseModel):
"""One track (identity = its .tidal path), reconciled across the three corners."""
track: str # canonical id = .tidal path
names: list[str] = Field(default_factory=list)
name: str
gigs: list[str] = Field(default_factory=list)
metas: list[TrackMeta] = Field(default_factory=list)
# corner A — score
score_present: bool
n_orbits: int
score: dict[str, str] = Field(default_factory=dict) # {"d1":"kick",…}
score_sounds: list[str] = Field(default_factory=list)
# corner C — metadata
claimed_sounds: list[str] = Field(default_factory=list)
# A↔C
ac: AgreeResult
# corner B — recording
takes: list[TakeRef] = Field(default_factory=list)
n_takes: int
n_takes_eda: int
recorded: bool
alias_siblings: list[str] = Field(default_factory=list) # other files sharing the name
class CatalogStats(BaseModel):
tracks_total: int
score_present: int
score_parsed: int
with_metadata: int
ac_agree: int
ac_partial: int
ac_conflict: int
ac_divergent: int
ac_unparsed: int
ac_no_claim: int
ac_empty: int
alias_fragmented: int
recorded: int
unrecorded: int
with_eda: int
takes_total: int
takes_with_eda: int
gigs_total: int
class CatalogView(BaseModel):
model_config = ConfigDict(populate_by_name=True)
schema_: str = Field(alias="schema") # 'schema' key; alias dodges BaseModel.schema
as_of: str
stats: CatalogStats
tracks: list[TrackRow] = Field(default_factory=list)
"""pytest setup for the tide-table pipeline tests.
Puts the tide-table dir on sys.path so the pipeline modules import as top-level
(`import tidal_score`, `import build_catalog_view`) exactly as they do in prod.
"""
import sys
from pathlib import Path
import pytest
HERE = Path(__file__).resolve().parent
TIDE_TABLE = HERE.parent
REPO = TIDE_TABLE.parent.parent # …/Sound/Tidal
sys.path.insert(0, str(TIDE_TABLE))
@pytest.fixture(scope="session")
def repo():
return REPO
@pytest.fixture(scope="session")
def fixtures():
return HERE / "fixtures"
@pytest.fixture(scope="session")
def view():
"""The real pipeline output, built once (IT-on-real-data)."""
import build_catalog_view as bcv
return bcv.build()
-- fixture: whole track wrapped in a `do` block with INDENTED dN (the
-- burn_this_book.tidal idiom that a column-0-only parser read as empty).
main = do
let g = id
d1 $ g $ s "jazz:2"
d3 $ g $ s "drums_nes:3" -- claimed-but-"missing" until the parser fix
d4 $ g $ s "cpluck:1"
d5
$ s "meth_bass"
# orbit 7 -- `# orbit 7` ⇒ d8, not d5
-- d9 $ s "should_be_ignored" (commented-out block must not parse)
"""A↔C agreement taxonomy. Encodes the cardinal lesson: never blame the metadata
for a parser miss — `unparsed` and `divergent` are NOT `conflict`."""
import build_catalog_view as bcv
def test_unparsed_is_not_conflict():
# corner A empty (parser couldn't read it) ⇒ unparsed, never conflict
r = bcv.agree([], ["drums_nes", "cpluck"])
assert r["level"] == "unparsed"
def test_no_claim_when_metadata_silent():
assert bcv.agree(["kick", "snare"], [])["level"] == "no-claim"
def test_empty_both_sides():
assert bcv.agree([], [])["level"] == "empty"
def test_agree_when_claims_present_in_score():
r = bcv.agree(["kick", "snare", "hat"], ["kick", "snare"])
assert r["level"] == "agree"
assert r["precision"] == 1.0
assert r["metadata_only"] == []
assert r["score_only"] == ["hat"] # drums C omitted — benign
def test_partial_band():
# 1 of 3 claimed present → precision .333 → partial
r = bcv.agree(["kick", "x", "y"], ["kick", "a", "b"])
assert r["level"] == "partial"
def test_conflict_thin_overlap():
# 1 of 4 claimed present → .25 → conflict
assert bcv.agree(["kick", "x"], ["kick", "a", "b", "c"])["level"] == "conflict"
def test_divergent_rich_disjoint():
# zero overlap, both sides rich → wrong-link/identity collision (Blue Gold→ccc0)
r = bcv.agree(["moog", "ccc", "909"], ["suns_keys", "suns_guitar"])
assert r["level"] == "divergent"
assert r["precision"] == 0.0
def test_conflict_when_disjoint_but_thin():
# zero overlap but only 1 claimed → conflict, not divergent
assert bcv.agree(["moog", "ccc"], ["zzz"])["level"] == "conflict"
"""Integration test on the REAL corpus: structural invariants + coverage
regression guards. This is the mechanical answer to 'metadata prior; nothing
verified' — the agreement numbers are now asserted, so they can't silently rot.
Ranges (not exact equality) absorb honest content drift while still catching a
regression (e.g. the parser breaking and conflicts spiking)."""
import re
# ── structural invariants (must hold regardless of content) ──────────────────
def test_every_cited_score_parses(view):
"""Corner A: all 73 cited .tidal exist AND parse to a non-empty score.
Guards the do-block / col-0 parser regression catalog-wide."""
s = view["stats"]
assert s["score_present"] == s["tracks_total"]
assert s["score_parsed"] == s["tracks_total"]
assert s["ac_unparsed"] == 0
def test_conflict_and_divergent_require_a_real_score(view):
"""The cardinal rule: a parser miss is NEVER reported as a disagreement."""
for r in view["tracks"]:
if r["ac"]["level"] in ("conflict", "divergent", "partial", "agree"):
assert r["n_orbits"] > 0, f"{r['name']} flagged {r['ac']['level']} on empty score"
def test_metadata_only_is_truly_disjoint_from_score(view):
"""metadata_only must never contain a sound that's also in the score."""
for r in view["tracks"]:
a = set(r["score_sounds"])
assert not (set(r["ac"]["metadata_only"]) & a), r["name"]
def test_candidate_takes_are_real(view):
for r in view["tracks"]:
for t in r["takes"]:
assert re.fullmatch(r"Take\d+", t["take"]), t["take"]
def test_eda_coverage_is_honest(view):
"""The whole point of surfacing the gap: EDA is sparse and we don't hide it."""
s = view["stats"]
assert 0 < s["takes_with_eda"] <= s["takes_total"]
assert s["with_eda"] <= s["recorded"]
# ── coverage regression guards (catch silent rot in the agreement numbers) ───
def test_agreement_distribution_within_bounds(view):
s = view["stats"]
assert s["tracks_total"] == 73 # update consciously if catalog grows
assert s["ac_agree"] >= 45 # was 48 — alarm if it craters
assert s["ac_conflict"] <= 4 # was 2 — alarm if conflicts spike
assert s["ac_divergent"] <= 6 # was 3 — wrong-link suspects
# ── specific cases that taught us the lessons ────────────────────────────────
def _row(view, track):
return next(r for r in view["tracks"] if r["track"] == track)
def test_burn_this_book_now_agrees(view):
r = _row(view, "live/techno/noir/burn_this_book.tidal")
assert r["ac"]["level"] == "agree" # was a false conflict
def test_blue_gold_ccc0_is_divergent_not_conflict(view):
r = _row(view, "live/collab/ccc/ccc0.tidal")
assert r["ac"]["level"] == "divergent" # wrong file link, properly labelled
assert r["alias_siblings"], "should share the name with the real blue_gold.tidal"
# ── DRY contract: the emitted view validates against the pydantic model ───────
def test_view_validates_against_model(view):
"""models.py is the single source of truth; the build must conform to it
(and so must the generated TS, which is derived from the same schema)."""
from models import CatalogView
cv = CatalogView.model_validate(view)
assert cv.stats.tracks_total == len(cv.tracks)
"""Corner C ingredient parsing, the date-join, and the audio_lens family
classifier (the cpluck-as-perc regression)."""
import build_catalog_view as bcv
import audio_lens as al
from sample_tfidf import sound_vocab
def test_ingredient_sounds_parses_codes():
vocab, _ = sound_vocab()
track = {"ingredients": [
{"type": "sample", "code": 's "[kick:4]"'},
{"type": "sample", "code": 's "jungle_breaks:84"'},
{"type": "moment", "code": "highlights"}, # non-sound type → skipped
{"type": "effect", "code": "d10"}, # effect type → skipped
]}
s = bcv.ingredient_sounds(track, vocab)
assert "kick" in s and "jungle_breaks" in s
assert "highlights" not in s
def test_takes_for_gig_exact_fuzzy_and_miss():
takes = [{"date": "2024-10-01", "take": "Take20", "dur": "", "orbits": 13,
"type": "SET", "label": ""}]
exact = bcv.takes_for_gig("2024-10-01", takes)
assert exact and exact[0][1] == "date-exact"
fuzzy = bcv.takes_for_gig("2024-10-03", takes) # within ±3d
assert fuzzy and "date±2d" in fuzzy[0][1]
assert bcv.takes_for_gig("2024-11-01", takes) == [] # outside window
def test_classify_family_cpluck_not_perc():
# the bug that mislabeled cpluck as a drum
assert al.classify_family("cpluck")[0] != "percs"
def test_classify_family_perc_exact():
assert al.classify_family("cp")[0] == "percs"
assert al.classify_family("dr")[0] == "percs"
def test_classify_family_breaks_are_tops():
assert al.classify_family("jungle_breaks")[0] == "tops"
def test_classify_family_bass_register():
assert al.classify_family("meth_bass", {"centroid": 100})[0] == "bass"
"""Corner A — the .tidal score parser. Regression-guards the idiom-blindness
bugs that manufactured false A↔C conflicts."""
import tidal_score as ts
def test_indented_doblock_parses(fixtures):
"""do-block / indented dN must NOT parse to an empty score."""
m = ts.orbit_sounds(fixtures / "doblock.tidal")
assert m, "indented do-block parsed to empty — the burn_this_book bug"
sounds = {d["sound"] for d in m.values()}
assert "drums_nes" in sounds
assert "cpluck" in sounds
def test_orbit_override_maps_to_dn_plus_1(fixtures):
"""`# orbit 7` ⇒ d8 (SuperDirt 0-indexed); d5 header must not survive."""
m = ts.orbit_sounds(fixtures / "doblock.tidal")
assert 8 in m and m[8]["sound"] == "meth_bass"
assert 5 not in m
assert m[8]["orbit_override"] is True
def test_commented_block_ignored(fixtures):
m = ts.orbit_sounds(fixtures / "doblock.tidal")
assert 9 not in m, "a -- commented dN block must not parse"
def test_dn_header_rejects_identifiers():
"""`\\b` after the digits keeps degradeBy / d4bass from matching as headers."""
assert ts.DN_HEADER.match('degradeBy 0.5 $ s "x"') is None
assert ts.DN_HEADER.match(' d4bass = s "x"') is None
assert ts.DN_HEADER.match(' d4 $ s "x"') is not None # indentation OK
assert ts.DN_HEADER.match('d1 $ s "x"') is not None
def test_real_burn_this_book_regression(repo):
"""The exact file that exposed the col-0-only bug: drums_nes + cpluck ARE
in the score (metadata was right all along)."""
m = ts.orbit_sounds(repo / "live/techno/noir/burn_this_book.tidal")
sounds = {d["sound"] for d in m.values()}
assert {"drums_nes", "cpluck"} <= sounds
......@@ -33,8 +33,12 @@ from sample_tfidf import SPLIT, sound_vocab # noqa: E402
# "last token wins" make `$` safe and necessary (kick/snare/meth_bass live there).
SOURCE_CTX = re.compile(r'(?:\bsound\b|\bs\b|#|\$)\s*"([^"]*)"')
# a `dN` block header at column 0 (real, not commented). Tidal comments are `--`.
DN_HEADER = re.compile(r"^(d|p)(\d{1,2})\b")
# a `dN` block header (real, not commented). Tidal comments are `--`. Leading
# whitespace IS allowed: ParVagues often wraps the whole track in a `do` block
# with INDENTED `d1`/`d2`… (e.g. burn_this_book.tidal) — a column-0-only match
# silently parsed those to an empty score. The `(\d{1,2})\b` guard still rejects
# identifiers like `degradeBy`/`d4bass` (no word-boundary after the digits).
DN_HEADER = re.compile(r"^\s*(d|p)(\d{1,2})\b")
ORBIT_OVERRIDE = re.compile(r"#\s*orbit\s+(\d+)")
......
<!doctype html><html lang="en"><head><meta charset="utf-8">
<meta name="viewport" content="width=device-width,initial-scale=1">
<title>L'Armada · Triangle · Catalog Confirmation</title>
<style>
/* Ship's Bridge tokens (armada/DESIGN.md) — magenta reserved for "the now" */
:root{
--surface:#0a0a0a;--raised:#111;--overlay:#171717;--hairline:#ffffff1f;
--ink:#e8e8ea;--mute:#9a9aa0;--faint:#6a6a70;--magenta:#d900ff;
--percs:#ff8c00;--bass:#7c5cff;--melodic:#36c5f0;--tops:#2dd4bf;--atmos:#8a93a6;--vox:#ff3d7b;
--agree:#5bc091;--partial:#e0a82e;--conflict:#ff5252;--divergent:#b06cff;--idle:#4a4a4a;
}
*{box-sizing:border-box}
body{margin:0;background:var(--surface);color:var(--ink);height:100vh;overflow:hidden;
font:14px/1.45 Geist,Inter,system-ui,sans-serif}
.mono{font-family:"Geist Mono",ui-monospace,SFMono-Regular,monospace}
.app{display:grid;grid-template-rows:auto auto 1fr;height:100vh}
/* header */
header{padding:14px 20px 10px;border-bottom:1px solid var(--hairline)}
h1{margin:0;font-size:15px;letter-spacing:.14em;font-weight:600}
h1 b{color:var(--magenta)}
.sub{color:var(--mute);font-size:12px;margin-top:2px}
.chips{display:flex;gap:18px;flex-wrap:wrap;margin-top:10px;align-items:flex-end}
.chip{display:flex;flex-direction:column;gap:1px}
.chip .n{font-size:22px;font-weight:600;line-height:1}
.chip .l{font-size:10px;letter-spacing:.08em;text-transform:uppercase;color:var(--faint)}
.stack{display:flex;height:9px;border-radius:5px;overflow:hidden;width:340px;margin-top:5px;border:1px solid var(--hairline)}
.stack i{display:block;height:100%}
/* toolbar */
.bar{padding:8px 20px;display:flex;gap:8px;align-items:center;border-bottom:1px solid var(--hairline);flex-wrap:wrap}
input#q{background:var(--overlay);border:1px solid var(--hairline);color:var(--ink);border-radius:8px;
padding:6px 10px;font:inherit;width:230px}
.f{background:transparent;border:1px solid var(--hairline);color:var(--mute);border-radius:99px;
padding:4px 11px;cursor:pointer;font:inherit;font-size:12px;display:flex;gap:6px;align-items:center}
.f:hover{color:var(--ink)}.f.on{color:#000;font-weight:600}
.f .d{width:8px;height:8px;border-radius:50%}
.grow{flex:1}.count{color:var(--faint);font-size:12px}
/* table */
.wrap{overflow:auto}
table{width:100%;border-collapse:collapse;font-size:13px}
thead th{position:sticky;top:0;background:var(--surface);text-align:left;padding:8px 12px;
font-size:10px;letter-spacing:.07em;text-transform:uppercase;color:var(--faint);
border-bottom:1px solid var(--hairline);z-index:2;white-space:nowrap}
tbody td{padding:7px 12px;border-bottom:1px solid #ffffff10;vertical-align:middle}
tbody tr{cursor:pointer}tbody tr:hover{background:#ffffff08}
tbody tr.sel{background:#ffffff12;outline:1px solid var(--hairline)}
.tname{font-weight:500}
.path{color:var(--faint);font-size:11px}
.tag{display:inline-block;padding:1px 7px;border-radius:99px;font-size:11px;font-weight:600;color:#000}
.lvl{text-transform:capitalize}
.snd{color:var(--mute);font-size:11px}
.pill{display:inline-block;border:1px solid var(--hairline);border-radius:5px;padding:0 5px;margin:1px 2px 1px 0;font-size:11px}
.dot{width:8px;height:8px;border-radius:50%;display:inline-block;margin-right:5px}
.warn{color:var(--divergent)}.muted{color:var(--faint)}
.eda-no{color:var(--conflict)}.eda-yes{color:var(--agree)}
.num{font-variant-numeric:tabular-nums;text-align:right}
/* detail drawer */
.drawer{position:fixed;right:0;top:0;bottom:0;width:420px;background:var(--raised);
border-left:1px solid var(--hairline);transform:translateX(100%);transition:.18s;overflow:auto;
padding:18px 20px;z-index:10}
.drawer.open{transform:none;box-shadow:-20px 0 60px #000a}
.drawer h2{margin:0 0 2px;font-size:18px}
.drawer .x{position:absolute;top:14px;right:16px;cursor:pointer;color:var(--mute);font-size:18px}
.sec{margin-top:16px}
.sec h3{font-size:10px;letter-spacing:.08em;text-transform:uppercase;color:var(--faint);margin:0 0 7px}
.kv{display:grid;grid-template-columns:34px 1fr;gap:4px 8px;align-items:center}
.kv .o{color:var(--magenta);font-size:12px}
.diff span{display:inline-block;border-radius:5px;padding:1px 7px;margin:2px 3px 2px 0;font-size:12px}
.d-shared{background:#5bc09122;color:var(--agree);border:1px solid #5bc09155}
.d-score{background:#8a93a622;color:var(--atmos);border:1px solid #8a93a655}
.d-meta{background:#ff525222;color:var(--conflict);border:1px solid #ff525255}
.tk{display:flex;justify-content:space-between;border:1px solid var(--hairline);border-radius:7px;padding:6px 9px;margin-bottom:5px;font-size:12px}
.empty{color:var(--faint);font-style:italic;font-size:12px}
a.src{color:var(--melodic);text-decoration:none;font-size:11px}a.src:hover{text-decoration:underline}
</style></head><body>
<div class="app">
<header>
<h1>🔺 <b>TRIANGLE</b> · CATALOG CONFIRMATION</h1>
<div class="sub" id="sub">loading…</div>
<div class="chips" id="chips"></div>
</header>
<div class="bar">
<input id="q" placeholder="search track / sound / gig…" oninput="render()">
<span id="filters"></span>
<span class="grow"></span>
<span class="count" id="count"></span>
</div>
<div class="wrap">
<table>
<thead><tr>
<th>Track</th><th>A · Score</th><th>C · Metadata</th>
<th>A↔C</th><th>B · Recording</th><th class="num">Gigs</th>
</tr></thead>
<tbody id="rows"></tbody>
</table>
</div>
</div>
<div class="drawer" id="drawer"></div>
<script>
const C={agree:'--agree',partial:'--partial',conflict:'--conflict',divergent:'--divergent',
'no-claim':'--idle',unparsed:'--conflict',empty:'--idle'};
const css=v=>getComputedStyle(document.documentElement).getPropertyValue(v).trim();
const ROLE={percs:'--percs',bass:'--bass',melodic:'--melodic',tops:'--tops',atmos:'--atmos',vox:'--vox'};
let DATA=null, FILTER='all', SEL=null;
const FILTERS=[['all','All'],['agree','Agree'],['partial','Partial'],['conflict','Conflict'],
['divergent','Divergent'],['no-claim','No-claim'],['unrecorded','Unrecorded'],['eda','EDA-grounded']];
fetch('catalog_view.json').then(r=>r.json()).then(d=>{DATA=d;boot()});
function boot(){
const s=DATA.stats;
document.getElementById('sub').textContent=
`${DATA.schema} · as of ${DATA.as_of}`;
// coverage chips
const chips=[
['Tracks',s.tracks_total],['Score parsed',`${s.score_parsed}/${s.tracks_total}`],
['With metadata',s.with_metadata],['Recorded',`${s.recorded}/${s.tracks_total}`],
['EDA coverage',`${s.takes_with_eda}/${s.takes_total} takes`],
];
let h=chips.map(([l,n])=>`<div class="chip"><div class="n mono">${n}</div><div class="l">${l}</div></div>`).join('');
// agreement stacked bar
const order=[['agree','agree'],['partial','partial'],['conflict','conflict'],
['divergent','divergent'],['no-claim','no-claim']];
const seg=order.map(([k,lv])=>{const n=s['ac_'+k.replace('-','_')]||s['ac_'+k]||0;
return n?`<i style="flex:${n};background:${css(C[lv])}" title="${lv}: ${n}"></i>`:''}).join('');
h+=`<div class="chip"><div class="stack">${seg}</div>
<div class="l">A↔C agreement · ${s.ac_agree} agree · ${s.ac_conflict} conflict · ${s.ac_divergent} divergent</div></div>`;
document.getElementById('chips').innerHTML=h;
// filters
document.getElementById('filters').innerHTML=FILTERS.map(([k,l])=>{
const c=C[k]?css(C[k]):'';
const dot=c?`<span class="d" style="background:${c}"></span>`:'';
return `<button class="f${k===FILTER?' on':''}" data-k="${k}"
style="${k===FILTER&&c?`background:${c};border-color:${c}`:''}">${dot}${l}</button>`}).join('');
document.querySelectorAll('.f').forEach(b=>b.onclick=()=>{FILTER=b.dataset.k;boot()});
render();
}
function match(r){
const q=document.getElementById('q').value.toLowerCase().trim();
if(q){const hay=(r.name+' '+r.track+' '+r.score_sounds.join(' ')+' '+r.claimed_sounds.join(' ')+' '+r.gigs.join(' ')).toLowerCase();
if(!hay.includes(q))return false}
if(FILTER==='all')return true;
if(FILTER==='unrecorded')return !r.recorded;
if(FILTER==='eda')return r.n_takes_eda>0;
return r.ac.level===FILTER;
}
function render(){
const rows=DATA.tracks.filter(match);
document.getElementById('count').textContent=`${rows.length} / ${DATA.tracks.length} tracks`;
document.getElementById('rows').innerHTML=rows.map((r,i)=>{
const lv=r.ac.level, c=css(C[lv]);
const idx=DATA.tracks.indexOf(r);
const prec=r.ac.precision!=null?` ${Math.round(r.ac.precision*100)}%`:'';
const sib=r.alias_siblings.length?` <span class="warn" title="${r.alias_siblings.join('\\n')}">⚠${r.alias_siblings.length}</span>`:'';
const meta=r.metas[0]||{};
const metaCell=r.claimed_sounds.length
? `<span class="snd">${r.claimed_sounds.slice(0,4).join(' · ')}${r.claimed_sounds.length>4?'…':''}</span>`
: `<span class="muted">— no ingredients —</span>`;
const bpm=meta.bpm?`<span class="mono">${meta.bpm}</span> `:'';
const recCell=r.recorded
? `${r.n_takes} take${r.n_takes>1?'s':''} <span class="${r.n_takes_eda?'eda-yes':'eda-no'}" title="EDA-grounded takes">${r.n_takes_eda?'◉ EDA':'○ no EDA'}</span>`
: `<span class="muted">— unrecorded —</span>`;
return `<tr data-i="${idx}" class="${idx===SEL?'sel':''}">
<td><div class="tname">${esc(r.name)}${sib}</div><div class="path mono">${esc(r.track)}</div></td>
<td><span class="mono">${r.n_orbits}d</span> <span class="snd">${r.score_sounds.slice(0,4).join(' · ')}${r.score_sounds.length>4?'…':''}</span></td>
<td>${bpm}${metaCell}</td>
<td><span class="tag lvl" style="background:${c}">${lv}${prec}</span></td>
<td>${recCell}</td>
<td class="num mono">${r.gigs.length}</td>
</tr>`}).join('');
document.querySelectorAll('#rows tr').forEach(tr=>tr.onclick=()=>open(+tr.dataset.i));
}
function open(i){SEL=i;render();const r=DATA.tracks[i];const lv=r.ac.level,c=css(C[lv]);
const orb=Object.entries(r.score).map(([d,snd])=>`<div class="o mono">${d}</div><div class="mono">${esc(snd)}</div>`).join('');
const diff=g=>(r.ac[g]||[]).map(s=>`<span class="d-${g==='metadata_only'?'meta':g==='score_only'?'score':'shared'}">${esc(s)}</span>`).join('')||'<span class="empty">none</span>';
const takes=r.takes.length?r.takes.map(t=>`<div class="tk"><span class="mono">${t.take}</span>
<span class="muted">${t.type} · via ${esc(t.via)} · ${t.method}</span>
<span class="${t.eda?'eda-yes':'eda-no'}">${t.eda?'◉':'○'}</span></div>`).join(''):'<div class="empty">no candidate take (date-join found nothing)</div>';
const gigs=r.gigs.map(g=>`<span class="pill">${esc(g)}</span>`).join('');
const alias=r.alias_siblings.length?`<div class="sec"><h3>⚠ name shared with other files</h3>
${r.alias_siblings.map(s=>`<div class="path mono">${esc(s)}</div>`).join('')}
<div class="empty">versions of one track, or a wrong link — ear-verify</div></div>`:'';
document.getElementById('drawer').innerHTML=`<span class="x" onclick="close_()">✕</span>
<h2>${esc(r.name)}</h2><div class="path mono">${esc(r.track)}</div>
<div style="margin-top:9px"><span class="tag lvl" style="background:${c}">${lv}${r.ac.precision!=null?' · precision '+Math.round(r.ac.precision*100)+'%':''}</span></div>
<div class="sec"><h3>A · score (${r.n_orbits} orbits)</h3><div class="kv">${orb||'<span class="empty">unparsed</span>'}</div></div>
<div class="sec"><h3>A↔C diff</h3>
<div class="diff"><div><b class="muted" style="font-size:11px">shared</b><br>${diff('shared')}</div>
<div style="margin-top:6px"><b class="muted" style="font-size:11px">score-only (often drums C omits)</b><br>${diff('score_only')}</div>
<div style="margin-top:6px"><b class="muted" style="font-size:11px">metadata-only (claimed ∉ score)</b><br>${diff('metadata_only')}</div></div></div>
${alias}
<div class="sec"><h3>B · candidate recordings</h3>${takes}</div>
<div class="sec"><h3>gigs (${r.gigs.length})</h3>${gigs}</div>`;
document.getElementById('drawer').classList.add('open');
}
function close_(){document.getElementById('drawer').classList.remove('open');SEL=null;render()}
function esc(s){return String(s).replace(/[&<>"]/g,m=>({'&':'&amp;','<':'&lt;','>':'&gt;','"':'&quot;'}[m]))}
document.addEventListener('keydown',e=>{if(e.key==='Escape')close_()});
</script></body></html>
// AUTO-GENERATED from armada/tide-table/models.py — DO NOT EDIT.
// Regenerate: python3 tools/gen_ts_types.py (DRY data layer, #42)
export type AgreeLevel = 'empty' | 'unparsed' | 'no-claim' | 'agree' | 'partial' | 'conflict' | 'divergent'
export type Variant = 'stream' | 'club'
/** A↔C: how well the site's claimed ingredients match the actual score. */
export interface AgreeResult {
level: AgreeLevel
precision?: number | null
jaccard?: number | null
score_only?: string[]
metadata_only?: string[]
shared?: string[]
}
export interface CatalogStats {
tracks_total: number
score_present: number
score_parsed: number
with_metadata: number
ac_agree: number
ac_partial: number
ac_conflict: number
ac_divergent: number
ac_unparsed: number
ac_no_claim: number
ac_empty: number
alias_fragmented: number
recorded: number
unrecorded: number
with_eda: number
takes_total: number
takes_with_eda: number
gigs_total: number
}
export interface LoudTrace {
trace: number[]
stepS: number
......@@ -45,6 +78,45 @@ export interface Take {
orbits: OrbitActivity[]
}
/** Corner B candidate: a take this track plausibly lives in (date-join, L0). */
export interface TakeRef {
take: string
type: string
via: string
method: string
is_set: boolean
eda?: string | null
}
/** Corner C facts for one gig appearance (site tracklist, a pointer). */
export interface TrackMeta {
gig: string
date?: string
bpm?: number | null
style?: string | null
dur?: number | null
}
/** One track (identity = its .tidal path), reconciled across the three corners. */
export interface TrackRow {
track: string
names?: string[]
name: string
gigs?: string[]
metas?: TrackMeta[]
score_present: boolean
n_orbits: number
score?: Record<string, string>
score_sounds?: string[]
claimed_sounds?: string[]
ac: AgreeResult
takes?: TakeRef[]
n_takes: number
n_takes_eda: number
recorded: boolean
alias_siblings?: string[]
}
export interface PlayerData {
track: string
calibration: string
......@@ -61,3 +133,10 @@ export interface Note {
t: number
text: string
}
export interface CatalogView {
schema: string
as_of: string
stats: CatalogStats
tracks?: TrackRow[]
}
......@@ -14,9 +14,9 @@ from pathlib import Path
ROOT = Path(__file__).resolve().parent.parent
sys.path.insert(0, str(ROOT / "armada" / "tide-table"))
from models import Note, PlayerData # noqa: E402
from models import CatalogView, Note, PlayerData # noqa: E402
ROOTS = [PlayerData, Note] # top-level UI-facing models
ROOTS = [PlayerData, Note, CatalogView] # top-level UI-facing models
OUT = ROOT / "armada" / "ui" / "src" / "types.gen.ts"
SCALARS = {"string": "string", "number": "number", "integer": "number",
"boolean": "boolean", "null": "null"}
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment