[DRAFT] Evo 2 SAE feature explorer — visualization mockup#1582
Draft
polinabinder1 wants to merge 13 commits into
Draft
[DRAFT] Evo 2 SAE feature explorer — visualization mockup#1582polinabinder1 wants to merge 13 commits into
polinabinder1 wants to merge 13 commits into
Conversation
torch 2.6 changed the default of `weights_only` to True. The Savanna checkpoint pickle includes numpy globals (`numpy.core.multiarray._reconstruct`), which the safer loader rejects. The converter then exits 0 with no output written and the error gets buried in stderr — silent failure. The Savanna repos under arcinstitute/* are trusted sources, so load with weights_only=False. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mirrors the existing esm2 / codonfm SAE recipes. Pipeline:
chunk -> convert (Savanna->MBridge) -> predict_evo2 -> pt_to_parquet -> train
Differences from esm2/codonfm are forced by Evo2 specifics:
- Hyena/Megatron-Core model, no HF AutoModel path => reuses the
existing `predict_evo2` CLI for inference instead of writing
a custom extract.py
- `pt_to_parquet.py` shim bridges predict_evo2's .pt output to
the universal `sae.activation_store` parquet contract
- `chunk_fasta.py` preprocessor keeps inputs within the model's
trained context length (8192 bp for 1B); Hyena fftconv OOMs
on long sequences even at micro-batch=1
- `train.py` is the same as codonfm's, copied verbatim per
bionemo-recipes' KISS-over-DRY convention
Validated end-to-end on 100 organelle sequences (Evo2 1B layer 12):
loss 0.67 -> 0.045, FVU 0.90 -> 0.10, var_exp 0.10 -> 0.90, 2m14s wall.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The recipe currently has no model-specific Python module — the extractor is upstream (`predict_evo2`) and the two scripts are simple CLIs in scripts/. Drop the empty package and adjust pyproject.toml so setuptools doesn't try to discover anything. Will reintroduce when there's actual library code to put there (eval, dashboard, dataloaders). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Fork of recipes/codonfm/codon_dashboard adapted for DNA + Evo 2,
populated with synthetic data. Demo-able artifact, not a real result.
What's here:
- scripts/make_mockup_features.py: deterministic synthetic data generator
(seed 42). Writes features_atlas.parquet, feature_metadata.parquet,
feature_examples.parquet to evo2_dashboard_mockup/public/. Fixtures
are committed for one-step npm-only setup.
- evo2_dashboard_mockup/: Vite/React SPA forked from codon_dashboard
with these swaps:
* Removed molstar dep + MolstarThumbnail.jsx
* Renamed ProteinSequence.jsx -> SequenceView.jsx; per-base
rendering (no codon framing, no AA translation)
* Renamed ProteinDetailModal.jsx -> RegionDetailModal.jsx;
UniProt content swapped for genomic-region content
* utils.js: getRegionLabel + parseBases (replacing
getAccession/uniprotUrl/parseCodons/codonToAA)
* MOCKUP banner at top of App
* "Evo 2 SAE Feature Explorer (Mockup)" title
- v2 roadmap placeholders (greyed em-dashes with hover tooltips):
* FeatureCard: Annotation, Sensitivity, Recon Δ stats
* FeatureDetailPage: Annotations, Conservation sections
Quick start: cd evo2_dashboard_mockup && npm install && npm run dev
The synthetic data schema is the contract the future real eval pipeline
will need to target.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ed features
Three changes on top of the initial mockup commit:
1. Drop codonfm-specific scaffolding from forked components.
- .gitignore the auto-generated package-lock.json (regenerates on `npm install`)
- FeatureCard.jsx: 793 -> 508 lines. Removed dead stat tiles (Hi-Score,
Variant/Site/Local deltas, ClinVar, PhyloP, GC, Trinuc/Gene entropy),
codonfm vocab-logits chart, codonfm GSEA tags, codonfm CSV export
sections — all conditional on fields our synthetic data doesn't provide.
- FeatureDetailPage.jsx: 522 -> 187 lines. Replaced codonfm-specific
VocabLogitChart / CodonAnnotations / FeatureMetrics components with a
simpler DNA-friendly detail view.
2. Refine the synthetic feature set.
- 11 labeled DNA-native features in 3 thematic UMAP clusters:
* eukaryotic regulatory (TATA box, polyA signal, CpG island,
splice donor, splice acceptor)
* bacterial regulatory (-10 box, -35 box, Shine-Dalgarno)
* codon context (start ATG, stop TAA, stop TAG)
- 9 unlabeled features in a 4th diffuse cluster (label=NULL,
db_source=NULL) — mimics the realistic case where most SAE
features are uninterpreted.
- New `db_source` column on each feature (RefSeq / JASPAR-ENCODE /
bacterial annotation / RefSeq UTR / ENCODE-RefSeq / NULL).
3. Bug fixes for cross-pod port-forward demo:
- App.jsx defaults: `selectedCategory` and `histMetric3` were
hardcoded to codonfm's `mean_variant_1bcdwt` column, which doesn't
exist in our atlas and threw Binder errors. Switched to `cluster_id`.
- Atlas column rename: `cluster` -> `cluster_id` to match what
App.jsx queries.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Contributor
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Each of the 11 labeled features now ships with a PWM-driven sequence
logo rendered into public/logos/feature_{id}.png by logomaker. Central
signatures are spec'd per label (Kozak ATG, TATA, polyA, CpG, Shine-
Dalgarno, bacterial -10/-35, splice donor/acceptor, stop TAA/TAG);
flanks are uniform 0-bit so the logos read as clean motif summaries
rather than noisy speckle. Unlabeled features get no logo — their
cards skip the section entirely.
make_mockup_features.py grows _build_pwm() and _render_logo(); the
metadata/atlas parquets carry a logo_path column; App.jsx detects it
optionally and excludes it from category detection; FeatureCard's
expanded view and FeatureDetailPage display the logo above the top-
activating-sequences list.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two new visualizations for the SAE interpretability dashboard, plus the offline pipeline that produces the gene-UMAP precompute bundle. scripts/generate_fake_genes.py 500-row genes.tsv stand-in (gene_symbol, species, sequence) until a real curated catalog lands. Realistic-ish distributions across 7 species. scripts/gene_umap_precompute.py End-to-end offline pipeline: genes.tsv -> Evo2 1B layer-20 -> TopK SAE encode -> mean per gene -> UMAP (cosine) -> HDBSCAN clusters -> per-feature firing stats. Writes G.npz, genes_umap.parquet, feature_stats.parquet, manifest.json. Reuses predict_evo2 via torchrun subprocess; aggregates .pt files by seq_idx + pad_mask. Idempotent (skips predict if .pt files exist). src/ColoredSequence.jsx React component: paste a DNA sequence -> each base background-colored by its top-firing SAE feature, opacity scaled by activation strength. Two modes: top-feature (default), single-feature lookup. Builds mock activations internally when no `analysis` prop is supplied so the component works standalone before the /analyze backend is wired. Tableau-10 colorblind palette, hover tooltip with top-5 features, legend sorted by per-color position count. src/GeneUMAPView.jsx Renders the 500-gene UMAP via canvas. Loads G.bin (raw float32), genes_meta.json, feature_stats.json from public/gene_umap/. Click a feature in the sidebar -> instant recolor by activation strength (no recompute). Click Reorganize -> re-runs UMAP client-side with feature-weighted vectors (umap-js, ~2-5s at N=500), animates the transition with ease-in-out cubic. Hover shows gene metadata + top 5 firing features. src/Preview.jsx + src/index.jsx Tabbed entry at /#preview: "Main" (the existing dashboard, untouched), "ColoredSequence", "Gene UMAP". Hash-gated so / still goes to the unchanged production layout. The ColoredSequence tab includes a paste textarea so users can drop their own sequences in. public/gene_umap/ Precomputed bundle for the GeneUMAPView (G.bin 30 MB, plus small JSON metadata + per-feature stats filtered to n_firing >= 10). Dep change: umap-js for client-side reorganize. Generated genes are synthetic; replace fake_genes.tsv with a real curated 500-gene list and re-run the precompute when one is available. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Force-added past the *.bin gitignore so coworkers can pull and run the dashboard end-to-end without re-running the GPU precompute. Without this file GeneUMAPView fails to load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three-column comparison view for steering a chosen SAE feature at a masked position. All synthetic — 14 hand-rolled (seed, feature) pairs in public/steering_examples.json, including 6 deliberately marked as null results so the demo shows honestly that not every steering attempt works. - Instant-apply controls (no cosmetic Run button) - A/C/G/T probability bars (DNA tokenization, matches Evo2) - Sticky diff summary above columns with effect-size badge - 16S × kanamycin_resistance pair illustrates the A1408G mutation - Disabled feature options for pairs without data; graceful fallback message when an unsupported combination is selected - 4th tab in Preview.jsx, reuses existing tab pattern (no router added) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pares the preview tabs down to the two that matter for now: /#preview tab 1: Main (existing feature catalog + atlas + WebLogos) /#preview tab 2: Steering explorer (slider + per-position P(ACGT) heatmap) Removes the ColoredSequence, Gene UMAP, and SAE Summary tabs along with their data, scripts, and components. The full 5-tab version is preserved on the evo2-sae-dashboard-full-mockup branch if we want to revive any of those views later. Removed: - src/ColoredSequence.jsx, src/GeneUMAPView.jsx, src/SteeringComparison.jsx, src/SAESummary.jsx - public/gene_umap/ (G.bin, genes_meta.json, feature_stats.json) - public/steering_examples.json (replaced by steering_data.json) - public/sae_qc_summary.json - scripts/gene_umap_precompute.py, scripts/generate_fake_genes.py Kept / added: - src/SteeringExplorer.jsx (slider + per-position heatmap) - public/steering_data.json (14 pairs × 200 positions × 4 clamps mock) - scripts/generate_steering_data.py (regenerates the JSON) - src/Preview.jsx trimmed to 2 tabs, no more state-heavy local logic Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final UX pass on the SteeringDemo: - Feature catalog trimmed to the two AMR features (kanamycin_resistance, streptomycin_resistance) — non-AMR features removed since this demo specifically reproduces the Hutchinson 2025 A1408G headline - "Feature to steer" is a dropdown picking the primary feature - "Also clamp" checkboxes let users co-clamp the other AMR feature alongside the primary; clamp slider applies to all selected - Neighbors-clamped buttons extended from 0/1/2 to 0/1/2/3/4 - Selectivity table + narrative callout removed earlier in the same iteration; just the dropdown + co-clamp + bar comparison stays JSON updated: comparisons now cover all 4 seeds × 2 AMR features (8 pairs total). Non-AMR seeds (promoter / brca1_exon / random) show null-result distributions — demonstrating that AMR features don't shift predictions where they have no biological purchase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- add Steering mode toggle (Position-restricted / Global all positions) - global mode smears the per-position bar chart toward a low-confidence distribution scaled by |clamp|, swaps the FLIPPED badge for "no clean flip - degraded" - add SequenceStrip showing baseline vs steered argmax across the whole seed sequence with flipped positions highlighted; only renders in global mode - remove author names / paper titles / external model labels from UI copy, banner, tab label, and code comments Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
How to use
Prerequisites — install Node.js + npm
Check if you already have it:
node --version && npm --versionIf either prints "command not found", install via one of:
Ubuntu / Debian / dev containers (most Lepton-style pods):
macOS (Homebrew):
Any OS, version-manager route (
nvm— recommended if you juggle Node versions):You need Node 18+; this project is tested on Node 20.
Run the dashboard
Then open
http://localhost:5176/#preview— the#previewhash route surfaces the tabbed mockup; the bare/URL still renders the unchanged main dashboard.This PR ships a demo-only visualization shell at
bionemo-recipes/interpretability/sparse_autoencoders/recipes/evo2/evo2_dashboard_mockup/. There is no real SAE inference involved. Everything you see in the dashboard is generated byscripts/make_mockup_features.pyfrom a fixed seed.A yellow
MOCKUP — synthetic data, not from a real SAE runbanner is rendered at the top of every page so nobody mistakes it for real model output.Summary
This branch contains a multi-tab visualization mockup for the evo2 SAE feature explorer. The latest commit adds:
Earlier commits ship the SAE summary table, feature catalog, UMAP atlas, gene-feature G.bin matrix, and the first version of the steering tab.
Test plan
npm run devstarts cleanly, no console errors at#preview🤖 Generated with Claude Code