You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Target: Punkst — C++ rewrite of the FICTURE algorithm by Yichen Si (FICTURE author).
Distribution: Docker image philo1984/punkst:latest (project notes "the Docker image is not always up to date"), or CMake from source.
Project state: 99.4% C++, 164 commits at audit time. No formal release tags yet.
Inputs: project mentions "http(s) / s3:// input support". Direct parquet support TBD — needs investigation.
Why upgrade
C++ rewrite — "substantially more efficient and (hopefully) easier to use" per project description. Matches our Atera scale concerns.
Potentially eliminates PARQUET_TO_CSV — if Punkst accepts parquet directly, we can drop the conversion step in FICTURE_PREPROCESS_MODEL. (TBD per investigation.)
Smaller container surface — Python ficture pulls a heavy dependency tree; C++ static binary is much leaner.
Active maintenance — Punkst is where FICTURE development is moving.
Migration plan
Confirm Punkst accepts parquet (or what input formats it supports) — open question to the upstream
Current state
ficturepackage via Wave containercommunity.wave.seqera.io/library/pip_ficture:ad8a1265a51b53cfpip install, containerisedficture/preprocess,ficture/modelsegfree(when--method ficture)Proposed upgrade
philo1984/punkst:latest(project notes "the Docker image is not always up to date"), or CMake from source.Why upgrade
PARQUET_TO_CSV— if Punkst accepts parquet directly, we can drop the conversion step inFICTURE_PREPROCESS_MODEL. (TBD per investigation.)Migration plan
ficture_preprocess.pyCLI to Punkst CLI (negative_control_regex, transcripts path, features, min_phred_score)ficture/preprocessmodule container + invocationPARQUET_TO_CSVinFICTURE_PREPROCESS_MODELsubworkflow (if Punkst takes parquet)Risks
philo1984/punkst:latest"is not always up to date"FICTUREprocess consumes preprocess outputsCross-links
PARQUET_TO_CSVentirely.