Open
Conversation
* Add skeleton untargeted pipeline directory with plan * fix flake8 errors in generate chemicals script * Add lint/test guidelines and minimal untargeted test * Implement dataset setup script * Add mzML generation step * Tune chromatogram width * Add ground truth table generation * Add MGF export for simulated chemicals * docs: mark simulation step complete
* Add metric reporting helper to pipeline * Add linear column RT noise option * Add RT drift models and use in demo pipeline * Refactor drift models into main package * Clean up drift and demo scripts
* Refactor output handling and add get_peak_data * Refactor pipeline into classes
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR to add demo using vimms to generate synthetic data so we can test the performance of a typical untargeted data processing pipeline in LC/MS metabolomics.
So we generate chemicals in various experimental setup, write their mzml files + corresponding ground truth. The mzml files are then passed into a pipeline that tries to infer back chemical identities from spectral data. Since it's synthetic data, we can easily evaluate pipeline performance for all these steps:
Still work in progress, do not merge yet!!