Skip to content

Conversation

@rostro36
Copy link

Dear EvoDiff authors,
Thank you for making the code to EvoDiff available and showing the many facets of single sequence and MSA generation.
I had a closer look at generating MSAs from the query sequence and made some adjustments that I would like to contribute.

Changes

  • README.md
    • wrong filename, larger description and added support for input file and output folder for generate_msa.py
  • evodiff/data.py
    • Added flag to use special openfold treatment or more simple behaviour for custom a3m
    • Added support for single file instead of whole folder
  • evodiff/generate_msa.py
    • Added output_fpath argument for output folder
    • Added support for custom input path
    • Again support for special openfold treatment or custom a3m
  • data/openfold/test.a3m
    • Some test a3m

I tried to keep the code as similar as possible, except for more use of pathlib.Path. The defaults should still be the same and give the same output.

Tests

I ran it with the default parameters on my system for --start-query, but changed the input, output and batch size as I have not downloaded the whole OpenFold database. All tests gave me an output and I could see the different steps of diffusion ending in an MSA that was expectedly behind the results of the experiments in the pre-print, for which the models were built and much effort went in.

Please let me know about any feedback and changes that are still needed :)

@rostro36
Copy link
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant