This Github repository containts scripts and data for assembly of the Streptanthus diversifolius genome.
A manuscript describing this assembly is available on BioRxiv.
The raw sequencing and final assemvly files can been found at NCBI with accession number PRJNA283414
The final assembly files can also be found in this repository in the HiFi_assemblies directory. Specifically:
S.div_Chr_Ordered.fasta.gz Is a fasta file containing the fourteen telomere-to-telomere (T2) chromosomes. The full assembly (with unplaced contigs) is too large for github but is available on NCBI.
S.div_Chr_Ordered_liftoffv1.gff.gz Is the annotation gff for the fourteen T2T chromosomes.
S.div_liftoffv1.gff.gz Is the annotation gff for chromosomes and addtional contigs
S.div_Chr_Ordered.proteins.fasta.gz and S.div_Chr_Ordered.transcripts.fasta.gz are fasta files of the predicted transcripts and proteins.
Cafe_analysis Is for the gene family expansion and contactions Analysis
WGD_identification Is for Ks analysis
Within the Hifi_assemblies Folder:
C_amplexicaulis_S_pinnata_comparisons Contains homoeolog rentention/loss analysis and alluvial plots
COGE_FracBias Is for fractionation bias analysis.