Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
52 changes: 52 additions & 0 deletions microBioRust/new_output_embl.gbk
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
LOCUS source_1 928 bp DNA linear CON 24-NOV-2025
DEFINITION Escherichia coli K-12 substr. MG1655.
ACCESSION source_1
KEYWORDS .
SOURCE Escherichia coli K-12 substr. MG1655
ORGANISM Escherichia coli K-12 substr. MG1655
FEATURES Location/Qualifiers
source 1..910
/organism="K-12 substr. MG1655"
/mol_type="DNA"
/strain="K-12 substr. MG1655"
/db_xref="PRJNA57779"
gene complement(1..354)
/locus_tag="b3304"
CDS complement(1..354)
/locus_tag="b3304"
/codon_start="1"
/gene="rplR"
/translation="MDKKSARIRRATRARRKLQELGATRLVVHRTPRHIYAQVIAPNGS
LVAASTVEKAIAEQLKYTGNKDAAAAVGKAVAERALEKGIKDVSFDRSGFQYHGRVQAL
DAAREAGLQ"
/product="50S ribosomal subunit protein L18"
gene complement(364..897)
/locus_tag="b3305"
CDS complement(364..897)
/locus_tag="b3305"
/codon_start="1"
/gene="rplF"
/translation="MSRVAKAPVVVPAGVDVKINGQVITIKGKNGELTRTLNDAVEVKH
NTLTFGPRDGYADGWAQAGTARALLNSMVIGVTEGFTKKLQLVGVGYRAAVKGNVINLS
GFSHPVDHQLPAGITAECPTQTEIVLKGADKQVIGQVAADLRAYRRPEPYKGKGVRYAD
VVRTKEAKK"
/product="50S ribosomal subunit protein L6"
ORIGIN
1 acctctacct tagaactgaa ggccagcttc acgggcagca tctgccagtg cctggacacg
61 accatgatat tggaacccgg aacggtcaaa ggatacatct ttgatgcctt tttccagagc
121 gcgttcagcg acagctttac ccacagctgc agccgcgtct ttgttaccgg tgtacttcag
181 ttgttcagcg atagcttttt ctacagtaga agcagctacc agaacttcag aaccgttcgg
241 tgcaattacc tgtgcgtaaa tgtgacgcgg ggtacgatgt accaccaggc gagttgcgcc
301 cagctcctgg agcttgcggc gtgcgcgggt cgcacgacgg atacgagcag atttcttatc
361 catagtgtta ccttacttct tcttagcctc tttggtacgc acgacttcgt cggcgtaacg
421 aacacccttg cctttataag gctcaggacg acggtaggcg cgcagatccg ctgcaacctg
481 gccgatcacc tgcttatcag cgcctttcag cacgatttca gtctgagtcg gacattcagc
541 agtgataccc gcaggcagct gatggtcaac aggatgagag aaacccagag acaggttaat
601 cacattgcct ttaaccgctg cacggtaacc tacaccaacc agctgcagct tcttagtgaa
661 gccttcggta acaccgataa ccattgagtt cagcagggca cgcgcggtac cagcctgtgc
721 ccaaccgtct gcgtaaccat cacgcggacc gaaggtcagg gtattatctg catgtttaac
781 ttcaacagca tcgttgagag tacgagtcag ctcgccgttt ttacctttga tcgtaataac
841 ctgaccgttg atttttacgt caacgccggc aggaacaacg accggtgctt tagcaacacg
901 agacattttt tcc

//
21 changes: 21 additions & 0 deletions microBioRust/new_output_embl.gff
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
##gff-version 3
##sequence-region source_1 1 910
source_1 . CDS 1 354 0 - 0 id=b3304;name=source_1;gene=rplR;locus_tag=b3304;product=50S ribosomal subunit protein L18
source_1 . CDS 364 897 0 - 0 id=b3305;name=source_1;gene=rplF;locus_tag=b3305;product=50S ribosomal subunit protein L6
##FASTA
acctctaccttagaactgaaggccagcttcacgggcagcatctgccagtgcctggacacg
accatgatattggaacccggaacggtcaaaggatacatctttgatgcctttttccagagc
gcgttcagcgacagctttacccacagctgcagccgcgtctttgttaccggtgtacttcag
ttgttcagcgatagctttttctacagtagaagcagctaccagaacttcagaaccgttcgg
tgcaattacctgtgcgtaaatgtgacgcggggtacgatgtaccaccaggcgagttgcgcc
cagctcctggagcttgcggcgtgcgcgggtcgcacgacggatacgagcagatttcttatc
catagtgttaccttacttcttcttagcctctttggtacgcacgacttcgtcggcgtaacg
aacacccttgcctttataaggctcaggacgacggtaggcgcgcagatccgctgcaacctg
gccgatcacctgcttatcagcgcctttcagcacgatttcagtctgagtcggacattcagc
agtgatacccgcaggcagctgatggtcaacaggatgagagaaacccagagacaggttaat
cacattgcctttaaccgctgcacggtaacctacaccaaccagctgcagcttcttagtgaa
gccttcggtaacaccgataaccattgagttcagcagggcacgcgcggtaccagcctgtgc
ccaaccgtctgcgtaaccatcacgcggaccgaaggtcagggtattatctgcatgtttaac
ttcaacagcatcgttgagagtacgagtcagctcgccgtttttacctttgatcgtaataac
ctgaccgttgatttttacgtcaacgccggcaggaacaacgaccggtgctttagcaacacg
agacattttttcc
1 change: 1 addition & 0 deletions microBioRust/src/gbk.rs
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
//! # A Genbank to GFF parser
//!
//!
//!
//! You are able to parse genbank and save as a GFF (gff3) format as well as extracting DNA sequences, gene DNA sequences (ffn) and protein fasta sequences (faa)
//!
//! You can also create new records and save as a genbank (gbk) format
Expand Down
6 changes: 6 additions & 0 deletions microBioRust/test_output.gff
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
##gff-version 3
##sequence-region source_NC_000913_1 1 913
source_NC_000913_1 . CDS 10 363 0 - 0 id=b3304;name=source_NC_000913_1;gene=rplR;locus_tag=b3304;product=50S ribosomal subunit protein L18
source_NC_000913_1 . CDS 373 906 0 - 0 id=b3305;name=source_NC_000913_1;gene=rplF;locus_tag=b3305;product=50S ribosomal subunit protein L6
##FASTA
acctctaccttagaactgaaggccagcttcacgggcagcatctgccagtgcctggacacgaccatgatattggaacccggaacggtcaaaggatacatctttgatgcctttttccagagcgcgttcagcgacagctttacccacagctgcagccgcgtctttgttaccggtgtacttcagttgttcagcgatagctttttctacagtagaagcagctaccagaacttcagaaccgttcggtgcaattacctgtgcgtaaatgtgacgcggggtacgatgtaccaccaggcgagttgcgcccagctcctggagcttgcggcgtgcgcgggtcgcacgacggatacgagcagatttcttatccatagtgttaccttacttcttcttagcctctttggtacgcacgacttcgtcggcgtaacgaacacccttgcctttataaggctcaggacgacggtaggcgcgcagatccgctgcaacctggccgatcacctgcttatcagcgcctttcagcacgatttcagtctgagtcggacattcagcagtgatacccgcaggcagctgatggtcaacaggatgagagaaacccagagacaggttaatcacattgcctttaaccgctgcacggtaacctacaccaaccagctgcagcttcttagtgaagccttcggtaacaccgataaccattgagttcagcagggcacgcgcggtaccagcctgtgcccaaccgtctgcgtaaccatcacgcggaccgaaggtcagggtattatctgcatgtttaacttcaacagcatcgttgagagtacgagtcagctcgccgtttttacctttgatcgtaataacctgaccgttgatttttacgtcaacgccggcaggaacaacgaccggtgctttagcaacacgagacattttttcc
8 changes: 8 additions & 0 deletions microBioRust/test_output_embl.gff
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
##gff-version 3
##sequence-region source_AM236082_1 1 6666
source_AM236082_1 . CDS 1 1197 0 + 0 id=pRL80001;name=source_AM236082_1;gene=repAp8;locus_tag=pRL80001;product=replication protein RepA
source_AM236082_1 . CDS 1321 2280 0 + 0 id=pRL80002;name=source_AM236082_1;gene=repBp8;locus_tag=pRL80002;product=replication protein RepB
source_AM236082_1 . CDS 2455 3672 0 + 0 id=pRL80003;name=source_AM236082_1;gene=repCp8;locus_tag=pRL80003;product=replication RepC protein
source_AM236082_1 . CDS 3811 6666 0 + 0 id=pRL80004;name=source_AM236082_1;locus_tag=pRL80004;product=hypothetical protein
##FASTA
gtggagaatcccgctcagcttcagaaggctattcataaactgatagcggcccacgcgcgagatctctcgggcgcgcttcacgagcatcgtgtgaagctttatccgcctgaagctcgaaagacgcttcggtcattttcgtcgatagaggctgcgaagctcattggcgtcaacgatggctatctccgccatctttcgctcgagggtaaggggccgcagcctgagatcggaaataacaatcgccgttcgtattcggtcgagactattcaggcgctccgcgagtatctcgacgagaacggcaagggtgaccgtcggtactcaccacgccggagcggtcgtgagcatttgcaggttataaccgcagtgaacttcaagggaggcagcggtaagaccacgacggctgctcatcttgctcagtatcttgcgcttaatggataccgggttcttgcgattgatcttgatccgcaggccagcatgtccgctttgcacggattccagcctgagtttgacgttggcgacaacgaaacgctctacggcgccgttcgttatgatgaagagcggcgcccgctgaaggatataatcaagaaaacctactttgcgaaccttgatctcgttccgggcaacctcgagcttatggaattcgagcacgacaccgctaaagtgctcggctctaacgaccgcaagaacatcttcttcacgcgaatggatgacgcaatcgcgtcagtggcggacgactatgacgttgtcgtcgtcgactgccctccccagctcggctttctgacgatctcggctctatgcgcggcaaccgccgttcttgttactgtacatcctcagatgctcgatgtgatgtcgatgtgccagtttctgctgatgacctcagaacttctgagcgtcgttgcggatgctggcgggagcatgaactacgattggatgcgttatctcgttacgcgctacgagccgggagacggaccgcaaaaccagatggtgtcgttcatgcgcacgatgtttggcgaccatgtcctgaaccacccgatgctcaagagcacagccatttcagacgcggggattactaagcagactctctatgaggtgagccgcgaccagttcacgcgagcaacatacgaccgagccatggaatcgctcgacaacgtgaacagcgaaatcgaacaactcattcaatcatcttggggtcgcaaatgatggctctagagatctcagaaaacgcgacattgatggagaagttgccagccggaaacttttcggaatttgcactctctatgtcgaggaatccggcttgtcacgagtacctcaggggaaagcaagatggctagaaaacacctcctttcagatttgaaagctcctgcttcatcatctacggagttcgatgaagctagggctgcagacgtccctactccgcagtatgcgcctcgaggtgcaatcggtgccgtctcgcgatcgattgaagctttgaagtcgcagggactgagtgaactcgatcccgaactgatagatgcgccgtccgttactgatcgccttgatgaggatggggctcagtttgaggagttcgctcgcaacatccgtgagaatgggcagcaggttccgattcttgtccggcctcacccgaccgtggaaggacggtatcagattgcctacggccggagacggttgagagcggtcaaggcggccggcctcaaggtcaaagccgcaatcagaaatctgacagatgacgagcttgtactggcgcaaggtcaggaaaacagcgcgcgtcaggatctgtcgtttatcgagcgggcgctctatgcagcccagctcgaagcgagtggctaccagcgtcccgtcatcatggcagcgctggctgtcgacaaaagtaacctttcgcggttgattcaggctgcgacccaattgccggacgacgtcatccgactaattggtgctgcgcctaagaccggccgtgatcgctggtacgagctatcatcgcggttggctgcagaaggtgctgcggagaaggcgcgcgctcttctttcgactagcgaggttggctccctgggttctgatgagcgatttgttcgcgttttcgacgcggttgcgccgaagaaatctaagaaggaaaaagttcaggcggatgtctggcaagctgacgatggggtcaaggctgcgagtttccgccaggacaaacgaacactgacattgatgatcgacaagaaggcagcgccggaattcggtgagtacctgatgtcggctctccccgagatctacgcttcgttcaagaagtcgaagcaatagatgagtcgtaacgaagaaaggtgccgatagcgcaaagaaaaagccctccgaaacggtgttccagaaggcctctctcagtttggtcgcttagagaatcgcatttcccggaatcacagtcaagagtcaacgccacaccggcgtagccttttctttgccttgcgaaaggtgaaggacatggaaacgggttatatcacgacgccctttgggcggcggccgatgacgcttgctctggtgaagcgtcaggttaagaccgagcaggcaatagcggatggctcggtcgacaagtggcgcgtgtttcgcgacataagcgacgcccgctcacgccttggccttcaagatcgagccttggcggtcttgaatgcacttttaacattcttcccagttgctgaactcagcaatgagaggaacctggtcgtctttccatcaaatgctcagctatcagcccgcacaaacggtatcgctgggacaactctgcgcaagtgcctcggttcgctggtggaggccggtgtaatcatccgcaaggatagccctaacggtaagcgatatgctcgaaaaggcaaagaaggaaacatagaggacgcctacggcttcagtctggcaccgcttcttgcgcgcgccggcgagtttgctagcctcgcccaagacgtggctgctgaacagcgccgcttccgcatcacgaaagaccgcctcacgatcgttcggcgagatgtccgcaagctgatcaccgtcgggatggaagagaaccttgccggcgattggattgccgcggaaacgtgctttgtcgagattgtgggaaggttcgttcggcacccgacgctccaggacctgatttcgagcctcgacgagatgagccttcttcacgaagaagtctccaggatgctggaaattaaagaagaaaccgcaaaaagtgatggcaatgccatcccggacggatgccacatacagaattcaaataccgaatcctgccatgaacttgaaccccgctccgaaaagaagcagggcgaaaagtccgagccaaacaagaaaacggagcggaaagacgaaccggaagcgtttccgttgtccatggtgttgcgtgcctgcccggagatcaacgcatttggccctggtggatcgattggaagctggcgcgaaatgatgtcagcggcggtaacggttcggtccatgcttggcgtcagcccctctgcctatcaggaggcatgcgaggtgatggggcaggccggagcggcgatagcaatagcttgcatttaccagcgtggcgggcacatcaactcggcggggggatatcttcgggatctaacggggaaggcgcggcgaggggagttttcacttgggccaatgctgtttacgcaattgcgggcgaactcgggcaccgtcaaggcgtcagcgtaggtcaaagtatcatgattgtttagcctaaccggttgaactaattaacctattttgactagtttccggctggcaactttatctcgatctaaagcgtcgagtgaatggcagaagataatcttcctgatgggcgtccgtataatgaccgaaattgtgcttccgaccgaaaacacgatcatcgcggcagccaaaaaacttgacgcggccgcatcgcagctggtggcagagacgttctttgccattcggcatgggatgtcaatcaatccaattggtcgcaacccggatgggcagaccatcaagggataccctgacattactgggcgggtgccgggtgagaagaagtacctgatcgaagtcacgaaggacgactggcgcacacatcttcagagcgatctatcaaaactgtcccgcctgcagaaaggagcctacgcgggtttcctacttctctgcttccgaaagtccgagtccgaactcactcaaagcaacaggaagaaggcacgggaaaccgtccagcaggccgagagccggattgaaaagcttttgggtgtccaggcaggacaggtagaattcgtctttcttggcgagttcgcgcgtgaggtcagatcggcgaaataccaccgcgtattgctggctctgggtctcgagcttgtgccagcgccattctacacggatttgcgcttcgtgcagggcttagccgatttcgtaccgaccgctgaggaatatgaggctgagagtgttgttcctcgcgatgaggtaagccggacctatgagcgggtcttcaaaaacagactaacgttgatcgaaggcgagggcggtagcggcaaaacaagcctggccctagccgttgcgacggagcatcggaagcaaggcgagatctttctgttcttagacgcctctgtcgctgactggaagagcggttcggagcgagctcgcctcgttgacgtagcggcgatgttcgcggaatcgaatgtcctgattatattggacaacgtacatctgggcgatgcgtccggcatttctgaactgattacaaatgtccaggcgtccggttatgatttccgctttttgatgacgacgcgcagcagcgacgaagttgaacaatggaagcgcctgggaaatatcgagcttctccgcagagttccgtctggagccgatgtcaactctgcctatcaccgcctgctcactcaaaagtttcccggaagcagtttcaacgatattcccccagcggtgaccacacgatggtcaaatcaaattcccaatctggttattctcacgcttgctcttgaaggtctcacaaagagaggcggctatgatcgcgattgggcgatcaaggttgaggacgcaggcacataccttcaagctaagttcatctcgaagctgtcgtccgacgacgtcaaacaggtgggcaagatcgctgcgctctcacttctggaaattcccacctcgctcaggtcgctcgaccaccgggttccaaagtctgctgtggatctgggcttcgttcgtctgaactcgagttcaacaactcagcgatatgagctcgttcaccacgaactgggcaagctgatcacgtccttcaaagatccggatatcaaggcgcggctgggagaggtgatgtccgctgatcccttccaggcaacatatatcgggctgaagcttatcggaaacggagaagccagcctggcaaaggaattgttgtcgtcagtcctttctcaatcactcacactctcgccagatttctcgatgggaaactccggcggagtcttcggtatcctggtccagtccaacgtgactacctatcccgaaattgagcgtatccttcttcctgatatcggcgcctttttcgatacaaagccggatattgtaaccggccttagctccttcctcggggctgcctccgaaaacatggagcgcgtatacaatgccattgtggaaaaacttgccgaacaggaaacgattcgacggatcgaagagcttctcccatccgtcggcccgacgactttcgcgacactttaccgatgcgcgaactcacggaacctcccgtttctttcaacgcttcgaaaatatctcaacagagggaagcgtatagattcctttgcctatcgatgcaggtctgaaagtccgagtaaggtcgagatctgctggggcctgattgatgagttctttccacaccacaaggcccggtttgaagttgtgcttcgctctgccctcgccgagggatacatcgagcgccttatcccggaagagcttattgagtctcgctcttcaagggctgttcagacggcgatccgatgcgcaaatagcgaagttttcaaacggtacatcacgttccgtgactgcagcgacgcgacgctgttgcttctggcccacacgatgcacgacatgggcaggaatgatctctcggaggtcgcagctgaccgagttgcaggcaggacgacctcttcaatctggtatcatcgtcgcaccggtggcagggcgttgctgactattttgcggagagcatcgatatctgcagaaggagatgttcagaaaattctgatgcggcttgaggctgaaggaaaaatgagggccattgtgaatggaatgcggccttatcgcctagcgaattttattttcgtgatctgggatcggcacgagcaatttacttcattcatctcgaagacagatcttcaggaaattacaaaccgccggttcaaagcgcgagcggcagagttctctgaagagcgacaagcgtccatctacattgcaggaatctatgcgctggtaggcctcgacataccgcgggacgagtggagcgcggtcgacgtcactgaagacgatttcattggaaaccagaacaacccggtcttctggatcggtctcaaggctctggaagaaaatggcatgatacgccttgcccatcgaagcagatttccgacatctgtcgcggcgctagatactcattcggaaaacaccagccggatcatgaacgatttgaaaaactgggctgcgaccaggtaa