Skip to content

Download script - NCBI fasta - delete rubbish files from download  #3

@julie-sullivan

Description

@julie-sullivan

Here are the files we do not want:

rm hs_ref_GRCh38.p12_unlocalized.fa hs_ref_GRCh38.p12_unlocalized.mfa hs_ref_GRCh38.p12_unplaced.fa hs_ref_GRCh38.p12_unplaced.mfa

Here's the error:

Caused by: java.lang.RuntimeException: Couldn't find chromosome identifier ref|NT_187396.1| Homo sapiens unplaced genomic scaffold, GRCh38.p12 Primary Assembly HSCHRUN_RANDOM_100
        at org.intermine.bio.dataconversion.NCBIFastaLoaderTask.getIdentifier(NCBIFastaLoaderTask.java:53)
        at org.intermine.bio.dataconversion.FastaLoaderTask.processSequence(FastaLoaderTask.java:320)
        at org.intermine.bio.dataconversion.FastaLoaderTask.processFile(FastaLoaderTask.java:240)
        at org.intermine.task.FileDirectDataLoaderTask.process(FileDirectDataLoaderTask.java:50)
        at org.intermine.bio.dataconversion.FastaLoaderTask.process(FastaLoaderTask.java:170)
        at org.intermine.task.DirectDataLoaderTask.execute(DirectDataLoaderTask.java:168)
        at org.intermine.bio.dataconversion.FastaLoaderTask.execute(FastaLoaderTask.java:215)
        at org.apache.tools.ant.UnknownElement.execute(UnknownElement.java:293)
        at org.apache.tools.ant.dispatch.DispatchUtils.execute(DispatchUtils.java:106)
        ... 38 more

We don't want unplaced features so delete these files on download.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions