Skip to content

minor code refactor in genome_sequence funcion#1172

Open
Kaustubh2k5 wants to merge 2 commits intomalariagen:masterfrom
Kaustubh2k5:fix1171
Open

minor code refactor in genome_sequence funcion#1172
Kaustubh2k5 wants to merge 2 commits intomalariagen:masterfrom
Kaustubh2k5:fix1171

Conversation

@Kaustubh2k5
Copy link

Refactor genome_sequence region normalisation

i found this part of the code to be unnecessarily complex (did not see a reason as to why) so i simplified some of the checks without changing any of the code behaviour.

Changes

  • Replace type() check with isinstance(region, (list, tuple))
  • Replace tuple("*") wildcard comparison with a direct region == "*" check
  • Unify single and multi-region paths into a single sequences list
  • Rename arrays sequences for clarity
  • Make None behaviour explicit (returns all contigs)

Before

genome = self.open_genome()
        if type(region) not in [tuple, list] and region != "*" and region is not None:
            d = self._subset_genome_sequence_region(
                genome=genome,
                region=region,
                inline_array=inline_array,
                chunks=chunks,
            )
        else:
            region = tuple(region)
            if region == tuple("*"):
                region = self.contigs
            d = da.concatenate(
                [
                    self._subset_genome_sequence_region(
                        genome=genome,
                        region=r,
                        inline_array=inline_array,
                        chunks=chunks,
                    )
                    for r in region
                ]
            )
        return d

After

        genome = self.open_genome()

        if region == "*" or region is None:
            regions = self.contigs
        elif isinstance(region, (list, tuple)):
            regions = list(region)
        else:
            regions = [region]

        sequences = [
            self._subset_genome_sequence_region(
                genome=genome,
                region=r,
                inline_array=inline_array,
                chunks=chunks,
            )
            for r in regions
        ]

        if len(sequences) == 1:
            return sequences[0]
        return da.concatenate(sequences)

Issues targeted:

#1171

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant