Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions .codespellrc
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
[codespell]
# Ref: https://github.com/codespell-project/codespell#using-a-config-file
skip = .git*,.codespellrc,./examples/split_process/input.txt
check-hidden = true
# ignore-regex =
ignore-words-list = checkin
25 changes: 25 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Codespell configuration is within .codespellrc
---
name: Codespell

on:
push:
branches: [main]
pull_request:
branches: [main]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v4
- name: Annotate locations with typos
uses: codespell-project/codespell-problem-matcher@v1
- name: Codespell
uses: codespell-project/actions-codespell@v2
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ To deactivate the virtual environment in your shell, run the command:
deactivate

Alternatively, a set of convenience scripts are provided that activate the
virutalenv before calling `dsub`, `dstat`, and `ddel`. They are in the
virtualenv before calling `dsub`, `dstat`, and `ddel`. They are in the
[bin](https://github.com/DataBiosphere/dsub/tree/main/bin) directory. You can
use these scripts if you don't want to activate the virtualenv explicitly in
your shell.
Expand Down Expand Up @@ -472,7 +472,7 @@ using the environment variable. Please read
and [Semantics](https://github.com/GoogleCloudPlatform/gcsfuse/blob/master/docs/semantics.md)
before using Cloud Storage FUSE.

##### Mounting an existing peristent disk
##### Mounting an existing persistent disk

To have the `google-cls-v2` or `google-batch` provider mount a persistent disk that
you have pre-created and populated, use the `--mount` command line flag and the
Expand Down
4 changes: 2 additions & 2 deletions docs/compute_quotas.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs.

When you submit a `dsub` job using one of the Google providers, the single
implicit task (for jobs that do not use a `--tasks` file) or the set of tasks
submited (for jobs that do use a `--tasks` file) are submitted to the
submitted (for jobs that do use a `--tasks` file) are submitted to the
[Cloud Life Sciences pipelines.run() API](https://cloud.google.com/life-sciences/docs/reference/rest/v2beta/projects.locations.pipelines/run).
The API maintains a queue of
[operations](https://cloud.google.com/life-sciences/docs/reference/rest/v2beta/projects.locations.operations)
Expand All @@ -39,7 +39,7 @@ If the lack of sufficient quota is not transient (the VM requires more resources
than your quota maximum), then the Life Sciences API will mark the operation
as failed and provide an informative message.

## Handling insufficent quota
## Handling insufficient quota

When you have insufficient quota to run your job tasks, you have a few options:

Expand Down
4 changes: 2 additions & 2 deletions docs/providers/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ documentation.

### Environment variables point to where to write `--output` files

When you write your commands that run in your Docker container, you shoud
When you write your commands that run in your Docker container, you should
always write output files to the locations specified by the
environment variables that are set for them.
You may observe that providers consistently expect output files to be
Expand Down Expand Up @@ -198,7 +198,7 @@ The `local` provider does not support resource-related flags such as

The `google-cls-v2` and `google-batch` providers share a significant amount of
their implementation. The `google-cls-v2` provider utilizes the Google Cloud Life Sciences
Piplines API [v2beta](https://cloud.google.com/life-sciences/docs/apis)
Pipelines API [v2beta](https://cloud.google.com/life-sciences/docs/apis)
while the `google-batch` provider utilizes the Google Cloud
[Batch API](https://cloud.google.com/batch/docs/reference/rest)
to queue a request for the following sequence of events:
Expand Down
2 changes: 1 addition & 1 deletion dsub/commands/ddel.py
Original file line number Diff line number Diff line change
Expand Up @@ -113,7 +113,7 @@ def _emit_search_criteria(user_ids, job_ids, task_ids, labels):
if task_ids:
print(' task-id:')
print(' %s\n' % task_ids)
# Labels are in a LabelParam namedtuple and must be reformated for printing.
# Labels are in a LabelParam namedtuple and must be reformatted for printing.
if labels:
print(' labels:')
print(' %s\n' % repr(labels))
Expand Down
2 changes: 1 addition & 1 deletion dsub/commands/dsub.py
Original file line number Diff line number Diff line change
Expand Up @@ -693,7 +693,7 @@ def _generate_unique_job_id() -> str:
"""Generates a unique job identifier.

Uses uuid4() to generate a Universally Unique IDentifier and performs a
small transformation to accomodate the Google Batch API.
small transformation to accommodate the Google Batch API.

Google Batch requires a client-provided job identifier and
requires that the first character be a non-digit.
Expand Down
2 changes: 1 addition & 1 deletion dsub/lib/param_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@
class ListParamAction(argparse.Action):
"""Append each value as a separate element to the parser destination.

This class satisifes the action interface of argparse.ArgumentParser and
This class satisfies the action interface of argparse.ArgumentParser and
refines the 'append' action for arguments with `nargs='*'`.

For the parameters:
Expand Down
2 changes: 1 addition & 1 deletion dsub/lib/providers_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@
# Requirements can be found in the docs/providers/README.md.
#
# This module defines some utility names and functions such that new providers
# can follow the patterns of exising providers.
# can follow the patterns of existing providers.
#
# Unless providers have a compelling reason not to, they should just provide
# a single disk for everything that needs to be written by the dsub
Expand Down
2 changes: 1 addition & 1 deletion dsub/providers/DEVELOPERS.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ including:

- The folder for inputs is expected to be writeable. A historical pattern for
some scripts has been to use the directory where inputs are as a scratch
working diretory. If your provider must make the input directories read-only
working directory. If your provider must make the input directories read-only
it may limit portability of existing scripts.

- The environment variable `TMPDIR` should be set explicitly to a directory
Expand Down
2 changes: 1 addition & 1 deletion dsub/providers/google_batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,7 @@ def _format_batch_job_id(self, task_metadata, job_metadata) -> str:
# append the dsub task-id and task-attempt to the job-id for the
# batch job ID.
# For single-task dsub jobs, there is no task-id, so use 0.
# Use a '-' character as the delimeter because Batch API job ID
# Use a '-' character as the delimiter because Batch API job ID
# must match regex ^[a-z]([a-z0-9-]{0,61}[a-z0-9])?$
task_id = task_metadata.get('task-id') or 0
task_attempt = task_metadata.get('task-attempt') or 0
Expand Down
8 changes: 4 additions & 4 deletions dsub/providers/google_v2_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@


class GoogleV2EventMap(object):
"""Helper for extracing a set of normalized, filtered operation events."""
"""Helper for extracting a set of normalized, filtered operation events."""

def __init__(self, op):
self._op = op
Expand Down Expand Up @@ -271,7 +271,7 @@ def _pipelines_run_api(self, request):
raise NotImplementedError('Derived class must implement this function')

def _operations_list_api(self, ops_filter, page_token, page_size):
"""Executes the provider-specific operaitons.list() API."""
"""Executes the provider-specific operations.list() API."""
raise NotImplementedError('Derived class must implement this function')

def _operations_cancel_api_def(self):
Expand Down Expand Up @@ -797,7 +797,7 @@ def lookup_job_tasks(self,
create_time_max: a timezone-aware datetime value for the most recent
create time of a task, inclusive.
max_tasks: the maximum number of job tasks to return or 0 for no limit.
page_size: the page size to use for each query to the pipelins API.
page_size: the page size to use for each query to the pipelines API.

Raises:
ValueError: if both a job id list and a job name list are provided
Expand Down Expand Up @@ -1027,7 +1027,7 @@ def error_message(self):
"""Returns an error message if the operation failed for any reason.

Failure as defined here means ended for any reason other than 'success'.
This means that a successful cancelation will also return an error message.
This means that a successful cancellation will also return an error message.

Returns:
string, string will be empty if job did not error.
Expand Down
8 changes: 4 additions & 4 deletions dsub/providers/google_v2_operations.py
Original file line number Diff line number Diff line change
Expand Up @@ -138,7 +138,7 @@ def get_last_event(op):


def external_network_blocked(op):
"""Retun True if the blockExternalNetwork flag is set for the user action."""
"""Return True if the blockExternalNetwork flag is set for the user action."""
user_action = get_action_by_name(op, 'user-command')
if user_action:
if _API_VERSION == google_v2_versions.V2BETA:
Expand All @@ -149,7 +149,7 @@ def external_network_blocked(op):


def is_unexpected_exit_status_event(e):
"""Retun True if the event is for an unexpected exit status."""
"""Return True if the event is for an unexpected exit status."""
if _API_VERSION == google_v2_versions.V2BETA:

return 'unexpectedExitStatus' in e
Expand All @@ -159,7 +159,7 @@ def is_unexpected_exit_status_event(e):


def is_failed_event(e):
"""Retun True if the event is an operation failed event."""
"""Return True if the event is an operation failed event."""
if _API_VERSION == google_v2_versions.V2BETA:

return 'failed' in e
Expand All @@ -169,7 +169,7 @@ def is_failed_event(e):


def is_container_stopped_event(e):
"""Retun True if the event is a container stopped event."""
"""Return True if the event is a container stopped event."""
if _API_VERSION == google_v2_versions.V2BETA:

return 'containerStopped' in e
Expand Down
2 changes: 1 addition & 1 deletion examples/custom_scripts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ All of the source VCF files are stored in a public bucket at

## Setup

* Follow the [dsub geting started](../../README.md#getting-started)
* Follow the [dsub getting started](../../README.md#getting-started)
instructions.

## Process one file with a Bash shell script
Expand Down
4 changes: 2 additions & 2 deletions examples/decompress/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ All of the source VCF files are stored in a public bucket at

## Setup

* Follow the [dsub geting started](../../README.md#getting-started)
* Follow the [dsub getting started](../../README.md#getting-started)
instructions.

## Decompress one file
Expand Down Expand Up @@ -85,7 +85,7 @@ Output should look like:
```
##fileformat=VCFv4.1
##FILTER=<ID=LowQual,Description="Low quality">
##FILTER=<ID=PASS,Description="Passing basic quality fiters">
##FILTER=<ID=PASS,Description="Passing basic quality filters">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
```
Expand Down
2 changes: 1 addition & 1 deletion examples/fastqc/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ All of the source BAM files are stored in a public bucket at

## Setup

* Follow the [dsub geting started](../../README.md#getting-started)
* Follow the [dsub getting started](../../README.md#getting-started)
instructions.

* (Optional) [Enable](https://console.cloud.google.com/flows/enableapi?apiid=cloudbuild.googleapis.com)
Expand Down
2 changes: 1 addition & 1 deletion examples/samtools/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ All of the source BAM files are stored in a public bucket at

## Setup

* Follow the [dsub geting started](../../README.md#getting-started)
* Follow the [dsub getting started](../../README.md#getting-started)
instructions.

## Index one BAM file
Expand Down
2 changes: 1 addition & 1 deletion examples/split_process/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ run on Google Cloud with minimal change (delete the --provider line).

## Setup

* Follow the [dsub geting started](../../README.md#getting-started)
* Follow the [dsub getting started](../../README.md#getting-started)
instructions.

Since this script uses the `local` backend provider, you will need
Expand Down
6 changes: 3 additions & 3 deletions examples/split_process/input.txt
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,9 @@ l'homme comme l'idéal commun à atteindre par tous les peuples et toutes les
nations afin que tous les individus et tous les organes de la société, ayant
cette Déclaration constamment à l'esprit, s'efforcent, par l'enseignement et
l'éducation, de développer le respect de ces droits et libertés et d'en assurer,
par des mesures progressives d'ordre national et international, la
par des measures progressives d'ordre national et international, la
reconnaissance et l'application universelles et effectives, tant parmi les
populations des Etats Membres eux-mêmes que parmi celles des territoires placés
populations des Etats Membres eux-mêmes que parmi cells des territories placés
sous leur juridiction.

Article premier
Expand All @@ -35,4 +35,4 @@ autonome ou soumis à une limitation quelconque de souveraineté.

Article 3

Tout individu a droit à la vie, à la liberté et à la sûreté de sa personne.
Tout individu a droit à la via, à la liberté et à la sûreté de sa personne.
2 changes: 1 addition & 1 deletion test/integration/e2e_after.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
import sys

# Because this may be invoked from another directory (treated as a library) or
# invoked localy (treated as a binary) both import styles need to be supported.
# invoked locally (treated as a binary) both import styles need to be supported.
# pylint: disable=g-import-not-at-top
try:
from . import test_setup_e2e as test
Expand Down
2 changes: 1 addition & 1 deletion test/integration/e2e_after_fail.py
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
from dsub.lib import dsub_errors

# Because this may be invoked from another directory (treated as a library) or
# invoked localy (treated as a binary) both import styles need to be supported.
# invoked locally (treated as a binary) both import styles need to be supported.
# pylint: disable=g-import-not-at-top
try:
from . import test_setup_e2e as test
Expand Down
2 changes: 1 addition & 1 deletion test/integration/e2e_env_list.py
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
import sys

# Because this may be invoked from another directory (treated as a library) or
# invoked localy (treated as a binary) both import styles need to be supported.
# invoked locally (treated as a binary) both import styles need to be supported.
# pylint: disable=g-import-not-at-top
try:
from . import test_setup_e2e as test
Expand Down
2 changes: 1 addition & 1 deletion test/integration/e2e_io_tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@
import sys

# Because this may be invoked from another directory (treated as a library) or
# invoked localy (treated as a binary) both import styles need to be supported.
# invoked locally (treated as a binary) both import styles need to be supported.
# pylint: disable=g-import-not-at-top
try:
from . import test_setup_e2e as test
Expand Down
2 changes: 1 addition & 1 deletion test/integration/e2e_python_api.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
from dsub.providers import local

# Because this may be invoked from another directory (treated as a library) or
# invoked localy (treated as a binary) both import styles need to be supported.
# invoked locally (treated as a binary) both import styles need to be supported.
# pylint: disable=g-import-not-at-top
try:
from . import test_setup
Expand Down
2 changes: 1 addition & 1 deletion test/integration/e2e_requester_pays_buckets.sh
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ set -o errexit
set -o nounset

# This test is designed to verify that accessing a Requester Pays bucket
# by specifiying a user-project to bill works. All input files used in this test
# by specifying a user-project to bill works. All input files used in this test
# are inside a requester-pays bucket.
#
# Note that we do not include a test for writing and logging to the requester
Expand Down
2 changes: 1 addition & 1 deletion test/integration/get_data_value.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.

"""Utililty for helping shell scripts extract values from JSON or YAML.
"""Utility for helping shell scripts extract values from JSON or YAML.

Usage:

Expand Down
4 changes: 2 additions & 2 deletions test/integration/test_setup_e2e.py
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@
from dsub.commands import dsub as dsub_command

# Because this may be invoked from another directory (treated as a library) or
# invoked localy (treated as a binary) both import styles need to be supported.
# invoked locally (treated as a binary) both import styles need to be supported.
# pylint: disable=g-import-not-at-top
try:
from . import test_setup
Expand All @@ -58,7 +58,7 @@


def _environ():
"""Merge the current enviornment and test variables into a dictionary."""
"""Merge the current environment and test variables into a dictionary."""
e = dict(os.environ)
for var in TEST_VARS + TEST_E2E_VARS:
e[var] = globals()[var]
Expand Down
2 changes: 1 addition & 1 deletion test/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ function get_test_providers() {
local providers="$(echo -n "${test_file}" | awk -F . '{ print $(NF-1) }')"

# Special case the google-batch tests - don't run them when this flag is set
# To be renabled once batch client library is available in G3
# To be re-enabled once batch client library is available in G3
if [[ "${providers}" == "google-batch" ]] && [[ "${NO_GOOGLE_BATCH_TESTS:-0}" -eq 1 ]]; then
echo -n ""
else
Expand Down
2 changes: 1 addition & 1 deletion test/unit/job_model_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -186,7 +186,7 @@ def testScriptCreation(self):
# TASK_3 gs://bucket/path/NA06986.chrom18...bam gs://bucket/path/3/*.md5

# pylint: disable=common_typos_disable
# pilot3_exon_targetted_GRCh37_bams raises a "common typos" warning: "targetted"
# pilot3_exon_targetted_GRCh37_bams raises a "common typos" warning: "targeted"

_IO_TASKS_META = textwrap.dedent("""
create-time: {}
Expand Down