Skip to content

[Bug] Unable to start a batch job if counting the workflows times out #838

@dcohen8128

Description

@dcohen8128

What are you really trying to do?

During some performance testing we ran into an issue and had to stop the test. We were left with about 30 million running workflows that would have taken a very long time to naturally drain so we wanted to batch terminate them.

Describe the bug

We attempted to start a batch terminate job through the CLI by running a command like temporal workflow terminate --query 'ExecutionStatus="Running" AND WorkflowType="TestWorkflow"' but it failed with the message "failed counting workflows from query: context deadline exceeded" and no batch job was submitted.

Minimal Reproduction

  1. On a cluster using a SQL advanced visibility store, spawn a few million workflows
  2. Attempt to batch terminate all of them

Environment/Versions

  • OS and processor: N/A
  • Temporal Version: Server 1.27.1, CLI 1.3.0
  • Are you using Docker or Kubernetes or building Temporal from source? N/A?

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions