"Worker terminated" from canceling a task

We got a "Worker terminated" error today when a user started an HTTP request that depended on a workerpool worker, interrupted the request, then immediately retried.

```
Error: Worker terminated
    at WorkerHandler.terminate (/home/app/deploy/node_modules/workerpool/src/WorkerHandler.js:516:45)
    at WorkerHandler.terminateAndNotify (/home/app/deploy/node_modules/workerpool/src/WorkerHandler.js:617:8)
    at ? (/home/app/deploy/node_modules/workerpool/src/WorkerHandler.js:456:26)
    at ? (/home/app/deploy/node_modules/workerpool/src/Promise.js:179:17)
    at ? (/home/app/deploy/node_modules/workerpool/src/Promise.js:109:7)
    at Array.forEach (<anonymous>)
    at _reject (/home/app/deploy/node_modules/workerpool/src/Promise.js:108:13)
    at Object.reject (/app/app/deploy/node_modules/workerpool/src/Promise.js:164:5)
    at Timeout._onTimeout (/home/app/deploy/node_modules/workerpool/src/WorkerHandler.js:484:36)
    at listOnTimeout (node:internal/timers:605:17)
```

The second task was started within the first task's `workerTerminateTimeout` and was apparently killed by it.

The following analysis is from Claude Code. I've done my best to review it and take responsibility for any mistakes.

## Summary

When a task is cancelled (via `Promise.cancel()`) or times out, `WorkerHandler` schedules a forced termination of the worker after `workerTerminateTimeout` (default 1000ms) to give the worker a chance to clean up gracefully. **During that grace window, the pool can dispatch a new, unrelated task onto the same worker.** When the timer fires (or the worker is otherwise force-terminated as part of the cancellation cleanup), every entry in `processing` is rejected with `Error: Worker terminated` — including the unrelated task that was assigned during the window.

The result: cancelling task A can cause task B, submitted milliseconds later, to fail with `Worker terminated` even though B has nothing to do with A's cancellation.

## Reproduction

```js
const workerpool = require('workerpool');

const pool = workerpool.pool({ maxWorkers: 1, workerTerminateTimeout: 100 });

const longTask = pool.exec(() => { while (true) {} }); // CPU-bound, can't cancel cooperatively
longTask.catch(() => {}); // ignore the cancellation rejection

setTimeout(() => {
  longTask.cancel();
  // Submit an unrelated task immediately. With maxWorkers: 1, the only worker
  // is currently in cleanup-after-cancel. It is reported by `busy()` as idle,
  // so the pool dispatches `add` onto it.
  pool.exec((a, b) => a + b, [3, 4])
    .then(r => console.log('result:', r))                  // never logs
    .catch(err => console.log('FAIL:', err.message));      // FAIL: Worker terminated
}, 50);
```

Output:
```
FAIL: Worker terminated
```

## Root cause

A worker can be torn down by two paths, but `WorkerHandler.busy()` only knows about one of them:

1. **`WorkerHandler.terminate()`** sets `this.cleaning = true` (`WorkerHandler.js:583`). `busy()` correctly reports the worker as busy and `Pool._getWorker()` skips it.
2. **`Promise.cancel()` / timeout in `exec`** is caught in the `resolver.promise.catch` block at `WorkerHandler.js:433` and adds the task to `tracking` with a scheduled `terminateAndNotify(true)` after `workerTerminateTimeout`. `busy()` does **not** consider `tracking`:

   ```js
   WorkerHandler.prototype.busy = function () {
     return this.cleaning || Object.keys(this.processing).length > 0;
   };
   ```

   So `Pool._getWorker()` (`Pool.js:273`) sees the worker as available and dispatches the next queued task. When the cleanup timer fires, `terminate(true)` rejects every entry in `processing` with `Error: Worker terminated` — sweeping up the newly-assigned task as collateral damage.

## Proposed fix

`busy()` should also report the worker as busy while it has entries in `tracking`:

```js
WorkerHandler.prototype.busy = function () {
  return this.cleaning
    || Object.keys(this.processing).length > 0
    || Object.keys(this.tracking).length > 0;
};
```

Once the worker responds to the cleanup message, the tracking entry is removed at `WorkerHandler.js:324` and the worker becomes idle again — so workers that successfully run their abort listener stay in the pool, just as they do today. Workers whose cleanup times out (i.e. the `workerTerminateTimeout` setTimeout fires) get force-terminated and removed from the pool — also as today. The only behavior change is that **newly arriving tasks are queued through the cleanup window instead of being dispatched onto a worker that's about to be killed**.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"Worker terminated" from canceling a task #548

Summary

Reproduction

Root cause

Proposed fix

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

"Worker terminated" from canceling a task #548

Description

Summary

Reproduction

Root cause

Proposed fix

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions