Skip to content

Commit 3367f37

Browse files
fix: remove runs from env-level currentConcurrency set during recovery
The recovery script was only removing runs from the queue-level currentConcurrency set but not from the environment-level set. This could leave runs stuck consuming environment-level concurrency capacity and blocking future dequeues. Co-authored-by: Eric Allam <ericallam@users.noreply.github.com>
1 parent ba07f23 commit 3367f37

File tree

1 file changed

+14
-2
lines changed

1 file changed

+14
-2
lines changed

scripts/recover-stuck-runs.ts

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515
* 2. Checking which runs have QUEUED execution status (inconsistent state)
1616
* 3. Re-adding them to their specific queue sorted sets
1717
* 4. Removing them from the queue-specific currentConcurrency sets
18+
* 5. Removing them from the environment-level currentConcurrency set
1819
*
1920
* SAFETY:
2021
* - Dry-run mode when no write Redis URL is provided (read-only, no writes)
@@ -262,6 +263,7 @@ async function main() {
262263
console.log(`This will:`);
263264
console.log(` 1. Add each run back to its specific queue sorted set`);
264265
console.log(` 2. Remove each run from the queue-specific currentConcurrency set`);
266+
console.log(` 3. Remove each run from the env-level currentConcurrency set`);
265267
console.log();
266268

267269
let successCount = 0;
@@ -292,17 +294,27 @@ async function main() {
292294
args: [run.runId],
293295
description: `Remove run from queue currentConcurrency set`,
294296
},
297+
{
298+
type: "SREM",
299+
key: envConcurrencyKey,
300+
args: [run.runId],
301+
description: `Remove run from env currentConcurrency set`,
302+
},
295303
];
296304

297305
if (executeMode && redisWrite) {
298306
// Execute operations using the write client
299307
await redisWrite.zadd(queueKey, currentTimestamp, run.runId);
300-
const removed = await redisWrite.srem(queueConcurrencyKey, run.runId);
308+
const removedFromQueue = await redisWrite.srem(queueConcurrencyKey, run.runId);
309+
const removedFromEnv = await redisWrite.srem(envConcurrencyKey, run.runId);
301310

302311
console.log(` ✓ Recovered run ${run.runId} (${run.taskIdentifier})`);
303-
if (removed === 0) {
312+
if (removedFromQueue === 0) {
304313
console.log(` ⚠ Run was not in queue currentConcurrency set`);
305314
}
315+
if (removedFromEnv === 0) {
316+
console.log(` ⚠ Run was not in env currentConcurrency set`);
317+
}
306318
successCount++;
307319
} else {
308320
// Dry run - just show what would be done

0 commit comments

Comments
 (0)