Adds 'redis' WORKER_TYPE#7296

Open

dkliban wants to merge 2 commits intopulp:mainfrom

Member

dkliban commented Feb 8, 2026

This adds WORKER_TYPE setting. The default value is 'pulpcore'. When 'redis' is selected, the tasking system uses Redis to lock resources. Redis workers produce less load on the PostgreSQL database.

closes: #7210

Generated By: Claude Code.

📜 Checklist

Commits are cleanly separated with meaningful messages (simple features and bug fixes should be squashed to one commit)
A changelog entry or entries has been added for any significant changes
Follows the Pulp policy on AI Usage
(For new features) - User documentation and test coverage has been added

See: Pull Request Walkthrough

dkliban force-pushed the 7210 branch 14 times, most recently from 19a5443 to ec316bf Compare

February 12, 2026 01:15

github-actions bot added multi-commit no-changelog no-issue labels


          Adds 'redis' WORKER_TYPE

289cd2d

This adds WORKER_TYPE setting. The default value is 'pulpcore'. When 'redis' is selected,
the tasking system uses Redis to lock resources. Redis workers produce less load on the
PostgreSQL database.

closes: pulp#7210

Generated By: Claude Code.

dkliban force-pushed the 7210 branch 2 times, most recently from da96f66 to dd46ad1 Compare

February 13, 2026 16:11


          Added WORKER_TYPE setting validation.

0bd61e0

Added redis connection checks to the worker so it shuts down if the connection is broken.

dkliban force-pushed the 7210 branch from dd46ad1 to 0bd61e0 Compare

February 13, 2026 16:13

gerrod3 requested changes

View reviewed changes

Contributor

gerrod3 left a comment

There's still a lot I haven't deeply reviewed yet, but this was getting long and I had a big idea around dispatch that I want to discuss

pulpcore/tasking/redis_locks.py Outdated Show resolved Hide resolved

pulpcore/tasking/redis_tasks.py

Comment on lines +192 to +203

+                      current_app = AppStatus.objects.current()
+                      if current_app:
+                          _logger.info(
+                              "TASK EXECUTION: Task %s being executed by %s (app_type=%s)",
+                              task.pk,
+                              current_app.name,
+                              current_app.app_type,
+                          )
+                      else:
+                          _logger.info(
+                              "TASK EXECUTION: Task %s being executed with no AppStatus.current()", task.pk
+                          )

Contributor

gerrod3 Feb 13, 2026

Is this needed? Can this be moved to log_task_start? The value should be set on the task object after set_running is called.

pulpcore/tasking/redis_tasks.py

Comment on lines +237 to +240

+                  finally:
+                      # Safety net: if we crashed before reaching the lock release above,
+                      # still try to release locks here (e.g., if crash during task execution)
+                      if safe_release_task_locks(task):

Contributor

gerrod3 Feb 13, 2026

I'm pretty sure this is wrong. finally always runs regardless of exception or returning early.

pulpcore/tasking/redis_tasks.py

Comment on lines +183 to +186

+              def execute_task(task):
+                  """Redis-aware task execution that releases Redis locks for immediate tasks."""
+                  # This extra stack is needed to isolate the current_task ContextVar
+                  contextvars.copy_context().run(_execute_task, task)

Contributor

gerrod3 Feb 13, 2026

Reading through this version and the base version there is nothing much different between the two besides that this one calls safe_release_task_locks. Could this be a wrapper of the original with a try/finally to release the redis locks?

pulpcore/tasking/redis_tasks.py

Comment on lines +488 to +489

		current_app = AppStatus.objects.current()
		lock_owner = current_app.name if current_app else f"immediate-{task.pk}"

Contributor

gerrod3 Feb 13, 2026

Probably same question as before in _execute_task, but when is this ever None?

pulpcore/tasking/redis_tasks.py

+                          except Exception:
+                              # Exception during execute_task()
+                              # Atomically release all locks as safety net
+                              safe_release_task_locks(task, lock_owner)

Contributor

gerrod3 Feb 13, 2026

Not needed, execute_task should already handle letting go of the locks.

pulpcore/tasking/redis_tasks.py

Comment on lines +466 to +470

+                  task = Task.objects.create(**task_payload)
+                  if execute_now:
+                      # Try to atomically acquire task lock and resource locks
+                      # are_resources_available() now acquires ALL locks atomically
+                      if are_resources_available(task):

Contributor

gerrod3 Feb 13, 2026 •

edited

Loading

Crazy idea, not sure if you want to try it: but how about trying to acquire the redis locks before the task hits the DB?

In the default worker scenario, dispatch acquires the lock on the task on creation because app_lock is set to the current task dispatcher (usually the API worker) and the task worker's fetch_task can only select from tasks that have this field be null.

The new redis worker selects from any task that is waiting and thus there is this time window between the the task object hitting the DB and line 470's are_resources_available that you have to account for inside dispatch. Instead we can acquire the task lock first, then create the task, then try to acquire the task's needed resources locks, if successful execute, else defer and finally do a safe release of the task lock. This way dispatch shouldn't be fighting against task workers to get the task lock.

Suggested change

      
                task = Task.objects.create(**task_payload)
          
                if execute_now:
          
                    # Try to atomically acquire task lock and resource locks
          
                    # are_resources_available() now acquires ALL locks atomically
          
                    if are_resources_available(task):
          
               # note that the pulp_id is set once the object is instantiated even if not saved to the DB yet!
          
                task = Task(**task_payload)
          
                # new function to just acquire the lock on the task
          
                aquire_lock(task.pulp_id)
          
                task.save()
          
                if execute_now:
          
                    # Change this function to only get task's resources since we already hold the task lock
          
                    if are_resources_available(resources):
          
                        # now guarenteed to have task + resource locks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

multi-commit no-changelog no-issue