Hi, The current implementation which (worker) - outof (total workers) picks jobs by partitioning the id space. If for whatever reason one of the worker is stuck processing a job, all the current and future jobs assigned to this worker will stall. Is there any plan (or ideas) to make this more fault tolerant? i.e. a worker can pick any job that is scheduled to run now (irrespective of the id of the job)? This will also make adding and removing worker much easier.