Summary
The task_processor_task and task_processor_recurringtask tables each carry an auto-generated unique B-tree index on their uuid column (task_processor_task_uuid_key, task_processor_recurringtask_uuid_key). Neither index is read by any query in our codebases.
In production, task_processor_task_uuid_key alone is ~7 GB.
Origin
The uuid field was introduced in the very first migration of the task processor — commit c5110873a ("Async processor (#1334)", 2022-08-03), defined as:
uuid = models.UUIDField(unique=True, default=uuid.uuid4)
unique=True triggers Postgres to create the backing unique index. The field has been carried through every relocation since (extraction to flagsmith-task-processor, then port to flagsmith-common) without ever being queried.
Why this matters
task_processor_task is a high-churn table (insert per enqueued task, update on lock/run, delete on cleanup). A unique index on a randomly-generated UUID is one of the more expensive index shapes to maintain — every insert pays for a B-tree write at a random position, every delete pays for a tombstone, and the index never returns the favour with a read. At ~7 GB it's also a non-trivial chunk of buffer cache, backup volume, and replication traffic.
Primary key is unaffected
Both tables keep Django's default auto-increment id PK. Every existing query already uses it:
Task.objects.filter(pk__in=…) (tasks.py:51)
TaskRun / RecurringTaskRun FKs target task_id
- The
get_tasks_to_process() SQL function selects by id ordering
Dropping uuid (or just unique=True) leaves all of that intact.
Proposed change
Drop unique=True from AbstractBaseTask.uuid (or remove the field outright, pending a check on external consumers — e.g. log/metric pipelines that may emit task.uuid). Either change is a one-migration cleanup; in prod, follow with DROP INDEX CONCURRENTLY to reclaim the 7 GB without locking the table.
Verification done
- No
.filter(uuid=…) / .get(uuid=…) / task__uuid / raw-SQL reference anywhere.
- The only
uuid lookup in task_processor is on the unrelated HealthCheckModel, which has its own index.
RecurringTaskAdmin.list_display renders uuid but does not filter by it.
Summary
The
task_processor_taskandtask_processor_recurringtasktables each carry an auto-generated unique B-tree index on theiruuidcolumn (task_processor_task_uuid_key,task_processor_recurringtask_uuid_key). Neither index is read by any query in our codebases.In production,
task_processor_task_uuid_keyalone is ~7 GB.Origin
The
uuidfield was introduced in the very first migration of the task processor — commitc5110873a("Async processor (#1334)", 2022-08-03), defined as:unique=Truetriggers Postgres to create the backing unique index. The field has been carried through every relocation since (extraction toflagsmith-task-processor, then port toflagsmith-common) without ever being queried.Why this matters
task_processor_taskis a high-churn table (insert per enqueued task, update on lock/run, delete on cleanup). A unique index on a randomly-generated UUID is one of the more expensive index shapes to maintain — every insert pays for a B-tree write at a random position, every delete pays for a tombstone, and the index never returns the favour with a read. At ~7 GB it's also a non-trivial chunk of buffer cache, backup volume, and replication traffic.Primary key is unaffected
Both tables keep Django's default auto-increment
idPK. Every existing query already uses it:Task.objects.filter(pk__in=…)(tasks.py:51)TaskRun/RecurringTaskRunFKs targettask_idget_tasks_to_process()SQL function selects byidorderingDropping
uuid(or justunique=True) leaves all of that intact.Proposed change
Drop
unique=TruefromAbstractBaseTask.uuid(or remove the field outright, pending a check on external consumers — e.g. log/metric pipelines that may emittask.uuid). Either change is a one-migration cleanup; in prod, follow withDROP INDEX CONCURRENTLYto reclaim the 7 GB without locking the table.Verification done
.filter(uuid=…)/.get(uuid=…)/task__uuid/ raw-SQL reference anywhere.uuidlookup intask_processoris on the unrelatedHealthCheckModel, which has its own index.RecurringTaskAdmin.list_displayrendersuuidbut does not filter by it.