Skip to content

Commit 4efe65f

Browse files
feat: enable site clusters to run Nautobot Celery workers locally
Nautobot currently runs entirely on the global cluster, including its Celery workers. Sites that generate heavy background task load have no way to offload that processing closer to where the work originates, and a single global worker pool becomes a bottleneck as sites scale. This adds a site-scoped ArgoCD Application that deploys only the Celery worker portion of the Nautobot helm chart. The web server, Redis, and PostgreSQL are all disabled because they remain on the global cluster — site workers connect back to those shared services. This lets operators scale worker capacity per-site independently, run queue-specific workers closer to the hardware they manage, and reduce cross-cluster task latency for site-driven automation.
1 parent b53561a commit 4efe65f

10 files changed

Lines changed: 297 additions & 0 deletions

File tree

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,49 @@
1+
{{- if eq (include "understack.isEnabled" (list $.Values.site "nautobot_worker")) "true" }}
2+
---
3+
apiVersion: argoproj.io/v1alpha1
4+
kind: Application
5+
metadata:
6+
name: {{ printf "%s-%s" $.Release.Name "nautobot-worker" }}
7+
finalizers:
8+
- resources-finalizer.argocd.argoproj.io
9+
annotations:
10+
argocd.argoproj.io/compare-options: ServerSideDiff=true,IncludeMutationWebhook=true
11+
spec:
12+
destination:
13+
namespace: nautobot
14+
server: {{ $.Values.cluster_server }}
15+
project: understack
16+
sources:
17+
- chart: nautobot
18+
helm:
19+
fileParameters:
20+
- name: nautobot.config
21+
path: $understack/components/nautobot/nautobot_config.py
22+
ignoreMissingValueFiles: true
23+
releaseName: nautobot-worker
24+
valueFiles:
25+
- $understack/components/nautobot-worker/values.yaml
26+
- $deploy/{{ include "understack.deploy_path" $ }}/nautobot-worker/values.yaml
27+
repoURL: https://nautobot.github.io/helm-charts/
28+
targetRevision: 2.5.6
29+
- path: components/nautobot-worker
30+
ref: understack
31+
repoURL: {{ include "understack.understack_url" $ }}
32+
targetRevision: {{ include "understack.understack_ref" $ }}
33+
- path: {{ include "understack.deploy_path" $ }}/nautobot-worker
34+
ref: deploy
35+
repoURL: {{ include "understack.deploy_url" $ }}
36+
targetRevision: {{ include "understack.deploy_ref" $ }}
37+
syncPolicy:
38+
automated:
39+
prune: true
40+
selfHeal: true
41+
managedNamespaceMetadata:
42+
annotations:
43+
argocd.argoproj.io/sync-options: Delete=false
44+
syncOptions:
45+
- CreateNamespace=true
46+
- ServerSideApply=true
47+
- RespectIgnoreDifferences=true
48+
- ApplyOutOfSyncOnly=true
49+
{{- end }}

charts/argocd-understack/values.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -514,6 +514,12 @@ site:
514514
# @default -- false
515515
enabled: false
516516

517+
# -- Nautobot Celery workers (site-level, connects to global Nautobot)
518+
nautobot_worker:
519+
# -- Enable/disable deploying Nautobot workers at the site level
520+
# @default -- false
521+
enabled: false
522+
517523
# -- Site-specific workflows and event handlers
518524
site_workflows:
519525
# -- Enable/disable deploying site workflows

components/envoy-configs/templates/gw-external.yaml.tpl

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,20 @@ spec:
5252
from: {{ .from | default "All" }}
5353
{{- end }}
5454
{{- end }}
55+
{{- range .Values.routes.tcp }}
56+
- name: {{ .listenerName }}
57+
port: {{ .gatewayPort }}
58+
protocol: TCP
59+
allowedRoutes:
60+
namespaces:
61+
{{- if .selector }}
62+
from: Selector
63+
selector:
64+
{{- .selector | toYaml | nindent 12 }}
65+
{{- else }}
66+
from: {{ .from | default "All" }}
67+
{{- end }}
68+
{{- end }}
5569
{{- if .Values.gateways.external.serviceAnnotations }}
5670
infrastructure:
5771
parametersRef:
Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,26 @@
1+
{{- range .Values.routes.tcp }}
2+
---
3+
apiVersion: gateway.networking.k8s.io/v1alpha2
4+
kind: TCPRoute
5+
metadata:
6+
{{- if .name }}
7+
name: {{ .name }}
8+
{{- else }}
9+
name: {{ .service.name }}
10+
{{- end }}
11+
namespace: {{ .namespace | default "envoy-gateway" }}
12+
labels:
13+
{{- include "envoy-configs.labels" $ | nindent 4 }}
14+
spec:
15+
parentRefs:
16+
- name: {{ $.Values.gateways.external.name }}
17+
namespace: {{ $.Values.gateways.external.namespace }}
18+
sectionName: {{ .listenerName }}
19+
rules:
20+
- backendRefs:
21+
- name: {{ .service.name }}
22+
{{- with .namespace }}
23+
namespace: {{ . }}
24+
{{- end }}
25+
port: {{ .service.port }}
26+
{{- end }}

components/envoy-configs/values.schema.json

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -224,6 +224,76 @@
224224
],
225225
"additionalProperties": false
226226
}
227+
},
228+
"tcp": {
229+
"type": "array",
230+
"description": "TCP routes for non-HTTP services (e.g., PostgreSQL, Redis)",
231+
"items": {
232+
"type": "object",
233+
"properties": {
234+
"name": {
235+
"type": "string",
236+
"description": "Name identifier for the TCPRoute resource"
237+
},
238+
"listenerName": {
239+
"type": "string",
240+
"description": "Name of the TCP listener on the gateway (must match)"
241+
},
242+
"gatewayPort": {
243+
"type": "integer",
244+
"minimum": 1,
245+
"maximum": 65535,
246+
"description": "Port exposed on the gateway for this TCP route"
247+
},
248+
"namespace": {
249+
"type": "string",
250+
"description": "Namespace of the backend service"
251+
},
252+
"service": {
253+
"type": "object",
254+
"description": "Kubernetes service backend configuration",
255+
"properties": {
256+
"name": {
257+
"type": "string",
258+
"description": "Name of the Kubernetes service"
259+
},
260+
"port": {
261+
"type": "integer",
262+
"minimum": 1,
263+
"maximum": 65535,
264+
"description": "Port of the backend service"
265+
}
266+
},
267+
"required": [
268+
"name",
269+
"port"
270+
],
271+
"additionalProperties": false
272+
},
273+
"selector": {
274+
"type": "object",
275+
"description": "Kubernetes-style label selector (key-value pairs)",
276+
"additionalProperties": {
277+
"type": "string"
278+
}
279+
},
280+
"from": {
281+
"type": "string",
282+
"enum": [
283+
"Same",
284+
"All",
285+
"Selector"
286+
],
287+
"description": "Specifies where traffic can originate from"
288+
}
289+
},
290+
"required": [
291+
"listenerName",
292+
"gatewayPort",
293+
"service"
294+
],
295+
"additionalProperties": false
296+
}
227297
}
228298
}
229299
}

components/envoy-configs/values.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,3 +2,4 @@ gateways: {}
22
routes:
33
http: []
44
tls: []
5+
tcp: []
Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
---
2+
apiVersion: kustomize.config.k8s.io/v1beta1
3+
kind: Kustomization
4+
5+
resources: []
Lines changed: 67 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,67 @@
1+
# Nautobot Worker (site-level)
2+
#
3+
# Deploys only Celery workers that connect back to the global Nautobot
4+
# database and Redis. The web server is disabled because it lives on
5+
# the global cluster. Redis and PostgreSQL are disabled because the
6+
# workers reach the global instances over the network.
7+
#
8+
# The deploy repo for each site MUST provide:
9+
# - ExternalSecrets for nautobot-django, nautobot-redis, nautobot-db,
10+
# nautobot-custom-env, and dockerconfigjson-github-com
11+
# - values.yaml overrides for nautobot.db.host and nautobot.redis.host
12+
# pointing to the global cluster endpoints
13+
---
14+
15+
# Disable the Nautobot web server — workers only
16+
nautobot:
17+
enabled: false
18+
19+
db:
20+
engine: "django.db.backends.postgresql"
21+
# Override in deploy repo values to point at the global CNPG service
22+
host: ""
23+
port: "5432"
24+
name: "app"
25+
user: "app"
26+
existingSecret: "nautobot-db"
27+
existingSecretPasswordKey: "password"
28+
29+
django:
30+
existingSecret: nautobot-django
31+
32+
redis:
33+
# Override in deploy repo values to point at the global Redis service
34+
host: ""
35+
port: "6379"
36+
ssl: false
37+
username: ""
38+
39+
celery:
40+
enabled: true
41+
concurrency: 2
42+
replicaCount: 1
43+
extraEnvVarsSecret:
44+
- nautobot-django
45+
- nautobot-custom-env
46+
livenessProbe:
47+
initialDelaySeconds: 60
48+
periodSeconds: 120
49+
timeoutSeconds: 60
50+
readinessProbe:
51+
initialDelaySeconds: 60
52+
periodSeconds: 120
53+
timeoutSeconds: 60
54+
55+
# Do not deploy local Redis — use the global instance
56+
redis:
57+
enabled: false
58+
59+
# Do not deploy local PostgreSQL — use the global CNPG instance
60+
postgresql:
61+
enabled: false
62+
63+
ingress:
64+
enabled: false
65+
66+
metrics:
67+
enabled: false
Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,58 @@
1+
---
2+
charts:
3+
- nautobot
4+
kustomize_paths:
5+
- components/nautobot-worker
6+
deploy_overrides:
7+
helm:
8+
mode: values
9+
kustomize:
10+
mode: second_source
11+
---
12+
13+
# nautobot-worker
14+
15+
Site-level Nautobot Celery workers that connect to the global Nautobot
16+
database and Redis. This allows sites to run their own worker pods for
17+
processing background tasks without deploying the full Nautobot web
18+
application.
19+
20+
## Deployment Scope
21+
22+
- Cluster scope: site
23+
- Values key: `site.nautobot_worker`
24+
- ArgoCD Application template: `charts/argocd-understack/templates/application-nautobot-worker.yaml`
25+
26+
## How ArgoCD Builds It
27+
28+
{{ component_argocd_builds() }}
29+
30+
## How to Enable
31+
32+
Enable this component in your site deployment values file:
33+
34+
```yaml title="$CLUSTER_NAME/deploy.yaml"
35+
site:
36+
nautobot_worker:
37+
enabled: true
38+
```
39+
40+
## Deployment Repo Content
41+
42+
{{ secrets_disclaimer }}
43+
44+
Required or commonly required items:
45+
46+
- `values.yaml`: Override celery worker settings such as image, replica
47+
count, concurrency, environment variables, and task queue assignments.
48+
- `nautobot-django` Secret: Provide a `NAUTOBOT_SECRET_KEY` value
49+
(must match the global Nautobot instance).
50+
- `nautobot-cluster-app` Secret: Database credentials for the global
51+
CloudNativePG cluster.
52+
53+
Optional additions:
54+
55+
- `nautobot-custom-env` Secret: Extra environment variables to inject
56+
into the worker pods.
57+
- Additional `workers` entries in `values.yaml` to run dedicated
58+
workers for specific Celery queues.

mkdocs.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -190,6 +190,7 @@ nav:
190190
- deploy-guide/components/nautobot-site.md
191191
- deploy-guide/components/nautobot.md
192192
- deploy-guide/components/nautobotop.md
193+
- deploy-guide/components/nautobot-worker.md
193194
- deploy-guide/components/neutron.md
194195
- deploy-guide/components/nova.md
195196
- deploy-guide/components/octavia.md

0 commit comments

Comments
 (0)