Skip to content

Commit cd513c2

Browse files
justonedev1Michal Tichak
authored andcommitted
[OCTRL-1088] rename author and properly handling TaskController as a daemon set
1 parent 85f1931 commit cd513c2

11 files changed

Lines changed: 70 additions & 148 deletions

File tree

control-operator/Makefile

Lines changed: 10 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -97,7 +97,7 @@ help: ## Display this help.
9797
##@ Development
9898

9999
.PHONY: manifests
100-
manifests: controller-gen ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
100+
manifests: controller-gen generate-proto ## Generate WebhookConfiguration, ClusterRole and CustomResourceDefinition objects.
101101
# Note that the option maxDescLen=0 was added in the default scaffold in order to sort out the issue
102102
# Too long: must have at most 262144 bytes. By using kubectl apply to create / update resources an annotation
103103
# is created by K8s API to store the latest version of the resource ( kubectl.kubernetes.io/last-applied-configuration).
@@ -220,8 +220,15 @@ deploy-environment: manifests kustomize ## Deploy environment controller to the
220220
$(KUSTOMIZE) build config/environment | $(KUBECTL) apply -f - --server-side
221221

222222
.PHONY: undeploy
223-
undeploy: ## Undeploy both controllers from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
224-
$(KUSTOMIZE) build config/default | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
223+
undeploy: undeploy-task undeploy-environment ## Undeploy both controllers from the K8s cluster specified in ~/.kube/config.
224+
225+
.PHONY: undeploy-task
226+
undeploy-task: kustomize ## Undeploy task controller from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
227+
$(KUSTOMIZE) build config/task | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
228+
229+
.PHONY: undeploy-environment
230+
undeploy-environment: kustomize ## Undeploy environment controller from the K8s cluster specified in ~/.kube/config. Call with ignore-not-found=true to ignore resource not found errors during deletion.
231+
$(KUSTOMIZE) build config/environment | $(KUBECTL) delete --ignore-not-found=$(ignore-not-found) -f -
225232

226233
##@ Build Dependencies
227234

control-operator/README.md

Lines changed: 21 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,14 @@ In order to deploy Task and Environment workflows to the k8s cluster you need co
88
controlling custom CRDs defining ALICE custom workload. This Folder defines and implements all moving parts together with Makefile
99
to build, deploy, install CRDs and operators.
1010

11+
## Architecture
12+
13+
The operator is split into two separate binaries with different deployment strategies:
14+
15+
**task-manager** runs as a DaemonSet — one pod per node. Each pod is responsible only for `Task` resources assigned to its node (matched via `spec.nodeName`). This is necessary because the task-manager communicates with OCC gRPC processes running locally on the same node via `hostNetwork`.
16+
17+
**environment-manager** runs as a Deployment with a single replica per cluster. It is responsible for `Environment` resources which are cluster-scoped and not tied to a specific node.
18+
1119
## Getting Started
1220

1321
You’ll need a Kubernetes cluster to run against. You can use [KIND](https://sigs.k8s.io/kind) to get a local cluster for testing, or run against a remote cluster. Author had the most success with K3s [see](/docs/kubernetes_ecs.md).
@@ -23,16 +31,16 @@ Following commands show basic use of Makefile. However this isn't exhaustive lis
2331
kubectl apply -f config/samples/
2432
```
2533

26-
1. Build and push your image to the location specified by `IMG`:
34+
1. Build and push your images. Default image tags are defined in the Makefile via `TASK_IMG` and `ENVIRONMENT_IMG`. Override them only if you want to use a different registry or tag:
2735

2836
```sh
29-
make docker-build docker-push IMG=<some-registry>/operator:tag
37+
make docker-build docker-push TASK_IMG=<some-registry>/task-manager:tag ENVIRONMENT_IMG=<some-registry>/environment-manager:tag
3038
```
3139

32-
1. Deploy the controller to the cluster with the image specified by `IMG`:
40+
1. Deploy the controllers to the cluster. Uses the same `TASK_IMG` and `ENVIRONMENT_IMG` defaults, override them if needed:
3341

3442
```sh
35-
make deploy IMG=<some-registry>/operator:tag
43+
make deploy TASK_IMG=<some-registry>/task-manager:tag ENVIRONMENT_IMG=<some-registry>/environment-manager:tag
3644
```
3745

3846
### Uninstall CRDs
@@ -70,13 +78,19 @@ which provide a reconcile function responsible for synchronizing resources until
7078
make install
7179
```
7280

73-
1. Run your controller (this will run in the foreground, so switch to a new terminal if you want to leave it running):
81+
1. Run a controller (this will run in the foreground, so switch to a new terminal if you want to leave it running):
82+
83+
```sh
84+
make run-environment
85+
```
86+
87+
The task-manager requires a `NODE_NAME` environment variable to know which node it is responsible for. In-cluster this is injected automatically via the downward API. When running locally you must set it manually:
7488

7589
```sh
76-
make run
90+
NODE_NAME=<your-node-name> make run-task
7791
```
7892

79-
**NOTE:** You can also run this in one step by running: `make install run`
93+
**NOTE:** You can also install CRDs and run in one step: `make install run-environment` or `make install run-task`
8094

8195
### Modifying the API definitions
8296

control-operator/api/v1alpha1/environment_types.go

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/*
22
* === This file is part of ALICE O² ===
33
*
4-
* Copyright 2024 CERN and copyright holders of ALICE O².
5-
* Author: Teo Mrnjavac <teo.mrnjavac@cern.ch>
4+
* Copyright 2026 CERN and copyright holders of ALICE O².
5+
* Author: Michal Tichak <michal.tichak@cern.ch>
66
*
77
* This program is free software: you can redistribute it and/or modify
88
* it under the terms of the GNU General Public License as published by
@@ -66,6 +66,8 @@ type EnvironmentStatus struct {
6666
// - "Progressing": the resource is being created or updated
6767
// - "Degraded": the resource failed to reach or maintain its desired state
6868
//
69+
// TODO: use conditions properly during deployment
70+
//
6971
// The status of each condition is one of True, False, or Unknown.
7072
// +listType=map
7173
// +listMapKey=type

control-operator/api/v1alpha1/tasktemplate_types.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/*
22
* === This file is part of ALICE O² ===
33
*
4-
* Copyright 2024 CERN and copyright holders of ALICE O².
5-
* Author: Teo Mrnjavac <teo.mrnjavac@cern.ch>
4+
* Copyright 2026 CERN and copyright holders of ALICE O².
5+
* Author: Michal Tichak <michal.tichak@cern.ch>
66
*
77
* This program is free software: you can redistribute it and/or modify
88
* it under the terms of the GNU General Public License as published by

control-operator/cmd/environment-manager/main.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/*
22
* === This file is part of ALICE O² ===
33
*
4-
* Copyright 2024 CERN and copyright holders of ALICE O².
5-
* Author: Teo Mrnjavac <teo.mrnjavac@cern.ch>
4+
* Copyright 2026 CERN and copyright holders of ALICE O².
5+
* Author: Michal Tichak <michal.tichak@cern.ch>
66
*
77
* This program is free software: you can redistribute it and/or modify
88
* it under the terms of the GNU General Public License as published by

control-operator/cmd/main.go

Lines changed: 0 additions & 124 deletions
This file was deleted.

control-operator/cmd/task-manager/main.go

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/*
22
* === This file is part of ALICE O² ===
33
*
4-
* Copyright 2024 CERN and copyright holders of ALICE O².
5-
* Author: Teo Mrnjavac <teo.mrnjavac@cern.ch>
4+
* Copyright 2026 CERN and copyright holders of ALICE O².
5+
* Author: Michal Tichak <michal.tichak@cern.ch>
66
*
77
* This program is free software: you can redistribute it and/or modify
88
* it under the terms of the GNU General Public License as published by
@@ -26,6 +26,7 @@ package main
2626

2727
import (
2828
"flag"
29+
"fmt"
2930
"os"
3031

3132
"k8s.io/apimachinery/pkg/runtime"
@@ -82,10 +83,17 @@ func main() {
8283
os.Exit(1)
8384
}
8485

86+
nodeName := os.Getenv("NODE_NAME")
87+
if nodeName == "" {
88+
setupLog.Error(fmt.Errorf("NODE_NAME environment variable not set"), "NODE_NAME is required")
89+
os.Exit(1)
90+
}
91+
8592
if err = (&controller.TaskReconciler{
8693
Client: mgr.GetClient(),
8794
Scheme: mgr.GetScheme(),
8895
Recorder: mgr.GetEventRecorderFor("task-controller"),
96+
NodeName: nodeName,
8997
}).SetupWithManager(mgr); err != nil {
9098
setupLog.Error(err, "unable to create controller", "controller", "Task")
9199
os.Exit(1)

control-operator/config/manager/task-manager/task-manager.yaml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,13 @@ spec:
3131
- command:
3232
- /manager
3333
args:
34-
- --leader-elect
3534
- --health-probe-bind-address=:9082
3635
- --metrics-bind-address=:9083
36+
env:
37+
- name: NODE_NAME
38+
valueFrom:
39+
fieldRef:
40+
fieldPath: spec.nodeName
3741
image: task-manager:latest
3842
name: manager
3943
securityContext:

control-operator/internal/controller/environment_controller.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/*
22
* === This file is part of ALICE O² ===
33
*
4-
* Copyright 2024 CERN and copyright holders of ALICE O².
5-
* Author: Teo Mrnjavac <teo.mrnjavac@cern.ch>
4+
* Copyright 2026 CERN and copyright holders of ALICE O².
5+
* Author: Michal Tichak <michal.tichak@cern.ch>
66
*
77
* This program is free software: you can redistribute it and/or modify
88
* it under the terms of the GNU General Public License as published by

control-operator/internal/controller/environment_controller_test.go

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
/*
22
* === This file is part of ALICE O² ===
33
*
4-
* Copyright 2024 CERN and copyright holders of ALICE O².
5-
* Author: Teo Mrnjavac <teo.mrnjavac@cern.ch>
4+
* Copyright 2026 CERN and copyright holders of ALICE O².
5+
* Author: Michal Tichak <michal.tichak@cern.ch>
66
*
77
* This program is free software: you can redistribute it and/or modify
88
* it under the terms of the GNU General Public License as published by

0 commit comments

Comments
 (0)