"Tools are only as good as the processes that govern them." — Saravanan Gnanagur, Founder, CloudEngineLabs
Process-First is the Systems Thinking of DevOps.
"The First Way emphasises the performance of the entire system, as opposed to the performance of a specific silo of work or department."
— Gene Kim, Kevin Behr, and George Spafford (The Phoenix Project / The DevOps Handbook)
Gene Kim, Kevin Behr, and George Spafford define the "Three Ways" of DevOps in The Phoenix Project and The DevOps Handbook. The First Way focuses on Systems Thinking — optimising the flow from development through operations to the customer, rather than maximising the throughput of any single stage.
"DevOps is about fixing the broken, inefficient processes between developers and operations before introducing new tools."
— Patrick Debois (Founder of DevOpsDays)
Patrick Debois, as the founder of DevOpsDays, emphasised that cultural and process problems — not technology gaps — are the root cause of slow, unreliable software delivery. He showed that bringing developers and operations together through shared processes is the prerequisite for any tooling to succeed.
cloudenginelabs.io is a Process-First SDLC automation company. Process-First is an engineering philosophy that places well-defined, repeatable Software Development Life Cycle (SDLC) processes at the centre of every engineering decision — before selecting tools, platforms, or frameworks.
The guiding question is always: "Do we have the right process before we pick the tool?"
| # | Principle | What it means in practice |
|---|---|---|
| 1 | Define before you Build | Document the "why" and "what" of every workflow before writing a single line of pipeline code. |
| 2 | Standardise before you Scale | Create golden-path templates (CI/CD, GitOps, SRE) that every team can adopt without reinventing the wheel. |
| 3 | Automate what is Repeatable | If a process is done more than twice, automate it. Automation should encode the process, not bypass it. |
| 4 | Observe and Iterate | Every automated process must produce measurable outcomes (SLOs, SLAs, error budgets) so teams can improve continuously. |
| 5 | Culture over Tooling | Processes create shared understanding and accountability. The right culture makes any toolchain succeed; the wrong culture makes the best toolchain fail. |
Each devopsos scaffold command encodes one or more Process-First principles into an
immediately usable configuration artefact:
| Process-First Principle | DevOps-OS Command | What it generates |
|---|---|---|
| Define before you Build | devopsos scaffold cicd / gha / gitlab |
Interactive wizards capture intent before generating any config file |
| Standardise before Scale | devopsos scaffold gha, gitlab, jenkins |
Golden-path templates for GitHub Actions, GitLab CI, and Jenkins — reviewed baselines every team can adopt |
| Standardise the container runtime & Kubernetes env | devopsos scaffold devcontainer |
Encodes Docker/Podman choice and all Kubernetes CLI tools (kubectl, helm, kustomize, argocd_cli, flux, k9s, kind, minikube, kubeseal …) into a reproducible devcontainer.json |
| Automate the Repeatable | devopsos scaffold argocd |
GitOps sync process encoded as an ArgoCD Application + AppProject CR |
| Automate the Repeatable | devopsos scaffold devcontainer |
Developer environment setup encoded as devcontainer.json — reproducible for every team member |
| Observe and Iterate | devopsos scaffold sre |
Prometheus alert rules, Grafana dashboards, and SLO manifests — close the measure-improve feedback loop |
| Culture over Tooling | devopsos MCP server skills |
AI assistants (Claude, ChatGPT) coach engineers on DevOps best practices, not just generate config files |
DevOps-OS ships a built-in process-first command to surface this philosophy
interactively in your terminal.
# Full overview (default)
python -m cli.devopsos process-first
# Section: what Process-First is, 5 core principles + thought leaders
python -m cli.devopsos process-first --section what
# Section: how each principle maps to a DevOps-OS scaffold command
python -m cli.devopsos process-first --section mapping
# Section: AI prompts and reading recommendations for beginners
python -m cli.devopsos process-first --section tips
# Section: best practices for each stage (build/test/iac/deploy/sre/monitoring/security)
python -m cli.devopsos process-first --section best_practicesYou can also invoke the module directly:
python -m cli.process_first
python -m cli.process_first --section best_practices| Option | Default | Description |
|---|---|---|
--section |
all |
what · mapping · tips · best_practices · all |
| Situation | Recommended section |
|---|---|
| New team member onboarding | all — read the full overview first |
| Pre-sprint process alignment meeting | mapping — show how each scaffold encodes a principle |
| Self-study / learning DevOps from scratch | tips — use the AI prompts to go deep |
| Quick philosophy refresh | what — re-read the 5 core principles and thought leaders |
| Setting up a new DevOps pipeline | best_practices — best practices for each stage |
Applying Systems Thinking means defining the right process at each stage of the value stream before selecting or configuring any tool.
| Best Practice | Why it matters |
|---|---|
| Define build standards before choosing tools (Gradle, Maven, Make) | Prevents tool sprawl and ensures consistent outputs |
| Standardise dependency management and enforce version-pinned builds | Reduces "works on my machine" issues |
| Enforce reproducible builds across all teams and environments | Makes builds auditable and roll-backable |
| Use an artifact repository (Nexus) to cache, version, and audit build outputs | Provides a single source of truth for artifacts |
| Best Practice | Why it matters |
|---|---|
| Choose and document your container runtime (Docker or Podman) before writing the first Dockerfile | Consistency prevents environment drift across teams |
| Standardise base images: use pinned, minimal images (distroless or Alpine) updated on a schedule | Reduces attack surface and image size |
Define a container image naming and tagging convention (<registry>/<org>/<service>:<gitsha>) |
Makes every image traceable back to its source commit |
| Scan every image for vulnerabilities in the CI build stage before pushing to a registry | Prevents known CVEs from reaching any environment |
| Use multi-stage Dockerfiles to keep production images free of build-time dependencies | Smaller images, faster pulls, smaller attack surface |
Use devopsos scaffold devcontainer to standardise the container runtime for every team member |
Every engineer starts from the same reproducible baseline |
| Best Practice | Why it matters |
|---|---|
| Define cluster topology and namespace strategy before deploying any workload | Avoids namespace sprawl and permission confusion |
| Use a local cluster (kind or minikube) for development so every engineer can reproduce production-like conditions locally | Reduces "it works on staging but not prod" surprises |
| Manage all manifests through version-controlled Kustomize overlays or Helm charts | No kubectl apply from developer laptops in production |
| Use GitOps (ArgoCD or Flux) as the single deployment path — direct production kubectl changes are prohibited | Every change is auditable, reviewable, and reversible |
| Manage Kubernetes secrets with Sealed Secrets (Kubeseal) | Encrypted secrets can be safely stored in Git |
| Use k9s or Lens for day-two operations and cluster observability | Consistent tooling reduces operator error |
| Best Practice | Why it matters |
|---|---|
| Define quality gates and acceptance criteria before writing tests | Ensures tests validate business intent, not just code paths |
| Automate unit, integration, and end-to-end tests in every pipeline run | Catches regressions before they reach production |
| Enforce code standards with static analysis (SonarQube, Checkstyle, ESLint, Pylint) | Maintains code quality at scale without manual review |
| Fail fast: surface failures early to prevent bad code advancing | Reduces the cost of fixing defects |
| Best Practice | Why it matters |
|---|---|
| Define infrastructure requirements before writing Terraform or Helm code | Prevents over-engineering and unnecessary complexity |
| Version-control every infrastructure definition — no manual changes to production | Makes infrastructure changes auditable and reproducible |
| Use Kustomize overlays for environment-specific config (dev/staging/prod) | Reduces duplication and drift between environments |
| Detect and alert on infrastructure drift regularly | Ensures the running state matches the declared state |
| Best Practice | Why it matters |
|---|---|
| Define deployment runbooks and rollback procedures before the first release | Prevents panicked, undocumented changes during incidents |
| Use GitOps (ArgoCD, Flux) to make deployment intent explicit in Git | Creates a full audit trail for every change |
| Implement blue/green or canary deployments for zero-downtime releases | Reduces blast radius of bad deployments |
| Gate production deployments with automated approvals and smoke tests | Prevents untested code from reaching users |
| Best Practice | Why it matters |
|---|---|
| Define SLOs and SLAs before deploying to production | You cannot measure reliability you haven't defined |
| Use error budgets to balance reliability with feature velocity | Gives teams a data-driven way to decide when to slow down |
| Alert on symptoms (SLO burn rate), not causes (CPU spikes) | Reduces alert fatigue and improves signal quality |
| Establish on-call rotations and incident runbooks before going live | Prevents chaotic, undocumented incident response |
| Best Practice | Why it matters |
|---|---|
| Define golden signals (Rate, Errors, Duration, Saturation) before building dashboards | Ensures dashboards answer real operational questions |
| Instrument applications with standard metrics from day one | Eliminates blind spots in your observability stack |
| Centralise logs, metrics, and traces in one platform (ELK, Prometheus, Grafana) | Enables faster root-cause analysis during incidents |
| Set alerting thresholds based on SLO objectives, not arbitrary values | Prevents noisy, meaningless alerts that erode trust |
| Best Practice | Why it matters |
|---|---|
| Shift security left: enforce scanning in the CI pipeline | Catches vulnerabilities before they reach production |
| Manage Kubernetes secrets with Sealed Secrets (Kubeseal) | Prevents plaintext secrets from being stored in Git |
| Scan container images for vulnerabilities on every build | Ensures base images are kept up to date and safe |
| Enforce least-privilege access controls for all service accounts | Limits the blast radius of a compromised component |
# View best practices in your terminal
python -m cli.devopsos process-first --section best_practicesUse AI tools like Claude, ChatGPT, or Gemini to deepen your understanding of Process-First DevOps. Here are prompts you can try right now:
"Explain the process-first approach to SDLC automation and why defining processes before choosing tools leads to better outcomes."
"What are the key stages of a production-grade CI/CD pipeline? Give me a process checklist before I start writing any pipeline code."
"Walk me through the GitOps process: from code commit to production deployment using ArgoCD. What processes must exist before ArgoCD adds value?"
"What is the SRE process for setting SLOs and error budgets? How do I translate business requirements into Prometheus alert rules?"
"What developer-environment standardisation process should a team follow before adopting Dev Containers?"
Install the DevOps-OS MCP server, connect it to Claude or ChatGPT, and ask:
"Using the DevOps-OS tools, scaffold a process-first CI/CD setup for a Python microservice with GitOps delivery and SRE observability."
See mcp_server/README.md for setup instructions.
| Resource | Where |
|---|---|
| The DevOps Handbook — Gene Kim et al. | Your preferred book retailer |
| Site Reliability Engineering — Google SRE book | sre.google/books |
| Accelerate — Nicole Forsgren et al. | Your preferred book retailer |
| Saravanan Gnanagur on LinkedIn | Search "Saravanan Gnanagur process first" on LinkedIn |
| Topic | Document |
|---|---|
| Full CLI reference | CLI-COMMANDS-REFERENCE.md |
| Getting started (first pipeline in 5 min) | GETTING-STARTED.md |
| GitHub Actions deep dive | GITHUB-ACTIONS-README.md |
| GitLab CI deep dive | GITLAB-CI-README.md |
| Jenkins deep dive | JENKINS-PIPELINE-README.md |
| ArgoCD / Flux deep dive | ARGOCD-README.md |
| SRE configuration deep dive | SRE-CONFIGURATION-README.md |
| MCP server (AI integration) | ../mcp_server/README.md |