Skip to content

Commit 8add796

Browse files
author
usrbinkat
committed
working deploy and destroy
1 parent d24aab4 commit 8add796

File tree

7 files changed

+665
-8
lines changed

7 files changed

+665
-8
lines changed
Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# Resolving Kubernetes Cluster State Issues with Pulumi
2+
3+
## Introduction
4+
5+
This document provides a step-by-step guide to resolving issues when Pulumi fails to destroy unreachable Kubernetes resources. It follows the Konductor project’s documentation standards and aligns with Pulumi Python development best practices.
6+
7+
This guide is designed to assist developers in handling errors related to `PULUMI_K8S_DELETE_UNREACHABLE` and Kubernetes provider misconfigurations. By following these steps, you can ensure smooth and complete infrastructure teardown, avoiding unnecessary cloud costs.
8+
9+
## Troubleshooting Steps
10+
11+
### 1. Identify the Problematic Resource
12+
13+
To identify Kubernetes resources causing destroy failures:
14+
15+
1. Export the Pulumi stack state:
16+
17+
```bash
18+
pulumi stack export > stack.json
19+
```
20+
21+
2. Use `jq` to filter Kubernetes resources:
22+
23+
```bash
24+
cat stack.json | jq '.deployment.resources[] | select(.type == "kubernetes:core/v1:Pod") | .urn'
25+
```
26+
27+
**Example Output:**
28+
29+
```
30+
"urn:pulumi:navteca-aws-credentials-config-smce::konductor::kubernetes:core/v1:Pod::nginx-test-test-eks-cluster"
31+
```
32+
33+
### 2. Use `PULUMI_K8S_DELETE_UNREACHABLE`
34+
35+
Set the environment variable to allow deletion of unreachable resources:
36+
37+
1. Export the variable:
38+
39+
```bash
40+
export PULUMI_K8S_DELETE_UNREACHABLE=true
41+
```
42+
43+
2. Attempt to destroy the resource:
44+
45+
```bash
46+
pulumi destroy --target="urn:<resource-URN>" --refresh --skip-preview
47+
```
48+
49+
If the operation fails, proceed to manually remove the resource from the state.
50+
51+
### 3. Manually Remove the Resource
52+
53+
1. Delete the resource directly from the Pulumi state:
54+
55+
```bash
56+
pulumi state delete "urn:<resource-URN>"
57+
```
58+
59+
**Example:**
60+
61+
```bash
62+
pulumi state delete "urn:pulumi:navteca-aws-credentials-config-smce::konductor::kubernetes:core/v1:Pod::nginx-test-test-eks-cluster"
63+
```
64+
65+
2. Retry the destroy operation for the entire stack:
66+
67+
```bash
68+
pulumi destroy --skip-preview --refresh --continue-on-error
69+
```
70+
71+
### 4. Validate and Clean Up
72+
73+
1. Ensure all resources have been deleted by reviewing the destroy output.
74+
2. Remove the Pulumi stack if no longer needed:
75+
76+
```bash
77+
pulumi stack rm <stack-name>
78+
```
79+
80+
## Best Practices
81+
82+
### Handling Kubernetes Resources
83+
84+
- **Use `PULUMI_K8S_DELETE_UNREACHABLE` Proactively:** Set the environment variable when working with ephemeral or potentially inaccessible clusters.
85+
- **Monitor Resource Dependencies:** Use `depends_on` in Pulumi to manage resource creation and deletion order.
86+
- **Backup Pulumi State:** Export the stack state regularly:
87+
88+
```bash
89+
pulumi stack export > backup-<date>.json
90+
```
91+
92+
### Configurations
93+
94+
- **Namespace Management:** Explicitly set namespaces for Kubernetes resources to avoid conflicts.
95+
- **Pulumi Provider Configuration:** Ensure the Kubernetes provider is properly configured with an accessible kubeconfig file:
96+
97+
```python
98+
k8s_provider = pulumi_kubernetes.Provider(
99+
"k8s-provider",
100+
kubeconfig="~/.kube/config"
101+
)
102+
```
103+
104+
### Debugging
105+
106+
- **Use Pulumi’s debug logging to analyze failures:**
107+
108+
```bash
109+
pulumi destroy --debug --skip-preview
110+
```
111+
112+
## Common Errors and Solutions
113+
114+
| Error Message | Cause | Solution |
115+
|------------------------------------------------------|------------------------------------------------------|--------------------------------------------------------------------------|
116+
| configured Kubernetes cluster is unreachable | Cluster is deleted or unreachable. | Use `PULUMI_K8S_DELETE_UNREACHABLE` or manually remove the resource. |
117+
| failed to read resource state due to unreachable API | Pulumi cannot reconcile the resource state. | Manually delete the resource using `pulumi state delete`. |
118+
| preview failed: unable to load schema information | Invalid or missing Kubernetes provider configuration. | Check and update the kubeconfig path in the Pulumi provider configuration. |
119+
| error: update failed | Failed to delete a dependent resource. | Identify and remove dependencies manually from the Pulumi state. |
120+
121+
## Advanced Techniques
122+
123+
### Recreate a Dummy Cluster
124+
125+
If the original cluster has been deleted but its resources remain in the Pulumi state:
126+
127+
1. Recreate a temporary cluster with the same name and configuration.
128+
2. Retry the destroy operation to clean up the resources.

modules/aws/eks.py

Lines changed: 11 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -297,13 +297,16 @@ def deploy_test_nginx(self, k8s_provider: k8s.Provider, name: str) -> k8s.core.v
297297
f"nginx-test-{name}",
298298
metadata={"name": f"nginx-test-{name}", "namespace": "default"},
299299
spec={"containers": [{"name": "nginx", "image": "nginx:latest", "ports": [{"containerPort": 80}]}]},
300-
opts=ResourceOptions(provider=k8s_provider),
300+
opts=ResourceOptions(
301+
provider=k8s_provider,
302+
depends_on=[k8s_provider],
303+
custom_timeouts={"create": "5m", "delete": "5m"}),
301304
)
302305
return nginx_pod
303306

304307
except Exception as e:
305308
log.error(f"Failed to deploy test nginx pod: {str(e)}")
306-
raise
309+
return None
307310

308311
def deploy_cluster(
309312
self,
@@ -329,6 +332,7 @@ def deploy_cluster(
329332

330333
# Create EKS cluster
331334
cluster = self.create_cluster(name=name, subnet_ids=subnet_ids, cluster_role=cluster_role, version=version)
335+
cluster_name = cluster.name.apply(lambda n: n)
332336

333337
# Create node group
334338
node_group = self.create_node_group(
@@ -386,7 +390,10 @@ def deploy_cluster(
386390
# Get cluster auth token using Pulumi's built-in AWS provider
387391
try:
388392
cluster_token = aws.eks.get_cluster_auth(
389-
name=cluster.name, opts=pulumi.InvokeOptions(provider=self.provider.provider)
393+
name=cluster_name, opts=pulumi.InvokeOptions(
394+
provider=self.provider.provider,
395+
depends_on=[cluster]
396+
)
390397
)
391398
except Exception as e:
392399
log.error(f"Failed to get cluster auth token: {str(e)}")
@@ -416,11 +423,7 @@ def deploy_cluster(
416423
)
417424

418425
# Export both kubeconfigs with descriptive names
419-
pulumi.export("eks_kubeconfig_external", external_kubeconfig)
420-
pulumi.export("eks_kubeconfig_internal", internal_kubeconfig)
421-
422-
# Export the k8s provider as a pulumi stack output secret for use by other Pulumi stacks
423-
pulumi.export("k8s_provider", k8s_provider)
426+
pulumi.export("eks_kubeconfig_external", pulumi.Output.secret(external_kubeconfig))
424427

425428
# Deploy test nginx pod
426429
self.deploy_test_nginx(k8s_provider, name)

modules/kubernetes/__init__.py

Whitespace-only changes.
Lines changed: 198 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,198 @@
1+
# Cert Manager Module Guide
2+
3+
# TODO: Convert from Kargo to generalized Konductor Framework Template repo docs content
4+
5+
Welcome to the **Cert Manager Module** for the Konductor IaC Framework! This guide is tailored for both newcomers to DevOps and experienced developers, providing a comprehensive overview of how to deploy and configure the Cert Manager module within the Kargo platform.
6+
7+
---
8+
9+
## Table of Contents
10+
11+
- [Introduction](#introduction)
12+
- [Why Use Cert Manager?](#why-use-cert-manager)
13+
- [Getting Started](#getting-started)
14+
- [Enabling the Module](#enabling-the-module)
15+
- [Configuration Options](#configuration-options)
16+
- [Default Settings](#default-settings)
17+
- [Customizing Your Deployment](#customizing-your-deployment)
18+
19+
- [Module Components Explained](#module-components-explained)
20+
- [Namespace Creation](#namespace-creation)
21+
- [Helm Chart Deployment](#helm-chart-deployment)
22+
- [Self-Signed Cluster Issuer Setup](#self-signed-cluster-issuer-setup)
23+
24+
- [Using the Module](#using-the-module)
25+
- [Example Usage](#example-usage)
26+
27+
- [Troubleshooting and FAQs](#troubleshooting-and-faqs)
28+
- [Additional Resources](#additional-resources)
29+
- [Conclusion](#conclusion)
30+
31+
---
32+
33+
## Introduction
34+
35+
The Cert Manager module automates the management of SSL/TLS certificates in your Kubernetes cluster using [cert-manager](https://cert-manager.io/). It simplifies the process of obtaining, renewing, and managing certificates, enhancing the security of your applications without manual intervention.
36+
37+
---
38+
39+
## Why Use Cert Manager?
40+
41+
- **Automation**: Automatically provisions and renews certificates.
42+
- **Integration**: Works seamlessly with Kubernetes Ingress resources and other services.
43+
- **Security**: Enhances security by ensuring certificates are always up-to-date.
44+
- **Compliance**: Helps meet compliance requirements by managing PKI effectively.
45+
46+
---
47+
48+
## Getting Started
49+
50+
### Prerequisites
51+
52+
- **Kubernetes Cluster**: Ensure you have access to a Kubernetes cluster.
53+
- **Pulumi CLI**: Install the Pulumi CLI and configure it.
54+
- **Kubeconfig**: Your kubeconfig file should be properly set up.
55+
56+
### Setup Steps
57+
58+
1. **Navigate to the Kargo Pulumi Directory**:
59+
60+
```bash
61+
cd Kargo/pulumi
62+
```
63+
64+
2. **Install Dependencies**:
65+
66+
```bash
67+
pip install -r requirements.txt
68+
```
69+
70+
3. **Initialize Pulumi Stack**:
71+
72+
```bash
73+
pulumi stack init dev
74+
```
75+
76+
---
77+
78+
## Enabling the Module
79+
80+
The Cert Manager module is enabled by default. To verify or modify its enabled status, adjust your Pulumi configuration.
81+
82+
### Verifying Module Enablement
83+
84+
```yaml
85+
# Pulumi.<stack-name>.yaml
86+
87+
config:
88+
cert_manager:
89+
enabled: true # Set to false to disable
90+
```
91+
92+
Alternatively, use the Pulumi CLI:
93+
94+
```bash
95+
pulumi config set --path cert_manager.enabled true
96+
```
97+
98+
---
99+
100+
## Configuration Options
101+
102+
### Default Settings
103+
104+
The module is designed to work out-of-the-box with default settings:
105+
106+
- **Namespace**: `cert-manager`
107+
- __Version__: Defined in `default_versions.json`
108+
- **Cluster Issuer Name**: `cluster-selfsigned-issuer`
109+
- **Install CRDs**: `true`
110+
111+
### Customizing Your Deployment
112+
113+
You can tailor the module to fit your specific needs by customizing its configuration.
114+
115+
#### Available Configuration Parameters
116+
117+
- **enabled** *(bool)*: Enable or disable the module.
118+
- **namespace** *(string)*: Kubernetes namespace for cert-manager.
119+
- **version** *(string)*: Helm chart version to deploy. Use `'latest'` to fetch the most recent stable version.
120+
- __cluster_issuer__ _(string)_: Name of the ClusterIssuer resource.
121+
- __install_crds__ _(bool)_: Whether to install Custom Resource Definitions.
122+
123+
#### Example Custom Configuration
124+
125+
```yaml
126+
config:
127+
cert_manager:
128+
enabled: true
129+
namespace: "my-cert-manager"
130+
version: "1.15.3"
131+
cluster_issuer: "my-cluster-issuer"
132+
install_crds: true
133+
```
134+
135+
---
136+
137+
## Module Components Explained
138+
139+
### Namespace Creation
140+
141+
A dedicated namespace is created to isolate cert-manager resources.
142+
143+
- **Why?**: Ensures better organization and avoids conflicts.
144+
- **Customizable**: Change the namespace using the `namespace` parameter.
145+
146+
### Helm Chart Deployment
147+
148+
Deploys cert-manager using Helm.
149+
150+
- **Chart Repository**: `https://charts.jetstack.io`
151+
- **Version Management**: Specify a version or use `'latest'`.
152+
- **Custom Values**: Resource requests and limits are set for optimal performance.
153+
154+
### Self-Signed Cluster Issuer Setup
155+
156+
Sets up a self-signed ClusterIssuer for certificate provisioning.
157+
158+
- **Root ClusterIssuer**: Creates a root issuer.
159+
- **CA Certificate**: Generates a CA certificate stored in a Kubernetes Secret.
160+
- **Primary ClusterIssuer**: Issues certificates for your applications using the CA certificate.
161+
- **Exported Values**: CA certificate data is exported for use in other modules.
162+
163+
---
164+
165+
## Using the Module
166+
167+
### Example Usage
168+
169+
After enabling and configuring the module, deploy it using Pulumi:
170+
171+
```bash
172+
pulumi up
173+
```
174+
175+
---
176+
177+
## Troubleshooting and FAQs
178+
179+
**Q1: Cert-manager pods are not running.**
180+
181+
- **A**: Check the namespace and ensure that CRDs are installed. Verify the Kubernetes version compatibility.
182+
183+
**Q2: Certificates are not being issued.**
184+
185+
- **A**: Ensure that the ClusterIssuer is correctly configured and that your Ingress resources reference it.
186+
187+
**Q3: How do I update cert-manager to a newer version?**
188+
189+
- **A**: Update the `version` parameter in your configuration and run `pulumi up`.
190+
191+
---
192+
193+
## Additional Resources
194+
195+
- **cert-manager Documentation**: [cert-manager.io/docs](https://cert-manager.io/docs/)
196+
- **Kargo Project**: [Kargo GitHub Repository](https://github.com/ContainerCraft/Kargo)
197+
- **Pulumi Kubernetes Provider**: [Pulumi Kubernetes Docs](https://www.pulumi.com/docs/reference/pkg/kubernetes/)
198+
- **Helm Charts Repository**: [Artifact Hub - cert-manager](https://artifacthub.io/packages/helm/cert-manager/cert-manager)

modules/kubernetes/cert_manager/__init__.py

Whitespace-only changes.

0 commit comments

Comments
 (0)