This guide covers security hardening for Countly Helm deployments in regulated environments (healthcare, financial services, government).
All five charts ship with networkPolicy.enabled: false by default. Enable them in production:
# In each chart's environment file
networkPolicy:
enabled: true
allowedNamespaces:
- countly
- kafka
- clickhouse
- mongodb
- observability
monitoring:
namespace: observabilityNetwork policies restrict pod-to-pod communication to only the namespaces that need it. Without them, any pod in the cluster can reach your databases.
- Use TLS in production (
global.tls: letsencryptorglobal.tls: provided) - The
none(HTTP) profile should only be used for local development - For internal-only deployments, consider
selfSignedwith your own CA
| Path | Default | Hardened |
|---|---|---|
| Client to Ingress | Depends on global.tls |
letsencrypt or provided |
| Ingress to Countly pods | HTTP (in-cluster) | Enable NGINX backend TLS if required |
| Countly to MongoDB | Plaintext | Set mongodb.tls.enabled: true in mongodb.yaml |
| Countly to ClickHouse | Plaintext | Configure ClickHouse TLS via operator settings |
| Countly to Kafka | Plaintext | Configure Strimzi TLS listeners |
| Observability collectors | HTTP | Configure mTLS on Alloy endpoints |
Storage encryption depends on your Kubernetes cluster's StorageClass:
- AWS EKS: Use
gp3StorageClass with EBS encryption enabled (default in most configurations) - GKE: Uses Google-managed encryption by default; enable CMEK for customer-managed keys
- Azure AKS: Uses Azure Disk Encryption by default; enable SSE with customer-managed keys
- Self-managed: Configure your CSI driver to use LUKS or dm-crypt
Set global.storageClass to an encryption-enabled StorageClass:
global:
storageClass: encrypted-gp3For regulated environments, avoid storing secrets as plain values:
| Method | Compliance Level | Setup |
|---|---|---|
values (default) |
Development only | Secrets in gitignored YAML files |
existingSecret |
Acceptable | Pre-create K8s Secrets via your secrets pipeline |
externalSecret |
Recommended | External Secrets Operator + AWS Secrets Manager / Vault / GCP Secret Manager |
See SECRET-MANAGEMENT.md for setup instructions.
Change secrets.rotationId to a new value to trigger secret rotation on the next deploy. This recreates all secrets without changing passwords (the lookup-or-create pattern preserves existing values).
To rotate actual passwords:
- Update passwords in your secret source (Vault, AWS SM, etc.)
- Bump
secrets.rotationId - Run
helmfile apply - Restart affected pods
The observability chart's alloy-otlp deployment runs with restricted security contexts:
securityContext:
runAsNonRoot: true
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities:
drop: [ALL]
seccompProfile:
type: RuntimeDefaultThe alloy DaemonSet (log collector) requires elevated privileges (SYS_PTRACE, root) to read container logs from host paths. This is expected. If log collection is not needed, disable it by setting global.observability: disabled or global.observability: external.
The production sizing profile enables PDBs for:
- All Countly components (api, frontend, ingestor, aggregator)
- ClickHouse server and keeper
- MongoDB replica set
Verify PDBs are active: kubectl get pdb --all-namespaces
The production profile uses preferred anti-affinity by default. For stricter guarantees (e.g., pods MUST be on separate nodes), override in your environment:
# environments/my-deployment/countly.yaml
api:
scheduling:
antiAffinity:
type: requiredCountly maintains internal audit logs. Ensure the aggregator and API components have sufficient resources to avoid dropped events.
Use global.observability: full to deploy the complete monitoring stack. Key dashboards:
- Overview: Cluster health, pod status, resource utilization
- Platform: Node metrics, network I/O, disk pressure
- Data: ClickHouse query performance, Kafka consumer lag
- Countly: Application-level metrics, request rates, error rates
Configure retention periods based on your compliance requirements:
# In observability.yaml
prometheus:
retention:
time: "90d" # Metrics retention
loki:
retention: "90d" # Log retention
tempo:
retention: "336h" # Trace retention (Go duration format, no 'd')| Component | Data | Method |
|---|---|---|
| MongoDB | Application data, user accounts | mongodump or volume snapshots |
| ClickHouse | Analytics/drill data | ClickHouse backup tool or volume snapshots |
| Kafka | Event stream (transient) | Usually not backed up; replay from source |
| Helm releases | Release state | helm get all or GitOps (helmfile in git) |
| Secrets | Credentials | External secret store (Vault, AWS SM) |
If your StorageClass supports VolumeSnapshot:
# Create snapshot of MongoDB PVC
kubectl apply -f - <<EOF
apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshot
metadata:
name: mongodb-backup-$(date +%Y%m%d)
namespace: mongodb
spec:
source:
persistentVolumeClaimName: data-volume-countly-mongodb-0
EOF- Restore PVCs from snapshots or backups
- Deploy with
helmfile apply(same environment config) - Verify data integrity with
./scripts/smoke-test.sh
- Back up all databases (MongoDB, ClickHouse)
- Review CHANGELOG.md for breaking changes
- Test upgrade in a staging environment first
- Ensure PDBs are healthy:
kubectl get pdb --all-namespaces - Verify sufficient cluster capacity for rolling updates
helmfile -e my-deployment applyHelmfile handles dependency ordering. Each chart waits for health checks before proceeding to the next.
helm rollback <release-name> <revision> -n <namespace>Or rollback all charts:
helmfile -e my-deployment apply # with previous git commit checked outAll Helm charts published to ghcr.io/countly are signed and attested at build time:
| Control | Implementation |
|---|---|
| Artifact signing | Cosign keyless (Sigstore OIDC) — identity bound to GitHub Actions workflow |
| SBOM | CycloneDX JSON generated by Syft, attached to each OCI artifact |
| Provenance | SLSA provenance via GitHub Artifact Attestation API |
| Transparency | All signatures logged in the Sigstore Rekor transparency log |
Consumers can verify chart authenticity before deployment. See VERIFICATION.md for step-by-step instructions including:
cosign verifyfor signature verificationcosign download sbomfor SBOM inspection and vulnerability scanninggh attestation verifyfor SLSA provenance auditing- Kyverno/Gatekeeper policy examples for admission-time enforcement
| Requirement | How Addressed |
|---|---|
| Encryption in transit | TLS profiles (letsencrypt, provided) |
| Encryption at rest | StorageClass with encryption |
| Access control | NetworkPolicy, RBAC (operator-managed) |
| Secret management | External Secrets Operator integration |
| Audit logging | Application audit trail, observability stack |
| High availability | Production sizing profile (PDBs, anti-affinity, multi-replica) |
| Backup/recovery | Volume snapshots, database dump tools |
| Monitoring | Full observability stack (metrics, logs, traces, profiling) |
| Vulnerability scanning | CI/CD integration (add Trivy/Snyk to your pipeline) |
| Supply chain integrity | Cosign keyless signing, SLSA provenance, CycloneDX SBOM |
| Change management | GitOps via helmfile, CHANGELOG.md, release-gated OCI publishing |