Skip to content

[ARCHITECTURE] Storage persistence prevents multi-replica (HA) deployments without RWX volumes #57

@bernardgut

Description

@bernardgut

Problem

The current chart uses Deployment + PVC for the OpenCloud pod, creating two PersistentVolumeClaims (config and data) that default to ReadWriteOnce access mode. This architecture makes multi-replica (HA) deployments impossible without ReadWriteMany (RWX) volumes.

While the values.yaml comments document this limitation:

# Number of replicas (Note: When using multiple replicas, persistence should be disabled
# or use a storage class that supports ReadWriteMany access mode)
replicas: 1

...there is no mechanism in the chart to properly decouple these concerns. The only options today are:

  1. Single replica (no HA) — works with RWO ✅
  2. Multiple replicas + RWX volumes — requires NFS or similar (antipattern for production K8s) ⚠️
  3. Multiple replicas + persistence disabled — loses config and metadata across restarts ❌

None of these options provide a production-grade HA deployment.

Root Cause Analysis

The two PVCs serve fundamentally different purposes that should be handled differently:

1. Config PVC (/etc/opencloud) — Can be externalized

The config volume stores:

  • Auto-generated secrets: JWT secret, WOPI secret, machine auth API key, transfer secret, URL signing secret, system user API key
  • Service configuration: Generated by opencloud init on startup
  • Web UI themes and customizations

These secrets are generated randomly on first startup. When config persistence is disabled and multiple replicas exist, each pod generates different secrets — causing authentication failures between services (e.g., the Collaboration pod cannot validate JWT tokens from the OpenCloud pod).

Current workaround: Inject secrets via environment variables (OC_JWT_SECRET, OC_MACHINE_AUTH_API_KEY, etc.) and disable config persistence. This works but is undocumented and requires creating all internal secrets manually.

Proposed fix:

  • Support injecting all init secrets via Kubernetes Secrets / env vars in the chart
  • Make opencloud init --force-overwrite use ENV-provided secrets instead of generating new ones
  • This fully decouples config from the filesystem — no PVC needed

2. Data PVC (/var/lib/opencloud) — Application-level limitation

The data volume stores S3ng metadata (file metadata, sharing data, user settings) even when blobs are stored on S3. This is the same issue documented in owncloud/ocis-charts#489, where @wkloucek confirmed:

"storageusers, storagesystem need RWX volumes because this is where the metadata is stored for the S3ng storage driver"

"Maybe there's gonna be another storage driver in the future that can live without RWX for storing metadata and leveraging one of those two key-value-stores..."

This is tracked upstream as owncloud/ocis#4594. Until the OpenCloud application supports storing metadata in a key-value store (Redis/NATS), this remains an application-level requirement.

However, the chart can still improve the situation by clearly separating the concerns:

  • Config → externalized via ENV/Secrets (no PVC needed)
  • Data (metadata) → clearly documented as requiring RWX if using multiple replicas with S3ng

Impact on Services

The chart's monolithic OpenCloud deployment bundles multiple internal services that have different storage requirements:

Data Category Current Storage Can Be Externalized? Proposed Solution
Init secrets (JWT, WOPI, etc.) Config PVC ✅ Yes ENV vars from K8s Secrets
Service registry Config PVC ✅ Yes NATS/Redis (already supported)
Cache (thumbnails, etc.) Config PVC / memory ✅ Yes Redis Sentinel
Persistent store (notifications, activity) Config PVC / NATS ✅ Yes Redis Sentinel / NATS JetStream
S3ng file metadata Data PVC ❌ Not yet Needs upstream app changes
Sharing data / user settings Data PVC ❌ Not yet Needs upstream app changes

Prior Art

This issue was first raised in the original (now archived) ownCloud chart:

Proposed Changes (Chart-Level)

I propose to make the following changes, that can be made at the chart level to improve the HA story, without requiring upstream application changes:

PR 1: Externalize init secrets (Config PVC decoupling)

  • Add support for injecting all internal secrets via existingSecret references
  • Add ENV var injection for JWT, WOPI, machine-auth, transfer, URL-signing secrets
  • Allow config persistence to be disabled without breaking multi-replica setups
  • Document the required secrets and their purpose

PR 2: Redis Sentinel integration for cache and persistent stores

  • Add values for configuring external Redis Sentinel as the cache and persistent store backend
  • Inject OC_CACHE_STORE, OC_PERSISTENT_STORE, OC_CACHE_STORE_NODES, etc. via ENV
  • This moves thumbnails, notifications, activity logs, and other ephemeral/persistent data off the filesystem
  • Reference: OpenCloud cache/store documentation

Remaining limitation (requires upstream app changes)

  • S3ng metadata still requires a shared filesystem (RWX PVC) for multi-replica deployments
  • This should be tracked upstream as a request for a key-value metadata storage driver

Environment

  • Chart version: 2.0.1 (Tim-herbie/opencloud-helm)
  • OpenCloud version: 5.2.0
  • Storage driver: decomposeds3 (S3ng)
  • Tested with: external NATS, external PostgreSQL (CNPG), external S3

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions