feat(clone): this allows you to use the dotcms image as an init container and clone an another dotCMS env#34443
feat(clone): this allows you to use the dotcms image as an init container and clone an another dotCMS env#34443
Conversation
…iner and clone an another dotCMS env ref: #34442
Remove the unconditional `exit 0` so the entrypoint continues to source startup scripts, clarify the import script’s exit-13 success path, and install `libarchive-tools` to support asset unpacking during imports. ref: #34442
Remove the unconditional `exit 0` so the entrypoint continues to source startup scripts, clarify the import script’s exit-13 success path, and install `libarchive-tools` to support asset unpacking during imports. ref: #34442
Remove the unconditional `exit 0` so the entrypoint continues to source
startup scripts, clarify the import script’s exit-13 success path, and
install `libarchive-tools` to support asset unpacking during imports.
ref: #34442
|
I am fine with this, the lack of a helm chart to help with the templating of this init-container makes integrating this a pain, but is fine for starting with a few instances. There are a couple of concerns that may need addressing and clarifying taking into account how it would effectively work when there is more than on replica for the stateful set, it is more an issue if trying to run on a upgrade where pods are being replaced and we still have active connections. PR #34443 Review - dotCMS Environment Cloning FeatureSummaryThis PR adds environment cloning functionality using an init container pattern. The implementation works for initial deployment scenarios with EFS shared storage and OrderedReady pod management, but has critical limitations that should be addressed before production use. Overall Assessment: ✅ What Works (With EFS + OrderedReady)
🚨 Critical Issues (Must Fix)1. Lock File Race Condition - Fundamental FlawProblem: The lock file mechanism has a race condition. It only "works" because OrderedReady prevents concurrent execution, not because the lock is correct. Current Code ( # Check if lock exists
if [ -f "$IMPORT_IN_PROCESS" ]; then
# ... check lock age ...
fi
# Create lock (NOT ATOMIC with check above)
mkdir -p $IMPORT_DATA_DIR && touch $IMPORT_IN_PROCESSWhy It's Flawed:
Impact: If Recommendation: Use atomic directory creation: LOCK_DIR="$IMPORT_DATA_DIR/.lock"
if mkdir "$LOCK_DIR" 2>/dev/null; then
# We got the lock
trap "rmdir '$LOCK_DIR' 2>/dev/null" EXIT
else
# Lock exists, check if stale
# ... existing stale lock logic ...
fiAlternative: Use PostgreSQL advisory locks (automatically released on connection close) 2. Doesn't Work on Existing StatefulSets with Multiple ReplicasProblem: Cannot safely run on existing StatefulSets during rolling updates. Old pods remain active while new pods run import, causing database conflicts. Scenario: Impact:
Recommendation: Add active connection check before import: check_active_connections() {
ACTIVE=$(psql -h "$DB_HOST" -d "$DB_NAME" -U "$DB_USERNAME" -qtAX -c \
"SELECT count(*) FROM pg_stat_activity
WHERE datname = '$DB_NAME'
AND pid != pg_backend_pid()
AND state != 'idle'" 2>/dev/null || echo "0")
if [ "$ACTIVE" -gt 0 ]; then
echo "ERROR: Database has $ACTIVE active connections"
echo "Cannot import while database is in use"
echo "This script is designed for initial deployment only"
echo "Stop all pods before performing refresh"
exit 1
fi
}
# Call before import
check_active_connections || exit 1Also: Document this limitation clearly in the PR description and script comments
|
| Scenario | Risk Level | Works? | Notes |
|---|---|---|---|
| Initial deployment (EFS + OrderedReady) | ✅ Low | Yes | Works correctly |
| Existing StatefulSet (rolling update) | 🚨 High | No | Conflicts with active connections |
| Without OrderedReady | 🚨 High | No | Lock race condition causes failures |
| Per-pod volumes | 🚨 High | No | Each pod imports independently |
💡 Additional Suggestions
- Consider database advisory locks instead of file-based locks (more reliable, auto-cleanup)
- Add pod ordinal check for extra safety (only pod-0 imports on initial deployment)
- Add structured logging for better observability
- Add metrics/telemetry for import operations
- Consider resume capability for failed imports (checkpoint progress)
Final Recommendation
Request Changes - Address critical issues before merge:
- Fix lock file race condition (atomic operations)
- Add active connection check (prevent conflicts on existing StatefulSets)
- Document dependencies (OrderedReady, EFS)
The feature is useful and the init container pattern is correct, but the implementation needs these fixes for production robustness.
Context Notes
- Storage: Assumes
/data/sharedis EFS (shared across pods) - Pod Management: Requires
OrderedReady(default StatefulSet policy) - Use Cases: Initial deployment or total refresh (intentional database/filesystem replacement)
- Pattern: Init container (as documented in PR description)
Remove the unconditional so the entrypoint continues to source
startup scripts, clarify the import script’s exit-13 success path, and
install to support asset unpacking during imports.
ref: #34442
… into issue-34442-clone-dotcms-env
|
New changes:
Now my little AI Friend now says: 📋 Overall Assessment Status: ✅ Logic is sound and production-ready with minor recommendations Priority Fixes:
Excellent Design Decisions:
|
|
@nollymar Where would that readme live? There is no doc under our |
…iner and clone an another dotCMS env (#34443) ref: #34442 Below is documentation: --- # dotCMS Environment Cloning This feature allows you to use the dotCMS Docker image as an init container to clone data and assets from another running dotCMS environment. This is useful for: - Setting up development environments from production/staging - Creating test environments with real data - Initializing new dotCMS instances with existing content ## How It Works The `10-import-env.sh` script runs at container startup (before dotCMS itself starts) and: 1. Downloads the database backup from the source environment via the Maintenance API 2. Downloads the assets archive from the source environment 3. Imports the database into PostgreSQL 4. Extracts assets to the shared data directory 5. Exits the container so it can be restarted to run dotCMS. When the script exits cleanly and is restarted, it will skip re-importing and run dotCMS. This enables the "init container" pattern in Kubernetes where the import runs once, then the main dotCMS container starts with the imported data. ## Environment Variables ### Required | Variable | Description | |----------|-------------| | `DOT_IMPORT_ENVIRONMENT` | URL of the source dotCMS environment (e.g., `https://demo.dotcms.com`) | | `DOT_IMPORT_API_TOKEN` | API token for authentication (Bearer token) | | `DOT_IMPORT_USERNAME_PASSWORD` | Alternative: Basic auth credentials in `user:password` format | **Note:** Either `DOT_IMPORT_API_TOKEN` or `DOT_IMPORT_USERNAME_PASSWORD` is required. ### Database Connection | Variable | Description | |----------|-------------| | `DB_BASE_URL` | JDBC URL for target PostgreSQL (e.g., `jdbc:postgresql://host/dbname`) | | `DB_USERNAME` | PostgreSQL username | | `DB_PASSWORD` | PostgreSQL password | ### Optional | Variable | Default | Description | |----------|---------|-------------| | `DOT_IMPORT_DROP_DB` | `false` | Drop existing database schema before import | | `DOT_IMPORT_MAX_ASSET_SIZE` | `100mb` | Maximum asset file size to download | | `DOT_IMPORT_ALL_ASSETS` | `false` | Include non-live (working/archived) assets | | `DOT_IMPORT_IGNORE_ASSET_ERRORS` | `true` | Continue if asset extraction has errors | ## Usage Examples ### Docker Standalone ```bash # Create environment file cat > app.env << 'EOF' DOT_IMPORT_ENVIRONMENT=https://demo.dotcms.com DOT_IMPORT_USERNAME_PASSWORD=admin@dotcms.com:admin DOT_IMPORT_DROP_DB=true DB_BASE_URL=jdbc:postgresql://localhost:5432/dotcms DB_USERNAME=dotcmsdbuser DB_PASSWORD=password EOF # Run dotCMS with environment cloning docker run --env-file app.env \ -v ./data:/data \ -p 8080:8082 \ dotcms/dotcms:latest ``` ### Kubernetes Init Container ```yaml apiVersion: v1 kind: Pod metadata: name: dotcms spec: initContainers: - name: clone-environment image: dotcms/dotcms:latest env: - name: DOT_IMPORT_ENVIRONMENT value: "https://source.dotcms.com" - name: DOT_IMPORT_API_TOKEN valueFrom: secretKeyRef: name: dotcms-secrets key: import-api-token - name: DB_BASE_URL value: "jdbc:postgresql://postgres:5432/dotcms" - name: DB_USERNAME valueFrom: secretKeyRef: name: dotcms-secrets key: db-username - name: DB_PASSWORD valueFrom: secretKeyRef: name: dotcms-secrets key: db-password volumeMounts: - name: shared-data mountPath: /data/shared containers: - name: dotcms image: dotcms/dotcms:latest # ... main dotCMS container config volumeMounts: - name: shared-data mountPath: /data/shared volumes: - name: shared-data persistentVolumeClaim: claimName: dotcms-data ``` > **Note:** It is possible to not use an init container and just have dotCMS fire up the first time to clone the target environment. In this case, you will need to adjust your probes and add an appropriate start up delay before the pod gets cycled (not terrible in dev but probably not recommended for production values). ## Behavior Details ### Idempotency - The script creates an `import_complete.txt` marker file after successful import - Subsequent container starts skip the import if this marker exists - Delete the marker file to force a re-import ### Locking - A `lock.txt` file prevents concurrent imports (important for Kubernetes) - Lock files older than 30 minutes are considered stale and removed - Pods wait 3 minutes and exit if a lock is held by another process ### Database Safety - The script checks if the target database already contains data (inode count) - Import is skipped if data exists (unless `DOT_IMPORT_DROP_DB=true`) - Use `DOT_IMPORT_DROP_DB=true` to wipe and reimport ### Downloaded Files - Database and asset backups are cached in `$IMPORT_DATA_DIR` - File names include an MD5 hash of the source URL - Delete cached files to force re-download ## Exit Codes | Code | Meaning | |------|---------| | `0` | No import needed (already complete or not configured) | | `1` | Error during import | | `13` | Import completed successfully (signals init container completion) | ## Troubleshooting ### Import stuck or failed 1. Check for stale lock file: `ls -la /data/shared/import/lock.txt` 2. Remove lock if stale: `rm /data/shared/import/lock.txt` ### Force reimport 1. Remove the completion marker: `rm /data/shared/import/import_complete.txt` 2. Optionally remove cached downloads to re-download: ```bash rm /data/shared/import/*_assets.zip rm /data/shared/import/*_dotcms_db.sql.gz ``` ### Authentication failures - Verify `DOT_IMPORT_API_TOKEN` or `DOT_IMPORT_USERNAME_PASSWORD` is correct - Ensure the user has access to the Maintenance API endpoints: - `/api/v1/maintenance/_downloadAssets` - `/api/v1/maintenance/_downloadDb` ## Source Environment Requirements The source dotCMS environment must: 1. Be accessible over HTTPS/HTTP from the target environment
|
Thanks for the likes guys! |
ref: #34442
Below is documentation:
dotCMS Environment Cloning
This feature allows you to use the dotCMS Docker image as an init container to clone data and assets from another running dotCMS environment. This is useful for:
How It Works
The
10-import-env.shscript runs at container startup (before dotCMS itself starts) and:When the script exits cleanly and is restarted, it will skip re-importing and run dotCMS. This enables the "init container" pattern in Kubernetes where the import runs once, then the main dotCMS container starts with the imported data.
Environment Variables
Required
DOT_IMPORT_ENVIRONMENThttps://demo.dotcms.com)DOT_IMPORT_API_TOKENDOT_IMPORT_USERNAME_PASSWORDuser:passwordformatNote: Either
DOT_IMPORT_API_TOKENorDOT_IMPORT_USERNAME_PASSWORDis required.Database Connection
DB_BASE_URLjdbc:postgresql://host/dbname)DB_USERNAMEDB_PASSWORDOptional
DOT_IMPORT_DROP_DBfalseDOT_IMPORT_MAX_ASSET_SIZE100mbDOT_IMPORT_ALL_ASSETSfalseDOT_IMPORT_IGNORE_ASSET_ERRORStrueUsage Examples
Docker Standalone
Kubernetes Init Container
Behavior Details
Idempotency
import_complete.txtmarker file after successful importLocking
lock.txtfile prevents concurrent imports (important for Kubernetes)Database Safety
DOT_IMPORT_DROP_DB=true)DOT_IMPORT_DROP_DB=trueto wipe and reimportDownloaded Files
$IMPORT_DATA_DIRExit Codes
0113Troubleshooting
Import stuck or failed
ls -la /data/shared/import/lock.txtrm /data/shared/import/lock.txtForce reimport
rm /data/shared/import/import_complete.txtAuthentication failures
DOT_IMPORT_API_TOKENorDOT_IMPORT_USERNAME_PASSWORDis correct/api/v1/maintenance/_downloadAssets/api/v1/maintenance/_downloadDbSource Environment Requirements
The source dotCMS environment must: