CSPL-4006: Fix SSL replication port deleted on first indexer cluster join#905
Open
gabrielm-splunk wants to merge 1 commit into
Open
CSPL-4006: Fix SSL replication port deleted on first indexer cluster join#905gabrielm-splunk wants to merge 1 commit into
gabrielm-splunk wants to merge 1 commit into
Conversation
…join Splunk's `edit cluster-config` REST API requires `-replication_port` even when SSL replication is configured. The previous fix (PR splunk#903) omitted the flag for SSL deployments, causing the command to fail with "parameter=replication_port not present" and loop indefinitely. The correct approach: - Always pass `-replication_port` so the CLI command succeeds - Immediately re-apply the full server.conf stanza block after cluster join to restore the `[replication_port-ssl://PORT]` entry that the CLI overwrites SSL detection uses both explicit `splunk.idxc.replication_ssl: true` flag and auto-detection of `replication_port-ssl://` keys in splunk.conf.server.content, so customers don't need to add an extra flag if they only configure SSL via conf. Affects both single-site (indexer_clustering.yml) and multi-site (setup_multisite.yml) cluster join paths. Verified on Splunk Enterprise 10.4.0 on EKS: SSL stanza persists on first startup without manual pod restart. Jira: https://splunk.atlassian.net/browse/CSPL-4006 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Customers configuring SSL replication port via
splunk.conf.server.content(e.g.replication_port-ssl://9887) find the stanza deleted after the indexer joins the cluster on first startup. The SSL config only persists after a second pod restart, requiring manual intervention.Customer Configuration
Root Cause
edit cluster-config -replication_portalways writes a[replication_port://PORT](non-SSL) stanza toserver.conf, overwriting the customer's[replication_port-ssl://PORT]stanza.PR #903 attempted to fix this by omitting
-replication_portfor SSL deployments, but Splunk's REST API requires the-replication_portparameter regardless of SSL mode. This caused the command to fail withparameter=replication_port not presentand loop indefinitely — the indexer never joined the cluster.Fix
-replication_portsoedit cluster-configsucceeds (REST API requirement)server.confstanza block immediately after cluster join to restore the[replication_port-ssl://PORT]entrySSL mode is detected via either:
splunk.idxc.replication_ssl: trueflag, orreplication_port-ssl://keys insplunk.conf.server.contentThis means customers with only the conf-based SSL config (no extra flag) are handled automatically.
Files Changed
roles/splunk_indexer/tasks/indexer_clustering.yml— single-site cluster joinroles/splunk_indexer/tasks/setup_multisite.yml— multi-site cluster joinVerification
Tested on Splunk Enterprise 10.4.0 on EKS (k8s 1.34):
Before fix:
parameter=replication_port not presenterror on every retry — indexer never joins cluster.After fix: Both stanzas present in
server.confon first startup, no manual restart needed:Backward Compatibility
use_ssl_replicationis false)Jira: https://splunk.atlassian.net/browse/CSPL-4006