Skip to content

System debug#2413

Open
pwright wants to merge 10 commits intoskupperproject:mainfrom
pwright:system-debug
Open

System debug#2413
pwright wants to merge 10 commits intoskupperproject:mainfrom
pwright:system-debug

Conversation

@pwright
Copy link
Copy Markdown
Member

@pwright pwright commented Mar 26, 2026

Skupper Debug Dump for local system sites

Overview

The skupper debug dump command collects diagnostic information from a non-Kubernetes Skupper site and packages it into a compressed tarball (.tar.gz file). This dump is useful for troubleshooting issues and sharing diagnostic data with support teams.

Scope

Single Namespace

The debug dump command operates on one namespace at a time. It does not collect data from multiple namespaces in a single run.

skupper debug dump              # Collects "default" namespace
skupper debug dump -n west      # Collects "west" namespace only
skupper debug dump -n east      # Collects "east" namespace only

To collect diagnostics from multiple namespaces, run the command multiple times with different namespace flags.

Current User Only

The command can only access namespaces owned by the current user:

  • Non-root users: Access namespaces in ~/.local/share/skupper/namespaces/ (or $XDG_DATA_HOME/skupper/namespaces/)
  • Root user: Access namespaces in /var/lib/skupper/namespaces/

The command cannot cross user boundaries. Root cannot collect dumps from non-root user namespaces, and vice versa.

Usage

skupper debug dump [filename] [flags]

Arguments:
  filename    Optional base name for the output file (default: "skupper-dump")

Flags:
  -n, --namespace string   Namespace to collect diagnostics from (default: "default")

Examples:
  skupper debug dump
  skupper debug dump my-diagnostics
  skupper debug dump -n production
  skupper debug dump production-debug -n production

Output File

The command creates a compressed tarball with the following naming pattern:

<filename>-<namespace>-<timestamp>.tar.gz

Tarball Contents

All collected files are organized under the following directory structure:

/versions/

Version information for Skupper and platform components:

  • skupper.yaml / skupper.yaml.txt - Skupper component versions and image details
  • Platform-specific version file:
    • Linux platform: systemd.txt - systemd version
    • Podman platform: podman.txt - podman version
    • Docker platform: docker.txt - docker version
  • skrouterd.txt - Router version (only if skrouterd binary exists on host)

/site-namespace/resources/

Skupper resource definitions and router configuration:

  • Site-*.yaml / Site-*.yaml.txt - Site resource definitions
  • Connector-*.yaml / Connector-*.yaml.txt - Connector resources
  • Listener-*.yaml / Listener-*.yaml.txt - Listener resources
  • Link-*.yaml / Link-*.yaml.txt - Link resources
  • Certificate-*.yaml / Certificate-*.yaml.txt - Certificate resources
  • Secret-*.yaml / Secret-*.yaml.txt - Secret resources
  • Configmap-*.yaml / Configmap-*.yaml.txt - ConfigMap resources
  • router-config.json - Router configuration file

Platform-specific files:

  • Container platforms (podman/docker):
    • Container-*.yaml / Container-*.yaml.txt - Container inspection details
  • Linux platform:
    • Systemd-*.txt - Systemd service status and service file content

/site-namespace/resources/skstat/

Router statistics collected with all diagnostic flags (-g, -c, -l, -n, -e, -a, -m, -p):

  • Container platforms: <namespace>-skupper-router-skstat-<flag>.txt
  • Linux platform: router-skstat-<flag>.txt

/site-namespace/resources/certs/

Certificate files organized by type and name:

  • runtime/<cert-name>/ca.crt - Certificate Authority
  • runtime/<cert-name>/tls.crt - TLS certificate
  • runtime/<cert-name>/tls.key - TLS private key
  • input/<cert-name>/... - User-provided certificates (if any)

/site-namespace/logs/

Log files from running components:

  • Container platforms:
    • <namespace>-skupper-router.txt - Router container logs
    • system-controller.txt - Controller container logs
  • Linux platform:
    • systemd-journal.txt - Complete systemd journal for the skupper service

Platform-Specific Differences

The debug dump automatically detects the platform for the target namespace and collects platform-appropriate diagnostic data.

Linux Platform (systemd)

Uses systemd to manage the router process on the host.

Additional collection:

  • Systemd service status (systemctl status)
  • Systemd service file content
  • Complete systemd journal logs (journalctl)
  • Router statistics via direct skstat command execution
  • Router version from host skrouterd binary

Systemd commands executed:

  • systemctl status (with --user flag for non-root users)
  • journalctl (with --user flag for non-root users)

Container Platforms (podman/docker)

Uses container runtime to manage the router in containers.

Additional collection:

  • Container inspection details (full container metadata)
  • Container logs via runtime API
  • Router statistics via container exec (podman exec or docker exec)
  • Platform version information

Container operations:

  • Container inspection of skupper-router container
  • Log retrieval from router and controller containers
  • Command execution inside router container for statistics

Notes

  • Duplicate file extensions: Most resource files are saved twice (.yaml and .yaml.txt) to allow easy viewing in different contexts
  • Certificate sensitivity: The tarball contains TLS certificates and keys. Treat the debug dump as sensitive data
  • Router version: For container platforms, if skrouterd is not installed on the host, the router version file may be omitted (version info is still available in skstat output)
  • Error handling: If any individual collection step fails, the command continues and collects remaining data. Missing files in the tarball indicate collection failures for those specific items

@pwright pwright marked this pull request as ready for review April 1, 2026 11:07
Comment thread internal/cmd/skupper/common/utils/debug.go Outdated
Comment thread internal/cmd/skupper/debug/nonkube/debug.go Outdated
Comment thread internal/cmd/skupper/debug/nonkube/debug.go Outdated
Comment thread internal/cmd/skupper/debug/nonkube/debug.go Outdated
@pwright pwright requested a review from fgiorgetti April 8, 2026 12:52
Comment thread internal/cmd/skupper/debug/nonkube/debug.go Outdated
cliutils.WriteObject(site, path+"runtime/Site-"+site.Name, tb)
}
}
sites, err = cmd.siteHandler.List(optsInput)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that the optsInput and optsRuntime should be used for all resources, not just the site, as I noticed that we have Site resources under input and runtime directories, but the Listener is not placed likewise.

There are some resources I am not seeing here, like RouterAccess for a site with link-access enabled, or the Secret for a site with a link to another.

Maybe instead of using the FileSystemHandler's List method, you could evaluate using FileSystemSiteStateLoader, providing the appropriate paths for input and runtime resources?
I'd recommend giving it a try.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fgiorgetti do we want input/ dir (currently it attempts to collect, but fails if there are two, eg I've somehow managed to have two site yamls in input/ but obviously only one is valid)?

if we do want to collect, is it a bug that there might be invalid entries in input/

Comment thread internal/cmd/skupper/debug/nonkube/debug.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants