Skip to content

No --verbose/--debug flag and no doctor/diagnose subcommand — operator is stuck with whatever logs were already emitted #41

@dolph

Description

@dolph

Operability problem

When the alert fires and the operator wants to dig in, the tool offers no way to ask for more detail. There is:

  • No --verbose / -v / --debug flag (verified: connectivity.go parses subcommands by hand and recognizes no flags).
  • No doctor / diagnose / explain subcommand that, given a destination, walks through each layer and prints every piece of state (resolved IPs, chosen route, TLS certificate chain, HTTP response headers and body snippet) regardless of pass/fail.
  • No way to ask for a single-destination check on the command line with extra detail.

The operator's available knobs are:

  1. Re-run connectivity check <url> and hope for more output (they get the same logs).
  2. Read the source.
  3. Reach for dig, traceroute, openssl s_client, curl -v separately — at which point the tool isn't helping.

Concrete operator scenarios

"The check is failing for https://api.example.com/health. I want to see: did DNS return one IP or many? Which one did the dial use? What was the TLS cert SAN list and expiry? What were the HTTP response headers? connectivity check shows me a single 'Failed HTTP GET' line and nothing else. I have to drop into openssl s_client and curl -v to see what the tool already had in its hands during the check."

"I am writing a runbook. Step 1 says 'run connectivity check and read the logs.' Step 2 should say 'if that doesn't explain it, run connectivity diagnose <url>.' But there is no step 2 — there's only 'go read the source.'"

Suggested fix

  • Add a -v / --verbose flag that:
    • Logs every resolved IP from DNS (currently only failures are logged per IP)
    • Logs the chosen route for every successful check
    • Logs TLS peer-certificate Subject/Issuer/NotAfter
    • Logs the HTTP status code and a body snippet
  • Add a connectivity doctor <url> subcommand that runs every layer (DNS, route, dial, TLS, HTTP) end-to-end with verbose output, even if a step succeeds. This is the runbook target: "when the alert fires, run connectivity doctor <the-failing-url> and attach the output to the incident."

Operability-related and complementary to #19 (self-observability metrics) and #21 (structured logs) — this is the human-debugging entrypoint after an alert.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions