SREP-347 & SREP-4896: factorizing code, fixing some potential race conditions in StreamLogs & making sure 'osdctl rhobs metrics' can work at a given time or with a time range#904
Conversation
…nditions in StreamLogs & making sure 'osdctl rhobs metrics' can work at a given time or with a time range
|
@Nikokolas3270: This pull request references SREP-4896 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughThis PR refactors the RHOBS CLI command implementations to support both instant and range metrics queries via new flags ( ChangesRHOBS CLI Metrics and Logs Refactoring
🎯 4 (Complex) | ⏱️ ~60 minutes Important Pre-merge checks failedPlease resolve all errors before merging. Addressing warnings is optional. ❌ Failed checks (1 error, 1 warning)
✅ Passed checks (13 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Nikokolas3270 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@Nikokolas3270: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Actionable comments posted: 10
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
cmd/rhobs/requests.go (1)
1037-1098:⚠️ Potential issue | 🟠 Major | ⚡ Quick win
startTimeis still shared unsafely across goroutines.
createWebsocketreads the capturedstartTime, butcloseWebsocketmutates it and can run from the signal goroutine throughcloseWebSocketInContext. That leaves a real data race on the reconnect cursor and can duplicate or skip logs after reconnect. Moving the cursor intostreamLogsContextand guarding it with the same mutex would close that gap.Also applies to: 1112-1144, 1156-1167
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/rhobs/requests.go` around lines 1037 - 1098, The shared variable startTime is captured by createWebsocket and mutated by closeWebsocket (invoked via closeWebSocketInContext), causing a data race; move startTime out of the outer scope into the streamLogsContext struct (or the same object that already holds the reconnect state and mutex), update createWebsocket and closeWebsocket to read/write that field instead of the captured variable, and protect all accesses with the existing mutex used by streamLogsContext so reads/writes are serialized; also update the other occurrences mentioned (blocks around lines 1112-1144 and 1156-1167) to use the guarded field rather than the outer startTime.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/rhobs/mcp_tools.go`:
- Around line 163-181: The MCP output schema still advertises a singular
"result" and marks "results"/"count" as instant-only, but the code paths in
QueryRangeMetrics (fetcher.QueryRangeMetrics) and QueryInstantMetrics
(fetcher.QueryInstantMetrics) populate cellInfo with "results" and "count" for
range queries too; update the MCP tool schema to replace the deprecated "result"
object with "results" (array/time-series) and mark "results" and "count" as
valid for both range and instant payloads so the advertised contract matches the
actual output emitted by the code (ensure any schema/README/struct that
describes the MCP response mirrors the keys set on cellInfo).
In `@cmd/rhobs/metricsCmd.go`:
- Around line 24-26: The help text claims the PromQL expression is optional when
--web, but the command still enforces a required positional arg; update the code
in the metrics command (metricsCmd / NewMetricsCmd) to either (A) relax arg
validation: in the command's Args or RunE check, read the "web" flag
(cmd.Flags().GetBool("web")) and allow zero positional args when web==true,
otherwise require one, or (B) adjust the help string lines that mention
"optional if the --web option is set" to remove that claim so documentation
matches the current required-positional-arg behavior; pick one approach and make
the corresponding change.
- Around line 39-55: After computing startTime/endTime (using nowTime, duration
and cmd.Flags().Changed), validate the time range and fail fast: ensure that
when --since is used the computed startTime (nowTime.Add(-duration)) is not in
the future (startTime <= nowTime) and when --end-time is set ensure startTime <=
endTime; if either check fails return a clear error (e.g. "--since results in
start time in the future" or "--start-time must be <= --end-time"). Add these
checks immediately after the existing startTime/endTime logic (the block using
cmd.Flags().Changed, startTime, endTime, duration, nowTime) so invalid ranges
are rejected before calling RHOBS.
In `@cmd/rhobs/requests.go`:
- Around line 1046-1047: After calling createLogsPrinter(...) and immediately
invoking logsPrinter.PrintHeader(), add a deferred call to
logsPrinter.PrintTrailer() (defer logsPrinter.PrintTrailer()) so the streaming
printer is always finalized even if websocket setup fails; then remove any
explicit/manual calls to logsPrinter.PrintTrailer() in the read-error /
error-return paths (to avoid double-closing). Apply this change for each place
where logsPrinter.PrintHeader() is used (the blocks that create logsPrinter,
call PrintHeader, and currently call PrintTrailer on error).
- Around line 595-602: The print helper functions (e.g., PrintInstantMetrics)
create context.TODO() and thus drop caller cancellation; update these APIs to
accept a context.Context parameter and thread it into calls like
QueryInstantMetrics and the printer constructors (e.g.,
createInstantMetricsPrinter) so that remote requests use the caller-provided
ctx; change signatures of the other similar wrappers (the ones at the other
occurrences) to accept ctx and pass it through to their underlying RHOBS queries
and printers to allow Cobra Ctrl+C/timeouts to propagate.
- Around line 567-568: The code currently sets evalTimeStr from
evalTime.UnixNano() and assigns it to evalTimePStr, which sends nanoseconds to
the Prometheus instant-query `time` parameter; change this to use seconds or
RFC3339 instead. In the function handling the instant query (look for
evalTimeStr / evalTimePStr and where evalTime is computed), replace
evalTime.UnixNano() with either evalTime.Unix() (or a fractional seconds string
like fmt.Sprintf("%d.%09d", evalTime.Unix(), evalTime.Nanosecond()) if fractions
are needed) or use evalTime.Format(time.RFC3339Nano) to produce an RFC3339
timestamp, then set evalTimeStr to that value and keep assigning it to
evalTimePStr.
In `@docs/osdctl_rhobs_logs.md`:
- Line 7: Synopsis text incorrectly references the --contain flag for regexp
filtering; update the documentation sentence in docs/osdctl_rhobs_logs.md so it
names the actual flag --contain-regex (i.e., change "regexp (--contain option)"
to "regexp (--contain-regex option)") so the synopsis matches the defined CLI
options.
In `@docs/osdctl_rhobs_metrics.md`:
- Line 7: The synopsis text in docs/osdctl_rhobs_metrics.md refers to a --web
option but that flag is not defined in the command options; update the docs so
they match: either remove the reference to "--web" from the synopsis sentence
about the expression being optional, or add a documented "--web" flag entry
(with its short/long form, description and behavior) to the options list for the
osdctl rhobs metrics command so the synopsis and options are consistent; ensure
the edited text around the sentence "The prometheus expression ... it is
optional if the --web option is set" is updated accordingly.
In `@docs/README.md`:
- Line 4450: The synopsis mentions a --web option ("the prometheus expression
... is optional if the --web option is set") but that flag is not present in the
command's options; either add documentation for the --web flag to the command's
options/inherited options or remove/replace the --web mention in the synopsis so
they match; update the sentence beginning "By default, the command will try to
evaluate..." and the command's options list in the same README section so the
synopsis and exposed flags are consistent.
- Line 4342: Update the README synopsis to correctly document the regex filter:
replace or supplement the mention of --contain with the correct flag
--contain-regex (or explicitly state that --contain is literal substring match
while --contain-regex accepts regular expressions) so the synopsis matches the
detailed flag descriptions; ensure both flags (--contain and --contain-regex)
are referenced unambiguously in the synopsis text to avoid user confusion.
---
Outside diff comments:
In `@cmd/rhobs/requests.go`:
- Around line 1037-1098: The shared variable startTime is captured by
createWebsocket and mutated by closeWebsocket (invoked via
closeWebSocketInContext), causing a data race; move startTime out of the outer
scope into the streamLogsContext struct (or the same object that already holds
the reconnect state and mutex), update createWebsocket and closeWebsocket to
read/write that field instead of the captured variable, and protect all accesses
with the existing mutex used by streamLogsContext so reads/writes are
serialized; also update the other occurrences mentioned (blocks around lines
1112-1144 and 1156-1167) to use the guarded field rather than the outer
startTime.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 24a4b96a-11b9-4ad9-8bec-8bc7adfd4dc5
📒 Files selected for processing (9)
cmd/rhobs/cellCmd.gocmd/rhobs/logsCmd.gocmd/rhobs/mcp_query.gocmd/rhobs/mcp_tools.gocmd/rhobs/metricsCmd.gocmd/rhobs/requests.godocs/README.mddocs/osdctl_rhobs_logs.mddocs/osdctl_rhobs_metrics.md
💤 Files with no reviewable changes (1)
- cmd/rhobs/mcp_query.go
| if start != "" && end != "" { | ||
| rawResult, err := fetcher.QueryRangeMetrics(ctx, query, start, end, step) | ||
| results, err := fetcher.QueryRangeMetrics(ctx, query, NewRawMetricsTimeRange(start, end, step)) | ||
| if err != nil { | ||
| return mcpError("Range metrics query failed: %v", err) | ||
| } | ||
| var parsed interface{} | ||
| if err := json.Unmarshal(rawResult, &parsed); err != nil { | ||
| cellInfo["result_raw"] = string(rawResult) | ||
| return mcpResultJSON(cellInfo) | ||
| } | ||
| cellInfo["result"] = parsed | ||
| results = filterMetricsResults(fetcher, results, filterCluster) | ||
| cellInfo["results"] = results | ||
| cellInfo["count"] = len(*results) | ||
| return mcpResultJSON(cellInfo) | ||
| } | ||
|
|
||
| results, err := fetcher.QueryInstantMetrics(ctx, query, filterCluster) | ||
| results, err := fetcher.QueryInstantMetrics(ctx, query, time.Time{}) | ||
| if err != nil { | ||
| return mcpError("Metrics query failed: %v", err) | ||
| } | ||
| results = filterMetricsResults(fetcher, results, filterCluster) | ||
|
|
||
| cellInfo["results"] = results | ||
| cellInfo["count"] = len(results) | ||
| cellInfo["count"] = len(*results) |
There was a problem hiding this comment.
Update the MCP output schema to match the new range-query payload.
This path now returns results and count for range queries, but the tool schema still advertises a result object and marks results/count as instant-only. That leaves MCP clients with a stale contract.
Suggested schema update
- "results": {"type": "array", "description": "Metric results with labels and values (instant query)"},
- "count": {"type": "integer", "description": "Number of results (instant query)"},
- "result": {"type": "object", "description": "Range query result (when start/end provided)"}
+ "results": {"type": "array", "description": "Metric results with labels and values"},
+ "count": {"type": "integer", "description": "Number of results returned"}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/rhobs/mcp_tools.go` around lines 163 - 181, The MCP output schema still
advertises a singular "result" and marks "results"/"count" as instant-only, but
the code paths in QueryRangeMetrics (fetcher.QueryRangeMetrics) and
QueryInstantMetrics (fetcher.QueryInstantMetrics) populate cellInfo with
"results" and "count" for range queries too; update the MCP tool schema to
replace the deprecated "result" object with "results" (array/time-series) and
mark "results" and "count" as valid for both range and instant payloads so the
advertised contract matches the actual output emitted by the code (ensure any
schema/README/struct that describes the MCP response mirrors the keys set on
cellInfo).
| "The prometheus expression provided as an argument can be either an instant query or a range query; it is optional if the --web option is set. " + | ||
| "By default, the command will try to evaluate the expression as an instant query at the current time, " + | ||
| "but it is possible to specify a different evaluation time using the --time option or a time range using the --start-time, --end-time and --since options." + |
There was a problem hiding this comment.
The documented no-expression path is unreachable.
These help strings say the PromQL expression is optional with --web, but this command still requires one positional arg. Either relax the arg validation for --web or remove that claim from the help text.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/rhobs/metricsCmd.go` around lines 24 - 26, The help text claims the
PromQL expression is optional when --web, but the command still enforces a
required positional arg; update the code in the metrics command (metricsCmd /
NewMetricsCmd) to either (A) relax arg validation: in the command's Args or RunE
check, read the "web" flag (cmd.Flags().GetBool("web")) and allow zero
positional args when web==true, otherwise require one, or (B) adjust the help
string lines that mention "optional if the --web option is set" to remove that
claim so documentation matches the current required-positional-arg behavior;
pick one approach and make the corresponding change.
| if cmd.Flags().Changed("since") { | ||
| startTime = nowTime.Add(-duration) | ||
| } else if !cmd.Flags().Changed("start-time") { | ||
| startTime = defaultStartTime | ||
| } | ||
| if cmd.Flags().Changed("end-time") { | ||
| if !cmd.Flags().Changed("start-time") { | ||
| return fmt.Errorf("--end-time can only be set if --start-time is set") | ||
| } | ||
| } else { | ||
| endTime = nowTime | ||
| } | ||
| if cmd.Flags().Changed("step") { | ||
| if !cmd.Flags().Changed("start-time") && !cmd.Flags().Changed("since") { | ||
| return fmt.Errorf("--step can only be set if --start-time or --since is set") | ||
| } | ||
| } |
There was a problem hiding this comment.
Reject invalid time ranges before calling RHOBS.
--since=-5m pushes startTime into the future, and --start-time after --end-time currently falls through to the backend. Failing fast here will avoid confusing empty/error responses from the new range mode.
💡 Suggested guardrails
if cmd.Flags().Changed("since") {
+ if duration <= 0 {
+ return fmt.Errorf("--since must be greater than 0")
+ }
startTime = nowTime.Add(-duration)
} else if !cmd.Flags().Changed("start-time") {
startTime = defaultStartTime
}
@@
if cmd.Flags().Changed("step") {
if !cmd.Flags().Changed("start-time") && !cmd.Flags().Changed("since") {
return fmt.Errorf("--step can only be set if --start-time or --since is set")
}
}
+ if (cmd.Flags().Changed("since") || cmd.Flags().Changed("start-time")) && !endTime.After(startTime) {
+ return fmt.Errorf("--end-time must be after --start-time")
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if cmd.Flags().Changed("since") { | |
| startTime = nowTime.Add(-duration) | |
| } else if !cmd.Flags().Changed("start-time") { | |
| startTime = defaultStartTime | |
| } | |
| if cmd.Flags().Changed("end-time") { | |
| if !cmd.Flags().Changed("start-time") { | |
| return fmt.Errorf("--end-time can only be set if --start-time is set") | |
| } | |
| } else { | |
| endTime = nowTime | |
| } | |
| if cmd.Flags().Changed("step") { | |
| if !cmd.Flags().Changed("start-time") && !cmd.Flags().Changed("since") { | |
| return fmt.Errorf("--step can only be set if --start-time or --since is set") | |
| } | |
| } | |
| if cmd.Flags().Changed("since") { | |
| if duration <= 0 { | |
| return fmt.Errorf("--since must be greater than 0") | |
| } | |
| startTime = nowTime.Add(-duration) | |
| } else if !cmd.Flags().Changed("start-time") { | |
| startTime = defaultStartTime | |
| } | |
| if cmd.Flags().Changed("end-time") { | |
| if !cmd.Flags().Changed("start-time") { | |
| return fmt.Errorf("--end-time can only be set if --start-time is set") | |
| } | |
| } else { | |
| endTime = nowTime | |
| } | |
| if cmd.Flags().Changed("step") { | |
| if !cmd.Flags().Changed("start-time") && !cmd.Flags().Changed("since") { | |
| return fmt.Errorf("--step can only be set if --start-time or --since is set") | |
| } | |
| } | |
| if (cmd.Flags().Changed("since") || cmd.Flags().Changed("start-time")) && !endTime.After(startTime) { | |
| return fmt.Errorf("--end-time must be after --start-time") | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/rhobs/metricsCmd.go` around lines 39 - 55, After computing
startTime/endTime (using nowTime, duration and cmd.Flags().Changed), validate
the time range and fail fast: ensure that when --since is used the computed
startTime (nowTime.Add(-duration)) is not in the future (startTime <= nowTime)
and when --end-time is set ensure startTime <= endTime; if either check fails
return a clear error (e.g. "--since results in start time in the future" or
"--start-time must be <= --end-time"). Add these checks immediately after the
existing startTime/endTime logic (the block using cmd.Flags().Changed,
startTime, endTime, duration, nowTime) so invalid ranges are rejected before
calling RHOBS.
| evalTimeStr := strconv.FormatInt(evalTime.UnixNano(), 10) | ||
| evalTimePStr = &evalTimeStr |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Prometheus HTTP API /api/v1/query timeparameter accepted format and units. If a client sendstime=, is that valid, or should it send Unix seconds / RFC3339 instead? Please also confirm whether Observatorium's instant query endpoint follows the same contract.
💡 Result:
Prometheus /api/v1/query time accepted format and units 1) Accepted formats (input) - The time query parameter is an “Evaluation timestamp” and is optional; if omitted, Prometheus uses the current server time [1][2]. - Prometheus accepts time as either: - RFC3339/RFC3339Nano timestamp string, or - Unix timestamp in seconds (allowing optional decimal places for sub-second precision) [3][4]. 2) Are “Unix nanoseconds” accepted (e.g., time=<unix nanoseconds>)? - No—Prometheus documents time as Unix timestamp in seconds (not nanoseconds) [3][4]. - Internally (in promtool) Prometheus parses time either as a float number of seconds (converted to time.Unix seconds + fractional part as nanoseconds) or as RFC3339Nano [5]. This float-in-seconds behavior is consistent with the “Unix timestamp in seconds” contract, not integer nanoseconds [5][3][4]. Recommendation - Send time as Unix seconds (optionally with decimals) or an RFC3339 timestamp string (e.g., 2026-05-29T12:34:56Z) rather than unix nanoseconds. Observatorium instant query endpoint contract - Observatorium’s API specification for instant queries at /api/metrics/v1/{tenant}/api/v1/query documents the time parameter as format: rfc3339 | unix_timestamp [6]. - This indicates Observatorium follows the same general contract as Prometheus for instant query timestamps: RFC3339 or Unix timestamp (seconds-based unix_timestamp per the Prometheus API contract) [6][3][4].
Citations:
- 1: https://prometheus.io/docs/prometheus/latest/querying/api/
- 2: https://github.com/prometheus/prometheus/blob/main/docs/querying/api.md
- 3: https://github.com/openai/prometheus/blob/f345889f/docs/querying/api.md
- 4: https://github.com/openshift/prometheus/blob/7935baed/docs/querying/api.md
- 5: https://github.com/prometheus/prometheus/blob/0279e14d/cmd/promtool/query.go
- 6: https://raw.githubusercontent.com/observatorium/api/main/client/spec.yaml
Fix instant-query time format (use Unix seconds or RFC3339, not nanoseconds)
cmd/rhobs/requests.go lines 567-568 passes evalTime.UnixNano() as the Prometheus/Observatorium time parameter. Prometheus /api/v1/query accepts RFC3339/RFC3339Nano or Unix timestamps in seconds (optionally fractional), so nanoseconds will shift the evaluation time far into the future and miss data. Update the instant query path to send evalTime.Unix() (optionally with fractional seconds) or an RFC3339 timestamp instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/rhobs/requests.go` around lines 567 - 568, The code currently sets
evalTimeStr from evalTime.UnixNano() and assigns it to evalTimePStr, which sends
nanoseconds to the Prometheus instant-query `time` parameter; change this to use
seconds or RFC3339 instead. In the function handling the instant query (look for
evalTimeStr / evalTimePStr and where evalTime is computed), replace
evalTime.UnixNano() with either evalTime.Unix() (or a fractional seconds string
like fmt.Sprintf("%d.%09d", evalTime.Unix(), evalTime.Nanosecond()) if fractions
are needed) or use evalTime.Format(time.RFC3339Nano) to produce an RFC3339
timestamp, then set evalTimeStr to that value and keep assigning it to
evalTimePStr.
| func (f *RhobsFetcher) PrintInstantMetrics(promExpr string, evalTime time.Time, format MetricsFormat, isPrintingClusterResultsOnly bool) error { | ||
| results, err := f.QueryInstantMetrics(context.TODO(), promExpr, evalTime) | ||
| if err != nil { | ||
| return err | ||
| } | ||
|
|
||
| results = filterMetricsResults(f, results, isPrintingClusterResultsOnly) | ||
| createInstantMetricsPrinter(format)(results) |
There was a problem hiding this comment.
Thread the caller context through the print helpers.
These wrappers issue RHOBS requests, but they always create context.TODO(). That drops Cobra cancellation/timeout propagation and makes Ctrl+C wait on the remote call instead of aborting promptly. As per coding guidelines, **/*.go: context.Context for cancellation and timeouts.
Also applies to: 696-703, 893-899
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/rhobs/requests.go` around lines 595 - 602, The print helper functions
(e.g., PrintInstantMetrics) create context.TODO() and thus drop caller
cancellation; update these APIs to accept a context.Context parameter and thread
it into calls like QueryInstantMetrics and the printer constructors (e.g.,
createInstantMetricsPrinter) so that remote requests use the caller-provided
ctx; change signatures of the other similar wrappers (the ones at the other
occurrences) to accept ctx and pass it through to their underlying RHOBS queries
and printers to allow Cobra Ctrl+C/timeouts to propagate.
| logsPrinter := createLogsPrinter(format, isPrintingTimeValue, fieldNames) | ||
| logsPrinter.PrintHeader() |
There was a problem hiding this comment.
Always finalize the streaming printer.
PrintHeader() is not paired with a deferred PrintTrailer(), so JSON/CSV output stays unterminated when websocket setup fails before the normal footer runs. Defer the trailer right after the header and remove the manual trailer in the read-error path to avoid double-closing.
💡 Suggested fix
logsPrinter := createLogsPrinter(format, isPrintingTimeValue, fieldNames)
logsPrinter.PrintHeader()
+ defer logsPrinter.PrintTrailer()
@@
- closeWebSocketInContext(false)
- logsPrinter.PrintTrailer()
-
+ closeWebSocketInContext(false)
return fmt.Errorf("failed to read JSON from websocket: %v", err)
}
@@
- logsPrinter.PrintTrailer()
-
return nil
}Also applies to: 1174-1176, 1198-1201, 1209-1209
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/rhobs/requests.go` around lines 1046 - 1047, After calling
createLogsPrinter(...) and immediately invoking logsPrinter.PrintHeader(), add a
deferred call to logsPrinter.PrintTrailer() (defer logsPrinter.PrintTrailer())
so the streaming printer is always finalized even if websocket setup fails; then
remove any explicit/manual calls to logsPrinter.PrintTrailer() in the read-error
/ error-return paths (to avoid double-closing). Apply this change for each place
where logsPrinter.PrintHeader() is used (the blocks that create logsPrinter,
call PrintHeader, and currently call PrintTrailer on error).
|
|
||
| ### Synopsis | ||
|
|
||
| Fetch logs from RHOBS for a given cluster. The cluster can be a management cluster (MC) or whatever cluster sending logs to RHOBS; the command works as if the management cluster ID was passed if given a hosted cluster (HCP) ID. By default, logs from all the pods in the given namespace are returned but it is possible to specify a single pod as an argument or filter pods using their labels. Logs themselves can be also filtered to only keep the ones containing a given regexp (--contain option) or a given log level (--level option). |
There was a problem hiding this comment.
Synopsis maps regexp filtering to the wrong flag.
This line says regexp filtering is done with --contain, but options define regexp filtering with --contain-regex. Please correct the synopsis to match behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/osdctl_rhobs_logs.md` at line 7, Synopsis text incorrectly references
the --contain flag for regexp filtering; update the documentation sentence in
docs/osdctl_rhobs_logs.md so it names the actual flag --contain-regex (i.e.,
change "regexp (--contain option)" to "regexp (--contain-regex option)") so the
synopsis matches the defined CLI options.
|
|
||
| ### Synopsis | ||
|
|
||
| Fetch metrics from RHOBS for a given cluster. The cluster can be a hosted cluster (HCP), a management cluster (MC) or whatever cluster sending metrics to RHOBS. The prometheus expression provided as an argument can be either an instant query or a range query; it is optional if the --web option is set. By default, the command will try to evaluate the expression as an instant query at the current time, but it is possible to specify a different evaluation time using the --time option or a time range using the --start-time, --end-time and --since options.Results can be filtered to only keep the ones matching the given cluster (--cluster-id option) with the --filter option even if it is more efficient to do that filtering at the prometheus expression level. |
There was a problem hiding this comment.
--web mention appears out of contract with listed flags.
The synopsis says the expression is optional when --web is set, but --web is not present in the options for this command. Please sync the synopsis with the actual CLI surface.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/osdctl_rhobs_metrics.md` at line 7, The synopsis text in
docs/osdctl_rhobs_metrics.md refers to a --web option but that flag is not
defined in the command options; update the docs so they match: either remove the
reference to "--web" from the synopsis sentence about the expression being
optional, or add a documented "--web" flag entry (with its short/long form,
description and behavior) to the options list for the osdctl rhobs metrics
command so the synopsis and options are consistent; ensure the edited text
around the sentence "The prometheus expression ... it is optional if the --web
option is set" is updated accordingly.
| ### osdctl rhobs logs | ||
|
|
||
| Fetch logs from RHOBS for a given cluster | ||
| Fetch logs from RHOBS for a given cluster. The cluster can be a management cluster (MC) or whatever cluster sending logs to RHOBS; the command works as if the management cluster ID was passed if given a hosted cluster (HCP) ID. By default, logs from all the pods in the given namespace are returned but it is possible to specify a single pod as an argument or filter pods using their labels. Logs themselves can be also filtered to only keep the ones containing a given regexp (--contain option) or a given log level (--level option). |
There was a problem hiding this comment.
Regex flag is misdocumented in synopsis.
The synopsis says regex filtering uses --contain, but regex support is documented under --contain-regex. This can mislead users about which filter actually accepts regexp input.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/README.md` at line 4342, Update the README synopsis to correctly
document the regex filter: replace or supplement the mention of --contain with
the correct flag --contain-regex (or explicitly state that --contain is literal
substring match while --contain-regex accepts regular expressions) so the
synopsis matches the detailed flag descriptions; ensure both flags (--contain
and --contain-regex) are referenced unambiguously in the synopsis text to avoid
user confusion.
| ### osdctl rhobs metrics | ||
|
|
||
| Fetch metrics from RHOBS for a given cluster | ||
| Fetch metrics from RHOBS for a given cluster. The cluster can be a hosted cluster (HCP), a management cluster (MC) or whatever cluster sending metrics to RHOBS. The prometheus expression provided as an argument can be either an instant query or a range query; it is optional if the --web option is set. By default, the command will try to evaluate the expression as an instant query at the current time, but it is possible to specify a different evaluation time using the --time option or a time range using the --start-time, --end-time and --since options.Results can be filtered to only keep the ones matching the given cluster (--cluster-id option) with the --filter option even if it is more efficient to do that filtering at the prometheus expression level. |
There was a problem hiding this comment.
--web is referenced but not documented in this command help.
The synopsis states the PromQL expression is optional when --web is set, but --web does not appear in this command’s options/inherited options. Please align the synopsis with the actual exposed flags.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/README.md` at line 4450, The synopsis mentions a --web option ("the
prometheus expression ... is optional if the --web option is set") but that flag
is not present in the command's options; either add documentation for the --web
flag to the command's options/inherited options or remove/replace the --web
mention in the synopsis so they match; update the sentence beginning "By
default, the command will try to evaluate..." and the command's options list in
the same README section so the synopsis and exposed flags are consistent.
Follow up of this PR:
#876
osdctl rhobs logs --follow--timeoption) or with a time range (new--start-timeand--sinceoptions)Summary by CodeRabbit
New Features
--start-time,--end-time,--since,--step, and--timeflagsBug Fixes
Documentation