Skip to content

Conversation

@sosiouxme
Copy link
Member

@sosiouxme sosiouxme commented Dec 23, 2025

This prepares sippy for the work being done in the cloud function to begin applying labels automatically. Some is to make sippy's code reusable in that function, some is just to adjust to the updated design for labels.

Individual commits represent bite-sized concepts for review.

/hold
This requires an accompanying schema update which should be performed once approved for merge.

Summary by CodeRabbit

  • New Features

    • Detect truncated artifact content in matches.
    • Add Sippy API client and job-symptoms endpoints (full CRUD).
    • Introduce JobRunLabel model with timestamps, source tool, symptom association, and richer job-run labeling.
  • Improvements

    • Artifact reading transparently handles gzip-compressed content.
    • Artifact content lookups now honor cancellation/timeouts for more responsive processing.

✏️ Tip: You can customize this high-level summary in your review settings.

@openshift-ci-robot
Copy link

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 23, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Dec 23, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 23, 2025

@sosiouxme: This pull request references TRT-2381 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This prepares sippy for the work being done in the cloud function to begin applying labels automatically. Some is to make sippy's code reusable in that function, some is just to adjust to the updated design for labels.

Individual commits represent bite-sized concepts for review.

/hold
This requires an accompanying schema update which should be performed once approved for merge.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 23, 2025

Walkthrough

Adds artifact reader abstraction with gzip handling and context propagation, exposes truncation detection on content matches, migrates job-run annotations to a public JobRunLabel model with JSON-wrapped comments, adds a DB model for job labels, and implements a Sippy HTTP client plus a Symptoms CRUD wrapper.

Changes

Cohort / File(s) Summary
Job Artifacts API & Reader
pkg/api/jobartifacts/apitypes.go, pkg/api/jobartifacts/query.go, pkg/api/jobartifacts/query_test.go, pkg/api/jobartifacts/manager.go
Adds MatchedContent.HasMatches() and ContentLineMatches.Truncated. Introduces OpenArtifactReader(ctx, file, contentType) (returns *bufio.Reader and closer) with gzip-aware handling. Refactors getFileContentMatches to accept context.Context and use the new reader; tests and worker call sites updated to pass context.
Job Run Annotation Refactoring
pkg/componentreadiness/jobrunannotator/jobrunannotator.go
Replaces internal jobRunAnnotation with models.JobRunLabel across fetch/assembly/insert flows. Adds LabelComment type and changes generateComment to emit JSON-wrapped comments. Updates function signatures and in-memory handling accordingly.
Database Models
pkg/db/models/job_labels.go
Adds exported models.JobRunLabel struct mapping to BigQuery columns (ID, StartTime, Label, Comment, User, CreatedAt, UpdatedAt, SourceTool, SymptomID) and an unexported URL field; includes schema migration guidance in comments.
Sippy API Client & Symptoms
pkg/sippyclient/client.go, pkg/sippyclient/jobrunscan/symptoms.go
Adds sippyclient.Client with functional options (WithServerURL, WithToken, WithHTTPClient) and methods for GET/POST/PUT/DELETE JSON requests with optional Bearer token. Adds jobrunscan.SymptomsClient offering List/Get/Create/Update/Delete symptom operations with ID validation.

Sequence Diagram(s)

mermaid
sequenceDiagram
autonumber
participant Annotator as JobRunAnnotator
participant BQRead as BigQuery (read)
participant Builder as In-memory builder
participant BQWrite as BigQuery (insert)
Note over Annotator,BQRead: Annotation flow now uses models.JobRunLabel and JSON-wrapped LabelComment
Annotator->>BQRead: getJobRunAnnotationsFromBigQuery(ctx)
BQRead-->>Annotator: rows -> decoded into models.JobRunLabel
Annotator->>Builder: generate labels (models.JobRunLabel), call generateComment → LabelComment JSON
Builder-->>Annotator: prepared inserts
Annotator->>BQWrite: bulkInsertJobRunAnnotations(ctx, inserts)
BQWrite-->>Annotator: insert result / errors

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (2 warnings, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 77.78% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
Go Error Handling ⚠️ Warning Pull request contains error handling issues violating Go best practices including ignored errors without justification and inappropriate panic usage. Add explanatory comments for ignored errors, convert panic in CacheKeyForJobRun to proper error handling, and use %w instead of %v for error wrapping.
Sql Injection Prevention ❓ Inconclusive Cannot access internal source code files. The specified file path appears to be from a private codebase and requires direct repository access to inspect the BigQuery query construction in getJobRunAnnotationsFromBigQuery. Provide the source code file contents or repository access to verify parameter placeholder usage instead of string concatenation.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly references a ticket (TRT-2381) and describes the main objective: preparing prerequisites for writing job labels via symptoms. This accurately reflects the primary changes across the codebase.
Excessive Css In React Should Use Styles ✅ Passed This pull request contains only Go backend code with no React components or inline CSS styling, making this check not applicable.
Single Responsibility And Clear Naming ✅ Passed The pull request demonstrates strong adherence to single responsibility and clear naming principles with specific, action-oriented names.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci bot requested review from deepsm007 and neisw December 23, 2025 01:14
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 23, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sosiouxme

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 23, 2025
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 23, 2025

@sosiouxme: This pull request references TRT-2381 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This prepares sippy for the work being done in the cloud function to begin applying labels automatically. Some is to make sippy's code reusable in that function, some is just to adjust to the updated design for labels.

Individual commits represent bite-sized concepts for review.

/hold
This requires an accompanying schema update which should be performed once approved for merge.

Summary by CodeRabbit

  • New Features

  • Added support for detecting truncated content in artifact matches.

  • Introduced new Sippy API integration for managing job symptoms with full CRUD operations.

  • Enhanced job run labels with metadata fields including creation timestamps, source tool tracking, and symptom association.

  • Improvements

  • Improved artifact file reading with transparent gzip decompression support.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
pkg/api/jobartifacts/query.go (1)

182-183: Consider propagating the caller's context instead of context.Background().

Using context.Background() bypasses any timeout or cancellation that might be set by the caller. If the caller's context is cancelled, this operation will continue running.

🔎 Proposed fix

The getFileContentMatches method should accept a context parameter and pass it through:

-func (q *JobArtifactQuery) getFileContentMatches(jobRunID int64, attrs *storage.ObjectAttrs) (artifact JobRunArtifact) {
+func (q *JobArtifactQuery) getFileContentMatches(ctx context.Context, jobRunID int64, attrs *storage.ObjectAttrs) (artifact JobRunArtifact) {
 	artifact.JobRunID = strconv.FormatInt(jobRunID, 10)
 	artifact.ArtifactPath = relativeArtifactPath(attrs.Name, artifact.JobRunID)
 	artifact.ArtifactContentType = attrs.ContentType
 	artifact.ArtifactURL = fmt.Sprintf(artifactURLFmt, util.GcsBucketRoot, attrs.Name)
 	if q.ContentMatcher == nil { // no matching requested
 		return
 	}

-	reader, closer, err := OpenArtifactReader(context.Background(), q.GcsBucket.Object(attrs.Name), attrs.ContentType)
+	reader, closer, err := OpenArtifactReader(ctx, q.GcsBucket.Object(attrs.Name), attrs.ContentType)
pkg/db/models/job_labels.go (1)

19-19: Consider using URL instead of Url for Go naming conventions.

Go conventions prefer URL as it's an acronym. However, this is a minor style preference.

-	Url        string              `bigquery:"-"`
+	URL        string              `bigquery:"-"`

Note: This would require updating the usage in jobrunannotator.go line 515 as well.

pkg/sippyclient/jobrunscan/symptoms.go (1)

32-40: Consider validating the id parameter.

Empty or malformed id values could result in unexpected API paths (e.g., /api/jobs/symptoms/ instead of /api/jobs/symptoms/{id}). While the server should reject these, client-side validation provides earlier, clearer feedback.

🔎 Proposed fix
 // Get retrieves a single symptom by ID
 func (sc *SymptomsClient) Get(ctx context.Context, id string) (*jobrunscan.Symptom, error) {
+	if id == "" {
+		return nil, fmt.Errorf("symptom id cannot be empty")
+	}
 	var symptom jobrunscan.Symptom
 	path := fmt.Sprintf("/api/jobs/symptoms/%s", id)

Apply similar validation to Update and Delete methods.

pkg/sippyclient/client.go (1)

144-146: Avoid unnecessary byte-to-string conversion.

strings.NewReader(string(bodyBytes)) creates an extra string allocation. Use bytes.NewReader directly.

🔎 Proposed fix

Add "bytes" to the imports and update the code:

+import (
+	"bytes"
 	"context"
 	...
 )
-		bodyReader = strings.NewReader(string(bodyBytes))
+		bodyReader = bytes.NewReader(bodyBytes)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between ce79f9a and c0e3183.

📒 Files selected for processing (6)
  • pkg/api/jobartifacts/apitypes.go
  • pkg/api/jobartifacts/query.go
  • pkg/componentreadiness/jobrunannotator/jobrunannotator.go
  • pkg/db/models/job_labels.go
  • pkg/sippyclient/client.go
  • pkg/sippyclient/jobrunscan/symptoms.go
🧰 Additional context used
🧬 Code graph analysis (1)
pkg/sippyclient/jobrunscan/symptoms.go (1)
pkg/sippyclient/client.go (1)
  • Client (20-24)
🔇 Additional comments (8)
pkg/api/jobartifacts/apitypes.go (1)

44-48: LGTM!

The HasMatches() helper method is well-implemented with proper nil-safety checks. The comment about future matcher types provides good context for maintainability.

pkg/api/jobartifacts/query.go (1)

197-234: LGTM!

The OpenArtifactReader function is well-designed with a robust closer pattern. The nil-safety in the closer function ensures it can always be safely deferred, even when errors occur during initialization. Good use of the transparent gzip handling.

pkg/db/models/job_labels.go (1)

8-20: LGTM on the model definition.

The JobRunLabel struct is well-documented with appropriate BigQuery field mappings. Good use of bigquery.NullString for the optional User field and civil.DateTime for timestamp fields.

pkg/sippyclient/jobrunscan/symptoms.go (1)

11-21: LGTM on the client structure and constructor.

Clean implementation following the functional options pattern from the parent sippyclient.Client. The composition pattern is appropriate for extending the API surface.

pkg/sippyclient/client.go (2)

86-89: Note: Status code handling differs between Get and doJSONRequest.

Get only accepts 200, while doJSONRequest (used by Post/Put) accepts the full 2xx range. This is likely intentional since GET requests typically return 200, but worth documenting if this is the expected behavior.


50-64: LGTM on the constructor and options pattern.

Clean implementation of the functional options pattern with sensible defaults (30s timeout, localhost for development). The design allows for easy customization via WithServerURL, WithToken, and WithHTTPClient.

pkg/componentreadiness/jobrunannotator/jobrunannotator.go (2)

416-422: LGTM on LabelComment type.

The versioned approach (job_run_annotator_v1) is a good practice for schema evolution. Exporting the type enables external tools to parse label comments consistently.


505-516: LGTM on the JobRunLabel construction.

The new fields (CreatedAt, UpdatedAt, SourceTool, SymptomID) are properly populated. The comment on SymptomID clearly indicates future work.

Comment on lines 424 to 436
func (j JobRunAnnotator) generateComment() string {
comment := j.comment
annotatorComment, err := json.MarshalIndent(j, "", " ")
if err == nil {
comment += fmt.Sprintf("\nAnnotator\n%s", annotatorComment)
comment := LabelComment{
Comment: j.comment, // apparently separated out in order to stand out from the object
V1: j,
}
str, err := json.MarshalIndent(comment, "", " ")
if err != nil {
log.WithError(err).Error("error generating JobAnnotator comment")
}
return comment
return string(str)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Error in generateComment results in empty string, not propagated.

If json.MarshalIndent fails, the error is logged but an empty string is returned. This could lead to labels being inserted with empty comments, masking the underlying issue.

🔎 Proposed fix

Consider either returning a fallback comment or propagating the error:

 func (j JobRunAnnotator) generateComment() string {
 	comment := LabelComment{
 		Comment: j.comment,
 		V1:      j,
 	}
 	str, err := json.MarshalIndent(comment, "", "    ")
 	if err != nil {
 		log.WithError(err).Error("error generating JobAnnotator comment")
+		// Return a fallback comment with the original prefix
+		return j.comment
 	}
 	return string(str)
 }

Alternatively, if this should never fail, consider using panic similar to the pattern in CacheKeyForJobRun (line 278 in query.go).

🤖 Prompt for AI Agents
In pkg/componentreadiness/jobrunannotator/jobrunannotator.go around lines 424 to
434, generateComment currently logs json.MarshalIndent errors and returns an
empty string which masks failures; change generateComment to return (string,
error) instead of string, call json.MarshalIndent and on error return "", err
(or if you intentionally want a hard failure, replace the log+return with a
panic consistent with CacheKeyForJobRun), then update all call sites to handle
the error (propagate or handle fallback) and add/update tests to cover marshal
failure behavior.

@sosiouxme sosiouxme force-pushed the 20251216-label-symptoms-CF branch from c0e3183 to dc2bf48 Compare December 24, 2025 03:34
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 24, 2025

@sosiouxme: This pull request references TRT-2381 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This prepares sippy for the work being done in the cloud function to begin applying labels automatically. Some is to make sippy's code reusable in that function, some is just to adjust to the updated design for labels.

Individual commits represent bite-sized concepts for review.

/hold
This requires an accompanying schema update which should be performed once approved for merge.

Summary by CodeRabbit

  • New Features

  • Detect truncated artifact content in matches.

  • Add Sippy API client and job-symptoms endpoints (full CRUD) for interacting with symptoms.

  • Introduce a JobRunLabel model and richer job-run labeling with creation/updated timestamps, source tool, and symptom association.

  • Improvements

  • Artifact reading now transparently handles gzip-compressed content.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
pkg/sippyclient/client.go (1)

68-97: Consider nil-check for consistency with doJSONRequest.

Line 92 decodes the response without checking if result is nil, while doJSONRequest (line 171) does perform this check. For consistency and defensive programming, consider adding a nil check here as well.

🔎 Optional defensive nil check
 	}
 
+	if result != nil {
 		if err := json.NewDecoder(resp.Body).Decode(result); err != nil {
 			return fmt.Errorf("failed to decode response: %w", err)
 		}
+	}
 
 	return nil
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between c0e3183 and dc2bf48.

📒 Files selected for processing (8)
  • pkg/api/jobartifacts/apitypes.go
  • pkg/api/jobartifacts/manager.go
  • pkg/api/jobartifacts/query.go
  • pkg/api/jobartifacts/query_test.go
  • pkg/componentreadiness/jobrunannotator/jobrunannotator.go
  • pkg/db/models/job_labels.go
  • pkg/sippyclient/client.go
  • pkg/sippyclient/jobrunscan/symptoms.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/api/jobartifacts/apitypes.go
  • pkg/db/models/job_labels.go
🧰 Additional context used
🧬 Code graph analysis (4)
pkg/api/jobartifacts/query.go (1)
pkg/api/jobartifacts/apitypes.go (2)
  • JobRunArtifact (29-37)
  • MatchedContent (39-42)
pkg/sippyclient/jobrunscan/symptoms.go (1)
pkg/sippyclient/client.go (1)
  • Client (21-25)
pkg/componentreadiness/jobrunannotator/jobrunannotator.go (2)
pkg/db/models/job_labels.go (1)
  • JobRunLabel (8-19)
vendor/github.com/invopop/jsonschema/id.go (1)
  • ID (12-12)
pkg/api/jobartifacts/query_test.go (3)
pkg/api/jobartifacts/content_line_matcher.go (1)
  • NewStringMatcher (74-81)
pkg/api/jobartifacts/apitypes.go (2)
  • MatchedContent (39-42)
  • ContentLineMatches (50-54)
pkg/api/jobartifacts/query.go (1)
  • ContentMatcher (237-246)
🔇 Additional comments (20)
pkg/api/jobartifacts/query_test.go (1)

50-69: LGTM! Context propagation properly implemented in tests.

The test updates correctly pass context.Context as the first parameter to getFileContentMatches, aligning with the new signature. The use of context.Background() is appropriate for test scenarios.

pkg/api/jobartifacts/manager.go (1)

269-269: LGTM! Proper context propagation for artifact queries.

The updated call correctly passes request.ctx to getFileContentMatches, enabling proper cancellation and timeout handling during artifact content matching. This aligns with the artifactQueryTimeout constraint defined at line 22.

pkg/api/jobartifacts/query.go (2)

173-195: LGTM! Well-designed refactoring with proper context propagation.

The refactored getFileContentMatches function correctly:

  • Accepts context for cancellation/timeout support
  • Uses the new OpenArtifactReader abstraction to handle gzip content transparently
  • Properly manages resources via the closer function with defer
  • Preserves incomplete matches even when scanning encounters errors (line 193), which is valuable for debugging

The explicit field assignments from attrs (lines 175-177) improve clarity.


197-234: LGTM! Excellent resource management pattern.

The OpenArtifactReader function is well-designed:

  • Closer pattern: Always returns a non-nil closer function that's safe to defer, even on error. The closer checks for nil before closing each reader (lines 206-211), preventing panics.
  • Transparent gzip handling: Automatically detects and decompresses gzipped content based on contentType (line 221).
  • Context propagation: Passes context to NewReader (line 215) for proper cancellation support.
  • Defensive nil assignments (lines 217, 225): Explicit nil assignments before returning on error make the intent clear and ensure the closer behaves correctly, even though these assignments are technically redundant given Go's error conventions.
pkg/componentreadiness/jobrunannotator/jobrunannotator.go (5)

416-422: LGTM! Well-designed extensible comment structure.

The exported LabelComment type provides a clean, extensible way to serialize tool-specific context as JSON. The design allows different tools to add their own keys while preserving a common comment field.


424-436: LGTM! Error handling improved with fallback comment.

The function now handles json.MarshalIndent errors by returning a fallback comment that includes the error message, rather than an empty string. This ensures failed serialization is visible in the DB and provides debugging information.

This pragmatically addresses the past review concern about masking marshal failures.


438-491: LGTM! Consistent migration to public model.

The function signature and implementation correctly migrated to use models.JobRunLabel throughout. The BigQuery query and row iteration are properly updated.

Note: Consistent with the model definition, the URL field won't be populated from BigQuery due to its bigquery:"-" tag.


493-523: LGTM! Properly populates all JobRunLabel fields.

The function correctly constructs models.JobRunLabel instances with:

  • Required fields mapped from input (ID, StartTime, Label, User, URL)
  • Metadata fields properly initialized (CreatedAt, UpdatedAt, SourceTool)
  • JSON-wrapped comment via generateComment()
  • Intentionally empty SymptomID with clear documentation

The hardcoded SourceTool value is appropriate for this specific annotation tool.


387-393: URL field with bigquery:"-" tag is intentional; no action needed.

The models.JobRunLabel.URL field correctly has the bigquery:"-" tag and is not stored in BigQuery. This is intentional design—the field is populated for dry-run display (line 393) and in-memory use, but persistent storage is not required since the job can be looked up by its ID (prowjob_build_id). The table schema confirms URL is not among the persisted fields.

pkg/sippyclient/client.go (6)

14-25: LGTM! Clean client structure.

The Client struct and default constants are well-designed. The comment explaining the intended usage (cloud functions, CLI tools) provides helpful context for understanding why this code may not be used within Sippy itself.


27-49: LGTM! Idiomatic functional options pattern.

The functional options are well-implemented. Line 33's strings.TrimSuffix is particularly good practice to ensure clean URL construction.


52-65: LGTM! Well-structured constructor.

The constructor properly initializes defaults and applies options. The 30-second timeout is a sensible default for API calls.


100-107: LGTM! Clean delegation pattern.

Post and Put methods correctly delegate to the shared doJSONRequest helper.


110-134: LGTM! Proper DELETE semantics.

The method correctly handles both StatusOK (200) and StatusNoContent (204) responses, which are both valid for DELETE operations.


137-178: LGTM! Robust JSON request handler.

The method properly handles all aspects of JSON requests:

  • Marshals request body only when present
  • Sets appropriate headers
  • Checks for all 2xx success codes (appropriate for POST/PUT)
  • Defensively checks if result is nil before decoding
pkg/sippyclient/jobrunscan/symptoms.go (5)

12-22: LGTM! Clean wrapper design.

The SymptomsClient properly wraps the generic HTTP client, providing good separation between HTTP mechanics and domain-specific API logic.


24-31: LGTM! Straightforward List implementation.

The method correctly retrieves all symptoms and wraps errors appropriately.


49-56: LGTM! Clean Create implementation.

The method correctly creates a new symptom and returns the server's response.


58-70: LGTM! Correct Update implementation.

The method properly validates the ID and updates the symptom.


72-83: LGTM! Proper Delete implementation.

The method correctly validates the ID and performs the deletion.

Comment on lines +33 to +47
var idSanityRegex = regexp.MustCompile(`^\w+$`) // slightly permissive but close enough to be safe

// Get retrieves a single symptom by ID
func (sc *SymptomsClient) Get(ctx context.Context, id string) (*jobrunscan.Symptom, error) {
if !idSanityRegex.MatchString(id) {
return nil, fmt.Errorf("invalid symptom ID: '%s'", id)
}

var symptom jobrunscan.Symptom
path := fmt.Sprintf("/api/jobs/symptoms/%s", id)
if err := sc.client.Get(ctx, path, &symptom); err != nil {
return nil, fmt.Errorf("failed to get symptom %s: %w", id, err)
}
return &symptom, nil
}
Copy link
Contributor

@coderabbitai coderabbitai bot Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find examples of symptom IDs and the Symptom struct definition

echo "=== Searching for symptom ID patterns ==="
rg -n -C3 'symptom.*[Ii][Dd]|Symptom.*ID' --type=go 2>/dev/null | head -50

echo ""
echo "=== Searching for Symptom struct definition ==="
ast-grep --pattern 'type Symptom struct {
  $$$
}'

echo ""
echo "=== Checking for UUID patterns in symptoms context ==="
rg -n 'Symptom|symptom' --type=go | rg -i 'uuid|guid|[0-9a-f]{8}-' | head -20

echo ""
echo "=== Looking for test files related to symptoms ==="
fd -e go --path '*symptom*' --path '*test*'

Repository: openshift/sippy

Length of output: 6326


🏁 Script executed:

#!/bin/bash
echo "=== Looking for test files and example symptom IDs ==="
fd -type f -name '*symptom*.go' | head -20

echo ""
echo "=== Searching for symptom creation/test examples ==="
rg -n -A5 'Symptom\{' --type=go | grep -E '(ID|Test|func)' | head -40

echo ""
echo "=== Checking server-side symptom validation ==="
rg -n 'idSanityRegex|validateSymptom|symptom.*ID' --type=go -B2 -A2 | grep -v Binary

echo ""
echo "=== Looking for actual ID constraint enforcement ==="
rg -n 'not.*start|digit|[Ii]dentifier.*valid' --type=go | head -30

Repository: openshift/sippy

Length of output: 10968


🏁 Script executed:

#!/bin/bash
echo "=== Finding validIdentifierRegex definition ==="
rg -n 'validIdentifierRegex\s*=' --type=go -A1

echo ""
echo "=== Full context of validateSymptom function ==="
rg -n 'func validateSymptom' --type=go -A15

echo ""
echo "=== Check the full pkg/api/jobrunscan/symptoms.go file ==="
head -50 pkg/api/jobrunscan/symptoms.go

Repository: openshift/sippy

Length of output: 3003


Fix client-side ID validation to match server-side constraints.

The idSanityRegex pattern (^\w+$) is more permissive than the server-side validation in pkg/api/jobrunscan/helpers.go, which requires identifiers to start with a letter or underscore (^[a-zA-Z_][a-zA-Z0-9_]*$). This creates a validation mismatch where the client accepts IDs that the server will reject. For example, an ID starting with a digit like "123test" would pass client validation but fail server validation. Update the regex to ^\[a-zA-Z_\][a-zA-Z0-9_]*$ to ensure consistency.

🤖 Prompt for AI Agents
In pkg/sippyclient/jobrunscan/symptoms.go around lines 33 to 47, the client-side
ID validation uses idSanityRegex = `^\w+$` which is more permissive than the
server's rule and allows IDs starting with digits; update the regex to enforce
the same constraint as the server by changing it to `^[a-zA-Z_][a-zA-Z0-9_]*$`
so IDs must start with a letter or underscore and contain only letters, digits,
or underscores, ensuring client and server validation match.

Copy link
Member Author

@sosiouxme sosiouxme Dec 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a sanity check, not a validity check. your suggestion was just to check that it not be empty. this also checks that it's not something broken or even maliciously inserted.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sosiouxme Thank you for clarifying the intent! You're absolutely right—a client-side sanity check that prevents obviously malformed or malicious input (like injection attempts or path traversal) while letting the server enforce the precise business rules is a sound architectural pattern. The looser ^\w+$ pattern still blocks special characters and suspicious patterns while allowing the server to be the authoritative validator.

I understand now that this is an intentional design decision, not an oversight. The server-side validation will catch any IDs that don't meet the stricter format requirements.

Label comments should be json with a tool-specific schema.
This was a suggestion from CodeRabbit that seemed to make sense. If the
request context gets canceled, then the results are going to be
abandoned anyway, so it makes sense to let that cancellation pass to the
GCS reader and give it a chance to abort.

It seems right... It should be a minor optimization... But check if
anything breaks.
@sosiouxme sosiouxme force-pushed the 20251216-label-symptoms-CF branch from dc2bf48 to c1d7597 Compare December 24, 2025 03:41
@openshift-ci-robot
Copy link

openshift-ci-robot commented Dec 24, 2025

@sosiouxme: This pull request references TRT-2381 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

This prepares sippy for the work being done in the cloud function to begin applying labels automatically. Some is to make sippy's code reusable in that function, some is just to adjust to the updated design for labels.

Individual commits represent bite-sized concepts for review.

/hold
This requires an accompanying schema update which should be performed once approved for merge.

Summary by CodeRabbit

  • New Features

  • Detect truncated artifact content in matches.

  • Add Sippy API client and job-symptoms endpoints (full CRUD).

  • Introduce JobRunLabel model with timestamps, source tool, symptom association, and richer job-run labeling.

  • Improvements

  • Artifact reading transparently handles gzip-compressed content.

  • Artifact content lookups now honor cancellation/timeouts for more responsive processing.

✏️ Tip: You can customize this high-level summary in your review settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
pkg/sippyclient/jobrunscan/symptoms.go (1)

33-47: Client-side ID validation is more permissive than server-side (previously flagged).

The regex ^\w+$ allows IDs starting with digits, but the server requires IDs to match ^[a-zA-Z_][a-zA-Z0-9_]*$. This was already identified in a previous review.

🧹 Nitpick comments (3)
pkg/sippyclient/client.go (2)

87-90: Consider accepting 2xx range for consistency with doJSONRequest.

The Get method only accepts 200 OK, while doJSONRequest (used by Post/Put) accepts any 2xx status. This inconsistency could cause issues if the API returns 204 No Content or other valid 2xx responses for GET requests.

🔎 Proposed fix
-	if resp.StatusCode != http.StatusOK {
+	if resp.StatusCode < 200 || resp.StatusCode >= 300 {
		body, _ := io.ReadAll(resp.Body)
		return fmt.Errorf("unexpected status code %d: %s", resp.StatusCode, string(body))
	}

166-168: Variable shadowing: body shadows the function parameter.

The body variable declared here shadows the body parameter on line 137. While this works, it can be confusing. Consider using a different name like respBody.

🔎 Proposed fix
	if resp.StatusCode < 200 || resp.StatusCode >= 300 {
-		body, _ := io.ReadAll(resp.Body)
-		return fmt.Errorf("unexpected status code %d: %s", resp.StatusCode, string(body))
+		respBody, _ := io.ReadAll(resp.Body)
+		return fmt.Errorf("unexpected status code %d: %s", resp.StatusCode, string(respBody))
	}
pkg/sippyclient/jobrunscan/symptoms.go (1)

49-56: Consider validating symptom.ID in Create method.

Unlike Get, Update, and Delete, the Create method doesn't validate the symptom ID before sending the request. If the symptom struct contains an invalid ID, the error will only be caught server-side.

🔎 Proposed fix
 func (sc *SymptomsClient) Create(ctx context.Context, symptom jobrunscan.Symptom) (*jobrunscan.Symptom, error) {
+	if symptom.ID != "" && !idSanityRegex.MatchString(symptom.ID) {
+		return nil, fmt.Errorf("invalid symptom ID: '%s'", symptom.ID)
+	}
 	var result jobrunscan.Symptom
 	if err := sc.client.Post(ctx, "/api/jobs/symptoms", symptom, &result); err != nil {
 		return nil, fmt.Errorf("failed to create symptom: %w", err)
 	}
 	return &result, nil
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Cache: Disabled due to data retention organization setting

Knowledge base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between dc2bf48 and c1d7597.

📒 Files selected for processing (8)
  • pkg/api/jobartifacts/apitypes.go
  • pkg/api/jobartifacts/manager.go
  • pkg/api/jobartifacts/query.go
  • pkg/api/jobartifacts/query_test.go
  • pkg/componentreadiness/jobrunannotator/jobrunannotator.go
  • pkg/db/models/job_labels.go
  • pkg/sippyclient/client.go
  • pkg/sippyclient/jobrunscan/symptoms.go
🧰 Additional context used
🧬 Code graph analysis (3)
pkg/api/jobartifacts/query_test.go (3)
pkg/api/jobartifacts/content_line_matcher.go (1)
  • NewStringMatcher (74-81)
pkg/api/jobartifacts/apitypes.go (2)
  • MatchedContent (39-42)
  • ContentLineMatches (50-54)
pkg/api/jobartifacts/query.go (1)
  • ContentMatcher (237-246)
pkg/componentreadiness/jobrunannotator/jobrunannotator.go (1)
pkg/db/models/job_labels.go (1)
  • JobRunLabel (8-19)
pkg/sippyclient/jobrunscan/symptoms.go (1)
pkg/sippyclient/client.go (1)
  • Client (21-25)
🔇 Additional comments (14)
pkg/db/models/job_labels.go (2)

7-19: LGTM! Well-structured BigQuery model.

The JobRunLabel struct correctly maps to BigQuery with appropriate field types. Good use of civil.DateTime for timestamp fields and the bigquery:"-" tag for the URL field that's only needed in-memory.


21-32: Schema update documentation is helpful.

Including the bq update and SQL ALTER TABLE commands directly in the source code is a practical approach for coordinating schema changes with code deployments. Consider also documenting that these columns should be nullable since existing rows won't have values.

pkg/sippyclient/client.go (1)

51-65: LGTM! Clean functional options pattern.

The constructor with functional options is idiomatic and provides sensible defaults. The 30-second timeout is reasonable for API calls.

pkg/sippyclient/jobrunscan/symptoms.go (1)

12-22: LGTM! Clean client wrapper.

The SymptomsClient properly encapsulates the base client and provides a focused API for symptoms operations.

pkg/componentreadiness/jobrunannotator/jobrunannotator.go (4)

416-436: Good improvement: generateComment now has a sensible fallback.

The previous review flagged that generateComment returned an empty string on error. This has been addressed with a fallback that includes the original comment and error message, making debugging easier while ensuring labels are never inserted with empty comments.


387-414: LGTM! Clean migration to JobRunLabel model.

The bulkInsertJobRunAnnotations correctly uses the new models.JobRunLabel type and properly accesses the capitalized URL field.


438-491: LGTM! BigQuery fetch correctly updated for new model.

The getJobRunAnnotationsFromBigQuery properly returns map[int64]models.JobRunLabel and decodes rows into the new struct type.


493-523: LGTM! annotateJobRuns properly populates all new fields.

Good use of civil.DateTimeOf(time.Now()) for timestamps, and the SourceTool is set to a meaningful identifier. The empty SymptomID for manual annotations is clearly documented.

pkg/api/jobartifacts/query_test.go (2)

48-63: LGTM! Tests properly updated for context-aware API.

The test correctly creates a context and passes it to getFileContentMatches, aligning with the new function signature.


65-76: LGTM! Gzip test updated with context.

The gzip content test properly passes context.Background() to the updated API.

pkg/api/jobartifacts/apitypes.go (1)

44-48: LGTM! Clean helper method for match detection.

The HasMatches() method provides a nil-safe way to check for matches. The comment acknowledging future complexity is helpful for maintainability.

pkg/api/jobartifacts/manager.go (1)

265-279: LGTM! Context properly propagated to content matching.

The worker now passes the request context to getFileContentMatches, enabling proper cancellation and timeout handling for artifact content scanning.

pkg/api/jobartifacts/query.go (2)

173-195: LGTM! Clean refactor with proper error handling.

The getFileContentMatches function now properly uses OpenArtifactReader for transparent gzip handling. Good decision to still return partial matches even when scanning encounters an error - this aids debugging.


197-234: Well-designed resource cleanup pattern in OpenArtifactReader.

The closure-based cleanup approach is elegant and handles all resource cleanup scenarios correctly:

  • Returns closer before error so callers can always defer it
  • Nil-checks in closer prevent issues when readers aren't initialized
  • Setting readers to nil on error prevents unnecessary close attempts

The gzip content type check correctly handles the standard application/gzip MIME type as provided by the GCS API.

@openshift-ci-robot
Copy link

Scheduling required tests:
/test e2e

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 24, 2025

@sosiouxme: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants