feat: add github/openssf scrapers by moshloop · Pull Request #1875 · flanksource/config-db

moshloop · 2026-01-23T08:13:01Z

Summary by CodeRabbit

Release Notes

New Features
- Added GitHub repository scraper with security alerts detection (Dependabot, Code Scanning, Secret Scanning).
- Integrated OpenSSF Scorecard analysis for repository security assessment.
Improvements
- Added rate-limit handling to gracefully manage API throttling with reset timing.
- Implemented incremental scraping to optimize repeated repository scans.
Tests
- Added comprehensive test coverage for GitHub scraper functionality.

coderabbitai · 2026-01-23T08:13:37Z

Warning

Rate limit exceeded

@moshloop has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 14 minutes and 38 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 8f8ab18 and 73cb168.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (18)

Makefile
api/v1/github.go
api/v1/interface.go
api/v1/interface_rate_limit_test.go
api/v1/scrapeconfig_types.go
api/v1/types.go
cmd/run.go
fixtures/github.yaml
go.mod
scrapers/common.go
scrapers/cron.go
scrapers/github/client.go
scrapers/github/openssf.go
scrapers/github/scraper.go
scrapers/github/scraper_e2e_test.go
scrapers/github/scraper_test.go
scrapers/github/security.go
scrapers/run.go

Walkthrough

Adds two new security scrapers and supporting types: a GitHub Security scraper (alerts: Dependabot, Code Scanning, Secret Scanning) and an OpenSSF Scorecard scraper, plus API clients, fixtures, deepcopy generation, registry registration, and small context/AWS tweaks.

Changes

Cohort / File(s)	Summary
API Types & Config `api/v1/github_security.go`, `api/v1/openssf.go`, `api/v1/types.go`, `api/v1/scrapeconfig_types.go`	Introduce GitHubSecurity and OpenSSFScorecard types (with repository and filter subtypes), add fields to ScraperSpec, and update ScrapeConfig.Type() to detect new types.
DeepCopy Generation `api/v1/zz_generated.deepcopy.go`	Generated DeepCopyInto/DeepCopy methods for new GitHubSecurity/OpenSSF types and extended ScraperSpec deep-copy handling.
GitHub Security Scraper & Client `scrapers/github/security.go`, `scrapers/github/security_client.go`, `scrapers/github/testdata/security/*.json`	Add GithubSecurityScraper with incremental per-repo alert scraping, rate-limit handling, health calculation; add GitHubSecurityClient with methods to fetch Dependabot, Code Scanning, Secret Scanning alerts and rate-limit checks; include test fixtures.
OpenSSF Scorecard Scraper & Client `scrapers/openssf/scorecard.go`, `scrapers/openssf/client.go`, `scrapers/openssf/compliance.go`, `scrapers/openssf/testdata/*.json`	Add OpenSSFScorecardScraper, ScorecardClient with retry/backoff and JSON parsing, compliance mappings for checks, caching/TTL logic, and scorecard test fixtures.
Scraper Registry & Common `scrapers/common.go`	Register new scrapers: GithubSecurityScraper and OpenSSFScorecardScraper; add openssf import.
Fixtures `fixtures/github-security.yaml`, `fixtures/openssf.yaml`	Add example ScrapeConfig fixtures for GitHub security and OpenSSF scorecard scans.
Context & Minor Changes `api/context.go`, `scrapers/aws/aws.go`	Add ScrapeContext.IsDebug() method; adjust ECS log verbosity and Route53 AliasTarget handling.

Sequence Diagram(s)

sequenceDiagram
    participant ScrapeCtx as Scrape Context
    participant GHS as GitHubSecurityScraper
    participant GSC as GitHubSecurityClient
    participant GHApi as GitHub API
    participant Results as ScrapeResults

    ScrapeCtx->>GHS: Scrape(ctx)
    GHS->>GHS: CanScrape(spec)
    GHS->>GSC: NewGitHubSecurityClient(config)
    loop For each repository
        GHS->>GSC: fetchAllAlerts(ctx)
        GSC->>GHApi: GetDependabotAlerts()
        GHApi-->>GSC: alerts
        GSC->>GHApi: GetCodeScanningAlerts()
        GHApi-->>GSC: alerts
        GSC->>GHApi: GetSecretScanningAlerts()
        GHApi-->>GSC: alerts
        GSC->>GSC: compute max updated time
        GHS->>GHS: calculateHealthStatus(counts)
        GHS->>Results: append ScrapeResult
    end
    GHS-->>ScrapeCtx: results

sequenceDiagram
    participant ScrapeCtx as Scrape Context
    participant OSS as OpenSSFScorecardScraper
    participant SC as ScorecardClient
    participant OpenSSFApi as OpenSSF API
    participant Results as ScrapeResults

    ScrapeCtx->>OSS: Scrape(ctx)
    OSS->>OSS: CanScrape(spec)
    OSS->>SC: NewScorecardClient(ctx)
    loop For each repository
        OSS->>OSS: check cache & TTL
        OSS->>SC: GetRepositoryScorecard(owner, repo)
        loop Retry up to 3 times
            SC->>OpenSSFApi: HTTP GET scorecard
            OpenSSFApi-->>SC: response or error
            SC->>SC: isRetryable(err)?
        end
        SC-->>OSS: ScorecardResponse
        OSS->>OSS: filter by MinScore & build analyses
        OSS->>Results: append ScrapeResult
    end
    OSS-->>ScrapeCtx: results

Possibly related PRs

Pr/exec #1787: Modifies scraper-type detection/registry logic similar to the ScrapeConfig.Type() additions and registry updates.
feat: add slim build tag and CLI improvements #1883: Alters the scrapers registry (scrapers/common.go), overlapping with registration changes for new scrapers.

Suggested reviewers

adityathebe

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 19.23% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly summarizes the main change: adding GitHub and OpenSSF security scrapers. The changeset introduces new scraper types for GitHubSecurity and OpenSSFScorecard with supporting infrastructure across multiple files.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch security-scrapers

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

api/v1/types.go (1)

130-136: ApplyPlugin skips GitHubSecurity/OpenSSFScorecard.

New scraper specs won’t receive BaseScraper plugin settings, so plugin labels/transforms won’t apply.

🛠️ Proposed fix

 	for i := range spec.GithubActions {
 		spec.GithubActions[i].BaseScraper = spec.GithubActions[i].BaseScraper.ApplyPlugins(plugins...)
 	}
+
+	for i := range spec.GitHubSecurity {
+		spec.GitHubSecurity[i].BaseScraper = spec.GitHubSecurity[i].BaseScraper.ApplyPlugins(plugins...)
+	}
+
+	for i := range spec.OpenSSFScorecard {
+		spec.OpenSSFScorecard[i].BaseScraper = spec.OpenSSFScorecard[i].BaseScraper.ApplyPlugins(plugins...)
+	}
 
 	for i := range spec.Azure {
 		spec.Azure[i].BaseScraper = spec.Azure[i].BaseScraper.ApplyPlugins(plugins...)
 	}

🤖 Fix all issues with AI agents

In `@api/context.go`:
- Around line 161-167: IsDebug() and IsTrace() may dereference a nil
scrapeConfig; update both methods to guard against nil (and nil Spec if Spec is
a pointer) similarly to ScraperID(): check if ctx.scrapeConfig == nil (and
ctx.scrapeConfig.Spec == nil when applicable) and return false if nil, otherwise
call and return ctx.scrapeConfig.Spec.IsDebug() / IsTrace(); this prevents
panics when WithScrapeConfig() hasn't been called.

In `@api/v1/github_security.go`:
- Around line 43-52: ParseMaxAge currently calls time.ParseDuration which
doesn't accept the "d" day suffix; update GitHubSecurityFilters.ParseMaxAge to
detect a trailing "d" (case-insensitive), parse the numeric portion as an
integer or float representing days, convert to a time.Duration via days * 24 *
time.Hour, and return that; for all other inputs keep using time.ParseDuration
and propagate its error. Ensure the implementation handles empty MaxAge (already
returns 0,nil), returns a clear error on invalid numeric parsing, and references
the MaxAge field and ParseMaxAge method when making the change.

In `@api/v1/openssf.go`:
- Around line 10-11: The MinScore field currently uses *float64 which breaks
controller-gen; change MinScore to a string pointer (e.g., *string) and add
validation/tagging as needed in the struct, then update the scraper logic to
parse the string into a float64 (handle nil/parse errors) wherever MinScore is
consumed; alternatively, if you must keep a float, add the kubebuilder CRD
marker // +kubebuilder:validation:Type=number and ensure controller-gen is
invoked with crd:allowDangerousTypes=true—locate and update the MinScore field
in the api/v1/openssf.go struct for these changes.

In `@scrapers/github/security_client.go`:
- Around line 121-133: GetSecurityAdvisories currently omits pagination so
results are limited to the API default; update the request options in
GetSecurityAdvisories to copy AlertListOptions.Page and AlertListOptions.PerPage
into the github.ListRepositorySecurityAdvisoriesOptions (alongside State) before
calling c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories so callers
can control Page and PerPage like the other methods (GetDependabotAlerts,
GetCodeScanningAlerts, GetSecretScanningAlerts).

In `@scrapers/github/security.go`:
- Around line 94-106: ScrapeResult.CreatedAt is typed as *time.Time but the code
passes a time.Time from repo.CreatedAt.GetTime(); fix by creating a local
variable to hold the returned time (e.g., tm := repo.CreatedAt.GetTime()) and
assign CreatedAt to its address (CreatedAt: &tm) when constructing the
v1.ScrapeResult; update the construction in the block that builds the
ScrapeResult (v1.ScrapeResult / CreatedAt) so it uses the pointer to the local
time variable.

In `@scrapers/openssf/compliance.go`:
- Around line 10-11: The file-level comment above ComplianceMappings incorrectly
cites "plugins/vite-plugin-security.js" as the data source; update the comment
to reference the correct source (e.g., the OpenSSF Scorecard documentation or
the actual mapping source used in this repo) so it accurately documents where
the compliance mapping originates; locate the top comment near the
ComplianceMappings declaration in scrapers/openssf/compliance.go and replace the
incorrect reference with the proper source name/URL (or remove the irrelevant
Vite/Vue reference) while keeping the explanatory text about mapping OpenSSF
Scorecard checks to compliance framework controls.

In `@scrapers/openssf/scorecard.go`:
- Around line 274-276: The function scorecardCheckNameToKebab is a no-op (it
replaces "-" with "-" and is unused); either remove the function completely or
implement proper kebab-case conversion and use it where needed—e.g., transform
spaces/underscores to dashes and lowercase (apply to scorecardCheckNameToKebab),
or delete the scorecardCheckNameToKebab declaration and any references to it if
no conversion is required.

🧹 Nitpick comments (4)

scrapers/github/security.go (1)

291-296: createConfigInsights is still a FIXME.

Please implement or track via a follow-up issue so the alert details aren’t silently dropped.
scrapers/github/security_client.go (1)
59-79: Empty state string passed to API unconditionally.

When opts.State is empty, you're passing &opts.State (pointer to empty string) to the API. This may cause unexpected filtering behavior depending on how the GitHub API handles empty state values.
Suggested fix to conditionally set State
 func (c *GitHubSecurityClient) GetDependabotAlerts(ctx context.Context, opts AlertListOptions) ([]*github.DependabotAlert, *github.Response, error) {
 	reqOpts := &github.ListAlertsOptions{
-		State: &opts.State,
 		ListOptions: github.ListOptions{
 			Page:    opts.Page,
 			PerPage: opts.PerPage,
 		},
 	}
 
+	if opts.State != "" {
+		reqOpts.State = &opts.State
+	}
+
 	if opts.Severity != "" {
 		reqOpts.Severity = &opts.Severity
 	}
scrapers/openssf/scorecard.go (2)
74-84: Type assertion without safety check could panic.

Line 76 uses a direct type assertion lastScrape.(time.Time) without the comma-ok pattern. If a non-time.Time value is ever stored in the map (due to a bug or future change), this will panic.
Suggested fix for safer type assertion
 func shouldSkipScorecardFetch(ctx api.ScrapeContext, repoFullName string) (bool, string) {
 	if lastScrape, ok := LastScorecardScrapeTime.Load(repoFullName); ok {
-		lastTime := lastScrape.(time.Time)
+		lastTime, ok := lastScrape.(time.Time)
+		if !ok {
+			ctx.Logger.V(3).Infof("invalid cache entry for %s, will refetch", repoFullName)
+			return false, ""
+		}
 		timeSinceLastScrape := time.Since(lastTime)
139-143: Consider if score < 10 is too strict for "failing" status.

A score of 9/10 is marked as "failing", which may be overly strict. Most scoring systems consider 7+ or 8+ as passing. This creates inconsistency with createRepositoryProperties where score >= 7 counts as passing.
Consider aligning with the passing threshold used elsewhere
-		if check.Score < 10 {
+		if check.Score < 7 {
 			analysis.Status = "failing"
 		} else {
 			analysis.Status = "passing"
 		}

coderabbitai · 2026-01-23T08:37:39Z

 func (ctx ScrapeContext) IsTrace() bool {
 	return ctx.scrapeConfig.Spec.IsTrace()
 }
+
+func (ctx ScrapeContext) IsDebug() bool {
+	return ctx.scrapeConfig.Spec.IsDebug()
+}


⚠️ Potential issue | 🟠 Major

Potential nil pointer dereference in IsDebug() and IsTrace().

Both methods access ctx.scrapeConfig.Spec without checking if scrapeConfig is nil. Other methods like ScraperID() (lines 144-149) perform this nil check. If IsDebug() or IsTrace() is called before WithScrapeConfig(), this will panic.

Suggested fix for IsDebug (same pattern should be applied to IsTrace)

func (ctx ScrapeContext) IsDebug() bool { + if ctx.scrapeConfig == nil { + return false + } return ctx.scrapeConfig.Spec.IsDebug() }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (ctx ScrapeContext) IsTrace() bool {

return ctx.scrapeConfig.Spec.IsTrace()

}

func (ctx ScrapeContext) IsDebug() bool {

return ctx.scrapeConfig.Spec.IsDebug()

}

func (ctx ScrapeContext) IsTrace() bool {

return ctx.scrapeConfig.Spec.IsTrace()

}

func (ctx ScrapeContext) IsDebug() bool {

if ctx.scrapeConfig == nil {

return false

}

return ctx.scrapeConfig.Spec.IsDebug()

}

🤖 Prompt for AI Agents

In `@api/context.go` around lines 161 - 167, IsDebug() and IsTrace() may dereference a nil scrapeConfig; update both methods to guard against nil (and nil Spec if Spec is a pointer) similarly to ScraperID(): check if ctx.scrapeConfig == nil (and ctx.scrapeConfig.Spec == nil when applicable) and return false if nil, otherwise call and return ctx.scrapeConfig.Spec.IsDebug() / IsTrace(); this prevents panics when WithScrapeConfig() hasn't been called.

coderabbitai · 2026-01-23T08:37:39Z

+	// MaxAge filters alerts by age (e.g., "90d", "30d")
+	MaxAge string `yaml:"maxAge,omitempty" json:"maxAge,omitempty"`
+}
+
+// ParseMaxAge converts the MaxAge string to a time.Duration
+func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) {
+	if f.MaxAge == "" {
+		return 0, nil
+	}
+	return time.ParseDuration(f.MaxAge)


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

Go time.ParseDuration supported time units documentation

💡 Result:

In Go’s standard library, time.ParseDuration accepts these unit suffixes (as documented):

ns

us (or µs)

ms

s

m

h (hours)

Docs: time.ParseDuration on pkg.go.dev/time (see “Valid time units…”). [1]

Examples: 300ms, -1.5h, 2h45m, 1µs / 1us. [1]

Sources

[1] https://pkg.go.dev/time#ParseDuration

🏁 Script executed:

# Search for MaxAge documentation examples in the codebase rg -A 5 -B 5 "MaxAge" --type go | head -100

Repository: flanksource/config-db

Length of output: 186

🏁 Script executed:

# Specifically look for "90d" or "30d" examples in comments or strings rg "(90d|30d)"

Repository: flanksource/config-db

Length of output: 186

🏁 Script executed:

fd github_security.go --exec cat {}

Repository: flanksource/config-db

Length of output: 186

ParseMaxAge cannot handle day units—time.ParseDuration only accepts ns, us, ms, s, m, h.

If the documentation or code examples suggest using "90d" or "30d", they will fail. Either update examples to hours (e.g., 2160h for 90 days) or implement custom day parsing.

🛠️ Proposed fix (support "d" suffix)

import ( + "strconv" + "strings" "time" "github.com/flanksource/duty/types" ) @@ func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) { if f.MaxAge == "" { return 0, nil } + if strings.HasSuffix(f.MaxAge, "d") { + daysStr := strings.TrimSuffix(f.MaxAge, "d") + days, err := strconv.Atoi(daysStr) + if err != nil { + return 0, err + } + return time.Hour * 24 * time.Duration(days), nil + } return time.ParseDuration(f.MaxAge) }

🤖 Prompt for AI Agents

In `@api/v1/github_security.go` around lines 43 - 52, ParseMaxAge currently calls time.ParseDuration which doesn't accept the "d" day suffix; update GitHubSecurityFilters.ParseMaxAge to detect a trailing "d" (case-insensitive), parse the numeric portion as an integer or float representing days, convert to a time.Duration via days * 24 * time.Hour, and return that; for all other inputs keep using time.ParseDuration and propagate its error. Ensure the implementation handles empty MaxAge (already returns 0,nil), returns a clear error on invalid numeric parsing, and references the MaxAge field and ParseMaxAge method when making the change.

coderabbitai · 2026-01-23T08:37:39Z

+// GetSecurityAdvisories fetches repository security advisories
+func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {
+	reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{
+		State: opts.State,
+	}
+
+	advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)
+	if err != nil {
+		return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)
+	}
+
+	return advisories, resp, nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

go-github ListRepositorySecurityAdvisoriesOptions pagination support

💡 Result:

go-github’s ListRepositorySecurityAdvisoriesOptions supports offset-based pagination the same way most “List…” calls do: via embedded github.ListOptions (page / per_page). ListOptions is the standard pagination struct in go-github and defines Page and PerPage. [1]

So you paginate like:

set opt.ListOptions.PerPage (and opt.ListOptions.Page starting at 1)

call ListRepositorySecurityAdvisories(...)

read resp.NextPage

loop until resp.NextPage == 0

This is the documented go-github pagination pattern in general. [2]
And ListRepositorySecurityAdvisoriesOptions is present as an options type alongside the advisory-listing methods. [3]

Sources
[1] (pkg.go.dev)
[2] (sources.debian.org)
[3] (pkg.go.dev)

Citations:

1: https://pkg.go.dev/github.com/google/go-github/github?utm_source=openai

2: https://sources.debian.org/src/golang-github-google-go-github/48.1.0-2/README.md?utm_source=openai

3: https://pkg.go.dev/github.com/google/go-github/v74/github?utm_source=openai

🏁 Script executed:

# First, let's examine the file to see the full context and compare with other alert methods find . -name "security_client.go" -path "*/github/*" | head -5

Repository: flanksource/config-db

Length of output: 101

🏁 Script executed:

# Let's also check if there's a go.mod to see which version of go-github is being used find . -name "go.mod" | head -1 | xargs cat | grep github

Repository: flanksource/config-db

Length of output: 22311

🏁 Script executed:

cat -n ./scrapers/github/security_client.go

Repository: flanksource/config-db

Length of output: 6316

🏁 Script executed:

# Also search for other alert-related methods to compare pagination patterns grep -n "func.*Alert\|func.*Advisory" ./scrapers/github/security_client.go

Repository: flanksource/config-db

Length of output: 681

Add pagination support to GetSecurityAdvisories.

GetSecurityAdvisories is the only alert method missing ListOptions pagination. All other methods (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts) include Page and PerPage from AlertListOptions. Without pagination, results will be limited to the API's default page size.

Suggested fix

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) { reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{ State: opts.State, + ListOptions: github.ListOptions{ + Page: opts.Page, + PerPage: opts.PerPage, + }, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

ListOptions: github.ListOptions{

Page: opts.Page,

PerPage: opts.PerPage,

},

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

🤖 Prompt for AI Agents

In `@scrapers/github/security_client.go` around lines 121 - 133, GetSecurityAdvisories currently omits pagination so results are limited to the API default; update the request options in GetSecurityAdvisories to copy AlertListOptions.Page and AlertListOptions.PerPage into the github.ListRepositorySecurityAdvisoriesOptions (alongside State) before calling c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories so callers can control Page and PerPage like the other methods (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts).

coderabbitai · 2026-01-23T08:37:39Z

+	result := v1.ScrapeResult{
+		Type:        ConfigTypeGitHubSecurityRepo,
+		ID:          fmt.Sprintf("github-security/%s", repoFullName),
+		Name:        repoFullName,
+		ConfigClass: "Security",
+		Config:      repo,
+		Tags: v1.Tags{
+			{Name: "owner", Value: repoConfig.Owner},
+			{Name: "repo", Value: repoConfig.Repo},
+			{Name: "security", Value: "true"},
+		},
+		CreatedAt:  repo.CreatedAt.GetTime(),
+		Properties: properties,


⚠️ Potential issue | 🔴 Critical

Fix CreatedAt type mismatch.

ScrapeResult.CreatedAt expects *time.Time, but repo.CreatedAt.GetTime() returns time.Time.

🐛 Proposed fix

- result := v1.ScrapeResult{ + createdAt := repo.GetCreatedAt() + result := v1.ScrapeResult{ Type: ConfigTypeGitHubSecurityRepo, ID: fmt.Sprintf("github-security/%s", repoFullName), Name: repoFullName, ConfigClass: "Security", Config: repo, Tags: v1.Tags{ {Name: "owner", Value: repoConfig.Owner}, {Name: "repo", Value: repoConfig.Repo}, {Name: "security", Value: "true"}, }, - CreatedAt: repo.CreatedAt.GetTime(), + CreatedAt: &createdAt, Properties: properties, }

🤖 Prompt for AI Agents

In `@scrapers/github/security.go` around lines 94 - 106, ScrapeResult.CreatedAt is typed as *time.Time but the code passes a time.Time from repo.CreatedAt.GetTime(); fix by creating a local variable to hold the returned time (e.g., tm := repo.CreatedAt.GetTime()) and assign CreatedAt to its address (CreatedAt: &tm) when constructing the v1.ScrapeResult; update the construction in the block that builds the ScrapeResult (v1.ScrapeResult / CreatedAt) so it uses the pointer to the local time variable.

coderabbitai · 2026-01-23T08:37:39Z

+// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls
+// Based on DataSheet reference: plugins/vite-plugin-security.js


⚠️ Potential issue | 🟡 Minor

Incorrect source reference in documentation comment.

The comment references "plugins/vite-plugin-security.js" which appears to be from a different project (JavaScript/Vite). This should reference the actual data source for the compliance mappings.

Suggested fix

// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls -// Based on DataSheet reference: plugins/vite-plugin-security.js +// Based on OpenSSF Scorecard checks documentation var ComplianceMappings = map[string]ComplianceMapping{

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls

// Based on DataSheet reference: plugins/vite-plugin-security.js

// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls

// Based on OpenSSF Scorecard checks documentation

var ComplianceMappings = map[string]ComplianceMapping{

🤖 Prompt for AI Agents

In `@scrapers/openssf/compliance.go` around lines 10 - 11, The file-level comment above ComplianceMappings incorrectly cites "plugins/vite-plugin-security.js" as the data source; update the comment to reference the correct source (e.g., the OpenSSF Scorecard documentation or the actual mapping source used in this repo) so it accurately documents where the compliance mapping originates; locate the top comment near the ComplianceMappings declaration in scrapers/openssf/compliance.go and replace the incorrect reference with the proper source name/URL (or remove the irrelevant Vite/Vue reference) while keeping the explanatory text about mapping OpenSSF Scorecard checks to compliance framework controls.

coderabbitai · 2026-01-23T08:37:39Z

+func scorecardCheckNameToKebab(name string) string {
+	return strings.ToLower(strings.ReplaceAll(name, "-", "-"))
+}


⚠️ Potential issue | 🟡 Minor

Dead code: scorecardCheckNameToKebab is a no-op.

This function replaces "-" with "-", which has no effect. It's also unused in the file. Remove it or fix the implementation if kebab-case conversion is needed.

Remove unused no-op function

-func scorecardCheckNameToKebab(name string) string { - return strings.ToLower(strings.ReplaceAll(name, "-", "-")) -}

Or if actual conversion is needed (e.g., spaces/underscores to dashes):

func scorecardCheckNameToKebab(name string) string { return strings.ToLower(strings.ReplaceAll(name, " ", "-")) }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func scorecardCheckNameToKebab(name string) string {

return strings.ToLower(strings.ReplaceAll(name, "-", "-"))

}

🤖 Prompt for AI Agents

In `@scrapers/openssf/scorecard.go` around lines 274 - 276, The function scorecardCheckNameToKebab is a no-op (it replaces "-" with "-" and is unused); either remove the function completely or implement proper kebab-case conversion and use it where needed—e.g., transform spaces/underscores to dashes and lowercase (apply to scorecardCheckNameToKebab), or delete the scorecardCheckNameToKebab declaration and any references to it if no conversion is required.

coderabbitai

Actionable comments posted: 11

🤖 Fix all issues with AI agents

In `@api/context.go`:
- Around line 165-167: The IsDebug method on ScrapeContext can panic if
scrapeConfig is nil; update ScrapeContext.IsDebug() to guard against a nil
ctx.scrapeConfig and return false when unset, otherwise return
ctx.scrapeConfig.Spec.IsDebug(); reference the ScrapeContext type and IsDebug()
method and ensure compatibility with NewScrapeContext which may produce a
context without scrapeConfig.

In `@api/v1/github_security.go`:
- Around line 47-53: The ParseMaxAge method in GitHubSecurityFilters currently
calls time.ParseDuration which rejects "90d"/"30d" as documented; change
ParseMaxAge to accept a trailing 'd' by detecting a suffix of 'd'
(case-insensitive), parsing the numeric portion as an integer/float, converting
days to hours (days * 24h) and returning time.Duration (or fall back to
time.ParseDuration for other valid suffixes), and ensure errors are returned for
invalid inputs; update logic inside ParseMaxAge to handle both "Nd" inputs and
standard time.ParseDuration-compatible strings and reference the
GitHubSecurityFilters.ParseMaxAge function when locating the change.

In `@api/v1/interface.go`:
- Around line 815-831: The PrettyRow method (ScrapeResult.PrettyRow) currently
performs unconditional printing via clicky.MustPrint and accepts an unused
options parameter; remove the two clicky.MustPrint calls so the method only
constructs and returns the row map, and drop the unused options parameter from
the function signature (or if you prefer to keep it, ensure it's used) so
callers control printing/formatting. Update any call sites of
ScrapeResult.PrettyRow to handle printing/formatting themselves (e.g., call
clicky.MustPrint with desired FormatOptions where needed).

In `@api/v1/openssf.go`:
- Around line 10-11: The MinScore field currently uses *float64 which breaks CRD
generation; change MinScore to a safer type (e.g., *string or *int) and update
any callers/serialization accordingly: replace the declaration of MinScore
(*float64) with either *string (to store "7.5") or *int representing a scaled
value (e.g., 75 for 0.0–10.0 scaled to 0–100), keep the existing yaml/json tags,
and adjust validation/usage logic that reads MinScore (parsing string to float
or dividing the int by the scale) so consumers like validation, defaults, and
API docs remain correct; if you must keep a numeric type, add the kubebuilder
marker // +kubebuilder:validation:Type=number and rerun controller-gen with
crd:allowDangerousTypes=true (not recommended).

In `@api/v1/types.go`:
- Around line 71-74: The ScraperSpec additions (fields GitHubSecurity and
OpenSSFScorecard) are not wired into AllScraperConfigs and ApplyPlugin; update
the AllScraperConfigs map to include entries "githubsecurity": GitHubSecurity{}
and "openssfscorecard": OpenSSFScorecard{}, and modify the ApplyPlugin method to
iterate over s.GitHubSecurity and s.OpenSSFScorecard and call ApplyPlugin on
each item (mirroring the existing loops for GithubActions and Azure) so those
plugin configs are applied; reference the AllScraperConfigs map and the
ApplyPlugin method and the types GitHubSecurity and OpenSSFScorecard when
locating code to change.

In `@cmd/run.go`:
- Around line 108-125: The current code always exports/prints regardless of the
--export flag; update the branch to honor the boolean export flag (e.g.,
exportFlag or export) so the export path only runs when exportFlag is true and
the print path runs when exportFlag is false; specifically, wrap the export loop
that calls exportResource(results, outputDir) and the
logger.Infof("Exported...") inside if exportFlag { ... } and move the
clicky.MustPrint(all.*) calls into the corresponding else branch, and also
validate that when exportFlag is true an outputDir is provided (return a clear
error if empty) to avoid silent failures.

In `@scrapers/github/security_client.go`:
- Around line 23-48: In NewGitHubSecurityClient, after resolving token (the
block that sets token from ctx.HydrateConnectionByURL or
ctx.GetEnvValueFromCache), add a check that token is not empty and return a
descriptive error if it is; for example, if token == "" return
fmt.Errorf("missing GitHub token: no connection %q and no personalAccessToken
provided", config.ConnectionName or config.PersonalAccessToken) so the function
fails fast instead of creating a client that will 401 on API calls.
- Around line 121-133: GetSecurityAdvisories ignores pagination and always
fetches the first page; set the ListRepositorySecurityAdvisoriesOptions to
include the requested page and per-page values from AlertListOptions (map
opts.Page and opts.PerPage into the options used for the API call). Update the
reqOpts used in GetSecurityAdvisories to populate the ListOptions/Page and
PerPage fields (same approach as
GetDependabotAlerts/GetCodeScanningAlerts/GetSecretScanningAlerts) so the call
to c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories honors
pagination.

In `@scrapers/github/security.go`:
- Around line 59-62: The log call is formatting with "%d" but is concatenating
the alert-count strings, so compute an integer total first by parsing the
individual counts from result.Properties.Find("Critical Alerts").Text, "High
Alerts", "Medium Alerts", and "Low Alerts" (use strconv.Atoi, handle errors
defaulting to 0), sum them into a total int, then call
ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName,
total) instead of concatenating the Text values.

In `@scrapers/processors/json.go`:
- Around line 599-607: The validation currently checks only ExternalChangeID for
standalone inputs but not ExternalID; update the validator in the same block
that inspects input.Changes (near the lo.Filter usage) to also check for any
change where ExternalID == "" when input.ID is empty (standalone change). Use
the same pattern (e.g., lo.Filter or an additional check) referencing
input.Changes, ExternalID, ExternalChangeID and input.ID, and return the same
formatted error (including e.Config and input.Debug()) when either required
field is missing.

In `@scrapers/processors/script.go`:
- Around line 33-36: The current logger.Infof call in the json.Unmarshal error
block logs the entire script output variable s (risking secrets/PII and large
dumps); update the error handling around json.Unmarshal to avoid printing the
full s — instead log the error and either a safely truncated sample (e.g., first
N characters with "…(truncated)") and the total length, or a redacted/hashed
summary of s, and consider simple redaction for common secrets (tokens,
passwords) before including any sample; modify the logger.Infof invocation that
references err and s to use the new truncated/redacted sample and length-only
information.

🧹 Nitpick comments (4)

api/v1/scrapeconfig_types.go (1)

70-76: Potential ambiguity with ScraperSpec.IsDebug() and ScraperSpec.IsTrace() methods.

ScrapeConfig.IsDebug() checks annotations while ScraperSpec.IsDebug() (in types.go lines 178-184) checks the LogLevel field. Both methods exist on related types with the same name but different behavior.

Consider documenting or clarifying the distinction, or using different method names to avoid confusion when both are accessible through the context chain.
scrapers/openssf/compliance.go (1)
110-115: Return defensive copies from GetComplianceMappings.

The returned slices share backing arrays with the global map; external mutation can alter shared mappings. Consider returning copies to keep mappings immutable (and optionally keep the map unexported if it’s not required externally).
♻️ Suggested adjustment
 func GetComplianceMappings(checkName string) ComplianceMapping {
 	if mapping, ok := ComplianceMappings[checkName]; ok {
-		return mapping
+		return ComplianceMapping{
+			SOC2:     append([]string(nil), mapping.SOC2...),
+			NISTSSDF: append([]string(nil), mapping.NISTSSDF...),
+			CIS:      append([]string(nil), mapping.CIS...),
+		}
 	}
 	return ComplianceMapping{}
 }
scrapers/openssf/client.go (2)
72-104: Respect context cancellation during retry backoff.

Line 82–86 sleeps unconditionally; if ctx is canceled, callers still block through the backoff. Consider a select on ctx.Done() to exit promptly.
♻️ Proposed fix (ctx-aware backoff)
 	for attempt := 0; attempt < maxRetries; attempt++ {
 		if attempt > 0 {
 			backoff := time.Duration(attempt*attempt) * time.Second
 			c.ctx.Logger.V(3).Infof("retrying scorecard request for %s (attempt %d/%d) after %v",
 				repoFullName, attempt+1, maxRetries, backoff)
-			time.Sleep(backoff)
+			timer := time.NewTimer(backoff)
+			select {
+			case <-ctx.Done():
+				timer.Stop()
+				return nil, ctx.Err()
+			case <-timer.C:
+			}
 		}
144-168: Retry classification is brittle; prefer typed errors and strings.Contains.

Line 149–152 relies on substring matching, and contains/findSubstring reimplements stdlib logic. Consider using errors.Is/As with net.Error for timeouts and strings.Contains for server errors, and drop custom helpers.
♻️ Proposed fix (typed errors + stdlib Contains)
 import (
 	"context"
 	"encoding/json"
+	"errors"
 	"fmt"
 	"io"
+	"net"
 	"net/http"
+	"strings"
 	"time"

 	"github.com/flanksource/config-db/api"
 )
@@
 func (c *ScorecardClient) isRetryable(err error) bool {
 	if err == nil {
 		return false
 	}
-
-	errStr := err.Error()
-	return contains(errStr, "server error") ||
-	       contains(errStr, "timeout") ||
-	       contains(errStr, "connection")
+	if errors.Is(err, context.DeadlineExceeded) {
+		return true
+	}
+	var netErr net.Error
+	if errors.As(err, &netErr) && (netErr.Timeout() || netErr.Temporary()) {
+		return true
+	}
+	return strings.Contains(err.Error(), "server error")
 }
-
-func contains(s, substr string) bool {
-	return len(s) >= len(substr) && (s == substr || len(s) > len(substr) &&
-		(s[:len(substr)] == substr || s[len(s)-len(substr):] == substr ||
-		 findSubstring(s, substr)))
-}
-
-func findSubstring(s, substr string) bool {
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return true
-		}
-	}
-	return false
-}

coderabbitai · 2026-01-23T09:01:55Z

+func (ctx ScrapeContext) IsDebug() bool {
+	return ctx.scrapeConfig.Spec.IsDebug()
+}


⚠️ Potential issue | 🟡 Minor

Guard against nil ScrapeConfig.
NewScrapeContext can produce a context without scrapeConfig; calling IsDebug() would panic. Return false when unset.

🐛 Proposed fix

func (ctx ScrapeContext) IsDebug() bool { - return ctx.scrapeConfig.Spec.IsDebug() + if ctx.scrapeConfig == nil { + return false + } + return ctx.scrapeConfig.Spec.IsDebug() }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func (ctx ScrapeContext) IsDebug() bool {

return ctx.scrapeConfig.Spec.IsDebug()

}

func (ctx ScrapeContext) IsDebug() bool {

if ctx.scrapeConfig == nil {

return false

}

return ctx.scrapeConfig.Spec.IsDebug()

}

🤖 Prompt for AI Agents

In `@api/context.go` around lines 165 - 167, The IsDebug method on ScrapeContext can panic if scrapeConfig is nil; update ScrapeContext.IsDebug() to guard against a nil ctx.scrapeConfig and return false when unset, otherwise return ctx.scrapeConfig.Spec.IsDebug(); reference the ScrapeContext type and IsDebug() method and ensure compatibility with NewScrapeContext which may produce a context without scrapeConfig.

coderabbitai · 2026-01-23T09:01:55Z

+// ParseMaxAge converts the MaxAge string to a time.Duration
+func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) {
+	if f.MaxAge == "" {
+		return 0, nil
+	}
+	return time.ParseDuration(f.MaxAge)
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, locate and examine the github_security.go file find . -name "github_security.go" -type f

Repository: flanksource/config-db

Length of output: 186

🏁 Script executed:

# Check if there are any documentation files or tests related to ParseMaxAge fd -e md -e txt | head -20 git ls-files | grep -i security

Repository: flanksource/config-db

Length of output: 567

🏁 Script executed:

# Search for references to "90d" and "30d" in the codebase rg "90d|30d" -i

Repository: flanksource/config-db

Length of output: 15027

🏁 Script executed:

# Read the github_security.go file to see the actual implementation cat -n api/v1/github_security.go | head -60

Repository: flanksource/config-db

Length of output: 2261

🏁 Script executed:

# Search for tests related to ParseMaxAge fd -e go -type f | xargs grep -l "ParseMaxAge" 2>/dev/null

Repository: flanksource/config-db

Length of output: 237

🌐 Web query:

Go time.ParseDuration supported format "d" day suffix

💡 Result:

Go’s standard library time.ParseDuration does not support a d (day) suffix—the only valid unit suffixes are ns, us (or µs), ms, s, m, h. [1]

The time package also intentionally provides no Day constant (“to avoid confusion across daylight savings time zone transitions”). [1]

What to do instead

If you mean exact 24-hour blocks, use hours: 24h, 48h, 720h (for 30 days), etc. [1]

If you mean calendar days (variable-length due to DST), use date math like t.AddDate(0, 0, days) rather than time.Duration.

🏁 Script executed:

# Search for tests of ParseMaxAge or any usages of it rg "ParseMaxAge" -i -A 3 -B 3

Repository: flanksource/config-db

Length of output: 511

🏁 Script executed:

# Check if there are any existing tests for github_security fd -name "*github*security*test*" -o -name "*test*github*security*"

Repository: flanksource/config-db

Length of output: 295

🏁 Script executed:

# Also check fixtures that might use MaxAge grep -r "maxAge\|MaxAge" --include="*.yaml" --include="*.yml" --include="*.json"

Repository: flanksource/config-db

Length of output: 458

ParseMaxAge rejects the documented 90d/30d inputs.

Line 43 documents examples "90d" and "30d", but line 52 uses time.ParseDuration, which doesn't accept a d suffix. The only supported unit suffixes are ns, us, µs, ms, s, m, h. Either update the documentation to reflect supported formats (e.g., "2160h" for 90 days, "720h" for 30 days) or implement custom parsing to support the d suffix.

🤖 Prompt for AI Agents

In `@api/v1/github_security.go` around lines 47 - 53, The ParseMaxAge method in GitHubSecurityFilters currently calls time.ParseDuration which rejects "90d"/"30d" as documented; change ParseMaxAge to accept a trailing 'd' by detecting a suffix of 'd' (case-insensitive), parsing the numeric portion as an integer/float, converting days to hours (days * 24h) and returning time.Duration (or fall back to time.ParseDuration for other valid suffixes), and ensure errors are returned for invalid inputs; update logic inside ParseMaxAge to handle both "Nd" inputs and standard time.ParseDuration-compatible strings and reference the GitHubSecurityFilters.ParseMaxAge function when locating the change.

coderabbitai · 2026-01-23T09:01:55Z

+	GithubActions   []GitHubActions   `json:"githubActions,omitempty" yaml:"githubActions,omitempty"`
+	GitHubSecurity  []GitHubSecurity  `json:"githubSecurity,omitempty" yaml:"githubSecurity,omitempty"`
+	OpenSSFScorecard []OpenSSFScorecard `json:"openssfScorecard,omitempty" yaml:"openssfScorecard,omitempty"`
+	Azure           []Azure           `json:"azure,omitempty" yaml:"azure,omitempty"`


⚠️ Potential issue | 🟠 Major

Missing entries in AllScraperConfigs map and ApplyPlugin method.

The new GitHubSecurity and OpenSSFScorecard fields are added to ScraperSpec, but the corresponding entries are missing from:

AllScraperConfigs map (lines 15-31) - should include "githubsecurity": GitHubSecurity{} and "openssfscorecard": OpenSSFScorecard{}

ApplyPlugin method (lines 97-167) - should include plugin application loops for both new types, similar to other scrapers

Proposed additions

Add to AllScraperConfigs map:

var AllScraperConfigs = map[string]any{ "aws": AWS{}, "azure": Azure{}, "azuredevops": AzureDevops{}, "exec": Exec{}, "file": File{}, "gcp": GCP{}, "githubactions": GitHubActions{}, + "githubsecurity": GitHubSecurity{}, "http": HTTP{}, "kubernetes": Kubernetes{}, "kubernetesfile": KubernetesFile{}, "logs": Logs{}, + "openssfscorecard": OpenSSFScorecard{}, "slack": Slack{}, "sql": SQL{}, "terraform": Terraform{}, "trivy": Trivy{}, }

Add to ApplyPlugin method (after line 132):

for i := range spec.GithubActions { spec.GithubActions[i].BaseScraper = spec.GithubActions[i].BaseScraper.ApplyPlugins(plugins...) } + + for i := range spec.GitHubSecurity { + spec.GitHubSecurity[i].BaseScraper = spec.GitHubSecurity[i].BaseScraper.ApplyPlugins(plugins...) + } + + for i := range spec.OpenSSFScorecard { + spec.OpenSSFScorecard[i].BaseScraper = spec.OpenSSFScorecard[i].BaseScraper.ApplyPlugins(plugins...) + }

🤖 Prompt for AI Agents

In `@api/v1/types.go` around lines 71 - 74, The ScraperSpec additions (fields GitHubSecurity and OpenSSFScorecard) are not wired into AllScraperConfigs and ApplyPlugin; update the AllScraperConfigs map to include entries "githubsecurity": GitHubSecurity{} and "openssfscorecard": OpenSSFScorecard{}, and modify the ApplyPlugin method to iterate over s.GitHubSecurity and s.OpenSSFScorecard and call ApplyPlugin on each item (mirroring the existing loops for GithubActions and Azure) so those plugin configs are applied; reference the AllScraperConfigs map and the ApplyPlugin method and the types GitHubSecurity and OpenSSFScorecard when locating code to change.

coderabbitai · 2026-01-23T09:01:55Z

+func NewGitHubSecurityClient(ctx api.ScrapeContext, config v1.GitHubSecurity, owner, repo string) (*GitHubSecurityClient, error) {
+	var token string
+	if connection, err := ctx.HydrateConnectionByURL(config.ConnectionName); err != nil {
+		return nil, fmt.Errorf("failed to hydrate connection: %w", err)
+	} else if connection != nil {
+		token = connection.Password
+	} else {
+		token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())
+		if err != nil {
+			return nil, fmt.Errorf("failed to get token: %w", err)
+		}
+	}
+
+	ts := oauth2.StaticTokenSource(
+		&oauth2.Token{AccessToken: token},
+	)
+	tc := oauth2.NewClient(ctx, ts)
+	client := github.NewClient(tc)
+
+	return &GitHubSecurityClient{
+		ScrapeContext: ctx,
+		Client:        client,
+		owner:         owner,
+		repo:          repo,
+	}, nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub REST API security alerts code scanning secret scanning authentication requirements

💡 Result:

GitHub REST API: security alerts (code scanning, secret scanning, Dependabot) — auth & access requirements

1) Token types you can use

OAuth App token or PAT (classic): supported broadly across these alert APIs (see per-endpoint scope requirements below).

Fine-grained tokens / GitHub App tokens: supported for most repo/org alert endpoints, but not for some enterprise-level Dependabot alert endpoints. For example, “List Dependabot alerts for an enterprise” does not work with fine-grained PATs or GitHub App tokens. [3]

2) Scopes / permissions (classic PATs & OAuth tokens)

Code scanning alerts: requires security_events (private or public), or public_repo if only accessing public repos. [1]

Secret scanning alerts: requires repo or security_events (and public_repo can be sufficient if only accessing public repos). [2]

Dependabot alerts: commonly requires security_events (and public_repo can be sufficient if only accessing public repos). Some endpoints (e.g., enterprise-level) accept repo or security_events. [3]

3) Fine-grained PAT / GitHub App permission names (repo-scoped)

Code scanning: "Code scanning alerts" repo permission (read for GET/list, write for update). [1]

Secret scanning: "Secret scanning alerts" repo permission (read/write depending on endpoint). [2]

Dependabot: "Dependabot alerts" repo permission (read/write depending on endpoint). [3]

4) Who is allowed to call the endpoints (role / access gating)

Even with the right token scope/permissions, the caller must have sufficient access:

Repository alert visibility/access is generally available to people with write/maintain/admin on the repo (and org owners for org-owned repos). GitHub also lets repo admins/org owners grant additional alert access (with constraints). [4]

Some alert endpoints explicitly require elevated roles, e.g.:

Secret scanning (org-level listing): caller must be an org admin or security manager. [2]

Secret scanning (repo-level listing): caller must be a repo admin (or admin of the owning org). [2]

Dependabot (org-level listing): caller must be an org owner or security manager. [3]

Dependabot (enterprise-level listing): caller must be an enterprise member, and results are limited to orgs where the caller is an org owner or security manager. [3]

Sources: GitHub REST API docs for Code Scanning [1], Secret Scanning [2], Dependabot Alerts [3], and “Managing security and analysis settings… / Granting access to security alerts” [4].

🏁 Script executed:

find . -name "security_client.go" -type f | head -5

Repository: flanksource/config-db

Length of output: 101

🏁 Script executed:

# Look for the full security_client.go file to understand context fd security_client.go --type f

Repository: flanksource/config-db

Length of output: 99

🏁 Script executed:

cat -n scrapers/github/security_client.go

Repository: flanksource/config-db

Length of output: 6316

🏁 Script executed:

# Check if there are other places in the codebase that use this client # to understand the broader context rg "NewGitHubSecurityClient|GitHubSecurityClient" --type go -A 2 -B 2

Repository: flanksource/config-db

Length of output: 5100

🏁 Script executed:

# Check if token validation patterns exist elsewhere in this project rg "token == \"\"" --type go

Repository: flanksource/config-db

Length of output: 47

Add token validation to fail fast on missing credentials.

Empty token values proceed silently to client creation, causing all subsequent security alert API calls (Dependabot, code scanning, secret scanning, advisories) to fail with 401 Unauthorized. Validate that a token exists and return an explicit error if neither connection nor personalAccessToken config provides one.

🐛 Proposed fix

} else { token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace()) if err != nil { return nil, fmt.Errorf("failed to get token: %w", err) } } + + if token == "" { + return nil, fmt.Errorf("github personal access token is required (connection or personalAccessToken)") + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func NewGitHubSecurityClient(ctx api.ScrapeContext, config v1.GitHubSecurity, owner, repo string) (*GitHubSecurityClient, error) {

var token string

if connection, err := ctx.HydrateConnectionByURL(config.ConnectionName); err != nil {

return nil, fmt.Errorf("failed to hydrate connection: %w", err)

} else if connection != nil {

token = connection.Password

} else {

token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())

if err != nil {

return nil, fmt.Errorf("failed to get token: %w", err)

}

}

ts := oauth2.StaticTokenSource(

&oauth2.Token{AccessToken: token},

)

tc := oauth2.NewClient(ctx, ts)

client := github.NewClient(tc)

return &GitHubSecurityClient{

ScrapeContext: ctx,

Client: client,

owner: owner,

repo: repo,

}, nil

}

func NewGitHubSecurityClient(ctx api.ScrapeContext, config v1.GitHubSecurity, owner, repo string) (*GitHubSecurityClient, error) {

var token string

if connection, err := ctx.HydrateConnectionByURL(config.ConnectionName); err != nil {

return nil, fmt.Errorf("failed to hydrate connection: %w", err)

} else if connection != nil {

token = connection.Password

} else {

token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())

if err != nil {

return nil, fmt.Errorf("failed to get token: %w", err)

}

}

if token == "" {

return nil, fmt.Errorf("github personal access token is required (connection or personalAccessToken)")

}

ts := oauth2.StaticTokenSource(

&oauth2.Token{AccessToken: token},

)

tc := oauth2.NewClient(ctx, ts)

client := github.NewClient(tc)

return &GitHubSecurityClient{

ScrapeContext: ctx,

Client: client,

owner: owner,

repo: repo,

}, nil

}

🤖 Prompt for AI Agents

In `@scrapers/github/security_client.go` around lines 23 - 48, In NewGitHubSecurityClient, after resolving token (the block that sets token from ctx.HydrateConnectionByURL or ctx.GetEnvValueFromCache), add a check that token is not empty and return a descriptive error if it is; for example, if token == "" return fmt.Errorf("missing GitHub token: no connection %q and no personalAccessToken provided", config.ConnectionName or config.PersonalAccessToken) so the function fails fast instead of creating a client that will 401 on API calls.

coderabbitai · 2026-01-23T09:01:55Z

+// GetSecurityAdvisories fetches repository security advisories
+func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {
+	reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{
+		State: opts.State,
+	}
+
+	advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)
+	if err != nil {
+		return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)
+	}
+
+	return advisories, resp, nil
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

find . -name "*.go" -type f | head -20

Repository: flanksource/config-db

Length of output: 465

🏁 Script executed:

# Search for AlertListOptions definition rg "type AlertListOptions" -A 10

Repository: flanksource/config-db

Length of output: 857

🏁 Script executed:

# Search for usage of ListRepositorySecurityAdvisoriesOptions rg "ListRepositorySecurityAdvisoriesOptions" -B 2 -A 5

Repository: flanksource/config-db

Length of output: 782

🏁 Script executed:

# Check the security_client.go file structure cat -n scrapers/github/security_client.go | head -150

Repository: flanksource/config-db

Length of output: 5681

🌐 Web query:

go-github ListRepositorySecurityAdvisoriesOptions ListOptions pagination

💡 Result:

In go-github, pagination for “list” endpoints is done with github.ListOptions (Page, PerPage). You pass it either directly or via embedding in a more specific *Options struct, then loop using resp.NextPage. [1], [2]

So for ListRepositorySecurityAdvisoriesOptions, you paginate the same way (GitHub’s REST endpoint also supports page / per_page). [1], [3]

opt := &github.ListRepositorySecurityAdvisoriesOptions{ ListOptions: github.ListOptions{PerPage: 100}, } for { advisories, resp, err := client.SecurityAdvisories. ListRepositorySecurityAdvisories(ctx, owner, repo, opt) if err != nil { return err } // ... use advisories ... if resp.NextPage == 0 { break } opt.Page = resp.NextPage // (or opt.ListOptions.Page, depending on embedding) }

Sources: [1] https://sources.debian.org/src/golang-github-google-go-github/48.1.0-2/README.md (Pagination section) • [2] https://pkg.go.dev/github.com/google/go-github/v30/github (ListOptions) • [3] https://docs.github.com/en/rest/security-advisories/repository-advisories (List repository security advisories)

Pagination options ignored for security advisories.

opts.Page and opts.PerPage are not applied, so only the first page is fetched. This creates a data loss risk for repositories with many advisories. All other alert methods in this file (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts) correctly include ListOptions; this method should follow the same pattern.

Proposed fix

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{ State: opts.State, + ListOptions: github.ListOptions{ + Page: opts.Page, + PerPage: opts.PerPage, + }, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

// GetSecurityAdvisories fetches repository security advisories

func (c *GitHubSecurityClient) GetSecurityAdvisories(ctx context.Context, opts AlertListOptions) ([]*github.SecurityAdvisory, *github.Response, error) {

reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{

State: opts.State,

ListOptions: github.ListOptions{

Page: opts.Page,

PerPage: opts.PerPage,

},

}

advisories, resp, err := c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories(ctx, c.owner, c.repo, reqOpts)

if err != nil {

return nil, resp, fmt.Errorf("failed to fetch security advisories: %w", err)

}

return advisories, resp, nil

}

🤖 Prompt for AI Agents

In `@scrapers/github/security_client.go` around lines 121 - 133, GetSecurityAdvisories ignores pagination and always fetches the first page; set the ListRepositorySecurityAdvisoriesOptions to include the requested page and per-page values from AlertListOptions (map opts.Page and opts.PerPage into the options used for the API call). Update the reqOpts used in GetSecurityAdvisories to populate the ListOptions/Page and PerPage fields (same approach as GetDependabotAlerts/GetCodeScanningAlerts/GetSecretScanningAlerts) so the call to c.Client.SecurityAdvisories.ListRepositorySecurityAdvisories honors pagination.

coderabbitai · 2026-01-23T09:01:56Z

+			results = append(results, result)
+			ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName,
+				result.Properties.Find("Critical Alerts").Text+result.Properties.Find("High Alerts").Text+
+				result.Properties.Find("Medium Alerts").Text+result.Properties.Find("Low Alerts").Text)


⚠️ Potential issue | 🟡 Minor

Fix total-alerts logging (string concat + %d).

The log concatenates strings and formats with %d, so the output is incorrect. Compute an integer total before logging.

🐛 Suggested fix

import ( "fmt" + "strconv" "sync" "time"

- results = append(results, result) - ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName, - result.Properties.Find("Critical Alerts").Text+result.Properties.Find("High Alerts").Text+ - result.Properties.Find("Medium Alerts").Text+result.Properties.Find("Low Alerts").Text) + results = append(results, result) + totalAlerts := 0 + for _, name := range []string{"Critical Alerts", "High Alerts", "Medium Alerts", "Low Alerts"} { + if p := result.Properties.Find(name); p != nil { + if n, err := strconv.Atoi(p.Text); err == nil { + totalAlerts += n + } + } + } + ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName, totalAlerts)

🤖 Prompt for AI Agents

In `@scrapers/github/security.go` around lines 59 - 62, The log call is formatting with "%d" but is concatenating the alert-count strings, so compute an integer total first by parsing the individual counts from result.Properties.Find("Critical Alerts").Text, "High Alerts", "Medium Alerts", and "Low Alerts" (use strconv.Atoi, handle errors defaulting to 0), sum them into a total int, then call ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName, total) instead of concatenating the Text values.

coderabbitai · 2026-01-23T09:01:56Z

 	if err := json.Unmarshal([]byte(s), &results); err != nil {
+		if logger.V(5).Enabled() {
+			logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s)
+		}


⚠️ Potential issue | 🟠 Major

Avoid logging full script output (potential secrets/PII leak).

Even at verbose level, logging raw s can expose credentials/configs and dump large payloads. Consider truncating/redacting or logging only a sample/length.

💡 Suggested fix (truncate output)

- if logger.V(5).Enabled() { - logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s) - } + if logger.V(5).Enabled() { + const maxLogBytes = 4096 + out := s + if len(out) > maxLogBytes { + out = out[:maxLogBytes] + "...(truncated)" + } + logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, out) + }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if err := json.Unmarshal([]byte(s), &results); err != nil {

if logger.V(5).Enabled() {

logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s)

}

if err := json.Unmarshal([]byte(s), &results); err != nil {

if logger.V(5).Enabled() {

const maxLogBytes = 4096

out := s

if len(out) > maxLogBytes {

out = out[:maxLogBytes] + "...(truncated)"

}

logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, out)

}

🤖 Prompt for AI Agents

In `@scrapers/processors/script.go` around lines 33 - 36, The current logger.Infof call in the json.Unmarshal error block logs the entire script output variable s (risking secrets/PII and large dumps); update the error handling around json.Unmarshal to avoid printing the full s — instead log the error and either a safely truncated sample (e.g., first N characters with "…(truncated)") and the total length, or a redacted/hashed summary of s, and consider simple redaction for common secrets (tokens, passwords) before including any sample; modify the logger.Infof invocation that references err and s to use the new truncated/redacted sample and length-only information.

coderabbitai

Actionable comments posted: 18

Note

Due to the large number of review comments, Critical, Major severity comments were prioritized as inline comments.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

scrapers/kubernetes/ketall.go (1)

53-56: ⚠️ Potential issue | 🔴 Critical

Do not swallow errors returned by retry.

Line 55 always returns nil error even when retry.Do fails, which masks scrape failures.

Suggested fix

 	err = retry.Do(ctx, backoff, func(goctx gocontext.Context) error {
 		...
 	})
-
-	return objs, nil
+	if err != nil {
+		return nil, err
+	}
+	return objs, nil
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/kubernetes/ketall.go` around lines 53 - 56, The code currently
ignores errors from retry.Do and always returns nil; modify the function to
capture the error returned by retry.Do (e.g., err := retry.Do(...)), check if
err != nil and return the collected objs (if applicable) along with that err (or
just return nil, err depending on intended semantics), instead of always
returning objs, nil; reference the retry.Do call and the final return of objs to
locate and fix the issue in ketall.go.

♻️ Duplicate comments (9)

pkg/api/context.go (1)

165-167: ⚠️ Potential issue | 🟠 Major

Guard scrapeConfig in IsDebug() to prevent panic.

Line 166 dereferences ctx.scrapeConfig without a nil check. NewScrapeContext() can return with scrapeConfig == nil, so this can panic; the same guard should be applied to IsTrace() (Line 162) for consistency.

🐛 Proposed fix

 func (ctx ScrapeContext) IsTrace() bool {
+	if ctx.scrapeConfig == nil {
+		return false
+	}
 	return ctx.scrapeConfig.Spec.IsTrace()
 }

 func (ctx ScrapeContext) IsDebug() bool {
+	if ctx.scrapeConfig == nil {
+		return false
+	}
 	return ctx.scrapeConfig.Spec.IsDebug()
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@pkg/api/context.go` around lines 165 - 167, The IsDebug() method dereferences
ctx.scrapeConfig and can panic when scrapeConfig is nil (NewScrapeContext may
return nil scrapeConfig); update IsDebug() to guard against nil by checking if
ctx.scrapeConfig == nil and returning false if so, mirroring the same nil-check
you should add to IsTrace() to keep behavior consistent; locate and modify the
IsDebug and IsTrace methods in ScrapeContext to perform the nil guard before
calling scrapeConfig.Spec.IsDebug()/IsTrace().

scrapers/processors/script.go (1)

33-36: ⚠️ Potential issue | 🟠 Major

Avoid logging full script output on unmarshal errors.

Line 35 logs raw s, which can expose secrets/PII and flood logs. Log only a bounded, sanitized preview plus length.

Suggested fix

 	if err := json.Unmarshal([]byte(s), &results); err != nil {
 		if logger.V(5).Enabled() {
-			logger.Infof("Failed to unmarshal script output: %v\n%s\n", err, s)
+			const maxLogBytes = 4096
+			out := s
+			if len(out) > maxLogBytes {
+				out = out[:maxLogBytes] + "...(truncated)"
+			}
+			logger.Infof("Failed to unmarshal script output: %v (len=%d)\n%s\n", err, len(s), out)
 		}
 		return nil, err
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/processors/script.go` around lines 33 - 36, The unmarshal error
handler currently logs the full script output variable s which may expose
secrets; in the json.Unmarshal error branch update the logger.Infof call (inside
the block that checks logger.V(5).Enabled()) to log only a sanitized, bounded
preview of s (e.g., first N characters) plus the total length and the error from
json.Unmarshal rather than the full payload; use the existing variables
json.Unmarshal, s, results and logger.Infof so the change is localized to that
error-handling block and ensure any non-printable or sensitive patterns (like
emails/URLs) are redacted in the preview.

scrapers/openssf/compliance.go (1)

10-11: Incorrect source reference in documentation comment.

The comment references "plugins/vite-plugin-security.js" which appears to be from a different project. This should reference the OpenSSF Scorecard documentation or the actual data source for the compliance mappings.

Suggested fix

 // ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls
-// Based on DataSheet reference: plugins/vite-plugin-security.js
+// Based on OpenSSF Scorecard checks documentation
 var ComplianceMappings = map[string]ComplianceMapping{

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/compliance.go` around lines 10 - 11, The documentation
comment above the ComplianceMappings variable incorrectly references
"plugins/vite-plugin-security.js"; update the comment to reference the correct
source (e.g., OpenSSF Scorecard documentation or the actual mapping data source)
and include a brief link or citation to that source so the comment accurately
reflects where the compliance mappings come from (edit the comment that precedes
ComplianceMappings in compliance.go to replace the incorrect reference with the
OpenSSF Scorecard docs or the internal data source identifier).

api/v1/types.go (2)

128-134: ApplyPlugin still missing loops for GitHubSecurity and OpenSSFScorecard.

Plugin specs (transforms, retention overrides, etc.) won't be applied to the two new scraper types.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/types.go` around lines 128 - 134, The loops that call ApplyPlugins on
BaseScraper are only present for spec.GithubActions and spec.Azure; add
equivalent loops for spec.GitHubSecurity and spec.OpenSSFScorecard so their
BaseScraper fields also call ApplyPlugins(plugins...), e.g., iterate over
spec.GitHubSecurity and spec.OpenSSFScorecard with the same pattern used for
spec.GithubActions/ spec.Azure to assign BaseScraper =
BaseScraper.ApplyPlugins(plugins...).

13-29: AllScraperConfigs still missing githubsecurity and openssfscorecard entries.

Both new scraper types added to ScraperSpec are absent from this registry map.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/types.go` around lines 13 - 29, AllScraperConfigs is missing entries
for the two new scraper types declared in ScraperSpec; add the corresponding map
entries so the registry includes "githubsecurity" and "openssfscorecard". Update
the map by adding "githubsecurity": GitHubSecurity{} and "openssfscorecard":
OpenSSFScorecard{} to the AllScraperConfigs map (matching the existing lowercase
key -> PascalCase type pattern) so the new scraper types are discoverable.

scrapers/github/security.go (1)

89-101: ⚠️ Potential issue | 🔴 Critical

Fix CreatedAt pointer assignment in ScrapeResult.

CreatedAt is being assigned a time.Time value directly; this path expects a pointer and will fail type-checking.

🐛 Proposed fix

-	result := v1.ScrapeResult{
+	createdAt := repo.CreatedAt.GetTime()
+	result := v1.ScrapeResult{
 		Type:        ConfigTypeGitHubSecurityRepo,
 		ID:          fmt.Sprintf("github-security/%s", repoFullName),
 		Name:        repoFullName,
 		ConfigClass: "Security",
 		Config:      repo,
@@
-		CreatedAt:  repo.CreatedAt.GetTime(),
+		CreatedAt:  &createdAt,
 		Properties: properties,
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 89 - 101, The ScrapeResult struct
field CreatedAt is a pointer but the code assigns a time.Time directly; fix in
the ScrapeResult construction (symbol: result := v1.ScrapeResult{ ... CreatedAt:
... }) by creating a local variable like t := repo.CreatedAt.GetTime() and then
assigning the pointer (CreatedAt: &t) so the field receives *time.Time instead
of time.Time.

scrapers/github/security_client.go (2)

23-34: ⚠️ Potential issue | 🟠 Major

Fail fast when token resolution returns empty.

Client creation currently proceeds with an empty token, which guarantees downstream 401s.

🔐 Proposed fix

 	} else {
 		token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())
 		if err != nil {
 			return nil, fmt.Errorf("failed to get token: %w", err)
 		}
 	}
+	if token == "" {
+		return nil, fmt.Errorf("github personal access token is required (connection or personalAccessToken)")
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security_client.go` around lines 23 - 34, The
NewGitHubSecurityClient constructor allows creating a client with an empty token
which causes downstream 401s; after resolving token in NewGitHubSecurityClient
(the token variable set from connection.Password or ctx.GetEnvValueFromCache)
validate that token is non-empty and return a descriptive error (e.g., "empty
GitHub token" or "failed to resolve GitHub token") instead of proceeding when
token == ""; ensure you check both branches (connection != nil and the env
lookup) and fail fast from the function if the token is empty.

121-125: ⚠️ Potential issue | 🟠 Major

Apply pagination options in GetSecurityAdvisories.

opts.Page and opts.PerPage are ignored here, so advisory retrieval is limited to the default first page.

go-github v73 ListRepositorySecurityAdvisoriesOptions pagination with ListOptions Page PerPage

📄 Proposed fix

 	reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{
 		State: opts.State,
+		ListOptions: github.ListOptions{
+			Page:    opts.Page,
+			PerPage: opts.PerPage,
+		},
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security_client.go` around lines 121 - 125,
GetSecurityAdvisories currently ignores pagination options; update the creation
of reqOpts (ListRepositorySecurityAdvisoriesOptions) inside
GetSecurityAdvisories to include the ListOptions with Page and PerPage from opts
(e.g., set reqOpts.ListOptions = github.ListOptions{Page: opts.Page, PerPage:
opts.PerPage} or equivalent) so the GitHub API call respects opts.Page and
opts.PerPage when listing repository security advisories.

api/v1/github_security.go (1)

43-53: ⚠️ Potential issue | 🟠 Major

ParseMaxAge rejects the documented "90d" / "30d" inputs.

time.ParseDuration does not support day (d) units, so the current implementation conflicts with the comment examples.

Go time.ParseDuration valid units; does it support the "d" day suffix?

🛠️ Proposed fix (support `d` suffix)

 import (
+	"strconv"
+	"strings"
 	"time"

 	"github.com/flanksource/duty/types"
 )
@@
 func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) {
 	if f.MaxAge == "" {
 		return 0, nil
 	}
+	raw := strings.TrimSpace(strings.ToLower(f.MaxAge))
+	if strings.HasSuffix(raw, "d") {
+		days, err := strconv.Atoi(strings.TrimSuffix(raw, "d"))
+		if err != nil {
+			return 0, err
+		}
+		return time.Duration(days) * 24 * time.Hour, nil
+	}
 	return time.ParseDuration(f.MaxAge)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/github_security.go` around lines 43 - 53, ParseMaxAge currently calls
time.ParseDuration which rejects values like "90d"; update ParseMaxAge on the
GitHubSecurityFilters type to accept a trailing "d" (days) by detecting if
f.MaxAge ends with "d" (or "D"), converting the numeric part to hours (days *
24h) and returning the corresponding time.Duration, while still delegating to
time.ParseDuration for values without a day suffix and returning errors from
parsing as before; reference the MaxAge field and the ParseMaxAge function when
making this change.

🟡 Minor comments (5)

api/v1/utils.go-15-23 (1)
15-23: ⚠️ Potential issue | 🟡 Minor

Avoid exporting MD5 behind a generic Hash API.

While current usages are for non-security ID generation (naming), the exported generic name could encourage future security-sensitive misuse. MD5 has known collision weaknesses and should not be used for authentication, signatures, or integrity checks.
Proposed safer implementation
 import (
-	"crypto/md5"
+	"crypto/sha256"
 	"encoding/hex"
 	"encoding/json"
 	"strings"
 )
@@
-// Hash returns the MD5 hash of the JSON representation of the given value.
+// Hash returns the SHA-256 hash of the JSON representation of the given value.
 func Hash(v any) (string, error) {
 	data, err := json.Marshal(v)
 	if err != nil {
 		return "", err
 	}
 
-	hash := md5.Sum(data)
+	hash := sha256.Sum256(data)
 	return hex.EncodeToString(hash[:]), nil
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1/utils.go` around lines 15 - 23, The exported function Hash currently
hides MD5 usage and risks future security misuse; change it to a non-exported,
purpose-specific name (e.g., hashName or hashID) or replace the algorithm with a
cryptographically safe one (e.g., SHA-256) and keep the name if you intend a
secure hash; update all callers of Hash accordingly. Specifically, modify the
function symbol Hash in api/v1/utils.go to either (a) rename to a lowercase,
purpose-scoped name like hashName/hashID if it is only used for non-security
IDs, or (b) switch the internal md5.Sum call to a SHA-256 implementation and
keep the exported API if you need a secure hash, then adjust return encoding
(hex) as needed; ensure all references to Hash are updated to the new name or
behavior.
scrapers/aws/aws.go-493-493 (1)
493-493: ⚠️ Potential issue | 🟡 Minor

Log level bump to V(3) is reasonable.

Reduces noise from per-chunk ECS task description logging. Note that the log message still references len(allTaskArns) (total count) rather than len(taskArns) (current chunk size), which could be confusing when reading logs.
Suggested fix
-		ctx.Logger.V(3).Infof("describing %d ECS tasks for cluster %s", len(allTaskArns), clusterArn)
+		ctx.Logger.V(3).Infof("describing %d ECS tasks for cluster %s", len(taskArns), clusterArn)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/aws/aws.go` at line 493, The log call ctx.Logger.V(3).Infof
currently prints len(allTaskArns) which is the total but it's inside a per-chunk
loop — update the message at the ctx.Logger.V(3).Infof call to reference
len(taskArns) for the current chunk (or include both counts, e.g. "describing
%d/%d ECS tasks" with len(taskArns) and len(allTaskArns)) so logs reflect the
chunk size being processed.
api/plugin/go.mod-405-407 (1)
405-407: ⚠️ Potential issue | 🟡 Minor

Add glebarez/sqlite replace directive to match root go.mod.

The root go.mod replaces github.com/glebarez/sqlite with github.com/clarkmcc/gorm-sqlite, but this submodule's go.mod lacks the matching replace directive. This module lists glebarez/sqlite v1.11.0 as an indirect dependency but will pull the original package instead of the fork when built standalone.
Proposed fix
 replace github.com/flanksource/config-db/api => ../v1
+
+replace github.com/glebarez/sqlite => github.com/clarkmcc/gorm-sqlite v0.0.0-20240426202654-00ed082c0311
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/plugin/go.mod` around lines 405 - 407, The submodule go.mod is missing
the replace that maps github.com/glebarez/sqlite to the fork used at the repo
root; add a replace directive in api/plugin/go.mod (alongside the existing
replace block that contains "github.com/flanksource/config-db/api => ../v1") to
point github.com/glebarez/sqlite to github.com/clarkmcc/gorm-sqlite (use the
same version used in root go.mod), so builds of this submodule use the forked
sqlite implementation.
api/plugin/proto/scraper.proto-116-133 (1)
116-133: ⚠️ Potential issue | 🟡 Minor

Inconsistent type for JSON-encoded fields: patches is string, details is bytes.

In ChangeResultProto, patches (line 123) is declared as string while details (line 129) is bytes with a comment indicating it's JSON-encoded. Usage throughout the codebase confirms patches contains JSON (e.g., json.Unmarshal([]byte(change.Patches), ...) in scrapers/run.go:143 and test fixtures with JSON objects), yet it remains string. Other JSON-encoded fields like config, properties, base_scraper, and analysis in ScrapeResultProto consistently use bytes. This inconsistency should be resolved: either change patches to bytes for consistency with other JSON fields, or remove the "JSON-encoded map" designation from details.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/plugin/proto/scraper.proto` around lines 116 - 133, ChangeResultProto
declares patches as string while details is bytes (commented as JSON); make
patches a bytes field to match other JSON-encoded fields and keep a consistent
pattern (update the ChangeResultProto field "patches" to type bytes and add the
same "JSON-encoded map" comment as on "details"). Then update all call sites
that treat ChangeResultProto.patches as a string (places that convert it to
[]byte before json unmarshalling) to accept the bytes field directly, regenerate
the protobuf artifacts, and adjust test fixtures to provide patches as raw
bytes/JSON bytes.
api/plugin/convert_test.go-164-178 (1)
164-178: ⚠️ Potential issue | 🟡 Minor

Test name is misleading: the test uses non-empty details.

TestProtoToChangeResultWithEmptyDetails actually exercises a non-empty JSON payload ({"key": "value"}). Rename to TestProtoToChangeResultWithDetails or similar.
🔧 Proposed rename
-func TestProtoToChangeResultWithEmptyDetails(t *testing.T) {
+func TestProtoToChangeResultWithDetails(t *testing.T) {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/plugin/convert_test.go` around lines 164 - 178, The test name
TestProtoToChangeResultWithEmptyDetails is misleading because the test
constructs a proto with non-empty JSON Details; rename the test function to
something like TestProtoToChangeResultWithDetails to reflect behavior, updating
the function declaration and any references to
TestProtoToChangeResultWithEmptyDetails so the test runner and readers see a
consistent, accurate name; ensure the function name in the file
(TestProtoToChangeResultWithEmptyDetails) is changed and any test-suite or CI
references (if present) are updated accordingly.

🧹 Nitpick comments (10)

scrapers/aws/cloudtrail.go (1)

15-15: Nit: the v1 alias is misleading for a non-versioned path.

Aliasing a package name to something unrelated to the actual package can confuse readers. Now that the import points to the root api package rather than a v1 sub-path, the alias no longer conveys accurate information. Consider renaming it to api (or a similarly descriptive name) in a follow-up pass.
♻️ Suggested rename (illustrative — apply across the file)
-	v1 "github.com/flanksource/config-db/api"
+	configapi "github.com/flanksource/config-db/api"
Then replace all v1. usages with configapi. throughout this file.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/aws/cloudtrail.go` at line 15, The import alias `v1` is misleading
for the root `api` package; change the import alias from `v1` to a clearer name
like `api` (or `configapi`) in scrapers/aws/cloudtrail.go and update all
references (e.g., any occurrences of `v1.` used in functions or types in this
file) to the new alias so the package name reflects the actual package; locate
references such as `v1.` in methods and structs in this file and replace them
with `api.` (or `configapi.`) consistently.

Makefile (1)

87-88: gotest target bypasses envtest asset validation.

Unlike gotest-prod and gotest-load which use $(validate-envtest-assets) to verify that etcd and other required binaries exist before running tests, this target uses a bare $(shell ...) expansion. If setup-envtest fails or returns an empty path, the test run will fail with a cryptic error instead of the clear diagnostics provided by the validation macro.

Suggested fix to use the same validation pattern

 .PHONY: gotest
 gotest: ginkgo
-	KUBEBUILDER_ASSETS="$(shell $(ENVTEST) use $(ENVTEST_K8S_VERSION) --bin-dir $(LOCALBIN) -p path)" \
-		ginkgo -r -v --skip-package=tests/e2e,api/plugin -coverprofile cover.out ./...
+	$(validate-envtest-assets) \
+	KUBEBUILDER_ASSETS="$$ASSETS" \
+		ginkgo -r -v --skip-package=tests/e2e,api/plugin -coverprofile cover.out ./...

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 87 - 88, The gotest Makefile target uses a direct
shell expansion to set KUBEBUILDER_ASSETS which bypasses the
validate-envtest-assets check; update the gotest target to invoke the existing
validate-envtest-assets validation (the same macro used by gotest-prod and
gotest-load) before using ENVTEST/ENVTEST_K8S_VERSION/LOCALBIN so that etcd and
required binaries are verified and a clear error is produced if validation
fails—locate the gotest target and replace the bare $(shell $(ENVTEST) use ...)
usage with the validated pattern (or add an explicit dependency on
validate-envtest-assets) so KUBEBUILDER_ASSETS is only determined after
validation succeeds.

go.mod (1)

406-406: Pre-release pin for structured-merge-diff/v6.

Pinning to a specific commit (v6.3.2-0.20260122202528-d9cc6641c482) rather than a stable release is fragile. Consider upgrading to a stable tag once one is available, to avoid potential breakage from upstream force-pushes or tag changes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@go.mod` at line 406, The go.mod entry currently pins
sigs.k8s.io/structured-merge-diff/v6 to a pseudo-version commit
(v6.3.2-0.20260122202528-d9cc6641c482); update this to a stable released tag
(e.g., v6.x.y) or otherwise remove the commit-specific pseudo-version so the
module uses a proper semantic version; locate the line containing the module
name "sigs.k8s.io/structured-merge-diff/v6" and replace the pseudo-version with
the intended stable tag (or run `go get
sigs.k8s.io/structured-merge-diff/v6@latest`/appropriate version and `go mod
tidy`) to ensure we don't rely on a fragile pre-release commit.

api/plugin/proto/scraper.proto (1)

1-19: Buf lint warning about package directory mismatch is a non-issue for the current build.

The Buf static analysis flags that configdb.plugin should be in a configdb/plugin directory. Since the Makefile uses raw protoc (not Buf), this doesn't affect the build. If Buf is adopted later, either restructure the directory or add a buf.yaml with a custom root.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/plugin/proto/scraper.proto` around lines 1 - 19, The proto package
declaration "package configdb.plugin" in scraper.proto conflicts with Buf's
expected directory layout (configdb/plugin); to fix, either move scraper.proto
into a directory matching the package (create configdb/plugin and place
scraper.proto there) or add a buf.yaml at the repo root configuring a custom
root to avoid the lint error; look for the package line and the services
ScraperPlugin and HostServices in scraper.proto when applying the change so
imports and go_package option remain correct.

.github/workflows/api-lint-test.yml (1)

19-21: Consider using go-version-file to avoid version drift.

Hardcoding go-version: 1.25.x could drift from the go 1.25.1 directive in the submodule's go.mod. Using go-version-file keeps them in sync automatically.

Example

       - uses: buildjet/setup-go@v5
         with:
-          go-version: 1.25.x
+          go-version-file: api/v1/go.mod

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In @.github/workflows/api-lint-test.yml around lines 19 - 21, The workflow
currently hardcodes the Go version using buildjet/setup-go@v5 with go-version:
1.25.x which can drift from the submodule's go.mod; change the action inputs to
use go-version-file instead and point it at the file that contains the canonical
Go version (e.g., the submodule's go.mod), so replace the go-version parameter
with go-version-file and ensure the path to the go.mod (or a dedicated
.go-version file) is correct to keep versions in sync.

api/v1/Makefile (1)

39-42: golangci-lint install doesn't verify version, unlike controller-gen.

The controller-gen target (lines 47-48) checks that the installed version matches CONTROLLER_TOOLS_VERSION before deciding to reinstall. The golangci-lint target only checks for existence (test -s), so a stale or mismatched version will silently be used.

Proposed fix to add version checking

 .PHONY: golangci-lint
 golangci-lint: $(GOLANGCI_LINT)
 $(GOLANGCI_LINT): $(LOCALBIN)
-	test -s $(LOCALBIN)/golangci-lint || GOBIN=$(LOCALBIN) go install github.com/golangci/golangci-lint/v2/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION)
+	test -s $(LOCALBIN)/golangci-lint && $(LOCALBIN)/golangci-lint version --short 2>&1 | grep -q $(GOLANGCI_LINT_VERSION) || \
+	GOBIN=$(LOCALBIN) go install github.com/golangci/golangci-lint/v2/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/Makefile` around lines 39 - 42, The golangci-lint target currently
only checks for the binary's existence and can leave a mismatched version
installed; update the golangci-lint target to verify the installed version
against GOLANGCI_LINT_VERSION (similar to the controller-gen logic): invoke
$(LOCALBIN)/golangci-lint --version (or equivalent) and compare the output to
GOLANGCI_LINT_VERSION, and only run GOBIN=$(LOCALBIN) go install
github.com/golangci/golangci-lint/v2/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION)
when the version differs or the binary is missing; use the existing variables
golangci-lint, GOLANGCI_LINT, GOLANGCI_LINT_VERSION, and LOCALBIN to implement
this check.

api/plugin/Makefile (1)

33-37: proto target assumes protoc and plugins are pre-installed.

Unlike the golangci-lint target which auto-installs the tool, the proto target relies on protoc, protoc-gen-go, and protoc-gen-go-grpc being already available. Consider documenting the required tool versions or adding install logic to avoid "command not found" failures for new contributors.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/plugin/Makefile` around lines 33 - 37, The proto Makefile target named
"proto" currently assumes protoc and its Go plugins are preinstalled; update the
Makefile to either (a) add install steps similar to the existing golangci-lint
flow that will check for and install protoc, protoc-gen-go and
protoc-gen-go-grpc (with pinned versions) before running the protoc command, or
(b) add a clear comment block above the "proto" target listing required tools
and their versions (protoc, protoc-gen-go, protoc-gen-go-grpc) and a one-line
hint on how to install them; reference the "proto" target and mirror the
auto-install pattern used for the golangci-lint target to avoid "command not
found" errors.

scrapers/openssf/scorecard.go (2)

139-143: Hardcoded "failing" / "passing" strings for analysis.Status.

These bare string literals should use constants (e.g., from the models package or a local const block) to prevent silent drift if the expected status values change.

♻️ Proposed refactor

+const (
+	analysisStatusPassing = "passing"
+	analysisStatusFailing = "failing"
+)
 ...
 		if check.Score < 10 {
-			analysis.Status = "failing"
+			analysis.Status = analysisStatusFailing
 		} else {
-			analysis.Status = "passing"
+			analysis.Status = analysisStatusPassing
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/scorecard.go` around lines 139 - 143, Replace the hardcoded
"failing"/"passing" literals assigned to analysis.Status with well-named
constants (e.g., models.StatusFailing and models.StatusPassing or local consts
StatusFailing/StatusPassing) and update the import if using the models package;
change the branch in the check.Score handling inside the same function (where
analysis.Status and check.Score are used) to assign the constants instead of raw
strings so the status values are centralized and won't drift.

265-272: containsString duplicates slices.Contains from the standard library.

Go 1.21+ includes slices.Contains. Replace the helper to reduce code surface.

♻️ Proposed refactor

-func containsString(slice []string, item string) bool {
-	for _, s := range slice {
-		if s == item {
-			return true
-		}
-	}
-	return false
-}

Then update the call site:

+import "slices"
 ...
-		if check.Score == 0 && containsString(criticalChecks, check.Name) {
+		if check.Score == 0 && slices.Contains(criticalChecks, check.Name) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/scorecard.go` around lines 265 - 272, The local helper
containsString duplicates the standard library; remove the containsString
function and replace its call sites with slices.Contains(slice, item). Add the
standard library import "slices" to the file (or group it in the existing import
block) and update any references to use slices.Contains where containsString was
used.

api/plugin/convert_test.go (1)

58-90: Round-trip test doesn't verify pointer fields CreatedAt and CreatedBy.

Both fields are set in original and go through a protobuf timestamp conversion. Adding assertions ensures the timestamp serialization round-trip is correct and won't silently regress.

➕ Suggested additional assertions

 	if roundTrip.ChangeType != original.ChangeType {
 		t.Errorf("RoundTrip ChangeType mismatch: got %s, want %s", roundTrip.ChangeType, original.ChangeType)
 	}
+	if roundTrip.CreatedBy == nil || *roundTrip.CreatedBy != *original.CreatedBy {
+		t.Errorf("RoundTrip CreatedBy mismatch: got %v, want %v", roundTrip.CreatedBy, original.CreatedBy)
+	}
+	if roundTrip.CreatedAt == nil || !roundTrip.CreatedAt.Equal(*original.CreatedAt) {
+		t.Errorf("RoundTrip CreatedAt mismatch: got %v, want %v", roundTrip.CreatedAt, original.CreatedAt)
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/plugin/convert_test.go` around lines 58 - 90, Add assertions to
TestChangeResultRoundTrip to verify the pointer fields CreatedAt and CreatedBy
survive the protobuf round-trip: after converting original -> proto -> roundTrip
(using changeResultToProto and protoToChangeResult), compare that
roundTrip.CreatedBy equals original.CreatedBy (dereference pointers as needed)
and that roundTrip.CreatedAt is non-nil and its time value equals
original.CreatedAt (use time.Equal or equivalent to compare times after proto
timestamp conversion). Ensure you handle nil pointers safely and include clear
failure messages referencing CreatedAt/CreatedBy.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/plugin/convert.go`:
- Around line 396-409: The conversion functions (e.g., protoToExternalRole,
protoToExternalUser, protoToExternalGroup, protoToExternalUserGroup,
protoToExternalConfigAccess, protoToExternalConfigAccessLog) currently call
uuid.MustParse which will panic on malformed plugin input; replace each
uuid.MustParse call with uuid.Parse, check the returned error, and only set the
model UUID fields when parsing succeeds (for pointer fields set to &id on
success, for value fields set id on success); on parse error either leave the
field unset/nil or log the invalid UUID (using existing logger) before
continuing so malformed plugin-provided IDs do not crash the process.
- Around line 386-393: externalRoleToProto (and similarly proto conversions for
configs) currently calls lo.FromPtr on r.ScraperID which turns nil into the
zero-UUID string; change the conversion so ScraperId is only set when the
original pointer is non-nil (e.g. if r.ScraperID != nil then
r.ScraperID.String() else ""), so the proto field remains empty for a nil
pointer and the existing protoToExternalRole/protoToExternalConfigAccess logic
(which checks p.ScraperId != "") will correctly preserve nil on round-trip;
update externalRoleToProto to avoid lo.FromPtr for ScraperID and apply the same
pattern where other ScraperID conversions occur.

In `@api/plugin/grpc_plugin.go`:
- Around line 29-37: The GRPCClient implementation dials a host connection and
sets grpcClient.hostClient (via broker.Dial(1) and hostConn) but never uses or
closes it, leaking resources; remove the unused hostClient and the broker.Dial
call: update GRPCPlugin.GRPCClient to stop calling broker.Dial(1) and construct
&grpcClient{client: pb.NewScraperPluginClient(c)} only, and also remove the
hostClient field from the grpcClient struct (and any related imports/fields) so
there is no unused client or open connection; alternatively, if host services
are required, ensure the hostConn is stored, hostClient methods are invoked from
grpcClient methods, and hostConn is closed when no longer needed.

In `@api/plugin/types.go`:
- Around line 35-70: The conversion functions EnvVarFromDuty and (EnvVar) ToDuty
are dropping the EnvVar.Name field; update EnvVarFromDuty to copy e.Name into
the returned EnvVar.Name and update (EnvVar) ToDuty to set types.EnvVar.Name
from e.Name so the name round-trips correctly; locate the conversions in
EnvVarFromDuty and ToDuty and assign the Name field on both sides (EnvVar and
types.EnvVar) while leaving the existing ValueStatic and ValueFrom handling
intact.

In `@scrapers/aws/cloudtrail.go`:
- Line 15: The import alias v1 is referencing a non-existent package root;
update the import to point to the actual package that defines the types (change
the import from v1 "github.com/flanksource/config-db/api" to v1
"github.com/flanksource/config-db/api/v1") so symbols like AWS, ScrapeResults,
ChangeResult and NewJSON resolve; alternatively add re-exports in the api
package root, but the quick fix is to correct the import path where v1 is used.

In `@scrapers/event.go`:
- Around line 10-12: The import alias v1 is pointing to the wrong package;
update the import so v1 refers to the package that actually defines the symbols
used (e.g., v1.ScrapeConfigFromModel, v1.InvolvedObject,
v1.InvolvedObjectFromObj). Replace the current import target for the v1 alias
with the v1 package path (change the import to reference
github.com/flanksource/config-db/api/v1) so all v1.* references resolve
correctly.

In `@scrapers/gcp/cloudsql_backup.go`:
- Line 13: The import change broke compilation because the file uses symbols
declared in api/v1; revert the import to reference the v1 package (e.g., use v1
"github.com/flanksource/config-db/api/v1") so GCP, ScrapeResults, ChangeResult,
CloudSQLInstance, GCPBackupRun and NewScrapeResult resolve correctly, or
alternately move/re-export those symbols into the root api package and update
references accordingly; the simplest fix is to restore the v1 import in
cloudsql_backup.go.

In `@scrapers/gcp/dns.go`:
- Line 12: The import path was changed incorrectly; restore the import to v1
"github.com/flanksource/config-db/api/v1" so the symbols ScrapeResults,
ScrapeResult, GCPManagedZone, and GCPResourceRecordSet referenced in
scrapers/gcp/dns.go resolve correctly; update the import statement in that file
to the api/v1 subpackage (v1) to avoid the compilation error.

In `@scrapers/gcp/security_center.go`:
- Line 9: The import statement using v1 "github.com/flanksource/config-db/api"
is incorrect because the types AnalysisResult, ExternalID, GCP, ScrapeResults
and ScrapeResult are declared in package v1; change the import to v1
"github.com/flanksource/config-db/api/v1" so the code referencing those symbols
(e.g., AnalysisResult, ScrapeResult, GCP) compiles, or alternatively re-export
those types from the root package if you prefer that design.

In `@scrapers/github/security_client.go`:
- Around line 51-57: AlertListOptions.CreatedAt is declared but never used when
building the alert list request, so time-constrained/incremental scrapes are
ineffective; update the code that constructs the GitHub security alerts query
(the logic that reads AlertListOptions in the alert listing path—e.g., functions
like ListAlerts, buildAlertQuery or the request parameter assembly) to include
CreatedAt as a filter/query parameter (convert the string to the appropriate
GitHub API time format if needed) and ensure it is added to the request URL or
body alongside State, Severity, Page, and PerPage so the API call actually
restricts results by creation time.

In `@scrapers/github/security.go`:
- Around line 127-137: The AlertListOptions construction currently only uses
stateFilter and ignores configured filters for severity and maxAge; update the
opts creation in scrapers/github/security.go (where filters := config.Filters
and stateFilter are defined) to propagate filters.Severity and filters.MaxAge
(or map to the expected field names on AlertListOptions, e.g., Severity and
MaxAge or AgeDays) so user-configured severity and maxAge are passed through to
AlertListOptions before passing opts to the list call; ensure you handle
empty/zero values the same way stateFilter is handled.
- Around line 148-170: The fetch loops for Dependabot, code scanning and secret
scanning alerts (calls to GetDependabotAlerts, GetCodeScanningAlerts,
GetSecretScanningAlerts) only request a single page (PerPage:100) and will
truncate results; change each fetch to iterate pagination: start with
opts.Page=1, call the client method, append results into alerts.dependabot /
alerts.codeScanning / alerts.secretScanning, then advance to the next page (use
the returned response.NextPage or increment opts.Page) and repeat until no more
pages or an error; preserve existing error handling and logging (ctx.Debugf) and
ensure maxAlertTime logic remains correct while aggregating across pages.
- Around line 45-51: The code blocks on time.Sleep(duration) after
client.ShouldPauseForRateLimit which prevents prompt shutdown; replace the
blocking sleep with a cancellation-aware wait using select: wait on
time.After(duration) and on ctx.Done(), so if ctx is canceled you stop waiting
and exit/return (or continue) promptly. Update the branch that currently calls
time.Sleep(duration) to use select { case <-time.After(duration): /* resume */
case <-ctx.Done(): ctx.Warnf("rate-limit wait canceled"); return/continue } and
keep existing logging (ctx.Warnf) and error handling around
client.ShouldPauseForRateLimit.

In `@scrapers/openssf/client.go`:
- Around line 122-129: The fetchScorecard HTTP response handling treats 429 as a
generic non-OK error so isRetryable won't catch rate-limits; update the response
checks in fetchScorecard (and the similar block used around the other status
handling) to explicitly treat http.StatusTooManyRequests (429) as a retryable
error — e.g., return an error value or type that is recognized by isRetryable
when resp.StatusCode == http.StatusTooManyRequests, and preserve the existing
body/error text for logging; ensure the same change is applied to the similar
status-check block referenced near the second occurrence so throttling triggers
retries.
- Around line 81-86: The retry backoff currently calls time.Sleep(backoff) which
ignores cancellation; replace that sleep with a context-aware wait (select on
time.After(backoff) and the request/context Done channel) so retries abort
promptly on cancel/timeout. Locate the retry block where repoFullName, attempt,
maxRetries and backoff are computed (the c.ctx.Tracef log and time.Sleep call)
and change the sleep to a select that returns/propagates the context error if
the context is cancelled instead of blocking the goroutine.

In `@scrapers/openssf/scorecard.go`:
- Line 22: The package-level sync.Map LastScorecardScrapeTime currently holds
global state; change it to be scoped per-scraper instance or injectable so cache
doesn't leak across tests or config changes: add a field (e.g., lastScrape
sync.Map or map[string]time.Time with mutex) to the scraper struct used to run
scorecard scrapes (or accept a Cache interface via the scraper's constructor)
and update all uses of LastScorecardScrapeTime to reference that instance field;
ensure the cache honors ScorecardCacheTTL from the scraper's config and write
tests that inject a fresh cache to avoid cross-test contamination.
- Around line 53-57: The cache isn't updated for repos filtered out by MinScore
leading to repeated API calls; in the loop handling scorecard results
(referencing scorecard.Score and config.MinScore) call the cache update
(LastScorecardScrapeTime.Store or the existing scrape-time storage used before)
immediately after receiving the scorecard for repoFullName and before the
MinScore check so that shouldSkipScorecardFetch will hit on subsequent runs;
move the Store(...) invocation to precede the if that checks config.MinScore and
retains the existing timestamp/value you store now.

In `@scrapers/stale.go`:
- Line 8: The import was changed to the root package which is empty; revert the
import to the versioned package so the symbols like DeletedReasonStale are
available: change the import alias back to v1
"github.com/flanksource/config-db/api/v1" in stale.go so uses of
v1.DeletedReasonStale (and any other v1 types) compile.

---

Outside diff comments:
In `@scrapers/kubernetes/ketall.go`:
- Around line 53-56: The code currently ignores errors from retry.Do and always
returns nil; modify the function to capture the error returned by retry.Do
(e.g., err := retry.Do(...)), check if err != nil and return the collected objs
(if applicable) along with that err (or just return nil, err depending on
intended semantics), instead of always returning objs, nil; reference the
retry.Do call and the final return of objs to locate and fix the issue in
ketall.go.

---

Minor comments:
In `@api/plugin/convert_test.go`:
- Around line 164-178: The test name TestProtoToChangeResultWithEmptyDetails is
misleading because the test constructs a proto with non-empty JSON Details;
rename the test function to something like TestProtoToChangeResultWithDetails to
reflect behavior, updating the function declaration and any references to
TestProtoToChangeResultWithEmptyDetails so the test runner and readers see a
consistent, accurate name; ensure the function name in the file
(TestProtoToChangeResultWithEmptyDetails) is changed and any test-suite or CI
references (if present) are updated accordingly.

In `@api/plugin/go.mod`:
- Around line 405-407: The submodule go.mod is missing the replace that maps
github.com/glebarez/sqlite to the fork used at the repo root; add a replace
directive in api/plugin/go.mod (alongside the existing replace block that
contains "github.com/flanksource/config-db/api => ../v1") to point
github.com/glebarez/sqlite to github.com/clarkmcc/gorm-sqlite (use the same
version used in root go.mod), so builds of this submodule use the forked sqlite
implementation.

In `@api/plugin/proto/scraper.proto`:
- Around line 116-133: ChangeResultProto declares patches as string while
details is bytes (commented as JSON); make patches a bytes field to match other
JSON-encoded fields and keep a consistent pattern (update the ChangeResultProto
field "patches" to type bytes and add the same "JSON-encoded map" comment as on
"details"). Then update all call sites that treat ChangeResultProto.patches as a
string (places that convert it to []byte before json unmarshalling) to accept
the bytes field directly, regenerate the protobuf artifacts, and adjust test
fixtures to provide patches as raw bytes/JSON bytes.

In `@api/v1/utils.go`:
- Around line 15-23: The exported function Hash currently hides MD5 usage and
risks future security misuse; change it to a non-exported, purpose-specific name
(e.g., hashName or hashID) or replace the algorithm with a cryptographically
safe one (e.g., SHA-256) and keep the name if you intend a secure hash; update
all callers of Hash accordingly. Specifically, modify the function symbol Hash
in api/v1/utils.go to either (a) rename to a lowercase, purpose-scoped name like
hashName/hashID if it is only used for non-security IDs, or (b) switch the
internal md5.Sum call to a SHA-256 implementation and keep the exported API if
you need a secure hash, then adjust return encoding (hex) as needed; ensure all
references to Hash are updated to the new name or behavior.

In `@scrapers/aws/aws.go`:
- Line 493: The log call ctx.Logger.V(3).Infof currently prints len(allTaskArns)
which is the total but it's inside a per-chunk loop — update the message at the
ctx.Logger.V(3).Infof call to reference len(taskArns) for the current chunk (or
include both counts, e.g. "describing %d/%d ECS tasks" with len(taskArns) and
len(allTaskArns)) so logs reflect the chunk size being processed.

---

Duplicate comments:
In `@api/v1/github_security.go`:
- Around line 43-53: ParseMaxAge currently calls time.ParseDuration which
rejects values like "90d"; update ParseMaxAge on the GitHubSecurityFilters type
to accept a trailing "d" (days) by detecting if f.MaxAge ends with "d" (or "D"),
converting the numeric part to hours (days * 24h) and returning the
corresponding time.Duration, while still delegating to time.ParseDuration for
values without a day suffix and returning errors from parsing as before;
reference the MaxAge field and the ParseMaxAge function when making this change.

In `@api/v1/types.go`:
- Around line 128-134: The loops that call ApplyPlugins on BaseScraper are only
present for spec.GithubActions and spec.Azure; add equivalent loops for
spec.GitHubSecurity and spec.OpenSSFScorecard so their BaseScraper fields also
call ApplyPlugins(plugins...), e.g., iterate over spec.GitHubSecurity and
spec.OpenSSFScorecard with the same pattern used for spec.GithubActions/
spec.Azure to assign BaseScraper = BaseScraper.ApplyPlugins(plugins...).
- Around line 13-29: AllScraperConfigs is missing entries for the two new
scraper types declared in ScraperSpec; add the corresponding map entries so the
registry includes "githubsecurity" and "openssfscorecard". Update the map by
adding "githubsecurity": GitHubSecurity{} and "openssfscorecard":
OpenSSFScorecard{} to the AllScraperConfigs map (matching the existing lowercase
key -> PascalCase type pattern) so the new scraper types are discoverable.

In `@pkg/api/context.go`:
- Around line 165-167: The IsDebug() method dereferences ctx.scrapeConfig and
can panic when scrapeConfig is nil (NewScrapeContext may return nil
scrapeConfig); update IsDebug() to guard against nil by checking if
ctx.scrapeConfig == nil and returning false if so, mirroring the same nil-check
you should add to IsTrace() to keep behavior consistent; locate and modify the
IsDebug and IsTrace methods in ScrapeContext to perform the nil guard before
calling scrapeConfig.Spec.IsDebug()/IsTrace().

In `@scrapers/github/security_client.go`:
- Around line 23-34: The NewGitHubSecurityClient constructor allows creating a
client with an empty token which causes downstream 401s; after resolving token
in NewGitHubSecurityClient (the token variable set from connection.Password or
ctx.GetEnvValueFromCache) validate that token is non-empty and return a
descriptive error (e.g., "empty GitHub token" or "failed to resolve GitHub
token") instead of proceeding when token == ""; ensure you check both branches
(connection != nil and the env lookup) and fail fast from the function if the
token is empty.
- Around line 121-125: GetSecurityAdvisories currently ignores pagination
options; update the creation of reqOpts
(ListRepositorySecurityAdvisoriesOptions) inside GetSecurityAdvisories to
include the ListOptions with Page and PerPage from opts (e.g., set
reqOpts.ListOptions = github.ListOptions{Page: opts.Page, PerPage: opts.PerPage}
or equivalent) so the GitHub API call respects opts.Page and opts.PerPage when
listing repository security advisories.

In `@scrapers/github/security.go`:
- Around line 89-101: The ScrapeResult struct field CreatedAt is a pointer but
the code assigns a time.Time directly; fix in the ScrapeResult construction
(symbol: result := v1.ScrapeResult{ ... CreatedAt: ... }) by creating a local
variable like t := repo.CreatedAt.GetTime() and then assigning the pointer
(CreatedAt: &t) so the field receives *time.Time instead of time.Time.

In `@scrapers/openssf/compliance.go`:
- Around line 10-11: The documentation comment above the ComplianceMappings
variable incorrectly references "plugins/vite-plugin-security.js"; update the
comment to reference the correct source (e.g., OpenSSF Scorecard documentation
or the actual mapping data source) and include a brief link or citation to that
source so the comment accurately reflects where the compliance mappings come
from (edit the comment that precedes ComplianceMappings in compliance.go to
replace the incorrect reference with the OpenSSF Scorecard docs or the internal
data source identifier).

In `@scrapers/processors/script.go`:
- Around line 33-36: The unmarshal error handler currently logs the full script
output variable s which may expose secrets; in the json.Unmarshal error branch
update the logger.Infof call (inside the block that checks
logger.V(5).Enabled()) to log only a sanitized, bounded preview of s (e.g.,
first N characters) plus the total length and the error from json.Unmarshal
rather than the full payload; use the existing variables json.Unmarshal, s,
results and logger.Infof so the change is localized to that error-handling block
and ensure any non-printable or sensitive patterns (like emails/URLs) are
redacted in the preview.

---

Nitpick comments:
In @.github/workflows/api-lint-test.yml:
- Around line 19-21: The workflow currently hardcodes the Go version using
buildjet/setup-go@v5 with go-version: 1.25.x which can drift from the
submodule's go.mod; change the action inputs to use go-version-file instead and
point it at the file that contains the canonical Go version (e.g., the
submodule's go.mod), so replace the go-version parameter with go-version-file
and ensure the path to the go.mod (or a dedicated .go-version file) is correct
to keep versions in sync.

In `@api/plugin/convert_test.go`:
- Around line 58-90: Add assertions to TestChangeResultRoundTrip to verify the
pointer fields CreatedAt and CreatedBy survive the protobuf round-trip: after
converting original -> proto -> roundTrip (using changeResultToProto and
protoToChangeResult), compare that roundTrip.CreatedBy equals original.CreatedBy
(dereference pointers as needed) and that roundTrip.CreatedAt is non-nil and its
time value equals original.CreatedAt (use time.Equal or equivalent to compare
times after proto timestamp conversion). Ensure you handle nil pointers safely
and include clear failure messages referencing CreatedAt/CreatedBy.

In `@api/plugin/Makefile`:
- Around line 33-37: The proto Makefile target named "proto" currently assumes
protoc and its Go plugins are preinstalled; update the Makefile to either (a)
add install steps similar to the existing golangci-lint flow that will check for
and install protoc, protoc-gen-go and protoc-gen-go-grpc (with pinned versions)
before running the protoc command, or (b) add a clear comment block above the
"proto" target listing required tools and their versions (protoc, protoc-gen-go,
protoc-gen-go-grpc) and a one-line hint on how to install them; reference the
"proto" target and mirror the auto-install pattern used for the golangci-lint
target to avoid "command not found" errors.

In `@api/plugin/proto/scraper.proto`:
- Around line 1-19: The proto package declaration "package configdb.plugin" in
scraper.proto conflicts with Buf's expected directory layout (configdb/plugin);
to fix, either move scraper.proto into a directory matching the package (create
configdb/plugin and place scraper.proto there) or add a buf.yaml at the repo
root configuring a custom root to avoid the lint error; look for the package
line and the services ScraperPlugin and HostServices in scraper.proto when
applying the change so imports and go_package option remain correct.

In `@api/v1/Makefile`:
- Around line 39-42: The golangci-lint target currently only checks for the
binary's existence and can leave a mismatched version installed; update the
golangci-lint target to verify the installed version against
GOLANGCI_LINT_VERSION (similar to the controller-gen logic): invoke
$(LOCALBIN)/golangci-lint --version (or equivalent) and compare the output to
GOLANGCI_LINT_VERSION, and only run GOBIN=$(LOCALBIN) go install
github.com/golangci/golangci-lint/v2/cmd/golangci-lint@$(GOLANGCI_LINT_VERSION)
when the version differs or the binary is missing; use the existing variables
golangci-lint, GOLANGCI_LINT, GOLANGCI_LINT_VERSION, and LOCALBIN to implement
this check.

In `@go.mod`:
- Line 406: The go.mod entry currently pins sigs.k8s.io/structured-merge-diff/v6
to a pseudo-version commit (v6.3.2-0.20260122202528-d9cc6641c482); update this
to a stable released tag (e.g., v6.x.y) or otherwise remove the commit-specific
pseudo-version so the module uses a proper semantic version; locate the line
containing the module name "sigs.k8s.io/structured-merge-diff/v6" and replace
the pseudo-version with the intended stable tag (or run `go get
sigs.k8s.io/structured-merge-diff/v6@latest`/appropriate version and `go mod
tidy`) to ensure we don't rely on a fragile pre-release commit.

In `@Makefile`:
- Around line 87-88: The gotest Makefile target uses a direct shell expansion to
set KUBEBUILDER_ASSETS which bypasses the validate-envtest-assets check; update
the gotest target to invoke the existing validate-envtest-assets validation (the
same macro used by gotest-prod and gotest-load) before using
ENVTEST/ENVTEST_K8S_VERSION/LOCALBIN so that etcd and required binaries are
verified and a clear error is produced if validation fails—locate the gotest
target and replace the bare $(shell $(ENVTEST) use ...) usage with the validated
pattern (or add an explicit dependency on validate-envtest-assets) so
KUBEBUILDER_ASSETS is only determined after validation succeeds.

In `@scrapers/aws/cloudtrail.go`:
- Line 15: The import alias `v1` is misleading for the root `api` package;
change the import alias from `v1` to a clearer name like `api` (or `configapi`)
in scrapers/aws/cloudtrail.go and update all references (e.g., any occurrences
of `v1.` used in functions or types in this file) to the new alias so the
package name reflects the actual package; locate references such as `v1.` in
methods and structs in this file and replace them with `api.` (or `configapi.`)
consistently.

In `@scrapers/openssf/scorecard.go`:
- Around line 139-143: Replace the hardcoded "failing"/"passing" literals
assigned to analysis.Status with well-named constants (e.g.,
models.StatusFailing and models.StatusPassing or local consts
StatusFailing/StatusPassing) and update the import if using the models package;
change the branch in the check.Score handling inside the same function (where
analysis.Status and check.Score are used) to assign the constants instead of raw
strings so the status values are centralized and won't drift.
- Around line 265-272: The local helper containsString duplicates the standard
library; remove the containsString function and replace its call sites with
slices.Contains(slice, item). Add the standard library import "slices" to the
file (or group it in the existing import block) and update any references to use
slices.Contains where containsString was used.

coderabbitai · 2026-02-25T12:43:51Z

+func externalRoleToProto(r models.ExternalRole) *pb.ExternalRoleProto {
+	return &pb.ExternalRoleProto{
+		Id:          r.ID.String(),
+		AccountId:   r.AccountID,
+		ScraperId:   lo.FromPtr(r.ScraperID).String(),
+		Name:        r.Name,
+		Description: r.Description,
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, explore the codebase structure find . -type f -name "convert.go" | head -20

Repository: flanksource/config-db

Length of output: 88

🏁 Script executed:

# Examine the specific file head -30 api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 820

🏁 Script executed:

# Check the lines mentioned (386-393 and 506-511) sed -n '380,415p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 912

🏁 Script executed:

# Search for all lo.FromPtr patterns in the file rg -n 'lo\.FromPtr' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 180

🏁 Script executed:

# Check for the models.ExternalRole definition to understand ScraperID type rg -n 'type ExternalRole struct' --type go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Look for UUID handling and imports in convert.go head -50 api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 1296

🏁 Script executed:

# Let me check what lo.FromPtr does - search for documentation or usage patterns # First, let's understand uuid package behavior with nil and zero UUID cat > /tmp/test_uuid.go << 'EOF' package main import ( "fmt" "github.com/google/uuid" "github.com/samber/lo" ) func main() { // Test 1: nil pointer var nilPtr *uuid.UUID = nil fmt.Println("nil pointer:", lo.FromPtr(nilPtr).String()) // Test 2: zero UUID zeroUUID := uuid.UUID{} fmt.Println("zero UUID:", zeroUUID.String()) // Test 3: non-nil pointer to zero UUID ptrToZero := &zeroUUID fmt.Println("ptr to zero UUID:", lo.FromPtr(ptrToZero).String()) } EOF cd /tmp && go run test_uuid.go 2>&1 || echo "Build failed, checking go.mod"

Repository: flanksource/config-db

Length of output: 428

🏁 Script executed:

# Let's trace through the actual proto conversion flow # Check the lines around 506-511 to see the second occurrence sed -n '500,520p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 847

🏁 Script executed:

# Let's understand the types more - check ExternalRole and ExternalConfigAccess structs rg -A 10 'type ExternalRole struct' --type go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check how lo.FromPtr behaves with UUID - search samber/lo documentation approach # Let's look for test cases that might use this pattern rg -n 'ScraperID' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 1012

🏁 Script executed:

# Check if there are existing tests for these conversion functions find . -type f -name "*convert*test*" -o -name "*test*convert*" | grep -E '\.go$'

Repository: flanksource/config-db

Length of output: 93

🏁 Script executed:

# Search for how the code handles round-trip conversions rg -B 5 -A 5 'protoToExternalRole' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 659

🏁 Script executed:

# Let's check the proto definition to understand the field type find . -type f -name "*.proto" | xargs grep -l "ExternalRoleProto" 2>/dev/null

Repository: flanksource/config-db

Length of output: 97

🏁 Script executed:

# Examine the proto file for ExternalRoleProto and ExternalConfigAccessProto find . -name "*.proto" -exec grep -l "ExternalRoleProto\|ExternalConfigAccessProto" {} \;

Repository: flanksource/config-db

Length of output: 97

🏁 Script executed:

# Check the convert_test.go file to see if there are tests for nil handling sed -n '1,100p' api/plugin/convert_test.go

Repository: flanksource/config-db

Length of output: 3134

🏁 Script executed:

# Search for test patterns related to nil ScraperID rg -n 'ScraperID.*nil\|nil.*ScraperID' api/plugin/convert_test.go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Let me check what go.mod says about samber/lo version cat go.mod | grep -E 'samber|uuid'

Repository: flanksource/config-db

Length of output: 294

🏁 Script executed:

# Let's look for lo.FromPtr documentation or other similar usages in codebase rg -n 'lo\.FromPtr' --type go

Repository: flanksource/config-db

Length of output: 23910

🏁 Script executed:

# Check the proto definition for ExternalRoleProto and ExternalConfigAccessProto cat api/plugin/proto/scraper.proto | grep -A 20 "message ExternalRoleProto"

Repository: flanksource/config-db

Length of output: 462

🏁 Script executed:

# Also check for ExternalConfigAccessProto cat api/plugin/proto/scraper.proto | grep -A 30 "message ExternalConfigAccessProto"

Repository: flanksource/config-db

Length of output: 820

🏁 Script executed:

# Let's verify the exact behavior by checking how the reverse conversion handles the zero UUID sed -n '395,415p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 561

🏁 Script executed:

# Check the reverse conversion for ExternalConfigAccess sed -n '535,560p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 798

🏁 Script executed:

# Let's count how many places have this pattern - search for similar nil ScraperID handling rg -B 2 -A 2 'if.*ScraperID.*nil' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check if there are other similar UUID pointer serializations rg -n '\.FromPtr.*UUID.*\.String' --type go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check if there are any tests for the round-trip conversion of ExternalRole with nil ScraperID rg -A 30 'ExternalRole' api/plugin/convert_test.go | grep -i scraper

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Let's examine if there are other pointer-to-string conversions that might have similar issues sed -n '415,430p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 431

🏁 Script executed:

# Check externalUserToProto for similar pattern sed -n '415,445p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 732

🏁 Script executed:

# And externalGroupToProto sed -n '450,485p' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 890

🏁 Script executed:

# Let's check the full context around all lo.FromPtr usages in the convert.go file to see if there are other problem areas rg -B 3 -A 3 'lo\.FromPtr.*\.String' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 584

🏁 Script executed:

# Let's verify the samber/lo library behavior with FromPtr on nil # by checking the library documentation or source python3 << 'EOF' # Simulate what lo.FromPtr does with a nil UUID pointer # FromPtr returns the dereferenced value or zero value if pointer is nil # For UUID type (google/uuid package): # zero UUID is 00000000-0000-0000-0000-000000000000 # When lo.FromPtr(nil_uuid_pointer) is called, it returns UUID{} # When .String() is called on UUID{}, it returns "00000000-0000-0000-0000-000000000000" print("nil UUID pointer -> UUID{} -> String() = '00000000-0000-0000-0000-000000000000'") print("This is the root cause of the issue described in the review") EOF

Repository: flanksource/config-db

Length of output: 204

🏁 Script executed:

# Verify that ExternalRole.ScraperID is indeed a pointer rg -B 5 -A 10 'type ExternalRole struct' --type go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check ExternalConfigAccess definition rg -B 5 -A 15 'type ExternalConfigAccess struct' --type go

Repository: flanksource/config-db

Length of output: 1111

Preserve nil ScraperID pointers during proto conversion.

lo.FromPtr(...) on a nil UUID pointer returns the zero UUID 00000000-0000-0000-0000-000000000000, which round-trips as a non-nil pointer. The reverse conversion (protoToExternalRole and protoToExternalConfigAccess) only checks if p.ScraperId != "" before parsing, so the zero UUID gets deserialized back as a concrete pointer, losing the original nil state.

🔧 Proposed fix

func externalRoleToProto(r models.ExternalRole) *pb.ExternalRoleProto { - return &pb.ExternalRoleProto{ + result := &pb.ExternalRoleProto{ Id: r.ID.String(), AccountId: r.AccountID, - ScraperId: lo.FromPtr(r.ScraperID).String(), Name: r.Name, Description: r.Description, } + if r.ScraperID != nil { + result.ScraperId = r.ScraperID.String() + } + return result } func externalConfigAccessToProto(ca v1.ExternalConfigAccess) *pb.ExternalConfigAccessProto { result := &pb.ExternalConfigAccessProto{ Id: ca.ID, ConfigId: ca.ConfigID.String(), - ScraperId: lo.FromPtr(ca.ScraperID).String(), ExternalUserAliases: ca.ExternalUserAliases, ExternalRoleAliases: ca.ExternalRoleAliases, ExternalGroupAliases: ca.ExternalGroupAliases, @@ -517,6 +520,9 @@ func externalConfigAccessToProto(ca v1.ExternalConfigAccess) *pb.ExternalConfi }, } + if ca.ScraperID != nil { + result.ScraperId = ca.ScraperID.String() + }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api/plugin/convert.go` around lines 386 - 393, externalRoleToProto (and similarly proto conversions for configs) currently calls lo.FromPtr on r.ScraperID which turns nil into the zero-UUID string; change the conversion so ScraperId is only set when the original pointer is non-nil (e.g. if r.ScraperID != nil then r.ScraperID.String() else ""), so the proto field remains empty for a nil pointer and the existing protoToExternalRole/protoToExternalConfigAccess logic (which checks p.ScraperId != "") will correctly preserve nil on round-trip; update externalRoleToProto to avoid lo.FromPtr for ScraperID and apply the same pattern where other ScraperID conversions occur.

coderabbitai · 2026-02-25T12:43:51Z

+func protoToExternalRole(p *pb.ExternalRoleProto) models.ExternalRole {
+	role := models.ExternalRole{
+		Name:        p.Name,
+		AccountID:   p.AccountId,
+		Description: p.Description,
+	}
+	if p.Id != "" {
+		role.ID = uuid.MustParse(p.Id)
+	}
+	if p.ScraperId != "" {
+		id := uuid.MustParse(p.ScraperId)
+		role.ScraperID = &id
+	}
+	return role


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, verify the file exists and check the code at the mentioned lines cd api/plugin && wc -l convert.go

Repository: flanksource/config-db

Length of output: 79

🏁 Script executed:

# Check if the file exists and read specific line ranges cat -n api/plugin/convert.go | sed -n '396,409p'

Repository: flanksource/config-db

Length of output: 500

🏁 Script executed:

# Find all uuid.MustParse occurrences in the conversion file rg -n 'uuid\.MustParse' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 765

🏁 Script executed:

# Find all protoTo conversion functions to understand the pattern rg -n '^func protoTo' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 915

🏁 Script executed:

# Check the context of the proto parameters - are they plugin-provided untrusted values? cat -n api/plugin/convert.go | sed -n '1,50p'

Repository: flanksource/config-db

Length of output: 1646

🏁 Script executed:

# Check import statements to understand the uuid library being used cat -n api/plugin/convert.go | head -30

Repository: flanksource/config-db

Length of output: 1030

🏁 Script executed:

# View the full protoToExternalUser function to verify the pattern cat -n api/plugin/convert.go | sed -n '430,453p'

Repository: flanksource/config-db

Length of output: 752

🏁 Script executed:

# View protoToExternalConfigAccess to check all mentioned locations cat -n api/plugin/convert.go | sed -n '536,577p'

Repository: flanksource/config-db

Length of output: 1472

🏁 Script executed:

# View protoToExternalGroup function cat -n api/plugin/convert.go | sed -n '470,490p'

Repository: flanksource/config-db

Length of output: 695

🏁 Script executed:

# View protoToExternalConfigAccessLog function cat -n api/plugin/convert.go | sed -n '593,600p'

Repository: flanksource/config-db

Length of output: 416

🏁 Script executed:

# Verify uuid.Parse behavior - search for any existing error handling patterns rg -B2 -A2 'uuid\.Parse\(' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# View protoToExternalUserGroup function (lines 499-503) cat -n api/plugin/convert.go | sed -n '499,510p'

Repository: flanksource/config-db

Length of output: 653

🏁 Script executed:

# Search for any existing uuid.Parse usage with error handling in the repo rg -r 'uuid\.Parse' --type go | head -20

Repository: flanksource/config-db

Length of output: 461

🏁 Script executed:

# Check if there's a logger.Warnf pattern used elsewhere for validation errors rg 'logger\.Warnf' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 177

🏁 Script executed:

# Verify that uuid.MustParse actually panics - check google/uuid documentation or source # Also check if there's a handler for panics in the plugin layer rg -r 'recover\|defer' api/plugin/ --type go | head -10

Repository: flanksource/config-db

Length of output: 593

🏁 Script executed:

# Check if these functions are called from plugin handlers and what error handling exists rg 'protoToExternal' api/plugin/convert.go

Repository: flanksource/config-db

Length of output: 1086

🏁 Script executed:

# Let me verify the context where these functions are called and check for error handling cat -n api/plugin/convert.go | sed -n '155,175p'

Repository: flanksource/config-db

Length of output: 926

🏁 Script executed:

# Check the calling context to understand the flow rg -B10 'protoToExternalRole\|protoToExternalUser\|protoToExternalGroup' api/plugin/convert.go | head -50

Repository: flanksource/config-db

Length of output: 47

Replace all uuid.MustParse calls with error-safe parsing in plugin conversion functions.

Plugin-provided UUIDs in ExternalRole, ExternalUser, ExternalGroup, ExternalUserGroup, ExternalConfigAccess, and ExternalConfigAccessLog conversion functions can crash the process if malformed. Use uuid.Parse with error handling instead.

Affected lines: 403, 406, 437, 440, 477, 480, 501, 502, 544, 547, 551, 555, 559, 596, 599

Example fix for line 403:

- if p.Id != "" { - role.ID = uuid.MustParse(p.Id) - } + if p.Id != "" { + if id, err := uuid.Parse(p.Id); err == nil { + role.ID = id + } else { + logger.Warnf("invalid external role id %q: %v", p.Id, err) + } + }

Apply the same pattern to all 15 uuid.MustParse occurrences in the file.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api/plugin/convert.go` around lines 396 - 409, The conversion functions (e.g., protoToExternalRole, protoToExternalUser, protoToExternalGroup, protoToExternalUserGroup, protoToExternalConfigAccess, protoToExternalConfigAccessLog) currently call uuid.MustParse which will panic on malformed plugin input; replace each uuid.MustParse call with uuid.Parse, check the returned error, and only set the model UUID fields when parsing succeeds (for pointer fields set to &id on success, for value fields set id on success); on parse error either leave the field unset/nil or log the invalid UUID (using existing logger) before continuing so malformed plugin-provided IDs do not crash the process.

coderabbitai · 2026-02-25T12:43:51Z

+func (p *GRPCPlugin) GRPCClient(ctx context.Context, broker *plugin.GRPCBroker, c *grpc.ClientConn) (interface{}, error) {
+	hostConn, err := broker.Dial(1)
+	if err != nil {
+		return nil, err
+	}
+	return &grpcClient{
+		client:     pb.NewScraperPluginClient(c),
+		hostClient: pb.NewHostServicesClient(hostConn),
+	}, nil


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Verify whether host-service client is actually used and where it's registered. rg -nP '\bhostClient\b' --type go api/plugin rg -nP 'RegisterHostServices|HostServicesServer|NewHostServicesClient|broker\.Dial\(' --type go api/plugin

Repository: flanksource/config-db

Length of output: 3149

🏁 Script executed:

cat -n api/plugin/grpc_plugin.go

Repository: flanksource/config-db

Length of output: 3268

🏁 Script executed:

rg -nP '\bhostClient\b' --type go --context 3 api/plugin/grpc_plugin.go

Repository: flanksource/config-db

Length of output: 438

🏁 Script executed:

rg -nP 'grpcClient|ScraperPlugin' --type go api/plugin | head -40

Repository: flanksource/config-db

Length of output: 4141

Remove unused hostClient field or implement missing host service calls.

The hostClient is initialized via broker.Dial(1) at line 30 but never consumed by any grpcClient method. This creates unused gRPC client and an unclosed connection that leaks resources. Either remove the field and the dial call, or implement the intended host service integration if this was incomplete.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api/plugin/grpc_plugin.go` around lines 29 - 37, The GRPCClient implementation dials a host connection and sets grpcClient.hostClient (via broker.Dial(1) and hostConn) but never uses or closes it, leaking resources; remove the unused hostClient and the broker.Dial call: update GRPCPlugin.GRPCClient to stop calling broker.Dial(1) and construct &grpcClient{client: pb.NewScraperPluginClient(c)} only, and also remove the hostClient field from the grpcClient struct (and any related imports/fields) so there is no unused client or open connection; alternatively, if host services are required, ensure the hostConn is stored, hostClient methods are invoked from grpcClient methods, and hostConn is closed when no longer needed.

coderabbitai · 2026-02-25T12:43:51Z

+func EnvVarFromDuty(e types.EnvVar) EnvVar {
+	ev := EnvVar{
+		ValueStatic: e.ValueStatic,
+	}
+	if e.ValueFrom != nil && e.ValueFrom.ConfigMapKeyRef != nil {
+		ev.ValueFromConfigMapKeyRefName = e.ValueFrom.ConfigMapKeyRef.Name
+		ev.ValueFromConfigMapKeyRefKey = e.ValueFrom.ConfigMapKeyRef.Key
+	}
+	if e.ValueFrom != nil && e.ValueFrom.SecretKeyRef != nil {
+		ev.ValueFromSecretKeyRefName = e.ValueFrom.SecretKeyRef.Name
+		ev.ValueFromSecretKeyRefKey = e.ValueFrom.SecretKeyRef.Key
+	}
+	return ev
+}
+
+func (e EnvVar) ToDuty() types.EnvVar {
+	ev := types.EnvVar{
+		ValueStatic: e.ValueStatic,
+	}
+	if e.ValueFromConfigMapKeyRefName != "" || e.ValueFromSecretKeyRefName != "" {
+		ev.ValueFrom = &types.EnvVarSource{}
+	}
+	if e.ValueFromConfigMapKeyRefName != "" {
+		ev.ValueFrom.ConfigMapKeyRef = &types.ConfigMapKeySelector{
+			LocalObjectReference: types.LocalObjectReference{Name: e.ValueFromConfigMapKeyRefName},
+			Key:                  e.ValueFromConfigMapKeyRefKey,
+		}
+	}
+	if e.ValueFromSecretKeyRefName != "" {
+		ev.ValueFrom.SecretKeyRef = &types.SecretKeySelector{
+			LocalObjectReference: types.LocalObjectReference{Name: e.ValueFromSecretKeyRefName},
+			Key:                  e.ValueFromSecretKeyRefKey,
+		}
+	}
+	return ev
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # First, find where types.EnvVar is defined echo "=== Searching for types.EnvVar definition ===" rg -n "type EnvVar struct" --type go -A 10 # Also search for EnvVar imports to understand the source echo -e "\n=== Searching for EnvVar imports ===" rg -n "import.*EnvVar|types\.EnvVar" --type go api/plugin/types.go

Repository: flanksource/config-db

Length of output: 864

🏁 Script executed:

#!/bin/bash # Let's check the current api/plugin/types.go file to see the actual code echo "=== Checking api/plugin/types.go ===" cat -n api/plugin/types.go | head -80

Repository: flanksource/config-db

Length of output: 2814

🏁 Script executed:

#!/bin/bash # Search for the duty types package location echo "=== Locating duty types package ===" fd -e go -x rg -l "package.*types|type EnvVar" {} \; | head -20 rg -n "type EnvVar" --type go -B 2 -A 8

Repository: flanksource/config-db

Length of output: 3177

EnvVarFromDuty and ToDuty silently drop the Name field.

The local EnvVar struct (line 26) includes a Name field, but EnvVarFromDuty (line 35) never copies e.Name to the returned struct, and ToDuty (line 50) never restores it. This causes data loss on any round-trip conversion through the plugin boundary, breaking any caller that depends on preserving environment variable names.

Fix

func EnvVarFromDuty(e types.EnvVar) EnvVar { ev := EnvVar{ + Name: e.Name, ValueStatic: e.ValueStatic, }

func (e EnvVar) ToDuty() types.EnvVar { ev := types.EnvVar{ + Name: e.Name, ValueStatic: e.ValueStatic, }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

func EnvVarFromDuty(e types.EnvVar) EnvVar {

ev := EnvVar{

ValueStatic: e.ValueStatic,

}

if e.ValueFrom != nil && e.ValueFrom.ConfigMapKeyRef != nil {

ev.ValueFromConfigMapKeyRefName = e.ValueFrom.ConfigMapKeyRef.Name

ev.ValueFromConfigMapKeyRefKey = e.ValueFrom.ConfigMapKeyRef.Key

}

if e.ValueFrom != nil && e.ValueFrom.SecretKeyRef != nil {

ev.ValueFromSecretKeyRefName = e.ValueFrom.SecretKeyRef.Name

ev.ValueFromSecretKeyRefKey = e.ValueFrom.SecretKeyRef.Key

}

return ev

}

func (e EnvVar) ToDuty() types.EnvVar {

ev := types.EnvVar{

ValueStatic: e.ValueStatic,

}

if e.ValueFromConfigMapKeyRefName != "" || e.ValueFromSecretKeyRefName != "" {

ev.ValueFrom = &types.EnvVarSource{}

}

if e.ValueFromConfigMapKeyRefName != "" {

ev.ValueFrom.ConfigMapKeyRef = &types.ConfigMapKeySelector{

LocalObjectReference: types.LocalObjectReference{Name: e.ValueFromConfigMapKeyRefName},

Key: e.ValueFromConfigMapKeyRefKey,

}

}

if e.ValueFromSecretKeyRefName != "" {

ev.ValueFrom.SecretKeyRef = &types.SecretKeySelector{

LocalObjectReference: types.LocalObjectReference{Name: e.ValueFromSecretKeyRefName},

Key: e.ValueFromSecretKeyRefKey,

}

}

return ev

}

func EnvVarFromDuty(e types.EnvVar) EnvVar {

ev := EnvVar{

Name: e.Name,

ValueStatic: e.ValueStatic,

}

if e.ValueFrom != nil && e.ValueFrom.ConfigMapKeyRef != nil {

ev.ValueFromConfigMapKeyRefName = e.ValueFrom.ConfigMapKeyRef.Name

ev.ValueFromConfigMapKeyRefKey = e.ValueFrom.ConfigMapKeyRef.Key

}

if e.ValueFrom != nil && e.ValueFrom.SecretKeyRef != nil {

ev.ValueFromSecretKeyRefName = e.ValueFrom.SecretKeyRef.Name

ev.ValueFromSecretKeyRefKey = e.ValueFrom.SecretKeyRef.Key

}

return ev

}

func (e EnvVar) ToDuty() types.EnvVar {

ev := types.EnvVar{

Name: e.Name,

ValueStatic: e.ValueStatic,

}

if e.ValueFromConfigMapKeyRefName != "" || e.ValueFromSecretKeyRefName != "" {

ev.ValueFrom = &types.EnvVarSource{}

}

if e.ValueFromConfigMapKeyRefName != "" {

ev.ValueFrom.ConfigMapKeyRef = &types.ConfigMapKeySelector{

LocalObjectReference: types.LocalObjectReference{Name: e.ValueFromConfigMapKeyRefName},

Key: e.ValueFromConfigMapKeyRefKey,

}

}

if e.ValueFromSecretKeyRefName != "" {

ev.ValueFrom.SecretKeyRef = &types.SecretKeySelector{

LocalObjectReference: types.LocalObjectReference{Name: e.ValueFromSecretKeyRefName},

Key: e.ValueFromSecretKeyRefKey,

}

}

return ev

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@api/plugin/types.go` around lines 35 - 70, The conversion functions EnvVarFromDuty and (EnvVar) ToDuty are dropping the EnvVar.Name field; update EnvVarFromDuty to copy e.Name into the returned EnvVar.Name and update (EnvVar) ToDuty to set types.EnvVar.Name from e.Name so the name round-trips correctly; locate the conversions in EnvVarFromDuty and ToDuty and assign the Name field on both sides (EnvVar and types.EnvVar) while leaving the existing ValueStatic and ValueFrom handling intact.

coderabbitai · 2026-02-25T12:43:51Z

 	"github.com/samber/lo"

-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Search for the types that must be exported from the new api package echo "=== Checking api package structure ===" find api -name "*.go" -type f | head -20 echo -e "\n=== Searching for type/func definitions in api package (excluding api/v1) ===" rg -n "^(type|func)\s+(AWS|ScrapeResults|ChangeResult|NewJSON)" api --type-list | head -50 || echo "Using alternative search..." echo -e "\n=== Alternative search for AWS type ===" rg -n "type\s+AWS\s" api --type go echo -e "\n=== Alternative search for ScrapeResults type ===" rg -n "type\s+ScrapeResults\s" api --type go echo -e "\n=== Alternative search for ChangeResult type ===" rg -n "type\s+ChangeResult\s" api --type go echo -e "\n=== Alternative search for NewJSON function ===" rg -n "func\s+NewJSON" api --type go echo -e "\n=== Check if api/v1 still exists ===" ls -la api/v1 2>/dev/null | head -5 || echo "api/v1 directory not found (or no permissions)" echo -e "\n=== List files in api package root ===" ls -la api/*.go 2>/dev/null | head -20 || echo "No .go files in api root"

Repository: flanksource/config-db

Length of output: 1698

🏁 Script executed:

#!/bin/bash # Check if api package root has any re-export files echo "=== Files directly in api/ root ===" find api -maxdepth 1 -name "*.go" -type f echo -e "\n=== Check api/go.mod or package declaration ===" head -20 api/*.go 2>/dev/null || echo "No .go files in api/" echo -e "\n=== Search for re-exports of AWS, ScrapeResults, ChangeResult, NewJSON ===" rg -n "^\s*(AWS|ScrapeResults|ChangeResult|NewJSON)" api --type go -B 2 | grep -v "^api/v1" | head -30 || echo "No re-exports found outside api/v1" echo -e "\n=== Check if scrapers/aws/cloudtrail.go is the only file with new import ===" rg -l "github.com/flanksource/config-db/api\"$" . --type go | grep -v "api/v1" | head -20

Repository: flanksource/config-db

Length of output: 949

🏁 Script executed:

#!/bin/bash # Look for initialization files or re-export patterns echo "=== Search for api.go, init.go, types.go in api root ===" find api -maxdepth 1 $ -name "*.go" -o -name "go.mod" $ -type f echo -e "\n=== Check git status to see what changed ===" cd scrapers/aws && git diff cloudtrail.go 2>/dev/null | head -40 echo -e "\n=== Look at the import block of cloudtrail.go ===" head -30 scrapers/aws/cloudtrail.go echo -e "\n=== Check if api/v1 package is re-exported via api package somehow ===" cd api && cat go.mod 2>/dev/null | head -20 || echo "No go.mod in api" echo -e "\n=== Verify that types in api/v1 are still there ===" head -20 api/v1/aws.go

Repository: flanksource/config-db

Length of output: 591

🏁 Script executed:

#!/bin/bash # Find the actual file paths echo "=== Search for cloudtrail.go ===" find . -name "cloudtrail.go" -type f echo -e "\n=== Search for api directory ===" find . -maxdepth 2 -name "api" -type d echo -e "\n=== List repository structure ===" ls -la

Repository: flanksource/config-db

Length of output: 2704

🏁 Script executed:

#!/bin/bash # Now examine the actual file and api structure echo "=== View cloudtrail.go imports (first 30 lines) ===" head -30 ./scrapers/aws/cloudtrail.go echo -e "\n=== List api/ directory contents ===" find ./api -type f -name "*.go" | sort echo -e "\n=== List pkg/api/ directory contents ===" find ./pkg/api -type f -name "*.go" 2>/dev/null | sort || echo "pkg/api has no .go files" echo -e "\n=== Search for AWS type in api/ ===" grep -rn "^type AWS" ./api --include="*.go" echo -e "\n=== Search for ScrapeResults type in api/ ===" grep -rn "^type ScrapeResults" ./api --include="*.go" echo -e "\n=== Search for ChangeResult type in api/ ===" grep -rn "^type ChangeResult" ./api --include="*.go" echo -e "\n=== Search for NewJSON function in api/ ===" grep -rn "^func NewJSON" ./api --include="*.go"

Repository: flanksource/config-db

Length of output: 2414

🏁 Script executed:

#!/bin/bash # Check if the api/v1 package is actually part of the api package namespace echo "=== Check package declaration in api/v1/aws.go ===" head -15 ./api/v1/aws.go echo -e "\n=== Check if there's any api.go or init.go in api/ root ===" ls -la ./api/ | grep -E "\.(go|mod)$" echo -e "\n=== Check api/v1 groupversion_info.go (likely contains package meta) ===" cat ./api/v1/groupversion_info.go echo -e "\n=== Verify: does api package exist as a module/package? ===" grep -r "^package api[^v]" ./api --include="*.go" | head -5 echo -e "\n=== Check for any import remapping in api/v1 ===" grep -n "import" ./api/v1/aws.go | head -10

Repository: flanksource/config-db

Length of output: 1823

🏁 Script executed:

#!/bin/bash # Critical check: verify if api package even exists or if this is a module structure issue echo "=== Check if api is a Go module itself ===" cat ./go.mod | grep -A 5 "^module" echo -e "\n=== Search for 'package api' (not v1) anywhere ===" grep -rn "^package api$" . --include="*.go" | grep -v "/v1/" | head -20 echo -e "\n=== Check if api/ has a go.mod file ===" ls -la ./api/go.mod 2>/dev/null || echo "No go.mod in api/" echo -e "\n=== Try to understand the api directory structure ===" tree -L 2 ./api/ 2>/dev/null || find ./api -maxdepth 2 -type d echo -e "\n=== Final verification: what package does cloudtrail.go actually get with 'v1'? ===" echo "The import 'v1 \"github.com/flanksource/config-db/api\"' would resolve to:" echo "- Files in api/ root with 'package api'" echo "- OR api/api.go with 'package api'" echo "Let's check what exists:" ls -la ./api/*.go 2>/dev/null || echo "NO .go FILES IN api/ ROOT"

Repository: flanksource/config-db

Length of output: 1688

🏁 Script executed:

#!/bin/bash # Final verification: this import should fail at compile time echo "=== Confirm: there is NO 'package api' in api/ directory ===" find ./api -name "*.go" -type f -exec grep -l "^package api$" {} \; echo -e "\n=== Verify package declarations in api/ ===" find ./api -maxdepth 1 -name "*.go" -type f -exec head -1 {} \; echo "(None found above = no .go files in api/ root)" echo -e "\n=== The types ARE in api/v1 with 'package v1' ===" head -1 ./api/v1/aws.go head -1 ./api/v1/interface.go head -1 ./api/v1/json_types.go echo -e "\n=== Conclusion: Import path mismatch ===" echo "Current import in cloudtrail.go: v1 \"github.com/flanksource/config-db/api\"" echo "Actual package location: ./api/v1/ (package v1)" echo "This import will FAIL because api/ has no Go package (no .go files with 'package api')" echo "" echo "Fix should be either:" echo "1. Import should be: v1 \"github.com/flanksource/config-db/api/v1\"" echo "2. OR api/ root needs a package with re-exports from api/v1"

Repository: flanksource/config-db

Length of output: 702

The import path is incorrect and will cause a compilation error.

The import was changed to v1 "github.com/flanksource/config-db/api", but the api package doesn't exist—there are no .go files in the api/ root directory. The types (AWS, ScrapeResults, ChangeResult, NewJSON) are all defined in api/v1/ with package v1 declaration. Go will fail to resolve the import.

The import must either be reverted to v1 "github.com/flanksource/config-db/api/v1" or the api/ package root must be populated with re-exports from api/v1/.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/aws/cloudtrail.go` at line 15, The import alias v1 is referencing a non-existent package root; update the import to point to the actual package that defines the types (change the import from v1 "github.com/flanksource/config-db/api" to v1 "github.com/flanksource/config-db/api/v1") so symbols like AWS, ScrapeResults, ChangeResult and NewJSON resolve; alternatively add re-exports in the api package root, but the quick fix is to correct the import path where v1 is used.

coderabbitai · 2026-02-25T12:43:52Z

+		if attempt > 0 {
+			backoff := time.Duration(attempt*attempt) * time.Second
+			c.ctx.Tracef("retrying scorecard request for %s (attempt %d/%d) after %v",
+				repoFullName, attempt+1, maxRetries, backoff)
+			time.Sleep(backoff)
+		}


⚠️ Potential issue | 🟠 Major

Backoff sleep ignores context cancellation.

Use a context-aware wait to avoid blocking shutdown/timeout paths during retries.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/openssf/client.go` around lines 81 - 86, The retry backoff currently calls time.Sleep(backoff) which ignores cancellation; replace that sleep with a context-aware wait (select on time.After(backoff) and the request/context Done channel) so retries abort promptly on cancel/timeout. Locate the retry block where repoFullName, attempt, maxRetries and backoff are computed (the c.ctx.Tracef log and time.Sleep call) and change the sleep to a select that returns/propagates the context error if the context is cancelled instead of blocking the goroutine.

coderabbitai · 2026-02-25T12:43:52Z

+	if resp.StatusCode >= 500 {
+		return nil, fmt.Errorf("server error: %d", resp.StatusCode)
+	}
+
+	if resp.StatusCode != http.StatusOK {
+		body, _ := io.ReadAll(resp.Body)
+		return nil, fmt.Errorf("unexpected status %d: %s", resp.StatusCode, string(body))
+	}


⚠️ Potential issue | 🟠 Major

HTTP 429 is treated as non-retryable.

fetchScorecard returns a generic non-OK error for rate limiting, and isRetryable won’t match it, so transient throttling fails fast instead of retrying.

🔁 Proposed fix

if resp.StatusCode >= 500 { return nil, fmt.Errorf("server error: %d", resp.StatusCode) } + if resp.StatusCode == http.StatusTooManyRequests { + return nil, fmt.Errorf("rate limited: retry-after=%s", resp.Header.Get("Retry-After")) + } @@ - return contains(errStr, "server error") || - contains(errStr, "timeout") || - contains(errStr, "connection") + return contains(errStr, "server error") || + contains(errStr, "rate limited") || + contains(errStr, "timeout") || + contains(errStr, "connection")

Also applies to: 149-153

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/openssf/client.go` around lines 122 - 129, The fetchScorecard HTTP response handling treats 429 as a generic non-OK error so isRetryable won't catch rate-limits; update the response checks in fetchScorecard (and the similar block used around the other status handling) to explicitly treat http.StatusTooManyRequests (429) as a retryable error — e.g., return an error value or type that is recognized by isRetryable when resp.StatusCode == http.StatusTooManyRequests, and preserve the existing body/error text for logging; ensure the same change is applied to the similar status-check block referenced near the second occurrence so throttling triggers retries.

coderabbitai · 2026-02-25T12:43:52Z

+)
+
+// LastScorecardScrapeTime tracks the last assessment date for each repository to avoid redundant API calls
+var LastScorecardScrapeTime = sync.Map{}


⚠️ Potential issue | 🟠 Major

Package-level LastScorecardScrapeTime bleeds state across scraper instances and test runs.

A process-global sync.Map means:

Tests cannot run in parallel without cross-contaminating cache state.

Changing ScorecardCacheTTL in config has no effect at runtime since the map persists.

Consider scoping the cache per-scraper config ID or making it injectable for testing.

Would you like me to open a new issue to track the per-scraper-config scoping of this cache?

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/openssf/scorecard.go` at line 22, The package-level sync.Map LastScorecardScrapeTime currently holds global state; change it to be scoped per-scraper instance or injectable so cache doesn't leak across tests or config changes: add a field (e.g., lastScrape sync.Map or map[string]time.Time with mutex) to the scraper struct used to run scorecard scrapes (or accept a Cache interface via the scraper's constructor) and update all uses of LastScorecardScrapeTime to reference that instance field; ensure the cache honors ScorecardCacheTTL from the scraper's config and write tests that inject a fresh cache to avoid cross-test contamination.

coderabbitai · 2026-02-25T12:43:52Z

+			if config.MinScore != nil && scorecard.Score < float64(*config.MinScore) {
+				ctx.Debugf("skipping %s: score %.1f below minimum %d",
+					repoFullName, scorecard.Score, *config.MinScore)
+				continue
+			}


⚠️ Potential issue | 🟠 Major

Repos filtered by MinScore don't update the cache, causing repeated unnecessary API calls.

When scorecard.Score < float64(*config.MinScore), the function skips the LastScorecardScrapeTime.Store(...) call on line 59. On every subsequent scrape run, shouldSkipScorecardFetch will return false (cache miss), triggering a fresh API call that will again be thrown away. Move the store before the filter check.

🐛 Proposed fix

scorecard, err := client.GetRepositoryScorecard(ctx, repoConfig.Owner, repoConfig.Repo) if err != nil { results.Errorf(err, "failed to scrape repository %s", repoFullName) continue } + LastScorecardScrapeTime.Store(repoFullName, time.Now()) + if config.MinScore != nil && scorecard.Score < float64(*config.MinScore) { ctx.Debugf("skipping %s: score %.1f below minimum %d", repoFullName, scorecard.Score, *config.MinScore) continue } - LastScorecardScrapeTime.Store(repoFullName, time.Now()) ctx.Tracef("stored last scorecard scrape time for %s: %v", repoFullName, scorecard.Date)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/openssf/scorecard.go` around lines 53 - 57, The cache isn't updated for repos filtered out by MinScore leading to repeated API calls; in the loop handling scorecard results (referencing scorecard.Score and config.MinScore) call the cache update (LastScorecardScrapeTime.Store or the existing scrape-time storage used before) immediately after receiving the scorecard for repoFullName and before the MinScore check so that shouldSkipScorecardFetch will hit on subsequent runs; move the Store(...) invocation to precede the if that checks config.MinScore and retains the existing timestamp/value you store now.

coderabbitai · 2026-02-25T12:43:52Z


 	"github.com/flanksource/commons/duration"
-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find where DeletedReasonStale is defined and used echo "=== Searching for DeletedReasonStale definitions and usages ===" rg -n "DeletedReasonStale" --type go -B 2 -A 2 echo "" echo "=== Checking directory structure of api package ===" find . -type d -name api | head -20

Repository: flanksource/config-db

Length of output: 1228

🏁 Script executed:

#!/bin/bash # Check if root api package re-exports v1 symbols echo "=== Contents of root api directory ===" ls -la ./api/ echo "" echo "=== Checking for imports/exports in api/*.go ===" rg -n "v1" ./api --type go -m 5

Repository: flanksource/config-db

Length of output: 4149

🏁 Script executed:

#!/bin/bash # Check for any root-level api package files echo "=== Files directly in api directory (not subdirs) ===" find ./api -maxdepth 1 -type f -name "*.go" echo "" echo "=== Check api package initialization or re-exports ===" rg -n "package api" --type go -B 2 -A 5 | head -40

Repository: flanksource/config-db

Length of output: 772

🏁 Script executed:

#!/bin/bash # Look at what api/plugin/convert.go imports and uses echo "=== Content of api/plugin/convert.go (first 50 lines) ===" head -50 ./api/plugin/convert.go echo "" echo "=== Check if there's any Go file in api/ root with hidden files ===" find ./api -maxdepth 1 -name "*.go" -o -name "doc.go" echo "" echo "=== Search for where ScrapeResult is defined ===" rg -n "type ScrapeResult" --type go

Repository: flanksource/config-db

Length of output: 1649

🏁 Script executed:

#!/bin/bash # Check if api package has any init or re-export mechanism echo "=== Check all files in api directory recursively ===" find ./api -type f -name "*.go" | sort echo "" echo "=== Verify: is api/ an empty package? ===" ls -la ./api/*.go 2>/dev/null || echo "No .go files in root api/" echo "" echo "=== Confirm: verify stale.go import path in actual file ===" head -15 ./scrapers/stale.go

Repository: flanksource/config-db

Length of output: 1428

This import change will cause a build failure.

The root api directory contains no Go files—all code resides in api/v1/. Changing the import from v1 "github.com/flanksource/config-db/api/v1" to v1 "github.com/flanksource/config-db/api" means the alias v1 now points to an empty package. The usages of v1.DeletedReasonStale at lines 58 and 68 will fail to compile because DeletedReasonStale is defined only in api/v1/types.go and is not accessible from the root api package.

The import must remain v1 "github.com/flanksource/config-db/api/v1".

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/stale.go` at line 8, The import was changed to the root package which is empty; revert the import to the versioned package so the symbols like DeletedReasonStale are available: change the import alias back to v1 "github.com/flanksource/config-db/api/v1" in stale.go so uses of v1.DeletedReasonStale (and any other v1 types) compile.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scrapers/config_access_test.go`:
- Around line 7-8: Remove the duplicate import of
"github.com/flanksource/config-db/api" aliased as v1 (keep the single import as
api) and update all usages of v1.ScrapeConfig to api.ScrapeConfig (replace the
four occurrences of v1.ScrapeConfig in this file). This eliminates the
redeclared import error by using the existing api import and consistent type
references.

In `@scrapers/external_entities_test.go`:
- Around line 4-5: Remove the duplicate import of
"github.com/flanksource/config-db/api" that uses the alias v1 and keep the
single import as api; then update all usages of v1.ScrapeConfig to
api.ScrapeConfig (occurrences include the references at the top-level imports
and the tests that currently use v1.ScrapeConfig) so there are no duplicate
imports or redeclared aliases and the code compiles cleanly.

In `@scrapers/helpers_test.go`:
- Around line 10-11: Remove the duplicate unaliased import of
"github.com/flanksource/config-db/api" from the import block and keep the
aliased import v1; then update the single occurrence of the bare api identifier
to use the v1 alias (replace api.XXX with v1.XXX where used, e.g. the test
reference that currently uses api) so the import is only declared once and
references the v1 alias consistently.

In `@scrapers/kubernetes_scraper_test.go`:
- Line 11: The file currently imports the same package twice as both the default
name (api) and alias (v1); remove the duplicate import (either the `v1
"github.com/flanksource/config-db/api"` line or the default
`github.com/flanksource/config-db/api` import) and then update all references
consistently (replace all `v1.` occurrences with `api.` if you remove the alias,
or replace all `api.` occurrences with `v1.` if you keep the alias) so there are
no duplicate import paths and all symbols compile.

In `@scrapers/plugins_test.go`:
- Around line 8-10: Remove the duplicated import of
github.com/flanksource/config-db/api (the unaliased `api`) and consolidate to
the existing `v1` alias; then replace all uses of the `api.` identifier in this
file with `v1.` (the occurrences currently using `api.` around the test
helper/initialization) so the file only imports the package once as `v1`.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7eed272 and 4037510.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (11)

db/update.go
go.mod
scrapers/aws/cloudtrail.go
scrapers/config_access_test.go
scrapers/external_entities_test.go
scrapers/helpers_test.go
scrapers/kubernetes/kubernetes.go
scrapers/kubernetes_scraper_test.go
scrapers/plugins_test.go
scrapers/processors/json.go
scrapers/system/system.go

🚧 Files skipped from review as they are similar to previous changes (5)

scrapers/system/system.go
scrapers/aws/cloudtrail.go
go.mod
scrapers/processors/json.go
db/update.go

coderabbitai · 2026-02-25T13:51:26Z

 	"github.com/flanksource/config-db/api"
-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"


⚠️ Potential issue | 🔴 Critical

Duplicate import of the same package path breaks compilation.

Line 7 already imports github.com/flanksource/config-db/api under its default package name api. Adding the same path again with alias v1 on line 8 is not permitted by the Go compiler and is the direct cause of the v1 redeclared in this block build failure.

The fix is to drop the redundant v1 alias import and replace the four v1.ScrapeConfig references in this file with api.ScrapeConfig.

🐛 Proposed fix

"github.com/flanksource/config-db/api" - v1 "github.com/flanksource/config-db/api"

Then update every usage of v1.ScrapeConfig → api.ScrapeConfig (lines 21, 140, 295, 550):

- var scrapeConfig v1.ScrapeConfig + var scrapeConfig api.ScrapeConfig

🧰 Tools

🪛 GitHub Actions: Lint

[error] 8-8: v1 redeclared in this block

🪛 GitHub Actions: Test

[error] 8-8: vet: scrapers/config_access_test.go:8:2: v1 redeclared in this block

🪛 GitHub Check: lint

[failure] 7-7:
other declaration of v1

[failure] 8-8:
v1 redeclared in this block

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/config_access_test.go` around lines 7 - 8, Remove the duplicate import of "github.com/flanksource/config-db/api" aliased as v1 (keep the single import as api) and update all usages of v1.ScrapeConfig to api.ScrapeConfig (replace the four occurrences of v1.ScrapeConfig in this file). This eliminates the redeclared import error by using the existing api import and consistent type references.

coderabbitai · 2026-02-25T13:51:26Z

 	"github.com/flanksource/config-db/api"
-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"


⚠️ Potential issue | 🔴 Critical

Duplicate import of the same package — compile error.

Lines 4 and 5 both import github.com/flanksource/config-db/api; Go rejects duplicate import paths in the same block (confirmed by the linter: "v1 redeclared in this block"). The v1 alias is only needed for v1.ScrapeConfig; since api resolves to the same package, simply drop the duplicate and use api.ScrapeConfig everywhere.

🐛 Proposed fix

import ( "github.com/flanksource/config-db/api" - v1 "github.com/flanksource/config-db/api" "github.com/flanksource/config-db/db" "github.com/flanksource/config-db/db/models" dutymodels "github.com/flanksource/duty/models" "github.com/google/uuid" . "github.com/onsi/ginkgo/v2" . "github.com/onsi/gomega" "github.com/samber/lo" k8sTypes "k8s.io/apimachinery/pkg/types" )

Then replace every v1.ScrapeConfig occurrence (lines 17, 142, 229, 334) with api.ScrapeConfig:

- var scrapeConfig v1.ScrapeConfig + var scrapeConfig api.ScrapeConfig

🧰 Tools

🪛 GitHub Check: lint

[failure] 4-4:
other declaration of v1

[failure] 5-5:
v1 redeclared in this block

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/external_entities_test.go` around lines 4 - 5, Remove the duplicate import of "github.com/flanksource/config-db/api" that uses the alias v1 and keep the single import as api; then update all usages of v1.ScrapeConfig to api.ScrapeConfig (occurrences include the references at the top-level imports and the tests that currently use v1.ScrapeConfig) so there are no duplicate imports or redeclared aliases and the code compiles cleanly.

coderabbitai · 2026-02-25T13:51:26Z

 	"github.com/flanksource/config-db/api"
-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"


⚠️ Potential issue | 🔴 Critical

Duplicate import of the same package path causes a compile error.

Lines 10 and 11 both import "github.com/flanksource/config-db/api" — Go forbids importing the same path twice in one block. The lint check confirms: v1 redeclared in this block. Remove the unaliased import and update the one call site that used the bare api identifier.

🐛 Proposed fix

import ( gocontext "context" "encoding/json" "fmt" "math/rand/v2" "os" - "github.com/flanksource/config-db/api" v1 "github.com/flanksource/config-db/api" "github.com/flanksource/config-db/db/models" . "github.com/onsi/ginkgo/v2" apiv1 "k8s.io/api/core/v1" metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" )

Update the only usage of the bare api identifier (line 58) to use the v1 alias:

-func GetConfigItemFromID(ctx api.ScrapeContext, id string) (*models.ConfigItem, error) { +func GetConfigItemFromID(ctx v1.ScrapeContext, id string) (*models.ConfigItem, error) {

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"github.com/flanksource/config-db/api"

v1 "github.com/flanksource/config-db/api/v1"

v1 "github.com/flanksource/config-db/api"

import (

gocontext "context"

"encoding/json"

"fmt"

"math/rand/v2"

"os"

v1 "github.com/flanksource/config-db/api"

"github.com/flanksource/config-db/db/models"

. "github.com/onsi/ginkgo/v2"

apiv1 "k8s.io/api/core/v1"

metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"

)

Suggested change

"github.com/flanksource/config-db/api"

v1 "github.com/flanksource/config-db/api/v1"

v1 "github.com/flanksource/config-db/api"

func GetConfigItemFromID(ctx v1.ScrapeContext, id string) (*models.ConfigItem, error) {

🧰 Tools

🪛 GitHub Check: lint

[failure] 10-10:
other declaration of v1

[failure] 11-11:
v1 redeclared in this block

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/helpers_test.go` around lines 10 - 11, Remove the duplicate unaliased import of "github.com/flanksource/config-db/api" from the import block and keep the aliased import v1; then update the single occurrence of the bare api identifier to use the v1 alias (replace api.XXX with v1.XXX where used, e.g. the test reference that currently uses api) so the import is only declared once and references the v1 alias consistently.

coderabbitai · 2026-02-25T13:51:27Z

 	"github.com/flanksource/commons/logger"
 	"github.com/flanksource/config-db/api"
-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

cat -n scrapers/kubernetes_scraper_test.go | head -20

Repository: flanksource/config-db

Length of output: 733

🏁 Script executed:

rg "api\.|v1\." scrapers/kubernetes_scraper_test.go

Repository: flanksource/config-db

Length of output: 1711

🏁 Script executed:

rg "apiv1\." scrapers/kubernetes_scraper_test.go -A 2 -B 2 | head -20

Repository: flanksource/config-db

Length of output: 592

🏁 Script executed:

rg "apiv1" scrapers/kubernetes_scraper_test.go | head -5

Repository: flanksource/config-db

Length of output: 220

🏁 Script executed:

# Check if there are other files in scrapers/ that might import apiv1 rg "apiv1\s*=" scrapers/

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

sed -n '3,30p' scrapers/kubernetes_scraper_test.go

Repository: flanksource/config-db

Length of output: 865

Duplicate import path will fail compilation.

Line 11 re-imports github.com/flanksource/config-db/api, which is already imported on Line 10. Go does not allow duplicate import paths in a single file. Keep one import alias and update all references (v1.* or api.*) consistently throughout the file.

Fix guidance

Either:

Remove line 11 and replace all v1. usages with api.

Remove the default import on line 10 and update all api. usages to v1.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/kubernetes_scraper_test.go` at line 11, The file currently imports the same package twice as both the default name (api) and alias (v1); remove the duplicate import (either the `v1 "github.com/flanksource/config-db/api"` line or the default `github.com/flanksource/config-db/api` import) and then update all references consistently (replace all `v1.` occurrences with `api.` if you remove the alias, or replace all `api.` occurrences with `v1.` if you keep the alias) so there are no duplicate import paths and all symbols compile.

coderabbitai · 2026-02-25T13:51:27Z

 	"github.com/flanksource/config-db/api"
-	v1 "github.com/flanksource/config-db/api/v1"
+	v1 "github.com/flanksource/config-db/api"
 	"github.com/flanksource/config-db/db"


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, read the import block and surrounding context head -50 scrapers/plugins_test.go | cat -n

Repository: flanksource/config-db

Length of output: 1637

🏁 Script executed:

# Search for all occurrences of api. and v1. prefixes in the file echo "=== api. usage ===" && rg 'api\.' scrapers/plugins_test.go | head -20 echo "" echo "=== v1. usage ===" && rg 'v1\.' scrapers/plugins_test.go | head -20

Repository: flanksource/config-db

Length of output: 382

🏁 Script executed:

# Run the verification script to check for duplicate imports across all Go files python - <<'PY' import re, pathlib, sys dups = [] for f in pathlib.Path(".").rglob("*.go"): txt = f.read_text(encoding="utf-8", errors="ignore") # import (...) blocks for m in re.finditer(r'import\s*$(.*?)$', txt, re.S): paths = re.findall(r'"([^"]+)"', m.group(1)) counts = {} for p in paths: counts[p] = counts.get(p, 0) + 1 for p, c in counts.items(): if c > 1: dups.append((str(f), p, c)) if dups: for f, p, c in dups: print(f"{f}: {p} imported {c} times in same import block") sys.exit(1) print("No duplicate import paths in import blocks.") PY

Repository: flanksource/config-db

Length of output: 596

Consolidate duplicated package imports to a single local name.

Lines 8-9 import the same package path github.com/flanksource/config-db/api under two different local names. The file uses api. on lines 24-25 and v1. elsewhere, creating unnecessary confusion and maintenance overhead.

Recommendation: Keep the v1 alias, remove line 8, and update the two api. references to v1..

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/plugins_test.go` around lines 8 - 10, Remove the duplicated import of github.com/flanksource/config-db/api (the unaliased `api`) and consolidate to the existing `v1` alias; then replace all uses of the `api.` identifier in this file with `v1.` (the occurrences currently using `api.` around the test helper/initialization) so the file only imports the package once as `v1`.

coderabbitai

♻️ Duplicate comments (16)

api/v1/types.go (2)

97-167: ApplyPlugin is missing loops for GitHubSecurity and OpenSSFScorecard.

All other scraper types have a corresponding BaseScraper.ApplyPlugins loop in ApplyPlugin, but the two new types are unhandled, so plugins (e.g., retention, relationships) will silently have no effect on them.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/types.go` around lines 97 - 167, The ApplyPlugin method on ScraperSpec
fails to apply plugins to GitHubSecurity and OpenSSFScorecard scrapers; update
ScraperSpec.ApplyPlugin to include loops for spec.GitHubSecurity and
spec.OpenSSFScorecard (same pattern as other types) and call each item’s
BaseScraper = spec.<Type>[i].BaseScraper.ApplyPlugins(plugins...) so Retention
and other plugin effects are applied to those scrapers as well. Ensure you
follow the existing loop/assignment pattern used for types like
spec.GithubActions and spec.Azure.

15-31: AllScraperConfigs missing "githubsecurity" and "openssfscorecard" entries.

New scraper types are absent from the discovery map, which may break config-type-based lookups that rely on AllScraperConfigs.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/types.go` around lines 15 - 31, AllScraperConfigs is missing entries
for the new scraper types; update the map named AllScraperConfigs to include the
"githubsecurity" and "openssfscorecard" keys mapped to their corresponding
config struct zero-values (e.g., GitHubSecurity{} and OpenSSFScorecard{} or
whatever the actual type names are) so discovery and type-based lookups can find
those scrapers; locate the AllScraperConfigs map and add those two key→struct
entries consistent with the existing naming style.

scrapers/openssf/client.go (2)

118-153: HTTP 429 is not treated as retryable.

fetchScorecard returns a generic error for 429 (falls through to the unexpected status branch), and isRetryable only matches "server error", "timeout", and "connection", so rate-limit responses are never retried.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/client.go` around lines 118 - 153, The fetchScorecard path
currently treats 429 as an unexpected status; update the HTTP handling in the
fetchScorecard function to explicitly detect http.StatusTooManyRequests and
return a distinct error (e.g., containing "rate limited" or "too many requests")
and optionally include the Retry-After header value if present, and then update
ScorecardClient.isRetryable to consider that error by checking for "rate
limit"/"too many requests"/"429" in err.Error() in addition to the existing
"server error", "timeout", and "connection" checks so 429 responses will be
retried.

80-86: Backoff sleep ignores context cancellation.

time.Sleep(backoff) blocks unconditionally during retries, preventing timely shutdown or timeout propagation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/client.go` around lines 80 - 86, The retry backoff currently
calls time.Sleep(backoff) which blocks and ignores cancellation; update the
retry loop in scrapers/openssf/client.go so the sleep uses a cancellable wait
(select on time.After(backoff) and the request/context cancellation) instead of
time.Sleep. Specifically, inside the for-attempt loop where c.ctx.Logger is
used, replace the unconditional time.Sleep(backoff) with a select that waits on
time.After(backoff) and on c.ctx.Done() (or the request context field you use),
and when the context is done return or propagate the context error so retries
stop promptly.

api/context.go (1)

161-167: Nil-dereference on scrapeConfig in IsTrace() and IsDebug().

Both methods dereference ctx.scrapeConfig.Spec without guarding against a nil scrapeConfig, unlike ScraperID() which already performs this check. Calling either before WithScrapeConfig() will panic.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/context.go` around lines 161 - 167, The IsTrace and IsDebug methods
dereference ctx.scrapeConfig.Spec without a nil check; update both IsTrace() and
IsDebug() to guard for ctx.scrapeConfig == nil (like ScraperID() does) and
return a safe default (false) when nil, otherwise call
ctx.scrapeConfig.Spec.IsTrace()/IsDebug(); ensure behavior aligns with
WithScrapeConfig usage.

api/v1/openssf.go (1)

10-11: MinScore *float64 is still breaking CRD generation.

The pipeline (lint, test, test-prod) continues to fail with found float, the usage of which is highly discouraged. Despite being marked as addressed in earlier commits, the *float64 type remains and zz_generated.deepcopy.go reflects it unchanged.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@api/v1/openssf.go` around lines 10 - 11, The MinScore field is still declared
as *float64 which breaks CRD generation; change the MinScore field declaration
from *float64 to an integer type (e.g. *int32 or *int) in the struct that
declares MinScore, update the yaml/json tags if needed, then regenerate the
deepcopy/CRD artifacts (re-run your generation script or make generate /
controller-gen) so zz_generated.deepcopy.go and CRDs no longer contain floats;
ensure any callers or tests using MinScore are updated to use the integer type
as well.

scrapers/github/security_client.go (3)

23-48: ⚠️ Potential issue | 🟠 Major

Missing empty-token validation — will silently produce 401s on all API calls.

If neither ConnectionName resolves a connection nor PersonalAccessToken yields a value, token remains "". The client is created with an empty OAuth token and every subsequent API call fails with 401, which surfaces as confusing "failed to fetch" errors downstream.
🐛 Proposed fix
 	} else {
 		token, err = ctx.GetEnvValueFromCache(config.PersonalAccessToken, ctx.Namespace())
 		if err != nil {
 			return nil, fmt.Errorf("failed to get token: %w", err)
 		}
 	}
+
+	if token == "" {
+		return nil, fmt.Errorf("github personal access token is required: configure either 'connection' or 'personalAccessToken'")
+	}
 
 	ts := oauth2.StaticTokenSource(
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security_client.go` around lines 23 - 48, In
NewGitHubSecurityClient ensure the resolved token is validated and not empty
before creating the oauth2 client: after trying ctx.HydrateConnectionByURL and
ctx.GetEnvValueFromCache, check that token != "" (and return a clear error
referencing ConnectionName/PersonalAccessToken) so you don't construct an
oauth2.StaticTokenSource / github.NewClient with an empty token; update the
function to return an error when token resolution yields an empty string.
50-57: ⚠️ Potential issue | 🟠 Major

CreatedAt field is declared but never consumed by any Get* method.

fetchAllAlerts in security.go (Line 147) sets opts.CreatedAt for incremental scrapes, but none of the four client methods (GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts, GetSecurityAdvisories) pass it through to the underlying GitHub API request options. Incremental scraping is therefore non-functional.

121-133: ⚠️ Potential issue | 🟠 Major

GetSecurityAdvisories is missing pagination — Page/PerPage from AlertListOptions are not applied.

All other Get* methods in this file include ListOptions but this one does not, so only the API's default page size is returned.
🐛 Proposed fix
 	reqOpts := &github.ListRepositorySecurityAdvisoriesOptions{
 		State: opts.State,
+		ListOptions: github.ListOptions{
+			Page:    opts.Page,
+			PerPage: opts.PerPage,
+		},
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security_client.go` around lines 121 - 133,
GetSecurityAdvisories is not applying pagination from AlertListOptions; update
the reqOpts passed to github.SecurityAdvisories.ListRepositorySecurityAdvisories
to include Page and PerPage from opts (e.g., set reqOpts.Page = opts.Page and
reqOpts.PerPage = opts.PerPage) so it honors paging like the other Get* methods;
ensure you reference the ListRepositorySecurityAdvisoriesOptions struct and the
AlertListOptions fields when making this change in GetSecurityAdvisories.

scrapers/github/security.go (5)

59-62: ⚠️ Potential issue | 🔴 Critical

Logging produces garbage: string concatenation passed to %d format verb.

result.Properties.Find(...).Text returns strings. Concatenating them with + yields a single string, which is then passed to %d. This will print %!d(string=...) instead of a count.

🐛 Proposed fix

-		ctx.Logger.V(2).Infof("successfully scraped %s: %d total alerts", repoFullName,
-			result.Properties.Find("Critical Alerts").Text+result.Properties.Find("High Alerts").Text+
-			result.Properties.Find("Medium Alerts").Text+result.Properties.Find("Low Alerts").Text)
+		ctx.Logger.V(2).Infof("successfully scraped %s: critical=%s, high=%s, medium=%s, low=%s",
+			repoFullName,
+			result.Properties.Find("Critical Alerts").Text,
+			result.Properties.Find("High Alerts").Text,
+			result.Properties.Find("Medium Alerts").Text,
+			result.Properties.Find("Low Alerts").Text)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 59 - 62, The log call in
ctx.Logger.V(2).Infof is passing a concatenated string of
result.Properties.Find(...).Text values to a %d format, producing garbage;
change this to compute numeric counts (parse each of
result.Properties.Find("Critical Alerts").Text, "High Alerts", "Medium Alerts",
"Low Alerts" to ints using strconv.Atoi with safe fallbacks to 0), sum them into
a totalCount int, and then pass totalCount to %d (and/or include the individual
counts as %d/%d/%d/%d) so the format verbs match numeric values; update the call
site where repoFullName and result are used (ctx.Logger.V(2).Infof) accordingly
and handle parse errors by treating them as zero.

94-107: ⚠️ Potential issue | 🔴 Critical

CreatedAt type mismatch: time.Time assigned to *time.Time.

repo.CreatedAt.GetTime() returns time.Time, but ScrapeResult.CreatedAt expects *time.Time.

🐛 Proposed fix

+	createdAt := repo.CreatedAt.GetTime()
 	result := v1.ScrapeResult{
 		...
-		CreatedAt:  repo.CreatedAt.GetTime(),
+		CreatedAt:  &createdAt,
 		...
 	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 94 - 107, The CreatedAt field on
v1.ScrapeResult expects a *time.Time but the code assigns
repo.CreatedAt.GetTime() (a time.Time); fix by creating a local variable (e.g.,
t := repo.CreatedAt.GetTime()) and assign its address to CreatedAt (CreatedAt:
&t) in the ScrapeResult construction so the types match; update the ScrapeResult
construction in scrapers/github/security.go where result is built.

153-175: ⚠️ Potential issue | 🟠 Major

All alert fetches are single-page (PerPage: 100) with no pagination loop.

Repositories with more than 100 alerts of any type will have truncated results, leading to undercounting and incorrect health assessments.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 153 - 175, The current fetches
using client.GetDependabotAlerts, client.GetCodeScanningAlerts, and
client.GetSecretScanningAlerts only request a single page (opts.PerPage=100) and
will truncate results for repos with >100 alerts; change each call to paginate:
loop over opts.Page starting at 1, call the respective method
(GetDependabotAlerts/GetCodeScanningAlerts/GetSecretScanningAlerts) per page,
append results to alerts.dependabot/alerts.codeScanning/alerts.secretScanning,
and break when the returned page is empty or less than opts.PerPage (or when the
API response indicates no next page). Ensure opts.Page is incremented each
iteration and preserve error handling and logging inside the loop.

128-142: ⚠️ Potential issue | 🟠 Major

filters.Severity and filters.MaxAge are declared but never propagated into AlertListOptions.

Only State is applied. The Severity field on AlertListOptions and the MaxAge/ParseMaxAge() method are unused here, meaning user-configured severity and age filters are silently ignored.

🔧 Proposed fix

 	opts := AlertListOptions{
 		State:   stateFilter,
 		Page:    1,
 		PerPage: 100,
 	}
+
+	if len(filters.Severity) > 0 {
+		opts.Severity = filters.Severity[0]
+	}
+
+	if maxAge, err := filters.ParseMaxAge(); err != nil {
+		return nil, time.Time{}, fmt.Errorf("invalid maxAge filter: %w", err)
+	} else if maxAge > 0 {
+		since := time.Now().Add(-maxAge)
+		opts.CreatedAt = since.Format(time.RFC3339)
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 128 - 142, In fetchAllAlerts,
propagate the configured severity and max-age into AlertListOptions: read
filters.Severity (use filters.Severity[0] or default when present) and set the
AlertListOptions.Severity field, and call filters.ParseMaxAge() (or equivalent)
to compute the time/duration and assign it to the appropriate
AlertListOptions.MaxAge/MaxAgeTime field so the client request honors
user-configured filters; update the opts variable (used to list alerts) before
making the API calls (references: fetchAllAlerts, AlertListOptions,
filters.Severity, filters.MaxAge, ParseMaxAge).

45-51: ⚠️ Potential issue | 🟠 Major

time.Sleep blocks indefinitely and is not cancellation-aware.

If the scrape context is cancelled while waiting for a rate-limit reset, the goroutine will remain blocked. Use a select with ctx.Done().

⏱️ Proposed fix

 		} else if shouldPause {
 			ctx.Warnf("pausing for %v due to rate limit", duration)
-			time.Sleep(duration)
+			timer := time.NewTimer(duration)
+			select {
+			case <-timer.C:
+			case <-ctx.Done():
+				timer.Stop()
+				results.Errorf(ctx.Err(), "scrape cancelled while waiting for rate-limit reset")
+				return results
+			}
 		}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 45 - 51, The current code in
security.go calls time.Sleep(duration) after client.ShouldPauseForRateLimit
which blocks even if ctx is cancelled; replace the blocking sleep with a
cancellation-aware wait: create a timer (e.g., timer := time.NewTimer(duration))
and use select { case <-timer.C: /* continue */; case <-ctx.Done():
timer.Stop(); return (or break to stop processing) } so ctx.Warnf and the pause
logic remain but the goroutine will stop waiting if ctx is cancelled; update the
branch handling in the block that calls ShouldPauseForRateLimit to use this
select instead of time.Sleep(duration).

scrapers/openssf/scorecard.go (1)

53-60: ⚠️ Potential issue | 🟠 Major

Repos filtered by MinScore don't update the cache, causing repeated unnecessary API calls.

When scorecard.Score < *config.MinScore, the continue on Line 56 skips the Store on Line 59. Every subsequent scrape run will re-fetch the same repo from the API, only to discard it again. Move the store before the filter.

🐛 Proposed fix

+		LastScorecardScrapeTime.Store(repoFullName, time.Now())
+
 		if config.MinScore != nil && scorecard.Score < *config.MinScore {
 			ctx.Debugf("skipping %s: score %.1f below minimum %.1f",
 				repoFullName, scorecard.Score, *config.MinScore)
 			continue
 		}
 
-		LastScorecardScrapeTime.Store(repoFullName, time.Now())
 		ctx.Logger.V(3).Infof("stored last scorecard scrape time for %s: %v", repoFullName, scorecard.Date)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/scorecard.go` around lines 53 - 60, The cache update is
skipped when a repo fails the MinScore filter, so move the
LastScorecardScrapeTime.Store call to happen before the MinScore check: call
LastScorecardScrapeTime.Store(repoFullName, time.Now()) (and the related log
ctx.Logger.V(3).Infof...) immediately after obtaining scorecard (before checking
if config.MinScore != nil && scorecard.Score < *config.MinScore) so filtered
repos still update the scrape timestamp; keep existing variable names
(repoFullName, scorecard.Score, config.MinScore) and preserve the log message
content.

api/v1/github_security.go (1)

43-53: ⚠️ Potential issue | 🟠 Major

ParseMaxAge cannot handle "90d" / "30d" as documented on Line 43.

time.ParseDuration only supports ns, us, ms, s, m, h. The documented examples will fail at runtime.

🐛 Proposed fix (support "d" suffix)

+import (
+	"strconv"
+	"strings"
+	"time"
+
+	"github.com/flanksource/duty/types"
+)

 func (f GitHubSecurityFilters) ParseMaxAge() (time.Duration, error) {
 	if f.MaxAge == "" {
 		return 0, nil
 	}
+	if strings.HasSuffix(f.MaxAge, "d") {
+		days, err := strconv.Atoi(strings.TrimSuffix(f.MaxAge, "d"))
+		if err != nil {
+			return 0, fmt.Errorf("invalid day duration %q: %w", f.MaxAge, err)
+		}
+		return time.Duration(days) * 24 * time.Hour, nil
+	}
 	return time.ParseDuration(f.MaxAge)
 }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@api/v1/github_security.go` around lines 43 - 53, The ParseMaxAge method on
GitHubSecurityFilters currently calls time.ParseDuration which doesn't accept
"d" (days); update ParseMaxAge to first detect a trailing 'd' or 'D' suffix
(e.g., "90d"/"30d"), trim it, parse the remaining numeric portion (using
strconv.ParseInt or strconv.ParseFloat to allow fractional days), multiply by
24*time.Hour to produce the time.Duration, and return that; if there is no 'd'
suffix, fall back to time.ParseDuration for supported units and return any
parsing error—make sure to update ParseMaxAge to handle invalid formats with a
clear error.

🧹 Nitpick comments (3)

scrapers/openssf/client.go (1)

155-168: Replace custom contains/findSubstring with strings.Contains.

Both helpers are a manual reimplementation of strings.Contains from the standard library. The custom version adds unnecessary complexity and a non-obvious short-circuit path, with no performance or correctness advantage.

♻️ Proposed refactor

+import "strings"

 func (c *ScorecardClient) isRetryable(err error) bool {
 	if err == nil {
 		return false
 	}

 	errStr := err.Error()
-	return contains(errStr, "server error") ||
-	       contains(errStr, "timeout") ||
-	       contains(errStr, "connection")
+	return strings.Contains(errStr, "server error") ||
+		strings.Contains(errStr, "timeout") ||
+		strings.Contains(errStr, "connection")
 }
-
-func contains(s, substr string) bool {
-	return len(s) >= len(substr) && (s == substr || len(s) > len(substr) &&
-		(s[:len(substr)] == substr || s[len(s)-len(substr):] == substr ||
-		 findSubstring(s, substr)))
-}
-
-func findSubstring(s, substr string) bool {
-	for i := 0; i <= len(s)-len(substr); i++ {
-		if s[i:i+len(substr)] == substr {
-			return true
-		}
-	}
-	return false
-}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/client.go` around lines 155 - 168, The custom contains and
findSubstring functions are duplicating standard library behavior; remove
contains and findSubstring and replace their callers with strings.Contains,
ensuring the package imports "strings"; update any call sites that reference
contains(...) to strings.Contains(...), delete the two helper functions
(contains and findSubstring), and run `go vet`/`go test` to confirm no missing
references.

scrapers/openssf/scorecard.go (1)

265-272: Replace containsString with slices.Contains from the standard library.

Go 1.21+ provides slices.Contains, and this project targets Go 1.25.1, so the helper function can be removed.

Changes required

Remove the function definition at lines 265-272:

-func containsString(slice []string, item string) bool {
-	for _, s := range slice {
-		if s == item {
-			return true
-		}
-	}
-	return false
-}

Add import and update line 182:

+import "slices"
 ...
-if check.Score == 0 && containsString(criticalChecks, check.Name) {
+if check.Score == 0 && slices.Contains(criticalChecks, check.Name) {

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/scorecard.go` around lines 265 - 272, Remove the custom
containsString helper and use the Go stdlib slices.Contains instead: delete the
containsString(...) function, add the standard library import "slices", and
replace any calls to containsString(slice, item) (e.g., the usage near where it
was referenced) with slices.Contains(slice, item).

scrapers/github/security.go (1)

193-198: Add a comment explaining why secret scanning alerts don't count severity.

Secret scanning alerts skip countAlertSeverity because the GitHub REST API does not return severity information for these alerts. Unlike Dependabot and code scanning alerts, SecretScanningAlert in go-github v73 has no severity field. Adding a comment to clarify this API limitation would help future maintainers understand why the handling differs.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 193 - 198, Add a short inline
comment above the loop that processes secretAlerts (the for _, alert := range
secretAlerts loop) explaining that secret scanning alerts do not include
severity in the GitHub REST API (SecretScanningAlert in go-github v73 has no
severity field), and therefore we intentionally skip calling countAlertSeverity
for these alerts; reference the go-github v73 limitation and the
countAlertSeverity function name in the comment so future maintainers understand
this special-case handling.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@api/context.go`:
- Around line 161-167: The IsTrace and IsDebug methods dereference
ctx.scrapeConfig.Spec without a nil check; update both IsTrace() and IsDebug()
to guard for ctx.scrapeConfig == nil (like ScraperID() does) and return a safe
default (false) when nil, otherwise call
ctx.scrapeConfig.Spec.IsTrace()/IsDebug(); ensure behavior aligns with
WithScrapeConfig usage.

In `@api/v1/github_security.go`:
- Around line 43-53: The ParseMaxAge method on GitHubSecurityFilters currently
calls time.ParseDuration which doesn't accept "d" (days); update ParseMaxAge to
first detect a trailing 'd' or 'D' suffix (e.g., "90d"/"30d"), trim it, parse
the remaining numeric portion (using strconv.ParseInt or strconv.ParseFloat to
allow fractional days), multiply by 24*time.Hour to produce the time.Duration,
and return that; if there is no 'd' suffix, fall back to time.ParseDuration for
supported units and return any parsing error—make sure to update ParseMaxAge to
handle invalid formats with a clear error.

In `@api/v1/openssf.go`:
- Around line 10-11: The MinScore field is still declared as *float64 which
breaks CRD generation; change the MinScore field declaration from *float64 to an
integer type (e.g. *int32 or *int) in the struct that declares MinScore, update
the yaml/json tags if needed, then regenerate the deepcopy/CRD artifacts (re-run
your generation script or make generate / controller-gen) so
zz_generated.deepcopy.go and CRDs no longer contain floats; ensure any callers
or tests using MinScore are updated to use the integer type as well.

In `@api/v1/types.go`:
- Around line 97-167: The ApplyPlugin method on ScraperSpec fails to apply
plugins to GitHubSecurity and OpenSSFScorecard scrapers; update
ScraperSpec.ApplyPlugin to include loops for spec.GitHubSecurity and
spec.OpenSSFScorecard (same pattern as other types) and call each item’s
BaseScraper = spec.<Type>[i].BaseScraper.ApplyPlugins(plugins...) so Retention
and other plugin effects are applied to those scrapers as well. Ensure you
follow the existing loop/assignment pattern used for types like
spec.GithubActions and spec.Azure.
- Around line 15-31: AllScraperConfigs is missing entries for the new scraper
types; update the map named AllScraperConfigs to include the "githubsecurity"
and "openssfscorecard" keys mapped to their corresponding config struct
zero-values (e.g., GitHubSecurity{} and OpenSSFScorecard{} or whatever the
actual type names are) so discovery and type-based lookups can find those
scrapers; locate the AllScraperConfigs map and add those two key→struct entries
consistent with the existing naming style.

In `@scrapers/github/security_client.go`:
- Around line 23-48: In NewGitHubSecurityClient ensure the resolved token is
validated and not empty before creating the oauth2 client: after trying
ctx.HydrateConnectionByURL and ctx.GetEnvValueFromCache, check that token != ""
(and return a clear error referencing ConnectionName/PersonalAccessToken) so you
don't construct an oauth2.StaticTokenSource / github.NewClient with an empty
token; update the function to return an error when token resolution yields an
empty string.
- Around line 121-133: GetSecurityAdvisories is not applying pagination from
AlertListOptions; update the reqOpts passed to
github.SecurityAdvisories.ListRepositorySecurityAdvisories to include Page and
PerPage from opts (e.g., set reqOpts.Page = opts.Page and reqOpts.PerPage =
opts.PerPage) so it honors paging like the other Get* methods; ensure you
reference the ListRepositorySecurityAdvisoriesOptions struct and the
AlertListOptions fields when making this change in GetSecurityAdvisories.

In `@scrapers/github/security.go`:
- Around line 59-62: The log call in ctx.Logger.V(2).Infof is passing a
concatenated string of result.Properties.Find(...).Text values to a %d format,
producing garbage; change this to compute numeric counts (parse each of
result.Properties.Find("Critical Alerts").Text, "High Alerts", "Medium Alerts",
"Low Alerts" to ints using strconv.Atoi with safe fallbacks to 0), sum them into
a totalCount int, and then pass totalCount to %d (and/or include the individual
counts as %d/%d/%d/%d) so the format verbs match numeric values; update the call
site where repoFullName and result are used (ctx.Logger.V(2).Infof) accordingly
and handle parse errors by treating them as zero.
- Around line 94-107: The CreatedAt field on v1.ScrapeResult expects a
*time.Time but the code assigns repo.CreatedAt.GetTime() (a time.Time); fix by
creating a local variable (e.g., t := repo.CreatedAt.GetTime()) and assign its
address to CreatedAt (CreatedAt: &t) in the ScrapeResult construction so the
types match; update the ScrapeResult construction in scrapers/github/security.go
where result is built.
- Around line 153-175: The current fetches using client.GetDependabotAlerts,
client.GetCodeScanningAlerts, and client.GetSecretScanningAlerts only request a
single page (opts.PerPage=100) and will truncate results for repos with >100
alerts; change each call to paginate: loop over opts.Page starting at 1, call
the respective method
(GetDependabotAlerts/GetCodeScanningAlerts/GetSecretScanningAlerts) per page,
append results to alerts.dependabot/alerts.codeScanning/alerts.secretScanning,
and break when the returned page is empty or less than opts.PerPage (or when the
API response indicates no next page). Ensure opts.Page is incremented each
iteration and preserve error handling and logging inside the loop.
- Around line 128-142: In fetchAllAlerts, propagate the configured severity and
max-age into AlertListOptions: read filters.Severity (use filters.Severity[0] or
default when present) and set the AlertListOptions.Severity field, and call
filters.ParseMaxAge() (or equivalent) to compute the time/duration and assign it
to the appropriate AlertListOptions.MaxAge/MaxAgeTime field so the client
request honors user-configured filters; update the opts variable (used to list
alerts) before making the API calls (references: fetchAllAlerts,
AlertListOptions, filters.Severity, filters.MaxAge, ParseMaxAge).
- Around line 45-51: The current code in security.go calls time.Sleep(duration)
after client.ShouldPauseForRateLimit which blocks even if ctx is cancelled;
replace the blocking sleep with a cancellation-aware wait: create a timer (e.g.,
timer := time.NewTimer(duration)) and use select { case <-timer.C: /* continue
*/; case <-ctx.Done(): timer.Stop(); return (or break to stop processing) } so
ctx.Warnf and the pause logic remain but the goroutine will stop waiting if ctx
is cancelled; update the branch handling in the block that calls
ShouldPauseForRateLimit to use this select instead of time.Sleep(duration).

In `@scrapers/openssf/client.go`:
- Around line 118-153: The fetchScorecard path currently treats 429 as an
unexpected status; update the HTTP handling in the fetchScorecard function to
explicitly detect http.StatusTooManyRequests and return a distinct error (e.g.,
containing "rate limited" or "too many requests") and optionally include the
Retry-After header value if present, and then update ScorecardClient.isRetryable
to consider that error by checking for "rate limit"/"too many requests"/"429" in
err.Error() in addition to the existing "server error", "timeout", and
"connection" checks so 429 responses will be retried.
- Around line 80-86: The retry backoff currently calls time.Sleep(backoff) which
blocks and ignores cancellation; update the retry loop in
scrapers/openssf/client.go so the sleep uses a cancellable wait (select on
time.After(backoff) and the request/context cancellation) instead of time.Sleep.
Specifically, inside the for-attempt loop where c.ctx.Logger is used, replace
the unconditional time.Sleep(backoff) with a select that waits on
time.After(backoff) and on c.ctx.Done() (or the request context field you use),
and when the context is done return or propagate the context error so retries
stop promptly.

In `@scrapers/openssf/scorecard.go`:
- Around line 53-60: The cache update is skipped when a repo fails the MinScore
filter, so move the LastScorecardScrapeTime.Store call to happen before the
MinScore check: call LastScorecardScrapeTime.Store(repoFullName, time.Now())
(and the related log ctx.Logger.V(3).Infof...) immediately after obtaining
scorecard (before checking if config.MinScore != nil && scorecard.Score <
*config.MinScore) so filtered repos still update the scrape timestamp; keep
existing variable names (repoFullName, scorecard.Score, config.MinScore) and
preserve the log message content.

---

Nitpick comments:
In `@scrapers/github/security.go`:
- Around line 193-198: Add a short inline comment above the loop that processes
secretAlerts (the for _, alert := range secretAlerts loop) explaining that
secret scanning alerts do not include severity in the GitHub REST API
(SecretScanningAlert in go-github v73 has no severity field), and therefore we
intentionally skip calling countAlertSeverity for these alerts; reference the
go-github v73 limitation and the countAlertSeverity function name in the comment
so future maintainers understand this special-case handling.

In `@scrapers/openssf/client.go`:
- Around line 155-168: The custom contains and findSubstring functions are
duplicating standard library behavior; remove contains and findSubstring and
replace their callers with strings.Contains, ensuring the package imports
"strings"; update any call sites that reference contains(...) to
strings.Contains(...), delete the two helper functions (contains and
findSubstring), and run `go vet`/`go test` to confirm no missing references.

In `@scrapers/openssf/scorecard.go`:
- Around line 265-272: Remove the custom containsString helper and use the Go
stdlib slices.Contains instead: delete the containsString(...) function, add the
standard library import "slices", and replace any calls to containsString(slice,
item) (e.g., the usage near where it was referenced) with slices.Contains(slice,
item).

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4037510 and 24de88c.

📒 Files selected for processing (22)

api/context.go
api/v1/github_security.go
api/v1/openssf.go
api/v1/scrapeconfig_types.go
api/v1/types.go
api/v1/zz_generated.deepcopy.go
fixtures/github-security.yaml
fixtures/openssf.yaml
scrapers/aws/aws.go
scrapers/common.go
scrapers/github/security.go
scrapers/github/security_client.go
scrapers/github/testdata/security/code-scanning-alerts.json
scrapers/github/testdata/security/dependabot-alerts.json
scrapers/github/testdata/security/secret-scanning-alerts.json
scrapers/github/testdata/security/security-advisories.json
scrapers/openssf/client.go
scrapers/openssf/compliance.go
scrapers/openssf/scorecard.go
scrapers/openssf/testdata/scorecard-high-score.json
scrapers/openssf/testdata/scorecard-low-score.json
scrapers/openssf/testdata/scorecard-medium-score.json

✅ Files skipped from review due to trivial changes (1)

scrapers/openssf/testdata/scorecard-medium-score.json

🚧 Files skipped from review as they are similar to previous changes (9)

api/v1/scrapeconfig_types.go
scrapers/aws/aws.go
scrapers/openssf/testdata/scorecard-high-score.json
scrapers/github/testdata/security/security-advisories.json
scrapers/github/testdata/security/code-scanning-alerts.json
scrapers/openssf/compliance.go
fixtures/github-security.yaml
fixtures/openssf.yaml
scrapers/github/testdata/security/dependabot-alerts.json

coderabbitai

♻️ Duplicate comments (2)

scrapers/openssf/scorecard.go (2)

54-61: ⚠️ Potential issue | 🟠 Major

Min-score handling is brittle and cache update happens too late.

Two issues here:

malformed MinScore is silently ignored (filter effectively disabled), and
repos filtered by min-score never update cache, causing repeated unnecessary API calls every run.

🐛 Proposed fix (parse once per config, fail fast on invalid value, store cache before filter)

 func (o OpenSSFScorecardScraper) Scrape(ctx api.ScrapeContext) v1.ScrapeResults {
 	results := v1.ScrapeResults{}

 	for _, config := range ctx.ScrapeConfig().Spec.OpenSSFScorecard {
 		ctx.Logger.V(2).Infof("scraping OpenSSF Scorecard for %d repositories", len(config.Repositories))
 		client := NewScorecardClient(ctx)
+		var minScore *float64
+		if config.MinScore != "" {
+			parsed, err := strconv.ParseFloat(config.MinScore, 64)
+			if err != nil {
+				results.Errorf(err, "invalid minScore %q", config.MinScore)
+				continue
+			}
+			minScore = &parsed
+		}

 		for _, repoConfig := range config.Repositories {
 			repoFullName := fmt.Sprintf("%s/%s", repoConfig.Owner, repoConfig.Repo)
@@
 			scorecard, err := client.GetRepositoryScorecard(ctx, repoConfig.Owner, repoConfig.Repo)
 			if err != nil {
 				results.Errorf(err, "failed to scrape repository %s", repoFullName)
 				continue
 			}
+
+			LastScorecardScrapeTime.Store(repoFullName, time.Now())

-			if minScore, err := strconv.ParseFloat(config.MinScore, 64); err == nil && scorecard.Score < minScore {
+			if minScore != nil && scorecard.Score < *minScore {
 				ctx.Logger.V(3).Infof("skipping %s: score %.1f below minimum %.1f",
-					repoFullName, scorecard.Score, minScore)
+					repoFullName, scorecard.Score, *minScore)
 				continue
 			}
-
-			LastScorecardScrapeTime.Store(repoFullName, time.Now())
 			ctx.Logger.V(3).Infof("stored last scorecard scrape time for %s: %v", repoFullName, scorecard.Date)

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/scorecard.go` around lines 54 - 61, Parse config.MinScore
once (not per-repo) at startup and fail fast if it is malformed; use that parsed
value (e.g., minScoreParsed) when evaluating scorecard.Score so malformed values
don't silently disable the filter, and move the
LastScorecardScrapeTime.Store(repoFullName, time.Now()) call to occur before the
min-score check so repositories skipped by the min-score filter still update the
cache; update references to config.MinScore, repoFullName, scorecard.Score and
LastScorecardScrapeTime accordingly.

22-23: ⚠️ Potential issue | 🟠 Major

Package-level scrape cache leaks state across scraper instances/tests.

LastScorecardScrapeTime as process-global mutable state can cause cross-run contamination and nondeterministic behavior when multiple configs/tests run in the same process.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/openssf/scorecard.go` around lines 22 - 23, LastScorecardScrapeTime
is a process-global sync.Map causing cross-test/instance leakage; instead, make
the scrape cache an instance field (e.g., add a field like scorecardCache
sync.Map on the ScorecardScraper/OpenssfScraper struct) and replace all uses of
the package-level LastScorecardScrapeTime with that instance field so each
scraper has its own cache; update constructors to initialize the field and
update tests to create new scraper instances (or provide a Reset method) to
avoid shared mutable state.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@scrapers/openssf/scorecard.go`:
- Around line 54-61: Parse config.MinScore once (not per-repo) at startup and
fail fast if it is malformed; use that parsed value (e.g., minScoreParsed) when
evaluating scorecard.Score so malformed values don't silently disable the
filter, and move the LastScorecardScrapeTime.Store(repoFullName, time.Now())
call to occur before the min-score check so repositories skipped by the
min-score filter still update the cache; update references to config.MinScore,
repoFullName, scorecard.Score and LastScorecardScrapeTime accordingly.
- Around line 22-23: LastScorecardScrapeTime is a process-global sync.Map
causing cross-test/instance leakage; instead, make the scrape cache an instance
field (e.g., add a field like scorecardCache sync.Map on the
ScorecardScraper/OpenssfScraper struct) and replace all uses of the
package-level LastScorecardScrapeTime with that instance field so each scraper
has its own cache; update constructors to initialize the field and update tests
to create new scraper instances (or provide a Reset method) to avoid shared
mutable state.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 24de88c and 8f8ab18.

📒 Files selected for processing (3)

api/v1/openssf.go
api/v1/zz_generated.deepcopy.go
scrapers/openssf/scorecard.go

coderabbitai

Actionable comments posted: 6

♻️ Duplicate comments (2)

scrapers/github/security.go (2)

50-66: ⚠️ Potential issue | 🟠 Major

Alert retrieval is single-page and can truncate counts.

Each alert category is fetched once at PerPage=100; repositories with more alerts will be undercounted and health status can be skewed.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 50 - 66, The current single-call
fetch for dependabot, code scanning, and secret scanning alerts
(client.GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts)
only requests PerPage=100 and will miss additional pages; change each call to a
paginated loop that sets opts.Page (starting at 1) and calls the respective
method repeatedly, capture the response (not "_"), append results to
alerts.dependabot / alerts.codeScanning / alerts.secretScanning across
iterations, and stop when resp.NextPage == 0 (or when the returned next page is
zero). Keep PerPage=100 (or configurable) and propagate errors as before inside
the loop.

30-40: ⚠️ Potential issue | 🟠 Major

Configured security filters are only partially applied.

securityFilters.severity and securityFilters.maxAge are ignored when building AlertListOptions, so configured filtering is silently dropped.

Suggested fix

 	opts := AlertListOptions{
 		State:   stateFilter,
 		Page:    1,
 		PerPage: 100,
 	}
+	if len(filters.Severity) > 0 {
+		opts.Severity = filters.Severity[0]
+	}
+	if d, err := filters.ParseMaxAge(); err == nil && d > 0 {
+		opts.CreatedAt = time.Now().Add(-d).Format(time.RFC3339)
+	}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/security.go` around lines 30 - 40, The current
AlertListOptions only sets State/Page/PerPage and ignores
config.SecurityFilters.severity and .maxAge; update the code that builds
AlertListOptions (where filters := config.SecurityFilters and opts :=
AlertListOptions{...}) to include the severity filter (set the appropriate
Severity field from filters.Severity or filters.Severity[0]) and apply maxAge by
computing a cutoff time (e.g., time.Now().Add(-filters.MaxAge) or converting
filters.MaxAge to the API's date/age field) and setting the corresponding
AlertListOptions field (e.g., CreatedAfter, Since or MaxAge) so the API returns
only alerts matching severity and within maxAge alongside State/Page/PerPage.

🧹 Nitpick comments (1)

scrapers/github/openssf.go (1)

82-85: Consider cancellation-aware retry backoff.

Using direct time.Sleep delays cancellation responsiveness during retries; a select on timer vs ctx.Done() is safer for shutdown behavior.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@scrapers/github/openssf.go` around lines 82 - 85, The retry backoff currently
uses time.Sleep which ignores cancellation; replace the direct sleep with a
context-aware timer: create a timer like
time.NewTimer(time.Duration(attempt*attempt)*time.Second) and use select { case
<-timer.C: /* continue */ case <-ctx.Done(): timer.Stop(); return ctx.Err() }
(or break/propagate error) so retries abort promptly on ctx cancellation; apply
this change around the for attempt := 0; attempt < maxRetries; attempt++ loop
(use the existing attempt, maxRetries and ctx identifiers) and ensure timers are
stopped to avoid leaks.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@scrapers/github/client.go`:
- Around line 60-67: AlertListOptions.CreatedAt is ignored so incremental
scraping silently does full fetch; update the three fetch paths
(GetDependabotAlerts, GetCodeScanningAlerts, GetSecretScanningAlerts) to apply
CreatedAt to the corresponding go-github list option struct before making
requests (e.g., map AlertListOptions.CreatedAt into the API option field such as
CreatedAt/CreatedSince/Created parameter used by each GitHub alerts client),
ensure you only set this field when AlertListOptions.CreatedAt is non-empty, and
propagate it through any helper that builds the request options so incremental
scraping is honored.

In `@scrapers/github/openssf.go`:
- Around line 92-101: The response body is deferred closed which keeps it open
across retry iterations; after the httpClient.Do(req) call in the retry loop
(variables resp, err, lastErr, and function isRetryable), read and close
resp.Body immediately instead of deferring: ensure you consume or explicitly
close resp.Body right after receiving resp (before any continue or status
checks) so connections are returned to the pool; then proceed to check status
codes and handle retryable errors or return values.

In `@scrapers/github/scraper_e2e_test.go`:
- Around line 36-37: The failure formatting calls run.ProcessState.ExitCode()
unguarded and can panic if the command never started; change the require.NoError
invocation so that when err != nil you compute an exit code safely by checking
if run.ProcessState != nil and using run.ProcessState.ExitCode() only then,
otherwise use a sentinel (e.g., -1 or "N/A") and include that safe value in the
error message passed to require.NoError (keep references to run, runOut, err,
run.ProcessState and require.NoError to locate the change).

In `@scrapers/github/scraper.go`:
- Around line 62-67: The current logic in the block that calls
scrapeSecurityAlerts (guarded by config.Security) calls continue on error which
discards the repository's other collected data; instead, remove the continue and
ensure you record the error via results.Errorf(err, "failed to scrape security
alerts for %s", repoFullName) but allow processing to continue so the repository
metadata remains in results; if scrapeSecurityAlerts returns no alerts on error,
set alerts to nil or an empty slice and proceed to attach whatever alerts
(possibly empty) to the repo result so other enrichment and storage (use the
same variables repoFullName, alerts, results) still runs.
- Around line 245-257: Update the mapGitHubSeverity function to handle GitHub
code-scanning values "error", "warning", and "note" so they don't fall through
to models.SeverityInfo; add cases mapping "error" -> models.SeverityHigh,
"warning" -> models.SeverityMedium, and "note" -> models.SeverityLow (and
consider case-insensitive matching) while keeping existing mappings for "low",
"medium", "high", and "critical".

In `@scrapers/github/security.go`:
- Around line 75-77: The loop over codeScanAlerts currently uses
alert.Rule.GetSeverity() and passes its value to countAlertSeverity, which only
handles "critical"/"high"/"medium"/"low" and thus silently drops non-CVSS values
like "error"/"warning"/"note"; change the logic in the loop to first check
alert.Rule.GetSecuritySeverityLevel() (or alert.GetSecuritySeverityLevel()) and
if that returns a non-empty CVSS level use that to call countAlertSeverity,
otherwise map fallback values from alert.Rule.GetSeverity() to a normalized
bucket (or increment an "other"/"note" count) before calling countAlertSeverity;
update any related handling in the same block and in the functions referenced
(countAlertSeverity and any aggregation using alerts.counts) so non-CVSS
severities are accounted for instead of being ignored.

---

Duplicate comments:
In `@scrapers/github/security.go`:
- Around line 50-66: The current single-call fetch for dependabot, code
scanning, and secret scanning alerts (client.GetDependabotAlerts,
GetCodeScanningAlerts, GetSecretScanningAlerts) only requests PerPage=100 and
will miss additional pages; change each call to a paginated loop that sets
opts.Page (starting at 1) and calls the respective method repeatedly, capture
the response (not "_"), append results to alerts.dependabot /
alerts.codeScanning / alerts.secretScanning across iterations, and stop when
resp.NextPage == 0 (or when the returned next page is zero). Keep PerPage=100
(or configurable) and propagate errors as before inside the loop.
- Around line 30-40: The current AlertListOptions only sets State/Page/PerPage
and ignores config.SecurityFilters.severity and .maxAge; update the code that
builds AlertListOptions (where filters := config.SecurityFilters and opts :=
AlertListOptions{...}) to include the severity filter (set the appropriate
Severity field from filters.Severity or filters.Severity[0]) and apply maxAge by
computing a cutoff time (e.g., time.Now().Add(-filters.MaxAge) or converting
filters.MaxAge to the API's date/age field) and setting the corresponding
AlertListOptions field (e.g., CreatedAfter, Since or MaxAge) so the API returns
only alerts matching severity and within maxAge alongside State/Page/PerPage.

---

Nitpick comments:
In `@scrapers/github/openssf.go`:
- Around line 82-85: The retry backoff currently uses time.Sleep which ignores
cancellation; replace the direct sleep with a context-aware timer: create a
timer like time.NewTimer(time.Duration(attempt*attempt)*time.Second) and use
select { case <-timer.C: /* continue */ case <-ctx.Done(): timer.Stop(); return
ctx.Err() } (or break/propagate error) so retries abort promptly on ctx
cancellation; apply this change around the for attempt := 0; attempt <
maxRetries; attempt++ loop (use the existing attempt, maxRetries and ctx
identifiers) and ensure timers are stopped to avoid leaks.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8f8ab18 and 583741d.

⛔ Files ignored due to path filters (1)

go.sum is excluded by !**/*.sum

📒 Files selected for processing (17)

api/v1/github.go
api/v1/interface.go
api/v1/interface_rate_limit_test.go
api/v1/types.go
cmd/run.go
fixtures/github.yaml
go.mod
scrapers/common.go
scrapers/cron.go
scrapers/github/client.go
scrapers/github/compliance.go
scrapers/github/openssf.go
scrapers/github/scraper.go
scrapers/github/scraper_e2e_test.go
scrapers/github/scraper_test.go
scrapers/github/security.go
scrapers/run.go

✅ Files skipped from review due to trivial changes (1)

api/v1/interface_rate_limit_test.go

coderabbitai · 2026-02-26T12:37:33Z

+		resp, err := httpClient.Do(req)
+		if err != nil {
+			lastErr = err
+			if isRetryable(err) {
+				continue
+			}
+			return nil, err
+		}
+		defer func() { _ = resp.Body.Close() }()
+


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat -n scrapers/github/openssf.go | head -150

Repository: flanksource/config-db

Length of output: 5294

Close response bodies immediately in the retry loop, not via defer.

When continue is executed after a retryable error (line 96) or server error (line 107), the deferred close is not invoked until the function exits, leaving the response body open across retry iterations. This prevents connection reuse and risks resource exhaustion.

Read and close the body immediately after the response is received, before status checking:

Suggested fix

resp, err := httpClient.Do(req) if err != nil { lastErr = err if isRetryable(err) { continue } return nil, err } - defer func() { _ = resp.Body.Close() }() + body, readErr := io.ReadAll(resp.Body) + _ = resp.Body.Close() + if readErr != nil { + return nil, fmt.Errorf("failed to read response body: %w", readErr) + } if resp.StatusCode == http.StatusNotFound { return nil, fmt.Errorf("repository not found or not assessed by OpenSSF Scorecard (404)") } if resp.StatusCode >= 500 { lastErr = fmt.Errorf("server error: %d", resp.StatusCode) continue } if resp.StatusCode != http.StatusOK { - body, _ := io.ReadAll(resp.Body) return nil, fmt.Errorf("unexpected status %d: %s", resp.StatusCode, string(body)) } - - body, err := io.ReadAll(resp.Body) - if err != nil { - return nil, fmt.Errorf("failed to read response body: %w", err) - }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/github/openssf.go` around lines 92 - 101, The response body is deferred closed which keeps it open across retry iterations; after the httpClient.Do(req) call in the retry loop (variables resp, err, lastErr, and function isRetryable), read and close resp.Body immediately instead of deferring: ensure you consume or explicitly close resp.Body right after receiving resp (before any continue or status checks) so connections are returned to the pool; then proceed to check status codes and handle retryable errors or return values.

coderabbitai · 2026-02-26T12:37:33Z

+	runOut, err := run.CombinedOutput()
+	require.NoError(t, err, "config-db run failed (exit %v):\n%s", run.ProcessState.ExitCode(), string(runOut))


⚠️ Potential issue | 🟡 Minor

Guard ProcessState in failure formatting.

run.ProcessState.ExitCode() can panic when the command fails before process state is populated (e.g., spawn/setup failures).

💡 Suggested fix

runOut, err := run.CombinedOutput() -require.NoError(t, err, "config-db run failed (exit %v):\n%s", run.ProcessState.ExitCode(), string(runOut)) +exitCode := -1 +if run.ProcessState != nil { + exitCode = run.ProcessState.ExitCode() +} +require.NoError(t, err, "config-db run failed (exit %v):\n%s", exitCode, string(runOut))

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/github/scraper_e2e_test.go` around lines 36 - 37, The failure formatting calls run.ProcessState.ExitCode() unguarded and can panic if the command never started; change the require.NoError invocation so that when err != nil you compute an exit code safely by checking if run.ProcessState != nil and using run.ProcessState.ExitCode() only then, otherwise use a sentinel (e.g., -1 or "N/A") and include that safe value in the error message passed to require.NoError (keep references to run, runOut, err, run.ProcessState and require.NoError to locate the change).

coderabbitai · 2026-02-26T12:37:33Z

+			if config.Security {
+				alerts, err = scrapeSecurityAlerts(ctx, client, config, repoFullName)
+				if err != nil {
+					results.Errorf(err, "failed to scrape security alerts for %s", repoFullName)
+					continue
+				}


⚠️ Potential issue | 🟠 Major

Don’t drop repository results when security enrichment fails.

A security-alert fetch error currently skips the entire repository (continue), so core repository metadata is lost even though it was successfully fetched.

💡 Suggested fix

if config.Security { alerts, err = scrapeSecurityAlerts(ctx, client, config, repoFullName) if err != nil { results.Errorf(err, "failed to scrape security alerts for %s", repoFullName) - continue + alerts = nil } }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

if config.Security {

alerts, err = scrapeSecurityAlerts(ctx, client, config, repoFullName)

if err != nil {

results.Errorf(err, "failed to scrape security alerts for %s", repoFullName)

continue

}

if config.Security {

alerts, err = scrapeSecurityAlerts(ctx, client, config, repoFullName)

if err != nil {

results.Errorf(err, "failed to scrape security alerts for %s", repoFullName)

alerts = nil

}

}

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/github/scraper.go` around lines 62 - 67, The current logic in the block that calls scrapeSecurityAlerts (guarded by config.Security) calls continue on error which discards the repository's other collected data; instead, remove the continue and ensure you record the error via results.Errorf(err, "failed to scrape security alerts for %s", repoFullName) but allow processing to continue so the repository metadata remains in results; if scrapeSecurityAlerts returns no alerts on error, set alerts to nil or an empty slice and proceed to attach whatever alerts (possibly empty) to the repo result so other enrichment and storage (use the same variables repoFullName, alerts, results) still runs.

coderabbitai · 2026-02-26T12:37:33Z

+func mapGitHubSeverity(severity string) models.Severity {
+	switch severity {
+	case "critical":
+		return models.SeverityCritical
+	case "high":
+		return models.SeverityHigh
+	case "medium":
+		return models.SeverityMedium
+	case "low":
+		return models.SeverityLow
+	default:
+		return models.SeverityInfo
+	}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

GitHub REST API Dependabot alerts security_advisory severity values 2024 2025

💡 Result:

For the GitHub REST API Dependabot alerts (/repos/{owner}/{repo}/dependabot/alerts and related endpoints), the security_advisory.severity value is a string with four possible values:

low

medium

high

critical [1][2]

This matches the severity filter parameter for listing Dependabot alerts (same four values) [1] and the example alert payloads where security_advisory.severity appears as e.g. "medium" [2].

Note: If you’ve seen moderate, that’s commonly used in other contexts (e.g., some UI/other APIs), but for the REST Dependabot alerts API the value is medium, not moderate [1][2].

Sources
[1] GitHub Docs – REST API endpoints for Dependabot alerts (severity filter values) (docs.github.com)
[2] GitHub Docs – Dependabot alerts REST example payload showing security_advisory.severity (docs.github.com)

Citations:

1: https://docs.github.com/en/enterprise-server%403.17/rest/dependabot/alerts

2: https://docs.github.com/pt/rest/dependabot/alerts?utm_source=openai

🌐 Web query:

GitHub REST API code scanning alert rule severity values version 2022-11-28

💡 Result:

For GitHub REST API Code Scanning alerts with X-GitHub-Api-Version: 2022-11-28, the alert rule object uses these severity-related fields:

rule.severity: error, warning, note (shown in the code scanning alerts REST responses) [1]

rule.security_severity_level (for security queries): critical, high, medium, low [1], [2]

Also, the severity filter parameter for listing alerts accepts the combined set: critical, high, medium, low, warning, note, error [1].

Sources: [1] [2] [3]

🏁 Script executed:

cat -n scrapers/github/scraper.go | head -300

Repository: flanksource/config-db

Length of output: 9904

🏁 Script executed:

rg "mapGitHubSeverity" scrapers/github/scraper.go -B 5 -A 2

Repository: flanksource/config-db

Length of output: 852

🏁 Script executed:

rg "dependabot|code.?scanning|CodeQL" scrapers/github/scraper.go -i

Repository: flanksource/config-db

Length of output: 455

Map GitHub code scanning severity values to prevent silent downgrades to Info.

Code scanning alerts return error, warning, and note in rule.severity, which are not handled and fall through to the default case. Dependabot already uses the correct set of values (low, medium, high, critical).

💡 Suggested fix

import ( "fmt" + "strings" "sync" "time" @@ func mapGitHubSeverity(severity string) models.Severity { - switch severity { + switch strings.ToLower(strings.TrimSpace(severity)) { case "critical": return models.SeverityCritical - case "high": + case "high", "error": return models.SeverityHigh - case "medium": + case "medium", "warning": return models.SeverityMedium - case "low": + case "low", "note": return models.SeverityLow default: return models.SeverityInfo } }

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/github/scraper.go` around lines 245 - 257, Update the mapGitHubSeverity function to handle GitHub code-scanning values "error", "warning", and "note" so they don't fall through to models.SeverityInfo; add cases mapping "error" -> models.SeverityHigh, "warning" -> models.SeverityMedium, and "note" -> models.SeverityLow (and consider case-insensitive matching) while keeping existing mappings for "low", "medium", "high", and "critical".

coderabbitai · 2026-02-26T12:37:33Z

+	for _, alert := range codeScanAlerts {
+		countAlertSeverity(&alerts.counts, alert.Rule.GetSeverity())
+		if alert.UpdatedAt != nil && alert.UpdatedAt.After(maxAlertTime) {


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the security.go file to understand the context cat -n scrapers/github/security.go | head -120

Repository: flanksource/config-db

Length of output: 4141

🏁 Script executed:

# Let's search for documentation or comments about severity levels in the codebase rg -i "severity" scrapers/github/ -A 2 -B 2

Repository: flanksource/config-db

Length of output: 5498

🏁 Script executed:

# Let's see how mapGitHubSeverity is used in scraper.go rg -A 10 "mapGitHubSeverity" scrapers/github/scraper.go | head -40

Repository: flanksource/config-db

Length of output: 1061

🌐 Web query:

GitHub code scanning alert severity levels documentation

💡 Result:

GitHub documents two related “severity” concepts for Code Scanning alerts:

Severity (non-security level): Error, Warning, or Note. [1]

Security severity (security alerts, e.g., CodeQL security queries): Critical, High, Medium, or Low—and when present, GitHub uses this in preference to severity. [1]

Related docs (GitHub Docs):

“About code scanning alerts” (includes “About alert severity and security severity levels”). [1]

“Triaging code scanning alerts in pull requests” (notes default PR check failures for error, critical, high). [2]

“Filtering alerts in security overview” (shows severity: filter values include both critical/high/medium/low and error/warning/note). [3]

Sources: [1] [2] [3]

🏁 Script executed:

# Search for any use of GetSecuritySeverityLevel in the codebase rg "GetSecuritySeverityLevel" .

Repository: flanksource/config-db

Length of output: 47

🏁 Script executed:

# Check go.mod for go-github version rg "github.com/google/go-github" go.mod

Repository: flanksource/config-db

Length of output: 158

🌐 Web query:

go-github v73 Alert Rule GetSecuritySeverityLevel method

💡 Result:

In go-github/v73, GetSecuritySeverityLevel() is a getter on the github.Rule type used by code scanning alerts (not an AlertRule type).

Struct field: SecuritySeverityLevel *string \json:"security_severity_level,omitempty"`ontype Rule struct` (code scanning rule metadata) [1]

Getter implementation:

func (r *Rule) GetSecuritySeverityLevel() string { if r == nil || r.SecuritySeverityLevel == nil { return "" } return *r.SecuritySeverityLevel } ``` [2] Typical usage from an alert: ```go sev := alert.GetRule().GetSecuritySeverityLevel() ``` [1][2] --- 🏁 Script executed: ```shell # Check all usages of countAlertSeverity rg "countAlertSeverity" scrapers/github/

Repository: flanksource/config-db

Length of output: 350

Code scanning severity mapping silently drops non-CVSS alerts.

alert.Rule.GetSeverity() can return "error", "warning", or "note" (non-security levels), but countAlertSeverity only handles "critical", "high", "medium", "low" with no default case, so these alerts are never counted. GitHub's Code Scanning API provides GetSecuritySeverityLevel() for CVSS levels and should be checked first.

Suggested normalization approach

- for _, alert := range codeScanAlerts { - countAlertSeverity(&alerts.counts, alert.Rule.GetSeverity()) + for _, alert := range codeScanAlerts { + sev := alert.Rule.GetSecuritySeverityLevel() + if sev == "" { + sev = alert.Rule.GetSeverity() + } + countAlertSeverity(&alerts.counts, strings.ToLower(sev)) if alert.UpdatedAt != nil && alert.UpdatedAt.After(maxAlertTime) { maxAlertTime = alert.UpdatedAt.Time } } ... func countAlertSeverity(counts *alertCounts, severity string) { switch severity { case "critical": counts.critical++ - case "high": + case "high", "error": counts.high++ - case "medium": + case "medium", "warning": counts.medium++ - case "low": + case "low", "note": counts.low++ } }

Also applies to: 99-109

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@scrapers/github/security.go` around lines 75 - 77, The loop over codeScanAlerts currently uses alert.Rule.GetSeverity() and passes its value to countAlertSeverity, which only handles "critical"/"high"/"medium"/"low" and thus silently drops non-CVSS values like "error"/"warning"/"note"; change the logic in the loop to first check alert.Rule.GetSecuritySeverityLevel() (or alert.GetSecuritySeverityLevel()) and if that returns a non-empty CVSS level use that to call countAlertSeverity, otherwise map fallback values from alert.Rule.GetSeverity() to a normalized bucket (or increment an "other"/"note" count) before calling countAlertSeverity; update any related handling in the same block and in the functions referenced (countAlertSeverity and any aggregation using alerts.counts) so non-CVSS severities are accounted for instead of being ignored.

Keep OpenSSF Scorecard results (richer data: scores, compliance mappings, details) and drop matching Code Scanning alerts. Previously only Vulnerabilities/SAST were filtered from OpenSSF side.

moshloop force-pushed the security-scrapers branch from 35d3f51 to d808826 Compare January 23, 2026 08:34

coderabbitai Bot reviewed Jan 23, 2026

View reviewed changes

moshloop force-pushed the security-scrapers branch from d808826 to 7eed272 Compare February 25, 2026 12:10

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

moshloop force-pushed the security-scrapers branch from 4037510 to 47eb290 Compare February 25, 2026 14:54

coderabbitai Bot reviewed Feb 25, 2026

View reviewed changes

moshloop force-pushed the security-scrapers branch from 8f8ab18 to 583741d Compare February 26, 2026 12:06

coderabbitai Bot reviewed Feb 26, 2026

View reviewed changes

fix(github): deduplicate OpenSSF and Code Scanning results

6dedead

Keep OpenSSF Scorecard results (richer data: scores, compliance mappings, details) and drop matching Code Scanning alerts. Previously only Vulnerabilities/SAST were filtered from OpenSSF side.

moshloop force-pushed the security-scrapers branch from 583741d to 6dedead Compare February 26, 2026 19:46

fix(github): remove unused compliance mappings

73cb168

moshloop merged commit 7400158 into main Feb 26, 2026
13 of 15 checks passed

moshloop deleted the security-scrapers branch February 26, 2026 21:19

This was referenced Mar 1, 2026

git scraper #1925

Open

Pr/azure devops #1926

Merged

		// ComplianceMappings maps each OpenSSF Scorecard check to compliance framework controls
		// Based on DataSheet reference: plugins/vite-plugin-security.js

	func scorecardCheckNameToKebab(name string) string {
	return strings.ToLower(strings.ReplaceAll(name, "-", "-"))
	}

-	"github.com/flanksource/config-db/api"
-	v1 "github.com/flanksource/config-db/api/v1"
-	v1 "github.com/flanksource/config-db/api"
+import (
+	gocontext "context"
+	"encoding/json"
+	"fmt"
+	"math/rand/v2"
+	"os"
+	v1 "github.com/flanksource/config-db/api"
+	"github.com/flanksource/config-db/db/models"
+	. "github.com/onsi/ginkgo/v2"
+	apiv1 "k8s.io/api/core/v1"
+	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
+)

		runOut, err := run.CombinedOutput()
		require.NoError(t, err, "config-db run failed (exit %v):\n%s", run.ProcessState.ExitCode(), string(runOut))

Conversation

moshloop commented Jan 23, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Jan 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Possibly related PRs

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

GitHub REST API: security alerts (code scanning, secret scanning, Dependabot) — auth & access requirements

1) Token types you can use

2) Scopes / permissions (classic PATs & OAuth tokens)

3) Fine-grained PAT / GitHub App permission names (repo-scoped)

4) Who is allowed to call the endpoints (role / access gating)

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jan 23, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Feb 25, 2026

moshloop commented Jan 23, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jan 23, 2026 •

edited

Loading