Skip to content

Conversation

@joejstuart
Copy link
Contributor

@joejstuart joejstuart commented Jan 8, 2026

User description

This adds parallel fetching of image manifests and caching for each function call that fetches from a registry.

https://issues.redhat.com/browse/EC-1600

Assisted-by: Claude Opus 4.5


PR Type

Enhancement


Description

  • Add parallel manifest fetching with ec.oci.image_manifests function

  • Implement per-function caching using sync.Map and singleflight.Group

  • Add concurrency control limiting parallel fetches to GOMAXPROCS * 4

  • Refactor existing OCI functions to use caching and singleflight deduplication


Diagram Walkthrough

flowchart LR
  A["OCI Functions<br/>blob, descriptor, manifest, etc."] -->|"Add caching<br/>& singleflight"| B["sync.Map<br/>+ singleflight.Group"]
  C["New Function<br/>ociImageManifestsBatch"] -->|"Parallel fetch<br/>with errgroup"| D["Multiple Manifests<br/>Concurrent"]
  B -->|"Prevent<br/>thundering herd"| E["Optimized<br/>Registry Access"]
  D -->|"Bounded by<br/>GOMAXPROCS*4"| E
Loading

File Walkthrough

Relevant files
Enhancement
oci.go
Implement caching and parallel manifest fetching                 

internal/rego/oci/oci.go

  • Add imports for concurrency primitives (sync, runtime, errgroup,
    singleflight)
  • Introduce package-level caches (sync.Map) and singleflight groups for
    blob, descriptor, manifest, image files, and image index operations
  • Implement new ociImageManifestsBatch function that fetches multiple
    manifests in parallel with bounded concurrency
  • Refactor all existing OCI functions (ociBlob, ociDescriptor,
    ociImageManifest, ociImageFiles, ociImageIndex) to use caching and
    singleflight deduplication
  • Add maxParallelManifestFetches variable set to GOMAXPROCS * 4 to
    control concurrency
  • Register new batch function in init() function
+563/-202
Tests
oci_test.go
Add tests for batch manifest fetching                                       

internal/rego/oci/oci_test.go

  • Add comprehensive test cases for ociImageManifestsBatch covering
    single ref, multiple refs, empty array, invalid input, and error
    scenarios
  • Add TestOCIImageManifestsBatchConcurrency to verify bounded
    concurrency behavior with configurable limits
  • Update TestFunctionsRegistered to include the new
    ociImageManifestsBatchName function
  • Add fmt import for test string formatting
+165/-1 
Documentation
ec_oci_image_manifests.adoc
Document new parallel manifest function                                   

docs/modules/ROOT/pages/ec_oci_image_manifests.adoc

  • Create new documentation file for ec.oci.image_manifests function
  • Document function signature, parameters, and return type
  • Describe the function as fetching Image Manifests from OCI registry in
    parallel
+20/-0   
rego_builtins.adoc
Update builtins reference table                                                   

docs/modules/ROOT/pages/rego_builtins.adoc

  • Add entry for ec.oci.image_manifests in the builtins reference table
  • Include cross-reference to the new function documentation
+2/-0     
rego_nav.adoc
Update navigation sidebar                                                               

docs/modules/ROOT/partials/rego_nav.adoc

  • Add navigation entry for ec.oci.image_manifests in the Rego reference
    sidebar
+1/-0     

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Jan 8, 2026

PR Compliance Guide 🔍

Below is a summary of compliance checks for this PR:

Security Compliance
Memory exhaustion

Description: Unbounded process-lifetime caching via package-level sync.Map (e.g., blobCache,
manifestCache, imageFilesCache, etc.) combined with reading full blob/layer contents into
memory (ociBlob copies entire layer into a bytes.Buffer and caches it) can enable
memory-exhaustion DoS if an attacker can trigger many unique refs or very large
blobs/manifests, since there is no size limit, TTL, or eviction policy.
oci.go [632-875]

Referred Code
// maxParallelManifestFetches limits concurrent manifest fetches to avoid overwhelming registries.
// Defaults to GOMAXPROCS * 4, which provides good parallelism while being respectful of resources.
var maxParallelManifestFetches = runtime.GOMAXPROCS(0) * 4

// Package-level caches for OCI operations.
// OPA's Memoize only works within a single Eval() call, but we validate multiple
// images in separate Eval() calls. These caches persist for the lifetime of the process.
// All caches are keyed by the ref string (or ref+paths for image files).
//
// We use singleflight.Group alongside sync.Map to prevent thundering herd:
// - sync.Map stores the cached results
// - singleflight.Group ensures only one goroutine fetches a given key at a time
var (
	blobCache        sync.Map           // map[string]*ast.Term - for ociBlob
	blobFlight       singleflight.Group // deduplicates concurrent blob fetches
	descriptorCache  sync.Map           // map[string]*ast.Term - for ociDescriptor
	descriptorFlight singleflight.Group
	manifestCache    sync.Map // map[string]*ast.Term - for ociImageManifest
	manifestFlight   singleflight.Group
	imageFilesCache  sync.Map // map[string]*ast.Term - for ociImageFiles (key: ref+pathsHash)
	imageFilesFlight singleflight.Group


 ... (clipped 223 lines)
Ticket Compliance
🎫 No ticket provided
  • Create ticket/issue
Codebase Duplication Compliance
Codebase context is not defined

Follow the guide to enable codebase context checks.

Custom Compliance
🟢
Generic: Comprehensive Audit Trails

Objective: To create a detailed and reliable record of critical system actions for security analysis
and compliance.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Meaningful Naming and Self-Documenting Code

Objective: Ensure all identifiers clearly express their purpose and intent, making code
self-documenting

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Error Handling

Objective: To prevent the leakage of sensitive system information through error messages while
providing sufficient detail for internal debugging.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Secure Logging Practices

Objective: To ensure logs are useful for debugging and auditing without exposing sensitive
information like PII, PHI, or cardholder data.

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

Generic: Security-First Input Validation and Data Handling

Objective: Ensure all data inputs are validated, sanitized, and handled securely to prevent
vulnerabilities

Status: Passed

Learn more about managing compliance generic rules or creating your own custom rules

🔴
Generic: Robust Error Handling and Edge Case Management

Objective: Ensure comprehensive error handling that provides meaningful context and graceful
degradation

Status:
Ignored errors: The new parallel batch manifest fetch ignores errors from ociImageManifest and from
errgroup.Wait(), causing silent failures without actionable context for debugging.

Referred Code
for _, refTerm := range uncachedTerms {
	term := refTerm
	g.Go(func() error {
		select {
		case <-ctx.Done():
			return nil
		default:
		}

		ref := string(term.Value.(ast.String))
		manifest, _ := ociImageManifest(bctxWithCancel, term)

		// Store in cache (even nil results to avoid re-fetching failures)
		if manifest != nil {
			manifestCache.Store(ref, manifest)
		}

		results <- manifestResult{ref: ref, manifest: manifest}
		return nil
	})
}


 ... (clipped 5 lines)

Learn more about managing compliance generic rules or creating your own custom rules

  • Update
Compliance status legend 🟢 - Fully Compliant
🟡 - Partial Compliant
🔴 - Not Compliant
⚪ - Requires Further Human Verification
🏷️ - Compliance label

@qodo-code-review
Copy link
Contributor

qodo-code-review bot commented Jan 8, 2026

PR Code Suggestions ✨

Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
High-level
Global caches introduce data staleness

The global, package-level caches for OCI artifacts lack an invalidation
mechanism. This causes data staleness when image tags are updated, leading to
incorrect policy validations in long-running processes.

Examples:

internal/rego/oci/oci.go [649-650]
	manifestCache    sync.Map // map[string]*ast.Term - for ociImageManifest
	manifestFlight   singleflight.Group
internal/rego/oci/oci.go [546-550]
	// Check cache first (fast path)
	if cached, found := manifestCache.Load(refStr); found {
		logger.Debug("Image manifest served from cache")
		return cached.(*ast.Term), nil
	}

Solution Walkthrough:

Before:

func ociImageManifest(bctx, refTerm) {
  // refTerm could be "myimage:latest"
  client := oci.NewClient(bctx.Context)
  
  // Always resolves the tag to the latest digest from the registry
  uri, ref, err := resolveIfNeeded(client, refTerm)
  
  // Fetches the manifest for the resolved digest
  image, err := client.Image(ref)
  manifest, err := image.Manifest()
  
  return manifest
}

After:

// Global cache, persists for process lifetime
var manifestCache sync.Map

func ociImageManifest(bctx, refTerm) {
  refStr := string(refTerm.Value) // e.g., "myimage:latest"

  // Check cache with the mutable tag as key
  if cached, found := manifestCache.Load(refStr); found {
    // Returns stale data if "myimage:latest" was updated in the registry
    return cached
  }

  // Fetch, resolve, and get manifest...
  // ...
  
  // Store in cache using the mutable tag as the key
  manifestCache.Store(refStr, term)
  return term
}
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a critical design flaw where using mutable image tags as cache keys without an invalidation mechanism will lead to stale data, compromising the correctness of policy evaluations.

High
Cache does not respect client configuration

The caching implementation uses only the image reference for its key, ignoring
OCI client configurations like authentication passed via context. This can lead
to security vulnerabilities, such as reusing cached data across different
credential contexts.

Examples:

internal/rego/oci/oci.go [645-655]
	blobCache        sync.Map           // map[string]*ast.Term - for ociBlob
	blobFlight       singleflight.Group // deduplicates concurrent blob fetches
	descriptorCache  sync.Map           // map[string]*ast.Term - for ociDescriptor
	descriptorFlight singleflight.Group
	manifestCache    sync.Map // map[string]*ast.Term - for ociImageManifest
	manifestFlight   singleflight.Group
	imageFilesCache  sync.Map // map[string]*ast.Term - for ociImageFiles (key: ref+pathsHash)
	imageFilesFlight singleflight.Group
	imageIndexCache  sync.Map // map[string]*ast.Term - for ociImageIndex
	imageIndexFlight singleflight.Group

 ... (clipped 1 lines)
internal/rego/oci/oci.go [561]
		client := oci.NewClient(bctx.Context)

Solution Walkthrough:

Before:

func ociImageManifest(bctx, refTerm) {
  // Client is created with context-specific configuration (e.g., auth)
  client := oci.NewClient(bctx.Context)
  
  // Fetch uses the specific client, respecting its configuration
  image, err := client.Image(ref)
  // ...
}

// Call 1:
ociImageManifest(context_with_creds, "private/image:tag") // Succeeds
// Call 2:
ociImageManifest(context_without_creds, "private/image:tag") // Fails

After:

// Global cache shared by all clients
var manifestCache sync.Map

func ociImageManifest(bctx, refTerm) {
  refStr := string(refTerm.Value)

  // Cache is checked using only the image ref
  if cached, found := manifestCache.Load(refStr); found {
    return cached
  }

  // Client is created with context-specific config
  client := oci.NewClient(bctx.Context)
  // ... fetch logic ...
  
  // Result is stored in global cache with a generic key
  manifestCache.Store(refStr, term)
}

// Call 1 (with auth): Caches result for "private/image:tag"
// Call 2 (without auth): Hits cache and incorrectly returns the result
Suggestion importance[1-10]: 9

__

Why: This suggestion correctly identifies a severe design and security flaw where the global cache key ignores client configuration from the context, potentially causing credential bypass and incorrect data access.

High
General
Log batch fetch errors

Capture and log the error returned by g.Wait() to ensure failures in parallel
fetches are not silently ignored.

internal/rego/oci/oci.go [764-767]

 go func() {
-    _ = g.Wait()
+    err := g.Wait()
     close(results)
+    if err != nil {
+        logger.WithError(err).Error("error fetching manifests in parallel")
+    }
 }()
  • Apply / Chat
Suggestion importance[1-10]: 7

__

Why: The suggestion correctly points out that errors from g.Wait() are ignored. Logging these errors is a good practice for improving the observability and debuggability of failures within the goroutine group.

Medium
Remove redundant manifest caching logic

Remove the redundant call to manifestCache.Store in ociImageManifestsBatch, as
the called function ociImageManifest already performs the caching.

internal/rego/oci/oci.go [754-757]

-			// Store in cache (even nil results to avoid re-fetching failures)
-			if manifest != nil {
-				manifestCache.Store(ref, manifest)
-			}
+			// The ociImageManifest function handles its own caching, so
+			// no need to store the result here.
  • Apply / Chat
Suggestion importance[1-10]: 6

__

Why: The suggestion correctly identifies that ociImageManifest already handles caching, making the explicit manifestCache.Store call in ociImageManifestsBatch redundant and its removal a good code simplification.

Low
Remove unnecessary mutex around slice append

Remove the unnecessary sync.Mutex from the loop that collects results, as the
loop runs serially in a single goroutine and the slice append is already safe.

internal/rego/oci/oci.go [769-777]

 	// Collect newly fetched results
-	var mu sync.Mutex
 	for result := range results {
 		if result.manifest != nil {
-			mu.Lock()
 			resultTerms = append(resultTerms, ast.Item(ast.StringTerm(result.ref), result.manifest))
-			mu.Unlock()
 		}
 	}
  • Apply / Chat
Suggestion importance[1-10]: 5

__

Why: The suggestion correctly points out that the for...range loop over the channel runs in a single goroutine, making the mutex protection for the slice append unnecessary.

Low
  • Update

@joejstuart joejstuart force-pushed the EC-1600 branch 2 times, most recently from e348ad6 to 075dbf8 Compare January 8, 2026 02:14
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

❌ Patch coverage is 85.89109% with 57 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
internal/rego/oci/oci.go 85.89% 57 Missing ⚠️
Flag Coverage Δ
acceptance 55.59% <38.36%> (-0.38%) ⬇️
generative 18.57% <0.00%> (-0.40%) ⬇️
integration 27.87% <0.00%> (-0.58%) ⬇️
unit 68.29% <85.89%> (+0.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
internal/rego/oci/oci.go 91.91% <85.89%> (-3.87%) ⬇️
🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This adds parallel fetching of image manifests
and caching for each function call that fetches
from a registry.

https://issues.redhat.com/browse/EC-1600

Assisted-by: Claude-4.5-opus
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant