Skip to content

Conversation

@Gkrumbach07
Copy link
Collaborator

Overview

This PR introduces a comprehensive overhaul of the E2E testing infrastructure and documentation organization for the Ambient Code Platform.

Key Changes

E2E Testing Infrastructure

  • New Kind overlay for local E2E testing with proper secrets and configuration
  • 🔧 Improved E2E scripts with better setup, cleanup, and token extraction
  • 🧪 Enhanced Cypress tests including new session management tests
  • 🐳 Image loading utilities for local development testing
  • 🔐 Test user creation and authentication handling

Documentation Reorganization

  • 📚 Restructured docs/ directory with clear sections:
    • agents/ - Active and archived agent definitions
    • architecture/ - System diagrams and architecture docs
    • deployment/ - Deployment guides including OpenShift, OAuth, S3, MinIO
    • developer/ - Developer guides including security, build metadata, and local development options (kind, CRC, hybrid)
    • integrations/ - GitHub, GitLab, Google Workspace integrations
    • observability/ - Langfuse and metrics documentation
    • testing/ - Testing strategy and guides
    • tools/ - Amber and other tooling docs
  • 📖 New documentation map (DOCUMENTATION_MAP.md) for easy navigation
  • Comprehensive README files for each section

Code Changes

  • 🔧 Operator updates for improved session handling
  • 🐳 Runner improvements including MCP configuration and hydration script updates
  • 🌐 Frontend API client enhancements for better error handling

Cleanup

  • 🗑️ Removed obsolete documentation (7,000+ lines of outdated/duplicate content)
  • 🗑️ Cleaned up old migration plans and implementation documents

Testing

The changes have been validated with:

  • ✅ Local kind cluster testing
  • ✅ E2E test suite execution
  • ✅ Documentation structure review

Impact

  • Documentation: Significantly improved organization and discoverability
  • Developer Experience: Easier local testing with kind overlay
  • CI/CD: Better E2E testing infrastructure for GitHub Actions
  • Maintenance: Reduced documentation duplication and confusion

Files Changed

  • 128 files changed
  • 4,477 insertions(+), 11,746 deletions(-)
  • Net reduction of ~7,000 lines while improving clarity

Related Issues

Addresses ongoing E2E testing improvements and documentation reorganization efforts.

@github-actions

This comment has been minimized.

Reverts the automatic copying of ambient-runner-secrets from operator
namespace to session namespaces. This simplifies the operator logic and
removes potential security concerns with cross-namespace secret copying.

For E2E testing, secrets should be configured directly in the project
namespace via the backend API or kubectl.
@codecov
Copy link

codecov bot commented Jan 23, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

Summary

This PR introduces significant improvements to E2E testing infrastructure and documentation organization. The changes include 128 files with a net reduction of ~7,000 lines while improving clarity. The PR primarily focuses on infrastructure, documentation, and testing improvements with minimal production code changes.

Overall Assessment: APPROVE with Minor Recommendations

The PR demonstrates strong adherence to project standards with excellent documentation reorganization and testing infrastructure improvements. A few minor security and code quality considerations should be addressed.


Issues by Severity

🚫 Blocker Issues

None - No blocking issues found.

🔴 Critical Issues

None - No critical issues found.

🟡 Major Issues

1. Frontend E2E Token Handling (Security Pattern Concern)

Location: components/frontend/src/services/api/client.ts:96-100

Issue: While the NEXT_PUBLIC_ prefix is appropriate for client-side environment variables, the comment "without localStorage hacks" suggests this may be replacing a problematic pattern. However, exposing tokens via client-side environment variables (even for E2E) could set a problematic precedent.

Recommendation: Add explicit comment that this is ONLY for E2E testing environments, consider adding a runtime check to prevent this code from running in production, and document this pattern in the E2E testing guide with clear warnings.

2. Operator Secret Management Simplification

Location: components/operator/internal/handlers/sessions.go:644-660

Change: Removed automatic secret copying from operator namespace to session namespaces (60+ lines removed).

Positive: This simplification aligns with KISS principles and reduces complexity.

Concern: Migration path unclear - existing deployments may rely on automatic secret propagation. Ensure migration guide exists for users upgrading.

🔵 Minor Issues

  1. Missing error context preservation in frontend client.ts (lines 211-212)
  2. Comment placement in adapter.py could be improved
  3. Documentation map could include links to ADRs and repomix guide

Positive Highlights

✨ Excellent Work

  1. Documentation Reorganization ⭐⭐⭐⭐⭐

    • New structure (agents/, architecture/, deployment/, developer/) is significantly better than previous flat structure
    • DOCUMENTATION_MAP.md provides excellent navigation
    • Net reduction of 7,000+ lines by removing duplicates/obsolete content
  2. E2E Testing Infrastructure ⭐⭐⭐⭐⭐

    • New overlays/kind/ overlay for local E2E testing is well-structured
    • Proper secret management with test-user.yaml, runner-secrets.yaml
    • Kustomization follows established patterns from overlays/e2e/
  3. Code Simplification ⭐⭐⭐⭐

    • Removing operator secret-copying logic (60+ lines) aligns with KISS principles
    • Clearer error messages in operator sessions.go
  4. Adherence to Standards ⭐⭐⭐⭐⭐

    • Frontend changes use existing patterns (apiClient, proper error handling)
    • Operator changes follow error handling patterns
    • No violations of critical rules (user token auth, no panics, etc.)

Recommendations

High Priority

  1. Security Documentation: Add explicit warnings about E2E token pattern to prevent misuse
  2. Migration Guide: Document impact of removed secret-copying logic for existing users

Medium Priority

  1. Error Handling: Improve error context preservation in frontend client.ts
  2. Documentation Links: Add ADR and repomix guide links to DOCUMENTATION_MAP.md

Final Verdict

APPROVE

This PR demonstrates:

  • Strong architectural improvements (documentation reorganization)
  • Thoughtful code simplification (removed 60+ lines of complex secret logic)
  • Excellent testing infrastructure additions (kind overlay)
  • Adherence to project standards and patterns

The minor issues identified are non-blocking and can be addressed in follow-up PRs or immediately before merge.

Recommended Actions Before Merge:

  1. Add production guard to E2E token code (NODE_ENV !== production check)
  2. Verify CI passes (linting, E2E tests)
  3. Add migration notes to release documentation

Review conducted by: Claude Code (AI Code Review Agent)
Standards Applied: CLAUDE.md, backend-development.md, frontend-development.md, security-standards.md, k8s-client-usage.md, error-handling.md, react-query-usage.md
Date: 2026-01-23


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

When ANTHROPIC_API_KEY is set in e2e/.env, automatically create a
'test-project' namespace with the key pre-configured. This enables
immediate manual testing without needing to configure via UI/API.

Changes:
- deploy.sh: Create test-project namespace with ambient-runner-secrets
- refresh-env.sh: Update test-project secret when API key changes
- env.example: Document the automatic namespace configuration

For other project namespaces, users should configure API keys via:
- UI: Project Settings → API Keys
- API: PUT /api/projects/{project}/runner-secrets
- kubectl: Create ambient-runner-secrets in project namespace
@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

This PR introduces a comprehensive overhaul of E2E testing infrastructure and documentation organization. Changes span ~128 files with 4,444 additions and 11,733 deletions.

Critical Issues

  1. Security: Test User with cluster-admin - components/manifests/overlays/kind/test-user.yaml:32 grants cluster-admin which is too permissive. Should use namespace-scoped role instead.

  2. Frontend: Token Leak Risk - components/frontend/src/services/api/client.ts:97-99 uses NEXT_PUBLIC_E2E_TOKEN without production guard. Add NODE_ENV check.

  3. Hardcoded Test Secrets - components/manifests/overlays/kind/secrets.yaml:16 has hardcoded secrets in git. Add warning comments.

Major Issues

  1. Resource Limits - Increased to 2CPU/2Gi without justification. Document rationale.
  2. File Permissions - Changed to chmod 777. Should use 775 since fsGroup is set.
  3. Missing Migration Guide - Doc reorganization needs migration guide for old paths.

Positive Highlights

✅ Excellent E2E infrastructure modernization with Kind
✅ Strong documentation organization (7,000 lines removed)
✅ Security context improvements with fsGroup
✅ Developer experience improvements (fast local dev)
✅ Code quality follows standards

Recommendation

Approve with required changes: Fix test RBAC, add production guard to E2E token, document resource limits.

This is high-quality work that significantly improves testing and documentation. Minor issues can be addressed in follow-up PRs.


Review based on CLAUDE.md and memory system standards


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Adds a configurable KIND_HTTP_PORT variable to control which port the
kind cluster listens on. This allows developers to choose between:

- KIND_HTTP_PORT=auto (default): Uses 8080 for podman, 80 for docker
- KIND_HTTP_PORT=80: Forces port 80 (good for docker on Linux)
- KIND_HTTP_PORT=8080: Forces port 8080 (good for podman)
- KIND_HTTP_PORT=<custom>: Any custom port (e.g., 9000)

Usage:
  make kind-up KIND_HTTP_PORT=80     # Force port 80
  make kind-up KIND_HTTP_PORT=9000   # Use port 9000
  make kind-up                       # Auto-detect (default)

Changes:
- Makefile: Add KIND_HTTP_PORT variable, pass to scripts
- setup-kind.sh: Accept KIND_HTTP_PORT override
- extract-token.sh: Use KIND_HTTP_PORT for .env.test generation
- Help text: Document new variable and examples
Enables running E2E tests without .env.test by accepting TEST_TOKEN
and CYPRESS_BASE_URL from environment variables. This allows:

1. Testing directly against frontend on any port:
   CYPRESS_BASE_URL=http://localhost:3000 make test-e2e

2. Testing against external clusters without kind-up:
   TEST_TOKEN=$(oc whoami -t) CYPRESS_BASE_URL=https://... make test-e2e

3. Port forwarding scenarios:
   kubectl port-forward -n ambient-code svc/frontend-service 3000:3000
   CYPRESS_BASE_URL=http://localhost:3000 make test-e2e

Changes:
- run-tests.sh: Load .env.test only if env vars not set
- run-tests.sh: Allow running without .env.test file
- Makefile: Pass CYPRESS_BASE_URL and TEST_TOKEN to script
- Makefile: Update help text with environment variable examples

Environment variables take precedence over .env.test values.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

Summary

This PR introduces a comprehensive overhaul of E2E testing infrastructure and documentation reorganization. The changes span ~128 files with significant improvements to developer experience and testing capabilities. Overall, this is a well-structured refactoring with good practices, but there are some security and code quality concerns that should be addressed.

Issues by Severity

Critical Issues

1. Security: E2E test token exposed client-side (client.ts:96-100)

Issue: The NEXT_PUBLIC_E2E_TOKEN is exposed to client-side JavaScript, making it visible in browser dev tools. This violates the security standard of never exposing tokens client-side and could be exploited if accidentally set in production.

Recommendation: Use server-side only approach, add NODE_ENV checks, document in security-standards.md

2. Type Safety: Missing JSON validation (adapter.py:1303)

Issue: Code doesn't validate JSON structure after parsing. If config is not a dict, config.get() will fail.

Recommendation: Add isinstance check before accessing dict methods

Major Issues

1. Operator: Race condition in session restart (sessions.go:97-148)

The operator sets phase=Pending and continues without waiting for status propagation, potentially causing duplicate job creation.

2. Frontend: Token injection ignores existing auth

E2E token injection doesn't check if user is already authenticated, potentially overriding legitimate sessions.

3. Adapter: Empty string validation missing (adapter.py:913-930)

Sanitization could result in empty strings causing downstream issues.

Minor Issues

  • TODO comments should be tracked in GitHub issues
  • Inconsistent log levels in adapter.py
  • Missing security documentation for E2E token

Positive Highlights

✅ Excellent documentation reorganization with DOCUMENTATION_MAP.md
✅ Good error handling patterns with async generators
✅ Security-conscious Google OAuth credential handling
✅ Comprehensive test coverage
✅ Clean separation of concerns in E2E infrastructure
✅ Type safety improvements in frontend client
✅ Following established error handling patterns

Recommendations

Priority 1 (Before Merge):

  1. Fix E2E token security - add environment checks and server-side approach
  2. Add JSON validation in adapter.py
  3. Document E2E token in security standards

Priority 2 (Follow-up):

  1. Address operator race condition
  2. Add empty string validation
  3. Create GitHub issue for operator refactoring

Overall Assessment

High-quality refactoring with excellent E2E testing and documentation improvements. Main concern is E2E token security pattern. With Priority 1 changes, ready to merge.

Recommendation: Request changes for security hardening, then approve.


Review conducted with loaded context from memory system files


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

Summary

This PR introduces a comprehensive E2E testing infrastructure overhaul with kind overlay support and extensive documentation reorganization (~7,000 lines cleaned up). The changes significantly improve developer experience and testing capabilities while maintaining security standards.

Issues by Severity

Blocker Issues

None found - The PR follows established patterns and security guidelines.

Critical Issues

None found - Code quality, security, and architecture patterns are followed correctly.

Major Issues

  1. Frontend E2E Token Injection Pattern (components/frontend/src/services/api/client.ts:96-100)

    • Issue: Direct token injection via NEXT_PUBLIC_E2E_TOKEN environment variable could be misused if deployed to production.
    • Risk: Bypasses normal authentication flow when environment variable is set.
    • Recommendation: Add explicit check that this only works in development/test environments.
  2. Operator Session Handler - Unbounded Goroutines (components/operator/internal/handlers/sessions.go)

    • Observation: The session handler still uses the legacy 2,300-line monolithic pattern with manual goroutine management.
    • Risk: While the TODO comment acknowledges this, the monitoredPods tracking helps prevent duplicate goroutines, but error cases could still leak.
    • Recommendation: Prioritize the controller-runtime migration mentioned in the TODO. Consider adding goroutine leak detection tests.

Minor Issues

  1. Documentation File Organization (docs/ restructure)

    • Positive: Excellent reorganization into logical directories
    • Suggestion: Consider adding a .markdownlint.yml configuration
  2. E2E Test Cleanup Logic (e2e/cypress/e2e/sessions.cy.ts:67-81)

    • Observation: Cleanup only happens in after() hook, which won't run if tests fail early.
    • Recommendation: Consider adding a Cypress task that runs cleanup even on test failures.
  3. Kind Overlay Port Configuration (e2e/scripts/setup-kind.sh:52-64)

    • Positive: Smart auto-detection of Podman vs Docker
    • Suggestion: Document the KIND_HTTP_PORT environment variable in e2e/README.md
  4. Makefile Help Text Accuracy (Makefile:73)

    • The quick start still references make local-up but CLAUDE.md now recommends make kind-up
    • Recommendation: Update Makefile help text to prioritize kind-based commands.

Positive Highlights

Security Standards - Excellent

  • Token Handling: All authentication patterns follow established security guidelines
  • RBAC Enforcement: Proper test user creation with ServiceAccount
  • Secret Management: Proper use of K8s Secrets

E2E Testing Infrastructure - Outstanding

  • Portable Test Suite: New sessions.cy.ts provides comprehensive session management tests (~267 lines)
  • Kind Overlay: Well-structured kustomization with proper namespacing, PVC patches, and image policies
  • Helper Scripts: extract-token.sh, load-images.sh, init-minio.sh are well-documented
  • Real Agent Testing: Tests include actual Claude response verification

Documentation Reorganization - Exceptional

  • DOCUMENTATION_MAP.md: Excellent navigation guide (227 lines)
  • Logical Structure: Clear separation into agents/, architecture/, deployment/, developer/, etc.
  • Cleanup: Removed 7,000+ lines of obsolete content
  • README files: Each section has comprehensive README with navigation

Code Quality - Follows Standards

  • Backend: No violations of authentication/authorization patterns
  • Frontend: Proper use of apiClient, no manual fetch() calls, type-safe
  • Operator: Follows established reconciliation patterns
  • Python Runner: Clean adapter pattern, proper error handling

Recommendations

High Priority

  1. Secure E2E Token Injection: Add environment check to client.ts to prevent accidental production use
  2. Document KIND_HTTP_PORT: Add to kind.md or e2e/README.md
  3. Update Makefile Help: Align quick start commands with CLAUDE.md recommendations

Medium Priority

  1. Add Goroutine Leak Tests: For operator session handler monitoring
  2. E2E Cleanup Robustness: Consider Cypress task for cleanup even on failures
  3. Markdownlint Config: Add .markdownlint.yml for consistent doc formatting

Low Priority

  1. Consider Controller-Runtime Migration: The operator TODO is valid - prioritize this refactor

Final Verdict

APPROVED with minor recommendations

This is an excellent PR that significantly improves the project's testing infrastructure and documentation organization. The code follows all established patterns, maintains security standards, and provides a solid foundation for E2E testing.

Merge Recommendation: Approve and merge after addressing the E2E token injection security check (high priority item #1). Other recommendations can be addressed in follow-up PRs.

Impact Summary:

  • Developer Experience: Significantly improved with kind overlay and comprehensive docs
  • Testing: New portable E2E suite with real agent testing
  • Documentation: Excellent reorganization with 7,000+ lines of cleanup
  • Security: No violations, follows all established patterns
  • Architecture: Maintains project structure and design guidelines

Reviewed by: Claude Code (Sonnet 4.5)
Review Date: 2026-01-23
Standards Applied: CLAUDE.md, backend-development.md, frontend-development.md, security-standards.md, k8s-client-usage.md, error-handling.md, react-query-usage.md


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

This commit simplifies the build process by removing the embedding of Git metadata into Docker images and the Makefile. The following changes were made:

- Removed Git-related build arguments and environment variables from Dockerfiles for backend, frontend, operator, and runner components.
- Updated the Makefile to eliminate the capture of Git metadata during the build process.
- Adjusted the build commands in the Makefile to focus solely on building the images without embedding versioning information.

This streamlining enhances the build process and reduces complexity, while still allowing for effective version tracking through other means.
This commit deletes the SECURITY_DEV_MODE.md file, which analyzed the risks of accidentally deploying development mode to production and documented existing safeguards and recommendations. The removal is part of a broader effort to streamline documentation and focus on more relevant security practices.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

Summary

This PR represents a significant infrastructure improvement introducing comprehensive E2E testing capabilities, documentation reorganization, and essential configuration updates. The changes span 128 files with a net reduction of ~7,000 lines while improving clarity and maintainability.

Overall Assessment: APPROVED with Minor Recommendations

The code quality is generally strong, following established patterns. However, there are several security considerations and architectural improvements to address.


Issues by Severity

Critical Issues

  1. Cluster-Admin Binding for Test User (Security Risk)
    Location: components/manifests/overlays/kind/test-user.yaml:24-32

The test user has cluster-admin privileges, which violates least-privilege principles and could mask RBAC issues in testing. Recommendation: Create a dedicated role with only required permissions. E2E tests should validate actual RBAC behavior.

  1. Environment Variable Token Injection Bypasses Security Pattern
    Location: components/frontend/src/services/api/client.ts:96-99

Issues: NEXT_PUBLIC_* vars are exposed to the browser bundle, token visible in build artifacts and DevTools, bypasses established auth pattern. Recommendation: Add NODE_ENV check to only enable in test environment.


Major Issues

  1. Missing Error Handling in Operator SecurityContext - Hardcoded FSGroup:0 may not work in all environments
  2. Resource Limits May Be Too High - 2GB/2CPU per session may exceed cluster capacity
  3. MCP Config Override Lacks Validation - Arbitrary file path injection possible

Minor Issues

  1. Inconsistent Permissions in Hydrate Script - Using 777 permissions
  2. Build Metadata Removed - Update documentation
  3. Documentation Reorganization May Break Links - Add redirects

Positive Highlights

  • Excellent Documentation Structure with DOCUMENTATION_MAP.md
  • Security-First Testing with proper namespace isolation
  • Resource Management improvements
  • Cleanup of 7,000+ lines of obsolete documentation
  • MCP Configuration Flexibility

Recommendations

Immediate (Before Merge):

  1. Downgrade test user to namespace-scoped role
  2. Add NODE_ENV check to NEXT_PUBLIC_E2E_TOKEN usage
  3. Validate MCP_CONFIG_FILE path

Short Term (Next PR):
4. Make resource limits configurable
5. Document build metadata removal
6. Add redirect stubs for moved documentation


Final Verdict

Approval Status: APPROVED with Recommendations

Risk Level: Medium

  • Security issues limited to test environment
  • Resource limit changes may affect production
  • Documentation reorganization may cause short-term confusion

Merge Recommendation: Merge after addressing Critical Issues 1 and 2

Great work on this comprehensive overhaul!

Review completed by: Claude Code (Sonnet 4.5)
Standards checked: CLAUDE.md, security-standards.md, backend-development.md, frontend-development.md, k8s-client-usage.md, error-handling.md, react-query-usage.md


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

Summary

This PR introduces significant improvements to the E2E testing infrastructure and reorganizes documentation. The changes align well with established patterns, but there are security concerns and architectural inconsistencies that need attention before merging.

Overall Assessment: 🟡 Major issues need addressing, particularly around security context and resource management.


Critical Issues

1. FSGroup Security Context May Violate OpenShift SCC

Location: components/operator/internal/handlers/sessions.go:757-760

Setting FSGroup: 0 (root group) may be rejected by OpenShift's Security Context Constraints (SCC). Most SCCs forbid root group usage.

Impact: Pods will fail to start on OpenShift clusters (production environment).

Fix: Use a non-root group ID like FSGroup: int64Ptr(1001)

Reference: CLAUDE.md line 672-680 (Container Security pattern)


2. Missing Resource Requests/Limits Validation

Location: components/operator/internal/handlers/sessions.go:1148-1156

Hardcoded resource limits (2Gi+512Mi per pod) without configurability impacts Kind/Minikube users and production workloads.

Impact: Pods may fail to schedule in resource-constrained environments.

Fix: Make resources configurable via ConfigMap or CR spec.


3. E2E Token Injection Security Risk

Location: components/frontend/src/services/api/client.ts:96-99

Token injection via process.env.NEXT_PUBLIC_E2E_TOKEN creates security risk if leaked into production builds.

Impact: Production frontend could have hardcoded token, bypassing auth.

Fix: Add NODE_ENV check: process.env.NODE_ENV !== 'production' && process.env.NEXT_PUBLIC_E2E_TOKEN


Major Issues

  • MCP Config Override Without Validation (adapter.py:1296): Allows arbitrary file path, validate against allowlist
  • Increased State-Sync Resources Without Justification: Document reason or revert
  • Documentation Reorganization Breaks External Links: Add redirects, update cross-references

Positive Highlights

✅ Excellent Documentation Reorganization
✅ Kind Integration (faster, matches CI)
✅ Resource Limits Added (production readiness)
✅ E2E Test Portability
✅ Cleanup of 7,000+ lines of obsolete docs


Recommendations Before Merge

  1. Fix FSGroup to non-root value (Critical)
  2. Add NODE_ENV check to E2E token injection (Critical)
  3. Validate MCP_CONFIG_FILE path (Major)
  4. Make resource limits configurable or document rationale (Critical)
  5. Test on OpenShift cluster for SCC compatibility

Final Verdict

⚠️ REQUEST CHANGES

Excellent work overall, but FSGroup and E2E token issues need fixing to avoid production security risks and OpenShift compatibility issues.

Estimated Fix Time: 15-30 minutes for code changes + 1 hour for OpenShift validation.


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

# Check if you already have a podman machine
podman machine list
```
brew install kind kubectl docker
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the call today, we discussed this change from Podman to Docker. The conclusion was that this file is no longer needed and should be deleted, I think.

…lopment documentation

This commit modifies the GitHub Actions workflow for building and deploying components by removing support for multiple platforms in Docker builds, now focusing solely on `linux/amd64`. Additionally, it deletes the `instrumentation.ts` file from the frontend, which was responsible for logging build information. The local development documentation has been significantly revamped to clarify scenarios for running components locally, including detailed setups and benefits for different development tasks. This streamlining aims to improve developer experience and efficiency.
@github-actions
Copy link
Contributor

github-actions bot commented Jan 23, 2026

Claude Code Review

Summary

This PR represents a major infrastructure and documentation overhaul for the Ambient Code Platform. The changes include E2E testing improvements, comprehensive documentation reorganization, and strategic cleanups. Overall, the PR is well-structured with 4,635 additions and 13,254 deletions (net -7,000 lines), demonstrating effective consolidation.

Key Strengths:

  • Excellent documentation reorganization with clear hierarchy
  • Improved E2E testing infrastructure with Kind overlay
  • Proper cleanup of 7,000+ lines of obsolete/duplicate content
  • Good adherence to established patterns in most code changes

Areas Requiring Attention:

  • Critical security issue in frontend API client
  • Minor improvements needed in operator code
  • Documentation references need updates

Issues by Severity

Blocker Issues

None - No blocking issues that prevent merge.


Critical Issues

1. Hardcoded E2E Token in Frontend API Client

File: components/frontend/src/services/api/client.ts:96-100

Issue:

  • Security Risk: Hardcoding token injection in production API client bypasses proper authentication
  • Violates Frontend Security Pattern: Authentication should be managed by auth context/provider, not in API client
  • E2E Testing Smell: E2E tests should use proper auth mechanisms, not environment variable hacks

Per security-standards.md, tokens must never be hardcoded or injected via environment variables in production code paths. This creates a backdoor that could be exploited if NEXT_PUBLIC_E2E_TOKEN is accidentally set in production.

Recommended Fix: Use proper auth context/provider for token management. For E2E tests, use Cypress commands to inject token via cy.setCookie() or localStorage, not environment variables.


Major Issues

2. Operator Status Updates Missing Error Context

File: components/operator/internal/handlers/sessions.go:76-81

Failed status initialization is logged but doesn't update CR status with error message. User/operator won't know why phase initialization failed.

Recommended Fix: Update status with error message and return error instead of just logging warning.

Reference: .claude/patterns/error-handling.md (Operator Errors section)


3. Inconsistent Documentation References in CLAUDE.md

Multiple documentation references in CLAUDE.md weren't updated after docs reorganization. Users following CLAUDE.md links will encounter 404s.

Recommended Fix: Run comprehensive search and replace to update all doc references to new paths based on DOCUMENTATION_MAP.md


Minor Issues

4. Operator Code Comment Clarity

The TODO comment in sessions.go:37-46 is excellent. Consider adding a GitHub issue link to track technical debt more formally.

5. Python Adapter Import Order

File: components/runners/claude-code-runner/adapter.py:10-21

Minor style issue - imports should be grouped per PEP 8. Consider using isort to auto-format.

6. Missing Test Coverage for New E2E Session Tests

Consider adding edge cases: session deletion while running, invalid workflow creation, session stuck in Starting state.


Positive Highlights

Excellent Documentation Structure

The new docs/ organization is outstanding:

  • Clear hierarchical structure (agents/, architecture/, deployment/, developer/, etc.)
  • Comprehensive README.md files for each section
  • New DOCUMENTATION_MAP.md provides excellent navigation
  • Proper separation of concerns

Kind Overlay Implementation

The new Kind overlay is well-designed:

  • Complete resource patches (PVC, secrets, operator config)
  • Test user creation with proper RBAC
  • Proper namespace isolation
  • Matches CI environment setup

Effective Cleanup

Removing 7,000+ lines of obsolete content while improving clarity demonstrates excellent maintenance.

Operator Status Patch Pattern

The new StatusPatch pattern is a great improvement - reduces API calls and improves performance.

Frontend API Client Error Handling

The error handling in client.ts is well-structured with type-safe error parsing and graceful fallbacks.


Recommendations

Priority 1 (Before Merge)

  1. Fix E2E token injection in client.ts - use proper auth context instead of environment variable
  2. Update documentation references in CLAUDE.md to reflect new paths
  3. Add error context to operator status initialization failure

Priority 2 (Follow-up PR)

  1. Add GitHub issue link to operator TODO comment
  2. Run isort on Python files to fix import order
  3. Add edge case tests to E2E session suite

Testing Verification

Required Before Merge

  • Run E2E tests against Kind cluster: make test-e2e-local
  • Verify frontend build passes: cd components/frontend && npm run build
  • Run operator linting: cd components/operator && golangci-lint run
  • Verify all documentation links in CLAUDE.md are valid

Conclusion

This is a high-quality PR that significantly improves the platform's testing infrastructure and documentation. The 7,000-line reduction while improving clarity is commendable.

Recommendation: APPROVE with Critical Issue Fixed

The E2E token injection in the frontend API client must be addressed before merge (Priority 1, item 1). All other issues can be addressed in follow-up PRs.

Great work on the comprehensive refactoring!


Review completed by Claude Code (Sonnet 4.5)
Reviewed against project standards: CLAUDE.md, backend-development.md, frontend-development.md, security-standards.md, k8s-client-usage.md, error-handling.md, react-query-usage.md


🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

  1. CLAUDE.md - Master project instructions, development standards
  2. backend-development.md - Go backend, K8s integration patterns
  3. frontend-development.md - NextJS, Shadcn UI, React Query patterns
  4. security-standards.md - Auth, RBAC, token handling
  5. k8s-client-usage.md - User token vs service account patterns
  6. error-handling.md - Consistent error patterns
  7. react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants