Skip to content

Service API layer with OAuth2 client credentials for service APIs and dummy tools#39

Open
josep-reyero wants to merge 6 commits intomainfrom
service-api-auth-layer
Open

Service API layer with OAuth2 client credentials for service APIs and dummy tools#39
josep-reyero wants to merge 6 commits intomainfrom
service-api-auth-layer

Conversation

@josep-reyero
Copy link

@josep-reyero josep-reyero commented Mar 24, 2026

Summary

Replace shared API keys with OAuth2 client credentials (M2M JWTs) for service API authentication (LFX Lens, Member Onboarding). Each service gets its own Auth0 resource server — no more shared secrets.

See docs/service-api-architecture.md for the full design: token flows, diagrams, tool inventory, and API schemas.

What changed

  • Auth layer: ClientCredentialsClient for OAuth2 client credentials with token caching; TokenSource interface replaces APIKey in the service API client
  • Tools: lfx_lens_query and onboarding_list_memberships registered with correct args/types but return dummy responses (actual API calls commented out inline)
  • Config: LFXMCP_LF_AI_API_KEY replaced with per-service LFXMCP_LENS_API_AUDIENCE and LFXMCP_ONBOARDING_API_AUDIENCE

Why dummy tool responses

The tools return placeholder data instead of calling the real service APIs. This lets us merge the full auth infrastructure now while the rest of the stack catches up. Three things need to happen before the tools can make real calls:

  1. The Auth0 resource servers need to exist — without them, the MCP server can't obtain client credentials tokens (there's no audience to request tokens for)
  2. LFX Lens and Member Onboarding need to be updated to validate Auth0 JWTs instead of their current OS_SECURITY_KEY / API key pattern
  3. The MCP server needs the new env vars (LFXMCP_LENS_API_URL, LFXMCP_LENS_API_AUDIENCE, etc.) set in its deployment config

The actual API calls are commented out inline in lens.go and onboarding.go with the full request/response format documented, so switching from dummy to real is a small, targeted change once the dependencies are ready.

Dependencies

This PR depends on two auth0-terraform PRs:

  1. auth0-terraform#249 — Creates the Auth0 resource servers for LFX Lens and Member Onboarding. Required before the MCP server can obtain client credentials tokens, and before the services can validate them.

  2. auth0-terraform#248 — Adds the lf_staff custom claim to MCP API access tokens. Required for lfx_lens_query specifically — this tool checks the lf_staff JWT claim to enforce staff-only access. Until #248 is deployed, the Lens tool will reject all requests with "this tool is available to Linux Foundation staff only".

onboarding_list_memberships does not depend on #248 (no staff check).

No new env vars needed to deploy this PR

The service tools only activate when both URL and audience are configured. With current deployment config unchanged, everything works as before.

Test plan

  • go build ./... passes
  • go test ./... passes
  • No remaining references to ApiKey / LF_AI_API_KEY
  • Existing V2 tools unaffected

🤖 Generated with Claude Code

Copilot AI review requested due to automatic review settings March 24, 2026 15:10
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review.

Tip: disable this comment in your organization's Code Review settings.

josep-reyero and others added 3 commits March 24, 2026 16:14
…ens tools

Implement Option 4 (V2 access-check) authorization for internal service APIs
that use shared API keys with no per-user auth. The MCP server acts as the
authorization gateway by calling the V2 access-check endpoint (OpenFGA-backed)
before proxying requests.

- Add AccessCheckClient for V2 access-check with # separator format
- Add SlugResolver with in-memory cache for slug→UUID translation
- Add serviceapi.Client for API-key-authenticated service calls
- Add ServiceAuth shared struct and AuthorizeProject flow
- Add onboarding_list_memberships tool (writer relation)
- Add lfx_lens_query tool (lf_staff claim + auditor relation)
- Add lf_staff custom claim extraction in JWT verifier
- Document new env vars in AGENTS.md

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
The Lens workflow endpoint no longer expects a project name in the
payload. Remove project_name from the tool args, request struct,
and handler.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
Switch service API authentication (LFX Lens, Member Onboarding) from shared
API keys to Auth0 client credentials (M2M JWTs). Each service gets its own
Auth0 resource server, so tokens are scoped per-service and validated
cryptographically via JWKS — no more shared secrets.

Changes:

Auth layer:
- Add ClientCredentialsClient (internal/lfxv2/client_credentials.go) for
  OAuth2 client credentials grants with automatic token caching
- Extract generateClientAssertion to shared package-level function for reuse
  by both token exchange and client credentials flows
- Replace APIKey field with TokenSource interface in serviceapi.Client —
  do() now sets Authorization: Bearer instead of ApiKey

Wiring:
- Replace LFXMCP_LF_AI_API_KEY with per-service audience env vars:
  LFXMCP_LENS_API_AUDIENCE and LFXMCP_ONBOARDING_API_AUDIENCE
- Each service gets its own ClientCredentialsClient using the same M2M
  credentials but a different Auth0 audience

Tools (stubbed):
- lfx_lens_query and onboarding_list_memberships return dummy responses
  until the Auth0 resource servers are deployed and the service APIs are
  updated to validate JWTs

Docs:
- Unify architecture-service-api-auth.md and service-api-auth-flow.md into
  service-api-architecture.md with full flow diagrams, tool inventory,
  naming conventions, and API schemas
- Reflect feedback: onboarding_run_agent supports a preview flag instead of
  a separate onboarding_preview_agent tool

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
@josep-reyero josep-reyero force-pushed the service-api-auth-layer branch from 55f9d13 to 01a5340 Compare March 24, 2026 15:14
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates internal service API authentication from shared API keys to Auth0-issued OAuth2 client-credentials tokens, and introduces shared service-tool authorization plumbing (slug→UUID resolution + V2 access-check) to enforce project-level authorization before proxying to Lens/Onboarding.

Changes:

  • Added OAuth2 client-credentials token client (cached) and a generic service API HTTP client (TokenSource-based).
  • Added V2 authorization helpers: access-check client, slug resolver (cached), and a ServiceAuth flow used by new service tools.
  • Wired new tools/config into main.go, plus added architecture + env var documentation.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
internal/tools/service.go Shared service-tool infra (ServiceAuth, access relations, staff-claim helpers).
internal/tools/onboarding.go New onboarding tool (currently returns stubbed/dummy response).
internal/tools/lens.go New Lens tool (staff-gated; currently returns stubbed/dummy response).
internal/serviceapi/client.go Generic HTTP client for service APIs using bearer tokens + optional debug wire logging.
internal/serviceapi/client_test.go Tests for GET/POST JSON/multipart behavior and auth header handling.
internal/lfxv2/token_exchange.go Extracted shared generateClientAssertion() helper for reuse.
internal/lfxv2/client_credentials.go New client-credentials grant client with token caching.
internal/lfxv2/access_check.go New access-check client for project relation checks.
internal/lfxv2/access_check_test.go Tests for access-check request/response handling and edge cases.
internal/lfxv2/slug_resolver.go New in-memory slug→UUID resolver for V2 query service.
internal/lfxv2/client.go Added helper to retrieve exchanged V2 token as a string (GetExchangedToken).
cmd/lfx-mcp-server/main.go Config + wiring for per-service audiences/URLs and new tool registration; copies lf_staff claim into TokenInfo.Extra.
docs/service-api-architecture.md Architecture doc describing the 3-token model and authorization flow.
AGENTS.md Documented new environment variables for Lens/Onboarding service configuration.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +104 to +107
c.cachedToken = token
// Cache with 5-minute buffer before actual expiry.
c.expiry = time.Now().Add(time.Duration(expiresIn)*time.Second - 5*time.Minute)

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Token cache expiry subtracts a fixed 5-minute buffer from expires_in. If expires_in is less than 300 seconds (or is 0/negative due to an upstream issue), this sets c.expiry in the past and can cause repeated token fetches in a tight loop. Consider clamping the buffer (e.g., min(5m, expiresIn/2) or max(expiresIn-buffer, 0)) and validating expires_in > 0 before caching.

Suggested change
c.cachedToken = token
// Cache with 5-minute buffer before actual expiry.
c.expiry = time.Now().Add(time.Duration(expiresIn)*time.Second - 5*time.Minute)
if expiresIn <= 0 {
return "", fmt.Errorf("token endpoint returned non-positive expires_in: %d", expiresIn)
}
c.cachedToken = token
lifetime := time.Duration(expiresIn) * time.Second
// Use a buffer before actual expiry to proactively refresh the token.
// Clamp the buffer to avoid setting expiry in the past for short-lived tokens.
maxBuffer := 5 * time.Minute
buffer := maxBuffer
if lifetime < maxBuffer {
buffer = lifetime / 2
}
c.expiry = time.Now().Add(lifetime - buffer)

Copilot uses AI. Check for mistakes.
Comment on lines +42 to +47
// This separation exists because:
// - Service APIs use shared API keys with no per-user authorization
// - The MCP server IS the authorization gateway for these tools
// - The V2 access-check (OpenFGA-backed) is the authoritative decision
// - MCP scopes (read:all / manage:all) are for V2 API tools where
// Heimdal handles per-resource authorization
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc comment here still says service APIs use shared API keys, but this PR replaces them with OAuth2 client credentials (M2M bearer tokens). Updating this bullet would avoid confusion for future readers and keep the in-code documentation aligned with the new auth model.

Copilot uses AI. Check for mistakes.
// RequireLFStaff checks that the user has the lf_staff custom claim. Returns
// an MCP error result if the user is not staff, or nil if they are.
func RequireLFStaff(req *mcp.CallToolRequest) *mcp.CallToolResult {
if req.Extra != nil && req.Extra.TokenInfo != nil && !IsLFStaff(req.Extra.TokenInfo) {
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RequireLFStaff currently only denies access when TokenInfo is present AND IsLFStaff returns false. If TokenInfo/Extra is missing (e.g., auth misconfiguration or unauthenticated calls), this returns nil and allows access to a staff-only tool. This should deny by default when the staff claim cannot be verified (i.e., require TokenInfo+claim=true).

Suggested change
if req.Extra != nil && req.Extra.TokenInfo != nil && !IsLFStaff(req.Extra.TokenInfo) {
// Deny by default: require a token and a true lf_staff claim.
if req.Extra == nil || req.Extra.TokenInfo == nil || !IsLFStaff(req.Extra.TokenInfo) {

Copilot uses AI. Check for mistakes.
Comment on lines +164 to +205
// wrapWithDebugTransport wraps an HTTP client with wire-level request/response logging.
func wrapWithDebugTransport(client *http.Client, logger *slog.Logger) *http.Client {
base := client.Transport
if base == nil {
base = http.DefaultTransport
}
return &http.Client{
Transport: &serviceDebugTransport{
transport: base,
logger: logger,
},
Timeout: client.Timeout,
}
}

// serviceDebugTransport logs the full HTTP wire dump of every request and response.
type serviceDebugTransport struct {
transport http.RoundTripper
logger *slog.Logger
}

// RoundTrip implements http.RoundTripper.
func (dt *serviceDebugTransport) RoundTrip(req *http.Request) (*http.Response, error) {
reqDump, err := httputil.DumpRequestOut(req, true)
if err != nil {
dt.logger.Error("failed to dump outbound request", "error", err)
} else {
dt.logger.Debug("serviceapi outbound request", "dump", string(reqDump))
}

resp, err := dt.transport.RoundTrip(req)
if err != nil {
dt.logger.Error("serviceapi outbound request failed", "error", err, "url", req.URL.String())
return nil, err
}

respDump, err := httputil.DumpResponse(resp, true)
if err != nil {
dt.logger.Error("failed to dump inbound response", "error", err)
} else {
dt.logger.Debug("serviceapi inbound response", "dump", string(respDump))
}
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The debug transport logs full request/response dumps via httputil.DumpRequestOut/DumpResponse. This will include the Authorization header (M2M bearer token) and potentially sensitive payloads, which is risky even behind a debug flag because logs often end up in centralized systems. Consider redacting Authorization (and other secrets) before dumping, or logging only safe metadata (method, URL, status, duration) with an opt-in body dump that explicitly strips credentials.

Copilot uses AI. Check for mistakes.
@josep-reyero josep-reyero changed the title Replace shared API keys with OAuth2 client credentials for service APIs Service API layer with OAuth2 client credentials for service APIs and dummy tools Mar 24, 2026
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
@josep-reyero josep-reyero requested a review from emsearcy March 24, 2026 15:54
Copy link
Contributor

@emsearcy emsearcy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

largest changes would be the access_check response payload, and of course implementing the upstream calls that are currently commented out.

Also, I didn't flag it, but I assume we'll need to change the custom-claim prefix (per my comment in the Auth0 Action to adopt our newer, more-terse domain prefix).

josep-reyero and others added 2 commits March 25, 2026 13:45
…pes on service tools

- Fix V2 access-check response parsing to handle the real tab-delimited
  format (request@user\ttrue|false) instead of the incorrect allow/deny
- Handle unordered results via map-based matching instead of positional indexing
- Enforce MCP JWT scopes (read:all/manage:all) on service tools via
  AddToolWithScopes, so reduced-scope tokens are respected
- Fix RequireLFStaff to deny by default when token info is missing
- Replace manual toolError helper with SDK error propagation for
  forward-compatibility with MCP spec changes
- Update architecture doc to reflect dual-layer auth model and correct
  access-check response format

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
…laims/)

Aligns with auth0-terraform PR #248 which moves the lf_staff custom claim
from the legacy lfPrefix (https://sso.linuxfoundation.org/claims/) to the
preferred lfxPrefix (http://lfx.dev/claims/) per Eric's review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
@josep-reyero
Copy link
Author

Also changed the custom-claim prefix lfxPrefix namespace as per the Auth0 Action comment.

@josep-reyero josep-reyero requested a review from emsearcy March 25, 2026 13:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants