Service API layer with OAuth2 client credentials for service APIs and dummy tools#39
Service API layer with OAuth2 client credentials for service APIs and dummy tools#39josep-reyero wants to merge 6 commits intomainfrom
Conversation
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review.
Tip: disable this comment in your organization's Code Review settings.
…ens tools Implement Option 4 (V2 access-check) authorization for internal service APIs that use shared API keys with no per-user auth. The MCP server acts as the authorization gateway by calling the V2 access-check endpoint (OpenFGA-backed) before proxying requests. - Add AccessCheckClient for V2 access-check with # separator format - Add SlugResolver with in-memory cache for slug→UUID translation - Add serviceapi.Client for API-key-authenticated service calls - Add ServiceAuth shared struct and AuthorizeProject flow - Add onboarding_list_memberships tool (writer relation) - Add lfx_lens_query tool (lf_staff claim + auditor relation) - Add lf_staff custom claim extraction in JWT verifier - Document new env vars in AGENTS.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
The Lens workflow endpoint no longer expects a project name in the payload. Remove project_name from the tool args, request struct, and handler. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
Switch service API authentication (LFX Lens, Member Onboarding) from shared API keys to Auth0 client credentials (M2M JWTs). Each service gets its own Auth0 resource server, so tokens are scoped per-service and validated cryptographically via JWKS — no more shared secrets. Changes: Auth layer: - Add ClientCredentialsClient (internal/lfxv2/client_credentials.go) for OAuth2 client credentials grants with automatic token caching - Extract generateClientAssertion to shared package-level function for reuse by both token exchange and client credentials flows - Replace APIKey field with TokenSource interface in serviceapi.Client — do() now sets Authorization: Bearer instead of ApiKey Wiring: - Replace LFXMCP_LF_AI_API_KEY with per-service audience env vars: LFXMCP_LENS_API_AUDIENCE and LFXMCP_ONBOARDING_API_AUDIENCE - Each service gets its own ClientCredentialsClient using the same M2M credentials but a different Auth0 audience Tools (stubbed): - lfx_lens_query and onboarding_list_memberships return dummy responses until the Auth0 resource servers are deployed and the service APIs are updated to validate JWTs Docs: - Unify architecture-service-api-auth.md and service-api-auth-flow.md into service-api-architecture.md with full flow diagrams, tool inventory, naming conventions, and API schemas - Reflect feedback: onboarding_run_agent supports a preview flag instead of a separate onboarding_preview_agent tool Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
55f9d13 to
01a5340
Compare
There was a problem hiding this comment.
Pull request overview
This PR migrates internal service API authentication from shared API keys to Auth0-issued OAuth2 client-credentials tokens, and introduces shared service-tool authorization plumbing (slug→UUID resolution + V2 access-check) to enforce project-level authorization before proxying to Lens/Onboarding.
Changes:
- Added OAuth2 client-credentials token client (cached) and a generic service API HTTP client (
TokenSource-based). - Added V2 authorization helpers: access-check client, slug resolver (cached), and a
ServiceAuthflow used by new service tools. - Wired new tools/config into
main.go, plus added architecture + env var documentation.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
internal/tools/service.go |
Shared service-tool infra (ServiceAuth, access relations, staff-claim helpers). |
internal/tools/onboarding.go |
New onboarding tool (currently returns stubbed/dummy response). |
internal/tools/lens.go |
New Lens tool (staff-gated; currently returns stubbed/dummy response). |
internal/serviceapi/client.go |
Generic HTTP client for service APIs using bearer tokens + optional debug wire logging. |
internal/serviceapi/client_test.go |
Tests for GET/POST JSON/multipart behavior and auth header handling. |
internal/lfxv2/token_exchange.go |
Extracted shared generateClientAssertion() helper for reuse. |
internal/lfxv2/client_credentials.go |
New client-credentials grant client with token caching. |
internal/lfxv2/access_check.go |
New access-check client for project relation checks. |
internal/lfxv2/access_check_test.go |
Tests for access-check request/response handling and edge cases. |
internal/lfxv2/slug_resolver.go |
New in-memory slug→UUID resolver for V2 query service. |
internal/lfxv2/client.go |
Added helper to retrieve exchanged V2 token as a string (GetExchangedToken). |
cmd/lfx-mcp-server/main.go |
Config + wiring for per-service audiences/URLs and new tool registration; copies lf_staff claim into TokenInfo.Extra. |
docs/service-api-architecture.md |
Architecture doc describing the 3-token model and authorization flow. |
AGENTS.md |
Documented new environment variables for Lens/Onboarding service configuration. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| c.cachedToken = token | ||
| // Cache with 5-minute buffer before actual expiry. | ||
| c.expiry = time.Now().Add(time.Duration(expiresIn)*time.Second - 5*time.Minute) | ||
|
|
There was a problem hiding this comment.
Token cache expiry subtracts a fixed 5-minute buffer from expires_in. If expires_in is less than 300 seconds (or is 0/negative due to an upstream issue), this sets c.expiry in the past and can cause repeated token fetches in a tight loop. Consider clamping the buffer (e.g., min(5m, expiresIn/2) or max(expiresIn-buffer, 0)) and validating expires_in > 0 before caching.
| c.cachedToken = token | |
| // Cache with 5-minute buffer before actual expiry. | |
| c.expiry = time.Now().Add(time.Duration(expiresIn)*time.Second - 5*time.Minute) | |
| if expiresIn <= 0 { | |
| return "", fmt.Errorf("token endpoint returned non-positive expires_in: %d", expiresIn) | |
| } | |
| c.cachedToken = token | |
| lifetime := time.Duration(expiresIn) * time.Second | |
| // Use a buffer before actual expiry to proactively refresh the token. | |
| // Clamp the buffer to avoid setting expiry in the past for short-lived tokens. | |
| maxBuffer := 5 * time.Minute | |
| buffer := maxBuffer | |
| if lifetime < maxBuffer { | |
| buffer = lifetime / 2 | |
| } | |
| c.expiry = time.Now().Add(lifetime - buffer) |
internal/tools/service.go
Outdated
| // This separation exists because: | ||
| // - Service APIs use shared API keys with no per-user authorization | ||
| // - The MCP server IS the authorization gateway for these tools | ||
| // - The V2 access-check (OpenFGA-backed) is the authoritative decision | ||
| // - MCP scopes (read:all / manage:all) are for V2 API tools where | ||
| // Heimdal handles per-resource authorization |
There was a problem hiding this comment.
The doc comment here still says service APIs use shared API keys, but this PR replaces them with OAuth2 client credentials (M2M bearer tokens). Updating this bullet would avoid confusion for future readers and keep the in-code documentation aligned with the new auth model.
internal/tools/service.go
Outdated
| // RequireLFStaff checks that the user has the lf_staff custom claim. Returns | ||
| // an MCP error result if the user is not staff, or nil if they are. | ||
| func RequireLFStaff(req *mcp.CallToolRequest) *mcp.CallToolResult { | ||
| if req.Extra != nil && req.Extra.TokenInfo != nil && !IsLFStaff(req.Extra.TokenInfo) { |
There was a problem hiding this comment.
RequireLFStaff currently only denies access when TokenInfo is present AND IsLFStaff returns false. If TokenInfo/Extra is missing (e.g., auth misconfiguration or unauthenticated calls), this returns nil and allows access to a staff-only tool. This should deny by default when the staff claim cannot be verified (i.e., require TokenInfo+claim=true).
| if req.Extra != nil && req.Extra.TokenInfo != nil && !IsLFStaff(req.Extra.TokenInfo) { | |
| // Deny by default: require a token and a true lf_staff claim. | |
| if req.Extra == nil || req.Extra.TokenInfo == nil || !IsLFStaff(req.Extra.TokenInfo) { |
| // wrapWithDebugTransport wraps an HTTP client with wire-level request/response logging. | ||
| func wrapWithDebugTransport(client *http.Client, logger *slog.Logger) *http.Client { | ||
| base := client.Transport | ||
| if base == nil { | ||
| base = http.DefaultTransport | ||
| } | ||
| return &http.Client{ | ||
| Transport: &serviceDebugTransport{ | ||
| transport: base, | ||
| logger: logger, | ||
| }, | ||
| Timeout: client.Timeout, | ||
| } | ||
| } | ||
|
|
||
| // serviceDebugTransport logs the full HTTP wire dump of every request and response. | ||
| type serviceDebugTransport struct { | ||
| transport http.RoundTripper | ||
| logger *slog.Logger | ||
| } | ||
|
|
||
| // RoundTrip implements http.RoundTripper. | ||
| func (dt *serviceDebugTransport) RoundTrip(req *http.Request) (*http.Response, error) { | ||
| reqDump, err := httputil.DumpRequestOut(req, true) | ||
| if err != nil { | ||
| dt.logger.Error("failed to dump outbound request", "error", err) | ||
| } else { | ||
| dt.logger.Debug("serviceapi outbound request", "dump", string(reqDump)) | ||
| } | ||
|
|
||
| resp, err := dt.transport.RoundTrip(req) | ||
| if err != nil { | ||
| dt.logger.Error("serviceapi outbound request failed", "error", err, "url", req.URL.String()) | ||
| return nil, err | ||
| } | ||
|
|
||
| respDump, err := httputil.DumpResponse(resp, true) | ||
| if err != nil { | ||
| dt.logger.Error("failed to dump inbound response", "error", err) | ||
| } else { | ||
| dt.logger.Debug("serviceapi inbound response", "dump", string(respDump)) | ||
| } |
There was a problem hiding this comment.
The debug transport logs full request/response dumps via httputil.DumpRequestOut/DumpResponse. This will include the Authorization header (M2M bearer token) and potentially sensitive payloads, which is risky even behind a debug flag because logs often end up in centralized systems. Consider redacting Authorization (and other secrets) before dumping, or logging only safe metadata (method, URL, status, duration) with an opt-in body dump that explicitly strips credentials.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
emsearcy
left a comment
There was a problem hiding this comment.
largest changes would be the access_check response payload, and of course implementing the upstream calls that are currently commented out.
Also, I didn't flag it, but I assume we'll need to change the custom-claim prefix (per my comment in the Auth0 Action to adopt our newer, more-terse domain prefix).
…pes on service tools - Fix V2 access-check response parsing to handle the real tab-delimited format (request@user\ttrue|false) instead of the incorrect allow/deny - Handle unordered results via map-based matching instead of positional indexing - Enforce MCP JWT scopes (read:all/manage:all) on service tools via AddToolWithScopes, so reduced-scope tokens are respected - Fix RequireLFStaff to deny by default when token info is missing - Replace manual toolError helper with SDK error propagation for forward-compatibility with MCP spec changes - Update architecture doc to reflect dual-layer auth model and correct access-check response format Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
…laims/) Aligns with auth0-terraform PR #248 which moves the lf_staff custom claim from the legacy lfPrefix (https://sso.linuxfoundation.org/claims/) to the preferred lfxPrefix (http://lfx.dev/claims/) per Eric's review. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Josep Garcia-Reyero Sais <josepreyero@gmail.com>
|
Also changed the custom-claim prefix lfxPrefix namespace as per the Auth0 Action comment. |
Summary
Replace shared API keys with OAuth2 client credentials (M2M JWTs) for service API authentication (LFX Lens, Member Onboarding). Each service gets its own Auth0 resource server — no more shared secrets.
See
docs/service-api-architecture.mdfor the full design: token flows, diagrams, tool inventory, and API schemas.What changed
ClientCredentialsClientfor OAuth2 client credentials with token caching;TokenSourceinterface replacesAPIKeyin the service API clientlfx_lens_queryandonboarding_list_membershipsregistered with correct args/types but return dummy responses (actual API calls commented out inline)LFXMCP_LF_AI_API_KEYreplaced with per-serviceLFXMCP_LENS_API_AUDIENCEandLFXMCP_ONBOARDING_API_AUDIENCEWhy dummy tool responses
The tools return placeholder data instead of calling the real service APIs. This lets us merge the full auth infrastructure now while the rest of the stack catches up. Three things need to happen before the tools can make real calls:
OS_SECURITY_KEY/ API key patternLFXMCP_LENS_API_URL,LFXMCP_LENS_API_AUDIENCE, etc.) set in its deployment configThe actual API calls are commented out inline in
lens.goandonboarding.gowith the full request/response format documented, so switching from dummy to real is a small, targeted change once the dependencies are ready.Dependencies
This PR depends on two
auth0-terraformPRs:auth0-terraform#249 — Creates the Auth0 resource servers for LFX Lens and Member Onboarding. Required before the MCP server can obtain client credentials tokens, and before the services can validate them.
auth0-terraform#248 — Adds the
lf_staffcustom claim to MCP API access tokens. Required forlfx_lens_queryspecifically — this tool checks thelf_staffJWT claim to enforce staff-only access. Until #248 is deployed, the Lens tool will reject all requests with "this tool is available to Linux Foundation staff only".onboarding_list_membershipsdoes not depend on #248 (no staff check).No new env vars needed to deploy this PR
The service tools only activate when both URL and audience are configured. With current deployment config unchanged, everything works as before.
Test plan
go build ./...passesgo test ./...passesApiKey/LF_AI_API_KEY🤖 Generated with Claude Code