Migrate agent tools from runtime to individual Lambda functions behind AgentCore Gateway by msumari · Pull Request #139 · aws-samples/sample-agentic-cost-optimizer

msumari · 2026-03-01T18:19:05Z

Issue number: closes #

Summary

Moves all agent tools (storage, journal, and the generic use_aws) out of the agent runtime into standalone Lambda functions behind the Bedrock AgentCore Gateway. Each Lambda has its own scoped IAM role, replacing the previous "overlord permissions" pattern where the runtime held broad access to S3, DynamoDB, Lambda, CloudWatch, and Pricing APIs.

Changes

New Lambda tools (7 handlers): storage_tool, journal_tool, interceptor, lambda_discovery_tool, cloudwatch_metrics_tool, cloudwatch_logs_tool, pricing_tool — each with tool-name routing, structured JSON responses, and scoped IAM.

CDK construct:
gateway.ts — Cognito OAuth2 (client credentials), Lambda targets with tool schemas, interceptor escape hatch, cdk-nag suppressions.

MCP client: src/mcp_client/ — OAuth2 token caching, MCPClient factory for the agent runtime to call tools via the gateway.

Agent runtime (main.py): Replaced storage, journal, and use_aws imports with a single mcp_client.

Infrastructure (infra-stack.ts, agent.ts): Removed direct S3 bucket policy, S3_BUCKET_NAME env var, and monitoringPolicy from the runtime.

Prompt updates: analysis_prompt.md references new tool names. report_prompt.md adds ASCII-only output constraint to prevent UTF-8 mojibake.

Deleted:
journal.py, storage.py, old tests, agent-iam-policies.spec.ts.

Security
Path traversal validation on storage filenames (allowlist regex)
Interceptor blocks prompt injection and redacts PII

Tests
210 Python tests (99% coverage), 41 TypeScript tests (100% statement coverage).

Please provide a summary of what's being changed

User experience

Please share what the user experience looks like before and after this change

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

github-actions · 2026-03-01T18:19:37Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
src/agents
main.py
src/mcp_client
__init__.py
client.py
src/shared
config.py
src/tools
__init__.py
Project Total

_{This report was generated by python-coverage-comment-action}

matteofigus

Thanks for the PR. I've gone through the diff and have some concerns about the overall approach.

The core premise doesn't hold up

The PR is framed as moving tools from the AgentCore runtime to Lambda for better permission scoping, but AgentCore runtime is Lambda under the hood. So this effectively goes from 1 Lambda with in-process tools to 8 Lambdas + Cognito + AgentCore Gateway + OAuth2 — without a meaningful security improvement to justify that complexity.

The permission improvements are marginal and achievable without this change

Comparing old vs new IAM policies side by side:

Tool	Old (runtime role)	New (per-Lambda role)
Lambda discovery	`lambda:ListFunctions`, `GetFunction`, etc → `*`	Same actions → `*`
CloudWatch metrics	`cloudwatch:GetMetricStatistics`, `ListMetrics` → `*`	Same actions → `*`
CloudWatch logs	`logs:StartQuery` → `/aws/lambda/`, `StopQuery`/`GetQueryResults` → ``	Same
Pricing	`pricing:GetProducts` → `*`	Same
S3 storage	`GetObject`/`PutObject` on bucket	`grantReadWrite` (broader — adds `Abort`, `DeleteObject`, `GetBucket`, `List`)
DynamoDB journal	Full table access on runtime	`PutItem` only ✅

The only real win is scoping journal to PutItem. The S3 change is actually a regression. The monitoring permissions are identical because those APIs don't support resource-level scoping. Both improvements could be achieved by tightening the existing runtime role — no new infrastructure needed.

New costs and tradeoffs

Latency: Every tool call now goes runtime → HTTP → Gateway → OAuth2 check → Lambda invoke → response, vs a direct in-process function call
Reliability: More failure modes — gateway outages, OAuth token endpoint issues, cold starts on tool Lambdas. If the gateway is down, the agent has zero tools
Cost: 7 additional Lambdas, Cognito user pool, AgentCore Gateway for what was previously free (in-process calls)
Complexity: 752 lines of CDK for gateway.ts alone, plus MCP client, OAuth token caching, etc.
Debugging: Tool call failures now span multiple Lambda invocations and log groups

Other issues

gateway.clientSecret.unsafeUnwrap() passes the Cognito client secret as a plaintext environment variable. Should use Secrets Manager or SSM SecureString.
The interceptor's regex-based prompt injection blocking is trivially bypassable and gives a false sense of security.
use_aws (generic AWS API tool) was removed without replacement — potential functional regression if the agent used it for anything beyond the 5 specific APIs now covered.
Token cache (_token_cache module-level dict) has no thread safety.

Suggestion

If the goal is tighter permissions, I'd suggest just scoping the existing runtime role more tightly (e.g., DynamoDB PutItem only, remove any excess S3 actions). That achieves the same or better security posture without the architectural overhead.

This review was done with the help of Claude Opus 4.6

msumari added 7 commits February 27, 2026 14:35

feat: remove old tools

3bf66ba

feat: add new tools

f2824a8

fix: add and update unit tests

cf9872b

fix: filename validation and deduplication

e0ddc53

feat: add aws service specific tools & remove use_aws

8b6a676

feat: add tools test

b5bba00

fix: resolve UTF-8 mojibake in report output

0afb3ef

msumari requested a review from a team as a code owner March 1, 2026 18:19

matteofigus requested changes Mar 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate agent tools from runtime to individual Lambda functions behind AgentCore Gateway#139

Migrate agent tools from runtime to individual Lambda functions behind AgentCore Gateway#139
msumari wants to merge 7 commits intomainfrom
msumari/tools-out

msumari commented Mar 1, 2026

Uh oh!

github-actions bot commented Mar 1, 2026

Uh oh!

matteofigus left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

msumari commented Mar 1, 2026

Summary

Changes

User experience

Uh oh!

github-actions bot commented Mar 1, 2026

Coverage report

Uh oh!

matteofigus left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants