Skip to content

fix: OWASP AI Agent Security hardening for chat and MCP endpoints#1073

Merged
BenjaminMichaelis merged 12 commits intomainfrom
agents/ai-agent-security-review
May 9, 2026
Merged

fix: OWASP AI Agent Security hardening for chat and MCP endpoints#1073
BenjaminMichaelis merged 12 commits intomainfrom
agents/ai-agent-security-review

Conversation

@BenjaminMichaelis
Copy link
Copy Markdown
Member

@BenjaminMichaelis BenjaminMichaelis commented May 7, 2026

Summary

Implements all actionable findings from an OWASP AI Agent Security Cheat Sheet review of the chat and MCP subsystems. The implementation was reviewed in two rounds by GPT-5.5 and Claude Opus 4.6, which found and resolved additional issues before this PR was created.

Fixes: #1064
Fixes: #1066
Fixes: #1067
Fixes: #1068
Fixes: #1069
Fixes: #1070
Fixes: #1071

- fix-1064: Wrap RAG vector search results in <retrieved_context> XML
  tags with injection-prevention header; truncate chunks to 2000 chars;
  update system prompt to treat retrieved content as read-only reference

- fix-1066: Add AllowedMcpTools static allowlist to AIOptions; filter
  tools at option-build time in CreateResponseOptionsAsync; add defense-
  in-depth check before CallToolAsync in both GetChatCompletionCore and
  ExecuteFunctionCallAsync

- fix-1067: Add endUserId parameter to GetChatCompletion/Stream; pass
  userId from ChatController claims; console app passes 'console-local';
  wired through CreateResponseOptionsAsync (SDK note: User property not
  yet available in OpenAI SDK v2.7.0)

- fix-1068: Add MaxTokensPerUser=10 cap to McpApiTokenService;
  add GetActiveTokenCountAsync; pre-check in McpTokenController
  before CreateTokenAsync

- fix-1069: Add ILogger<AIChatService> to AIChatService constructor;
  mark class partial; add [LoggerMessage] source-generated log methods
  for tool invocations, rejections, and contextual search; never log
  tool arguments or prompt/response content

- fix-1070: Add ResponseIdValidationService (IMemoryCache-backed,
  userId->HashSet<responseId>, 2h sliding expiry); validate
  previousResponseId in ChatController before AI calls; record new
  responseIds after completion; cache-miss allows (graceful degradation)

- fix-1071: Move lastResponseId from localStorage to sessionStorage
  (clears on tab close); add 2h TTL enforcement on history load;
  update clearChatHistory to clean both storage locations
…n 2)

Based on GPT-5.5 and Claude Opus 4.6 review of the initial implementation:

- fix(streaming): ExecuteFunctionCallAsync on rejected tool now streams a
  recovery response ('tool not available') back to the model instead of
  yield break, preventing silent stream truncation and OpenAI protocol
  violations (both agents flagged as blocker)

- fix(allowlist): AllowedMcpTools changes from fail-open (empty=allow all)
  to fail-secure (empty=deny all). Added AllowAllMcpTools bool for explicit
  dev override. Converted _AllowedMcpTools to FrozenSet<string> at ctor
  time for O(1) lookup and immutability. Added IsMcpToolAllowed() helper
  to centralize the check across all three call sites.

- fix(rag): Sanitize XML angle brackets in Heading and ChunkText using
  typographic alternatives (U+2039/U+203A) before inserting into
  <retrieved_context> block, preventing boundary-escape attacks

- fix(cache): ResponseIdValidationService.RecordResponseId sets sliding
  expiry inside the GetOrCreate factory (not in a separate Set call)
  eliminating the TOCTOU race between GetOrCreate and cache.Set. Added
  500-ID cap per user to prevent unbounded HashSet growth.

- fix(docs): Corrected endUserId param comment to say 'reserved for
  forwarding' rather than claiming it is already forwarded
- Remove dead MakeCacheOptions() method from ResponseIdValidationService
- Add comment explaining why user_question prompt is intentionally not
  XML-sanitized (C# generics like List<T> must not be corrupted)
@BenjaminMichaelis BenjaminMichaelis changed the title security: OWASP AI Agent Security hardening for chat and MCP endpoints fix: OWASP AI Agent Security hardening for chat and MCP endpoints May 7, 2026
@BenjaminMichaelis BenjaminMichaelis marked this pull request as ready for review May 8, 2026 00:38
Copilot AI review requested due to automatic review settings May 8, 2026 00:38
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR applies OWASP AI Agent Security hardening across the chat and MCP tool-call paths by adding prompt trust-boundary framing/sanitization, MCP tool allowlisting, per-user response-id validation, and MCP token issuance limits.

Changes:

  • Hardened prompt enrichment by sandboxing retrieved vector-search content inside explicit <retrieved_context> boundaries and sanitizing/truncating retrieved chunks.
  • Added a config-driven MCP tool allowlist and enforced it both when advertising tools to the model and before executing tool calls.
  • Reduced client-side persistence risk and added server-side validation to bind previousResponseId to the authenticated user; added a per-user active MCP token cap.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
EssentialCSharp.Web/wwwroot/js/chat-module.js Moves lastResponseId persistence to sessionStorage and adds a short TTL for local chat history.
EssentialCSharp.Web/Services/ResponseIdValidationService.cs Adds server-side tracking/validation of response IDs per user using IMemoryCache.
EssentialCSharp.Web/Services/McpApiTokenService.cs Adds active-token counting and introduces MaxTokensPerUser.
EssentialCSharp.Web/Program.cs Registers memory cache and the response-id validation service.
EssentialCSharp.Web/Controllers/McpTokenController.cs Enforces per-user active MCP token cap before issuing new tokens.
EssentialCSharp.Web/Controllers/ChatController.cs Validates client-supplied previousResponseId, records issued response IDs, and plumbs endUserId.
EssentialCSharp.Web/appsettings.json Updates system prompt to treat <retrieved_context> as untrusted data; adds MCP tool allowlist.
EssentialCSharp.Chat/Program.cs Passes a concrete endUserId into chat calls (console host).
EssentialCSharp.Chat.Shared/Services/AIChatService.cs Implements retrieved-context sandboxing, MCP allowlist enforcement, and AI security logging; plumbs endUserId.
EssentialCSharp.Chat.Shared/Models/AIOptions.cs Adds AllowedMcpTools and AllowAllMcpTools configuration options.

Comment thread EssentialCSharp.Web/Services/ResponseIdValidationService.cs Outdated
Comment thread EssentialCSharp.Web/Controllers/McpTokenController.cs Outdated
Comment thread EssentialCSharp.Web/wwwroot/js/chat-module.js
Comment thread EssentialCSharp.Web/Services/McpApiTokenService.cs
Comment thread EssentialCSharp.Web/Services/ResponseIdValidationService.cs
Comment thread EssentialCSharp.Web/Services/ResponseIdValidationService.cs Outdated
Comment thread EssentialCSharp.Chat.Shared/Services/AIChatService.cs Outdated
Comment thread EssentialCSharp.Chat.Shared/Services/AIChatService.cs Outdated
- ResponseIdValidationService: redesign from per-user HashSet to per-responseId
  cache entries, eliminating arbitrary eviction (ids.Remove(ids.First()))
  and reducing lock contention; each ID now has its own 2-hour TTL
- McpApiTokenService.CreateTokenAsync: enforce MaxTokensPerUser limit inside
  a serializable transaction, making the cap atomic under concurrency; move
  limit check out of controller into service
- McpTokenController: remove pre-check (now redundant); catch
  InvalidOperationException from service and return 400
- chat-module.js: clear sessionStorage lastResponseId when localStorage
  history is rejected due to TTL expiry, preventing stale responseId
- AIChatService: add LogMcpToolCallInvokedStream [LoggerMessage] for the
  streaming code path (previously shared LogMcpToolCallInvoked with
  iteration semantics, now split: iteration vs depth are distinct)
- AIChatService: reword endUserId comment to accurately reflect that the
  SDK does not support it yet (was misleadingly claiming it was forwarded)
- Tests: add GetActiveTokenCountAsync tests (active, revoked, expired,
  zero, at-max-limit) to McpApiTokenServiceTests
- Tests: add ResponseIdValidationServiceTests (new conversation, cache miss,
  owner validates, cross-user rejected, null inputs, multi-user isolation)
Comment thread EssentialCSharp.Web.Tests/ResponseIdValidationServiceTests.cs Fixed
…ings

- Add TokenLimitExceededException (subclass of InvalidOperationException)
  so McpTokenController can catch only the expected limit-exceeded case;
  previously catching InvalidOperationException would mask unrelated EF Core
  failures as 400 BadRequest instead of 500
- McpApiTokenService: throw TokenLimitExceededException instead of base
  InvalidOperationException; remove redundant explicit RollbackAsync before
  throw (using-block disposal handles rollback automatically)
- Program.cs: change ResponseIdValidationService registration from Scoped
  to Singleton — service is stateless (all state is in IMemoryCache singleton)
- Tests: rename AtMaxLimit test to reflect specific exception type;
  add AfterRevokingAtLimit test to cover the key invariant that a revoked
  token frees a slot for a new one
…tion docs

- ResponseIdValidationService: create a fresh MemoryCacheEntryOptions per
  RecordResponseId call instead of sharing a static instance. The shared
  instance was safe today (no mutable lists populated) but is a latent hazard
  if ExpirationTokens or PostEvictionCallbacks are ever added in future —
  IMemoryCache reads those lists at Set time without defensive copying
- TokenLimitExceededException: add XML doc warning that catch-by-base-class
  (InvalidOperationException) will silently capture this type
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

Comment thread EssentialCSharp.Web/Controllers/ChatController.cs Outdated
Comment thread EssentialCSharp.Web/Services/ResponseIdValidationService.cs Outdated
Comment thread EssentialCSharp.Web/Program.cs
Comment thread EssentialCSharp.Chat.Shared/Services/AIChatService.cs
Comment thread EssentialCSharp.Chat.Shared/Services/AIChatService.cs Outdated
Comment thread EssentialCSharp.Web/wwwroot/js/chat-module.js
Comment thread EssentialCSharp.Web.Tests/McpApiTokenServiceTests.cs Outdated
…tion 4)

- Thread endUserId through GetChatCompletionCore, ProcessStreamingUpdatesAsync, and ExecuteFunctionCallAsync; include in tool-call and tool-rejected log messages for security telemetry
- Emit responseId early from StreamingResponseCreatedUpdate so ChatController can record ownership before stream completes (handles mid-stream client disconnects)
- Record responseId in streaming loop on first observation instead of post-loop, eliminating the disconnect race
- Add SizeLimit=10_000 to AddMemoryCache to bound memory growth; add SetSize(1) per entry in ResponseIdValidationService
- Fix MemoryCache disposal in ResponseIdValidationServiceTests: each test owns its cache via 'using var cache'
- Fix IServiceScope disposal in McpApiTokenServiceTests: collect scopes in _scopes list, dispose all via [After(Test)] teardown
- AIChatService: remove responseId yield from StreamingResponseCompletedUpdate;
  StreamingResponseCreatedUpdate already emits the ID early at the start of
  each API call leg, making the CompletedUpdate emit redundant and causing
  duplicate SSE events (one per leg) when tool calls are involved

- AIChatService: drop now-unnecessary per-iteration string? responseId local
  variable and the dead 
esponseId = functionResult.responseId assignment
  inside the tool-call foreach loop

- ChatController: remove 
esponseIdRecorded one-shot flag; instead record
  ownership for every non-null responseId yielded — one per response leg
  (initial + each tool-call continuation) — so tool-call continuation IDs
  are properly bound to the authenticated user before being forwarded to
  the client

- ChatController: add null-guard for userId after FindFirstValue(); [Authorize]
  + ASP.NET Identity guarantees the claim is present but the explicit check
  returns 401 early rather than silently propagating a null into validation

- ResponseIdValidationServiceTests: extract CreateCache() helper with
  SizeLimit = 10_000 matching production config, so SetSize(1) per entry
  is actually exercised in tests rather than silently ignored; add new
  RecordResponseId_SizeLimitEnforced_EntryCountedInCache test
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Comment thread EssentialCSharp.Web/Program.cs Outdated
Comment thread EssentialCSharp.Web/Services/ResponseIdValidationService.cs Outdated
Comment thread EssentialCSharp.Web/wwwroot/js/chat-module.js Outdated
… SizeLimit; remove dead cleanupOldSessions JS

- ResponseIdValidationService now creates and owns a dedicated MemoryCache
  instance (parameterless DI ctor) rather than injecting the shared
  IMemoryCache. This prevents the SizeLimit constraint from forcing every
  other IMemoryCache caller in the app to call SetSize, which would cause
  runtime InvalidOperationExceptions in ASP.NET Core's built-in caches.
- An IMemoryCache overload ctor is retained for unit test injection.
- Remove SizeLimit from AddMemoryCache() in Program.cs; the dedicated
  cache in ResponseIdValidationService already carries SizeLimit=10_000.
- Remove cleanupOldSessions() from chat-module.js: loadChatHistory()
  already enforces the 2-hour TTL on load, making the 7-day cleanup
  function unreachable dead code.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

EssentialCSharp.Web/wwwroot/js/chat-module.js:70

  • saveChatHistory() only writes aiChatLastResponseId to sessionStorage after localStorage.setItem(...) succeeds. If localStorage throws (quota exceeded, blocked, etc.), the catch/fallback path never updates sessionStorage, so lastResponseId persistence can become stale or be lost on reload even though sessionStorage is still available. Consider updating/removing the sessionStorage key in a finally (or duplicating the sessionStorage write in the catch) so lastResponseId is persisted independently of localStorage success.
        try {
            // Limit messages to prevent memory issues (keep last 100 messages)
            const maxMessages = 100;
            const messagesToSave = chatMessages.value.slice(-maxMessages);
            
            const data = {
                messages: messagesToSave,
                timestamp: Date.now()
            };
            localStorage.setItem('aiChatHistory', JSON.stringify(data));
            // Store lastResponseId in sessionStorage — it clears on tab close, reducing persistent exposure
            if (lastResponseId.value) {
                sessionStorage.setItem('aiChatLastResponseId', lastResponseId.value);
            } else {
                sessionStorage.removeItem('aiChatLastResponseId');
            }
        } catch (error) {
            console.warn('Failed to save chat history:', error);
            // If localStorage is full, try clearing and saving only recent messages
            try {
                const recentMessages = chatMessages.value.slice(-20);
                const fallbackData = {
                    messages: recentMessages,
                    timestamp: Date.now()
                };
                localStorage.setItem('aiChatHistory', JSON.stringify(fallbackData));
            } catch (fallbackError) {

Comment thread EssentialCSharp.Web/Program.cs Outdated
Comment thread EssentialCSharp.Web/Services/McpApiTokenService.cs Outdated
…; fix XML doc exception type

- Register ResponseIdValidationService via AddSingleton factory lambda
  (_ => new ResponseIdValidationService()) so DI always calls the
  parameterless ctor that creates the owned, size-limited MemoryCache.
  Without this, ASP.NET Core DI prefers the IMemoryCache ctor (matching
  the registered IMemoryCache from AddMemoryCache()), defeating the
  dedicated cache design.
- Update CreateTokenAsync XML doc: references TokenLimitExceededException
  instead of the base InvalidOperationException, so callers know the
  concrete type to catch.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread EssentialCSharp.Web.Tests/McpApiTokenServiceTests.cs
@BenjaminMichaelis
Copy link
Copy Markdown
Member Author

@copilot resolve the merge conflicts in this pull request

…ity-review

# Conflicts:
#	EssentialCSharp.Chat.Shared/Services/AIChatService.cs

Co-authored-by: BenjaminMichaelis <22186029+BenjaminMichaelis@users.noreply.github.com>
@BenjaminMichaelis BenjaminMichaelis merged commit 0d2cb75 into main May 9, 2026
6 checks passed
@BenjaminMichaelis BenjaminMichaelis deleted the agents/ai-agent-security-review branch May 9, 2026 04:22
Copilot stopped work on behalf of BenjaminMichaelis due to an error May 9, 2026 04:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment