fix: OWASP AI Agent Security hardening for chat and MCP endpoints by BenjaminMichaelis · Pull Request #1073 · IntelliTect/EssentialCSharp.Web

BenjaminMichaelis · 2026-05-07T17:01:48Z

Summary

Implements all actionable findings from an OWASP AI Agent Security Cheat Sheet review of the chat and MCP subsystems. The implementation was reviewed in two rounds by GPT-5.5 and Claude Opus 4.6, which found and resolved additional issues before this PR was created.

Fixes: #1064
Fixes: #1066
Fixes: #1067
Fixes: #1068
Fixes: #1069
Fixes: #1070
Fixes: #1071

- fix-1064: Wrap RAG vector search results in <retrieved_context> XML tags with injection-prevention header; truncate chunks to 2000 chars; update system prompt to treat retrieved content as read-only reference - fix-1066: Add AllowedMcpTools static allowlist to AIOptions; filter tools at option-build time in CreateResponseOptionsAsync; add defense- in-depth check before CallToolAsync in both GetChatCompletionCore and ExecuteFunctionCallAsync - fix-1067: Add endUserId parameter to GetChatCompletion/Stream; pass userId from ChatController claims; console app passes 'console-local'; wired through CreateResponseOptionsAsync (SDK note: User property not yet available in OpenAI SDK v2.7.0) - fix-1068: Add MaxTokensPerUser=10 cap to McpApiTokenService; add GetActiveTokenCountAsync; pre-check in McpTokenController before CreateTokenAsync - fix-1069: Add ILogger<AIChatService> to AIChatService constructor; mark class partial; add [LoggerMessage] source-generated log methods for tool invocations, rejections, and contextual search; never log tool arguments or prompt/response content - fix-1070: Add ResponseIdValidationService (IMemoryCache-backed, userId->HashSet<responseId>, 2h sliding expiry); validate previousResponseId in ChatController before AI calls; record new responseIds after completion; cache-miss allows (graceful degradation) - fix-1071: Move lastResponseId from localStorage to sessionStorage (clears on tab close); add 2h TTL enforcement on history load; update clearChatHistory to clean both storage locations

…n 2) Based on GPT-5.5 and Claude Opus 4.6 review of the initial implementation: - fix(streaming): ExecuteFunctionCallAsync on rejected tool now streams a recovery response ('tool not available') back to the model instead of yield break, preventing silent stream truncation and OpenAI protocol violations (both agents flagged as blocker) - fix(allowlist): AllowedMcpTools changes from fail-open (empty=allow all) to fail-secure (empty=deny all). Added AllowAllMcpTools bool for explicit dev override. Converted _AllowedMcpTools to FrozenSet<string> at ctor time for O(1) lookup and immutability. Added IsMcpToolAllowed() helper to centralize the check across all three call sites. - fix(rag): Sanitize XML angle brackets in Heading and ChunkText using typographic alternatives (U+2039/U+203A) before inserting into <retrieved_context> block, preventing boundary-escape attacks - fix(cache): ResponseIdValidationService.RecordResponseId sets sliding expiry inside the GetOrCreate factory (not in a separate Set call) eliminating the TOCTOU race between GetOrCreate and cache.Set. Added 500-ID cap per user to prevent unbounded HashSet growth. - fix(docs): Corrected endUserId param comment to say 'reserved for forwarding' rather than claiming it is already forwarded

- Remove dead MakeCacheOptions() method from ResponseIdValidationService - Add comment explaining why user_question prompt is intentionally not XML-sanitized (C# generics like List<T> must not be corrupted)

Copilot

Pull request overview

This PR applies OWASP AI Agent Security hardening across the chat and MCP tool-call paths by adding prompt trust-boundary framing/sanitization, MCP tool allowlisting, per-user response-id validation, and MCP token issuance limits.

Changes:

Hardened prompt enrichment by sandboxing retrieved vector-search content inside explicit <retrieved_context> boundaries and sanitizing/truncating retrieved chunks.
Added a config-driven MCP tool allowlist and enforced it both when advertising tools to the model and before executing tool calls.
Reduced client-side persistence risk and added server-side validation to bind previousResponseId to the authenticated user; added a per-user active MCP token cap.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
EssentialCSharp.Web/wwwroot/js/chat-module.js	Moves `lastResponseId` persistence to `sessionStorage` and adds a short TTL for local chat history.
EssentialCSharp.Web/Services/ResponseIdValidationService.cs	Adds server-side tracking/validation of response IDs per user using IMemoryCache.
EssentialCSharp.Web/Services/McpApiTokenService.cs	Adds active-token counting and introduces `MaxTokensPerUser`.
EssentialCSharp.Web/Program.cs	Registers memory cache and the response-id validation service.
EssentialCSharp.Web/Controllers/McpTokenController.cs	Enforces per-user active MCP token cap before issuing new tokens.
EssentialCSharp.Web/Controllers/ChatController.cs	Validates client-supplied `previousResponseId`, records issued response IDs, and plumbs `endUserId`.
EssentialCSharp.Web/appsettings.json	Updates system prompt to treat `<retrieved_context>` as untrusted data; adds MCP tool allowlist.
EssentialCSharp.Chat/Program.cs	Passes a concrete `endUserId` into chat calls (console host).
EssentialCSharp.Chat.Shared/Services/AIChatService.cs	Implements retrieved-context sandboxing, MCP allowlist enforcement, and AI security logging; plumbs `endUserId`.
EssentialCSharp.Chat.Shared/Models/AIOptions.cs	Adds `AllowedMcpTools` and `AllowAllMcpTools` configuration options.

- ResponseIdValidationService: redesign from per-user HashSet to per-responseId cache entries, eliminating arbitrary eviction (ids.Remove(ids.First())) and reducing lock contention; each ID now has its own 2-hour TTL - McpApiTokenService.CreateTokenAsync: enforce MaxTokensPerUser limit inside a serializable transaction, making the cap atomic under concurrency; move limit check out of controller into service - McpTokenController: remove pre-check (now redundant); catch InvalidOperationException from service and return 400 - chat-module.js: clear sessionStorage lastResponseId when localStorage history is rejected due to TTL expiry, preventing stale responseId - AIChatService: add LogMcpToolCallInvokedStream [LoggerMessage] for the streaming code path (previously shared LogMcpToolCallInvoked with iteration semantics, now split: iteration vs depth are distinct) - AIChatService: reword endUserId comment to accurately reflect that the SDK does not support it yet (was misleadingly claiming it was forwarded) - Tests: add GetActiveTokenCountAsync tests (active, revoked, expired, zero, at-max-limit) to McpApiTokenServiceTests - Tests: add ResponseIdValidationServiceTests (new conversation, cache miss, owner validates, cross-user rejected, null inputs, multi-user isolation)

…ings - Add TokenLimitExceededException (subclass of InvalidOperationException) so McpTokenController can catch only the expected limit-exceeded case; previously catching InvalidOperationException would mask unrelated EF Core failures as 400 BadRequest instead of 500 - McpApiTokenService: throw TokenLimitExceededException instead of base InvalidOperationException; remove redundant explicit RollbackAsync before throw (using-block disposal handles rollback automatically) - Program.cs: change ResponseIdValidationService registration from Scoped to Singleton — service is stateless (all state is in IMemoryCache singleton) - Tests: rename AtMaxLimit test to reflect specific exception type; add AfterRevokingAtLimit test to cover the key invariant that a revoked token frees a slot for a new one

…tion docs - ResponseIdValidationService: create a fresh MemoryCacheEntryOptions per RecordResponseId call instead of sharing a static instance. The shared instance was safe today (no mutable lists populated) but is a latent hazard if ExpirationTokens or PostEvictionCallbacks are ever added in future — IMemoryCache reads those lists at Set time without defensive copying - TokenLimitExceededException: add XML doc warning that catch-by-base-class (InvalidOperationException) will silently capture this type

…ests

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 7 comments.

…tion 4) - Thread endUserId through GetChatCompletionCore, ProcessStreamingUpdatesAsync, and ExecuteFunctionCallAsync; include in tool-call and tool-rejected log messages for security telemetry - Emit responseId early from StreamingResponseCreatedUpdate so ChatController can record ownership before stream completes (handles mid-stream client disconnects) - Record responseId in streaming loop on first observation instead of post-loop, eliminating the disconnect race - Add SizeLimit=10_000 to AddMemoryCache to bound memory growth; add SetSize(1) per entry in ResponseIdValidationService - Fix MemoryCache disposal in ResponseIdValidationServiceTests: each test owns its cache via 'using var cache' - Fix IServiceScope disposal in McpApiTokenServiceTests: collect scopes in _scopes list, dispose all via [After(Test)] teardown

- AIChatService: remove responseId yield from StreamingResponseCompletedUpdate; StreamingResponseCreatedUpdate already emits the ID early at the start of each API call leg, making the CompletedUpdate emit redundant and causing duplicate SSE events (one per leg) when tool calls are involved - AIChatService: drop now-unnecessary per-iteration string? responseId local variable and the dead esponseId = functionResult.responseId assignment inside the tool-call foreach loop - ChatController: remove esponseIdRecorded one-shot flag; instead record ownership for every non-null responseId yielded — one per response leg (initial + each tool-call continuation) — so tool-call continuation IDs are properly bound to the authenticated user before being forwarded to the client - ChatController: add null-guard for userId after FindFirstValue(); [Authorize] + ASP.NET Identity guarantees the claim is present but the explicit check returns 401 early rather than silently propagating a null into validation - ResponseIdValidationServiceTests: extract CreateCache() helper with SizeLimit = 10_000 matching production config, so SetSize(1) per entry is actually exercised in tests rather than silently ignored; add new RecordResponseId_SizeLimitEnforced_EntryCountedInCache test

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

… SizeLimit; remove dead cleanupOldSessions JS - ResponseIdValidationService now creates and owns a dedicated MemoryCache instance (parameterless DI ctor) rather than injecting the shared IMemoryCache. This prevents the SizeLimit constraint from forcing every other IMemoryCache caller in the app to call SetSize, which would cause runtime InvalidOperationExceptions in ASP.NET Core's built-in caches. - An IMemoryCache overload ctor is retained for unit test injection. - Remove SizeLimit from AddMemoryCache() in Program.cs; the dedicated cache in ResponseIdValidationService already carries SizeLimit=10_000. - Remove cleanupOldSessions() from chat-module.js: loadChatHistory() already enforces the 2-hour TTL on load, making the 7-day cleanup function unreachable dead code.

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

EssentialCSharp.Web/wwwroot/js/chat-module.js:70

saveChatHistory() only writes aiChatLastResponseId to sessionStorage after localStorage.setItem(...) succeeds. If localStorage throws (quota exceeded, blocked, etc.), the catch/fallback path never updates sessionStorage, so lastResponseId persistence can become stale or be lost on reload even though sessionStorage is still available. Consider updating/removing the sessionStorage key in a finally (or duplicating the sessionStorage write in the catch) so lastResponseId is persisted independently of localStorage success.

        try {
            // Limit messages to prevent memory issues (keep last 100 messages)
            const maxMessages = 100;
            const messagesToSave = chatMessages.value.slice(-maxMessages);
            
            const data = {
                messages: messagesToSave,
                timestamp: Date.now()
            };
            localStorage.setItem('aiChatHistory', JSON.stringify(data));
            // Store lastResponseId in sessionStorage — it clears on tab close, reducing persistent exposure
            if (lastResponseId.value) {
                sessionStorage.setItem('aiChatLastResponseId', lastResponseId.value);
            } else {
                sessionStorage.removeItem('aiChatLastResponseId');
            }
        } catch (error) {
            console.warn('Failed to save chat history:', error);
            // If localStorage is full, try clearing and saving only recent messages
            try {
                const recentMessages = chatMessages.value.slice(-20);
                const fallbackData = {
                    messages: recentMessages,
                    timestamp: Date.now()
                };
                localStorage.setItem('aiChatHistory', JSON.stringify(fallbackData));
            } catch (fallbackError) {

…; fix XML doc exception type - Register ResponseIdValidationService via AddSingleton factory lambda (_ => new ResponseIdValidationService()) so DI always calls the parameterless ctor that creates the owned, size-limited MemoryCache. Without this, ASP.NET Core DI prefers the IMemoryCache ctor (matching the registered IMemoryCache from AddMemoryCache()), defeating the dedicated cache design. - Update CreateTokenAsync XML doc: references TokenLimitExceededException instead of the base InvalidOperationException, so callers know the concrete type to catch.

Copilot

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

BenjaminMichaelis · 2026-05-09T04:16:03Z

@copilot resolve the merge conflicts in this pull request

…ity-review # Conflicts: # EssentialCSharp.Chat.Shared/Services/AIChatService.cs Co-authored-by: BenjaminMichaelis <22186029+BenjaminMichaelis@users.noreply.github.com>

BenjaminMichaelis added 3 commits May 6, 2026 22:14

fix: cleanup from second agent review pass

b1119cc

- Remove dead MakeCacheOptions() method from ResponseIdValidationService - Add comment explaining why user_question prompt is intentionally not XML-sanitized (C# generics like List<T> must not be corrupted)

BenjaminMichaelis changed the title ~~security: OWASP AI Agent Security hardening for chat and MCP endpoints~~ fix: OWASP AI Agent Security hardening for chat and MCP endpoints May 7, 2026

BenjaminMichaelis marked this pull request as ready for review May 8, 2026 00:38

Copilot AI review requested due to automatic review settings May 8, 2026 00:38

Copilot started reviewing on behalf of BenjaminMichaelis May 8, 2026 00:38 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

github-code-quality Bot found potential problems May 8, 2026

View reviewed changes

Comment thread EssentialCSharp.Web.Tests/ResponseIdValidationServiceTests.cs Fixed

BenjaminMichaelis added 3 commits May 7, 2026 18:05

test: DRY up McpApiTokenServiceTests and ResponseIdValidationServiceT…

068ff8f

…ests

BenjaminMichaelis requested a review from Copilot May 8, 2026 04:55

Copilot started reviewing on behalf of BenjaminMichaelis May 8, 2026 04:56 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

BenjaminMichaelis added 2 commits May 7, 2026 22:12

BenjaminMichaelis requested a review from Copilot May 8, 2026 21:03

Copilot started reviewing on behalf of BenjaminMichaelis May 8, 2026 21:04 View session

Copilot AI reviewed May 8, 2026

View reviewed changes

Comment thread EssentialCSharp.Web/Program.cs Outdated

Comment thread EssentialCSharp.Web/Services/ResponseIdValidationService.cs Outdated

Comment thread EssentialCSharp.Web/wwwroot/js/chat-module.js Outdated

BenjaminMichaelis requested a review from Copilot May 9, 2026 01:41

Copilot started reviewing on behalf of BenjaminMichaelis May 9, 2026 01:41 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

Comment thread EssentialCSharp.Web/Program.cs Outdated

Comment thread EssentialCSharp.Web/Services/McpApiTokenService.cs Outdated

BenjaminMichaelis requested a review from Copilot May 9, 2026 02:06

Copilot started reviewing on behalf of BenjaminMichaelis May 9, 2026 02:06 View session

Copilot AI reviewed May 9, 2026

View reviewed changes

Comment thread EssentialCSharp.Web.Tests/McpApiTokenServiceTests.cs

Copilot started work on behalf of BenjaminMichaelis May 9, 2026 04:16 View session

Merge remote-tracking branch 'origin/main' into agents/ai-agent-secur…

518f50d

…ity-review # Conflicts: # EssentialCSharp.Chat.Shared/Services/AIChatService.cs Co-authored-by: BenjaminMichaelis <22186029+BenjaminMichaelis@users.noreply.github.com>

BenjaminMichaelis merged commit 0d2cb75 into main May 9, 2026
6 checks passed

BenjaminMichaelis deleted the agents/ai-agent-security-review branch May 9, 2026 04:22

Copilot stopped work on behalf of BenjaminMichaelis due to an error May 9, 2026 04:23
The session was cancelled by the user.

Conversation

BenjaminMichaelis commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

BenjaminMichaelis commented May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

BenjaminMichaelis commented May 7, 2026 •

edited

Loading