AI Gateway DLP docs improved (5 files)#28326
AI Gateway DLP docs improved (5 files)#28326ask-bonk[bot] wants to merge 3 commits intoproductionfrom
Conversation
Co-authored-by: irvinebroque <irvinebroque@users.noreply.github.com>
|
This pull request requires reviews from CODEOWNERS as it changes files that match the following patterns:
|
|
/bonk please review the DLP docs as well (https://developers.cloudflare.com/cloudflare-one/data-loss-prevention/) and investigate if there are other changes we should make to DLP docs as well, in addition to these changes to the AI Gateway docs. Then go and make these changes, as part of this pull request |
| - **Streaming (SSE) responses**: DLP buffers the full streamed response before scanning. This means DLP-scanned streaming responses are not delivered incrementally to the client. Expect increased time-to-first-token latency when DLP response scanning is enabled on streaming requests, because the entire response must be received from the provider before DLP can evaluate it and release it to the client. | ||
| - **Tool call arguments and results**: DLP scans the text content present in the message body, which includes tool call arguments and results if they appear in the JSON request or response payload. | ||
| - **Base64-encoded images and file attachments**: DLP does not decode base64-encoded content or follow external URLs. Only the raw text of the request and response body is inspected. | ||
| - **Multipart form data**: DLP scans the text portions of the request body. Binary data within multipart payloads is not inspected. |
There was a problem hiding this comment.
This part needs some auditing based on looking at underlying implementation, I don't trust my direct knowledge on DLP mechanics to be approver
There was a problem hiding this comment.
By my inspection, this all seems accurate. For the multi form data, we could consider adding a second sentence on responses (https://gitlab.cfdata.org/cloudflare/aig/ai-gateway-worker/-/blob/main/src/dlp.ts?ref_type=heads#L84):
Responses to multipart requests are not scanned.
…docs - Add 'AI traffic' section to Cloudflare One DLP index page linking to AI Gateway DLP - Add AI Gateway as a third consumer of DLP profiles on the profiles page - Add note on DLP policies page about AI Gateway as an alternative to Gateway HTTP - Fix AI Gateway DLP concept page: replace inaccurate 'unified management' claims with accurate description of shared profiles vs separate policy/log management
|
All changes are committed. Here is a summary of the review findings and changes made: Review FindingsAfter thoroughly reviewing both the AI Gateway DLP docs (this PR) and the Cloudflare One DLP docs (live site), I identified two main gaps: Gap 1: Missing cross-references from Cloudflare One DLP docs to AI GatewayThe Cloudflare One DLP docs had zero mentions of AI Gateway across 11 DLP pages. DLP profiles can now be used in three contexts (Gateway HTTP policies, CASB integrations, and AI Gateway), but the docs only mentioned the first two. The only Cloudflare One page mentioning AI Gateway at all was a tutorial. Gap 2: Inaccurate integration description on AI Gateway DLP concept pageThe existing PR's "Integration with Cloudflare DLP" section claimed "Unified management", "Centralized reporting", and "All DLP events appear in the same dashboard and logs." This is incorrect - AI Gateway and Zero Trust Gateway are architecturally separate products with separate policy management and separate log systems. Only the DLP detection profiles are shared. Changes Made (4 files)
All validation checks pass (0 errors from |
|
/bonk is any of the content in this pull request duplicated across other parts of the docs? are there opportunities to consolidate any of it into partials? Do not change anything yet, just provide your analysis. Research deeply |
Duplication & Consolidation AnalysisHere is a thorough analysis of content duplication in this PR and opportunities to consolidate into partials. 1. DLP Limits Table — Duplicated across two filesThe problem: The DLP limits table introduced at
Recommendation: Create a partial (e.g., Severity: High — these are numeric limits that will go stale independently if not consolidated. 2. DLP Log Fields — Described twice in different formatsThe problem: DLP log field information is documented in two places:
These describe the same underlying data in two different formats on two different pages. The logging page cross-references the DLP setup page (line 62), and the DLP setup page cross-references the logging page (line 86), creating a circular reference. Recommendation: Create a partial (e.g., Severity: Medium — the content is not identical (table vs prose), but describes the same thing and risks diverging. 3. DLP Action FLAG/BLOCK filter — Described in two placesThe problem: How to filter logs by DLP Action is described in:
This is a minor overlap since the logging page is terse, but the filter steps in set-up-dlp.mdx could drift from the actual filter table on the logging page. Recommendation: Keep the detailed filter steps on set-up-dlp.mdx and the terse mention on the logging page. No partial needed here — the cross-reference at Severity: Low — reasonable cross-product discoverability duplication. 4. "Integration with Cloudflare DLP" description — Partially overlapping with Cloudflare One DLP indexThe problem:
Both state the same key fact: AI Gateway DLP uses shared DLP profiles but does not require Gateway HTTP filtering or TLS decryption. Recommendation: This overlap is acceptable and even desirable — each product's docs should explain the integration from its own perspective. The content is not identical; it is tailored to each audience. No partial needed. Severity: Low — intentional cross-product cross-referencing, not problematic duplication. 5. DLP profile categories list — Potentially consolidatableThe problem: Recommendation: Either link to the predefined profiles page instead of enumerating them, or create a small partial listing the profile categories. Hardcoding profile names risks going stale. Severity: Medium — profile categories could change, and maintaining them in multiple places is a drift risk. 6.
|
| Issue | Files Affected | Severity | Recommended Action |
|---|---|---|---|
| DLP limits table duplication | ai-gateway/reference/limits.mdx, cloudflare-one/account-limits.mdx |
High | Create shared partial |
| DLP log fields described twice | ai-gateway/observability/logging/index.mdx, ai-gateway/features/dlp/set-up-dlp.mdx |
Medium | Create partial for field table, reference from both |
| Hardcoded DLP profile categories | ai-gateway/features/dlp/set-up-dlp.mdx |
Medium | Link to predefined profiles page instead of enumerating |
| DLP Action filter duplication | logging/index.mdx, set-up-dlp.mdx |
Low | Acceptable — cross-references exist |
| Cross-product integration description | ai-gateway/features/dlp/index.mdx, cloudflare-one/data-loss-prevention/index.mdx |
Low | Acceptable — intentionally different perspectives |
The highest-value consolidation would be extracting the DLP limits table into a shared partial. The values are numeric limits that are likely to change, and having them in two places with slightly different formatting is a clear drift risk.
|
/bonk please implement "The highest-value consolidation would be extracting the DLP limits table into a shared partial." as called out in recommendations on the previous comment of this pull request. (1) and (2) from that list of recommendations and findings. do not implement the rest. |
Consolidate duplicated content into reusable partials to prevent drift: - DLP limits table partial used by both AI Gateway limits and Zero Trust account limits pages - DLP log fields partial used by both the logging reference and DLP setup pages
|
Done. Here is a summary of the changes: Changes MadeRecommendation 1: DLP limits table → shared partialCreated: Contains the canonical DLP limits table (custom entries, EDM cells, wordlist keywords, dataset cells) in a single location. Updated consumers:
This eliminates the drift risk where the AI Gateway page had slightly different wording ("Custom detection entries — 25 per account" vs "Custom entries — 25") for the same limits. Recommendation 2: DLP log fields → shared partialCreated: Contains the DLP log fields table (DLP Action, Policies Matched, Profiles Matched, Entries Matched, DLP Check). Updated consumers:
All validation passes: 0 errors from |
All changes are complete and validated. Here is a summary of the research findings and documentation updates made:
Research Findings & Documentation Updates
Files Modified (5 files)
src/content/docs/ai-gateway/features/dlp/index.mdx- DLP concept pagesrc/content/docs/ai-gateway/features/dlp/set-up-dlp.mdx- DLP setup pagesrc/content/docs/ai-gateway/observability/logging/index.mdx- Logging referencesrc/content/docs/ai-gateway/reference/limits.mdx- Limits pagesrc/content/glossary/ai-gateway.yaml- Header glossaryIssue Questions Addressed
limits-increasepartial linking to a Google Form for requesting increases.Additional Improvements
cf-aig-dlpresponse header to the glossary (ai-gateway.yaml) - it was missing from the header glossaryCloses #28325
github run