-
Notifications
You must be signed in to change notification settings - Fork 0
image validation #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughAdds a new Changes
Sequence Diagram(s)(omitted — changes are internal validation logic and tests, not multi-component control-flow requiring a sequence diagram) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @src/Validation/IsImage.php:
- Around line 127-143: The SVG detection is too permissive in validateImageData;
remove the "<?xml" signature from the $signatures array and instead perform a
more specific check for SVG by searching for a "<svg" tag (case-insensitive)
within the image data (e.g., within the first ~1KB) inside the validateImageData
function; optionally make SVG acceptance configurable or document the security
implications and treat SVG as excluded by default to avoid accepting XML files
or XSS-bearing SVGs.
🧹 Nitpick comments (3)
src/Validation/IsImage.php (1)
64-85: Consider validating MIME type consistency between data URI header and actual content.The data URI's declared MIME type (e.g.,
image/jpeg) is validated againstallowedMimeTypes, but thenvalidateBase64Image()→validateImageData()may detect a different actual type. An attacker could declareimage/jpegin the URI while embedding malicious SVG content.Consider either:
- Pass the declared MIME type to
validateImageData()and verify it matches the detected type- Or skip the MIME check in
validateDataUri()and rely solely onvalidateImageData()detection♻️ Proposed fix to enforce consistency
private function validateDataUri( string $dataUri ) : bool { // Parse data URI: data:[<mediatype>][;base64],<data> $pattern = '/^data:([a-zA-Z0-9][a-zA-Z0-9\/+\-]*);base64,(.+)$/'; if( !preg_match( $pattern, $dataUri, $matches ) ) { return false; } $mimeType = $matches[1]; $base64Data = $matches[2]; // Check MIME type if( !empty( $this->allowedMimeTypes ) && !in_array( $mimeType, $this->allowedMimeTypes, true ) ) { return false; } - // Validate base64 data - return $this->validateBase64Image( $base64Data ); + // Validate base64 data with expected MIME type for consistency check + return $this->validateBase64Image( $base64Data, $mimeType ); }Then modify
validateBase64Image()andvalidateImageData()to accept and verify the expected MIME type.tests/Validation/IsImageTest.php (2)
56-66: Missing test coverage for WebP images.The default
allowedMimeTypesincludesimage/webp, but there's no test verifying WebP validation works correctly. Consider adding a WebP test similar to the existing format tests.💚 Proposed WebP test
/** * Test valid base64 encoded WebP image. */ public function testValidBase64WebpImage() { // Minimal 1x1 WebP image in base64 $webpBase64 = 'UklGRiYAAABXRUJQVlA4IBoAAAAwAQCdASoBAAEAAQAcJYgCdAEO/hOMAAD++/8A'; $validator = new IsImage(); $this->assertTrue( $validator->isValid( $webpBase64 ) ); }
180-215: LGTM with a suggestion.Good coverage of SVG and malformed data URI handling. Consider adding a test for SVG content that starts directly with
<svg(without XML declaration) to ensure both detection paths work:💚 Optional: SVG without XML declaration test
/** * Test SVG without XML declaration. */ public function testSvgWithoutXmlDeclaration() { $svgContent = '<svg xmlns="http://www.w3.org/2000/svg" width="1" height="1"></svg>'; $svgBase64 = base64_encode( $svgContent ); $validator = new IsImage(); $this->assertTrue( $validator->isValid( $svgBase64 ) ); }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/Validation/IsImage.phptests/Validation/IsImageTest.php
🧰 Additional context used
🧬 Code graph analysis (2)
src/Validation/IsImage.php (1)
src/Validation/Base.php (1)
Base(9-32)
tests/Validation/IsImageTest.php (1)
src/Validation/IsImage.php (1)
IsImage(8-178)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🔇 Additional comments (8)
src/Validation/IsImage.php (4)
8-29: LGTM!The class structure and constructor are well-designed with sensible defaults for MIME types, and properly calls the parent constructor.
35-56: LGTM!The validation entry point correctly handles type checking, empty string rejection, and proper delegation to specialized validators.
93-119: LGTM!The base64 validation is robust: strict decoding mode catches invalid input, whitespace normalization handles common formatting variations, and size checking uses
strlen()correctly for binary byte count.
145-177: LGTM!The signature matching logic correctly implements magic number detection with appropriate special handling for WebP's RIFF container format.
tests/Validation/IsImageTest.php (4)
68-110: LGTM!Comprehensive negative test coverage including invalid base64, non-image data, empty strings, and various non-string input types.
112-144: LGTM!Good coverage of MIME type restriction functionality for both raw base64 and data URI formats.
146-178: LGTM!Good coverage of size constraints and the
checkImageDatabypass mode, testing both positive and negative scenarios.
1-6: No action needed. The test file does not require a namespace for autoloading, as the project's PSR-4 configuration (composer.json) only includes"Neuron\\" => "src/"with no test autoloading configuration. Tests are not automatically loaded and work correctly without a namespace declaration.Likely an incorrect or invalid review comment.
Welcome to Codecov 🎉Once you merge this PR into your default branch, you're all set! Codecov will compare coverage reports and display results in all future pull requests. ℹ️ You can also turn on project coverage checks and project coverage reporting on Pull Request comment Thanks for integrating Codecov - We've got you covered ☂️ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @src/Validation/IsImage.php:
- Around line 209-221: The svg xmlns conditional is dead: inside the preg_match
block you check for 'xmlns' via stripos($dataToCheck, ...) and return
'image/svg+xml' if present but then unconditionally return the same value
anyway; either remove the redundant xmlns check (delete the inner if/stripos
block) to keep current behavior, or enforce the stricter intention by changing
the unconditional return 'image/svg+xml' to return null when xmlns is not found
(i.e., keep the stripos check and return null in its else branch). Update the
code in IsImage.php around the preg_match on $dataToCheck accordingly so only
one of these two behaviors remains.
🧹 Nitpick comments (1)
src/Validation/IsImage.php (1)
137-190: Minor edge case in WebP validation with very short payloads.When the decoded image data is between 8-11 bytes,
substr($imageData, 8, 4)will return fewer than 4 characters, and the comparison!== 'WEBP'will correctly fail. However, this could theoretically allow a malformed 4-byte file starting with "RIFF" to pass the initial check before being caught by the WebP check.In practice this is a non-issue since such tiny files aren't valid images anyway, but for robustness you could add a minimum length check:
♻️ Optional defensive check
// Special handling for WebP - if( $signature === 'RIFF' && substr( $imageData, 8, 4 ) !== 'WEBP' ) + if( $signature === 'RIFF' && ( strlen( $imageData ) < 12 || substr( $imageData, 8, 4 ) !== 'WEBP' ) ) { continue; }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/Validation/IsImage.phptests/Validation/IsImageTest.php
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/Validation/IsImageTest.php
🧰 Additional context used
🧬 Code graph analysis (1)
src/Validation/IsImage.php (1)
src/Validation/Base.php (1)
Base(9-32)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🔇 Additional comments (4)
src/Validation/IsImage.php (4)
10-39: LGTM!The constructor properly initializes validation parameters with sensible defaults. Good security practice to disable SVG by default with a clear docblock explaining the risk. The auto-addition of
image/svg+xmlto allowed MIME types when$allowSvgis true is a nice convenience that prevents configuration mistakes.
45-66: LGTM!Clean routing logic with proper type checking and early returns for invalid inputs. The distinction between data URI and raw base64 is handled correctly.
74-95: LGTM!The data URI parsing correctly handles the RFC 2397 format for base64-encoded data. The MIME type validation against the allowlist is properly implemented.
103-129: LGTM!Proper handling of base64 decoding with strict mode enabled, which correctly rejects invalid base64 input. Using
strlen()for size checking is appropriate for binary data in PHP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @tests/Validation/IsImageTest.php:
- Around line 1-6: Add the missing namespace declaration for the test by
declaring "namespace Tests\Validation;" at the top of the file before the use
statements; ensure it sits above the existing use lines and that the class
IsImageTest remains unchanged so the file follows the project's directory-based
namespace convention.
🧹 Nitpick comments (2)
VERSIONLOG.md (1)
1-2: Missing date for version 0.7.11.For consistency with other entries in this file (e.g.,
## 0.7.10 2026-01-13), consider adding a release date:-## 0.7.11 +## 0.7.11 2026-01-13tests/Validation/IsImageTest.php (1)
6-7: Consider adding WebP image tests.The validator supports WebP images (with special offset-based signature detection), but there are no tests covering this format. Consider adding tests similar to the JPEG/PNG/GIF tests to ensure WebP validation works correctly.
Example test method for WebP
/** * Test valid base64 encoded WebP image. */ public function testValidBase64WebpImage() { // Small 1x1 pixel WebP image in base64 // You'll need to generate a minimal valid WebP for this $webpBase64 = 'UklGRlYAAABXRUJQVlA4IEoAAADQAQCdASoBAAEAAUAmJYgCdAEO/hOMAAD++O9P/u3vDP/3kPH/s/1lx0Xf/P+l8xrRPz/9AAA='; $validator = new IsImage(); $this->assertTrue($validator->isValid($webpBase64)); }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
VERSIONLOG.mdsrc/Validation/IsImage.phptests/Validation/IsImageTest.php
🚧 Files skipped from review as they are similar to previous changes (1)
- src/Validation/IsImage.php
🧰 Additional context used
🧬 Code graph analysis (1)
tests/Validation/IsImageTest.php (1)
src/Validation/IsImage.php (1)
IsImage(8-225)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🔇 Additional comments (9)
tests/Validation/IsImageTest.php (9)
8-30: LGTM!These tests correctly validate both raw base64 and data URI formatted JPEG images using minimal 1x1 pixel test data.
32-66: LGTM!PNG and GIF validation tests use correct image signatures.
68-110: LGTM!Comprehensive edge case coverage for invalid base64, non-image data, empty strings, and non-string input types.
112-144: LGTM!MIME type restriction tests effectively validate filtering for both raw base64 images (where type is inferred from magic bytes) and data URIs (where type is declared explicitly).
163-178: LGTM!This test correctly validates the fast-path mode where
checkImageDatais disabled—base64 decoding is verified but actual image signature checking is bypassed.
180-201: LGTM!SVG opt-in tests properly validate that SVG support is disabled by default and must be explicitly enabled. Tests cover both raw base64 and data URI variants.
203-233: LGTM!These tests ensure robust SVG detection by verifying that generic XML and XML with SVG namespaces but incorrect root tags are properly rejected. Good security-conscious coverage.
235-252: LGTM!Enforcing the xmlns namespace requirement is a sound security measure that helps prevent malicious content disguised as SVG. Well-designed test case.
254-272: LGTM!Comprehensive coverage of malformed data URI edge cases ensures the validator fails gracefully on invalid inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @src/Validation/IsImage.php:
- Around line 206-230: validateImageData contains an unreachable MIME filter:
since callers only invoke validateImageData when $this->allowedMimeTypes is
empty, the if(!empty($this->allowedMimeTypes) &&
!$this->isMimeTypeAllowed($detectedType)) branch is dead; remove that
conditional and its isMimeTypeAllowed call from validateImageData (leaving
detectImageType/detectSvg checks and boolean return), or alternatively move MIME
validation responsibility to the caller that handles allowedMimeTypes and keep
validateImageData as a pure "is recognizable image" check; update/rename the
method if needed to reflect the narrower responsibility.
🧹 Nitpick comments (3)
src/Validation/IsImage.php (2)
75-97: Data URI MIME type is not validated against actual image content.When a data URI is provided (e.g.,
data:image/png;base64,...), the MIME type from the URI is checked againstallowedMimeTypes, but thenvalidateBase64Image()is called which also detects the actual image type from magic bytes. If the declared MIME differs from actual content (e.g.,data:image/jpeg;base64,<png-data>), the validation could pass if both types are allowed, potentially enabling MIME type spoofing.Consider verifying that the declared MIME type matches the detected type from the actual image bytes.
♻️ Suggested approach
private function validateDataUri( string $dataUri ) : bool { // Parse data URI: data:[<mediatype>][;base64],<data> // Use 's' modifier to allow . to match newlines in base64 data $pattern = '/^data:([a-zA-Z0-9][a-zA-Z0-9\/+\-]*);base64,(.+)$/s'; if( !preg_match( $pattern, $dataUri, $matches ) ) { return false; } $mimeType = strtolower( $matches[1] ); // Normalize to lowercase per RFC 2045 $base64Data = $matches[2]; // Check MIME type (case-insensitive per RFC 2045) if( !empty( $this->allowedMimeTypes ) && !$this->isMimeTypeAllowed( $mimeType ) ) { return false; } - // Validate base64 data - return $this->validateBase64Image( $base64Data ); + // Validate base64 data and optionally verify declared MIME matches detected type + return $this->validateBase64Image( $base64Data, $mimeType ); }Then modify
validateBase64Imageto accept an optional$declaredMimeTypeparameter and verify consistency when provided.
238-249: Optimize by normalizing MIME types once in constructor.The
strtolower()call on line 243 is executed for every element inallowedMimeTypeson every invocation ofisMimeTypeAllowed(). Consider normalizing the MIME types once in the constructor instead.The
normalizeMimeTypes()method does not exist in the current implementation—the constructor directly assigns$allowedMimeTypeswithout pre-processing. Adding a normalization step would avoid redundant lowercasing calls.tests/Validation/IsImageTest.php (1)
58-68: Missing test coverage for WebP format.The default
allowedMimeTypesincludesimage/webp, anddetectImageType()has specific WebP signature handling (RIFF + WEBP at offset 8), but there's no test validating WebP images. Consider adding a test with a valid WebP image to ensure the WebP detection works correctly.🧪 Suggested test addition
/** * Test valid base64 encoded WebP image. */ public function testValidBase64WebpImage() { // Minimal 1x1 WebP image in base64 // WebP signature: RIFF....WEBP $webpBase64 = 'UklGRhoAAABXRUJQVlA4TA0AAAAvAAAAEAcQERGIiP4HAA=='; $validator = new IsImage(); $this->assertTrue( $validator->isValid( $webpBase64 ) ); }
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/Validation/IsImage.phptests/Validation/IsImageTest.php
🧰 Additional context used
🧬 Code graph analysis (2)
src/Validation/IsImage.php (1)
src/Validation/Base.php (1)
Base(9-32)
tests/Validation/IsImageTest.php (1)
src/Validation/IsImage.php (1)
IsImage(8-286)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Cursor Bugbot
🔇 Additional comments (9)
src/Validation/IsImage.php (5)
21-40: LGTM - Constructor with sensible defaults and SVG opt-in logic.The constructor properly initializes the validator with secure defaults (SVG disabled by default) and correctly auto-adds
image/svg+xmlto allowed types when SVG is explicitly enabled.
46-67: LGTM - Input validation and routing logic is correct.The
validate()method properly rejects non-strings and empty values, then correctly routes data URIs vs raw base64 inputs to appropriate handlers.
105-157: LGTM - Base64 validation with proper decoding and size checks.The method correctly:
- Strips whitespace for MIME-encoded base64
- Uses strict mode for
base64_decode()- Enforces size constraints on decoded data
- Properly branches based on configuration
The fast path when both
allowedMimeTypesis empty andcheckImageDatais false is intentional per the PR description.
165-198: LGTM - Magic number detection is correctly implemented.The signature detection properly handles:
- JPEG (FFD8FF prefix)
- PNG (8-byte signature)
- GIF87a/GIF89a
- WebP with proper RIFF+WEBP validation at correct offsets
258-285: SVG detection with namespace requirement is a good security measure.The implementation correctly:
- Limits check to first 1KB for performance
- Removes UTF-8 BOM before parsing
- Requires proper SVG namespace (
http://www.w3.org/2000/svg) as an attributeOne edge case: the regex
[^>]*won't match if there's a>character inside an attribute value beforexmlns(e.g.,<svg data-foo="a>b" xmlns="...">). This is a rare edge case that would result in a false negative (rejection), which is the safer failure mode.tests/Validation/IsImageTest.php (4)
165-180: LGTM - Tests correctly verify the fast-path behavior.The test properly documents that when
checkImageData=falseand no MIME restrictions are set, any valid base64 string passes validation. This aligns with the implementation's documented "fast path" behavior.
182-273: Excellent SVG security test coverage.The tests comprehensively verify:
- SVG requires explicit opt-in
- Generic XML is rejected even with SVG enabled
<svg>tag with proper namespace is required- Bypass attempts via xmlns in text content or comments are correctly rejected
This provides good assurance against SVG-based XSS vectors.
359-384: Important test verifying MIME restrictions apply regardless ofcheckImageDataflag.This test correctly verifies that MIME type restrictions based on magic byte detection are enforced even when
checkImageData=false. This is an important security property ensuring that the fast path doesn't bypass type restrictions.
419-437: LGTM - Malformed data URI test coverage.Good coverage of edge cases:
- Missing
;base64declaration- Missing MIME type
- Completely invalid format
These tests ensure the regex pattern correctly rejects malformed inputs.
Note
Introduces robust image validation with base64 and data-URI support.
IsImagevalidator with configurableallowedMimeTypes,maxSize,checkImageData, andallowSvgtests/Validation/IsImageTest.phpcovering valid/invalid inputs, MIME/size rules, SVG edge cases, and line-wrapped base64composer.jsonwithautoload-devforTests\and notes feature inVERSIONLOG.mdWritten by Cursor Bugbot for commit 44198ed. This will update automatically on new commits. Configure here.
Summary by CodeRabbit
New Features
Tests
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.