Rethink image parsing logic#14496
Conversation
There was a problem hiding this comment.
Pull request overview
This PR improves WSLA image reference handling by centralizing image-reference parsing and fixing PullImage() behavior when no tag is specified (defaulting to latest instead of pulling all tags).
Changes:
- Moved image reference parsing into
wsl::windows::common::wslutil::ParseImage()with support for tags, digests, and registry ports. - Updated
WSLASession::PullImage()to default missing tags tolatest, and switched other call sites to the shared parser. - Added tests for image parsing and added (currently skipped) advanced pull scenarios.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| test/windows/WSLATests.cpp | Adds ImageParsing coverage and an advanced pull test (currently unconditionally skipped). |
| src/windows/wslasession/WSLASession.cpp | Uses the shared parser; defaults missing pull tags to latest. |
| src/windows/common/wslutil.h | Declares wslutil::ParseImage(). |
| src/windows/common/wslutil.cpp | Implements wslutil::ParseImage() to correctly split repo vs tag/digest (handling registry ports and digests). |
| // and lookup corresponding digest from the map | ||
| auto repoName = ParseImage(tag).first; | ||
| auto repoName = wslutil::ParseImage(tag).first; | ||
| size_t colonPos = tag.find(':'); |
There was a problem hiding this comment.
In ListImages(), colonPos is now unused after switching to wslutil::ParseImage(tag).first. This will trigger an unused-local warning (and may break builds if warnings are treated as errors). Remove colonPos or use it for the intended logic.
| size_t colonPos = tag.find(':'); |
…oneblue/parse-image
|
Does this supersede #14484 ? |
It does indeed ! |
| RETURN_HR_IF_NULL(E_POINTER, ImageUri); | ||
|
|
||
| auto [repo, tag] = ParseImage(ImageUri); | ||
| auto [repo, tag] = wslutil::ParseImage(ImageUri); |
There was a problem hiding this comment.
Suggest renaming tag to tagOrDigest, and do the same in the Docker client PullImage signature.
|
As an aside, I noticed we are implicitly prepending I don't think this is correct to do other than for Docker Hub images. Has image pull been tested with other registries? |
We don't support custom registries yet, but added a TODO to track that. |
| } | ||
|
|
||
| auto lock = m_lock.lock_shared(); | ||
|
|
There was a problem hiding this comment.
PullImage() dereferences m_dockerClient without checking it has a value. Since Terminate() resets m_dockerClient, callers that invoke PullImage() after termination (or on a partially-initialized/failed session) can hit a crash/UB here. Add the same invalid-state guard used by the other image APIs (e.g., THROW_HR_IF(HRESULT_FROM_WIN32(ERROR_INVALID_STATE), !m_dockerClient.has_value());) after taking the shared lock and before calling through it.
| THROW_HR_IF(HRESULT_FROM_WIN32(ERROR_INVALID_STATE), !m_dockerClient.has_value()); |
test/windows/WSLATests.cpp
Outdated
| VERIFY_ARE_EQUAL(expectedError, comError->Message.get()); | ||
| } | ||
|
|
||
| // Validat that PullImage() returns the appropriate error if the session is terminated. |
There was a problem hiding this comment.
Typo in comment: "Validat" should be "Validate".
| // Validat that PullImage() returns the appropriate error if the session is terminated. | |
| // Validate that PullImage() returns the appropriate error if the session is terminated. |
src/windows/common/wslutil.cpp
Outdated
| std::string identifier = "([a-f0-9]{64})"; | ||
| std::string shortIdentifier = "([a-f0-9]{6,64})"; |
There was a problem hiding this comment.
BuildImageReferenceRegex() declares identifier and shortIdentifier but never uses them. With /WX enabled in the build, this will fail compilation due to unreferenced local variable warnings. Remove these variables or use them in the constructed regex.
| std::string identifier = "([a-f0-9]{64})"; | |
| std::string shortIdentifier = "([a-f0-9]{6,64})"; |
| RETURN_HR_IF_NULL(E_POINTER, Image); | ||
|
|
||
| auto lock = m_lock.lock_shared(); | ||
| THROW_HR_IF(HRESULT_FROM_WIN32(ERROR_INVALID_STATE), !m_dockerClient.has_value()); | ||
|
|
||
| auto [repo, tagOrDigest] = wslutil::ParseImage(Image); | ||
|
|
||
| if (!tagOrDigest.has_value()) | ||
| { | ||
| tagOrDigest = "latest"; | ||
| } |
There was a problem hiding this comment.
WSLASession::PullImage() validates Image != nullptr but does not enforce WSLA_MAX_IMAGE_NAME_LENGTH like other image APIs (e.g., ImportImage, InspectImage). Since this now runs a fairly complex std::regex parse, a very long input could cause unnecessary CPU/memory usage. Add a length check (returning E_INVALIDARG) before calling wslutil::ParseImage.
src/windows/common/wslutil.cpp
Outdated
| } | ||
| } | ||
|
|
||
| std::regex BuildImageReferenceRegex() |
There was a problem hiding this comment.
BuildImageReferenceRegex() is defined in the global namespace even though it appears to be a file-local helper. Consider moving it into an anonymous namespace or making it static to avoid exporting an unintended global symbol.
| std::regex BuildImageReferenceRegex() | |
| static std::regex BuildImageReferenceRegex() |
src/windows/common/wslutil.cpp
Outdated
|
|
||
| THROW_HR_IF_MSG(E_UNEXPECTED, !repo.matched, "Unexpected regex match. Input: %hs", Input.c_str()); | ||
|
|
||
| if (digest.matched) // <repo>:[tag]@<digest] (If both digest and tag are specified, digest takes precedence). |
There was a problem hiding this comment.
Minor typo in comment: the reference format example has mismatched brackets (<repo>:[tag]@<digest]). Update it to a correctly bracketed form to avoid confusion.
| if (digest.matched) // <repo>:[tag]@<digest] (If both digest and tag are specified, digest takes precedence). | |
| if (digest.matched) // <repo>:[tag]@<digest> (If both digest and tag are specified, digest takes precedence). |
Summary of the Pull Request
This change fixes various issues with PullImage() not correctly handling the reference format. Most notably:
wslc pull debianwould pull every single image in thedebianrepo. Now the tag is defaulted tolatestif not specified.PR Checklist
Detailed Description of the Pull Request / Additional comments
Validation Steps Performed