fix: add license SPDX ID resolver #1569
Merged
+104
−21
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
This is a complete miss from the previous implementation. While the license resolver was able to catch multiple formats of the license URL, it totally missed the SPDX URL(it's kind of obvious... sorry )
From our AI friend
This pull request enhances the license resolution logic by adding support for SPDX catalog URLs (e.g.,
spdx.org/licenses/<ID>.html) and refines the matching strategies used to resolve license URLs. The main changes include a new SPDX resolver branch, improved ordering and documentation of resolution strategies, and corresponding tests for the new SPDX logic.License Resolution Logic Improvements:
resolve_license, which parses SPDX IDs from URLs likespdx.org/licenses/ODbL-1.0.htmland attempts to match them against the database. If found, returns a high-confidence match; if not, logs a warning and returns no result.resolve_licenseto include the new SPDX branch and clarify existing heuristics and fuzzy matching.Supporting Functions and Tests:
extract_spdx_id_from_urlto safely extract SPDX IDs from SPDX catalog URLs, and refactoredfind_exact_match_license_urlfor clarity and reuse.Other Changes:
MatchingLicenseclass.find_exact_match_license_urlafter refactoring.Expected behavior:
The license URL is properly resolved when the license URL have the SPDX URL format.
Testing tips:
Provide tips, procedures and sample files on how to test the feature.
Testers are invited to follow the tips AND to try anything they deem relevant outside the bounds of the testing tips.
Please make sure these boxes are checked before submitting your pull request - thanks!
./scripts/api-tests.shto make sure you didn't break anything