feat: deduplicate multi-artifact package results (PyPI, etc.)#156
Merged
Alexandros Kapravelos (kapravel) merged 1 commit intomainfrom Mar 16, 2026
Merged
Conversation
PyPI packages like numpy return one NDJSON line per artifact (sdist, wheels for each platform). This floods the agent with duplicate results for the same package. - Add lib/artifacts.ts with deduplicateArtifacts() that groups results by (type, namespace, name, version) and selects one representative per group (source dist > universal wheel > first artifact) - Add optional `platform` parameter to depscore tool for agents that can detect the user's OS/arch (e.g. 'darwin-arm64', 'linux-x64') - Filter out purlError/summary NDJSON lines before processing - Add 18 unit tests for deduplication and platform matching - Add 2 integration tests for numpy deduplication with/without platform Made-with: Cursor
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
pkg:pypi/numpy@1.26.0, it returns one NDJSON line per artifact (sdist, wheels for manylinux x86_64, macosx arm64, win amd64, etc.). The MCP was outputting every line, flooding the agent with duplicate results for the same package.(type, namespace, name, version)and selects one representative artifact per group using a priority system: source distribution > universal wheel > first artifact.platformparameter (e.g.darwin-arm64,linux-x64) so agents that can detect the user's OS/arch can get the most relevant artifact. Falls back gracefully for agents without tool execution (like Claude Web).purlErrorandsummaryNDJSON lines that were previously processed as if they were artifacts.Changes
lib/artifacts.ts:deduplicateArtifacts()function with grouping, platform matching (maps Node.js-style os-arch to ecosystem-specific patterns), and default selection logic.index.ts: Added optionalplatformparam to depscore schema, filter non-artifact NDJSON lines, wire in deduplication.artifacts.test.ts: 18 unit tests covering deduplication, platform matching, edge cases.test.ts: 2 integration tests verifying numpy deduplication with and without platform hint.Test plan
Made with Cursor