Skip to content

[WIP] feat: filter files on mount#27

Open
aagumin wants to merge 3 commits intomodelpack:mainfrom
aagumin:feature/super-exclude
Open

[WIP] feat: filter files on mount#27
aagumin wants to merge 3 commits intomodelpack:mainfrom
aagumin:feature/super-exclude

Conversation

@aagumin
Copy link
Contributor

@aagumin aagumin commented Feb 11, 2026

implement #17

Signed-off-by: Arsen Gumin <gumin@live.ru>
@gemini-code-assist
Copy link

Summary of Changes

Hello @aagumin, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the model CSI driver by introducing a file exclusion mechanism. Users can now specify patterns to prevent certain files or directories from being pulled and mounted with a model, providing greater control over storage and potentially optimizing resource usage. This feature is integrated into the CSI volume creation and node publishing workflows, ensuring that only relevant model assets are deployed.

Highlights

  • File Exclusion Feature: Introduced a new capability to exclude specific files or patterns during model mounting, similar to .gitignore functionality.
  • Configuration and API Updates: Added a new configuration parameter exclude-files and updated the MountRequest API to accept an array of exclusion patterns.
  • Pattern Matching Logic: Implemented a FilePatternMatcher utility using go-git/go-git/v5/plumbing/format/gitignore for robust pattern matching and a filterFilesByPatterns function to remove excluded files and empty directories.
  • Pull Process Integration: Modified the model pulling logic to incorporate file exclusion, both by pre-filtering files to fetch and by post-processing to remove any remaining excluded files.
  • Security Validations: Added validation to prevent absolute paths and parent directory references in exclusion patterns, enhancing security.
  • Testing: Included new unit and integration tests to ensure the correctness and robustness of the file exclusion feature.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • pkg/config/config.go
    • Added ParameterKeyExcludeFiles function to retrieve the configuration key for file exclusion patterns.
  • pkg/server/server_test.go
    • Updated the mockPuller.Pull method signature to include excludeFilePatterns.
    • Modified the service.NewPuller mock to pass excludeFilePatterns.
  • pkg/service/controller_local.go
    • Imported encoding/json for JSON parsing.
    • Implemented logic to parse and validate the exclude-files parameter from CSI requests.
    • Passed the parsed excludeFilePatterns to the worker.PullModel function.
  • pkg/service/dynamic_server_handler.go
    • Imported encoding/json for JSON marshaling.
    • Added validation for ExcludeFilePatterns in the CreateVolume handler to prevent absolute paths and parent directory references.
    • Marshaled ExcludeFilePatterns to a JSON string and included it in the CSI CreateVolumeRequest parameters.
  • pkg/service/node.go
    • Imported encoding/json for JSON parsing.
    • Added logic to parse and validate exclude-files parameter from volume attributes.
    • Passed the parsed excludeFilePatterns to nodePublishVolumeStaticInlineVolume.
  • pkg/service/node_static_inline.go
    • Updated the nodePublishVolumeStaticInlineVolume function signature to accept excludeFilePatterns.
    • Passed excludeFilePatterns to the worker.PullModel function.
  • pkg/service/patterns.go
    • Added a new file defining FilePatternMatcher for gitignore-style pattern matching.
    • Implemented NewFilePatternMatcher with security validations for patterns.
    • Provided Match method to check if a path matches exclusion patterns.
    • Included filterFilesByPatterns to walk a directory, remove matched files, and clean up empty directories.
    • Added helper functions removeEmptyDirectories and isDirEmpty.
  • pkg/service/patterns_integration_test.go
    • Added a new integration test TestPullModel_WithExcludeFilePatterns to verify the end-to-end file exclusion functionality during model pulling.
  • pkg/service/patterns_test.go
    • Added a new file with comprehensive unit tests for FilePatternMatcher and filterFilesByPatterns functions, covering various pattern types and edge cases.
  • pkg/service/puller.go
    • Removed unused io import.
    • Updated the Puller interface and puller struct to include excludeFilePatterns.
    • Modified the Pull method to integrate file exclusion logic: it now pre-filters files to fetch based on patterns and performs a post-fetch cleanup to remove any remaining excluded files.
  • pkg/service/request.go
    • Added ExcludeFilePatterns field (as a slice of strings) to the MountRequest struct.
  • pkg/service/testdata/config.yaml
    • Added a new test data configuration file.
  • pkg/service/worker.go
    • Updated the newPuller field in the Worker struct to accept excludeFilePatterns.
    • Modified PullModel and pullModel function signatures to include excludeFilePatterns.
    • Passed excludeFilePatterns to the newPuller factory and the puller.Pull method.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@aagumin aagumin changed the title feat: filter files on mount [WIP] feat: filter files on mount Feb 11, 2026
Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a feature to filter files during model mounting using gitignore-style patterns provided via an exclude-files parameter. The implementation adds logic to parse, validate, and apply these patterns. The core pattern matching is handled in a new patterns.go file. While the feature is a good addition, my review identifies several areas for improvement, including code duplication in validation logic, potential bugs in directory pattern matching, and some inefficiencies. Addressing these points will enhance the robustness and maintainability of the new feature.

path = filepath.ToSlash(path)
pathParts := strings.Split(path, "/")

return m.matcher.Match(pathParts, false)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The isDir argument to m.matcher.Match is hardcoded to false. This will cause incorrect behavior for gitignore patterns that are meant to match directories (e.g., logs/). The FilePatternMatcher.Match function should accept a boolean isDir argument and pass it to the underlying go-git matcher.

This will also require changes in filterFilesByPatterns to pass info.IsDir() and to correctly handle directory matches (e.g., by using os.RemoveAll and returning filepath.SkipDir). A similar issue exists in puller.go where Match is called without directory information.

Suggested change
return m.matcher.Match(pathParts, false)
return m.matcher.Match(pathParts, isDir)

Comment on lines +128 to +131
matcher, err := NewFilePatternMatcher(excludeFilePatterns)
if err != nil {
return errors.Wrap(err, "create file pattern matcher")
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

NewFilePatternMatcher is called here and also earlier in this function (line 78) with the same excludeFilePatterns. This is inefficient. The FilePatternMatcher instance should be created once at the beginning of the function (if excludeFilePatterns is not empty) and then reused in both places.

Signed-off-by: Arsen Gumin <gumin@live.ru>

logger.WithContext(ctx).Infof("Applying file exclusion patterns: %v", excludeFilePatterns)

_, err = filterFilesByPatterns(targetDir, matcher)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The file system operation should not be introduced, after we set up fetchPatterns, the files pulled by b.Fetch should already be what the end user needs.

Signed-off-by: Arsen Gumin <gumin@live.ru>
@imeoer
Copy link
Collaborator

imeoer commented Mar 3, 2026

Hi @aagumin, any follow-up updates?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants