Skip to content

Conversation

@georgeguimaraes
Copy link

This PR adds support for the MPNet model family, enabling models like sentence-transformers/all-mpnet-base-v2 to be used in Bumblebee.

I noticed there was a previous attempt at this in #405 by @snewcomer, but it seems to have gone stale. I used the feedback from that PR review (thanks @jonatanklosko!) to create this implementation.

Changes

  • Added Bumblebee.Text.Mpnet module with support for all standard architectures:
    • :base
    • :for_masked_language_modeling
    • :for_sequence_classification
    • :for_token_classification
    • :for_question_answering
    • :for_multiple_choice
  • Registered MPNet models in Bumblebee.load_model/2
  • Added :mpnet tokenizer type with the correct special tokens
  • Added tests for all architectures

Notes

  • MPNet does not use token_type_ids (unlike BERT)
  • MPNet uses relative position bias shared across all encoder layers
  • The HuggingFace layer names use shortened forms (q, k, v, o instead of query, key, value, output)

Closes #219

Copilot AI review requested due to automatic review settings December 27, 2025 15:27
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive support for the MPNet model family to Bumblebee, enabling the use of models like sentence-transformers/all-mpnet-base-v2. The implementation follows the patterns established by similar encoder models (BERT, RoBERTa) in the codebase.

  • Implements all standard MPNet architectures (base, masked LM, sequence/token classification, question answering, multiple choice)
  • Adds proper tokenizer configuration with MPNet-specific special tokens
  • Registers all model types in the main Bumblebee module
  • Includes comprehensive test coverage for all architectures using tiny random models from HuggingFace

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated no comments.

File Description
lib/bumblebee/text/mpnet.ex Core MPNet implementation with all architectures, configuration options, and HuggingFace parameter mappings
lib/bumblebee/text/pre_trained_tokenizer.ex Adds MPNet tokenizer configuration with appropriate special tokens
lib/bumblebee.ex Registers MPNet model classes and tokenizer type mapping
test/bumblebee/text/mpnet_test.exs Comprehensive tests for all six MPNet architectures

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +23 to +28
assert_all_close(
outputs.hidden_state[[.., 1..3, 1..3]],
Nx.tensor([
[[-0.7203, -1.2364, -0.1180], [-1.1624, -1.0586, 0.1338], [0.7040, 1.3575, 0.4602]]
])
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In all the tests we should assert against values from Python transformers, see #405 (comment). If it doesn't match, which seems to be the case here, it means there is still some implementation difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for MPNet based models

2 participants