Skip to content

add support for w4a16_mixed #1326

Merged
n1ck-guo merged 8 commits intomainfrom
hengugo/add_mixed_w4
Jan 30, 2026
Merged

add support for w4a16_mixed #1326
n1ck-guo merged 8 commits intomainfrom
hengugo/add_mixed_w4

Conversation

@n1ck-guo
Copy link
Copy Markdown
Contributor

@n1ck-guo n1ck-guo commented Jan 23, 2026

Description

add support for w4a16_mixed

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Performance improvement
  • Code refactoring
  • Other (please describe):

Related Issues

Fixes #
Relates to #

Changes Made

  • add new scheme w4a16_mixed

Testing

  • Tested locally
  • x ] Added/updated unit tests
  • All existing tests pass
  • Tested on specific hardware/environment (please specify):

Checklist

  • My code follows the project's coding style
  • I have performed a self-review of my code
  • I have commented my code, particularly in hard-to-understand areas
  • I have updated the documentation accordingly
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

Additional Context

Signed-off-by: n1ck-guo <heng.guo@intel.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@n1ck-guo n1ck-guo requested review from Copilot and wenhuach21 and removed request for Copilot January 23, 2026 02:52
Comment thread auto_round/schemes.py
Comment thread auto_round/schemes.py Outdated
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Copilot AI review requested due to automatic review settings January 23, 2026 05:35
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for a new quantization scheme called W4A16_MIXED, which implements mixed-precision quantization for different model components. The scheme uses 4-bit weights for expert layers, 8-bit for regular layers, and 16-bit for attention layers in standard models or all non-expert layers in multimodal language models (MLLMs).

Changes:

  • Added W4A16_MIXED scheme registration in the PRESET_SCHEMES dictionary
  • Implemented special handling logic for W4A16_MIXED in _handle_special_schemes function with layer-specific bit precision assignment
  • Updated output format support lists to include W4A16_MIXED scheme
  • Added comprehensive test coverage for both MoE models and MLLM models

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
auto_round/schemes.py Added W4A16_MIXED to PRESET_SCHEMES and implemented special scheme handling with layer-specific quantization logic
auto_round/compressors/base.py Refactored initialization order and updated _handle_special_schemes call to pass required parameters
auto_round/formats.py Added W4A16_MIXED to supported schemes lists for AutoGPTQ and AutoRound output formats
test/test_cpu/schemes/test_scheme.py Added two new test cases for W4A16_MIXED scheme covering both MoE and MLLM models

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread auto_round/schemes.py Outdated
Comment thread test/test_cpu/schemes/test_scheme.py
n1ck-guo and others added 2 commits January 23, 2026 13:52
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: n1ck-guo <heng.guo@intel.com>
@n1ck-guo n1ck-guo added enhancement New feature or request ready only add when the PR is ready to merge labels Jan 26, 2026
@n1ck-guo n1ck-guo requested review from wenhuach21 and xin3he January 26, 2026 02:28
Comment thread auto_round/schemes.py
@n1ck-guo n1ck-guo merged commit 7cff6af into main Jan 30, 2026
28 checks passed
@n1ck-guo n1ck-guo deleted the hengugo/add_mixed_w4 branch January 30, 2026 02:32
lvliang-intel pushed a commit that referenced this pull request Feb 2, 2026
Signed-off-by: n1ck-guo <heng.guo@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request ready only add when the PR is ready to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants