Skip to content

Conversation

@lkk12014402
Copy link
Contributor

Description

add mxfp4 qat

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds support for MXFP4 (4-bit microscaling floating point) quantization-aware training (QAT), extending the existing MXFP8 support. The implementation includes packing utilities to convert FP4 values into packed uint8 format and export functionality for serialization.

Key changes:

  • Added MXFP4 to QAT module mappings alongside MXFP8
  • Implemented FP4 packing/unpacking utilities with bit manipulation
  • Extended export logic to handle MXFP4 format with packed weight buffers

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
neural_compressor/torch/quantization/config.py Adds MXFP4 to QAT module mappings for torch.nn.Linear
neural_compressor/torch/export/export_hf.py Adds MXFP4 export path that packs weights into buffers
neural_compressor/torch/algorithms/qat/tensor_quantizer.py Implements MXFP4 weight packing using new packing utilities
neural_compressor/torch/algorithms/qat/quant_utils.py Adds MXFP4 detection and sets float-quantized format
neural_compressor/torch/algorithms/qat/mxfp4_packing.py New file with FP4 casting and uint4-to-uint8 packing functions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

@yiliu30 yiliu30 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some UTs.

@lkk12014402 lkk12014402 added this to the 3.7 milestone Dec 3, 2025
@lkk12014402
Copy link
Contributor Author

Please add some UTs.

add ut

@yiliu30 yiliu30 self-requested a review December 5, 2025 10:29
@chensuyue
Copy link
Contributor

Code issue not related to this PR.

@chensuyue chensuyue merged commit 134dd92 into intel:master Dec 5, 2025
19 of 22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants