Skip to content

Commit 3622443

Browse files
committed
Update the naming
1 parent 8d415a6 commit 3622443

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

docs/source/en/quantization/modelopt.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ specific language governing permissions and limitations under the License. -->
1111

1212
# NVIDIA ModelOpt
1313

14-
[NVIDIA-ModelOpt](https://github.com/NVIDIA/TensorRT-Model-Optimizer) is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
14+
[NVIDIA-ModelOpt](https://github.com/NVIDIA/Model-Optimizer) is a unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed.
1515

1616
Before you begin, make sure you have nvidia_modelopt installed.
1717

@@ -57,7 +57,7 @@ image.save("output.png")
5757
>
5858
> The quantization methods in NVIDIA-ModelOpt are designed to reduce the memory footprint of model weights using various QAT (Quantization-Aware Training) and PTQ (Post-Training Quantization) techniques while maintaining model performance. However, the actual performance gain during inference depends on the deployment framework (e.g., TRT-LLM, TensorRT) and the specific hardware configuration.
5959
>
60-
> More details can be found [here](https://github.com/NVIDIA/TensorRT-Model-Optimizer/tree/main/examples).
60+
> More details can be found [here](https://github.com/NVIDIA/Model-Optimizer/tree/main/examples).
6161
6262
## NVIDIAModelOptConfig
6363

@@ -86,7 +86,7 @@ The quantization methods supported are as follows:
8686
| **NVFP4** | `nvfp4 weight only`, `nvfp4 block quantization` | `quant_type`, `quant_type + channel_quantize + block_quantize` | `channel_quantize = -1 is only supported for now`|
8787

8888

89-
Refer to the [official modelopt documentation](https://nvidia.github.io/TensorRT-Model-Optimizer/) for a better understanding of the available quantization methods and the exhaustive list of configuration options available.
89+
Refer to the [official modelopt documentation](https://nvidia.github.io/Model-Optimizer/) for a better understanding of the available quantization methods and the exhaustive list of configuration options available.
9090

9191
## Serializing and Deserializing quantized models
9292

src/diffusers/quantizers/modelopt/modelopt_quantizer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727

2828
class NVIDIAModelOptQuantizer(DiffusersQuantizer):
2929
r"""
30-
Diffusers Quantizer for TensorRT Model Optimizer
30+
Diffusers Quantizer for Nvidia-Model Optimizer
3131
"""
3232

3333
use_keep_in_fp32_modules = True

0 commit comments

Comments
 (0)