Skip to content

Others: Doubt regarding the scales and zero value calculate for each layer in PTQ and QAT #1

@kaushikepi

Description

@kaushikepi

📝 Description

While going through your YouTube video explanation on Quantisation. I came across this doubt when I was validating the formulas of scales and zero_point for Asymmetric and Symmetric Quantization and found mismatch in the values in Notebook code examples.

image

🤯 Observation

In the above SS of Post-Training Quantisation notebook:
the MinMaxObserver for Linea1 layer1 has calculated the min(beta) and max_value(alpha) as min_val=-53.58397674560547, max_val=34.898128509521484

Using the above min and max values, the scale and zero_point for QuantizedLinear layer1 are scale=0.6967094540596008, zero_point=77

❔ Question/Doubt

Formulae for calculating s and z for Asymmetric Quantization

Scale = (Xmax - Xmin)/ (2^n - 1)
Zero_point = -1 * (Xmin/Scale)

Considering the The default qscheme = torch.per_tensor.affine & dtype=torch.quint8 for MinMaxObserver
The Quantisation used by torch Quantisation library is Asymmetric.

Shouldn't the value for scale and zero_point for QuantizedLinear layer according to Asymmetric Quantization to 8 bit INT be:
scale=0.34698863, zero_point=100.57??

Why the scale value in the notebook ss is 2X of the scale value calculated by the formulae which is 2 * 0.34598863 ~= 0.6967094540596008,

image image

@hkproj Can you please shed some light on the calculation?

Thank you

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions