-
Notifications
You must be signed in to change notification settings - Fork 23
Description
📝 Description
While going through your YouTube video explanation on Quantisation. I came across this doubt when I was validating the formulas of scales and zero_point for Asymmetric and Symmetric Quantization and found mismatch in the values in Notebook code examples.
🤯 Observation
In the above SS of Post-Training Quantisation notebook:
the MinMaxObserver for Linea1 layer1 has calculated the min(beta) and max_value(alpha) as min_val=-53.58397674560547, max_val=34.898128509521484
Using the above min and max values, the scale and zero_point for QuantizedLinear layer1 are scale=0.6967094540596008, zero_point=77
❔ Question/Doubt
Formulae for calculating s and z for Asymmetric Quantization
Scale = (Xmax - Xmin)/ (2^n - 1)
Zero_point = -1 * (Xmin/Scale)
Considering the The default qscheme = torch.per_tensor.affine & dtype=torch.quint8 for MinMaxObserver
The Quantisation used by torch Quantisation library is Asymmetric.
Shouldn't the value for scale and zero_point for QuantizedLinear layer according to Asymmetric Quantization to 8 bit INT be:
scale=0.34698863, zero_point=100.57??
Why the scale value in the notebook ss is 2X of the scale value calculated by the formulae which is 2 * 0.34598863 ~= 0.6967094540596008,
@hkproj Can you please shed some light on the calculation?
Thank you