How are qparams (scale and zero_point) determined after fusing Conv and BN layers? #4369

gef1998 · 2025-02-27T03:23:24Z

During quantization (using pytorch_quantization), the qparams (scale and zero_point) of old Conv is computed using Calibrator. However, when the Conv and Batch Normalization (BN) layers are fused, the weights and biases of the fused Conv change. In this case, the original qparams may not be applicable anymore. Could you please explain how to correctly determine the new qparams (scale and zero_point) after this fusion?

lix19937 · 2025-03-26T09:14:16Z

In qat phase, no fuse_bn, and export onnx which also not fused bn layer. When in trtexec build onnx to plan,

           weight 
            |
            |
            v

input --> conv + bn -->    


 
                    weight + w_Q + w_DQ  
                       |
                       |
                       v

[input  + i_Q] + i_DQ --> conv + bn -->

In conv +bn case: 𝑊' = 𝛾/sqrt(𝜎2+𝜖)* 𝑊

So the weight range after fusion changes, trt compiler will recalculate the dynamic range of the fused weights 𝑊' values (such as max/histogram calibrator method) through statistical analysis., that is w_Q, w_DQ will change.

kevinch-nv added triaged Issue has been triaged by maintainers Module:Quantization Issues related to Quantization labels Mar 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How are qparams (scale and zero_point) determined after fusing Conv and BN layers? #4369

How are qparams (scale and zero_point) determined after fusing Conv and BN layers? #4369

gef1998 commented Feb 27, 2025 •

edited

Loading

lix19937 commented Mar 26, 2025

How are qparams (scale and zero_point) determined after fusing Conv and BN layers? #4369

How are qparams (scale and zero_point) determined after fusing Conv and BN layers? #4369

Comments

gef1998 commented Feb 27, 2025 • edited Loading

lix19937 commented Mar 26, 2025

gef1998 commented Feb 27, 2025 •

edited

Loading