Research output: Contribution to journal › Conference article › peer-review
Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks. / Chudakov, D.; Goncharenko, A.; Alyamkin, S. et al.
In: Journal of Physics: Conference Series, Vol. 2134, No. 1, 012004, 20.12.2021.Research output: Contribution to journal › Conference article › peer-review
}
TY - JOUR
T1 - Quantization noise in low bit quantization and iterative adaptation to quantization noise in quantizable neural networks
AU - Chudakov, D.
AU - Goncharenko, A.
AU - Alyamkin, S.
AU - Densidov, A.
N1 - Publisher Copyright: © 2021 Institute of Physics Publishing. All rights reserved.
PY - 2021/12/20
Y1 - 2021/12/20
N2 - Quantization is one of the most popular and widely used methods of speeding up a neural network. At the moment, the standard is 8-bit uniform quantization. Nevertheless, the use of uniform low-bit quantization (4- and 6-bit quantization) has significant advantages in speed and resource requirements for inference. We present our quantization algorithm that offers advantages when using uniform low-bit quantization. It is faster than quantization-aware training from scratch and more accurate than methods aimed only at selecting thresholds and reducing noise from quantization. We also investigated quantization noise in neural networks for low-bit quantization and concluded that quantization noise is not always a good metric for quantization quality.
AB - Quantization is one of the most popular and widely used methods of speeding up a neural network. At the moment, the standard is 8-bit uniform quantization. Nevertheless, the use of uniform low-bit quantization (4- and 6-bit quantization) has significant advantages in speed and resource requirements for inference. We present our quantization algorithm that offers advantages when using uniform low-bit quantization. It is faster than quantization-aware training from scratch and more accurate than methods aimed only at selecting thresholds and reducing noise from quantization. We also investigated quantization noise in neural networks for low-bit quantization and concluded that quantization noise is not always a good metric for quantization quality.
UR - http://www.scopus.com/inward/record.url?scp=85123640836&partnerID=8YFLogxK
U2 - 10.1088/1742-6596/2134/1/012004
DO - 10.1088/1742-6596/2134/1/012004
M3 - Conference article
AN - SCOPUS:85123640836
VL - 2134
JO - Journal of Physics: Conference Series
JF - Journal of Physics: Conference Series
SN - 1742-6588
IS - 1
M1 - 012004
T2 - 8th International Young Scientists Conference on Information Technologies, Telecommunications and Control Systems, ITTCS 2021
Y2 - 16 December 2021 through 17 December 2021
ER -
ID: 35377964