Standard
Trainable Thresholds for Neural Network Quantization. / Goncharenko, Alexander; Denisov, Andrey; Alyamkin, Sergey et al.
Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. ed. / Ignacio Rojas; Gonzalo Joya; Andreu Catala. Springer-Verlag GmbH and Co. KG, 2019. p. 302-312 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11507 LNCS).
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Harvard
Goncharenko, A, Denisov, A, Alyamkin, S & Terentev, E 2019,
Trainable Thresholds for Neural Network Quantization. in I Rojas, G Joya & A Catala (eds),
Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11507 LNCS, Springer-Verlag GmbH and Co. KG, pp. 302-312, 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Gran Canaria, Spain,
12.06.2019.
https://doi.org/10.1007/978-3-030-20518-8_26
APA
Goncharenko, A., Denisov, A., Alyamkin, S., & Terentev, E. (2019).
Trainable Thresholds for Neural Network Quantization. In I. Rojas, G. Joya, & A. Catala (Eds.),
Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings (pp. 302-312). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11507 LNCS). Springer-Verlag GmbH and Co. KG.
https://doi.org/10.1007/978-3-030-20518-8_26
Vancouver
Goncharenko A, Denisov A, Alyamkin S, Terentev E.
Trainable Thresholds for Neural Network Quantization. In Rojas I, Joya G, Catala A, editors, Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. Springer-Verlag GmbH and Co. KG. 2019. p. 302-312. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-20518-8_26
Author
Goncharenko, Alexander ; Denisov, Andrey ; Alyamkin, Sergey et al. /
Trainable Thresholds for Neural Network Quantization. Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. editor / Ignacio Rojas ; Gonzalo Joya ; Andreu Catala. Springer-Verlag GmbH and Co. KG, 2019. pp. 302-312 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
BibTeX
@inproceedings{dfc084c13b7549f786557c637e24b599,
title = "Trainable Thresholds for Neural Network Quantization",
abstract = "Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure up{\^A} to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.",
keywords = "Distillation, Machine learning, Neural networks, Quantization",
author = "Alexander Goncharenko and Andrey Denisov and Sergey Alyamkin and Evgeny Terentev",
year = "2019",
month = jan,
day = "1",
doi = "10.1007/978-3-030-20518-8_26",
language = "English",
isbn = "9783030205171",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer-Verlag GmbH and Co. KG",
pages = "302--312",
editor = "Ignacio Rojas and Gonzalo Joya and Andreu Catala",
booktitle = "Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings",
address = "Germany",
note = "15th International Work-Conference on Artificial Neural Networks, IWANN 2019 ; Conference date: 12-06-2019 Through 14-06-2019",
}
RIS
TY - GEN
T1 - Trainable Thresholds for Neural Network Quantization
AU - Goncharenko, Alexander
AU - Denisov, Andrey
AU - Alyamkin, Sergey
AU - Terentev, Evgeny
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure up to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.
AB - Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure up to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.
KW - Distillation
KW - Machine learning
KW - Neural networks
KW - Quantization
UR - http://www.scopus.com/inward/record.url?scp=85067573043&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-20518-8_26
DO - 10.1007/978-3-030-20518-8_26
M3 - Conference contribution
AN - SCOPUS:85067573043
SN - 9783030205171
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 302
EP - 312
BT - Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings
A2 - Rojas, Ignacio
A2 - Joya, Gonzalo
A2 - Catala, Andreu
PB - Springer-Verlag GmbH and Co. KG
T2 - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019
Y2 - 12 June 2019 through 14 June 2019
ER -