Trainable Thresholds for Neural Network Quantization › Обзор исследований

Standard

Trainable Thresholds for Neural Network Quantization. / Goncharenko, Alexander ; Denisov, Andrey; Alyamkin, Sergey и др.

Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. ред. / Ignacio Rojas; Gonzalo Joya; Andreu Catala. Springer, 2019. стр. 302-312 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Том 11507 LNCS).

Результаты исследований: Публикации в книгах, отчётах, сборниках, трудах конференций › статья в сборнике материалов конференции › научная › Рецензирование

Harvard

Goncharenko, A , Denisov, A, Alyamkin, S & Terentev, E 2019, Trainable Thresholds for Neural Network Quantization. в I Rojas, G Joya & A Catala (ред.), Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Том. 11507 LNCS, Springer, стр. 302-312, 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Gran Canaria, Испания, 12.06.2019. https://doi.org/10.1007/978-3-030-20518-8_26

APA

Goncharenko, A., Denisov, A., Alyamkin, S., & Terentev, E. (2019). Trainable Thresholds for Neural Network Quantization. в I. Rojas, G. Joya, & A. Catala (Ред.), Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings (стр. 302-312). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Том 11507 LNCS). Springer. https://doi.org/10.1007/978-3-030-20518-8_26

Vancouver

Goncharenko A , Denisov A, Alyamkin S, Terentev E. Trainable Thresholds for Neural Network Quantization. в Rojas I, Joya G, Catala A, Редакторы, Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. Springer. 2019. стр. 302-312. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-20518-8_26

Author

Goncharenko, Alexander ; Denisov, Andrey ; Alyamkin, Sergey и др. / Trainable Thresholds for Neural Network Quantization. Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings. Редактор / Ignacio Rojas ; Gonzalo Joya ; Andreu Catala. Springer, 2019. стр. 302-312 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

BibTeX

@inproceedings{dfc084c13b7549f786557c637e24b599,

title = "Trainable Thresholds for Neural Network Quantization",

abstract = "Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure up{\^A} to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.",

keywords = "Distillation, Machine learning, Neural networks, Quantization",

author = "Alexander Goncharenko and Andrey Denisov and Sergey Alyamkin and Evgeny Terentev",

year = "2019",

month = jan,

day = "1",

doi = "10.1007/978-3-030-20518-8_26",

language = "English",

isbn = "9783030205171",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer",

pages = "302--312",

editor = "Ignacio Rojas and Gonzalo Joya and Andreu Catala",

booktitle = "Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings",

address = "United States",

note = "15th International Work-Conference on Artificial Neural Networks, IWANN 2019 ; Conference date: 12-06-2019 Through 14-06-2019",

}

RIS

TY - GEN

T1 - Trainable Thresholds for Neural Network Quantization

AU - Goncharenko, Alexander

AU - Denisov, Andrey

AU - Alyamkin, Sergey

AU - Terentev, Evgeny

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure upÂ to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.

AB - Embedded computer vision applications for robotics, security cameras, and mobile phone apps require the usage of mobile neural network architectures like MobileNet-v2 or MNAS-Net in order to reduce RAM consumption and accelerate processing. An additional option for further resource consumption reduction is 8-bit neural network quantization. Unfortunately, the known methods for neural network quantization lead to significant accuracy reduction (more than 1.2%) for mobile architectures and require long training with quantization procedure. To overcome this limitation, we propose a method that allows to quantize mobile neural network without significant accuracy loss. Our approach is based on trainable quantization thresholds for each neural network filter, that allows to accelerate training with quantization procedure upÂ to 10 times in comparison with the standard techniques. Using the proposed technique, we quantize the modern mobile architectures of neural networks with the accuracy loss not exceeding 0.1%. Ready-for-use models and code are available at: https://github.com/agoncharenko1992/FAT-fast-adjustable-threshold.

KW - Distillation

KW - Machine learning

KW - Neural networks

KW - Quantization

UR - http://www.scopus.com/inward/record.url?scp=85067573043&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/49bc60fb-442f-378c-83f8-5f88c075053c/

U2 - 10.1007/978-3-030-20518-8_26

DO - 10.1007/978-3-030-20518-8_26

M3 - Conference contribution

AN - SCOPUS:85067573043

SN - 9783030205171

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 302

EP - 312

BT - Advances in Computational Intelligence - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019, Proceedings

A2 - Rojas, Ignacio

A2 - Joya, Gonzalo

A2 - Catala, Andreu

PB - Springer

T2 - 15th International Work-Conference on Artificial Neural Networks, IWANN 2019

Y2 - 12 June 2019 through 14 June 2019

ER -

ID: 20643808