What type of study is this?

This is a Experimental Study study.

September 30, 2025Open Access

Analyzing the effectiveness of post-learning quantization for optimizing neural networks

Key Points

Post-learning quantization reduces model size by three times while maintaining high accuracy.
The inference speed improves by up to 40% without significant accuracy loss, highlighting efficiency.
The analysis includes models like MobileNetV2, BERT-base, and YOLOv5s, comparing performance enhancements.
Post-learning quantization proves beneficial for mobile and IoT devices, making it effective for various applications.

Abstract

Technologies are developing rapidly, including neural networks. After the advent of deep learning, the models became more complex and deeper every year, which led to a shortage of hardware. The article discusses modern methods for optimizing neural networks with an emphasis on post-learning quantization as the most practical approach for deploying models in conditions of limited computing resources. An overview of key methods, including pruning, quantization, and distillation of knowledge, is presented, and their effectiveness and applicability are compared. Special attention is paid to the advantages and limitations of PTQ, such as model size reduction, faster inference, and compatibility with industrial frameworks. The experimental part presents the results of quantization of MobileNetV2, BERT-base, YOLOv5s, EfficientNet-B0, and DistilBERT models, and analyzes the effect of quantization on the accuracy, speed, and compactness of the models. The results showed that post-learning quantization does an excellent job. This method was able to reduce the size of the model by 3 times, accelerate the inference by up to 40% and lose no more than 1.5% accuracy. The results obtained can become the basis for further research on neural network optimization, combining the quantization method with other methods, and creating new hybrid methods that will take all the advantages of post-learning quantization and offset the disadvantages. After all, post-learning quantization is especially effective for mobile and IoT devices, where energy consumption and memory requirements are critical. And its use for computer vision and natural language processing tasks is already showing applicability and prospects.

Analyzing the effectiveness of post-learning quantization for optimizing neural networks

Key Points

Abstract

Cite This Study

Also Consider

Also Consider