TY - GEN
T1 - Quantization and Deployment Study of Classification Models for Embedded Platforms
AU - Huang, Zihan
AU - Jin, Jin
AU - Zhang, Chaolong
AU - Xu, Zhijie
AU - Xu, Yuanping
AU - Kong, Chao
AU - Wen, Qin
AU - Tang, Dan
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Deep learning models find extensive applications across various domains. However, their large number of parameters, high storage requirements, and computational overhead pose challenges for deploying these models on resource-constrained embedded devices. This study focuses on addressing this issue by exploring techniques to optimize and deploy lightweight models on embedded devices. The approach involves optimization and adjustment of the model, followed by model conversion, quantization, and quantization calibration, aimed at reducing model size and improving inference speed. Notably, improvements are made to the quantization calibration algorithm to mitigate accuracy loss caused by model quantization. The experimental results demonstrate that light quantization significantly reduces model size, facilitating storage on embedded devices. Although there is a slight reduction in accuracy, the inference speed is substantially improved, enabling real-time human face recognition in video scenarios.
AB - Deep learning models find extensive applications across various domains. However, their large number of parameters, high storage requirements, and computational overhead pose challenges for deploying these models on resource-constrained embedded devices. This study focuses on addressing this issue by exploring techniques to optimize and deploy lightweight models on embedded devices. The approach involves optimization and adjustment of the model, followed by model conversion, quantization, and quantization calibration, aimed at reducing model size and improving inference speed. Notably, improvements are made to the quantization calibration algorithm to mitigate accuracy loss caused by model quantization. The experimental results demonstrate that light quantization significantly reduces model size, facilitating storage on embedded devices. Although there is a slight reduction in accuracy, the inference speed is substantially improved, enabling real-time human face recognition in video scenarios.
KW - Deep learning
KW - Embedded devices
KW - Lightweight models
KW - Quantization
UR - http://www.scopus.com/inward/record.url?scp=85175575340&partnerID=8YFLogxK
U2 - 10.1109/ICAC57885.2023.10275155
DO - 10.1109/ICAC57885.2023.10275155
M3 - Conference Proceeding
AN - SCOPUS:85175575340
T3 - ICAC 2023 - 28th International Conference on Automation and Computing
BT - ICAC 2023 - 28th International Conference on Automation and Computing
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 28th International Conference on Automation and Computing, ICAC 2023
Y2 - 30 August 2023 through 1 September 2023
ER -