TY - GEN
T1 - Study on the Correlation of Trainable Parameters and Hyperparameters with the Performance of Deep Learning Models
AU - Ong, Song Quan
AU - Isawasan, Pradeep
AU - Nair, Gomesh
AU - Salleh, Khairulliza Ahmad
AU - Yusof, Umi Kalsom
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - Trainable parameters and hyperparameters are critical to the development of a deep learning model. However, the components have typically been studied individually, and most studies have found it difficult to investigate the effects of their combination on model performance. We are interested in examining the correlation between the number of trainable parameters in a deep learning model and its performance metrics under different hyperparameters. Specifically, we want to study the effects of using either the Adam or SGD optimizers at three varying learning rates. We use six pre-trained models whose trainable parameters have been quantitatively defined using two strategies: (1) freezing the convolutional basis with partially trainable weights and (2) training the whole model with most trainable weights to obtain a set of trainable parameters. Our experimental result shows a positive correlation between the trainable parameters and the test accuracy regardless of the level of the learning rate. However, for the generalization of the model, it was not guaranteed that a higher number of trainable parameters would lead to higher accuracy and F1 score. We have shown that the correlation between trainable parameters and model generalization becomes positive by using Adam with the smallest learning rate.
AB - Trainable parameters and hyperparameters are critical to the development of a deep learning model. However, the components have typically been studied individually, and most studies have found it difficult to investigate the effects of their combination on model performance. We are interested in examining the correlation between the number of trainable parameters in a deep learning model and its performance metrics under different hyperparameters. Specifically, we want to study the effects of using either the Adam or SGD optimizers at three varying learning rates. We use six pre-trained models whose trainable parameters have been quantitatively defined using two strategies: (1) freezing the convolutional basis with partially trainable weights and (2) training the whole model with most trainable weights to obtain a set of trainable parameters. Our experimental result shows a positive correlation between the trainable parameters and the test accuracy regardless of the level of the learning rate. However, for the generalization of the model, it was not guaranteed that a higher number of trainable parameters would lead to higher accuracy and F1 score. We have shown that the correlation between trainable parameters and model generalization becomes positive by using Adam with the smallest learning rate.
KW - Deep Convolutional Neural Network
KW - Fine-tuning
KW - Parameters
KW - Regularization
UR - http://www.scopus.com/inward/record.url?scp=85176568821&partnerID=8YFLogxK
U2 - 10.1109/AiDAS60501.2023.10284682
DO - 10.1109/AiDAS60501.2023.10284682
M3 - Conference Proceeding
AN - SCOPUS:85176568821
T3 - 2023 4th International Conference on Artificial Intelligence and Data Sciences: Discovering Technological Advancement in Artificial Intelligence and Data Science, AiDAS 2023 - Proceedings
SP - 235
EP - 238
BT - 2023 4th International Conference on Artificial Intelligence and Data Sciences
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Conference on Artificial Intelligence and Data Sciences, AiDAS 2023
Y2 - 6 September 2023 through 7 September 2023
ER -