TY - GEN
T1 - PLSR: Unstructured Pruning with Layer-wise Sparsity Ratio
AU - Zhao, Haocheng
AU - Yu, Limin
AU - Guan, Runwei
AU - Jia, Liye
AU - Zhang, Junqing
AU - Yue, Yutao
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023/12
Y1 - 2023/12
N2 - In the current era of multi-modal and large models gradually revealing their potential, neural network pruning has emerged as a crucial means of model compression. It is widely recognized that models tend to be over-parameterized, and pruning enables the removal of unimportant weights, leading to improved inference speed while preserving accuracy. From early methods such as gradient-based, and magnitude-based pruning to modern algorithms like iterative magnitude pruning, lottery ticket hypothesis, and pruning at initialization, researchers have strived to increase the compression ratio of model parameters while maintaining high accuracy. Currently, mainstream algorithms focus on the global pruning of neural networks using various scoring functions, followed by different pruning strategies to enhance the accuracy of sparse model. Recent studies have shown that random pruning with varying layer-wise sparsity ratio has achieved robust results for large models and out-of-distribution data. Based on this discovery, we propose a new score called FeatIO, which is based on module input and output feature map sizes. As a score function used in PaI, FeatIO surpasses the performance of other PaI score functions. Additionally, we propose a novel pruning strategy called Pruning with Layer-wise Sparsity Ratio (PLSR), which conbines the layer-wise sparsity ratios and magnitude-based score function, resulting in optimal evaluation performance. Almost all algorithms exhibit improved performance when using our novel pruning strategy. The combination of PLSR and FeatIO consistently outperforms other algorithms in testing, demonstrating the significant potential of our proposed approach. Our code will be available here.
AB - In the current era of multi-modal and large models gradually revealing their potential, neural network pruning has emerged as a crucial means of model compression. It is widely recognized that models tend to be over-parameterized, and pruning enables the removal of unimportant weights, leading to improved inference speed while preserving accuracy. From early methods such as gradient-based, and magnitude-based pruning to modern algorithms like iterative magnitude pruning, lottery ticket hypothesis, and pruning at initialization, researchers have strived to increase the compression ratio of model parameters while maintaining high accuracy. Currently, mainstream algorithms focus on the global pruning of neural networks using various scoring functions, followed by different pruning strategies to enhance the accuracy of sparse model. Recent studies have shown that random pruning with varying layer-wise sparsity ratio has achieved robust results for large models and out-of-distribution data. Based on this discovery, we propose a new score called FeatIO, which is based on module input and output feature map sizes. As a score function used in PaI, FeatIO surpasses the performance of other PaI score functions. Additionally, we propose a novel pruning strategy called Pruning with Layer-wise Sparsity Ratio (PLSR), which conbines the layer-wise sparsity ratios and magnitude-based score function, resulting in optimal evaluation performance. Almost all algorithms exhibit improved performance when using our novel pruning strategy. The combination of PLSR and FeatIO consistently outperforms other algorithms in testing, demonstrating the significant potential of our proposed approach. Our code will be available here.
KW - Layer-wise Sparsity
KW - Model Com-pression
KW - Pruning
KW - Unstructured Pruning
UR - http://www.scopus.com/inward/record.url?scp=85190113835&partnerID=8YFLogxK
U2 - 10.1109/ICMLA58977.2023.00009
DO - 10.1109/ICMLA58977.2023.00009
M3 - Conference Proceeding
AN - SCOPUS:85190113835
T3 - International Conference on Machine Learning and Applications (ICMLA)
SP - 1
EP - 8
BT - Proceedings - 22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023
A2 - Arif Wani, M.
A2 - Boicu, Mihai
A2 - Sayed-Mouchaweh, Moamar
A2 - Abreu, Pedro Henriques
A2 - Gama, Joao
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 22nd IEEE International Conference on Machine Learning and Applications, ICMLA 2023
Y2 - 15 December 2023 through 17 December 2023
ER -