Deep learning models have achieved remarkable success across various domains, yet their large size and computational demands present significant challenges for deployment, particularly in resource-constrained environments. This presentation provides a comprehensive overview of model compression techniques, with a specific focus on training-free methods such as pruning, quantization, and low-rank factorization. We will explore the theoretical foundations of these techniques and their practical implications, emphasizing the benefits of reducing resource requirements without the need for additional training. The analysis will include case studies that demonstrate the effectiveness of training-free approaches in real-world applications, along with a discussion of emerging trends and future directions in model compression. By equipping researchers and practitioners with insights into these innovative techniques, this presentation aims to enhance model performance while maintaining operational feasibility in diverse deployment scenarios.
Period
26 Oct 2024
Event title
2024 13th International Conference on Computing and Pattern Recognition