SyncMalloc: A Synchronized Host-Device Co-Management System for GPU Dynamic Memory Allocation across All Scales

Jiajian Zhang, Fangyu Wu, Hai Jiang, Guangliang Cheng, Genlang Chen, Qiufeng Wang

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Dynamic memory allocation on GPUs, increasingly crucial for applications with dynamic computational patterns, encounters significant challenges due to the complex calculations with intricate branches and substantial memory resources consumed by metadata from massive thread allocations. Despite the current research, there is a lack of a scalable and flexible solution that effectively manages dynamic memory allocation while minimizing memory usage on GPUs. This paper introduces SyncMalloc, a synchronized Host-Device Co-Management system that is specifically designed to adeptly handle dynamic memory allocations of diverse magnitudes. Through the integration of pipelining and producer-consumer mechanisms, SyncMalloc effectively reduces communication overhead and resolves architectural mismatches, further enhancing its capability through synergistic integration with CUDA's unified memory to facilitate oversubscription. Moreover, SyncMalloc advances slab-based memory management to enhance the efficiency of small allocations, reducing conflict probabilities and overhead in high-activity scenarios. Finally, we present a comprehensive performance evaluation, expanding benchmarks and measurement dimensions to reflect the performance of real-world applications more accurately. The experimental results demonstrate the effectiveness of SyncMalloc in supporting dynamic GPU allocations scaled from 4B to 200GB from multiple perspectives. Our source code is available at https://github.com/jjZhang94/SyncMalloc.

Original languageEnglish
Title of host publication53rd International Conference on Parallel Processing, ICPP 2024 - Main Conference Proceedings
PublisherAssociation for Computing Machinery
Pages179-188
Number of pages10
ISBN (Electronic)9798400708428
DOIs
Publication statusPublished - 12 Aug 2024
Event53rd International Conference on Parallel Processing, ICPP 2024 - Gotland, Sweden
Duration: 12 Aug 202415 Aug 2024

Publication series

NameACM International Conference Proceeding Series

Conference

Conference53rd International Conference on Parallel Processing, ICPP 2024
Country/TerritorySweden
CityGotland
Period12/08/2415/08/24

Keywords

  • Dynamic Allocation
  • GPU
  • Memory Management

Fingerprint

Dive into the research topics of 'SyncMalloc: A Synchronized Host-Device Co-Management System for GPU Dynamic Memory Allocation across All Scales'. Together they form a unique fingerprint.

Cite this