A novel parallel prefix sum algorithm and its implementation on multi-core platforms

Nan Zhang*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

4 Citations (Scopus)

Abstract

This paper presents a novel parallel prefix sum algorithm on n numbers by p processors. A parallel time analysis shows that this proposed algorithm is more efficient and easy-scalable than Blelloch's classic parallel solution for the same problem. Unlike Blelloch's method, the proposed algorithm does not require the number p of processors to be a power of 2, and it needs no extra buffers apart from the array that holds the numbers. For practical considerations, a variant of the proposed algorithm is further developed which incurs less overhead due to thread management and synchronisation. The Blelloch's algorithm and the variant of the proposed method are implemented via POSIX Threads, and are tested on two multi-core platforms. The results showed that the variant outperformed the Blelloch's method provided that the size of the array not exceeded the total capacity of the L2 cache of the host machine. In some of the tested cases, the increase in performance was as high as 50%.

Original languageEnglish
Title of host publicationICCET 2010 - 2010 International Conference on Computer Engineering and Technology, Proceedings
PagesV266-V270
DOIs
Publication statusPublished - 2010
Event2010 2nd International Conference on Computer Engineering and Technology, ICCET 2010 - Chengdu, China
Duration: 16 Apr 201018 Apr 2010

Publication series

NameICCET 2010 - 2010 International Conference on Computer Engineering and Technology, Proceedings
Volume2

Conference

Conference2010 2nd International Conference on Computer Engineering and Technology, ICCET 2010
Country/TerritoryChina
CityChengdu
Period16/04/1018/04/10

Keywords

  • Multi-core computing
  • Parallel algorithm
  • Parallel computing
  • Parallel prefix sum

Fingerprint

Dive into the research topics of 'A novel parallel prefix sum algorithm and its implementation on multi-core platforms'. Together they form a unique fingerprint.

Cite this