Event-based analysis of a L2 prefetch related parallel nonscaling on intel dual core processor

Nan Zhang*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

1 Citation (Scopus)

Abstract

Performance degradation is a common problem in parallel computing as when a workload is parallelised the speedup gained is notably lower than the factor predicted by Amdahl's law. This work examines such a case where image pixels are summed up along the rows in parallel by two threads, but no speedup against the sequential summation is gained on images whose sizes exceed the capacity of the L2 cache of the processor. Counts collected by Intel VTune™ Performance Analyser on relevant performance events show that this nonscaling problem is not caused by those well-understood pitfalls, such as cache contention, bus overloading and unbalanced workload, but that over the parallel summation the L2 prefetchers of the processor are less effective in bringing in data before they are needed. Consequently, a considerably more number of cache lines are brought into the L2 cache by demand requests originated from the L1 data caches, and for such accesses the parallel computation pay the penalty.

Original languageEnglish
Title of host publicationICCET 2010 - 2010 International Conference on Computer Engineering and Technology, Proceedings
PagesV225-V229
DOIs
Publication statusPublished - 2010
Event2010 2nd International Conference on Computer Engineering and Technology, ICCET 2010 - Chengdu, China
Duration: 16 Apr 201018 Apr 2010

Publication series

NameICCET 2010 - 2010 International Conference on Computer Engineering and Technology, Proceedings
Volume2

Conference

Conference2010 2nd International Conference on Computer Engineering and Technology, ICCET 2010
Country/TerritoryChina
CityChengdu
Period16/04/1018/04/10

Keywords

  • Hardware prefetching
  • Parallel nonscaling analysis
  • Parallel performance degradation
  • Parallel scalability

Fingerprint

Dive into the research topics of 'Event-based analysis of a L2 prefetch related parallel nonscaling on intel dual core processor'. Together they form a unique fingerprint.

Cite this