A comparison of software and hardware synchronization mechanisms for distributed shared memory multiprocessors

Update Item Information
Publication Type technical report
School or College College of Engineering
Department School of Computing
Creator Carter, John B.
Other Author Kuo, Chen-Chi; Kuramkote, Ravindra
Title A comparison of software and hardware synchronization mechanisms for distributed shared memory multiprocessors
Date 1996
Description Efficient synchronization is an essential component of parallel computing. The designers of traditional multiprocessors have included hardware support only for simple operations such as compare-and-swap and load-linked/store-conditional, while high level synchronization primitives such as locks, barriers, and condition variables have been implemented in software [9,14,15]. With the advent of directory-based distributed shared memory (DSM) multiprocessors with significant flexibility in their cache controllers [7,12,17], it is worthwhile considering whether this flexibility should be used to support higher level synchronization primitives in hardware. In particular, as part of maintaining data consistency, these architectures maintain lists of processors with a copy of a given cache line, which is most of the hardware needed to implement distributed locks. We studied two software and four hardware implementations of locks and found that hardware implementation can reduce lock acquire and release times by 25-94% compared to well tuned software locks. In terms of macrobenchmark performance, hardware locks reduce application running times by up to 75% on a synthetic benchmark with heavy lock contention and by 3%-6% on a suite of SPLASH-2 benchmarks. In addition, emerging cache coherence protocols promise to increase the time spent synchronizing relative to the time spent accessing shared data, and our study shows that hardware locks can reduce SPLASH-2 execution times by up to 10-13% if the time spent accessing shared data is small. Although the overall performance impact of hardware lock mechanisms varies tremendously depending on the application, the added hardware complexity on a flexible architecture like FLASH [12] or Avalanche [7] is negligible, and thus hardware support for high level synchronization operations should be provided.
Type Text
Publisher University of Utah
Subject Hardware locks
Subject LCSH Parallel programming (Computer science); Synchronization; Synchronous circuits
Language eng
Bibliographic Citation Carter, J. B., Kuo, C.-C., & Kuramkote, R. (1996). A comparison of software and hardware synchronization mechanisms for distributed shared memory multiprocessors. 1-24. UUCS-96-011.
Series University of Utah Computer Science Technical Report
Relation is Part of ARPANET
Rights Management ©University of Utah
Format Medium application/pdf
Format Extent 9,113,743 bytes
Identifier ir-main,16231
ARK ark:/87278/s6223c1z
Setname ir_uspace
ID 703945
Reference URL https://collections.lib.utah.edu/ark:/87278/s6223c1z