Key points are not available for this paper at this time.
The recent emergence of compute-intensive stream processors such as the Cell Broadband Engine, Stanford's Merrimac, and Clear-Speed's CSX600 has made them attractive platforms for scientific high-performance computing. Unstructured mesh and graph applications are an important class of numerical algorithms used in the scientific computing domain, which are particularly challenging for stream architectures. These codes have irregular structures where nodes have a variable number of neighbors, resulting in irregular memory access patterns and irregular control. We study four representative sub-classes of irregular algorithms, including finite-element and finite-volume methods for modeling physical systems, direct methods for n-body problems, and computations involving sparse algebra. We propose a framework for representing the diverse characteristics of these algorithms in the context of the unique properties of stream architectures, and demonstrate it using one representative application from each sub-class. We then develop techniques for mapping the applications onto a stream processor, placing emphasis on data-localization and parallelizations. Our simulations show that efficient stream hardware with restricted control abilities can effectively run challenging irregular applications with, for example, a finite element method and a molecular dynamic code sustaining 69GFLOP/s and 46GFLOP/s (64-bit) respectively using a single chip that measures 12mm on a side and consumes less than 70W in 90nm technology.
Erez et al. (Sun,) studied this question.
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: