Key points are not available for this paper at this time.
In this paper, we consider the efficient computation of all eigenvalues and eigenvectors of Symmetric Hierarchically Semiseparable (HSS) matrices, which have an inherent structure: the off-diagonal blocks have hierarchical bases and have low ranks. State-of-the-art is a divide-conquer algorithm, SuperDC, to compute eigenvectors and eigenvalues in an order of magnitude faster than popular and commercial solvers. We improve on the state-of-the-art and present novel shared- and distributed-memory parallel algorithms for computing eigenvalues of HSS matrices. We take advantage of the recursive divide-conquer approach employed in SuperDC to parallelize the eigenvalue computation, present a span and available parallelism analysis, and optimize the original SuperDC algorithm to reduce the storage requirement from O(N2) to O(N) in the case of banded matrices. We do a systematic evaluation with different parallel programming paradigms, scheduling policies, and scalability configurations. We observe that in the shared-memory parallel implementations, OpenMP implementations perform better than Cilk versions, work stealing offers no significant performance advantage, and in the distributed-memory implementations, asynchronous communication yields better performance than implementation with barrier-based communication. We find the optimal input decomposition at which the parallel implementations provide the best speedup. For input symmetric matrices of different sparsity structures and sizes ranging from 4096 to 256k rows, on up to 512 cores, the implementations scale well and show a significant speedup of up to 147 × compared to the available SuperDC implementation.
Josyula et al. (Thu,) studied this question.