Sparse matrix problems are notoriously low performers. Most of this is related to the fact that there is little or no data reuse, thereby preventing the use of BLAS kernels. See for example the tests on vector architectures in [4].
Other performance problems stem from parallel aspects of the solution methods.
Here are then a few of the approaches taken to alleviate performance problems.