Large sparse linear systems occur in many scientific and engineering applications encountered in military and civilian domains. Such systems are typically solved using either iterative or direct methods. Although highly parallel formulations of dense matrix factorization are well known, it had been a challenge to implement efficient sparse linear system solvers using direct methods, even on moderately parallel computers. This talk will present our recent work on a highly parallel sparse Cholesky factorization algorithm that substantially improves the state of the art in parallel direct solution of sparse linear systems---both in terms of scalability and overall performance. Experiments have shown that this algorithm can easily speedup Cholesky factorization by a factor of at least a few hundred up to 1024 processors, and achieve levels of performance that were unheard of and unimaginable for this problem until very recently.
Fast and accurate graph partitioning algorithms are needed for direct as well iterative parallel sparse solvers. This talk will also present our work on new a multilevel graph partitioning scheme that consistently outperforms the spectral partitioning schemes (which is considered the state-of-the-art) in terms of both cut size and run time. We also used our graph partitioning scheme to compute fill reducing orderings for sparse matrices. Surprisingly, our scheme substantially outperforms the multiple minimum degree algorithm (MMD), which is the most commonly used method for computing fill reducing orderings of a sparse matrix.