This section demonstrates how sequential LAPACK-based programs are parallelized and converted to ScaLAPACK.
As with the BLAS conversion, it is relatively simple to translate the serial version of an LAPACK call into its parallel equivalent. Translating LAPACK calls to ScaLAPACK calls primarily consists of the following steps:
As an example of this translation process, let us consider the parallelization of the LAPACK driver routine DGESV, which solves a general system of linear equations. The calling sequence comparison for DGESV versus its ScaLAPACK equivalent, PDGESV, is presented below.
CALL DGESV( N, NRHS, A( I, J ), LDA, IPIV, B( I, 1 ), LDB, INFO )
CALL PDGESV( N, NRHS, A, I, J, DESCA, IPIV, B, I, 1, DESCB, INFO )
For a more complete example, let us consider parallelizing a serial LU factorization code, as demonstrated in sections B.2.1 and B.2.2. Note that the parallel routine assumes the existence of the auxiliary routines PDGETF2 (unblocked LU) and PDLASWP (parallel swap routine) in addition to the PBLAS. With this in mind, the serial and parallel versions are very similar since most of the details of the parallel implementation such as communication and synchronization are hidden at lower levels of the software.