next up previous contents index
Next: Local Storage Schemes for Up: In-Core Narrow Band and Previous: The Block Mapping

Local Storage Scheme for Narrow Band Matrices

 

Let us first discuss how to distribute a narrow band matrix A  over a one-dimensional process grid using a block-column distribution. We assume that the coefficient band matrix A   is of size tex2html_wrap_inline15332 (tex2html_wrap_inline15334) with a bandwidth BW=2 if the matrix A is symmetric positive definite, and BWL=2 and BWU=2 if the matrix A is nonsymmetric. The matrix A is represented by the following.


displaymath15326

If we assume that the matrix A is nonsymmetric band, the user may choose to perform partial pivoting or no pivoting during the factorization (PxGBTRF     or PxDBTRF    , respectively). Both strategies assume a block-column distribution of the coefficient matrix, but additional storage is required for fill-in if partial pivoting is selected. First, let us assume that we have selected no pivoting, and we distribute this matrix onto a tex2html_wrap_inline14534 process grid with a block size of tex2html_wrap_inline15352. The processes would contain the local arrays found in figure 4.9. Figure 4.9 also illustrates that the leading dimension of the local arrays containing the coefficient matrix must be at least BWL+1+BWU for the non-pivoting narrow band linear solver.

  figure2837
Figure 4.9: Mapping of local arrays for nonsymmetric band matrix A (no pivoting)

If, however, we select partial pivoting and distribute this same matrix onto a tex2html_wrap_inline14534 process grid with a block size of tex2html_wrap_inline15430, the processes would contain the local arrays found in figure 4.10. The amount of additional storage required for fill-in is represented by F in the figure and is equal to the sum of the lower bandwidth (number of subdiagonals), BWL, and the upper bandwidth (number of superdiagonals), BWU. In this example, BWL=2 and BWU=2. Refer to the leading comments of the routine PxGBTRF for further details. Figure 4.10 also illustrates that the leading dimension of the local arrays containing the coefficient matrix must be at least 2*(BWL+BWU)+1 for the partial pivoting narrow band linear solver.

  figure2886
Figure 4.10: Mapping of local arrays for nonsymmetric band matrix A (partial pivoting)

Let us now assume that the matrix A is symmetric positive definite band with BW=2, and we distribute this matrix assuming lower triangular storage (UPLO='L') onto a tex2html_wrap_inline14534 process grid with a block size tex2html_wrap_inline15430. The processes would contain the local arrays found in figure 4.11. We would then call the routine PxPBTRF     with BW=2 to perform the factorization, for example.

  figure2938
Figure 4.11: Mapping of local arrays for symmetric positive definite band matrix A (UPLO='L')

If we then distributed this same matrix assuming upper triangular storage (UPLO='U') onto a tex2html_wrap_inline14534 process grid with a block size tex2html_wrap_inline15430, the processes would contain the local arrays found in figure 4.12.

  figure2975
Figure 4.12: Mapping of local arrays for symmetric positive definite band matrix A (UPLO='U')

Figures 4.11 and  4.12 also illustrate that the leading dimension of the local arrays containing the coefficient matrix must be at least BW+1 for the symmetric positive definite narrow band linear solver.

The tex2html_wrap_inline15263 notation in figures 4.9,  4.10,  4.11, and 4.12 and the F notation in figure 4.10 signify an entry in which one need not store a value in that position of the local array. These storage positions, however, are required and overwritten during the computation.

The tex2html_wrap_inline15189 matrix of right-hand-side vectors B (for example, used in PxGBTRS    , PxDBTRS    , and PxPBTRS    ) is assumed to be a dense matrix distributed in a block-row manner across the process grid. Thus, consecutive blocks of rows of the matrix B are assigned to successive processes in the process grid, as described in section 4.4.1.


next up previous contents index
Next: Local Storage Schemes for Up: In-Core Narrow Band and Previous: The Block Mapping

Susan Blackford
Tue May 13 09:21:01 EDT 1997