ScaLAPACK provides two matrix redistribution/copy routines for each data type [107, 49, 106]. These routines provide a truly general copy from any block cyclicly distributed (sub)matrix to any other block cyclicly distributed (sub)matrix. These routines are the only ones in the entire ScaLAPACK library which provide inter-context operations. By this we mean that they can take a (sub)matrix in context A (distributed over process grid A) and copy it to a (sub)matrix in context B.
There need be no relation between the two operand (sub)matrices other than their global size and the fact that they are both legal block cyclicly distributed (sub)matrices. This means that they may be distributed across different process grids, have varying block sizes, and differing matrix starting points, be contained in different size distributed matrices, etc.
Because of the generality of these routines, they may be used for many operations not usually associated with copy routines. For instance, they may be used to a take a matrix on one process and distribute it across a process grid, or the reverse. If a supercomputer is grouped into a virtual parallel machine with a workstation, for instance, this routine can be used to move the matrix from the workstation to the supercomputer and back. In ScaLAPACK, these routines are called to copy matrices from a two-dimensional process grid to a one-dimensional process grid. They can be used to redistribute matrices so that distributions providing maximal performance can be used by various component libraries, as well. This list of uses is hardly exhaustive, but it gives an idea of the power of a general copy in parallel computing.
The two routine classifications are as follows:
All routines are available in integer, single precision real, double precision real, single precision complex, and double precision complex. In the following sections, we describe only the singe precision routines for each data type. Double precision routines are the same as their single precision counterparts, but they have names beginning with PD- instead of PS-, or PZ- instead of PC-.
Note that these routines require an array descriptor of type DESC_(DTYPE_)=1.