HPL_pdlaswp01N Broadcast a column panel L and swap the row panel U.


#include "hpl.h"

void HPL_pdlaswp01N( HPL_T_panel * PBCST, int * IFLAG, HPL_T_panel * PANEL, const int NN );


HPL_pdlaswp01N applies the NB row interchanges to NN columns of the trailing submatrix and broadcast a column panel. A "Spread then roll" algorithm performs the swap :: broadcast of the row panel U at once, resulting in a minimal communication volume and a "very good" use of the connectivity if available. With P process rows and assuming bi-directional links, the running time of this function can be approximated by: (log_2(P)+(P-1)) * lat + K * NB * LocQ(N) / bdwth where NB is the number of rows of the row panel U, N is the global number of columns being updated, lat and bdwth are the latency and bandwidth of the network for double precision real words. K is a constant in (2,3] that depends on the achieved bandwidth during a simultaneous message exchange between two processes. An empirical optimistic value of K is typically 2.4.


PBCST   (local input/output)          HPL_T_panel *
        On entry,  PBCST  points to the data structure containing the
        panel (to be broadcast) information.
IFLAG   (local input/output)          int *
        On entry, IFLAG  indicates  whether or not  the broadcast has
        already been completed.  If not,  probing will occur, and the
        outcome will be contained in IFLAG on exit.
PANEL   (local input/output)          HPL_T_panel *
        On entry,  PANEL  points to the data structure containing the
        panel information.
NN      (local input)                 const int
        On entry, NN specifies  the  local  number  of columns of the
        trailing  submatrix  to  be swapped and broadcast starting at
        the current position. NN must be at least zero.

See Also

HPL_pdgesv, HPL_pdgesvK2, HPL_pdupdateNN, HPL_pdupdateTN, HPL_pipid, HPL_plindx1, HPL_plindx10, HPL_spreadN, HPL_equil, HPL_rollN, HPL_dlaswp00N, HPL_dlaswp01N, HPL_dlaswp06N.