GETTING FURTHER HELP: ===================== Questions and comments about this prototype wrapper library should be e-mailed to scalapack@cs.utk.edu. GENERAL SOFTWARE OVERVIEW: ========================== This is release 1.1 of SLHPF, the HPF interface to a subset of ScaLAPACK routines. The routines in this wrapper library are divided into 3 layers: LAYER 1 :: Global HPF routines LAYER 2 :: HPF_LOCAL routines LAYER 3 :: Strict Fortran77, taking local assumed-size arrays as arguments. Note that all libraries called by these wrappers (ScaLAPACK, BLACS, PBLAS, BLAS) are classified as LAYER 3. We provide HPF wrappers for all precisions of: GELS : Solving the full-rank linear least squares problem GESV : Solving a general system of linear equations using the LU factorization POSV : Solving an SPD system of linear equations using the Cholesky factorization SYEV : Solving a symmetric eigenvalue problem (double and single precision only) GEMM : Matrix-matrix multiply of the form C <- ALPHA*op(A)*op(B) + BETA*C TRSM : Triangular solve of the form B <- ALPHA*op(inv(A))*B or B <- ALPHA*B*op(inv(A)) CALLING THE ROUTINES ==================== Users wishing to call ScaLAPACK routines need to USE the module HPF_LAPACK. The file SLHPF/TESTING/simple_gesv.f demonstrates a simple call to LA_GESV. The calling sequences for the supplied routines are (default values for optional parameters are enclosed in []): SUBROUTINE LA_GESV(A, B, IPIV, INFO) , INTENT(INOUT), DIMENSION(:,:) :: A, B INTEGER, OPTIONAL, INTENT(OUT) :: IPIV(:), INFO SUBROUTINE LA_POSV(A, B, UPLO, INFO) , INTENT(INOUT), DIMENSION(:,:) :: A, B CHARACTER(LEN=1), OPTIONAL, INTENT(IN) :: UPLO[='Upper'] INTEGER, OPTIONAL, INTENT(OUT) :: IPIV(:) SUBROUTINE LA_GELS(A, B, TRANS, INFO) , INTENT(INOUT), DIMENSION(:,:) :: A, B CHARACTER(LEN=1), OPTIONAL, INTENT(IN) :: TRANS[='NoTranspose'] INTEGER, OPTIONAL, INTENT(OUT) :: IPIV(:) SUBROUTINE LA_SYEV(A, W, Z, UPLO, INFO) , INTENT(INOUT), DIMENSION(:,:) :: A , INTENT(OUT), DIMENSION(:) :: W , OPTIONAL, INTENT(OUT), DIMENSION(:,:) :: Z CHARACTER(LEN=1), OPTIONAL, INTENT(IN) :: UPLO[='Upper'] INTEGER, OPTIONAL, INTENT(OUT) :: INFO SUBROUTINE LA_GEMM(A, B, C, TRANSA, TRANSB, ALPHA, BETA) , INTENT(IN), DIMENSION(:,:) :: A, B , INTENT(INOUT), DIMENSION(:,:) :: C CHARACTER(LEN=1), OPTIONAL, INTENT(IN) :: TRANSA[='NoTranspose'], TRANSB[='NoTranspose'] , OPTIONAL, INTENT(IN) :: ALPHA[=1.0], BETA[=0.0] SUBROUTINE LA_TRSM(A, B, SIDE, UPLO, TRANSA, DIAG, ALPHA) , INTENT(IN), DIMENSION(:,:) :: A , INTENT(INOUT), DIMENSION(:,:) :: B CHARACTER(LEN=1), OPTIONAL, INTENT(IN) :: SIDE[='Left'], UPLO[='Upper'], TRANSA[=NoTranspose'], DIAG[='NonUnit'] , OPTIONAL, INTENT(IN) :: ALPHA[=1.0] For more details, see the module file, SLHPF/SRC/HPF_LAPACK_mod.f. WHAT IS MISSING/WRONG IN THIS RELEASE: ====================================== (1) Many routines accept arguments which may be 2D or 1D arrays (eg. the right hand side vector (or series of vectors) X). In this release, double complex is not working for 1D vector arguments due to an unknown bug. (2) Because of previous compiler bugs, we internally use INCLUDE in place of the more modern MODULEs. (4) The code contains a few awkward sections of code required to work around present compiler bugs. (5) Due to compiler bugs, does not run on all HPF compilers. PLATFORM NOTES: =============== (1) This release has been tested on Solaris using pghpf Rel 2.4 with pvm, on Linux using pghpf Rel 2.4 using pvm, and on an IBM SP2 using pghpf Rel 2.4 and mpi. (2) We have not tested on IBM xlhpf because they explicitly state they do not support the !HPF$ INHERIT directive required by our code. (3) We are working with Compaq to get our wrappers to work with the DEC HPF compiler on DEC ALPHA's INSTALLATION: ============= (1) Locate on your system, or obtain and install, all required libraries: -- ScaLAPACK (http://www.netlib.org/scalapack/index.html) -- BLACS (http://www.netlib.org/blacs/index.html) -- BLAS (http://www.netlib.org/blas/index.html) ** Note that precompiled ScaLAPACK and BLACS libraries are available for many platforms at the above sites, in the subdirectory archives/. ** Warning: If you are using pghpf, you might have to recompile the libraries using the pghpf compiler rather than the above prebuilts. This is true for pghpf on linux. (2) Copy the SLhpf_make.inc file that is closest to your machine, compiler and message passing in the INSTALL directory to the file SLHPF/SLhpf_make.inc. For example SLhpf_make.PGHPF-PVM-SUN4SOL2 would be for the pghpf compiler running PVM on a Solaris system. Our testing was done using pghpf Rel 2.4. (3) Edit SLHPF/SLhpf_make.inc to be correct for your system. See "EDITING SLHPF_MAKE.INC" for further details. (4) If you are using a system which does not have the F77_LOCAL extrinsic, type make convert in your SLHPF directory after properly setting up SLhpf_make.inc. See "CONVERTING LAYER 3 TO A NEW EXTRINSIC" below for further details. (5) If you are using pghpf with MPI you need to set the enviroment variable HPF_MPI to point to your MPI library, for example: setenv HPF_MPI /usr/local/MPI/mpich/lib/solaris/ch_p4/libmpi.a You need to change EXITVAL=0 to EXITVAL=1 in the SLHPF/SRC/misc.h file if your compiler is using the same message passing layer as the BLACS you are using. (6) If you want SLHPF to print a warning when you are performing a redistribution, make sure that REDIST is set to TRUE in the SLHPF/SRC/misc.h file (7) Type make all (library and testers) or make lib (just the library) in your SLHPF directory. (8) After you have successfully installed and tested, you can remove the object files by typing make clean in the SLHPF directory. EDITING SLHPF_MAKE.INC: ======================= SLhpf_make.inc is a make include file containing macros used by all Makefiles under the SLHPF directory. The provided example SLhpf_make.inc was used to compile the library using pghpf Rel 2.4 on a Solaris system using pvm. The user should modify the macros for his system. It is roughly divided into three sections: SECTION 1 ========= TOPdir :: The top-level directory for the HPF wrappers for ScaLAPACK. usually $(HOME)/SLHPF. INCdir :: The directory where the include and module files required by the tester can be found. Usually, this will be $(TOPdir)/SRC. PLAT :: The platform name. Examples include SUN4SOL2 (Solaris), RS6000, HPPA, etc. SLdir :: The directory your ScaLAPACK libraries are in. If you don't have ScaLAPACK installed, you must do this first. SLlib :: All of the ScaLAPACK libraries required by this package. Bdir :: The library your BLACS are installed in. If you don't have the BLACS installed, you must do this before proceeding. Blib :: The BLACS libraries to link to. MPdir :: The directory your system message passing layer is installed in. MPlib :: Your system message passing layer library. For workstations, this is usually PVM. For an SP2, it might be your MPL or MPI library, for instance. See "CHOOSING A MESSAGE PASSING LAYER" for further details. setup :: This macro should be set to mpi if you are not using PVM as your message passing layer. Otherwise, leave it set to pvm. BLAS :: Your BLAS library. If you don't have the BLAS installed on your machine, you must do so before proceeding. SYSlib :: System dependent libraries. LIBS :: All libraries needed for link. HPFdir :: Directory to create the wrapper library in. HPFlib :: Name of the wrapper library. SECTION 2 ========= You only need modify this section if you are using a compiler other than pghpf. pghpf has the extrinsic F77_LOCAL, which all ScaLAPACK routines are declared. On DEC's f90, this should instead be HPF_LOCAL. Other compilers might use F90_LOCAL, for instance. CONVERT_FROM :: The present extrinsic declaration for all layer 3 codes. The default setting is "extrinsic( f77_local ) ". CONVERT_TO :: The correct extrinsic declaration for a Fortran77 routine taking assumed-size arrays under your compiler. SECTION 3 ========= FL1 :: The compiler to use for compiling Layer 1 routines (global HPF). FL1FLAGS :: The compiler flags for compiling Layer 1 routines. FL2 :: The compiler to use for compiling Layer 2 routines (HPF_LOCAL). FL2FLAGS :: The compiler flags for compiling Layer 2 routines. FL3 :: The compiler to use for compiling Layer 3 routines (Fortran77 accepting assumed-size arrays). FL3FLAGS :: The compiler flags for compiling Layer 3 routines. L1LOADER :: The correct linker/loader to invoke. L1LOADFLAGS :: The flags for the linker/loader. CONVERTING LAYER 3 TO A NEW EXTRINSIC: ====================================== !!! USERS OF PGHPF DO NOT NEED TO DO THIS STEP !!! If you are not using pghpf, you may need to change the extrinsic declaration of your Layer 3 routines. Presently they are declared as "extrinsic( f77_local ) " If your system wants a different declaration for Fortran77 routines accepting assume-size arrays, you must run the conversion before compiling. In your SLhpf_make.inc file, you need to modify the CONVERT_TO macro to the correct extrinsic. On DEC's f90, this would be "extrinsic( hpf_local ) ". Once CONVERT_TO is correctly set, the conversion can be done by typing "make convert" in your SLHPF directory. REDISTRIBUTING INHERITED MATRICES: ================================== The name of this section is a bit of a misnomer. By redistribution, we are not talking about calling the HPF REDISTRIBUTE directive, but rather redistribution across extrinsic subroutine boundaries. Our wrappers are designed to take the user's input operands, pass them directly to ScaLAPACK if they are in the proper form, and redistribute them only if they are not. The logical parameter REDIST (set in SLHPF/SRC/misc.h) determines whether or not the library will issue a warning to let the user know if redistribution is occurring. If this variable is set to .TRUE. (the default), SLHPF will issue a warning to standard output indicating that because arrays were improperly aligned or not block cyclic and a redistribution is occurring. If REDIST is set to .FALSE. then no such message is issued. CHOOSING A MESSAGE PASSING LAYER: ================================= The message passing layer for ScaLAPACK is the BLACS. The BLACS in turn usually run on top of another message passing layer. Choices for message passing layers include MPI, PVM, NX, etc. All our testing was done using PVM or MPI. In order to use MPI, you need a compiler which uses MPI as its message passing layer. This is because MPI does not provide dynamic methods of startup, so you need to depend on the compiler to start the MPI `machine'. Therefore, by default our codes assume you are using PVM. If you are running on a dedicated parallel machine such as the SP2 or Paragon, you should be able to use the native BLACS and message passing layer (MPL and NX, respectively) or MPI, as desired. If you use any message passing layer besides PVM, be sure to set the SLhpf_make.inc macro SETUP to mpi (see "EDITING SLHPF_MAKE.INC" for details). If your compiler is using the same message passing layer as the BLACS you're running (e.g., your compiler uses MPI as its message passing layer, and you are using the MPIBLACS), the compiler will probably issue the message passing layer shutdown call (e.g. MPI_Finalize or pvm_exit) itself, so you need to tell the wrappers not to do so. To do this, edit SLHPF/SRC/misc.h, and change EXITVAL=0 to EXITVAL=1. RUNNING THE TESTERS: ==================== By default, the executables are built into your SLHPF/TESTING directory. Their names consist of xYZZZZ, where Y is replaced by the character indicating data type and precision (s, d, c, or z), and ZZZZ is replaced by the routine name (gesv, posv, gels, syev, gemm, or trsm). The executable xsimple is a simple example program to try before the other testers. It also shows in a simple manner how to call a ScaLAPACK routine using SLHPF. How these executables are run will be system dependent. If you are using PVM as your message passing layer, the first step is to start PVM on all machines. On our Solaris machine using pghpf, the executables are then run by xYZZZZ -pghpf -np <# of procs> For example, to run the single precision version of the GESV tester using the default data file, you would type xsgesv -pghpf -np 4 Running the same routine under MPI (mpich) can be accomplished by: mpirun -np 4 xsgesv Each routine has its own data file, controlling what testing is done. Running the default input files should give at least some confidence in the installation. If further testing is required, the user may modify the data files as roughly described in "EDITING THE DATA FILES". Note that all default data files require 4 processes to run. TROUBLESHOOTING: ================ * WARNING: We have an unknown bug with using vectors in double complex precision. Use a matrix declaration for a vector instead. i.e. use integer, parameter :: DP=kind(0.0D0) complex(DP) :: A(n,1) rather than integer, parameter :: DP=kind(0.0D0) complex(DP) :: A(n) * MISSING SYMBOLS ON LINK: Check that you are linking to all required libraries. If the missing symbols are system or Fortran library routines, one of your libraries may have been compiled with a compiler incompatible with your hpf compiler. If so, recompile that library using a compatible compiler. If using the pghpf compiler. On Linux systems it is important to recompile all of the F77 code with the PGHPF compiler. * MISSING MPI LIBRARY: If you are using pghpf with the -Mmpi option, it will need to be told where to find your MPICH installation. Set the environment variable HPF_MPI to point to your mpi library (e.g. setenv HPF_MPI /src/icl/MPI/mpich/lib/solaris/ch_p4/libmpi.a). * TESTER DIES IMMEDIATELY AFTER STARTUP WITH MESSAGE "Must start PVM before calling ScaLAPACK routines" You must start PVM by typing "pvm" on one machine, and adding all machines that will have processes on them. For instance, if you are running on two machines named BOB1 and BOB2, on BOB1 you type: pvm pvm> add bob2 pvm> quit See PVM homepage for more info, http://www.epm.ornl.gov/pvm/pvm_home.html. * TESTER DIES IMMEDIATELY AFTER STARTUP Check to make sure you have allocated at least 4 processes, if running the default data files. Otherwise, make sure you allocate however many processes are called for in your data file. If you have unresolved problems, feel free to contact the developers at: scalapack@cs.utk.edu. EDITING THE DATA FILES: ======================= This section gives a very rough description of the input data files for the wrapper testers. Users familiar with ScaLAPACK descriptors and calling sequences should find these data files relatively easy to understand. New users may not be able to figure them out without some study of ScaLAPACK in general. Please note that since GEMM lacks an INFO parameter, the tester will be killed by GEMM if the user enters an illegal test case. The supplied data files all have the general form where first you indicate the number of process grids you wish to test, and their dimensions. The tested routine will be called on submatrices (hereafter called operand matrices). The next thing to be described, however, is the dimensions (M, N) distribution blocking factors (MB, NB), process grid starting point (RSRC, CSRC) of the original storage arrays from which the operand matrices will be extracted. You then indicate the number of operand matrices to test, their sizes (M, N, possibly K), their starting indices in the base array (I, J). Also described here are routine dependent quantities such as TRANSA, TRANSB, UPLO, etc. Here is the data file for gesv: 2 # of grids 2 1 Values of P 2 4 Values of Q 2 # of matrices 50 70 Values of M_A 50 70 Values of N_A 5 2 Values of MB_A 5 2 Values of NB_A 0 0 Values of RSRC_A 0 0 Values of CSRC_A 50 70 Values of M_B 40 60 Values of N_B 5 2 Values of MB_B 5 3 Values of NB_B 0 0 Values of RSRC_B 0 0 Values of CSRC_B 4 # of ops 10 10 35 9 Values of op(M) 8 8 6 10 Values of op(N) 1 3 6 6 Values of IA 1 3 6 6 Values of JA 1 3 6 6 Values of IB 1 4 6 1 Values of JB 1.0E0 THRESHOLD Note that the data file is organized in a column oriented fashion for the # of grids, # of matrices, # of ops, with each column following denoting the setup you desire. For example this data file will run a 2x2 and a 1x4 grid for all of the following options. The values of M_A,N_A,MB_B, NB_A,RSRC_A,CSRC_A,M_B,N_B,MB_B,NB_B,RSRC_B,CSRC_B correspond to the ScaLAPACK descriptor entries. If you are unfamiliar with these, please refer to the ScaLAPACK User's guide. The values of op(M), op(N), IA, JA,IB,JB correspond to the size and starting subsections of the operand matrices to be tested.