$(F77) $(F77NO_OPTFLAGS) -c $*.f
to:
$(F77) $(F77NO_OPTFLAGS) -fno-globals -fno-f90 -fugly-complex -w -c $*.f
Flags necessary to compile the BLACS tester with Intel's
Fortran compiler
If you are compiling it with Intel's Fortran compiler, the tester will hang in
determining epsilon unless you add -fp_port to F77NO_OPTFLAGS
in your Bmake.inc file.
MPIBLACS SECTION:
Error in most MPI implementations of MPI_Abort.
This error last confirmed in MPICH 1.0.13 and MPICH 1.1. MPI_Abort
does not kill any other processes at all, but seems to behave pretty much
like calling a local exit(). This will cause the BLACS tester to
hang on the BLACS_ABORT test in the auxiliary test. Here is straight
MPI code demonstrating the error:
#include#include "mpi.h" main(int narg, char **args) { int i, Iam, Np; MPI_Init(&narg, &args); MPI_Comm_size(MPI_COMM_WORLD, &Np); MPI_Comm_rank(MPI_COMM_WORLD, &Iam); if (Iam == Np-1) MPI_Abort(MPI_COMM_WORLD, -2); while(1); MPI_Finalize(); }
Problems compiling dwalltime00
There is a undiagnosed problem that causes some users' dwalltime00
routine to return bad values. It appears likely that there is a problem with
macro name overruns, but errors in cpp or the code have not been ruled out.
If you get bad return values from dwalltime00, overwrite
BLACS/SRC/MPI/dwalltime00_.c with:
#include "Bdef.h"
#if (INTFACE == C_CALL)
double Cdwalltime00(void)
#else
F_DOUBLE_FUNC dwalltime00_(void)
#endif
{
return(MPI_Wtime());
}
Sun f77 and gcc compiler mismatch.
User's of Sun's f77 compilers may need to throw the -f
flag to force 8-byte double precision scalar alignment, which
gcc-compiled BLACS expect. Therefore, add -f to the
NOPT macro in SLmake.inc and to the
F77NO_OPTFLAGS in Bmake.inc.
NOTE: this is an old entry, and my no longer be needed.
T3E MPI error in handling zero-length segments
mpt.1.2.0.0.6beta couldn't handle 0-length segments used with
MPI_Type_indexed. To work around this problem, throw the
T3ETrError flag in your Bmake.inc of patched MPIBLACS
(as shown in the example Bmake.T3E supplied with the patch).
NOTE: this is an old entry, and my no longer be needed.
T3E MPI error in handling mixed types
mpt.1.2.0.0.6beta couldn't handle certain reductions where you mix types
iwth a MPI data type. To work around this problem, apply the patch and throw
the T3EReductErr flag in your Bmake.inc
(as shown in the example Bmake.T3E supplied with the patch).
NOTE: this is an old entry, and my no longer be needed.
Include file scoping problem.
This appears to be a compiler problem with including files within the
brackets of a routine. Must include system files before starting scope of the
routine. Therefore, in BLACS/SRC/PVM/blacs_setup_.c, move line:
#include "string.h"to second line of file (ie., after #include "Bdef.h").
.SUFFIXES: .o .C
.c.C:
$(CC) -c $(CCFLAGS) -o C$*.o $(BLACSDEFS) -DCallFromC $<
mv C$*.o $*.C
SGI error workaround:
.SUFFIXES: .o .C
.c.C:
ln -s $*.c C$*.c
$(CC) -c $(CCFLAGS) $(BLACSDEFS) -DCallFromC C$*.c
mv C$*.o $*.C
rm -f C$*.c
program tst
integer k, iam, Np, ictxt, i, j
call mpc_environ(Np, Iam);
k = Iam + 100
print*,'start'
if (iam.eq.1) then
call mp_send(Iam, 4, 0, 2, i)
call mp_send(k, 4, 0, 3, j)
print*,mp_status(i)
print*,mp_status(j)
else if (iam .eq. 0) then
call mp_brecv(k, 4, 1, 3, j)
call mp_brecv(k, 4, 1, 2, j)
end if
print*,'done'
stop
end
When this is run, the output is:
xtst2 -procs 2
start
start
4
4
done
So both sends complete, but the receives still hang.
long iaddr;
iaddr = (long) A;
/*
* If address is on a 8 byte boundary, and lda and m are evenly divisible by 2,
* can use double sized pointers for faster packing
*/
if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) )
mvcopy8(m/2, n, (double *) A, lda/2, (double *) buff);
/*
* Otherwise, must use 4 byte packing
*/
else
You also need to delete basically the same lines from BLACS/SRC/NX/INTERNAL/vmcopy4.c:
long iaddr;
iaddr = (long) A;
/*
* If address is on a 8 byte boundary, and lda and m are evenly divisible by 2,
* can use double sized pointers for faster packing
*/
if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) )
vmcopy8(m/2, n, (double *) A, lda/2, (double *) buff);
/*
* Otherwise, must use 4 byte packing
*/
else