*******************************************************************************
*******************************************************************************
*                               STATUS SECTION                                *
*******************************************************************************
*******************************************************************************
This file last modified on 03/05/2001.
Version 1.1 release of the BLACS on 5/01/97.
The tester for this release of the is available in the gzipped tarfile
blacstester.tgz
This release of the BLACS has 4 versions:
   mpiblacs.tgz : BLACS using MPI
   mplblacs.tgz : BLACS using IBM SP's MPL library
   nxblacs.tgz  : BLACS using Intel's NX library
   pvmblacs.tgz : BLACS using PVM

It appears that the CMMDBLACS have become obsolete.  If you need this
BLACS version, send mail to blacs@cs.utk.edu.

BLACS errors found: 
   MPI BLACS: error #8 and #9, #13, #16
BLACS tester errors found: none.

There have been 3 patch(es) for this release.  Patches are cumulative, so you
need only get the newest one, mpiblacs-patch03.tgz.

All g77 users should examine  MPI ERROR #14 for instructions on making the
tester compile correctly with g77.

LAM-MPI users should examine MPI ERROR #12.

This rest of this file is divided into 4 sections, one for each BLACS type.

*******************************************************************************
*******************************************************************************
*                              MPIBLACS SECTION                               *
*******************************************************************************
*******************************************************************************

    Please note the MPICH version number in suspected MPICH bugs.  These may
    have been fixed by subsequent releases.

    Be sure you have applied the mpi patch file mentioned above!

=============================== MPI ERROR #1 ==================================
WHERE: MPICH1.0.13 with patch WHAT: Probable MPICH error STATUS: fixed by MPICH1.1
If you apply the big patch to MPICH1.0.13, and run the BLACS tester, it will
fail in the double precision BS/BR tests of the standard input files.  It is
not actually error in the double precision BS/BR tests: the same failure
happens if you run the integer tests 3 times, for instance.  This appears to
be some sort of resource exhaustion or memory overwrite associated with the
patch.  The problem could possibly be in the BLACS, but they work with unpatched
MPICH1.0.13, all previous MPICH releases, and IBM's MPI.  For right now, the
best solution is probably to not apply the MPICH patch.  For LINUX users,
you will probably need to use MPICH1.0.12 (MPICH1.0.13 will not compile on
LINUX without the patch).  The error message caused by this error is:
>2 - MPI_p2_11509:  p4_error: : 3
>rm_l_2_11511:  p4_error: interrupt SIGINT: 2
>TYPE_COMMIT : Invalid datatype argument
>[2]  Aborting program !
>[2] Aborting program!

=============================== MPI ERROR #2 ==================================
     WHERE: ALL MPICH      WHAT: MPICH ERROR              STATUS: NOT FIXED
In MPICH1.0.13 and MPICH1.1, there appears to be an error in MPICH's MPI_Abort.
It does not kill any other processes at all, but seems to behave pretty much
like calling a local exit().  This will cause the BLACS tester to hang on the
BLACS_ABORT test in the auxiliary test.  Here is a straight MPI code
demonstrating the error:

#include <stdio.h>
#include "mpi.h"
main(int narg, char **args)
{
   int i, Iam, Np;

   MPI_Init(&narg, &args);
   MPI_Comm_size(MPI_COMM_WORLD, &Np);
   MPI_Comm_rank(MPI_COMM_WORLD, &Iam);
   if (Iam == Np-1) MPI_Abort(MPI_COMM_WORLD, -2);
   while(1);
   MPI_Finalize();
}

=============================== MPI ERROR #3 ==================================
     WHERE: SGI's MPI v2.0 WHAT: SGI MPI ERROR            STATUS: FIXED v3.0
SGI's MPI v2.0 cannot handle repeated usage and freeing of data types.
This error has been fixed in SGI MPI v3.0.  Included below is a small
straight-MPI program that demonstrates the error.  This program fails
in 984th k-loop iteration, with the following message from MPI (you get
this message if you run the BLACS tester as well):

>Assertion failed: i < 1024, file type_util.c, line 69, pid 4965
>loop 983
>Assertion failed: i < 1024, file type_util.c, line 69, pid 4966

We have successfully used SGI MPI v3.0 and MPICH on this platform.

#include <stdio.h>
#include "mpi.h"
main(int narg, char **args)
{
   int i, Iam, Np, k, j;
   MPI_Datatype Dtype;
   MPI_Status stat;

   MPI_Init(&narg, &args);
   MPI_Comm_size(MPI_COMM_WORLD, &Np);
   MPI_Comm_rank(MPI_COMM_WORLD, &Iam);
   fprintf(stdout, "%d: starting test\n", Iam);
   for (k=0; k != 10000; k++)
   {
      i = j = 1;
      MPI_Type_vector(1, 1, 1, MPI_INT, &Dtype);
      MPI_Type_commit(&Dtype);
      if (Iam == 0)
      {
         MPI_Send(&Iam, 1, Dtype, 1, 0, MPI_COMM_WORLD);
      }
      else
      {
         MPI_Recv(&i, 1, Dtype, 0, 0, MPI_COMM_WORLD, &stat);
      }
      MPI_Type_free(&Dtype);
      fprintf(stdout, "loop %d\n",k);
   }
   fprintf(stdout, "MPI sanity test passed\n");
   MPI_Finalize();
}

=============================== MPI ERROR #4 ==================================
     WHERE: RS6000                             WHAT: COMPILER PROBLEM 
Must use gcc, not xlc, to compile MPICH1.0.10 on the rs6000.  To configure
MPICH, need to add -cc=gcc to configure line (thus my configure line was:
'configure -device=ch_p4 -arch=rs6000 -cc=gcc').


=============================== MPI ERROR #5 ==================================
     WHERE: SUN4     WHAT: COMPILER MISMATCH     STATUS: FIXED BY COMPILER FLAG
We use gcc to compile the BLACS on the SUN4, and it seems to require that
all double precision data be aligned on a 8-byte boundary.  SUN's f77 defaults
to aligning local double precision scalars to 4-byte boundaries,
potentially causing bus errors.  Use the f77 flag -f to compile all fortran
code to force 8-byte alignment.  Therefore, add -f to the NOPT macro in
SLmake.inc and to the F77NO_OPTFLAGS in Bmake.inc.

=============================== MPI ERROR #6 ==================================
     WHERE: T3E           WHAT: MPI ERROR       STATUS: FIXED in 1.2.0.0.6beta
CRAY MPI (MPT 1.1.0.2) has an error in handling 0-byte data types.  Here is
some legal MPI code that fails on the T3E:
#include <stdio.h>
#include <mpi.h>

main(int nargs, char **args)
{
   MPI_Datatype Dt;
   int ierr;

   MPI_Init(&nargs, &args);
   printf( "If this routine does not complete, you should set SYSERRORS = -DZeroByteTypeBu
g.\n");

   ierr = MPI_Type_vector(0, 1, 1, MPI_INT, &Dt);
   if (ierr != MPI_SUCCESS)
      printf("MPI_Type_vector returned %d, set SYSERRORS = -DZeroByteTypeBug\n", ierr);
   else MPI_Type_commit(&Dt);
   if (ierr == MPI_SUCCESS) printf("Leave SYSERRORS blank for this system.\n");
   MPI_Finalize();
}

=============================== MPI ERROR #7 ==================================
     WHERE: T3E           WHAT: MPI ERROR       STATUS: FIXED in 1.2.0.0.6beta
The CRAY MPI (MPT 1.1.0.2) has a strange error where it can't correctly handle
some data types if the communicator used to do the communication is not
MPI_COMM_WORLD.  Here is a small routine showing the error:

#include <stdio.h>
#include <mpi.h>

main(int nargs, char **args)
{
   MPI_Datatype Dt;
   MPI_Comm CMPI_COMM_WORLD;
   int Iam, Np, i, k, ierr;
   int ibuff[4];

   MPI_Init(&nargs, &args);

   MPI_Comm_rank(MPI_COMM_WORLD, &Iam);
   MPI_Comm_size(MPI_COMM_WORLD, &Np);
   MPI_Comm_dup(MPI_COMM_WORLD, &CMPI_COMM_WORLD);
   if (Iam) for (i=0; i != 4; i++) ibuff[i] = -9999;
   else for (i=0; i != 4; i++) ibuff[i] = i+1;
   ierr = MPI_Type_vector(2, 1, 2, MPI_INT, &Dt);
   if (ierr != MPI_SUCCESS)
      printf("MPI_Type_vector returned %d\n",ierr);
   else MPI_Type_commit(&Dt);

   MPI_Bcast(ibuff, 1, Dt, 0, CMPI_COMM_WORLD);

   MPI_Type_free(&Dt);
   for (k=0; k != Np; k++)
   {
      if (Iam == k)
      {
         fprintf(stdout, "%d: ibuff =", Iam);
         for (i=0; i != 4; i++) fprintf(stdout, " %d ", ibuff[i]);
         fprintf(stdout, "\n");
      }
      MPI_Barrier(CMPI_COMM_WORLD);
   }
   MPI_Finalize();
}

If CMPI_COMM_WORLD is set to MPI_COMM_WORLD, this routine produces the
correct answer:
>0: ibuff = 1  2  3  4 
>1: ibuff = 1  -9999  3  -9999 

Otherwise, you get:
>_T3EMPI_coll_send asked to deal with unknown datatype.
>0: ibuff = 1  2  3  4
>1: ibuff = 0  -9999  0  -9999

=============================== MPI ERROR #8 ==================================
WHERE: SGI Origin 2000  WHAT: MPIBLACS ERROR  STATUS: FIXED by patch
The BLACS were not freeing groups created by calls to MPI_COMM_GROUP, causing
some systems to run out of groups.

=============================== MPI ERROR #9 ==================================
     WHERE: T3E           WHAT: MPI BLACS ERROR   STATUS: FIXED by patch
There were a couple of problems in the BLACS handling of CRAY's non-standard
F77 data types.  Also, you can't call F77's mpi_init from C on this platform.
These problems are fixed by the patch.

=============================== MPI ERROR #10 =================================
     WHERE: T3E           WHAT: MPI ERROR    STATUS: workaround in patch 01, 02
mpt.1.2.0.0.6beta couldn't handle 0-length segments used with MPI_Type_indexed.
To work around this problem, apply the patch and throw the T3ETrError flag in
your Bmake.inc (as shown in the example Bmake.T3E supplied with the patch).

=============================== MPI ERROR #11 =================================
     WHERE: T3E           WHAT: MPI ERROR    STATUS: workaround in patch 01, 02
mpt.1.2.0.0.6beta couldn't handle certain reductions where you mix types
with a MPI data type.  To work around this problem, apply the patch and throw
the T3EReductErr flag in your Bmake.inc (as shown in the example Bmake.T3E
supplied with the patch).

=============================== MPI ERROR #12 =================================
     WHERE: ALL platforms  WHAT: new functionality     STATUS: in patch 01, 02
MPI-2 provides a standard way to translate communicators between C and
Fortran77.  If your MPI implements these routines, set TRANSCOMM to -DUseMpi2.
We have reports that the newer versions of LAM-MPI use this setting.

=============================== MPI ERROR #13 =================================
     WHERE: ALL platforms     WHAT: BLACS ERROR           STATUS: in patch 02
Even after the first patch, there were still errors in freeing groups.
In BLACS/SRC/MPI/INTERNAL/BI_TransUserComm.c, the group ugrp was not freed.
In BLACS/SRC/MPI/INTERNAL/BI_MPI_F77_to_c_trans_comm.c, to groups were freed
as communicators, instead of groups.

=============================== MPI ERROR #14 =================================
     WHERE: LINUX/g77         WHAT: Compiler change      STATUS: fixed by flags
The BLACS tester uses a large array in order to simulate dynamic memory.  It
passes this array to routines that accept it as an array of differing data
types.  G77 has upgraded this, in some cases, from warning to error.  In order
to tell g77 to allow this behavior, change BLACS/TESTING/Makefile line 39 from:
        $(F77) $(F77NO_OPTFLAGS) -c $*.f
to:
        $(F77) $(F77NO_OPTFLAGS) -fno-globals -fno-f90 -fugly-complex -w -c $*.f

=============================== MPI ERROR #15 =================================
  WHERE: ????         WHAT: Compiler error/macro problem  STATUS: Not fixed
There is a undiagnosed problem that causes some users' dwalltime00 routine
to return bad values.  It appears likely that there is a problem with
macro name overruns, but errors in cpp or the code have not been ruled out.
If you get bad return values from dwalltime00, overwrite
   BLACS/SRC/MPI/dwalltime00_.c with:
#include "Bdef.h"

#if (INTFACE == C_CALL)
double Cdwalltime00(void)
#else
F_DOUBLE_FUNC dwalltime00_(void)
#endif
{
   return(MPI_Wtime());
}

=============================== MPI ERROR #16 =================================
  WHERE: mpich1.2.*             WHAT: BLACS error   STATUS: Fixed by patch 03
If you get missing f77 argc and argv symbols, when using the BLACS C init
routine s, you are seeing this error.

*******************************************************************************
*******************************************************************************
*                              MPLBLACS SECTION                               *
*******************************************************************************
*******************************************************************************

=============================== MPL ERROR #1 ==================================
     WHERE: SP2                WHAT: ERROR IN MPL          STATUS: NOT FIXED
It appears that MP_BRECV requires that messages be received in the order they
were sent, even if all messages have been successfully sent.  IBM has reported
that this is not an error, but rather perhaps an oversight in documentation.
MPL does not support receiving messages any any order except that which they
are sent.  Here is a small routine showing the problem:
      program tst
      integer k, iam, Np, ictxt, i, j

      call mpc_environ(Np, Iam);
      k = Iam + 100
      print*,'start'
      if (iam.eq.1) then
         call mp_send(Iam, 4, 0, 2, i)
         call mp_send(k,   4, 0, 3, j)
         print*,mp_status(i)
         print*,mp_status(j)
      else if (iam .eq. 0) then
         call mp_brecv(k, 4, 1, 3, j)
         call mp_brecv(k, 4, 1, 2, j)
      end if
      print*,'done'

      stop
      end

When this is run, the output is:
xtst2 -procs 2
 start
 start
 4
 4
 done

So both sends complete, but the receives still hang.

*******************************************************************************
*******************************************************************************
*                              NXBLACS SECTION                                *
*******************************************************************************
*******************************************************************************

=============================== NX  ERROR #1 ==================================
     WHERE: Some NX machines   WHAT: ERROR IN NXBLACS      STATUS: NOT FIXED
The NXBLACS use a copy optimization which is, according to strict IEEE
arithmetic rules, illegal.  More precisely, doubles are sometimes used to
copy floats or integers.  At implementation time, the author tested all
available NX platforms, found no errors, so put the optimization in even
thought it was known to be illegal.  Unfortunately, on more recent platforms
(i.e., ASCI red with newest MPI) this causes problems.  So, if you get
mysterious errors in the tester, this may what's happening.  To prevent the
BLACS from applying this illegal optimization, delete the following lines in 
BLACS/SRC/NX/INTERNAL/mvcopy4.c:
   long iaddr;

   iaddr = (long) A;
/*
 * If address is on a 8 byte boundary, and lda and m are evenly divisible by 2,
 * can use double sized pointers for faster packing
 */
   if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) )
      mvcopy8(m/2, n, (double *) A, lda/2, (double *) buff);
/*
 * Otherwise, must use 4 byte packing
 */
   else

You also need to delete basically the same lines from 
BLACS/SRC/NX/INTERNAL/vmcopy4.c:
   long iaddr;

   iaddr = (long) A;

/*
 * If address is on a 8 byte boundary, and lda and m are evenly divisible by 2,
 * can use double sized pointers for faster packing
 */
   if ( !(iaddr % 8) && !(lda % 2) && !(m % 2) )
      vmcopy8(m/2, n, (double *) A, lda/2, (double *) buff);
/*
 * Otherwise, must use 4 byte packing
 */
   else



*******************************************************************************
*******************************************************************************
*                              PVMBLACS SECTION                               *
*******************************************************************************
*******************************************************************************

=============================== PVM ERROR #1 ==================================
     WHERE: SUNMP PVM     WHAT: PVM3.3.11 ERROR          STATUS: NOT FIXED
SUNMP PVM is broken.  Your best bet is to rig your PVM_ARCH so that it thinks
it is a SUN4SOL2, and use that version of PVM.

=============================== PVM ERROR #2 ==================================
  WHERE: SGI5/new gcc        WHAT: COMPILER ERROR       STATUS: NOT FIXED
This appears to be a compiler problem with including files within the
brackets of a routine.  Must include system files before starting scope of the
routine.  Therefore, in BLACS/SRC/PVM/blacs_setup_.c, move line:
#include <string.h>
to second line of file (after #include "Bdef.h").

=============================== PVM ERROR #3 ==================================
     WHERE: SGI5         WHAT: COMPILER ERROR        STATUS: NOT FIXED
The compiler does not accept the -o (renaming option) if optimization is turned
on.  This breaks the compilation of the C interface.  Bmake.PVM-SGI5 defaults
to using gcc.  If you can't use gcc, you may be able to do a workaround like
the following in BLACS/SRC/PVM/Makefile:

Line 166 of original Makefile:
.SUFFIXES: .o .C
.c.C:
	$(CC) -c $(CCFLAGS) -o C$*.o $(BLACSDEFS) -DCallFromC $<
	mv C$*.o $*.C
SGI error workaround:
.SUFFIXES: .o .C
.c.C:
	ln -s $*.c C$*.c
	$(CC) -c $(CCFLAGS) $(BLACSDEFS) -DCallFromC C$*.c
	mv C$*.o $*.C
	rm -f C$*.c


--------------4570007A3B68A7D31D3D4B86--