GBIS Benchmark Header File: gr1

   ===                                                            ===
   ===           GENESIS Distributed Memory Benchmarks            ===
   ===                                                            ===
   ===                             GR1                            ===
   ===                                                            ===
   ===                       General Relativity                   ===
   ===                                                            ===
   ===     Original author:        Nigel T. Bishop                ===
   ===     Modified by    :        Vladimir Getov                 ===
   ===     PARMACS macros :        Vladimir Getov                 ===
   ===     Department of Electronics and Computer Science         ===
   ===               University of Southampton                    ===
   ===               Southampton SO9 5NH, U.K.                    ===
   ===     fax.:+44-703-593045       ===
   ===                               ===
   ===                                                            ===
   ===     Copyright: SNARC, University of Southampton            ===
   ===                                                            ===
   ===          Last update: June 1993; Release: 2.2              ===
   ===                                                            ===

1. Description

While the equations of general relativity are long and complicated,
from a conceptual point of view the situation is similar to that of
solving the wave equation.  Given data describing the initial state of
a gravitational field, we would like to calculate the future evolution
of the field; this is done by solving a system of hyperbolic partial
differential equations.
The specific problem tackled in the benchmark is the axially symmetric
characteristic initial value problem (c.i.v.p.).  This means that
initial data is given on a light cone and, because light cones are
natural constructs in relativity, the equations to be solved in this
formulation are comparatively simple.  The specific problem solved
has a known exact solution, and the output of the program is the
difference between the numerically calculated solution and the exact
one.  Axisymmetry is necessary to ensure that the problem is of manageable
size.  The full calculation will require of order 10^10 flops, while
for the general (no symmetry) problem about 10^13 flops would be required.
Worldwide, little work has been done on the general problem because of the
computing time needed.  However, as progress is made towards a teraflop
machine there is likely to be a substantial growth of interest in
relativity codes, which is why the code was chosen as part of the 
benchmark suite.
The coordinates used are (u, r, y); u is analogous to time, and r and
y are spatial coordinates.  At each step in the evolution, q0 = dq/du
(= rate of change of the gravitational field) is calculated.  The
algorithm is parallel in y.  The host process creates the node
processes, each at a different value of y, sends initial information
and collects the results.  Message passing is mainly between nearest
neighbours, in calculating y-derivatives.  However, messages are also
passed to and from the host at every step in the evolution to smooth
the data at the I1 r-points nearest to the origin.  The code uses
double precision throughout.

2. Operating Instructions

Changing problem size and numbers of processes:

The (r, y) grid has IMAX X JMAX points. The grid is partitioned in the
y-direction on a grid of PN processors. After NI time-steps, an estimate
of the error is output, and after NR such reports the program terminates.
The problem size is specified by parameter statements in the file

If is altered then the program must be recompiled.  For the
various cases, we list the variables or parameters in these files. 	PN    - Number of processing nodes; 
			In the current version of the benchmark it is equal 
                        to JMAX, the number of nodes in the y-direction.
		IMAX  - Grid size in r-direction
                JMMAX - Maximum value permitted for JMAX (PN)

Suggested Problem Sizes :

IMAX should be > 50 and JMAX(PN) > or = 4, and usually JMAX > IMAX/15. 
The suggested values of IMAX are 51, 101, 151, 201 and 251.

Compiling and running the benchmark:
1) To compile and link the benchmark type:   `make' for the distributed
   version or `make slave' for the single-node version.

2) If any of the parameters in the include files are changed,
   the code has to be recompiled. The make-file will automatically
   send to the compiler only affected files, Type   make

3) On some systems it may be necessary to allocate the appropriate
   resources before running the benchmark, eg. on the iPSC/860
   to reserve a cube of 8 processors, type:    getcube -t8

4) To run either sequential or distributed version of the benchmark,
   type:    gr1

   The progress of the benchmark execution can be monitored via
   the standard output, whilst a permanent copy of the benchmark
   output is written to a file called 'result'.

5) If the run is successful and a permanent record is required, the
   file 'result' should be copied to another file before the next run
   overwrites it.

3. Accuracy check

The programs calculate numerically a solution of Einstein's equations,
and compare the solution with the known exact solution.  The norm of the
difference between the two solutions is  output as "NORM ERROR" after every
NR time-steps; the value of the "time" U is also shown.  The norm of the
error should be of order U/(IMAX^2).

Some sample output, which can be used to confirm accuracy of codes, follows:

IMAX=101, JMAX=16, I1=10, NI=10, NR=5, DU=0.001
  NORM ERROR =   2.5849590738144D-16    U =  0.
  NORM ERROR =   8.2103269242124D-07    U =   1.1000000000000D-02
  NORM ERROR =   1.6792533243694D-06    U =   2.1000000000000D-02
  NORM ERROR =   2.5761049493549D-06    U =   3.1000000000000D-02
  NORM ERROR =   3.5132206539250D-06    U =   4.1000000000000D-02
  NORM ERROR =   4.4922444446385D-06    U =   5.1000000000000D-02

IMAX=101, JMAX= 8, I1=10, NI=10, NR=5, DU=0.001
  NORM ERROR =   2.5303469538960D-16    U =  0.
  NORM ERROR =   2.4258663189922D-06    U =   1.1000000000000D-02
  NORM ERROR =   4.9668617189738D-06    U =   2.1000000000000D-02
  NORM ERROR =   7.6268419410634D-06    U =   3.1000000000000D-02
  NORM ERROR =   1.0410209695866D-05    U =   4.1000000000000D-02
  NORM ERROR =   1.3321358015514D-05    U =   5.1000000000000D-02

IMAX= 51, JMAX= 8, I1=10, NI=10, NR=5, DU=0.001
  NORM ERROR =   2.0669846575593D-16    U =  0.
  NORM ERROR =   3.0629150475526D-06    U =   1.1000000000000D-02
  NORM ERROR =   6.2675421973340D-06    U =   2.1000000000000D-02
  NORM ERROR =   9.6187594957325D-06    U =   3.1000000000000D-02
  NORM ERROR =   1.3122148529626D-05    U =   4.1000000000000D-02
  NORM ERROR =   1.6783313508491D-05    U =   5.1000000000000D-02

$Id: ReadMe,v 1.2 1994/04/20 17:27:40 igl Rel igl $


High Performance Computing Centre

Submitted by Mark Papiani,
last updated on 10 Jan 1995.