DEFINITION OF A COMPACT APPLICATION BENCHMARK
Compact applications are typical of those found in research environments
(as opposed to production or engineering environments), and usually consist of
up to a few thousand lines of source code. Compact applications are distinct
from kernel applications since they are capable of producing scientifically
useful results. In many cases, compact applications are made up of several
kernels, interspersed with data movements and I/O operations between the
kernels.
The three main criteria for inclusion of a parallel code
in the compact applications suite are,
- The code must be a complete application and be capable of producing results
of research interest. These two points distinguish a compact application from
a kernel. For example, a code that only solves a randomly-generated, dense,
linear system by LU factorization should be considered a kernel. Even though
the code is complete, it does not produce results of research interest.
However, if the LU factorization is embedded in an application that uses
the boundary element method to solve, for example, a two-dimensional
elastodynamics problem, then such an application could legitimately be
considered a compact application.
Compact applications and full production codes are distinguished by their
software complexity, which is difficult to quantify. Software complexity gives
an indication of how hard it is to write, port and maintain an application,
and may be gauged very roughly by the length of the source code. However, there
is no hard upper limit on the length of a code in the compact applications
suite. It is expected that the source code (excluding comments and repeated
common blocks) for most compact applications will be between 2000 and 10000
lines, but some may be shorter or longer.
- The code must be of high quality. This means it must have been extensively
tested and validated, preferably on a wide selection of different parallel
architectures. The problem size and number of processors used must not be
hard-coded into the application, and should be specified at runtime as input
to the program. Ideally, the parallel code should not impose restrictions on
the problem size that are not applicable for the corresponding sequential code.
Thus, the parallel code should not require that the problem size be exactly
divisible by the number of processors, or that the number of processors be
a power of two. In some cases this latter requirement may have to be relaxed.
For example, most parallel fast Fourier transform routines require the number
of processors to be a power of two. It is preferable that the code be
written so that it works correctly for
an arbitrary one-to-one mapping between the logical process topology of the
application and the hardware topology of the parallel computer.
This is desirable so
that the assignment of a location in the logical process topology to a
physical processor can be easily adjusted when porting
the application between platforms. For example a Gray code assignment may
be best for a hypercube, and a natural ordering for a mesh architecture.
- The application must be well documented. The source code itself should
contain an adequate number of comments, and each module should begin
with a comment section that describes what the routine does, and the
arguments passed to it. In addition, there should be a ``Users' Guide''
to the application that describes the input and output, the parameterization
of the problem size and processor layout, and details of what the application
does. The Users' Guide should also contain a bibliography of related
papers.
In addition, to the three criteria discussed above, there are
other desirable features that a PARKBENCH compact application should have.