THE SEIS COMPACT APPLICATION

The SEIS1 seismic processing performance evaluation suite was submitted by Charles C. Mosher of ARCO Exploration and Technology. It may be obtained in the future compact applications directory from the netlib repository. Further details, provided by Charles Mosher, are given below.
-------------------------------------------------------------------------------
Name of Program         : ARCO Parallel Seismic Processing Benchmarks
-------------------------------------------------------------------------------
Submitter's Name        : Charles C. Mosher
Submitter's Organization: ARCO Exploration and Production Technology
Submitter's Organization: ARCO Exploration and Production Technology
Submitter's Address     : 2300 West Plano Parkway
                          Plano, TX 75075-8499
Submitter's Telephone # : (214)754-6468
Submitter's Fax #       : (214)754-3016
-------------------------------------------------------------------------------
Major Application Field : Seismic Data Processing
Application Subfield(s) : Parallel I/O, signal processing, solution of PDE's
-------------------------------------------------------------------------------
Application "pedigree" (origin, history, authors, major mods) :

The application began as a prototype system for seismic data processing
on parallel computing architectures.  The prototype was used to design
and implement production seismic processing on ARCO's Intel iPSC/860, where
it is used today.

Like other companies, ARCO continues to upgrade our HPC facilities.  We found
that we were spending a large amount of time on benchmarking, as were other
companies in the oil industry.  We decided to place our system in the public
domain as a benchmark suite, in the hopes that the benchmarking effort could
be spread across many participants. In addition, we hope to use the system
as a mechanism for code development and sharing between academia, national
labs, and industry.

Many members of the PERFECT benchmark group provided valuable input that 
significantly improved the structure and content of the suite.  Special thanks 
to David Schneider for his work on organizing and managing the Perfect effort.

A consulting organization (Resource 2000) has also picked up the code and
is providing newsletter subscriptions to participants in the oil industry
describing both benchmark numbers and commentary on usability of the sytems
tested.  Thanks to Randy Premont, Gary Montry, and Clive Bailley of Resource
2000 for their continuing work to make the ARCO suite a viable benchmark.

-------------------------------------------------------------------------------
May this code be freely distributed (if not specify restrictions) :

The code may be freely distributed.  We request that ARCO and the authors be
acknowledged in publications.

In order to ensure relevance of the codes in the suite, the authors plan
to retain control of the source and algorithms contained therein, and request
that suggestions for changes and updates be directed to the authors only.

-------------------------------------------------------------------------------
Give length in bytes of integers and floating-point numbers that should be
used in this application:

        Integers :   4 bytes
        Floats   :   4 bytes

-------------------------------------------------------------------------------
Documentation describing the implementation of the application (at module
level, or lower) :

High level: ARCO Seismic Benchark Suite Users's Guide
Low  level: source comments

-------------------------------------------------------------------------------
Research papers describing sequential code and/or algorithms :

Yilmaz, Ozdogan, 1990, Seismic Data Processing: Investigations in Geophysics
    vol. 2, Society of Exploration Geophysicists, P.O. Box 702740,
    Tulsa, Oklahoma, 74170

-------------------------------------------------------------------------------
Research papers describing parallel code and/or algorithms :

Mosher, C., Hassanzadeh, S., and Schneider, D., 1992, A Benchmark Suite 
    for Parallel Seismic Processing, Supercomputing 1992 proceedings.

-------------------------------------------------------------------------------
Other relevant research papers:


-------------------------------------------------------------------------------
Application available in the following languages (give message passing system
used, if applicable, and machines application runs on) :

Language: 
    Fortran 77

Message Passing: 
    Yet Another Message Passing Layer (YAMPL)
	Sample implementations for PVM, Intel NX, TCGMSG

Machines Supported:
	Workstation clusters and multiprocessors (i.e. Sun, Dec, HP, IBM, SGI)
	Cray YMP 
	Intel iPSC/860

-------------------------------------------------------------------------------
Total number of lines in source code: ~ 20000
Number of lines excluding comments  : ~ 15000
Size in bytes of source code        : ~ 1 MByte 
-------------------------------------------------------------------------------
List input files (filename, number of lines, size in bytes, and if formatted) :

ASCI parameter files, 10-100 lines

-------------------------------------------------------------------------------
List output files (filename, number of lines, size in bytes, and if formatted) :

Binary seismic data files, 1 MByte (small),  1 GByte (medium), 
                          10 Gbyte (large), 100 Gbyte (huge)


-------------------------------------------------------------------------------
Brief, high-level description of what application does:

Synthetic seismic data for small, medium and large test cases are generated
in the native format of the target machine.  The test data are read and
processed in parallel, and the output is written to disk.  Simple checksum
and timing tables are printed to standard output.  A simple x-windows image
display tool is used to verify correctness of results.

-------------------------------------------------------------------------------
Main algorithms used:

Signal processing (FFT's, Toepplitz equation solvers, interpolation)
Seismic Imaging (Fourier domain, Kirchhoff integral, 
   finite difference algorithms)

-------------------------------------------------------------------------------
Skeleton sketch of application:

Processing modules are applied in a pipeline fashion to 2D arrays of seismic
data read from disk.  Processing flows are of the form READ-FLTR-MIGR-WRIT.
The same flow is executed on all processors.  Individual modules communicate
via message passing to implement parallel algorithms.  Nearly all message
passing is hidden via transpose operations that change the parallel data 
distribution as appropriate for each algorithm.

-------------------------------------------------------------------------------
Brief description of I/O behavior:

2D arrays are read/written from HDF style files on disk.  Parallel I/O is
supported for both a single large file read by multiple processors, and a
a separate file read by each processor.  A significant part of the seismic
processing flow requires data to be read in transposed fashion across all
processors.

-------------------------------------------------------------------------------
Brief description of load balance behavior :

Assumes a homogeneous array of processors with similar capabilities.
Load balance is rudimentary, with an attempt to distribute equal-sized
'workstation' chunks of work.

-------------------------------------------------------------------------------
Describe the data distribution (if appropriate) :

Seismic data is inherently parallel, with large data sets that offer mutliple
opportunities for parallel operation.  Typically, the data is treated as a
collection of 2D arrays, with each processor owning a 'slab' of data.

-------------------------------------------------------------------------------
Give parameters of the data distribution (if appropriate) :

The data is defined as a 4-dimensional array with Fortran dimensions
(sample, trace, frame, volume).  The third dimension (frame) is typically
spread across the processors.

-------------------------------------------------------------------------------
Give parameters that determine the problem size :

The ASCII parameter files define the data set size in terms of the number
of samples per seismic traces, the number of traces per shot, the number
of shooting lines, and the number of 3D volumes.

-------------------------------------------------------------------------------
Give memory as function of problem size :

Requires enough memory to hold 2 frames on each node, and a 3D volume
spread across the node.

-------------------------------------------------------------------------------
Give number of floating-point operations as function of problem size :

Reported by code as appropriate. On a Cray YMP, medium sized problems with
750 MB of output run at 30-100 Mflops for about an hour.

-------------------------------------------------------------------------------
Give communication overhead as function of problem size and data distribution :

On an Intel iPSC/860, there are parts of the suite that have comp/comm
ratios ranging from near infinite to 1/10.

-------------------------------------------------------------------------------
Give three problem sizes, small, medium, and large for which the benchmark
should be run (give parameters for problem size, sizes of I/O files,
memory required, and number of floating point operations) :

small: 1 MB output, 10 sec on YMP
medium: 1 GB output, 1 hour on YMP
large: 10 GB output, 10 hours on YMP

-------------------------------------------------------------------------------
How did you determine the number of floating-point operations (hardware
monitor, count by hand, etc.) :

Hand count for simple operations, Regression analysis of Cray HPM results
for more complex operations.

-------------------------------------------------------------------------------
Other relevant information:



-------------------------------------------------------------------------------
PARKBENCH future compact applications page
Last Modified May 14, 1996