CP's research showed that

PCW:Parallel Computers work in a large class of scientific and engineering computations.

In Chapter 2, we provide the national overview of parallel
computing activities during the last decade. Chapter 3
is somewhat speculative as it attempts to provide a framework to quantify
the previous **PCW** statement.

*We will show that, more precisely, parallel computing only works in a
``scaling'' fashion in a special class of problems which we call
synchronous and loosely synchronous.*

By scaling, we mean that the parallel implementation will efficiently extend to systems with large numbers of nodes without levelling off of the speedup obtained. These concepts are quantified in Chapter 3 with a simple performance model described in detail in [Fox:88a].

The book is organized with applications and software issues growing in complexity in later chapters. Chapter 4 describes the cleanest regular synchronous applications which included many of our initial successes. However, we already see the essential points:

DD:Domain decomposition (or data parallelism) is a universal source of scalable parallelismMP:software model was a simple explicit message passing with each node of a parallel processor running conventional sequential code communicating via subroutine call with other nodes.

Chapters 6 through 9 confirm these lessons
with an extension to more irregular problems. Loosely synchronous
problem classes are harder to parallelize, but still use the basic
principles **DD** and **MP**. Chapter 7 describes a special
class, *embarrassingly parallel*, of applications where scaling
parallelism is guaranteed by the independence of separate components of
the problem.

Chapters 10 and 11 describe parallel computing tools developed within CP. DIME supports parallel mesh generation and adaptation, and its use in general finite element codes. Initially, we thought load balancing would be a major stumbling block for parallel computing because formally it is an NP-complete (intractable) optimization problem. However, effective heuristic methods were developed which avoid the exponential time complexity of NP-complete problems by searching for good but not exact minima.

In Chapter 12, we describe the most complex irregular loosely synchronous problems which include some of the hardest problems tackled in CP.

As described earlier, we implemented essentially all the applications described in the book using explicit user-generated message passing. In Chapter 13, we describe our initial efforts to produce a higher level data-parallel Fortran environment, which should be able to provide a more attractive software environment for the user. High Performance Fortran has been adopted as an informal industry standard for this language.

In Chapter 14, we describe the very difficult asynchronous problem class for which scaling parallel algorithms and the correct software model are less clear. Chapters 15, 16, and 17 describe four software models, Zipcode, MOOSE, Time Warp, and MOVIE which tackle asynchronous and the mixture of asynchronous and loosely synchronous problems one finds in the complex system simulations and analysis typical of many real-world problems. Applications of this class are described in Chapter 18, with the application of Section 18.3 being an event-driven simulation-an important class of difficult-to-parallelize asynchronous applications.

In Chapter 19 we look to the future and describe some
possibilities for the use of parallel computers in industry. Here we
note that CP, and much of the national enterprise, concentrated on
scientific and engineering computations. The examples and ``proof''
that parallel computing works are focussed in this book on such
problems. However, this will *not* be the dominant industrial
use of parallel computers where information processing is most
important. This will be used for decision support in the military and
large corporations, and to supply video, information and simulation
``on demand'' for homes, schools, and other institutions. Such
applications have recently been termed *national challenges* to
distinguish them from the large scale *grand challenges*, which
underpinned the initial HPCC initiative [FCCSET:94a]. The lessons
CP and others have learnt from scientific computations will have
general applicability across the wide range of industrial problems.

Chapter 20 includes a discussion of education in computational science-an unexpected byproduct of CP-and other retrospective remarks. The appendices list the CP reports including those not cited directly in this book. Some information is available electronically by mailing citlib@caltech.edu.

**
**

Wed Mar 1 10:19:35 EST 1995