Now we switch topics and consider the spatial properties of complex systems.
The size N of the complex system is obviously an important
property. Note that we think of a complex system as a set of members with their spatial structure evolving with time. Sometimes,
the time domain has a definite ``size'' but often one can evolve the
system indefinitely in time. However, most complex systems have a
natural spatial size with the spatial domain consisting of N
members. In the examples of Section 3.3, the seismic example
had a definite spatial extent and unlimited time domain; on the other
hand, Gaussian elimination had
spatial members evolving for a fixed number of n ``time'' steps. As
usual, the value of spatial size N will depend on the granularity or
detail with which one looks at the complex system. One could consider
a parallel computer as a complex system constructed as a collection of
transistors with a correspondingly very large value of
but here we will view the processor node as the fundamental entity and
define the spatial size of a parallel computer viewed as a complex
system, by the number
of processing nodes.
Now is a natural time to define the von Neumann complex system
spatial structure which is relevant, of course, for computer architecture. We
will formally define this to be a system with no spatial extent, i.e.
size . Of course, a von Neumann node can have a
sophisticated structure if we look at fine enough resolution with multiple
functional units. More precisely, perhaps, we can generalize this complex
system to one where
is small and will not scale up to large
values.
Consider mapping a seismic simulation with grid
points onto a parallel machine with
processors. An important
parameter is the grain size n of the resultant decomposition. We
can introduce the problem grain size
and the computer grain size
as the memory contained in each node of the parallel computer. Clearly we
must have,
if we measure memory size in units of seismic grid points. More
interestingly, in Equation 3.10 we will relate the
performance of the parallel implementation of the seismic simulation to
and other problem and computer characteristics. We find that,
in many cases, the parallel performance only depends on
and
in the combination
and so
grain size is a critical parameter in determining the effectiveness
of parallel computers for a particular application.
The next set of parameters describe the topology or structure of the
spatial domain associated with the complex system. The simplest parameter
of this type is the geometric dimension of the space. As
reviewed in Chapter 2, the original hardware and, in fact, software
(see Chapter 5) exhibited a clear geometric structure for
or
. The binary hypercube of dimension d
had this as its geometric dimension. This was an effective architecture
because it was richer than the topologies of most problems. Thus, consider
mapping a problem of dimension
onto a computer of dimension
. Suppose the software system preserves the spatial structure
of the problem and that
. Then, one can show
that the parallel computing overhead f has a term due to internode
communication that has the form,
with parallel speedup S given by
The communication overhead depends on the problem grain size
and computer complex system
. It also involves
two parameters specifying the parallel hardware performance.
The definitions of and
are imprecise
above. In particular,
depends on the nature of node and
can take on very different values depending on the details of the
implementation; floating-point operations are much faster when the
operands are taken from registers than from slower parts of the memory
hierarchy. On systems built from processors like the Intel i860 chip,
these effects can be large;
could be
from registers (50 MFLOPS) and larger by a factor of ten when the
variables a,b are fetched from dynamic RAM. Again, communication
speed
depends on internode message size (a software
characteristic) and the latency (startup time) and
bandwidth of the computer communication subsystem.
Returning to Equation 3.10, we really only need to understand
here that the term indicates that communication
overhead depends on relative performance of the internode communication
system and node (floating-point) processing unit. A real study of
parallel computer performance would require a deeper discussion of the
exact values of
and
. More interesting here is
the dependence
on the number of
processors
and problem grain size
. As
described above, grain size
depends on both the problem and the computer. The values of
and
are given by
independent of computer parameters, while if
The results in Equation 3.13 quantify the penalty, in
terms of a value of that increases with
, for a
computer architecture that is less rich than the problem
architecture. An attractive feature of the
hypercube architecture is that
is large and one is
essentially always in the regime governed by the top line in
Equation 3.13. Recently, there has been a trend away
from rich topologies like the hypercube towards the view that the node
interconnect should be considered as a routing network or switch to be
implemented in the very best technology. The original MIMD machines
from Intel, nCUBE, and AMETEK all used
hypercube topologies, as did the SIMD Connection
Machines CM-1 and CM-2. The nCUBE-2,
introduced in 1990, still uses a hypercube topology but both it and the
second generation Intel iPSC/2 used a hardware routing that ``hides''
the hypercube connectivity. The latest Intel Paragon and Touchstone
Delta and Symult (ex-Ametek) 2010 use a two-dimensional mesh with
wormhole routing. It is not clear how to incorporate these new node
interconnects into the above picture and further research is needed.
Presumably, we would need to add new complex system properties and
perhaps generalize the definition of dimension
as we
will see below is in fact necessary for Equation 3.10 to
be valid for problems whose structure is not geometrically based.
Returning to Equations 3.10 through 3.12,
we note that we have not properly defined the correct dimension
or
to use. We have implicitly equated
this with the natural geometric dimension but this is not
always correct. This is illustrated by the complex system
consisting of a set of particles in three dimensions interacting
with a long range force such as gravity or electrostatic charge. The
geometric structure is local with
but
the complex system structure is quite different; all particles are
connected to all others. As described in Chapter 3 of [Fox:88a],
this implies that
whatever the
underlying geometric structure. We define the system
dimension
for a general complex
system to reflect the system connectivity. Consider Figure 3.9
which shows a general domain D in a complex system. We define the
volume
of this domain by the information in it. Mathematically,
is the computational complexity needed to simulate D in isolation.
In a geometric system
where L is a geometric length scale. The domain D is not in general
isolated and is connected to the rest of the complex system. Information
flows in D and again in a geometric system.
is a surface
effect with
Figure 3.9: The Information Density and Flow in a General Complex System
with Length Scale L
If we view the complex system as a graph, is related to the
number of links of the graph inside D and
is related to the
number of links cut by the surface of D. Equations 3.14
and 3.15 are altered in cases like the long-range
force problem where the complex system
connectivity is no longer geometric. We define the system dimension to
preserve the surface versus volume interpretation of
Equation 3.15 compared to Equation 3.14.
Thus, generally we define
With this definition of system dimension , we will
find that Equations 3.10 through 3.12
essentially hold in general. In particular for the long range force problem,
one finds
independent of
.
A very important special type of spatial structure is the case
when we find the embarrassingly
parallel or spatially
disconnected complex system. Here there is no connection between
the different members in the spatial domain. Applying this to parallel
computing, we see that if
or
is
spatially disconnected, then it can be parallelized very straightforwardly.
In particular, any MIMD machine can be used whatever the temporal
structure of the complex system. SIMD machines can only be used to
simulate embarrassingly parallel complex systems which have spatially
disconnected members with identical structure.
In Section 13.7, we extend the analysis of this section to cover the performance of hierarchical memory machines. We find that one needs to replace subdomains in space with those in space-time.
In Chapter 11, we describe other interesting spatial properties in terms of a particle analogy. We find system temperature and phase transitions as one heats and cools the complex system.