We will make the simplifying assumption that the same computation, producing and consuming the same number of data,
is done at every point of .
The memory required to execute a tile depends on the order in which the individual points of the tile are executed.
For this analysis, we assume that the points along hyperplanes normal to a given integer k-vector are executed
simultaneously. We need to store the values produced at earlier iterations that are required by the iterations on this
hyperplane. The number of such values is again given by the sum of volumes of ``shadows'' as
Here is the volume of the intersection of the hyperplane and the given tile,
i.e., of the set of iteration points computed at time t.
The maximum is taken over the relevant values of t.
This largest volume is a function of the tile dimensions and of the shape of the tile as well as of the time
coordinate . In general, it can be larger than the faces, but by no more than a constant factor
(at most ).
It may be much smaller, as in this case:
Let
and let and
so that the tiles are long and narrow and are nearly aligned with the axis.
The faces of these tiles have lengths of 10 and about 10.5. If we take , then the set of points
in the tile that are simultaneously executed is small; there are at most two.
On the other hand, if we choose , then there are 10 such points. So our earlier assertion that face
volume is a good measure of memory required is in doubt.
This is not, however, a real possibility. The example above depends on highly elongated tiles. This happens because the basis vectors (the columns of A) are close, and this in turn is due to a narrow cone . But in order for to be used in scheduling as described above, we must have . The difficulties described above are associated with a choice of nearly orthogonal to all of the basis vectors a, which are confined to lie in a narrow cone. Such a vector cannot also be in the cone.