for I<>i = 0 to dofor I<>j = 0 to do
a(i,j) = a(i-3,j) + a(i,j-2)
b(i,j) = a(i-2,j-3) + b(i-2,j-1)
enddo
enddo
This loop nest has depth 2. The dependences are uniform (intuitively, dependence vectors are translations), and they can be
encapsulated into the dependence matrix
The atomicity constraint can be expressed by the analytical condition , where H is the matrix of vectors normal to the edges of the tile. Irigoin and Triolet [13] were the first to derive the condition . In the context of vector machines with a two-level memory hierarchy, they have introduced tiling as a much more powerful technique than strip-mining or rectangular partitioning [15]. In their paper, tiling consists in aggregating elemental computation points in tiles (or supernode) defined as multi-dimensional parallelepipeds so that each tile executes atomically with no intervening synchronization or communication. This approach has very close objectives to those of Schreiber and Dongarra [17] who aim at automatically designing block versions of nested loop kernels. Schreiber and Dongarra [17] have discussed how to determine the size and shape of the tiles so as to optimize the communication-to-computation ratio. Their work has been extended by Ramanujam and Sadayappan [16], and by Boulet et al. [3]. Several other papers have discussed the same framework, including [18, 4, 1, 5].