next up previous
Next: Back to Example Up: Tiling as a loop Previous: Tiling as a loop

Example tex2html_wrap797

 

 for I<>i = 0 to tex2html_wrap_inline777 do

for I<>j = 0 to tex2html_wrap_inline781 do

a(i,j) = a(i-3,j) + a(i,j-2)

b(i,j) = a(i-2,j-3) + b(i-2,j-1)

enddo

enddo

This loop nest has depth 2. The dependences are uniform (intuitively, dependence vectors are translations), and they can be encapsulated into the dependence matrix
displaymath789

The atomicity constraint can be expressed by the analytical condition tex2html_wrap_inline791, where H is the matrix of vectors normal to the edges of the tile. Irigoin and Triolet [13] were the first to derive the condition tex2html_wrap_inline791. In the context of vector machines with a two-level memory hierarchy, they have introduced tiling as a much more powerful technique than strip-mining or rectangular partitioning [15]. In their paper, tiling consists in aggregating elemental computation points in tiles (or supernode) defined as multi-dimensional parallelepipeds so that each tile executes atomically with no intervening synchronization or communication. This approach has very close objectives to those of Schreiber and Dongarra [17] who aim at automatically designing block versions of nested loop kernels. Schreiber and Dongarra [17] have discussed how to determine the size and shape of the tiles so as to optimize the communication-to-computation ratio. Their work has been extended by Ramanujam and Sadayappan [16], and by Boulet et al. [3]. Several other papers have discussed the same framework, including [18, 4, 1, 5]gif.



Jack Dongarra
Sat Feb 8 08:17:58 EST 1997