next up previous
Next: Optimizing the number of Up: Optimizing the tile size Previous: Optimizing the tile size

Proof

We break down the problem into two subcases depending on the values taken by the function f, whose argument tex2html_wrap_inline826 ranges from 1 to tex2html_wrap_inline1004;


This result is disappointing in that we end up with degenerate tiles in most practical situations. For instance if tex2html_wrap_inline1080 (which is very likely to happen in practice), tex2html_wrap_inline1082, and the optimal tile size is tex2html_wrap_inline1084, not a very coarse-grain tiling indeed! The flaw is that the model is not accurate enough to take the impact of data locality and data reuse into account (which are the main objectives of tiling). A first solution is to model the computation cost of a tile by an affine expression like tex2html_wrap_inline1086, where u represents some access overhead. It is not difficult to plug this expression into the formula given for the total execution time T, and to derive the optimal tile size. Another solution is to assume a fixed tile size that would be imposed by some a priori considerations (such as the cache size). Again, we can let tex2html_wrap_inline1092 in Equation (1), and minimize T for tex2html_wrap_inline826, say.


next up previous
Next: Optimizing the number of Up: Optimizing the tile size Previous: Optimizing the tile size

Jack Dongarra
Sat Feb 8 08:17:58 EST 1997