Proof

Next: Conclusion Up: Optimal mapping and scheduling Previous: Proof

Proof

Let proc(I) be the processor that executes tile I. We have three cases to consider, depending upon whether proc(I) also executes both successors I' and I'', or exactly one of them, or none of them:

both successors:: proc(I)=proc(I')=proc(I'')
The same processor executes both successors. They are executed sequentially and the last one being executed cannot begin execution before time-step . As the result is proven.
one successor:: proc(I) = proc(I') and
(respectively proc(I) = proc(I'') and ). A communication is needed between I and I'' (respectively I and I'), hence (respectively )
no successor:: and
This case is similar to the previous one.

Back to the proof of the theorem, let the total execution time using P processors. Let Idle be the cumulated idle time of all processors during execution. Finally, let be the sequential execution time. Clearly,

Hence, to show that , we need to show that

The structure of the dependence graph does impose that some processors are idle at the beginning of the computation, which will lead to a lower bound for Idle. For instance, during the execution of tile (0,0), there are necessarily P-1 idle processors. To go on, we recursively define as follows (see Figure 5):

, and
for , is the one of the two successors of which is executed last: if , let I'=(i+1,j) and I''=(i,j+1) be the successors of tile I:
- If , then , and we define S(k) as the remaining tiles in column j: ),
- If , then , and we define S(k) as the remaining tiles in row i: ,

We know from Lemma 1 that for all

We prove by induction that for , at least P-k processors are kept idle between the beginning of the execution of and that of . This will lead to be result that

This will prove the desired result, because the same amount of idleness, so to speak, will be spent at the end of the computation (by symmetry of the dependence graph). Now, for the induction:

Let k=1: is either (0,1) or (1,0). See Figure 5 where and . Between the the beginning of the execution of and that of , the only successors of that can be executed are in S(1). But all tasks in S(1) must be executed sequentially, hence between the beginning of the execution of and that of , at least (P-1) processors are kept idle.
Assume that the hypothesis is true until step k. Between the beginning of the execution of and that of , at most one processor can be active in S(1), at most another one in S(2), , and at most one processor in S(k+1), so that at most k+1 processors can be active, or equivalently, at least P-(k+1) processors remain idle.

Figure 5: A schedule when and . Pivot tiles are labeled, and sets S(k) are framed.

Next: Conclusion Up: Optimal mapping and scheduling Previous: Proof

Jack Dongarra
Sat Feb 8 08:17:58 EST 1997