hide
Free keywords:
-
Abstract:
In this paper we study the problem of
efficiently scheduling a wide class of multithreaded
computations, called {\em strict};
that is, computations in which all dependencies
from a thread go to the thread's
ancestors in the computation tree.
Strict multithreaded computations allow the
limited use of synchronization primitives.
We present the {\em first} fully distributed
scheduling algorithm which applies to {\em any}
strict multithreaded computation. The algorithm
is asynchronous, on-line and follows the
{\em work-stealing} paradigm. We prove that
our algorithm is efficient not only in terms of
its memory requirements and its execution time, but also in terms of its
communication complexity.
Our analysis applies to both shared and distributed memory
machines.
More specifically,
the expected execution time of our algorithm
is $O(T_1/P + hT_{\infty})$, where
$T_1$ is the minimum serial execution
time, $T_{\infty}$ is the minimum execution
time with an infinite number of processors,
$P$ is the number of processors
and $h$ is the maximum ``distance'' in the
{\em computation tree} between
any two threads that need to communicate.
Furthermore, the total space required during the execution
is $O(S_1 P)$, where $S_1$ is the space required
by a serial computer to execute the
computation, while the expected communication cost
incurred by our algorithm is $O(PhT_{\infty} (1+n_d) S_{max})$,
where $n_d$ is the maximum number of dependencies
entering any thread and $S_{max}$ is the largest
amount of storage needed for the execution of
any specific thread of the computation.
Our communication complexity bound is {\em the first}
nontrivial bound ever proved for the
model of strict parallel programming.