Low-contention depth-first scheduling of parallel computations with 
synchronization variables

Fatourou, Panagiota

Local TagsRelease HistoryDetailsSummary

Low-contention depth-first scheduling of parallel computations with synchronization variables

Fatourou, P.(2000). Low-contention depth-first scheduling of parallel computations with synchronization variables (MPI-I-2000-1-003). Saarbrücken: Max-Planck-Institut für Informatik.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-0014-6F2B-B Version Permalink: https://hdl.handle.net/11858/00-001M-0000-0014-77A9-5

Genre: Report

Files

show Files

hide Files

:

2000-1-003 (Any fulltext), 12KB

View Save

File Permalink:
https://hdl.handle.net/11858/00-001M-0000-0014-6F2D-7

Name:
2000-1-003

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
text/html / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
-

Locators

show

Creators

show

hide

Creators:
Fatourou, Panagiota¹, Author

Affiliations:
1Algorithms and Complexity, MPI for Informatics, Max Planck Society, ou_24019

Content

show

hide

Free keywords: -

Abstract: In this paper, we present a randomized, online, space-efficient algorithm for the general class of programs with synchronization variables (such programs are produced by parallel programming languages, like, e.g., Cool, ID, Sisal, Mul-T, OLDEN and Jade). The algorithm achieves good locality and low scheduling overheads for this general class of computations, by combining work-stealing and depth-first scheduling. More specifically, given a computation with work $T_1$, depth $T_\infty$ and $\sigma$ synchronizations that its execution requires space $S_1$ on a single-processor computer, our algorithm achieves expected space complexity at most $S_1 + O(PT_\infty \log (PT_\infty))$ and runs in an expected number of $O(T_1/P + \sigma \log (PT_\infty)/P + T_\infty \log (PT_\infty))$ timesteps on a shared-memory, parallel machine with $P$ processors. Moreover, for any $\varepsilon > 0$, the space complexity of our algorithm is at most $S_1 + O(P(T_\infty + \ln (1/\varepsilon)) \log (P(T_\infty + \ln(P(T_\infty + \ln (1/\varepsilon))/\varepsilon))))$ with probability at least $1-\varepsilon$. Thus, even for values of $\varepsilon$ as small as $e^{-T_\infty}$, the space complexity of our algorithm is at most $S_1 + O(PT_\infty \log(PT_\infty))$, with probability at least $1-e^{-T_\infty}$. The algorithm achieves good locality and low scheduling overheads by automatically increasing the granularity of the work scheduled on each processor. Our results combine and extend previous algorithms and analysis techniques (published by Blelloch et. al [6] and by Narlikar [26]). Our algorithm not only exhibits the same good space complexity for the general class of programs with synchronization variables as its deterministic analog presented in [6], but it also achieves good locality and low scheduling overhead as the algorithm presented in [26], which however performs well only for the more restricted class of nested parallel computations.

Details

show

hide

Language(s): eng - English

Dates: Date issued: 2000

Publication Status: Issued

Pages: 56 p.

Publishing info: Saarbrücken : Max-Planck-Institut für Informatik

Table of Contents: -

Rev. Type: -

Identifiers: URI: http://domino.mpi-inf.mpg.de/internet/reports.nsf/NumberView/2000-1-003
Report Nr.: MPI-I-2000-1-003

Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show

hide

Title: Research Report / Max-Planck-Institut für Informatik

Source Genre: Series

Creator(s):

Affiliations:

Publ. Info: -

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -