Waste makes haste: tight bounds for loose parallel sorting

Hagerup, Torben; Raman, Rajeev

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Report

Waste makes haste: tight bounds for loose parallel sorting

MPS-Authors

/persons/resource/persons44564

Hagerup, Torben
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

/persons/resource/persons45250

Raman, Rajeev
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

MPI-I-92-141.pdf
(Any fulltext), 19MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Hagerup, T., & Raman, R.(1992). Waste makes haste: tight bounds for loose parallel sorting (MPI-I-92-141). Saarbrücken: Max-Planck-Institut für Informatik.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0014-B70C-1

Abstract

Conventional parallel sorting requires the $n$ input keys to be output in an array of size $n$, and is known to take $\Omega({{\log n}/{\log\log n}})$ time using any polynomial number of processors. The lower bound does not apply to the more ``wasteful'' convention of {\em padded sorting}, which requires the keys to be output in sorted order in an array of size $(1 + o(1)) n$. We give very fast randomized CRCW PRAM algorithms for several padded-sorting problems. Applying only pairwise comparisons to the input and using $kn$ processors, where $2\le k\le n$, we can padded-sort $n$ keys in $O({{\log n}/{\log k}})$ time with high probability (whp), which is the best possible (expected) run time for any comparison-based algorithm. We also show how to padded-sort $n$ independent random numbers in $O(\log^*\! n)$ time whp with $O(n)$ work, which matches a recent lower bound, and how to padded-sort $n$ integers in the range $ 1..n $ in constant time whp using $n$ processors. If the integer sorting is required to be stable, we can still solve the problem in $O({{\log\log n}/{\log k}})$ time whp using $kn$ processors, for any $k$ with $2\le k\le\log n$. The integer sorting results require the nonstandard OR PRAM; alternative implementations on standard PRAM variants run in $O(\log\log n)$ time whp. As an application of our padded-sorting algorithms, we can solve approximate prefix summation problems of size~$n$ with $O(n)$ work in constant time whp on the OR PRAM, and in $O(\log\log n)$ time whp on standard PRAM variants.