# Item

ITEM ACTIONSEXPORT

Released

Report

#### Waste makes haste: tight bounds for loose parallel sorting

##### MPS-Authors

##### Locator

There are no locators available

##### Fulltext (public)

MPI-I-92-141.pdf

(Any fulltext), 19MB

##### Supplementary Material (public)

There is no public supplementary material available

##### Citation

Hagerup, T., & Raman, R.(1992). *Waste makes haste: tight
bounds for loose parallel sorting* (MPI-I-92-141). Saarbrücken: Max-Planck-Institut für Informatik.

Cite as: http://hdl.handle.net/11858/00-001M-0000-0014-B70C-1

##### Abstract

Conventional
parallel sorting requires the $n$ input
keys to be output in an array of size $n$, and is known to
take $\Omega({{\log n}/{\log\log n}})$ time using
any polynomial number of processors.
The lower bound does not apply to the more ``wasteful''
convention of {\em padded sorting}, which
requires the keys to be output in sorted order in
an array of size $(1 + o(1)) n$.
We give very fast randomized CRCW PRAM algorithms for
several padded-sorting problems.
Applying only pairwise comparisons to the input
and using $kn$ processors, where $2\le k\le n$,
we can padded-sort $n$ keys in
$O({{\log n}/{\log k}})$ time with
high probability (whp), which
is the best possible (expected)
run time for any comparison-based algorithm.
We also show how to padded-sort
$n$ independent random numbers
in $O(\log^*\! n)$ time whp with $O(n)$ work,
which matches a recent lower bound,
and how to padded-sort
$n$ integers in the range $ 1..n $
in constant time whp using $n$ processors.
If the integer sorting is required to be stable,
we can still solve the problem in
$O({{\log\log n}/{\log k}})$ time whp using
$kn$ processors, for any $k$ with $2\le k\le\log n$.
The integer sorting results require the
nonstandard OR PRAM; alternative implementations
on standard PRAM variants run in $O(\log\log n)$ time whp.
As an application of our padded-sorting algorithms,
we can solve approximate prefix summation problems
of size~$n$ with $O(n)$ work
in constant time whp on the OR PRAM,
and in $O(\log\log n)$ time whp on
standard PRAM variants.