hide
Free keywords:
-
Abstract:
For the design and analysis of algorithms that process huge data sets,
a machine model is needed that handles parallel disks. There seems to
be a dilemma between simple and flexible use of such a model and
accurate modelling of details of the hardware. This paper explains
how many aspects of this problem can be resolved. The programming
model implements one large logical disk allowing concurrent access to
arbitrary sets of variable size blocks. This model can be implemented
efficienctly on multiple independent disks even if zones with
different speed, communication bottlenecks and failed disks are
allowed. These results not only provide useful algorithmic tools but
also imply a theoretical justification for studying external memory
algorithms using simple abstract models.
The algorithmic approach is random redundant placement of data and
optimal scheduling of accesses. The analysis generalizes a previous
analysis for simple abstract external memory models in several ways
(Higher efficiency, variable block sizes, more detailed disk model).
As a side effect, an apparently new Chernoff bound for sums of
weighted 0-1 random variables is derived.