Robust Parallel Computations through Randomization

Kontogiannis, Spyros; Pantziou, Grammati E.; Spirakis, Paul G.; Yung, Moti

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

Robust Parallel Computations through Randomization

MPG-Autoren

/persons/resource/persons44823

Kontogiannis, Spyros
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

/persons/resource/persons45532

Spirakis, Paul G.
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Kontogiannis, S., Pantziou, G. E., Spirakis, P. G., & Yung, M. (2000). Robust Parallel Computations through Randomization. Theory of Computing Systems, 33(5/6), 427-464. Retrieved from http://www.mpi-sb.mpg.de/~spyros/papers/KPSY-TOCS2000.pdf.gz.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-000F-33F9-E

Zusammenfassung

In this paper we present an efficient general simulation strategy for computations designed for fully operational BSP machines of n ideal processors, on n-processor dynamic-fault-prone BSP machines. The fault occurrences are fail-stop and fully dynamic, i.e., they are allowed to happen on-line at any point of the computation, subject to the constraint that the total number of faulty processors may never exceed a known fraction. The computational paradigm can be exploited for robust computations over virtual parallel settings with a volatile underlying infrastructure, such as a NETWORK OF WORKSTATIONS (where workstations may be taken out of the virtual parallel machine by their owner). Our simulation strategy is Las Vegas (i.e., it may never fail, due to backtracking operations to robustly stored instances of the computation, in case of locally unrecoverable situations). It adopts an adaptive balancing scheme of the workload among the currently live processors of the BSP machine. Our strategy is efficient in the sense that, compared with an optimal off-line adversarial computation under the same sequence of fault occurrences, it achieves an O(log n loglog n)^2 multiplicative factor times the optimal work (namely, this measure is in the sense of the "competitive ratio" of on-line analysis). In addition, our scheme is modular, integrated, and considers many implementation points. We comment that, to our knowledge, no previous work on robust parallel com-putations has considered fully dynamic faults in the BSP model, or in general distributed memory systems. Furthermore, this is the first time an efficient Las Vegas simulation in this area is achieved.