ausblenden:
Schlagwörter:
Computer Science, Distributed, Parallel, and Cluster Computing, cs.DC
Zusammenfassung:
Consider a fully-connected synchronous distributed system consisting of $n$
nodes, where up to $f$ nodes may be faulty and every node starts in an
arbitrary initial state. In the synchronous counting problem, all nodes need to
eventually agree on a counter that is increased by one modulo some $C$ in each
round. In the self-stabilising firing squad problem, the task is to eventually
guarantee that all non-faulty nodes have simultaneous responses to external
inputs: if a subset of the correct nodes receive an external "go" signal as
input, then all correct nodes should agree on a round (in the not-too-distant
future) in which to jointly output a "fire" signal. Moreover, no node should
generate a "fire" signal without some correct node having previously received a
"go" signal as input.
We present a framework reducing both tasks to binary consensus at very small
cost: we maintain the resilience of the underlying consensus routine, while the
stabilisation time and message size are, up to constant factors, bounded by the
sum of the cost of the consensus routine for $f$ faults and recursively
applying our scheme to $f'<f/2$ faults. For example, we obtain a deterministic
algorithm for self-stabilising Byzantine firing squads with optimal resilience
$f<n/3$, asymptotically optimal stabilisation and response time $O(f)$, and
message size $O(\log f)$.
As our framework does not restrict the type of consensus routines used, we
also obtain efficient randomised solutions, and it is straightforward to adapt
our framework to allow for $f<n/2$ omission or $f<n$ crash faults,
respectively. Our results resolve various open questions on the two problems,
most prominently whether (communication-efficient) self-stabilising Byzantine
firing squads or (randomised) sublinear-time solutions for either problem
exist.