Gossip protocol
Encyclopedia
A gossip protocol is a style of computer-to-computer communication protocol inspired by the form of gossip
seen in social networks. Modern distributed systems
often use gossip protocols to solve problems that might be difficult to solve in other ways, either because the underlying network has an inconvenient structure, is extremely large, or because gossip solutions are the most efficient ones available.
The term epidemic protocol is sometimes used as a synonym for a gossip protocol, because gossip spreads information in a manner similar to the spread of a virus in a biological community.
The power of gossip lies in the robust spread of information. Even if Jill had trouble understanding Sally, she will probably run into someone else soon and can learn the news that way.
Expressing these ideas in more technical terms, a gossip protocol is one that satisfies the following conditions:
Many protocols that predate the earliest use of the term "gossip" fall within this rather inclusive definition. For example, Internet routing protocol
s often use gossip-like information exchanges. A gossip substrate can be used to implement a standard routed network: nodes "gossip" about traditional point-to-point messages, effectively pushing traffic through the gossip layer. Bandwidth permitting, this implies that a gossip system can potentially support any classic protocol or implement any classical distributed service. However, such a broadly inclusive interpretation is rarely intended. More typically gossip protocols are those that specifically run in a regular, periodic, relatively lazy, symmetric and decentralized manner; the high degree of symmetry among nodes is particularly characteristic. Thus, while one could run a 2-phase commit protocol
over a gossip substrate, doing so would be at odds with the spirit, if not the wording, of the definition.
Frequently, the most useful gossip protocols turn out to be those with exponentially rapid convergence towards a state that "emerges" with probability 1.0. A classic distributed computing problem, for example, involves building a tree whose inner nodes are the nodes in a network and whose edges represent links between computers (for routing, as a dissemination overlay, etc). Not all tree-building protocols are gossip protocols (for example, spanning tree constructions in which a leader initiates a flood), but gossip offers a decentralized solution that is useful in many situations.
The term convergently consistent is sometimes used to describe protocols that achieve exponentially rapid spread of information. For this purpose, a protocol must propagate any new information to all nodes that will be affected by the information within time logarithmic in the size of the system (the "mixing time" must be logarithmic in system size).
If the messages might get large (for example, if many searches are active all at the same time), a size limit should be introduced. Also, searches should "age out" of the network.
It follows that within logarithmic time in the size of the network (the number of agents), any new search string will have reached all agents. Within an additional delay of the same approximate length, every agent will learn where the best match can be found. In particular, the agent that started the search will have found the best match.
For example, in a network with 25,000 machines, we can find the best match after about 30 rounds of gossip: 15 to spread the search string and 15 more to discover the best match. A gossip exchange could occur as often as once every tenth of a second without imposing undue load, hence this form of network search could search a big data center in about 3 seconds.
In this scenario, searches might automatically age out of the network after, say, 10 seconds. By then, the initiator knows the answer and there is no point in further gossip about that search.
Gossip protocols have also been proposed for such tasks as maintaining databases or other kinds of files in consistent states, counting the number of nodes in a network of unknown size, spreading news robustly, organizing nodes according to some structuring policy, building so-called overlay networks, computing aggregates, sorting the nodes in a network, electing leaders, etc.
Gossip
Gossip is idle talk or rumour, especially about the personal or private affairs of others, It is one of the oldest and most common means of sharing facts and views, but also has a reputation for the introduction of errors and variations into the information transmitted...
seen in social networks. Modern distributed systems
Distributed computing
Distributed computing is a field of computer science that studies distributed systems. A distributed system consists of multiple autonomous computers that communicate through a computer network. The computers interact with each other in order to achieve a common goal...
often use gossip protocols to solve problems that might be difficult to solve in other ways, either because the underlying network has an inconvenient structure, is extremely large, or because gossip solutions are the most efficient ones available.
The term epidemic protocol is sometimes used as a synonym for a gossip protocol, because gossip spreads information in a manner similar to the spread of a virus in a biological community.
Gossip communication
The concept of gossip communication can be illustrated by the analogy of office workers spreading rumors. Let's say each hour the office workers congregate around the water cooler. Each employee pairs off with another, chosen at random, and shares the latest gossip. At the start of the day, Ted starts a new rumor: he comments to Sally that he believes Fred dyes his mustache. At the next meeting, Sally tells Jill, while Ted repeats the idea to Sam. After each water cooler rendezvous, the number of individuals who have heard the rumor roughly doubles (though this doesn't account for gossiping twice to the same person; perhaps Ted tries to tell his story to Mark, only to find that Mark already heard it from Jill). Computer systems typically implement this type of protocol with a form of random "peer selection": with a given frequency, each machine picks another machine at random and shares any hot rumors.The power of gossip lies in the robust spread of information. Even if Jill had trouble understanding Sally, she will probably run into someone else soon and can learn the news that way.
Expressing these ideas in more technical terms, a gossip protocol is one that satisfies the following conditions:
- The core of the protocol involves periodic, pairwise, inter-process interactions.
- The information exchanged during these interactions is of bounded size.
- When agents interact, the state of at least one agent changes to reflect the state of the other.
- Reliable communication is not assumed.
- The frequency of the interactions is low compared to typical message latencies so that the protocol costs are negligible.
- There is some form of randomness in the peer selection. Peers might be selected from the full set of nodes or from a smaller set of neighborsNeighbourhood (graph theory)In graph theory, an adjacent vertex of a vertex v in a graph is a vertex that is connected to v by an edge. The neighbourhood of a vertex v in a graph G is the induced subgraph of G consisting of all vertices adjacent to v and all edges connecting two such vertices. For example, the image shows a...
.
Gossip protocol types
It is useful to distinguish three prevailing styles of gossip protocol:- Dissemination protocols (or rumor-mongering protocols). These use gossip to spread information; they basically work by flooding agents in the network, but in a manner that produces bounded worst-case loads:
- a) Event dissemination protocols use gossip to carry out multicasts. They report events, but the gossip occurs periodically and events don’t actually trigger the gossip. One concern here is the potentially high latency from when the event occurs until it is delivered.
- b) Background data dissemination protocols continuously gossip about information associated with the participating nodes. Typically, propagation latency isn’t a concern, perhaps because the information in question changes slowly or there is no significant penalty for acting upon slightly stale data.
- Anti-entropy protocols for repairing replicated data, which operate by comparing replicas and reconciling differences.
- Protocols that compute aggregates. These compute a network-wide aggregate by sampling information at the nodes in the network and combining the values to arrive at a system-wide value – the largest value for some measurement nodes are making, smallest, etc. The key requirement is that the aggregate must be computable by fixed-size pairwise information exchanges; these typically terminate after a number of rounds of information exchange logarithmic in the system size, by which time an all-to-all information flow pattern will have been established. As a side effect of aggregation, it is possible to solve other kinds of problems using gossip; for example, there are gossip protocols that can arrange the nodes in a gossip overlay into a list sorted by node-id (or some other attribute) in logarithmic time using aggregation-style exchanges of information. Similarly, there are gossip algorithms that arrange nodes into a tree and compute aggregates such as "sum" or "count" by gossiping in a pattern biased to match the tree structure.
Many protocols that predate the earliest use of the term "gossip" fall within this rather inclusive definition. For example, Internet routing protocol
Routing protocol
A routing protocol is a protocol that specifies how routers communicate with each other, disseminating information that enables them to select routes between any two nodes on a computer network, the choice of the route being done by routing algorithms. Each router has a priori knowledge only of...
s often use gossip-like information exchanges. A gossip substrate can be used to implement a standard routed network: nodes "gossip" about traditional point-to-point messages, effectively pushing traffic through the gossip layer. Bandwidth permitting, this implies that a gossip system can potentially support any classic protocol or implement any classical distributed service. However, such a broadly inclusive interpretation is rarely intended. More typically gossip protocols are those that specifically run in a regular, periodic, relatively lazy, symmetric and decentralized manner; the high degree of symmetry among nodes is particularly characteristic. Thus, while one could run a 2-phase commit protocol
Two-phase commit protocol
In transaction processing, databases, and computer networking, the two-phase commit protocol is a type of atomic commitment protocol . It is a distributed algorithm that coordinates all the processes that participate in a distributed atomic transaction on whether to commit or abort the transaction...
over a gossip substrate, doing so would be at odds with the spirit, if not the wording, of the definition.
Frequently, the most useful gossip protocols turn out to be those with exponentially rapid convergence towards a state that "emerges" with probability 1.0. A classic distributed computing problem, for example, involves building a tree whose inner nodes are the nodes in a network and whose edges represent links between computers (for routing, as a dissemination overlay, etc). Not all tree-building protocols are gossip protocols (for example, spanning tree constructions in which a leader initiates a flood), but gossip offers a decentralized solution that is useful in many situations.
The term convergently consistent is sometimes used to describe protocols that achieve exponentially rapid spread of information. For this purpose, a protocol must propagate any new information to all nodes that will be affected by the information within time logarithmic in the size of the system (the "mixing time" must be logarithmic in system size).
Examples
Suppose that we want to find the object that most closely matches some search pattern, within a network of unknown size, but where the computers are linked to one-another and where each machine is running a small agent program that implements a gossip protocol.- To start the search, a user would ask the local agent to begin to gossip about the search string. (We're assuming that agents either start with a known list of peers, or retrieve this information from some kind of a shared Web site.)
- Periodically, at some rate (let's say ten times per second, for simplicity), each agent picks some other agent at random, and gossips with it. Search strings known to A will now also be known to B, and vice versa. In the next "round" of gossip A and B will pick additional random peers, maybe C and D. This round-by-round doubling phenomenon makes the protocol very robust, even if some messages get lost, or some of the selected peers are the same or already know about the search string.
- On receipt of a search string for the first time, each agent checks its local machine for matching documents.
- Agents also gossip about the best match, to date. Thus, if A gossips with B, after the interaction, A will know of the best matches known to B, and vice versa. Best matches will "spread" through the network.
If the messages might get large (for example, if many searches are active all at the same time), a size limit should be introduced. Also, searches should "age out" of the network.
It follows that within logarithmic time in the size of the network (the number of agents), any new search string will have reached all agents. Within an additional delay of the same approximate length, every agent will learn where the best match can be found. In particular, the agent that started the search will have found the best match.
For example, in a network with 25,000 machines, we can find the best match after about 30 rounds of gossip: 15 to spread the search string and 15 more to discover the best match. A gossip exchange could occur as often as once every tenth of a second without imposing undue load, hence this form of network search could search a big data center in about 3 seconds.
In this scenario, searches might automatically age out of the network after, say, 10 seconds. By then, the initiator knows the answer and there is no point in further gossip about that search.
Gossip protocols have also been proposed for such tasks as maintaining databases or other kinds of files in consistent states, counting the number of nodes in a network of unknown size, spreading news robustly, organizing nodes according to some structuring policy, building so-called overlay networks, computing aggregates, sorting the nodes in a network, electing leaders, etc.
Epidemic Algorithms
Gossip protocols can be used to propagate information in a manner rather similar to the way that a viral infection spreads in a biological population. Indeed, the mathematics of epidemics are often used to model the mathematics of gossip communication. The term epidemic algorithm is sometimes employed when describing a software system in which this kind of gossip-based information propagation is employed.Biased Gossip
Above, we described a purely random peer-selection scheme for gossip: when agent A decides to run a gossip round, it picks some peer B uniformly and at random within the network as a whole (or launches a message on a random walk that will terminate at a random agent). More commonly, gossip algorithms are designed so that agents interact mostly with nearby agents, and only sometimes with agents that are far away (in terms of network delay). These biased gossip protocols need to ensure a sufficient degree of connectivity to avoid the risk of complete disconnection of one side of a network from the other, but if care is taken, can be faster and more efficient than protocols that are purely random. Moreover, as a purely practical question, it is much easier to maintain lists of peers in ways that might be somewhat biased.See also
- Gossip protocols are just one class among many classes of networking protocols. See also virtual synchronyVirtual synchronyVirtual synchrony is an interprocess messaging passing technology. Virtual synchrony systems allow programs running in a network to organize themselves into process groups, and to send messages to groups...
, distributed state machines, Paxos algorithmPaxos algorithmPaxos is a family of protocols for solving consensus in a network of unreliable processors.Consensus is the process of agreeing on one result among a group of participants...
, database transactionsDatabase transactionA transaction comprises a unit of work performed within a database management system against a database, and treated in a coherent and reliable way independent of other transactions...
. Each class contains tens or even hundreds of protocols, differing in their details and performance properties but similar at the level of the guarantees offered to users.
- Some Gossip protocols replace the random peer selection mechanism with a more deterministic scheme. For example, in the NeighbourCast Algorithm, instead of talking to random nodes, information is spread by talking only to neighbouring nodes. There are a number of algorithms that use similar ideas. A key requirement when designing such protocols is that the neighbor set trace out an expander graphExpander graphIn combinatorics, an expander graph is a sparse graph that has strong connectivity properties, quantified using vertex, edge or spectral expansion as described below...
.
- RoutingRoutingRouting is the process of selecting paths in a network along which to send network traffic. Routing is performed for many kinds of networks, including the telephone network , electronic data networks , and transportation networks...