Difference between revisions of "Distributed hash table"
(New page: == Overview == '''[[w:Distributed hash table|Distributed hash tables...) |
m |
||
Line 2: | Line 2: | ||
'''[[w:Distributed hash table|Distributed hash tables]]''' ('''DHTs''') are a class of decentralized [[w:Distributed computing|distributed systems]] that provide a lookup service similar to a [[w:hash table|hash table]]: (''name'', ''value'') pairs are stored in the DHT, and any participating [[w:node (networking)|node]] can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to [[w:scale (computing)|scale]] to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures. | '''[[w:Distributed hash table|Distributed hash tables]]''' ('''DHTs''') are a class of decentralized [[w:Distributed computing|distributed systems]] that provide a lookup service similar to a [[w:hash table|hash table]]: (''name'', ''value'') pairs are stored in the DHT, and any participating [[w:node (networking)|node]] can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to [[w:scale (computing)|scale]] to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures. | ||
− | DHTs form an infrastructure that can be used to build more complex services, such as [[distributed file system]]s, [[w:peer-to-peer|p2p]] [[w:file sharing|file sharing]] and [[w:content distribution|content distribution]] systems, cooperative [[w:web cache|web caching]], [[w:multicast|multicast]], [[w:anycast|anycast]], [[w:Domain name system|domain name services]], and [[w:instant messaging|instant messaging]]. Applications that use DHTs include [[w:BitTorrent|BitTorrent]], [[w:Overnet|Overnet]], [[w:YaCy|YaCy]], and the [[w:Coral Content Distribution Network|Coral Content Distribution Network]]. | + | DHTs form an infrastructure that can be used to build more complex services, such as [[w:distributed file system|distributed file system]]s, [[w:peer-to-peer|p2p]] [[w:file sharing|file sharing]] and [[w:content distribution|content distribution]] systems, cooperative [[w:web cache|web caching]], [[w:multicast|multicast]], [[w:anycast|anycast]], [[w:Domain name system|domain name services]], and [[w:instant messaging|instant messaging]]. Applications that use DHTs include [[w:BitTorrent|BitTorrent]], [[w:Overnet|Overnet]], [[w:YaCy|YaCy]], and the [[w:Coral Content Distribution Network|Coral Content Distribution Network]]. |
== The Kademlia DHT == | == The Kademlia DHT == |
Revision as of 08:40, 10 May 2007
Overview
Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (name, value) pairs are stored in the DHT, and any participating node can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.
DHTs form an infrastructure that can be used to build more complex services, such as distributed file systems, p2p file sharing and content distribution systems, cooperative web caching, multicast, anycast, domain name services, and instant messaging. Applications that use DHTs include BitTorrent, Overnet, YaCy, and the Coral Content Distribution Network.
The Kademlia DHT
Kademlia is a distributed hash table for decentralized peer to peer networks designed by Petar Maymounkov and David Mazières. It specifies the structure of the network and how the exchange of information has to take place through network node lookups. Kademlia nodes communicate among themselves using the UDP. Over an existing network, a new virtual or overlay network is created in which each node is identified by a number or node ID. The node ID serves not only as a nodes identification, but the Kademlia algorithm uses it to locate values (usually file hashes or keywords). In fact, the node ID provides a direct map to file hashes.
When searching for some value, the algorithm explores the network in several steps. Each step approaches the key until the contacted node returns the value or no more closer nodes are found. Like many other DHTs, Kademlia contacts only [math]O(\log n)[/math] (see Big O notation) nodes during the search out of a total of [math]n[/math] nodes in the system.
Further advantages are found particularly in the decentralized structure, which clearly increases the resistance against a denial of service attack. Even if a whole set of nodes are flooded, this will have limited effect on network availability, which will recover itself by knitting the network around these "holes".