Difference between revisions of "Distributed hash table"

From Organic Design wiki
(Pastry)
(more research - Chimera may be the way to go)
Line 3: Line 3:
  
 
DHTs form an infrastructure that can be used to build more complex services, such as [[w:distributed file system|distributed file system]]s, [[w:peer-to-peer|p2p]] [[w:file sharing|file sharing]] and [[w:content distribution|content distribution]] systems, cooperative [[w:web cache|web caching]], [[w:multicast|multicast]], [[w:anycast|anycast]], [[w:Domain name system|domain name services]], and [[w:instant messaging|instant messaging]]. Applications that use DHTs include [[w:BitTorrent|BitTorrent]], [[w:Overnet|Overnet]], [[w:YaCy|YaCy]], and the [[w:Coral Content Distribution Network|Coral Content Distribution Network]].
 
DHTs form an infrastructure that can be used to build more complex services, such as [[w:distributed file system|distributed file system]]s, [[w:peer-to-peer|p2p]] [[w:file sharing|file sharing]] and [[w:content distribution|content distribution]] systems, cooperative [[w:web cache|web caching]], [[w:multicast|multicast]], [[w:anycast|anycast]], [[w:Domain name system|domain name services]], and [[w:instant messaging|instant messaging]]. Applications that use DHTs include [[w:BitTorrent|BitTorrent]], [[w:Overnet|Overnet]], [[w:YaCy|YaCy]], and the [[w:Coral Content Distribution Network|Coral Content Distribution Network]].
 +
 +
DHTs are scalable network infrastructures that support Internet-scale network applications utilizing a decentralized resource model.  At their core, these overlays provide [[w:Key-Based Routing|Key-Based Routing]] (KBR), where messages addressed to any Key will incrementally route towards an overlay node responsible for that key.  On top of the KBR layer, these overlays can support distributed storage using a DHT layer or data location using a DOLR layer.
 +
 +
Along with [[w:Pastry (DHT)|Pastry]], [[w:Chord project|Chord]], and [[w:Content Addressable Network|CAN]], [[w:Tapestry (DHT)|Tapestry]] was one of the first structured overlay networks proposed in 2001. Since then, at least 20 structured protocols have been proposed with varying properties, including SkipNet/SkipGraph, [[w:Kademlia|Kademlia]], Viceroy, Z-Ring among many others.  On top of these overlays, researchers have proposed numerous distributed applications, including distributed storage and backup systems, multicast systems, resilient routing networks, distributed spam filters, mobility support and anonymous routing networks.
  
 
== The Kademlia DHT ==
 
== The Kademlia DHT ==
Line 10: Line 14:
  
 
Further advantages are found particularly in the decentralized structure, which clearly increases the resistance against a [[w:denial of service attack|denial of service attack]]. Even if a whole set of nodes are flooded, this will have limited effect on network availability, which will recover itself by knitting the network around these "holes".
 
Further advantages are found particularly in the decentralized structure, which clearly increases the resistance against a [[w:denial of service attack|denial of service attack]]. Even if a whole set of nodes are flooded, this will have limited effect on network availability, which will recover itself by knitting the network around these "holes".
*[http://www.amule.org/wiki/index.php/Kademlia Kademlia] - they use a wiki too
+
*[http://www.cspace.in CSpace] is a P2P application space written in Python which uses Kademlia
 +
*[http://www.amule.org/wiki/index.php/Kademlia Kademlia] - they use a wiki too (a MediaWiki version 1.2!)
 
*[{{SERVER}}/www/images/f/f0/Kademlia.pdf Kademlia.pdf]
 
*[{{SERVER}}/www/images/f/f0/Kademlia.pdf Kademlia.pdf]
 
*[http://www.amule.org/wiki/index.php/AMule AMule] - confluence of the current range of p2p protocols eg eDonkey2K etc
 
*[http://www.amule.org/wiki/index.php/AMule AMule] - confluence of the current range of p2p protocols eg eDonkey2K etc
Line 18: Line 23:
 
The first generation of peer-to-peer applications, including [[w:Napster|Napster]] and [[w:Gnutella|Gnutella]], had restricting limitations such as a central directory for Napster and scoped broadcast queries for Gnutella limiting scalability.  To address these problems a second generation of p2p applications were developed including Tapestry, [[w:Chord project|Chord]], [[w:Pastry (DHT)|Pastry]], and [[w:Content addressable network|CAN]].  These overlays implement a basic key-based routing mechanism.  This allows for deterministic routing of messages and adaptation to node failures in the overlay network.
 
The first generation of peer-to-peer applications, including [[w:Napster|Napster]] and [[w:Gnutella|Gnutella]], had restricting limitations such as a central directory for Napster and scoped broadcast queries for Gnutella limiting scalability.  To address these problems a second generation of p2p applications were developed including Tapestry, [[w:Chord project|Chord]], [[w:Pastry (DHT)|Pastry]], and [[w:Content addressable network|CAN]].  These overlays implement a basic key-based routing mechanism.  This allows for deterministic routing of messages and adaptation to node failures in the overlay network.
  
Tapestry is an extensible infrastructure that provides decentralized object location and routing focusing on efficiency and minimizing message latency.  This is achieved since Tapestry constructs locally optimal routing tables from initialization and maintains them in order to reduce routing stretch.  Furthermore, Tapestry allows object distribution determination according to the needs of a given application.  Similarly Tapestry allows applications to implement multicasting in the overlay network.
+
[[w:Tapestry (DHT)|Tapestry]] is an extensible infrastructure that provides decentralized object location and routing focusing on efficiency and minimizing message latency.  This is achieved since Tapestry constructs locally optimal routing tables from initialization and maintains them in order to reduce routing stretch.  Furthermore, Tapestry allows object distribution determination according to the needs of a given application.  Similarly Tapestry allows applications to implement multicasting in the overlay network.
 +
;From experiments it is shown that Tapestry efficiency increases with network size so multiple applications sharing the same overlay network increases efficiency. To differentiate between applications a unique application identifier is used. Tapestry uses best-effort to publish and route objects.
 +
*[http://www.oceanstore.org/ OceanStore] uses Tapestry for [http://oceanstore.cs.berkeley.edu/info/whytapestry.html these reasons]
 +
*'''NOTE:''' Tapestry is written in Java and development has stopped and been replaced by [http://current.cs.ucsb.edu/projects/chimera/ Chimera] (see below) which is a "light-weight" version in C.
  
 
== Pastry ==
 
== Pastry ==
[[w:Pastry|Pastry]] is a generic, scalable and efficient substrate for peer-to-peer applications. Pastry nodes form a decentralized, self-organizing and fault-tolerant overlay network within the Internet. Pastry provides efficient request routing, deterministic object location, and load balancing in an application-independent manner. Furthermore, Pastry provides mechanisms that support and facilitate application-specific object replication, caching, and fault recovery.
+
[[w:Pastry (DHT)|Pastry]] is a generic, scalable and efficient substrate for peer-to-peer applications. Pastry nodes form a decentralized, self-organizing and fault-tolerant overlay network within the Internet. Pastry provides efficient request routing, deterministic object location, and load balancing in an application-independent manner. Furthermore, Pastry provides mechanisms that support and facilitate application-specific object replication, caching, and fault recovery.
 +
*Current implementations are Java and MS-based
 +
 
 +
== Content Addressable Network ==
 +
The [[w:Content Addressable Network|Content Addressable Network]] ('''CAN''') was one of the original four DHT proposals (Ratnasamy 2001), introduced concurrently with [[w:Chord project|Chord]], [[w:Pastry (DHT)|Pastry]], and [[w:Tapestry (DHT)|Tapestry]].  Although intended to be more general, the term ''content addressable network'' came to be associated with Ratnasamy et al.'s specific design.
 +
 
 +
Like other DHTs, CAN is a distributed, decentralized P2P infrastructure that provides hash table functionality on an Internet-like scale. CAN is designed to be scalable, fault tolerant, and self-organizing. CAN is built around a virtual multi-dimensional [[w:Cartesian coordinate system|Cartesian]] [[w:coordinate space|coordinate space]] on a multi-torus. This d-dimensional coordinate space is completely logical. The entire coordinate space is dynamically partitioned among all the peers (N number of peers) in the system such that every peer possesses its individual, distinct zone within the overall space. A CAN peer maintains a [[w:routing table|routing table]] that holds the IP address and virtual coordinate zone of each of its neighbor coordinates. A peer routes a message towards its destination using a simple [[w:Greedy algorithm|greedy forwarding]] to the neighbor peer that is closest to the destination coordinates. CAN is a distributed system that maps keys onto values.
 +
 
 +
== Chimera ==
 +
[http://current.cs.ucsb.edu/projects/chimera/ Chimera] is a light-weight '''C implementation''' of a "next-generation" structured overlay that provides similar functionality as prefix-routing protocols Tapestry and Pastry. Chimera gains simplicity and robustness from its use of Pastry's leafsets, and efficient routing from Tapestry's locality algorithms.  In addition to these properties, Chimera also provides efficient detection of node and network failures, and re-routes messages around them to maintain connectivity and throughput. 
 +
 
 +
Along w/ academic research, Chimera is currently being used by projects at industry labs, as part of DoD research, and by local startups.  It also provides a research infrastructure for both graduate and undergraduate networking class projects at UCSB.  Three MS graduates have used it for their MS projects.  To get timely (but infrequent) notifications of updates, bug fixes and new releases to Chimera, please join the Chimera-users GoogleGroup.
 +
*Chimera is programmed in C and replaces Tapestry which was in Java
 +
*[http://www.cs.ucsb.edu/%7Eravenben/chimera/download/chimera-1.10.zip chimera-1.10.zip]
  
 
== See also ==
 
== See also ==
 
*[http://www.p-grid.org/publications/p2pGeneral.html The P-Grid project]
 
*[http://www.p-grid.org/publications/p2pGeneral.html The P-Grid project]
 
[[Category:Networking]]
 
[[Category:Networking]]

Revision as of 01:37, 13 May 2007

Overview

Distributed hash tables (DHTs) are a class of decentralized distributed systems that provide a lookup service similar to a hash table: (name, value) pairs are stored in the DHT, and any participating node can efficiently retrieve the value associated with a given name. Responsibility for maintaining the mapping from names to values is distributed among the nodes, in such a way that a change in the set of participants causes a minimal amount of disruption. This allows DHTs to scale to extremely large numbers of nodes and to handle continual node arrivals, departures, and failures.

DHTs form an infrastructure that can be used to build more complex services, such as distributed file systems, p2p file sharing and content distribution systems, cooperative web caching, multicast, anycast, domain name services, and instant messaging. Applications that use DHTs include BitTorrent, Overnet, YaCy, and the Coral Content Distribution Network.

DHTs are scalable network infrastructures that support Internet-scale network applications utilizing a decentralized resource model. At their core, these overlays provide Key-Based Routing (KBR), where messages addressed to any Key will incrementally route towards an overlay node responsible for that key. On top of the KBR layer, these overlays can support distributed storage using a DHT layer or data location using a DOLR layer.

Along with Pastry, Chord, and CAN, Tapestry was one of the first structured overlay networks proposed in 2001. Since then, at least 20 structured protocols have been proposed with varying properties, including SkipNet/SkipGraph, Kademlia, Viceroy, Z-Ring among many others. On top of these overlays, researchers have proposed numerous distributed applications, including distributed storage and backup systems, multicast systems, resilient routing networks, distributed spam filters, mobility support and anonymous routing networks.

The Kademlia DHT

Kademlia is a distributed hash table for decentralized peer to peer networks designed by Petar Maymounkov and David Mazières. It specifies the structure of the network and how the exchange of information has to take place through network node lookups. Kademlia nodes communicate among themselves using the UDP. Over an existing network, a new virtual or overlay network is created in which each node is identified by a number or node ID. The node ID serves not only as a nodes identification, but the Kademlia algorithm uses it to locate values (usually file hashes or keywords). In fact, the node ID provides a direct map to file hashes.

When searching for some value, the algorithm explores the network in several steps. Each step approaches the key until the contacted node returns the value or no more closer nodes are found. Like many other DHTs, Kademlia contacts only [math]O(\log n)[/math] (see Big O notation) nodes during the search out of a total of [math]n[/math] nodes in the system.

Further advantages are found particularly in the decentralized structure, which clearly increases the resistance against a denial of service attack. Even if a whole set of nodes are flooded, this will have limited effect on network availability, which will recover itself by knitting the network around these "holes".

Tapestry

The first generation of peer-to-peer applications, including Napster and Gnutella, had restricting limitations such as a central directory for Napster and scoped broadcast queries for Gnutella limiting scalability. To address these problems a second generation of p2p applications were developed including Tapestry, Chord, Pastry, and CAN. These overlays implement a basic key-based routing mechanism. This allows for deterministic routing of messages and adaptation to node failures in the overlay network.

Tapestry is an extensible infrastructure that provides decentralized object location and routing focusing on efficiency and minimizing message latency. This is achieved since Tapestry constructs locally optimal routing tables from initialization and maintains them in order to reduce routing stretch. Furthermore, Tapestry allows object distribution determination according to the needs of a given application. Similarly Tapestry allows applications to implement multicasting in the overlay network.

From experiments it is shown that Tapestry efficiency increases with network size so multiple applications sharing the same overlay network increases efficiency. To differentiate between applications a unique application identifier is used. Tapestry uses best-effort to publish and route objects.
  • OceanStore uses Tapestry for these reasons
  • NOTE: Tapestry is written in Java and development has stopped and been replaced by Chimera (see below) which is a "light-weight" version in C.

Pastry

Pastry is a generic, scalable and efficient substrate for peer-to-peer applications. Pastry nodes form a decentralized, self-organizing and fault-tolerant overlay network within the Internet. Pastry provides efficient request routing, deterministic object location, and load balancing in an application-independent manner. Furthermore, Pastry provides mechanisms that support and facilitate application-specific object replication, caching, and fault recovery.

  • Current implementations are Java and MS-based

Content Addressable Network

The Content Addressable Network (CAN) was one of the original four DHT proposals (Ratnasamy 2001), introduced concurrently with Chord, Pastry, and Tapestry. Although intended to be more general, the term content addressable network came to be associated with Ratnasamy et al.'s specific design.

Like other DHTs, CAN is a distributed, decentralized P2P infrastructure that provides hash table functionality on an Internet-like scale. CAN is designed to be scalable, fault tolerant, and self-organizing. CAN is built around a virtual multi-dimensional Cartesian coordinate space on a multi-torus. This d-dimensional coordinate space is completely logical. The entire coordinate space is dynamically partitioned among all the peers (N number of peers) in the system such that every peer possesses its individual, distinct zone within the overall space. A CAN peer maintains a routing table that holds the IP address and virtual coordinate zone of each of its neighbor coordinates. A peer routes a message towards its destination using a simple greedy forwarding to the neighbor peer that is closest to the destination coordinates. CAN is a distributed system that maps keys onto values.

Chimera

Chimera is a light-weight C implementation of a "next-generation" structured overlay that provides similar functionality as prefix-routing protocols Tapestry and Pastry. Chimera gains simplicity and robustness from its use of Pastry's leafsets, and efficient routing from Tapestry's locality algorithms. In addition to these properties, Chimera also provides efficient detection of node and network failures, and re-routes messages around them to maintain connectivity and throughput.

Along w/ academic research, Chimera is currently being used by projects at industry labs, as part of DoD research, and by local startups. It also provides a research infrastructure for both graduate and undergraduate networking class projects at UCSB. Three MS graduates have used it for their MS projects. To get timely (but infrequent) notifications of updates, bug fixes and new releases to Chimera, please join the Chimera-users GoogleGroup.

  • Chimera is programmed in C and replaces Tapestry which was in Java
  • chimera-1.10.zip

See also