Difference between revisions of "Cloud"

From Organic Design wiki
(see also Peer to peer)
m (See also)
 
(5 intermediate revisions by 2 users not shown)
Line 1: Line 1:
Migrating systems into the cloud means ensuring that the systems are able to transparently scale to an arbitrarily sized cluster of servers. This has traditionally been a problem where SQL is concerned because complex queries (for example those using [[w:Join (SQL)|join]]'s) are unable to execute across large numbers of servers in a scalable way if the data is partitioned across them, and if the data is fully mirrored, then changes to the data don't scale because they must be pushed to all the servers.
+
Migrating systems into the cloud means ensuring that the systems are able to transparently scale to an arbitrarily sized cluster of servers. This has traditionally been a problem where SQL is concerned because complex queries (for example those using [[w:Join (SQL)|join]]s) are unable to execute across large numbers of servers in a scalable way if the data is partitioned across them, and if the data is fully mirrored, then changes to the data don't scale because they must be pushed to all the servers.
  
 
However there are nowadays a number of different technologies available which can scale well, some of them are amongst the so called "NoSQL" paradigm which use a totally different querying language designed specifically for distributed systems, and others such as Apache's [http://wiki.apache.org/hadoop/Hive Hive] and Google's [http://code.google.com/appengine/docs/python/datastore/gqlreference.html GQL]. The storage layer in a cloud or distributed system is called a "datastore" rather than a "database". Switching from a relational database to the Datastore requires a paradigm shift for developers when modelling their data.
 
However there are nowadays a number of different technologies available which can scale well, some of them are amongst the so called "NoSQL" paradigm which use a totally different querying language designed specifically for distributed systems, and others such as Apache's [http://wiki.apache.org/hadoop/Hive Hive] and Google's [http://code.google.com/appengine/docs/python/datastore/gqlreference.html GQL]. The storage layer in a cloud or distributed system is called a "datastore" rather than a "database". Switching from a relational database to the Datastore requires a paradigm shift for developers when modelling their data.
Line 8: Line 8:
 
*[[Peer to peer]]
 
*[[Peer to peer]]
 
*[http://groups.google.com/group/nosql-discussion/tree/browse_frm/thread/6945aa26152f1ccb/b9a4042503e92a4d?rnum=21&_done=%2Fgroup%2Fnosql-discussion%2Fbrowse_frm%2Fthread%2F6945aa26152f1ccb%2F447b4fa1222a8701%3Ftvc%3D1%26fwc%3D1%26 Google groups NoSQL discussion] ''- interesting snippet about SQL and the cloud''
 
*[http://groups.google.com/group/nosql-discussion/tree/browse_frm/thread/6945aa26152f1ccb/b9a4042503e92a4d?rnum=21&_done=%2Fgroup%2Fnosql-discussion%2Fbrowse_frm%2Fthread%2F6945aa26152f1ccb%2F447b4fa1222a8701%3Ftvc%3D1%26fwc%3D1%26 Google groups NoSQL discussion] ''- interesting snippet about SQL and the cloud''
 +
*[http://cloudfoundry.com/ Micro Cloud Foundry] is a free downloadable version of VMWare's Cloud Foundry service that runs on a single laptop. [http://www.theregister.co.uk/2011/08/24/vmware_shrinks_cloud/]
 +
*[http://en.wikipedia.org/wiki/Apache_Cassandra Apache Cassandra] is an open source distributed database management system ([https://wiki.apache.org/cassandra/DebianPackaging Debian install notes here]). It is a noSQL solution designed to handle very large amounts of data spread out across many commodity servers while providing a highly available service with no single point of failure. Java though.
 +
*[[PeerPedia]] ''- see Feb 2013 update re MariaDB''

Latest revision as of 18:10, 13 February 2013

Migrating systems into the cloud means ensuring that the systems are able to transparently scale to an arbitrarily sized cluster of servers. This has traditionally been a problem where SQL is concerned because complex queries (for example those using joins) are unable to execute across large numbers of servers in a scalable way if the data is partitioned across them, and if the data is fully mirrored, then changes to the data don't scale because they must be pushed to all the servers.

However there are nowadays a number of different technologies available which can scale well, some of them are amongst the so called "NoSQL" paradigm which use a totally different querying language designed specifically for distributed systems, and others such as Apache's Hive and Google's GQL. The storage layer in a cloud or distributed system is called a "datastore" rather than a "database". Switching from a relational database to the Datastore requires a paradigm shift for developers when modelling their data.

For us at Organic Design the concept of migrating to the cloud is very important because we'd like our entire infrastructure to eventually work in P2P space which has very similar technical requirements to the cloud. The problem is that most of the popular content management systems available are heavily reliant on fully-featured SQL. One exception is Plone which instead uses the ZODB (Z Object DB) for its storage and querying layer which can in turn plug in to a variety of different storage mechanisms, so hopefully adding a DHT to this wouldn't be too difficult.

See also