Difference between revisions of "Backup"

From Organic Design wiki
(stats)
m (See also)
Line 67: Line 67:
 
*[http://www.perihel.at/3/index.html Good discussion on Linux and backing up with rsync]
 
*[http://www.perihel.at/3/index.html Good discussion on Linux and backing up with rsync]
 
*[http://blogs.techrepublic.com.com/10things/?p=895 10 Linux backup tools]
 
*[http://blogs.techrepublic.com.com/10things/?p=895 10 Linux backup tools]
 +
*[[Sky]] ''- EMP-proof and water-proof 2010 backup''
 
[[Category:IT Support]][[Category:Nad]][[Category:Organic Design]]
 
[[Category:IT Support]][[Category:Nad]][[Category:Organic Design]]

Revision as of 01:58, 9 August 2010

The organic design backups are created daily and compressed to 7zip and distributed over SCP to various other domains. Eventually backup will become unnecessary because we plan to move all our information into distributed space which will include both servers and workstations data.

Changing to rsync

The amount of data we need to back up each day is getting larger, and currently we're only backing the web structure and user's emails up on a weekly basis due to the amount of traffic required. We moving more of our users over to our IMAP folders for email and may be taking on some larger sites which will require us to maintain hundreds of gigabytes rendering our current system unworkable.

rsync is an application for Unix systems which synchronizes files and directories from one location to another while minimizing data transfer using delta encoding when appropriate. An important feature of rsync not found in most similar programs/protocols is that the mirroring takes place with only one transmission in each direction. rsync can copy or display directory contents and copy files, optionally using compression and recursion.

After changing over to rsync, our backup scripts can be greatly simplified since compression will no longer be required (since compression will prevent rsync's delta encoding from being effective). All that will be necessary is for the SQL to be dumped to a common location within the backed up structure such as in /var/www. We will probably backup the entire /home directory and move sizeable such as old backups and downloaded applications source into other locations such as /src and /backup.

Our new procedure will be more generic allowing a group of peers to share a file structure which is mapped directly in to a filesystem mount-point. The procedure is incomplete, but is being developed and discussed in the set up a distributed file system article.

Simple workstation backup

The following backup-workstation.pl Perl script is a simple backup solution for a local workstation which compresses and 7-zip's a list of directory trees.

<perl>Backup-workstation.pl</perl>

Simple server backup

The following backup.pl Perl script is a simple backup solution to backup both a directory tree and all databases in a local MySQL server, and save them to a 7-zip compressed file named according to the date. It also logs the backup and it's file size in Server log using SimpleForms. It can be executed periodically from the crontab.

<perl>Backup.pl</perl>

MySQL Replication

MySQL 5.0 offers a new thing called replication which allows many slave databases to be kept syncronised with a single master. Even running the slaves locally is of benefit because regular backup and distribution can be made from the slaves so that the master never needs to be locked.

Backup related news items

About LZMA

LZMA is an extremely good compression method which compresses our backups to about one third of the size of the gzip or bzip. I have tested it with the free 7-zip file manager from www.7-zip.org and od-wiki-db-2006-11-20 is 268MB uncompressed, 54.9MB gzipped and only 21.7MB as a 7z! But I'm unable to get the Debian port to work due to dependency issues with low level C libraries that I don't want to mess with.

  • I've found a standalone version at http://sourceforge.net/projects/p7zip and that's compressed it to 24.8MB, not quite as small as the windows one, but still very good.
  • Using switches -t7z -m0=lzma -mx=9 has got it down to 21.1MB - slightly smaller than the windows version :-)

Statistics

7Zip is extremely good at compressing wiki data compared to other algorithms, perhaps due to compressing the history more efficiently, here's a size comparison for compressing a server image which is a standard linux file structure containing no database or web site content.

Compressionserver imagewiki backup
none517MB269MB
7z122MB (76%)21.1MB (92%)
RAR140MB (72%)24.9MB (90%)
Bzip2176MB (66%)38.0MB (86%)
Gzip197MB (62%)54.5MB (80%)
Zip197MB (62%)54.5MB (80%)

Simple snapshot backup

If when a file is added to a compressed archive the size only increased based on the differences between the added items and the patterns within the existing content, then storing regular snapshots of entire folder structures in a compressed archive would be very efficient in terms of space. To this end I've made some test archives from two files 1.sql and 2.sql which contain a great deal of common content both within themselves and amongst each other. Here's a table of the sizes of the files in MB under various circumstances:

bz2
Type!1 2 1+2
887 1100
gz
126 163
7z 34 45 79

See also