Difference between revisions of "Rsync"

From Organic Design wiki
(The transliterate patch: note, not needed if using encryption)
(Change source-code blocks to standard format)
Line 17: Line 17:
  
 
The transfer syntax is then done very similarly to SCP, for example to pull new changes from a remote directory to a local one, use:
 
The transfer syntax is then done very similarly to SCP, for example to pull new changes from a remote directory to a local one, use:
{{code|<bash>rsync -avz -e ssh remoteuser@remotehost:/remote/dir /this/dir/</bash>}}
+
<source lang="bash">
 +
rsync -avz -e ssh remoteuser@remotehost:/remote/dir /this/dir/
 +
</source>
  
 
After the systems are confirmed as being able to connect over SSH you may want to lock them down so that the connection between them can only be used for rsync. The IP and command can be prepended to the key in the remote hosts ''~/.ssh/authorized_keys'' file.
 
After the systems are confirmed as being able to connect over SSH you may want to lock them down so that the connection between them can only be used for rsync. The IP and command can be prepended to the key in the remote hosts ''~/.ssh/authorized_keys'' file.
{{code|<pre>from="1.2.3.4",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-rsa AAAAB...</pre>}}
+
<source>
 +
from="1.2.3.4",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-rsa AAAAB...
 +
</source>
  
 
For more security, the command allowed can be restricted to just that specific rsync command. This can be done by manually running the rsync command with the ''-e'ssh -v' ''option which will output the exact command sent that can be used in the remote hosts ''authorized_keys'' file instead of just "rsync".
 
For more security, the command allowed can be restricted to just that specific rsync command. This can be done by manually running the rsync command with the ''-e'ssh -v' ''option which will output the exact command sent that can be used in the remote hosts ''authorized_keys'' file instead of just "rsync".
Line 34: Line 38:
  
 
Rather than keep ''EncFS'' running permanently, our backup script starts it every time before doing the ''rsync'' command, then stops it again after ''rsync'' has finished. The only difficulty with this is that ''EncFS'' doesn't have an option for passing the password as a command-line option, so we use Perl's ''Net::Expect'' module to enter the password. You can then use ''mount'' to ensure that ''EncFS'' has properly mounted the decrypted location before continuing. This is shown in the following example snippet.
 
Rather than keep ''EncFS'' running permanently, our backup script starts it every time before doing the ''rsync'' command, then stops it again after ''rsync'' has finished. The only difficulty with this is that ''EncFS'' doesn't have an option for passing the password as a command-line option, so we use Perl's ''Net::Expect'' module to enter the password. You can then use ''mount'' to ensure that ''EncFS'' has properly mounted the decrypted location before continuing. This is shown in the following example snippet.
{{code|<perl>
+
<source lang="perl">
 
use Expect;
 
use Expect;
  
Line 53: Line 57:
  
 
} else { print "Failed to mount!" }
 
} else { print "Failed to mount!" }
</perl>}}
+
</source>
  
 
== The transliterate patch ==
 
== The transliterate patch ==
Line 63: Line 67:
  
 
To install the patch you need to download and unpack the latest source and the patches, then change into the source directory and do the following:
 
To install the patch you need to download and unpack the latest source and the patches, then change into the source directory and do the following:
{{code|<bash>
+
<source lang="bash">
 
patch -p1 <patches/transliterate.diff
 
patch -p1 <patches/transliterate.diff
 
./configure
 
./configure
 
make
 
make
 
make install
 
make install
</bash>}}
+
</source>
  
 
You can't use this option to backup directory to the target server unless the target also has the transliterate patch installed. If it's not installed on the server you'll need to do a two-stage backup. The first to a local directory using the ''--tr'' option, and second synchronising this local directory (that has all the colons replaced) with the remote server without the ''--tr'' option. Here's an example taken from our daily backup script.
 
You can't use this option to backup directory to the target server unless the target also has the transliterate patch installed. If it's not installed on the server you'll need to do a two-stage backup. The first to a local directory using the ''--tr'' option, and second synchronising this local directory (that has all the colons replaced) with the remote server without the ''--tr'' option. Here's an example taken from our daily backup script.
{{code|<bash>
+
<source lang="bash">
 
rsync -a --delete --tr=':/;' /home /backup/home.rsync
 
rsync -a --delete --tr=':/;' /home /backup/home.rsync
 
rsync -az --delete /backup/home.rsync user\@domain\@rsync.adrive.com:.
 
rsync -az --delete /backup/home.rsync user\@domain\@rsync.adrive.com:.
</bash>}}
+
</source>
  
 
[[Category:Software]][[Category:Linux]]
 
[[Category:Software]][[Category:Linux]]

Revision as of 18:11, 22 May 2015

rsync is an open source utility that provides fast incremental file transfer. rsync is freely available under the GNU General Public License and is currently being maintained by Wayne Davison.

rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.

Some features of rsync include

  • Can update whole directory trees and filesystems
  • Optionally preserves symbolic links, hard links, file ownership, permissions, devices and times
  • Requires no special privileges to install
  • Internal pipelining reduces latency for multiple files
  • Can use rsh, ssh or direct sockets as the transport
  • Supports anonymous rsync which is ideal for mirroring

Following is a description of some of the ways we use rsync here at Organic Design.

Using rsync over SSH with key-based logins

Sometimes it's useful to do a one-off backup of a file structure from one host to another, and since all the hosts (in our system) are guaranteed to be able to connect to each other with SSH (after adding appropriate RSA keys), using rsync over SSH is a good way to do this.

The transfer syntax is then done very similarly to SCP, for example to pull new changes from a remote directory to a local one, use:

rsync -avz -e ssh remoteuser@remotehost:/remote/dir /this/dir/

After the systems are confirmed as being able to connect over SSH you may want to lock them down so that the connection between them can only be used for rsync. The IP and command can be prepended to the key in the remote hosts ~/.ssh/authorized_keys file.

from="1.2.3.4",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-rsa AAAAB...

For more security, the command allowed can be restricted to just that specific rsync command. This can be done by manually running the rsync command with the -e'ssh -v' option which will output the exact command sent that can be used in the remote hosts authorized_keys file instead of just "rsync".

Using rsync with encryption

EncFS which can be installed simply via apt-get is an excellent option for synchronising data with rsync when the target system is insecure. Using EncFS, an encrypted version of the source data can be maintained locally and then synchronised to a remote system with rsync.

EncFS is used in its standard form to create two directories, an encrypted one which is permanent and another which is the decrypted view of this encrypted data that only exists while EncFS is running. This means that we can start EncFS which will then present the current state of the encrypted data in the temporary decrypted mount point, then we do a local rsync from the source directory structure to the decrypted mount point, then stop EncFS. We can then rsync the permanent encrypted directory to the remote server as it has now been brought up to date with the source structure.

The encrypted directory contains a one-to-one correspondence of file-system objects with the decrypted data, which means that rsync can still work efficiently at updating only the added files and deleting the removed ones as usual. These objects have encrypted names and content, but retain their original ownership and time-stamp attributes, their sizes remain very close to the original too.

Another good feature of EncFS is that the file and directory names are also encrypted using only friendly characters which means that there's no need to use the transliterate patch with rsync when using EncFS.

Rather than keep EncFS running permanently, our backup script starts it every time before doing the rsync command, then stops it again after rsync has finished. The only difficulty with this is that EncFS doesn't have an option for passing the password as a command-line option, so we use Perl's Net::Expect module to enter the password. You can then use mount to ensure that EncFS has properly mounted the decrypted location before continuing. This is shown in the following example snippet.

use Expect;

# Mount the decrypted view onto the encrypted local mirror
$exp = Expect->spawn( "encfs /backup/encrypted /backup/decrypted" );
$exp->expect( 5, [ qr/EncFS Password:/ => sub { my $exp = shift; $exp->send( "********\n" ); exp_continue; } ] );
$exp->soft_close();

# Rsync then unmount if mounted successfully
if( qx( mount|grep 'encfs on /backup/decrypted' ) ) {

	# Bring the local encrypted mirror up to date via the decrypted mount-point, then un-mount it
	qx( rsync -a --delete /home/foo /backup/decrypted );
	qx( fusermount -u /backup/decrypted );

	# Synchronise the encrypted mirror with the remote service
	qx( rsync -a --delete /backup/encrypted foo\@domain.com\@rsync.adrive.com:. )

} else { print "Failed to mount!" }

The transliterate patch

Sometimes the source file structure contains characters that are not allowed on the target system, for example Maildirs use colons in the file naming protocol which are not allowed on many filesystems including our ADrive backup service.

Note: this problem is redundant if using the above encryption method because all the filenames are encrypted too and the encrypted names only use friendly characters.

We've overcome this problem using the transliterate patch which adds a --tr=BAD/GOOD option for mapping bad characters to good ones.

To install the patch you need to download and unpack the latest source and the patches, then change into the source directory and do the following:

patch -p1 <patches/transliterate.diff
./configure
make
make install

You can't use this option to backup directory to the target server unless the target also has the transliterate patch installed. If it's not installed on the server you'll need to do a two-stage backup. The first to a local directory using the --tr option, and second synchronising this local directory (that has all the colons replaced) with the remote server without the --tr option. Here's an example taken from our daily backup script.

rsync -a --delete --tr=':/;' /home /backup/home.rsync
rsync -az --delete /backup/home.rsync user\@domain\@rsync.adrive.com:.