Difference between revisions of "Rsync"

From Organic Design wiki
(Using encryption with EncFS: point 3 all good)
(Using encryption with EncFS: works)
Line 42: Line 42:
  
 
=== Using encryption with EncFS ===
 
=== Using encryption with EncFS ===
Another option which has not been tested yet is to use [http://how-to.linuxcareer.com/user-data-encryption-with-fuse-based-encfs EncFS] which can be installed simply via ''apt-get''. EncFS allows a directory to be mounted in encrypted form so that this encrypted version could then be synchronised with rsync, it even has an option specifically for this purpose as described by this section from the man page.
+
[http://how-to.linuxcareer.com/user-data-encryption-with-fuse-based-encfs EncFS] which can be installed simply via ''apt-get'' is an excellent option for synchronising ''Maildir'' structures (as well as other data). Using ''EncFS'' and encrypted version of the source data can be maintained locally and then synchronised to a remote location with ''rsync''. The only issue is that twice the space is required on the local host to maintain both the plain view and the encrypted view, unless the daemon is run at boot time and the mail-server works directly into the temporary decrypted view of the encrypted directories which can be done but makes the mail-server less reliable.
<div style="margin: 20px 0">
 
'''--reverse''' Normally EncFS provides a plaintext view of data on demand.  Normally it stores enciphered data and displays plaintext data.  With --reverse it takes as source plaintext data and produces enciphered data on-demand. This can be useful for creating remote encrypted backups, where you do not wish to keep the local files unencrypted.
 
  
For example, the following would create an encrypted view in ''/tmp/crypt-view''.
+
''EncFS'' is used in its standard form to create two directories, an encrypted one which is permanent and another which is the decrypted view of this encrypted data that only exists while ''EncFS'' is running. This means that we can start ''EncFS'' which will then present the current state of the encrypted data in the temporary decrypted mount point, then we do a local ''rsync'' from the source Maildir structure to the decrypted mount point, then stop ''EncFS''. We can then ''rsync'' the permanent encrypted directory to the remote server as it has now been brought up to date with the source structure. So only points one and two remain to be tested to see if ''rsync'' can work properly on this format of data.
{{code|<bash>encfs --reverse /home/me /tmp/crypt-view</bash>}}
 
You could then copy the ''/tmp/crypt-view'' directory in order to have a copy of the encrypted data. You must also keep a copy of the file /home/me/.encfs5 which contains the filesystem information. Together, the two can be used to reproduce the unencrypted data:
 
{{code|<bash>ENCFS5_CONFIG=/home/me/.encfs5 encfs /tmp/crypt-view /tmp/plain-view</bash>}}
 
  
Now ''/tmp/plain-view'' contains the same data as ''/home/me''.
+
The encrypted view contains a one-to-one correspondence of filesystem objects between the decrypted and encrypted structures, which means that ''rsync'' can still work efficiently at updated only the added files (and deleting the removed ones).
  
Note that ''--reverse'' mode only works with limited configuration options, so many settings may be disabled when used.
+
Furthermore, the file and directory names are also encrypted using only friendly characters which means that there's no need to use the ''transliterate'' patch with ''rsync'' when using ''EndFS''.
</div>
 
This ''--reverse'' option may not be quite what we're after though because it may then need to encrypt the entire structure every time the encrypted view is requested for backup.
 
 
 
Ideally what we need from ''EncFS'' is the following three important features:
 
*A one-to-one correspondence of file and directory objects between the plain and encrypted views so that ''rsync'' can still only update new files.
 
*Encrypted names of the file and directories so that the ''rsync --tr'' option is redundant.
 
*Ability to maintain a local encrypted view so that ''EncFS'' is only activated for the ''rsync'' operation and it only needs to encrypt new out of date information from the source structure.
 
I don't know yet if ''EncFS'' supports these requirements as further testing is required.
 
 
 
;update
 
The last requirement is no problem. We can use ''EncFS'' in the normal way that creates two directories, an encrypted one which is permanent and another which is the decrypted view of this encrypted data that only exists while ''EncFS'' is running. This means that we can start ''EncFS'' which will then present the current state of the encrypted data in the temporary decrypted mount point, then we do a local ''rsync'' from the source Maildir structure to the decrypted mount point, then stop ''EncFS''. We can then ''rsync'' the permanent encrypted directory to the remote server as it has now been brought up to date with the source structure. So only points one and two remain to be tested to see if ''rsync'' can work properly on this format of data.
 
 
[[Category:Software]][[Category:Linux]]
 
[[Category:Software]][[Category:Linux]]

Revision as of 17:07, 17 July 2014

rsync is an open source utility that provides fast incremental file transfer. rsync is freely available under the GNU General Public License and is currently being maintained by Wayne Davison.

rsync uses the "rsync algorithm" which provides a very fast method for bringing remote files into sync. It does this by sending just the differences in the files across the link, without requiring that both sets of files are present at one of the ends of the link beforehand.

Some features of rsync include

  • Can update whole directory trees and filesystems
  • Optionally preserves symbolic links, hard links, file ownership, permissions, devices and times
  • Requires no special privileges to install
  • Internal pipelining reduces latency for multiple files
  • Can use rsh, ssh or direct sockets as the transport
  • Supports anonymous rsync which is ideal for mirroring

Using rsync over SSH

Sometimes it's useful to do a one-off backup of a file structure from one host to another, and since all the hosts (in our system) are guaranteed to be able to connect to each other with SSH (after adding appropriate RSA keys), using rsync over SSH is a good way to do this.

The transfer syntax is then done very similarly to SCP, for example to pull new changes from a remote directory to a local one, use:

<bash>rsync -avz -e ssh remoteuser@remotehost:/remote/dir /this/dir/</bash>

After the systems are confirmed as being able to connect over SSH you may want to lock them down so that the connection between them can only be used for rsync. The IP and command can be prepended to the key in the remote hosts ~/.ssh/authorized_keys file.

from="1.2.3.4",no-agent-forwarding,no-port-forwarding,no-pty,no-user-rc,no-X11-forwarding ssh-rsa AAAAB...

For more security, the command allowed can be restricted to just that specific rsync command. This can be done by manually running the rsync command with the -e'ssh -v' option which will output the exact command sent that can be used in the remote hosts authorized_keys file instead of just "rsync".

Backing up Maildirs with Rsync

Backing up Maildirs can be a problem with many target systems (even non-Windows ones) because many filesystems don't allow colons in file names. This problem occurs for us using the ADrive backup service.

We've over come this problem using the transliterate patch which adds a --tr=BAD/GOOD option for mapping bad characters to good ones.

To install the patch you need to download and unpack the latest source and the patches, then change into the source directory and do the following:

<bash>

patch -p1 <patches/transliterate.diff ./configure make make install </bash>

You can't use this option to backup directory to the target server unless the target also has the transliterate patch installed. If it's not installed on the server you'll need to do a two-stage backup. The first to a local directory using the --tr option, and second synchronising this local directory (that has all the colons replaced) with the remote server without the --tr option. Here's an example taken from our daily backup script.

{{{1}}}

Using encryption with EncFS

EncFS which can be installed simply via apt-get is an excellent option for synchronising Maildir structures (as well as other data). Using EncFS and encrypted version of the source data can be maintained locally and then synchronised to a remote location with rsync. The only issue is that twice the space is required on the local host to maintain both the plain view and the encrypted view, unless the daemon is run at boot time and the mail-server works directly into the temporary decrypted view of the encrypted directories which can be done but makes the mail-server less reliable.

EncFS is used in its standard form to create two directories, an encrypted one which is permanent and another which is the decrypted view of this encrypted data that only exists while EncFS is running. This means that we can start EncFS which will then present the current state of the encrypted data in the temporary decrypted mount point, then we do a local rsync from the source Maildir structure to the decrypted mount point, then stop EncFS. We can then rsync the permanent encrypted directory to the remote server as it has now been brought up to date with the source structure. So only points one and two remain to be tested to see if rsync can work properly on this format of data.

The encrypted view contains a one-to-one correspondence of filesystem objects between the decrypted and encrypted structures, which means that rsync can still work efficiently at updated only the added files (and deleting the removed ones).

Furthermore, the file and directory names are also encrypted using only friendly characters which means that there's no need to use the transliterate patch with rsync when using EndFS.