Wednesday, 22 July 2009

rsyncable gzip

Rsync is an incredibly powerful tool that allows one to do fast incremental file transfer. It's a very smart tool, but sometimes it can be helped to do even better if you know how to help it.

One example of how to help rsync is with gzip compression. Gzip provides the --rsyncable flag which makes the output easier for rsync to spot local changes and just send the delta. According to the manual, the downside is that it makes the compressed about 1% larger, but there should be a big rsync win.

So how much better is it? To test this, I tar'd up a some ARM chumby sources and objects that I've been playing around with; this came to 1.2GB and compressed with gzip --fast came to 510MB.

Copying it over wifi to my server took ~9 mins 50 secs. I then removed 1 object file which was about 50% down the archive, tar'd up the files again, gzip'd again and rsync'd this; this took ~9 mins 12 secs.

I then repeated the exercise, first I removed the original file on the server to start with a clean rsync target. This time I gzip'd with --fast --rsyncable, and the rscync took 9 mins 56 seconds, which is ~1% longer than --rsyncable because the gzip file was 1% larger. Next, I removed 1 object file, again which was 50% down the archive, tar'd and gzip'd again with the --fast --rsyncable option, this took 15 seconds, or a speed up of 1387 times!

  1. --rsyncable does add a little over head in the gzip image size, in my test it was indeed ~1% extra in size.
  2. --rsyncable really does help rsync detect sync points and allows it to just copy over a small delta. Very efficient!