Home Company Services Portfolio Contact us nav spacer

SpreadMirror

by Izak Burger posted on Dec 16, 2008 01:33 PM last modified Apr 02, 2009 12:47 PM —

This is a simple tool I implemented to synchronise files between a cluster of machines.

The specific problem we had to solve involved a Zope product called FilesystemStorage. We have a small cluster of two machines running against a single ZEO server that lived on one of the two machines with heartbeat and drbd to guarantee availability.

When we put together this architecture we (the sysadmins) didn't know that FSS is going to be used for some of the content, so we found ourselves in the rather unfortunate position where not all the content was available from both nodes in our cluster.

There was also a further possible future complication in the form of a third node that might be added to the cluster.

So we needed a simple tool for synchronising files between nodes, with no specific master node: whoever operated on the file at the time would be the master and everyone would have to follow. Normally you'd require a mutual exclusion mechanism to do this safely, but the way FSS handles this made that unnecessary.

I called the result spreadmirror. It uses spread as a messaging bus between the nodes, and a simple protocol where file rename, creation and deletion events are simply replicated to all nodes. A zope product called fssspread and a simple patch for FileSystemStorage is also provided. Unfortunately there has been at least three changes to the utils.py file in FSS recently, making it a little hard to provide a definitive version. You may have to patch it by hand.

When a file is modified, the patched FSS will generate an event. The fssspread product subscribes to these events and relays the event to the spreadmirror daemon running on the host. Spreadmirror then relays the change to all other nodes.

The synchronisation protocol is extremely simple. File deletion and renaming is handled by performing the same operation on all the nodes. File creation is handled by sending the entire file over spread to all the nodes. The idea is that spread's multicast features can be used to optimise this. Because there is a maximum message size imposed by spread, an additional internal "append" event is used between nodes, so that a file bigger than 16k ends up as one create event followed by several append events.

We've been using it for a few months now without problems.

Spreadmirror has been debianised, which makes it easy to install on any debian or ubuntu machine.

It is available from svn here.