WikiSync II

From Organic Design wiki

We need to get a peerd running again which will be used to synchronise articles across wikis which may be running on different servers. The original wikisync solution was too unreliable and cumbersome and suffered from the following flaws:

  • It was not fault tolerant, revision history would become out of sync if connections were down during times when revisions were made
  • It was too slow and inefficient because it relied on polling to check if articles requiring synchronisation had changed
  • It could not work properly in both directions because it had no way of dealing with edit conflicts

To handle fault tolerance it should work of a job-queue model and only remove the items from the queue after the revisions have been tested to exist on each peer.

Instead of polling it will a real-time messaging system based on the standard MediaWiki email notification system but using a daemon (based on peerd) which integrates directly with Exim4 via a pipe. The peer will maintain permanent logins to all the wikis it deals with to ensure very rapid response and will poll to ensure it is still logged in, reporting any down time.

It will need to use ideas from the P2P extension to resolve edit conflicts (no-loss revision method and notification of possible manual attention requirements).

Each wiki running a peerd instance should be able to act as a central management interface for synchronising categories of articles across groups of wikis. This may be done with RecordAdmin in a similar way that computers and networks are being handled. These groups should be category based so that a CategoryWatch-like methodology can be employed to notify the local peer of changes requiring action.

Dev notes

The Exim4 integration will be based on the method we use to integrate SpamAssassin which uses the following configuration additions:

# 850: Spamcheck router
spamcheck_router:
	no_verify
	check_local_user
	condition = "${if and { {!def:h_X-Spam-Flag:} {!eq {$received_protocol}{spam-scanned}}} {1}{0}}"
	driver = accept
	transport = spamcheck_transport

Add the following section before the line containing end transport/30_exim4-config_remote_smtp_smarthost

Note: be sure to check that the spamc binary is in the location specified, and that it is running.

# 30: Spamcheck transport
spamcheck_transport:
	debug_print = "T: spamassassin_pipe for $local_part@$domain"
	driver = pipe
	command = /usr/sbin/exim4 -oMr spam-scanned -bS
	use_bsmtp
	transport_filter = /usr/local/bin/spamc
	home_directory = "/tmp"
	current_directory = "/tmp"
	user = Debian-exim
	group = Debian-exim
	return_fail_output
	message_prefix =
	message_suffix =

This can be simplified those since there's no need to use a program to determine whether routing should occur or not - it will be based on the email address (e.g. peerd@localhost or similar).

Here are some relevant entries from the Exim4 manual: