A Java and Haskell implementation of the rsync algorithm.
Note that this is not an implementation of rsync itself! The data it produces is not compatible with either rsync or librsync. It is merely an implementation of the signature-generation, delta-analysis, and patch-application as described in the paper linked above.
To compute the signature of a file, use
SignatureComputer.compute or a SignatureComputer.SignatureFileInputStream;
to create a patch, read the generated signature data into a
SignatureTable
and pass it together with an input stream to
PatchComputer.compute or a PatchComputer.PatchComputerInputStream
to build a patch, and finally send the patch together with a
BlockFinder to
PatchApplier.apply or a PatchApplier.PatchInputStream
to generate the new file.
The use of these classes is demonstrated in the class
com.socrata.ssync.SSync.
The SSync library uses
conduit for streaming data.
The
produceSignatureTable
conduit will digest a byte-stream into a signature file, which can
itself be read into a SignatureTable value via
consumeSignatureTable.
If the signature table is malformed, consumeSignatureTable will
throw a SignatureTableException. The
patchComputer conduit can
combine the signature table with a stream of bytes to produce a patch
file. Finally, the
patchApplier conduit can
combine the patch file with the data from the file being patched to
produce the target.
The use of these functions is demonstrated in the code for the
ssync executable.
The ssync library (but not the executable) is compatible with GHCJS
(note: GHCJS is currently a moving target; ssync has been built with
the version at commit
100fa6d67).
When using GHCJS, the only HashAlgorithm available is MD5.
The binary-equivalence-test.sh file contains tests that ensure the
Java and Haskell versions produce exactly the same output for the same
input.