Search code examples
javadelta

Update data only by difference between files (delta for java)


UPDATE: I solved the problem with a great external library - https://code.google.com/p/xdeltaencoder/. The way I did it is posted below as the accepted answer

Imagine I have two separate pcs who both have an identical byte[] A.

One of the pcs creates byte[] B, which is almost identical to byte[] A but is a 'newer' version.

For the second pc to update his copy of byte[] A into the latest version (byte[] B), I need to transmit the whole byte[] B to the second pc. If byte[] B is many GB's in size, this will take too long.

Is it possible to create a byte[] C that is the 'difference between' byte[] A and byte[] B? The requirements for byte[] C is that knowing byte[] A, it is possible to create byte[] B.

That way, I will only need to transmit byte[] C to the second PC, which in theory would be only a fraction of the size of byte[] B.

I am looking for a solution to this problem in Java.

Thankyou very much for any help you can provide :)

EDIT: The nature of the updates to the data in most circumstances is extra bytes being inserted into parts of the array. Ofcourse it is possible that some bytes will be changed or some bytes deleted. the byte[] itself represents a tree of the names of all the files/folders on a target pc. the byte[] is originally created by creating a tree of custom objects, marshalling them with JSON, and then compressing that data with a zip algorithm. I am struggling to create an algorithm that can intelligently create object c.

EDIT 2: Thankyou so much for all the help everyone here has given, and I am sorry for not being active for such a long time. I'm most probably going to try to get an external library to do the delta-encoding for me. A great part about this thread is that I now know what I want to achieve is called! I believe that when I find an appropriate solution I will post it and accept it so others can see as to how I solved my problem. Once again, thankyou very much for all your help.


Solution

  • So, what I ended up doing was using this:

    https://code.google.com/p/xdeltaencoder/

    From my test it works really really well. However, you will need to make sure to checksum the source (in my case fileAJson), as it does not do it automatically for you!

    Anyways, code below:

    //Create delta
    String[] deltaArgs = new String[]{fileAJson.getAbsolutePath(), fileBJson.getAbsolutePath(), fileDelta.getAbsolutePath()};
    XDeltaEncoder.main(deltaArgs);
    
    //Apply delta
    deltaArgs = new String[]{"-d", fileAJson.getAbsolutePath(), fileDelta.getAbsolutePath(), fileBTarget.getAbsolutePath()};
    XDeltaEncoder.main(deltaArgs);
    
    //Trivia, Surpisingly this also works
    deltaArgs = new String[]{"-d", fileBJson.getAbsolutePath(), fileDelta.getAbsolutePath(), fileBTarget.getAbsolutePath()};
    XDeltaEncoder.main(deltaArgs);