Search code examples

Copied DocumentFile has different siize and hash to original

I'm attempting to copy / duplicate a DocumentFile in an Android application, but upon inspecting the created duplicate, it does not appear to be exactly the same as the original (which is causing a problem, because I need to do an MD5 check on both files the next time a copy is called, so as to avoid overwriting the same files).

The process is as follows:

  1. User selects a file from a ACTION_OPEN_DOCUMENT_TREE
  2. Source file's type is obtained
  3. New DocumentFile in target location is initialised
  4. Contents of first file is duplicated into second file

The initial stages are done with the following code:

// Get the source file's type
String sourceFileType = MimeTypeMap.getSingleton().getExtensionFromMimeType(contextRef.getContentResolver().getType(file.getUri()));

// Create the new (empty) file
DocumentFile newFile = targetLocation.createFile(sourceFileType, file.getName());

// Copy the file
CopyBufferedFile(new BufferedInputStream(contextRef.getContentResolver().openInputStream(file.getUri())), new BufferedOutputStream(contextRef.getContentResolver().openOutputStream(newFile.getUri())));

The main copy process is done using the following snippet:

    void CopyBufferedFile(BufferedInputStream bufferedInputStream, BufferedOutputStream bufferedOutputStream)
        // Duplicate the contents of the temporary local File to the DocumentFile
            byte[] buf = new byte[1024];

            while( != -1);
        catch (IOException e)
                if (bufferedInputStream != null) bufferedInputStream.close();
                if (bufferedOutputStream != null) bufferedOutputStream.close();
            catch (IOException e)

The problem that I'm facing, is that although the file copies successfully and is usable (it's a picture of a cat, and it's still a picture of a cat in the destination), it is slightly different.

  1. The file size has changed from 2261840 to 2262016 (+176)
  2. The MD5 hash has changed completely

Is there something wrong with my copying code that is causing the file to change slightly?

Thanks in advance.


  • Your copying code is incorrect. It is assuming (incorrectly) that each call to read will either return buffer.length bytes or return -1.

    What you should do is capture the number of bytes read in a variable each time, and then write exactly that number of bytes. Your code for closing the streams is verbose and (in theory1) buggy as well.

    Here is a rewrite that addresses both of those issues, and some others as well.

    void copyBufferedFile(BufferedInputStream bufferedInputStream,
                          BufferedOutputStream bufferedOutputStream)
             throws IOException 
        try (BufferedInputStream in = bufferedInputStream;
             BufferedOutputStream out = bufferedOutputStream) 
            byte[] buf = new byte[1024];
            int nosRead;
            while ((nosRead = != -1)  // read this carefully ...
                out.write(buf, 0, nosRead);

    As you can see, I have gotten rid of the bogus "catch and squash exception" handlers, and fixed the resource leak using Java 7+ try with resources.

    There are still a couple of issues:

    1. It is better for the copy function to take file name strings (or File or Path objects) as parameters and be responsible for opening the streams.

    2. Given that you are doing block reads and writes, there is little value in using buffered streams. (Indeed, it might conceivably be making the I/O slower.) It would be better to use plain streams and make the buffer the same size as the default buffer size used by the Buffered* classes .... or larger.

    3. If you are really concerned about performance, try using transferFrom as described here:

    1 - In theory, if the bufferedInputStream.close() throws an exception, the bufferedOutputStream.close() call will be skipped. In practice, it is unlikely that closing an input stream will throw an exception. But either way, the try with resource approach will deals with this correctly, and far more concisely.