Search code examples
javatcp

Why does my file, sent through TCP, contain more data than the file it self contains?


I have been trying to send a file through TCP to one of my colleagues and check if the files arrive correctly. We've succeeded in sending simple .txt files with some text in it, but there's something off.

Whenever a message is sent through txt, the file itself contains more than the original message? e.g my colleague sends the .txt file with the content 123456789012345678901234567890123. It gets sent using DataOutputStream and FileInputStream with a dynamic filesize check using file.length.

the dynamic filesize variable gets fed to the byte[] buffer = new byte[filesize]

We eventually sent it using

while (fis.read(buffer) > 0) {
        dos.write(buffer);
    }
    
    fis.close();
    dos.close();    
}

Using this approach yields the following result:

sent:     123456789012345678901234567890123
received: 12345678901234567890123456789012378

As is visible, for some reason 78 gets pasted behind the message, we haven't been able to figure out what's going and we were wondering what was going on.

What's even weirder is that after some more tries, the sent messages are received as is/ arrived without any extra giberish? This is very irregular.

Any input is greatly appreciated, thank you!


Solution

  • while (fis.read(buffer) > 0) {
        dos.write(buffer);
    

    You assume that the read populated the complete buffer. Most often, it might do that. But sometimes, only parts of the buffer are read.

    From the javadoc:

    returns the total number of bytes read into the buffer, or -1 if there is no more data because the end of the stream has been reached.

    Therefore, that read() method tells you how many bytes got read. You have to ensure keep reading until you got exactly your bytes together! Otherwise, you will use a buffer that has only been partially filled with new data.