Search code examples
socketslarge-filesreliabilitycontinuous

Sending a large file over network continuously


We need to write software that would continuously (i.e. new data is sent as it becomes available) send very large files (several Tb) to several destinations simultaneously. Some destinations have a dedicated fiber connection to the source, while some do not.

Several questions arise:

  • We plan to use TCP sockets for this task. What failover procedure would you recommend in order to handle network outages and dropped connections?
  • What should happen upon upload completion: should the server close the socket? If so, then is it a good design decision to have another daemon provide file checksums on another port?
  • Could you recommend a method to handle corrupted files, aside from downloading them again? Perhaps I could break them into 10Mb chunks and calculate checksums for each chunk separately?

Thanks.


Solution

  • Since no answers have been given, I'm sharing our own decisions here:

    • There is a separate daemon for providing checksums for chunks and whole files.
    • We have decided to abandon the idea of using multicast over VPN for now; we use a multi-process server to distribute the files. The socket is closed and the worker process exits as soon as the file download is complete; any corrupted chunks need to be downloaded separately.
    • We use a filesystem monitor to capture new data as soon as it arrives to the tier 1 distribution server.