Search code examples
multithreadingfile-iofilesystemsnfsdistributed-filesystem

Writing files locally vs. remote file system?


My question is about remote files systems on Windows.

Suppose you have workstation X which has access to files systems on the network - say - \\ServerY\MYDir\.

Imagine a scenario that you have two simultaneous threads on X.

  1. Thread 1 is writing a file to local hard drive directory in X - C:\MYDir\.
  2. Thread 2 is writing to the remote file in \\ServerY\MYDir\.

I want to know are these two IO operations actually independent, i.e is thread 1 only using hard disk controller of X and thread 2 only using the network and passing all data to Server on the wire, where is is actually written to the hard drive on serverY.

Or

Is thread 2 also making some local cache data in X (and hence using the hard disk controller on X). In this case the IO operations of thread 2 may interfere in operation of thread1 which may lead to possible performance loss.

Basically - will there be any gain in doing writes on a local file and that on a remote file in parallel?

My question is specific to remote file system supported by windows like Microsoft Networks or NFS


Solution

  • Typically Thread2 will not cache the writes locally and the two threads are independent. So generally, yes you will see performance benefits from accessing both files simultaneously.

    You will also usually see performance benefits from accessing two local files that are on different disks - even if they are on the same controller.

    You can often see performance benefits even if the files are on the same disk because this allows you to keep the disk busy - but to avoid thrashing the head around you need to do IO in large blocks. If the local drive is flash then there is no seek time, so having multiple threads reading/writing to it doesn't reduce performance like it can on a hard drive.

    A couple of things that can affect this: 1: Windows supports "offline files" where the client is allowed to cache files of a remote file system, both read and write on a local drive. It used to be that when the client would only use the local cache when offline. I don't know if Win8 changes this.

    2: "\ServerY" might actually refer to the local machine.