Search code examples
cmultithreadingasynchronousthread-safety

thread safety of read/pread system calls


I have few queries related to read()/pread() system calls in a multi-threaded environment

I am using Mac-OSX which is freeBsd based , if that helps in any way I am only using this file in read mode,and not read/write And the language is c/c++

Suppose we have a file on disk:
AAAABBBBCCCCDDDEEEE....

and 4 alphabets fit on one page of the file

So Page1:AAAA

Page2:BBBB ..... and so on

Now I initiate a read system call from two different threads with the same file descriptor.
My intention is to read first page from thread 1, second page from thread 2,..and so on.

read(fd,buff,sizeof(page));

From the man page I am given to understand that read will also increment the file pointer, so definitely I am gonna get garbled responses like

ABCC ABBB .. etc (with no particular sequence )

to remedy this I can use pread()

From man pages:

Pread() - performs the same function, but reads from the specified position in the file without modifying the file pointer

But I am not sure whether using pread will actually help me in my objective, cause even though it does not increment the internal file pointer, there are no guarantees that the responses are not jumbled.

All of my data is page aligned and I want to read one page from each thread like:

Thread 1 reads:AAAA
Thread 2 reads:BBBB
Thread 3 reads:CCCC ... without actually garbling the content ..

I also found a post Is it safe to read() from a file as soon as write() returns?

but it wasn't quite useful .

I am also not sure whether read() will actually have the problem, that I am thinking of. The file that I am reading is a binary file and hence it is a little difficult to just quickly manually read and verify..

Any help will be appreciated


Solution

  • read and write change the position of the underlying open file. They are "thread safe" in the sense that your program will not have undefined behavior (crash or worse) if multiple threads perform IO on the same open file at once using them, but the order and atomicity of the operations could vary depending on the type of file and the implementation.

    On the other hand, pread and pwrite do not change the position in the open file. They were added to POSIX for exactly the purpose you want: performing IO operations on the same open file from multiple threads or processes without the operations interfering with one another's position. You could still run into some trouble with ordering if you're mixing pread and pwrite (or multiple calls to pwrite) with overlapping parts of the file, but as long as you avoid that, they're perfectly safe for what you want to do.