Search code examples
cmultithreadingfileiopthreads

Why does reading a file by creating a new thread take more time than not using a new thread?


So I am reading a while which is 3.5 GB long (actually that's half the size of the file. I am reading half of the file). What I originally had in mind was to split the 7GB while into two halves and read the halves in two separate threads and see if I could get a performance boost over reading the whole file in one go without any threads.

But, just reading half file in a newly created thread is taking way more than reading the whole file without any threads. Why is there such a difference?

Here's the code without any threads :-

#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>

int main(int argc, char **argv) {
    if (argc < 2) {
        printf("Usage: %s <filepath>\n", argv[0]);
        exit(1);
    }
    struct stat info;
    if (stat(argv[1], &info) < 0) {
        perror("stat()");
        exit(1);
    }
    long size = info.st_size;

    long count = 0;
    FILE *fptr = fopen(argv[1], "r");
    if (fptr == NULL) {
        perror("fopen()");
        exit(1);
    }
    int ch = 0;
    while (count != size / 2) {
        ch = fgetc(fptr);
        count++;
    }
    printf("read bytes: %ld\n", count);
}

The above code takes 8-10 ms to complete on average.

Now, the same thing is being doing using pthreads,

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/types.h>
#include <unistd.h>

FILE *fptr1;
long size = 0;
long by = 0;

void *read_file(void *param) {
    long count = 0;
    int ch = 0;
    while (count != size / 2) {
        ch = fgetc(fptr1);
        count++;
    }
    by = count;
    return NULL;
}

int main(int argc, char **argv) {
    if (argc < 2) {
        exit(1);
    }

    struct stat info;
    if (stat(argv[1], &info) < 0) {
        perror("stat()");
        exit(1);
    }

    size = info.st_size;

    pthread_t thread1;
    fptr1 = fopen(argv[1], "r");

    if (pthread_create(&thread1, NULL, &read_file, NULL) < 0) {
        perror("thread()");
        exit(1);
    }

    pthread_join(thread1, NULL);

    printf("bytes: %ld\n", by);
}

This code, doing the exact same thing, takes on average 65-70 s.

Why does it take so much time in threaded case as compared to non-threaded version? Is there any point of reading a file in halves with two threads?

Also, I know fread() would be a better choice, completely agreed. I deliberately used fgetc because I didn't wanted to set up a buffer and what not. Since fgetc() is used in both versions, I guess the answer lies in threading and fgetc().

Thank you for you help.


Solution

  • Here are multiple reasons for the threaded code to be slower:

    • Using standard streams in a multi-threaded application is not recommended and has definite drawbacks: fgetc() is optimized for non threaded programs where it does not need to lock the FILE structure. In a multi-threaded application, to allow consistent concurrent access to FILE structures, all standard stream functions must use a lock to serialize access to the stream structure even if it is used in a single thread, because there is no way for the library to assert that. This is very slow compared to the minimal task involved in reading a single byte from the buffer.

      If the stream is used in a single thread, you can use fgetc_unlocked() to bypass this overhead for fgetc() but there is no equivalent for more elaborate stream functions such as fgets. fread should not be a problem if reading large blocks at a time.

    • Furthermore, the thread version seems to achieve the exact same task, but it uses global variables for size and fptr1, which has an impact on the generated code. Furthermore, the function needs to reload the values from memory for each iteration because the compiler cannot assume that fgetc() does not change them. This may prevent further optimisations in this version.

      Note that it is bad design to pass values to a thread via global variables, you should use allocate a structure and pass a pointer as the param argument.

    • Note that if you create multiple threads for each to handle a separate part of the file, you will hit another problem: the threads will compete to read blocks from the file at different locations on the storage device, which may be very inefficient on some devices such as Hard Disks because these accesses will incur a head movement costing a typical 10ms latency per access. Reading the file sequentially might be more efficient by a order of magnitude. File system layout and cacheing, specific device characteristics and other factors may impact the performance and make it non reproducible.

    For a quick test, try changing the read_file function to this:

    void *read_file(void *param) {
        long local_size = size;
        FILE *local_fptr = fptr1;
        long count = 0;
        int ch;
        while (count != local_size / 2) {
            ch = fgetc_unlocked(local_fptr);
            count++;
        }
        by = count;
        return NULL;
    }