Search code examples
cfilelseek

How to properly use lseek() to extend file size?


I'm trying to truly understand the use of lseek() while creating a file of the needed size. So I wrote this code whose only goal is to create a file of the size given in the input.

Running for example:

$ ./lseek_test myFile 5

I would expect it to create a file named myFile of 5 bytes whose last byte is occupied by the number 5. What I get is a file I can't even access. What's wrong? Did I badly interpret lseek() usage?

#include <stdlib.h>
#include <stdio.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <sys/types.h>

#define abort_on_error(cond, msg) do {\
    if(cond) {\
        int _e = errno;\
        fprintf(stderr, "%s (%d)\n", msg, _e);\
        exit(EXIT_FAILURE);\
    }\
} while(0)

/* Write an integer with error control on the file */
void write_int(int fd, int v) {
    ssize_t c = write(fd, &v, sizeof(v));
    if (c == sizeof(v))
        return;
    abort_on_error(c == -1 && errno != EINTR, "Error writing the output file");
    abort_on_error(1, "Write operation interrupted, aborting");
}

int main(int argc, char *argv[]) {
    // Usage control
    abort_on_error(argc != 3, "Usage: ./lseek_test <FileName> <FileSize>");

    // Parsing of the input
    int size = strtol(argv[2], NULL, 0);
    // Open file
    int fd = open(argv[1], O_RDWR|O_CREAT, 0644);
    abort_on_error(fd == -1, "Error opening or creating file");

    // Use lseek() and write() to create the file of the needed size
    abort_on_error(lseek(fd, size, SEEK_SET) == -1, "Error in lseek");
    write_int(fd, size); // To truly extend the file 

    //Close file
    abort_on_error(close(fd) == -1, "Error closing file");
    return EXIT_SUCCESS;
}

Solution

  • Your program works for me exactly as I would expect, based on its implementation:

    • supposing that the named file does not initially exist, it creates it
    • it writes the 4 bytes of an int (sizeof(int)) having value 5 into the file, starting at offset 5
    • it writes nothing at offsets 0 - 4; these are filled with null bytes.

    The result is a nine-byte file, with byte values (not printable digits):

    0 0 0 0 0 5 0 0 0
    

    (My system is little-endian.) Note in particular that that file is not a text file in any sense. If you expected a text file, as seems to be the case, you might indeed see unexpected behavior with regard to it that you might characterize as not being able to access it.

    Some considerations, then:

    • The fifth byte of a file is at offset 4 from the beginning, not 5.
    • If you want to write the digit '5' then store it in a char and write that char; do not write its int representation. Alternatively, wrap your file descriptor in a stream and use stream I/O functions, such as fputc().
    • If you want to fill the other space with anything other than null bytes then you'll need to do that manually.
    • As far as I can determine, this is all as required by POSIX. In particular, it says this of lseek:

    The lseek() function shall allow the file offset to be set beyond the end of the existing data in the file. If data is later written at this point, subsequent reads of data in the gap shall return bytes with the value 0 until data is actually written into the gap.

    (POSIX 1003.1-2008, 2016 Edition)