Search code examples
filefilesystemssparse-file

Why would a program want/prefer to use a sparse file?


I know what sparse file is but I can't figure out how or why a program such as lastlog would prefer such a file over a normal file.

I know sparse files can be used for loop back filesystems to save space but that is obviously not efficient for a program since its another layer.

The only thing I can think of is using the sparse file for memory efficient random access of a giant multidimensional array (e.g. a matrix) but I'm not even sure if that is what people are using sparse files for (and I'm not even sure if that is really that much more performant over multiple files).


Solution

  • The /var/log/lastlog file contains information about the most recent login for each user, organised by uid. If a uid is not used or that user has never logged in, then no data will be stored in the sparse file for that entry.

    If there are large gaps in the uid numbering in /etc/passwd, then there will be correspondingly large gaps in the /var/log/lastlog file.

    This structure allows multiple lastlog processes to access/update the file simultaneously without locking or risk of corruption. With a more complex file structure, there would need to be locks to prevent corruption while updating the file. And a lock during the login sequence is not a good idea.