Search code examples
creaddir

How does readdir return a pointer to information for the NEXT file?


I'm learning C. so I'm just kinda confused about the function readdir. In the book K&R, the function dirwalk includes the following

while ((dp = readdir(dfd)) != NULL){
  if (strcmp(dp->name, ".") == 0
//...code...

Based on my understanding, each time the whileloop is passed, dp (directory entry) is advanced one step, so next directory entry (which is associated with a file) can be processed (while dp != NULL)

My question is: How doesreaddir return a new directory entry each time it's called? Where does it show that? Please don't use too much jargon as I just started learning about this. Here's the code for readdir. Thanks.

#include <sys/dir.h>
Dirent *readdir(DIR *dp)
{
  struct direct dirbuf; \* local directory structure *\
  static Dirent d;
  while (read(dp->fd, (char *) &dirbuf, sizeof(dirbuf))
  == sizeof(dirbuf)) {
    if (dirbuf.d_ino == 0) \* slot not in use *\
      continue;
    d.ino = dirbuf.d_ino;
    strncpy(d.name, dirbuf.d_name, DIRSIZ);
    d.name[DIRSIZ] = '\0'; \* ensure termination *\
    return &d;
  }
  return NULL;
}

Solution

  • First, this is not the code that the POSIX readdir would use on any relevant operating system...

    How the code works is really simple, assuming that you know how files work. A directory on the UNIX systems is just as readable a file as any other file would be - a directory would appear as if a binary file of directory records - in this implementation the dirbuf structure is one record in a directory. Therefore reading sizeof dirbuf bytes from the file descriptor gives you a next entry from the directory - the filename and its associated inode number.

    If a file is deleted an entry might be marked unused by setting the inode number to 0, and it is skipped by the code.

    When a next used entry is found, its filename and inode number is copied to the Dirent d, which has static storage duration. It means that there is only one Dirent allocated for use by readdir for the entire duration of the program. readdir will return the same pointer over and over again, pointing to the same structure, but the contents of the structure change.

    Finally, when all entries in the directory have been read the last call to readdir will execute a read that will not read sizeof (dirbuf) bytes and the loop is broken, and NULL pointer is returned.