Search code examples
cmemory-managementlinked-list

Efficient linked list design for a C file manager


Context.

I'm making a file manager in C. To display a list of files in the CWD in ncurses, I chose to implement the file list as a singly linked list of the following structs:

typedef struct FileNode {
    char name[256]
    struct FileNode* next;
} FileNode;

I know for certain that the filenames will never exceed 255 characters plus the null terminator. This is standard for ext2, ext3, ext4, BTRFS, XFS, ZFS and exFAT. (I'm not familiar with DOS filesystems, but I'm just targeting UNIX.)

I have to account for the case in which a given file is renamed to a longer name, in which case I would only change the name for that file node and not rebuild the entire linked list. However, if the name is changed, then the struct will also potentially change places in the list because the list is in alphabetical order of FileNode names.

Note. I will likely add members to the FileNode struct later on to determine which files in the list are selected by the user in the ncurses interface, for example.

Problem.

Because the FileNode struct reserves 256 bytes for the filename, a lot of memory will be wasted for directories having an enormous number of files with relatively short names. My file manager is general-purpose, so I can well imagine a scenario in which this would occur.

What would an advanced programmer do in this case? Dynamic allocation of filenames would cause significantly more overhead when creating a large file list or when renaming many files simultaneously. (This will be a feature of my file manager.)

Things I've looked into.

After considering FAMs for a while, (see below code snippet) I noticed that things got complicated for file renaming unless I chose to free the node and create a new one upon renaming a file. Is this a good option? The costly part would be to check the length of all the filenames initially in order to generate a list os structs which are individually suited to these lengths. To repeat my earlier question, what would an advanced programmer do?

typedef FileNode {
    struct FileNode* next;
    char name[]; // this member should be the last
} FileNode;

Solution

  • Some ideas from an Algorithms perspective:

    For an always sorted list, you could make use of data structures like AVL or Red-Black Trees to maintain sorted order of FileList upon creating, deleting or renaming a file as per your choice.

    To improve performance with large directories:

    Use a hash table for filename lookups, inspired by the linux kernel's dcache

    Map filenames (or file IDs) to FileNode pointers in the Red Black Tree for O(1) lookups.

    To handle rename cases, I would:

    1. Retrieve the FileNode pointer using Hash table.
    2. Free the occupied memory by FileNode->name. Delete the node from Red-black tree and rebalance.
    3. Reallocate the necessary memory required.
    4. Copy the newName to FileNode->name. Insert the node in Red-black tree and rebalance again.

    Further scope of improvement: