Search code examples
macosfilesystemssymlinkhardlink

Deduplicating identical files using hard links


I have a couple of identical files stored in more than one place on my hard disk. I figure I can save a lot of disk space by hard-linking them to point to the same file. I am a little worried about possibly disastrous side effects.

I guess it does not affect permissions, as those are stored in the respective directories, just like the file name, right? (Update: Apparently, I guessed wrong, permissions are shared, as Carl demonstrates in his answer)

The biggest concern is changes to one file inadvertently also changing the other files. Read-only files should be safe then. And files that can be changed are also okay, if rather than updating within the existing file, a new file gets written. I believe most applications work that way, but probably not all.

Is there anything else to consider?

I am on OS X / HFS+.


Solution

  • Don't use hard links if you want changes to one file not to be reflected in other files. That's the whole point of hard links - multiple directory entries for the same file (same blocks on disk). Changing permissions on one of the names of a hard link changes them on both:

    $ touch file
    $ ln file link
    $ ls -l
    total 0
    -rw-r--r--  2 owner group  0 Nov 11 16:44 file
    -rw-r--r--  2 owner group  0 Nov 11 16:44 link
    $ chmod 444 file
    $ ls -l
    total 0
    -r--r--r--  2 owner group  0 Nov 11 16:44 file
    -r--r--r--  2 owner group  0 Nov 11 16:44 link
    

    From the ln man page:

    A hard link to a file is indistinguishable from the original directory entry; any changes to a file are effectively independent of the name used to reference the file.