Search code examples
phpurlurl-shortenershort-url

URL shortening: using inode as short name?


The site I am working on wants to generate its own shortened URLs rather than rely on a third party like tinyurl or bit.ly.

Obviously I could keep a running count new URLs as they are added to the site and use that to generate the short URLs. But I am trying to avoid that if possible since it seems like a lot of work just to make this one thing work.

As the things that need short URLs are all real physical files on the webserver my current solution is to use their inode numbers as those are already generated for me ready to use and guaranteed to be unique.

function short_name($file) {
   $ino = @fileinode($file);
   $s = base_convert($ino, 10, 36);
   return $s;
}

This seems to work. Question is, what can I do to make the short URL even shorter?

On the system where this is being used, the inodes for newly added files are in a range that makes the function above return a string 7 characters long.

Can I safely throw away some (half?) of the bits of the inode? And if so, should it be the high bits or the low bits?

I thought of using the crc32 of the filename, but that actually makes my short names longer than using the inode.

Would something like this have any risk of collisions? I've been able to get down to single digits by picking the right value of "$referencefile".

function short_name($file) {
   $ino = @fileinode($file);
   // arbitrarily selected pre-existing file,
   // as all newer files will have higher inodes
   $ino = $ino - @fileinode($referencefile);
   $s = base_convert($ino, 10, 36);
   return $s;
}

Solution

  • Not sure this is a good idea : if you have to change server, or change disk / reformat it, the inodes numbers of your files will most probably change... And all your short URL will be broken / lost !

    Same thing if, for any reason, you need to move your files to another partition of your disk, btw.


    Another idea might be to calculate some crc/md5/whatever of the file's name, like you suggested, and use some algorithm to "shorten" it.

    Here are a couple articles about that :