Search code examples
pythondjangomd5filenamessha1

Can I use MD5 or SHA1 hashes for filenames?


Let's consider a site where users can upload files. Can I use MD5 or SHA1 hashes of their contents as filenames? If not, what should I use? To avoid collisions.


Solution

  • You can use almost anything as a filename, minus reserved characters. Those particular choices tell you nothing about the file itself, aside from its hash value. Provided they aren't uploading identical files, that should prevent file naming collisions. If you don't care about that, have at it.

    Usually people upload files in order for someone to pull them back down. So you'd need to have a descriptor of some kind; otherwise users would need to open a mass of files to get the one they want. Perhaps a better option would be to let the user select a name (up to a character limit) and then append the datetime code. Then, in order to have a collision, you'd need to have 2 users select the exact same name at the exact same time. Include seconds in the datetime code, and the chances of collision approach (but never equal) zero.