I am writing go
code but I don't believe its unique to go so lets generalize it. Imagine a user via go code creates three files with three distinct unicode names. Notice the last letters of the filename are different.
καθέδρα.txt
καθέδρᾳ.txt
καθέδραι.txt
In go
, these three strings are three different unique strings. It appears, that if you try to create three files with these three names, you end up with two files saved to disk. The second and third filenames appear to be treated as identical files. So when the script writes three user created files, one goes "missing".
If you write καθέδρᾳ.txt
then καθέδραι.txt
you end up with only the first filename.
If you write καθέδραι.txt
then καθέδρᾳ.txt
you end up with only the first filename.
How do you guard in golang against strange OS/X filename behavior in unicode? It appears to think two different strings are one filename.
When you choose a case insensitive file system on OS/X, the case insensitivity process is more complex than our intuition would expect. Depending on the language, the rules are different.
There is no real way to guard against this except to detect the file system type.
The cross platform way to prevent the problem would be to have your software write a file and read it back using a different "case" to detect if the problem exists.