Search code examples
windowsgitcharacter-encodingfilesystemssymbolic-references

What character encoding is used in Git symbolic refs (especially on Windows)?


This is a really quick question: what is the character encoding used in symbolic ref files like .git/HEAD, especially on Windows?

Is it the same as the filesystem's encoding? It sounds improbable, though, since I've heard before that Windows' filesystem encoding is UTF-16 and ASCII control bytes 0x00..0x1F and 0x7F is prohibited in Git ref name (we can't have a byte 0x00 in Git ref). Is it UTF-8 universally? However it does not seem to be documented in git help check-ref-format. Maybe it lies somewhere else? Or is symbolic ref's encoding undefined? However then, how can we clone, push and fetch branches between each other?


Solution

  • There is no specific character encoding used by Git's refs. The format is specified in the git check-ref-format manual page, and it allows a variety of byte values, including values which are not value UTF-8, such as 0xFE and 0xFF.

    However, having said that, it is customary to use UTF-8 for ref names, and when ref files are written into the file system on Windows, they will be converted into UTF-16 because Windows can't handle anything else in its file system. The contents of the files, however, remains something containing arbitrary bytes, which, again, are customarily (but need not be) UTF-8.