Search code examples
clangescapingllvm

Clang: is interpreting escape sequences in .incbin a bug or feature?


Currently Clang interprets escape sequences (e.g. \x) in .incbin.

Is this a bug or feature?

One of the consequences is that Windows path containing \ needs to contain \\ to prevent interpreting escape sequences. Which implies an extra processing of such Windows path.


Solution

  • It is correct for clang to recognize and interpret escape sequences in the file argument of the .incbin directive:

    • clang generally follows conventions established by gcc.

    • gcc interprets assembly using GNU as.

    • The GNU as manual explains escape sequence interpretation in its Strings section.

    • Its .incbin section says the file name must be quoted. It does not explicitly say that the file name is treated like other string constants, but absent other indication, that seems like the most reasonable interpretation.

    Consistent with the above, GNU as interprets such escape sequences:

    $ cat incbin.s
            .incbin "\x41"
    
    $ gcc -c incbin.s
    incbin.s: Assembler messages:
    incbin.s:1: Error: file not found: A
    
    $ gcc --version
    gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
    [...]
    

    Therefore it is sensible that clang does too:

    $ clang -c incbin.s
    incbin.s:1:10: error: Could not find incbin file 'A'
            .incbin "\x41"
                    ^
    
    $ clang --version
    clang version 16.0.0
    [...]
    

    Dealing with Windows paths

    Because of annoyances like this, when possible, I recommend using forward slashes as directory separators even on Windows since the API accepts them and in my experience almost all programs do too.

    If using Cygwin, the cygpath -m ("mixed") option will print Windows paths using Windows drive letters but forward slashes.