Search code examples
gccassemblyx86directivegnu-assembler

What's the difference between the .asciz and the .string assembler directives?


I know that the .ascii directive doesn't put a null character at the end of the string, as the .asciz directive is used for that purpose. However, I don't know whether the .string directive puts a null character at the end of the string.

If it does append the null character, then what's the difference between the .asciz and the .string directives? To me, having both .asciz and .string seems redundant.


Solution

  • According to the GNU Binutils docs on as:

    .ascii "string" (Here for completeness)

    .ascii expects zero or more string literals [...] separated by commas. It assembles each string (with no automatic trailing zero byte) into consecutive addresses.

    .asciz "string"

    .asciz is just like .ascii, but each string is followed by a zero byte. The "z" in '.asciz' stands for "zero".

    .string "str", .string8 "str", .string16 "str", .string32 "str", .string64 "str"

    Copy the characters in str to the object file. You may specify more than one string to copy, separated by commas. Unless otherwise specified for a particular machine, the assembler marks the end of each string with a 0 byte.

    [...]

    The variants string16, string32 and string64 differ from the string pseudo opcode in that each 8-bit character from str is copied and expanded to 16, 32 or 64 bits respectively. The expanded characters are stored in target endianness byte order.

    To summarize, the differences between .string and .asciz:

    • In certain architectures (listed below), .string will not add the null byte, when .asciz always will. To test your own system, you can do this:

        echo '.string ""' | gcc -c -o stdout.o -xassembler -; objdump -sj .text stdout.o
      

      If the first byte is 00, then the null character was inserted.

    • .string also has suffixes to expand characters to certain widths (16, 32, or 64), but by default it is 8.

    As stated in the comments to the question, in simple use-cases, there is no difference other than semantics. They all support escape sequences and accept multiple arguments. Technically, however, the two pseudo-ops are handled differently by the preprocessor and are not aliases. (Contrast with .zero and .skip, which are aliases.)


    Regarding .string, the docs mention two architectures that behave differently:

    • HPPA (HP Precision Architecture) - does not add 0, but has a special .stringz directive for that.
    • TI-C54X (A DSP chip from Texas Instruments) - zero-fills upper 8 bits of each word (2 bytes). Has a related .pstring directive that packs the characters and zero-fills unused space.

    Digging through the source code in the gas/config folder, we can confirm this and find one more:

    • IA64 (Intel Architecture) - .string and .stringz behave like HPPA.