I would like to be able to define a symbol in an assembler file with any name whatsoever that does not contain NUL characters. How do I get the GNU assembler to create such symbols? What about NASM? MASM?
Edit: I am using the following Python script for testing (requires Python 3.5.1+):
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import tempfile
import os.path
import subprocess
import ctypes
def main(symbolname, quoter):
join = os.path.join
with tempfile.TemporaryDirectory() as d:
as_file_name = join(d, 'test.s')
with open(as_file_name, 'w') as file_object:
assembler = '''\
\t.globl "{0}"
"{0}":
\tmov $0x0, %rdi # exit status
\tmov $231, %rax # __NR_exit_group
\tsyscall
'''.format(quoter(symbolname))
file_object.write(assembler)
objectname, sharedlib = join(d, 'test.o'), join(d, 'test.so')
subprocess.check_call(['as', '-o', objectname, as_file_name])
subprocess.check_call(['ld', objectname, '-shared', '-o', sharedlib])
mydll = ctypes.pydll.LoadLibrary(sharedlib)
mydll[symbolname]
if __name__ == '__main__':
main('a', lambda x: x)
I am trying to figure out what I can put instead of the identity function passed to main
, so that the code will work whatever string I put instead of 'a'
Works for me in GAS: .comm "my weirdsym .$ 12 foo^M bar" 2
(where that ^M is a literal carriage return, and makes the output of objdump -t
look funny).
Creating such symbols with the label:
syntax probably isn't always possible. The GAS manual doesn't mention quoted label names in its description of the statement syntax, and it doesn't work for me:
test.S:52: Error: junk at end of line, first unrecognized character is '"'
for an input of "foobar":
.
If you really want this, you can probably use .set
to get a context where a symbol name is expected, so you can use quotes. Then you can give a symbol whatever value you want, including the value of another symbol (e.g. a sensibly-named label).
For example (thanks @FUZxxl):
# symbol includes a literal doublequote, and a literal newline
# symbol value(address) is . which means current position
.set "\"my weirdsym .$ 12 foo^M bar", .
nop
objdump -drwC -Mintel
output:
bar>:00000000a7 <"my weirdsym .$ 12 foo
a7: 90 nop
I highly recommend doing some sanity checks on symbol names in your code, because it's probably not very helpful (for anyone debugging your object files) to create symbol names with non-printable characters.
A custom name-mangling scheme to encode things into characters that are legal for C function/variable names would also work.
But if you really want to do this, this is how (with GAS). It's probably not possible with NASM.