I'm learning some 8080 assembly, which uses the older suffix H
to indicate hexadecimal constants (vs modern prefix 0x
or $
). I'm also noodling around with a toy assembler and thinking about how to tokenize source code.
It's possible to write a valid hex constant (say) BEEFH
, which contains only alphabetical characters. It's also possible to define a label called BEEFH
. So when I write:
ORG 0800H
START: ...
JMP BEEFH ; <--- how is this resolved?
....
BEEFH: ...
...
This should be syntactically valid based on the old Intel docs: BEEFH
meets the naming rules for labels, and of course is also a valid 16-bit address. The ambiguity of whether the operand to JMP
here is an address constant or an identifier seems like a problem.
I don't have access to the original 8080 assembler to see what it does with this example. Here's an online 8080 assembler that appears to parse the operand to JMP
as a label reference in all cases, but obviously a proper assembler should be able to target an absolute address with a JMP instruction.
Can anyone shed light on what the conventions around this actually are/should be? Am I missing something obvious? Thanks.
Someone left a comment that they then deleted, but I looked again and it was right on. Apparently I missed the note in the old Intel manual that says this about hex constants:
Hex constants must begin with a decimal digit. So that's certainly how you avoid the semantic ambiguity when parsing. It seems a bit inelegant to me as a solution but I guess then you should just use a modern prefix.
Thanks, anonymous commenter!