Search code examples
assemblysyntaxlanguage-design

Is Assembly Language syntax the same for different architectures


I know that i can not write assembly language that will run/compile on all machines because they have different instruction sets,opcodes,registers etc. My question is, even though the instruction set would be different, is the assembly syntax (or the language it self) the same for any architecture?


Solution

  • My question is, even though the instruction set would be different, is the assembly syntax (or the language it self) the same for any architecture?

    No!

    Just for x86, there are a dozen different assemblers, each having their own uniqueness making them each accept a slightly different language — there's GAS, MASM, NASM, TASM, FASM, ASM... Few programs will assemble with all of these x86 assemblers.

    There's at&t syntax vs. intel — target first vs. target last.

    There's varied requirements around directives: .proc, .endp, etc..

    There's Intel's beautiful byte ptr syntax for determining operation size/width, vs. most of the rest of the world's .b, .w, .l opcode suffixes (sometimes without the .).

    Some assemblers like the : after label, others don't allow it (or require a , instead).

    Some require special characters to differentiate register names from other identifiers (e.g. % prefix for some, $ prefix for others), others don't.

    Syntax for addressing modes also vary significantly, e.g. in ARM's [] notation, the unusual location of the constant after the brackets indicates pointer variable update.

    And that's without getting into the names of the opcodes.

    On intel we use call for the instruction that invokes a function (transfers pc to function while capturing return address), jal on MIPS & RISC V, bsr, jsr, or bl, jms on others, etc..

    The term for invoking system calls, variously syscall, ecall, trap, sc, int, swi, svc etc..

    In short, there's no standardization of language, grammar, or syntax across assemblers.


    As for similarities, broadly speaking, there's the concepts of if-goto conditional branching (and unconditional branching) as the mechanism for control flow constructs, the concept of labels as branch targets and data targets, one instruction per line (as @Peter mentions), mnemonic opcode with separate operands — but these similarities are conceptual rather than syntactic.