Search code examples
cassemblymachine-instruction

Assembly and Execution of Programs - Two pass assembler


While going through a book on machine instructions and programs I came across a particular point which says that an assembler scans an entire source program twice. It builds a symbol table during the 1st pass/scan and associates the entire program with it during the second scan. The assembler needs to provide an address in a similar way for a function.
Now, since the assembler passes through the program twice, why is it necessary to declare a function before it can be used? Wouldn't the assembler provide an address for the function from the 1st pass and then correlate it to the program during the 2nd pass ? I am considering C programming in this case.


Solution

  • The simple answer is that C programs require that functions be declared before it can be used because the C language was designed to be processed by a compiler in a single pass. It has nothing to with assemblers and addresses of functions. The compiler needs to know the type of a symbol, whether its a function, variable or something else, before it can use it.

    Consider this simple example:

    int foo() { return bar(); }
    int (*bar)();
    

    In order to generate the correct code the compiler needs to know that bar isn't a function, but a pointer to a function. The code only works if you put extern int (*bar)(); before the definition of foo so the compiler knows what type bar is.

    While the language could have been in theory designed to require the compiler to use two passes, this would have required some significant changes in the design of the language. Requiring two passes would also increase the required complexity of the compiler, decreasing the number of platforms that could host a C compiler. This was very important consideration back in the day when C was first being developed, back when 64K (65,536) bytes of RAM was a lot of memory. Even today would have noticeable impact on the compile times of large programs.

    Note that the C language does sort of allows what you want anyways, by supporting implicit function declarations. (In my example above this it what happens in foo when bar isn't declared previously.) However this feature is obsolete, limited, and considered dangerous.