Search code examples
ccompilationtranslation

Relationship between compilation stages and Translation phases in c


What is the relationship between compilation stages and Translation phases in c programming. Can we classify Translation phases based on compilation stages. Could anyone explain this?


Solution

  • The C standard defines eight phases of translation:

    1. Physical source multibyte characters and trigraph sequences are mapped to characters of the source character set.

    2. Each backslash followed by a new-line is deleted (splicing together two lines).

    3. The source characters are grouped into preprocessing tokens, and each sequence of white-space characters is replaced by one space, except new-lines are kept.

    4. Preprocessing directives and _Pragma operators are executed, and macro invocations are expanded.

    5. Source characters in strings and character constants are converted to the execution character set.

    6. Adjacent string literals are concatenated.

    7. Each preprocessing token is converted into a grammar token, and white-space characters separated tokens are discarded. The resulting tokens are analyzed and translated (compiled).

    8. All external references are resolved (the program is linked).

    Phases 1 to 6 are generally regarded as preliminary or “preprocessing.” The substantial work of compiling is in phase 7, “The resulting tokens are analyzed and translated (compiled).”

    These phases are largely conceptual, used for describing the semantics of C programs, not for specifying how the work is actually done. Phases 1 to 6 are not necessarily executed chronologically during compilation; they may be built into the structure of the compiler. This may even include phase 7.

    Phase 8 is most commonly performed as a separate step, using a separate linker instead of a compiler.