Search code examples
c++linuxgccavr-gcc

can GCC print out intermediate results?


Check the code below:

#include <avr/io.h>

const uint16_t baudrate = 9600;

void setupUART( void ) {
        uint16_t ubrr = ( ( F_CPU / ( 16 * (float) baudrate ) ) - 1 + .5 );
        UBRRH = ubrr >> 8;
        UBRRL = ubrr & 0xff;
}

int main( void ) {
        setupUART();
}

This is the command used to compile the code:

avr-gcc -g -DF_CPU=4000000       -Wall -Os -Werror -Wextra -mmcu=attiny2313 -Wa,-ahlmns=project.lst -c -o project.o project.cpp

ubrr is calculated by the compiler as 25, so far so good. However, to check what the compiler calculated, I have peek into the disassembly listing.

000000ae <setupUART()>:
  ae:   12 b8           out     UBRRH, r1       ; 0x02
  b0:   89 e1           ldi     r24, 0x19       ; 25
  b2:   89 b9           out     UBRRL, r24      ; 0x09
  b4:   08 95           ret

Is it possible to make avr-gcc print out the intermediate result at compile time (or pull the info from the .o file), so when I compile the code it prints a line like (uint16_t) ubbr = 25 or similar? That way I can do a quick sanity check on the calculation and settings.


Solution

  • GCC has command line options to request that it dump out its intermediate representation after any stage of compilation. The "tree" dumps are in pseudo-C syntax and contain the information you want. For what you're trying to do, the -fdump-tree-original and -fdump-tree-optimized dumps happen at useful points in the optimization pipeline. I don't have an AVR compiler to hand, so I modified your test case to be self-contained and compilable with the compiler I do have:

    typedef unsigned short uint16_t;
    const int F_CPU = 4000000;
    const uint16_t baudrate = 9600;
    extern uint16_t UBRRH, UBRRL;
    
    void 
    setupUART(void)
    {
        uint16_t ubrr = ((F_CPU / (16 * (float) baudrate)) - 1 + .5);
        UBRRH = ubrr >> 8;
        UBRRL = ubrr & 0xff;
    }
    

    and then

    $ gcc -O2 -S -fdump-tree-original -fdump-tree-optimized test.c
    $ cat test.c.003t.original
    ;; Function setupUART (null)
    ;; enabled by -tree-original
    
    
    {
      uint16_t ubrr = 25;
    
        uint16_t ubrr = 25;
      UBRRH = (uint16_t) ((short unsigned int) ubrr >> 8);
      UBRRL = ubrr & 255;
    }
    
    $ cat test.c.149t.optimized
    ;; Function setupUART (setupUART, funcdef_no=0, decl_uid=1728, cgraph_uid=0)
    
    setupUART ()
    {
    <bb 2>:
      UBRRH = 0;
      UBRRL = 25;
      return;
    }
    

    You can see that constant-expression folding is done so early that it's already happened in the "original" dump (which is the earliest comprehensible dump you can have), and that optimization has further folded the shift and mask operations into the statements writing to UBRRH and UBRRL.

    The numbers in the filenames (003t and 149t) will probably be different for you. If you want to see all the "tree" dumps, use -fdump-tree-all. There are also "RTL" dumps, which don't look anything like C and are probably not useful to you. If you're curious, though, -fdump-rtl-all will turn 'em on. In total there are about 100 tree and 60 RTL dumps, so it's a good idea to do this in a scratch directory.

    (Psssst: Every time you put spaces on the inside of your parentheses, God kills a kitten.)