Search code examples
assemblyavravr-gcc

Why does avr-gcc add "push r1" instructions to the start of a function?


I was looking at the produced assembly of some code I had written after compiling with avr-gcc. Specifically I compiled with the -Os option. Overall the output is what I expect, but what I cannot understand is the instruction push r1 being emitted. Even stranger is the fact that the complementary instruction at the end of the function is pop r0. So the value of r1 is being saved but it appears to be restored to r0

Per the documentation here:

https://gcc.gnu.org/wiki/avr-gcc#Register_Layout

The register r1 always containers zero, but a function may use the register if it restores it. Restoring it would be ldi r1, 0x0, there would be no need for a push, pop from my understanding.

Here's an example of the C code and disassembly of the compiled version. It is compiled with -Os. It's a bit long but I had to make a large function to get the compiler to emit this.

C code:

void mqtt_create_connect(mqtt_parser *p, mqtt_connect_config *cfg){
  uint8_t idx = 0;
  p->buffer[idx++] = MQTT_CTRL_CONNECT << 4;
  idx++; // skip remaining length for now
  p->buffer[idx++] = 0x0;
  p->buffer[idx++] = 0x4;
  p->buffer[idx++] = 'M';
  p->buffer[idx++] = 'Q';
  p->buffer[idx++] = 'T';
  p->buffer[idx++] = 'T';
  p->buffer[idx++] = 0x04; // protocol level 3.1.1
  p->buffer[idx++] = 
    MQTT_CONNECT_FLAG_CLEAN_SESSION;
  push_uint16_t(p, cfg->keepAliveInterval, &idx);
  push_charptr(p, cfg->clientIdentifier, &idx);
  p->bufferIdx = idx;

  // fill in remaining length
  p->buffer[1] = p->bufferIdx - 2;
}

Disassembly:

0000006a <mqtt_create_connect>:
  6a:   0f 93           push    r16
  6c:   1f 93           push    r17
  6e:   cf 93           push    r28
  70:   df 93           push    r29
  72:   1f 92           push    r1
  74:   cd b7           in      r28, 0x3d       ; 61
  76:   de b7           in      r29, 0x3e       ; 62
  78:   8c 01           movw    r16, r24
  7a:   fb 01           movw    r30, r22
  7c:   80 e1           ldi     r24, 0x10       ; 16
  7e:   d8 01           movw    r26, r16
  80:   11 96           adiw    r26, 0x01       ; 1
  82:   8c 93           st      X, r24
  84:   11 97           sbiw    r26, 0x01       ; 1
  86:   13 96           adiw    r26, 0x03       ; 3
  88:   1c 92           st      X, r1
  8a:   13 97           sbiw    r26, 0x03       ; 3
  8c:   84 e0           ldi     r24, 0x04       ; 4
  8e:   14 96           adiw    r26, 0x04       ; 4
  90:   8c 93           st      X, r24
  92:   14 97           sbiw    r26, 0x04       ; 4
  94:   9d e4           ldi     r25, 0x4D       ; 77
  96:   15 96           adiw    r26, 0x05       ; 5
  98:   9c 93           st      X, r25
  9a:   15 97           sbiw    r26, 0x05       ; 5
  9c:   91 e5           ldi     r25, 0x51       ; 81
  9e:   16 96           adiw    r26, 0x06       ; 6
  a0:   9c 93           st      X, r25
  a2:   16 97           sbiw    r26, 0x06       ; 6
  a4:   94 e5           ldi     r25, 0x54       ; 84
  a6:   17 96           adiw    r26, 0x07       ; 7
  a8:   9c 93           st      X, r25
  aa:   17 97           sbiw    r26, 0x07       ; 7
  ac:   18 96           adiw    r26, 0x08       ; 8
  ae:   9c 93           st      X, r25
  b0:   18 97           sbiw    r26, 0x08       ; 8
  b2:   19 96           adiw    r26, 0x09       ; 9
  b4:   8c 93           st      X, r24
  b6:   19 97           sbiw    r26, 0x09       ; 9
  b8:   82 e0           ldi     r24, 0x02       ; 2
  ba:   1a 96           adiw    r26, 0x0a       ; 10
  bc:   8c 93           st      X, r24
  be:   1a 97           sbiw    r26, 0x0a       ; 10
  c0:   80 81           ld      r24, Z
  c2:   91 81           ldd     r25, Z+1        ; 0x01
  c4:   1b 96           adiw    r26, 0x0b       ; 11
  c6:   9c 93           st      X, r25
  c8:   1b 97           sbiw    r26, 0x0b       ; 11
  ca:   9c e0           ldi     r25, 0x0C       ; 12
  cc:   99 83           std     Y+1, r25        ; 0x01
  ce:   1c 96           adiw    r26, 0x0c       ; 12
  d0:   8c 93           st      X, r24
  d2:   62 81           ldd     r22, Z+2        ; 0x02
  d4:   73 81           ldd     r23, Z+3        ; 0x03
  d6:   ae 01           movw    r20, r28
  d8:   4f 5f           subi    r20, 0xFF       ; 255
  da:   5f 4f           sbci    r21, 0xFF       ; 255
  dc:   c8 01           movw    r24, r16
  de:   0e 94 00 00     call    0       ; 0x0 <push_charptr>
  e2:   89 81           ldd     r24, Y+1        ; 0x01
  e4:   f8 01           movw    r30, r16
  e6:   ef 5b           subi    r30, 0xBF       ; 191
  e8:   ff 4f           sbci    r31, 0xFF       ; 255
  ea:   80 83           st      Z, r24
  ec:   82 50           subi    r24, 0x02       ; 2
  ee:   f8 01           movw    r30, r16
  f0:   82 83           std     Z+2, r24        ; 0x02
  f2:   0f 90           pop     r0
  f4:   df 91           pop     r29
  f6:   cf 91           pop     r28
  f8:   1f 91           pop     r17
  fa:   0f 91           pop     r16
  fc:   08 95           ret

What is the purpose of the push r1 and the pop r0?


Solution

  • There are two tricks here.

    At first, gcc needs to reserve one-byte place on stack — for uint8_t idx; Stack pointer need to be decremented and saved back to SPH:SPL. But this two-out operation can be interrupted with catastrophic results. So it must be wrapped on by cli/sei pair — extra code and time. Pushing of any register gives the same result atomically and using short code.

    Second: as You note, by avr-gcc/avr-libc convention, r1 is __zero_reg__, assumed to be always zero in any C code. So, push r1 not only reserves space for idx but also initializes it by 0.

    pop r0 in function epilogue restores stack pointer. By mentioned convention, r0 is __temp_reg__, temporary register that can be clobbered by any C code. So compiler can destroy its content at any time.

    p.s. The function does not change r1, so that no r1 restoring needed.