I'm trying to develop a delay function for the ATtiny1614 (using AtmelStudio 7) There is an existing platform _delay_us() which does something similar but this is as much a learning experience as being able to tweak your own code.
For the sake of delay resolution and minimum- and consistent delay time I decided to go for inline assembly.
I made the following: (snippet)
__attribute__((__always_inline__)) static inline void delay_loops(volatile uint32_t numLoops) {
asm volatile(
"loop_1%=: \n\t"
" subi %A[numLoops], 1 \n\t"
" sbci %B[numLoops], 0 \n\t"
" sbci %C[numLoops], 0 \n\t"
" sbci %D[numLoops], 0 \n\t"
" brcc loop_1%= \n\t"
:
:[numLoops] "d" (numLoops)
// d=select upper register (r16-31) only
);
}
int main(void)
{
...
delay_loops(10);
delay_loops(12);
...
}
So far, so good. Everything works as expected and the following code is generated:
3ec: 8a e0 ldi r24, 0x0A ; 10
3ee: 90 e0 ldi r25, 0x00 ; 0
3f0: a0 e0 ldi r26, 0x00 ; 0
3f2: b0 e0 ldi r27, 0x00 ; 0
000003f4 <loop_1333>:
3f4: 81 50 subi r24, 0x01 ; 1
3f6: 90 40 sbci r25, 0x00 ; 0
3f8: a0 40 sbci r26, 0x00 ; 0
3fa: b0 40 sbci r27, 0x00 ; 0
3fc: d8 f7 brcc .-10 ; 0x3f4 <loop_1333>
3fe: 8c e0 ldi r24, 0x0C ; 12
400: 90 e0 ldi r25, 0x00 ; 0
402: a0 e0 ldi r26, 0x00 ; 0
404: b0 e0 ldi r27, 0x00 ; 0
00000406 <loop_1341>:
406: 81 50 subi r24, 0x01 ; 1
408: 90 40 sbci r25, 0x00 ; 0
40a: a0 40 sbci r26, 0x00 ; 0
40c: b0 40 sbci r27, 0x00 ; 0
40e: d8 f7 brcc .-10 ; 0x406 <loop_1341>
Registers are preloaded with the given loop value and that number of loops is then iterated.
However, if I change the main code to
int main(void)
{
...
delay_loops(12); // changed 10->12
delay_loops(12);
...
}
then the second delay becomes seemingly endless (or at least outside the scope of my logical analyser).
The compiled assembly reveals the following:
3ec: 8c e0 ldi r24, 0x0C ; 12
3ee: 90 e0 ldi r25, 0x00 ; 0
3f0: a0 e0 ldi r26, 0x00 ; 0
3f2: b0 e0 ldi r27, 0x00 ; 0
000003f4 <loop_1332>:
3f4: 81 50 subi r24, 0x01 ; 1
3f6: 90 40 sbci r25, 0x00 ; 0
3f8: a0 40 sbci r26, 0x00 ; 0
3fa: b0 40 sbci r27, 0x00 ; 0
3fc: d8 f7 brcc .-10 ; 0x3f4 <loop_1332>
000003fe <loop_1339>:
3fe: 81 50 subi r24, 0x01 ; 1
400: 90 40 sbci r25, 0x00 ; 0
402: a0 40 sbci r26, 0x00 ; 0
404: b0 40 sbci r27, 0x00 ; 0
406: d8 f7 brcc .-10 ; 0x3fe <loop_1339>
Initialisation of the input value (12) is not done on the second 'call' of delay_loops(). The assembly just continues the second loop with the (altered) register values it still has. I can only assume the compiler does not know I changed r24..27 and assumes they are still correctly initialised to 12, and thus optimises the proper initialisation away.
How do I force proper initialisation?
Thanks
The inline assembler cookbook explains what you need to do if you have one operand being used for input and output.
Following their example, I think you should try replacing the two lines that start with colons with something like this:
:[numLoops] "=d" (numLoops)
:"0" (numLoops)