Search code examples
assemblyarmstm32

STM32. Timer keeps counting past auto-reload value?


Still learning assembly only STM32 and ARM. The board is an STM32F407.

I've been exploring interrupts. Took the TIM2 timer, set it up with a prescaler. My expectation is that at every overflow, the interrupt signal is generated and my code runs into the TIM2_handler. There a GPIO-pin is enabled and disabled, depending on whether it was disabled or enabled before. There's an LED and a resistor connected to that pin, so I can monitor it.

The first count goes peachy. After reset the LED is off. After a couple of seconds it turns on through the interrupt.

Then I expect it to turn off after roughly the same time. However the delay until it turns off is way longer than how long it took for it to turn on.

Had a check with GDB, and the counter seems to not reset to 0 after passing the auto-reload value. It keeps counting up. Once it overflowed and passed the auto-reload value again, the interrupt routine at TIM2_handler is called again, and the LED turns off as expected.

I would like the intervals between the toggles to be the same though. Essentially I want the documented feature, that the timer counts from 0 to auto-reload. Then generates the interrupt, actually auto-reloads and counts that piece again. Rinse. Repeat.

I've tried changing directions, center-align mode, one-pulse mode which is reset manually every interrupt. I've tried longer and shorter intervals. It's always the same. The first interval is fine, every subsequent interval takes a lap around the entire register.

My code contains three files, of which the first one is the standard vector table hard coded into ROM. The other two files are: reset_handler.s

    .macro STRMASKSET
    LDR r1, [r2, +r3]                           //load
    ORR r1, r1, r4                              //mask
    STR r1, [r2, +r3]                           //store
    .endm

    .macro STRMASKRESET
    LDR r1, [r2, +r3]                           //load
    AND r1, r1, r4                              //mask
    STR r1, [r2, +r3]                           //store
    .endm

    .text
reset_handler:

    /*Enable APB1 for TIM2*/
    LDR r2, =0x40023800                         //RCC Register
    LDR r3, =0x40                               //APB1 offset
    LDR r4, =0x1                                //TIM2 Register Enable
    STRMASKSET                                  //Enable TIM2 Clock in RCC

    /*Set up TIM2 (LF)*/
    LDR r2, =0x40000000                         //TIM2 Register
    LDR r3, =0x0                                //TIM2 control register 1 offset
    /* Counter:      on   (1)
     * Update Event: on   (0)
     * URS:          any  (0)
     * One-pulse:    off  (0)
     * Direction:    up   (0)
     * CMS:          off  (00)
     * ARPE:         off  (0)
     * CKD:          x1   (00)
     **/
    LDR r4, =0x01                               //TIM2 control register 1 config
    STRMASKSET                                  //Set up TIM2 control register 1

    LDR r3, =0x0c                               //TIM2 Interrupt enable register
    LDR r4, =0x1                                //Update Interrupt enable
    STRMASKSET

    LDR r3, =0x2c                               //TIM2 Auto-reload register
    LDR r4, =0x0400000                          //Auto-reload value
    STRMASKSET

    /*Enable AHB1 for GPIOC*/
    LDR r2, =0x40023830
    LDR r1, [r2]
    LDR r4, =0x4
    ORR r1, r1, r4
    STR r1, [r2]
    
    /*GPIOC is at 0x4002 0800 - 0x4002 0BFF*/
    LDR r2, =0x40020800 /*Address register*/
    /*Set up GPIOC as output*/
    /*Port Mode Register (offset 0x00)*/
    LDR r3, =0x00
    /*Set to GPIO output mode*/
    LDR r4, =0x04
    STRMASKSET

    /*Set up GPIOC to High Speed*/
    /*Port output Register (offset 0x08)*/
    LDR r3, =0x08
    /*Set to high speed*/
    LDR r4, =0x08
    STRMASKSET

    LDR r2, =0x40000000                         //TIM2 Register
    LDR r3, =0x10                               //TIM2 Status Register Offset
    LDR r4, =0xfffe                             //Reset Update interrupt flag
    STRMASKRESET

    /*Set counter*/
    LDR r3, =0x24                               //TIM2 Counter offset
    LDR r4, =0xf8000000                         //Auto-reload value
    STR r4, [r2, +r3]
    
    /*Enable TIM2 Interrupt in NVIC*/
    LDR r2, =0xe000e100
    LDR r3, =0x0
    LDR r4, =0x10000000
    STRMASKSET
    
main_loop:
    LDR r1, =0xdeadbeef
    LDR r2, =0x0807b4be
    AND r1, r1, r2
    B main_loop                                 //Pure interrupt program.
    
    .size reset_handler, . - reset_handler

TIM2_handler.s

    .type TIM2_handler, %function

TIM2_handler:
    PUSH {r4-r11, lr}                           //push registers onto stack
    
    LDR r2, =0x40000000                         //TIM2 Register
    LDR r3, =0x10                               //TIM2 Status Register Offset
    LDR r4, =0xffe0                             //Reset Update interrupt flag
    STRMASKRESET

    LDR r2, =0xe000e280                         //NVIC_Register
    LDR r3, =0x00                               //ICPR0_Register
    LDR r4, =0x10000000                         //Reset Update Interrupt for 0x2c
    STRMASKSET

    LDR r2, =0x40020800                         //GPIOC Register
    LDR r3, =0x14                               //GPIOC_ODR offset
    LDR r1, [r2, +r3]                           //Load GPIOC_ODR Status
    AND r1, r1, #2                              //Mask GPIOC1 pin status
    LDR r3, =0x18                               //GPIOC_BSRR offset
    CMP r1, #2                                  //Compare if GPIOC1 is set
    BEQ reset_GPIOC1                            //Reset if set
set_GPIOC1:                                     //Else set if reset
    LDR r4, =0x2                                //Set up the set word
    B config_GPIOC1BSRR                         //Jump to set
reset_GPIOC1:   
    LDR r4, =0x20000                            //Set up the reset word
config_GPIOC1BSRR:
    STRMASKSET                                  //STR the config register
    
    LDR r2, =0xe000e100                         //NVIC_Register
    LDR r3, =0x00                               //ISER0_Register
    LDR r4, =0x10000000                         //Reenable Interrupt for 0x2c
    STRMASKSET
    
    POP {r4-r11, pc}                            //pop registers back from stack
    //B main_loop
    .size TIM2_handler, . - TIM2_handler

What am I doing wrong?


Solution

  • This is what a typical road to a timer peripheral interrupt handler should look like. Several ad-hoc throwaway programs, never enable the interrupt until you completely understand the peripheral and can go no further.

    I am using a NUCLEO-F411RE or 466RE both run the same code, I have not cross referenced to the F407 but those boards are a PITA compared to the NUCLEOs (well move a jumper dfu-util, or run a command line to load so maybe not so painful, but takes more tools). I may go and port this to asm to the f407 and run it on one of those parts later.

    Best to use C + asm not asm, or at least prototype in C+asm then you can go pure asm later. YMMV. This is C + asm. I will post the whole program at the end everything you need to be able to recreate it with some arm-whatever-whatever gnu tools from the last decade or so (no libraries all code visible, two or three files total).

    First of course blink an led so you know you can run programs and talk to peripherals and read the manual and schematics. Use a counting loop (count some variable or register to some tens of thousands or hundreds of thousands or whatever, no timers). Then use something like the systick timer if you can and confirm the clock speed referred to in the manual. In this case I have a HSI of 16Mhz. Then get the uart working (debuggers create as many problems as they solve so they are useless to me, in this case the uart is very much your friend and makes this task much much much easier).

    hexstrings(); prints with a space at the end, hexstring() with a cr/lf

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    
    PUT32(TIM2_CR1,0x00000001);
    while(1)
    {
        hexstring(GET32(TIM2_CNT));
    }
    
    4702927E 
    4702CE33 
    470309F7 
    470345AC 
    47038162 
    

    counts up, takes a long time as one would expect to get to roll over and it rolls over as one would expect. 32 bit counter...And it keeps going so we have learned a TON with that one experiment about this peripheral, for some timers we would be doing a bunch more work just to get the thing to autoreload and restart. Not this one, defaults to what we want.

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    
    PUT32(TIM2_ARR,0x08000000);
    PUT32(TIM2_CR1,0x00000001);
    while(1)
    {
        hexstring(GET32(TIM2_CNT));
    }
    

    Watching the output I can see that it counts to 0x08000000 and rolls over (whether or not we really want 0x07FFFFFF or 0x08000000 to meet some precise time is not part of this exercise (likely 0x07FFFFFF));

    If you had done this experiment you would also be seeing this and not wondering or thinking that it kept counting beyond.

    Tried the prescaler

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    
    PUT32(TIM2_ARR,0x08000000);
    PUT32(TIM2_PSC,0xFFFF);
    GET32(TIM2_CR1);
    PUT32(TIM2_CR1,0x0001
    

    Output was pretty messed up the timer counts at full speed until the first rollover then counts using the prescaler. Research project for another day, do not need that for this.

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    PUT32(TIM2_ARR,0x08000000);
    PUT32(TIM2_CR1,0x0001);
    

    (printed out the count and status register)

    At 0x08000000 the status register changes to 0x1F. It did not clear on read so that is important to know/confirm if we want half a chance at success with an interrupt.

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    PUT32(TIM2_ARR,0x08000000);
    PUT32(TIM2_CR1,0x0001);
    while(1)
    {
        hexstrings(GET32(TIM2_CNT));
        rx=GET32(TIM2_SR);
        hexstrings(rx);
        if(rx&1)
        { 
            PUT32(TIM2_SR,0);
            rb++;
        }
        hexstring(rb);
    }
    

    When it hits the arr we get the one UIE we clear it and can see the count has rolled over, etc. Good info.

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    PUT32(TIM2_ARR,0x08000000);
    PUT32(TIM2_DIER,0x0001);
    PUT32(TIM2_CR1,0x0001);
    //tim2 position 28
    
    while(1)
    {
        hexstrings(GET32(TIM2_CNT));
    
        rx=GET32(TIM2_SR);
        hexstrings(rx);
        hexstrings(GET32(NVIC_ICPR0));
        hexstrings(GET32(NVIC_ICPR1));
        if(rx&1)
        { 
            PUT32(TIM2_SR,0);
            rb++;
            break;
        }
        hexstring(rb);
    }
    
    
    07FF8B0A 00000000 00000000 00000000 00000000 
    00008A36 0000001F 10000000 00000000 
    

    So this shows what we expected to see, I did not capture output without DIER having any enable bits, and that was an important step that I did but skipped capturing source/output. The interrupt is not enabled out of the peripheral as one would expect and that is confirmed.

    Kinda messed up my logic there, but doesn't matter saw what I needed to see confirmation that it is external interrupt 28. As shown in the st documentation.

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    PUT32(TIM2_ARR,0x04000000);
    PUT32(TIM2_DIER,0x0001);
    PUT32(TIM2_CR1,0x0001);
    //tim2 position 28
    
    while(1)
    {
        hexstrings(GET32(TIM2_CNT));
    
        rx=GET32(TIM2_SR);
        hexstrings(rx);
        hexstrings(GET32(NVIC_ICPR0));
        PUT32(NVIC_ICPR0,0x10000000);
        hexstrings(GET32(NVIC_ICPR0));
        if(rx&1)
        { 
            PUT32(TIM2_SR,0);
            rb++;
        }
        hexstrings(GET32(NVIC_ICPR0));
        hexstrings(GET32(TIM2_SR));
        hexstring(rb);
    

    if(rb) break;
    }

    03FD90C6 00000000 00000000 00000000 00000000 00000000 00000000 
    03FEF1B5 00000000 00000000 00000000 00000000 00000000 00000000 
    00005296 0000001F 10000000 10000000 10000000 00000000 00000001 
    

    demonstrating an important trap that folks tend to fall into. You need to clear the interrupt from the peripheral toward the processor in general, otherwise it may stay triggered or re-trigger.

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    PUT32(TIM2_ARR,0x04000000);
    PUT32(TIM2_DIER,0x0001);
    PUT32(TIM2_CR1,0x0001);
    //tim2 position 28
    
    
    while(1)
    {
        hexstrings(GET32(TIM2_CNT));
    
        rx=GET32(TIM2_SR);
        hexstrings(rx);
        hexstrings(GET32(NVIC_ICPR0));
        if(rx&1)
        { 
            PUT32(TIM2_SR,0);
            rb++;
        }
        PUT32(NVIC_ICPR0,0x10000000);
        hexstrings(GET32(NVIC_ICPR0));
        hexstrings(GET32(TIM2_SR));
        hexstring(rb);
    

    if(rb) break;
    }

    03FD759C 00000000 00000000 00000000 00000000 00000000 
    03FEA5AA 00000000 00000000 00000000 00000000 00000000 
    03FFD5BA 0000001F 10000000 00000000 00000000 00000001 
    

    Much better, this is more like it, we might have enough to finally enable interrupts all the way through.

    void tim2_handler ( void )
    {
        uart2_send(0x55);
        PUT32(TIM2_SR,0);
        PUT32(NVIC_ICPR0,0x10000000);
    }
    

    ...

    ra=GET32(RCC_APB1ENR);
    ra|=1<<0; //enable TIM2
    PUT32(RCC_APB1ENR,ra);
    PUT32(TIM2_ARR,0x04000000);
    PUT32(TIM2_DIER,0x0001);
    PUT32(NVIC_ISER0,0x10000000);
    ienable();
    PUT32(TIM2_CR1,0x0001);
    
    while(1)
    {
        DOWFI();
        hexstring(GET32(TIM2_CNT));
    
    }
    
    
    12345678 
    U00000062 
    U00000061 
    U00000061 
    U00000061 
    U00000061 
    U00000061 
    U00000061 
    

    And it all works as desired.

    An important note here, some cortex-ms (and other arm cores) treat WFI as a nop, in place simply to allow for more portable code. In this case it does not it actually waits.

    flash.s

    .cpu cortex-m0
    .thumb
    
    .thumb_func
    .global _start
    _start:
    stacktop: .word 0x20001000
    .word reset
    .word hang
    .word hang
    .word hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang,hang,hang,hang
    .word hang
    .word hang
    .word hang
    .word tim2_handler
    
    .thumb_func
    reset:
        cpsid i
        bl notmain
        b hang
    .thumb_func
    hang:   b .
    .thumb_func
    .globl PUT32
    PUT32:
        str r1,[r0]
        bx lr
    .thumb_func
    .globl GET32
    GET32:
        ldr r0,[r0]
        bx lr
    .thumb_func
    .globl ienable
    ienable:
        cpsie i
        bx lr
    
    .thumb_func
    .globl DOWFI
    DOWFI:
        wfi
        bx lr
    

    flash.ld

    MEMORY
    {
        ram : ORIGIN = 0x08000000, LENGTH = 0x1000
    }
    SECTIONS
    {
        .text : { *(.text*) } > ram
        .rodata : { *(.rodata*) } > ram
        .bss : { *(.bss*) } > ram
    }
    

    notmain.c

    void PUT32 ( unsigned int, unsigned int );
    unsigned int GET32 ( unsigned int );
    void ienable ( void);
    void DOWFI(void);
    
    #define RCCBASE 0x40023800
    #define RCC_APB1RSTR    (RCCBASE+0x20)
    #define RCC_AHB1ENR     (RCCBASE+0x30)
    #define RCC_APB1ENR     (RCCBASE+0x40)
    
    #define GPIOABASE 0x40020000
    #define GPIOA_MODER     (GPIOABASE+0x00)
    #define GPIOA_AFRL      (GPIOABASE+0x20)
    
    #define USART2BASE 0x40004400
    #define USART2_SR       (USART2BASE+0x00)
    #define USART2_DR       (USART2BASE+0x04)
    #define USART2_BRR      (USART2BASE+0x08)
    #define USART2_CR1      (USART2BASE+0x0C)
    
    #define TIM2_BASE 0x40000000
    #define TIM2_CR1 (TIM2_BASE+0x00)
    #define TIM2_SR  (TIM2_BASE+0x10)
    #define TIM2_DIER  (TIM2_BASE+0x0C)
    #define TIM2_CNT (TIM2_BASE+0x24)
    #define TIM2_PSC (TIM2_BASE+0x28)
    #define TIM2_ARR (TIM2_BASE+0x2C)
    
    #define ICTR            0xE000E004
    #define NVIC_ISER0      0xE000E100
    #define NVIC_ICPR0      0xE000E280
    
    int uart2_init ( void )
    {
        unsigned int ra;
    
        ra=GET32(RCC_AHB1ENR);
        ra|=1<<0; //enable port A
        PUT32(RCC_AHB1ENR,ra);
    
        ra=GET32(RCC_APB1ENR);
        ra|=1<<17; //enable USART2
        PUT32(RCC_APB1ENR,ra);
    
        ra=GET32(GPIOA_MODER);
        ra&=~(3<<4); //PA2
        ra&=~(3<<6); //PA3
        ra|=2<<4; //PA2
        ra|=2<<6; //PA3
        PUT32(GPIOA_MODER,ra);
    
        ra=GET32(GPIOA_AFRL);
        ra&=~(0xF<<8); //PA2
        ra&=~(0xF<<12); //PA3
        ra|=0x7<<8; //PA2
        ra|=0x7<<12; //PA3
        PUT32(GPIOA_AFRL,ra);
    
        ra=GET32(RCC_APB1RSTR);
        ra|=1<<17; //reset USART2
        PUT32(RCC_APB1RSTR,ra);
        ra&=~(1<<17);
        PUT32(RCC_APB1RSTR,ra);
        //16000000/(115200) = 138.8
        PUT32(USART2_BRR,139);
        PUT32(USART2_CR1,(1<<3)|(1<<2)|(1<<13));
        return(0);
    }
    void uart2_send ( unsigned int x )
    {
        while(1) if(GET32(USART2_SR)&(1<<7)) break;
        PUT32(USART2_DR,x);
    }
    unsigned int uart2_recv ( void )
    {
        while(1) if((GET32(USART2_SR))&(1<<5)) break;
        return(GET32(USART2_DR));
    }
    void hexstrings ( unsigned int d )
    {
        //unsigned int ra;
        unsigned int rb;
        unsigned int rc;
    
        rb=32;
        while(1)
        {
            rb-=4;
            rc=(d>>rb)&0xF;
            if(rc>9) rc+=0x37; else rc+=0x30;
            uart2_send(rc);
            if(rb==0) break;
        }
        uart2_send(0x20);
    }
    void hexstring ( unsigned int d )
    {
        hexstrings(d);
        uart2_send(0x0D);
        uart2_send(0x0A);
    }
    void tim2_handler ( void )
    {
        uart2_send(0x55);
        PUT32(TIM2_SR,0);
        PUT32(NVIC_ICPR0,0x10000000);
    }
    int notmain ( void )
    {
        unsigned int ra;
    
        uart2_init();
        hexstring(0x12345678);
    
        ra=GET32(RCC_APB1ENR);
        ra|=1<<0; //enable TIM2
        PUT32(RCC_APB1ENR,ra);
        PUT32(TIM2_ARR,0x04000000);
        PUT32(TIM2_DIER,0x0001);
        PUT32(NVIC_ISER0,0x10000000);
        ienable();
        PUT32(TIM2_CR1,0x0001);
        //tim2 position 28
            while(1)
        {
            DOWFI();
            hexstring(GET32(TIM2_CNT));
    
        }
        return(0);
    }
    

    build it

    arm-linux-gnueabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
    arm-linux-gnueabi-gcc -Wall -O2 -ffreestanding -mcpu=cortex-m0 -mthumb -c notmain.c -o notmain.o
    arm-linux-gnueabi-ld -nostdlib -nostartfiles -T flash.ld flash.o notmain.o -o notmain.elf
    arm-linux-gnueabi-objdump -D notmain.elf > notmain.list
    arm-linux-gnueabi-objcopy -O binary notmain.elf notmain.bin
    

    arm-linux-gneeabi, arm-none-eabi, etc do not matter the code is designed to not care.

    Note there is no .data nor real .bss support (as designed, who needs .data? I do not)

    Critical next step is to check the output

    Disassembly of section .text:
    
    08000000 <_start>:
     8000000:   20001000    andcs   r1, r0, r0
     8000004:   080000b5    stmdaeq r0, {r0, r2, r4, r5, r7}
     8000008:   080000bd    stmdaeq r0, {r0, r2, r3, r4, r5, r7}
     800000c:   080000bd    stmdaeq r0, {r0, r2, r3, r4, r5, r7}
    ...
     80000ac:   080000bd    stmdaeq r0, {r0, r2, r3, r4, r5, r7}
     80000b0:   08000209    stmdaeq r0, {r0, r3, r9}
    
    080000b4 <reset>:
     80000b4:   b672        cpsid   i
     80000b6:   f000 f8b9   bl  800022c <notmain>
     80000ba:   e7ff        b.n 80000bc <hang>
    
    080000bc <hang>:
     80000bc:   e7fe        b.n 80000bc <hang>
    
    ...
    
    08000208 <tim2_handler>:
     8000208:   b510        push    {r4, lr}
     800020a:   2055        movs    r0, #85 ; 0x55
    

    On some platforms recovering from a bricked chip is more painful than others. For the nucleos it is not an issue the debug mcu takes over the chip and can recover it in general. The stm32f4 discovery board exposes the boot0 pin so that is recoverable too. Also it can take you forever to figure out you are not really booting because you did not build the vector table right and/or did not declare the label a thumb function.

    And for the nucleo simply copy notmain.bin over to the virtual thumb drive.

    The ARPE bits are for whether or not you want to mess with the register runtime or not and where in the cycle you want the changes to be sampled (during the run or after, this may have been the key to the prescaler if I had set that, but that was not my goal here).

    I did not read through the gpio code in your ISR other than to note you used a lot of code just to toggle the led. More than necessary.

    Now naturally you want to setup the GPIO before you start the timer. The write of the enable bit to the timer should be the very last thing you do not right up front. You don't have anything setup at that point, the interrupt enable in the nvic, the gpio, etc. Now that is about 4 seconds or more of time but not the point. Messing with the ARR after you start does/can get the ARPE bit involved, or you could just set the ARR up front, at least for your first try at this.

    If you have a desire to mess with the ARR while it is running, then you go backward without any interrupt service stuff and understand the peripheral, ideally with the uart, if not with the led using polling. Any time you work with interrupts, weeks or decades of experience, you want to do everything you can without interrupts before you commit to interrupting the processor. Makes development time 10 to 100 times shorter. (any time you are learning some new peripheral bare metal, roll your own (or really even with a third party library) you need to write lots of targetted ad-hoc throwaway code. And then glue fragments or write new pieces from the throwaway code to make the real application).

    If you are willing to burn a memory location (say 0x20000800) you can preload it with 0x00020000 or 0x00000002 and then in the isr load it, xor it with 0x00020002 and then write it to the bsrr...(assuming this gpio peripheral does not have a toggle control register). (you could read odr and xor it with 2 and write it back)

    The only concern really with your code is the order in which you do things, I would (did) do the timer last, have all the rest of the system ready before you start that timer, gpio, nvic, all the timer control registers you can.

    Reminder mine above so far is for the STM32F411 and 466. So the rcc bits and the base address of timer2 may be different. I would hope the same peripheral was used, st does have more than one gpio and uart for example not sure about if one timer2 is the same as the others. I may try to port this to the 407 later (or not).

    Using asm is fine it just takes longer...you are taking a good brute force approach and not trying to optimize just yet, address, data, store, address, data store, etc. (bare-metal does not mean assembly language, but you probably know that).

    A major problem with bare metal is debugging. Which is why I write a few lines of code per experiment, not hundreds not thousands. Can get through thousands of debugged lines of code in a day that way. Also the approach you take, I write a lot of throwaway code to learn the peripheral, vs write lots of code then put faith in layers of debugger (fill in the blank). From experience that takes longer and is more frustrating. So the problem you are facing at the moment is how to debug this thing. And my answer there is back off the interrupt and get it working with polling first (simply use the UIE bit in the status register before even using DIER) and then work back forward.

    The wait for interrupt is not required, just for fun. Your main loop is fine.