I'm trying to get the STM32F446 running at full speed, following this tutorial: https://www.youtube.com/watch?v=GJ_LFAlOlSk&t=826s i did everything he does, but the clock speed of my timers is DEADLY slow, like literally, when blinking an LED with pre-scalar of 9 and ARR of 20, it is easily visible by eye.. Wt* am i doing wrong ?
void setup_clock(void)
{
// Enables HSE and waits until it is ready
*RCC_CR |= (1 << RCC_CR_HSEON);
while (!(*RCC_CR & (1 << RCC_CR_HSERDY)));
// Set the power enable clock and voltage regulator
*RCC_APB1ENR |= (1 << RCC_APB1ENR_PWREN);
*PWR_CR |= PWR_CR_VOS(PWR_CR_VOS_SCALEM1);
// Configure flash
*FLASH_ACR = (1 << FLASH_ACR_DCEN) | (1 << FLASH_ACR_ICEN) | (1 << FLASH_ACR_PRFTEN);
*FLASH_ACR |= FLASH_ACR_LATENCY(5);
// Configures HCLK, PCLK1, PCLK2
*RCC_CFGR &= ~RCC_CFGR_HPRE_MASK;
*RCC_CFGR |= RCC_CFGR_HPRE(RCC_CFGR_HPRE_NODIV); // HCLK 180Mhz
*RCC_CFGR &= ~RCC_CFGR_PPRE1_MASK;
*RCC_CFGR |= RCC_CFGR_PPRE1(RCC_CFGR_PPRE1_DIV4); // PCLK1 45Mhz
*RCC_CFGR &= ~RCC_CFGR_PPRE2_MASK;
*RCC_CFGR |= RCC_CFGR_PPRE2(RCC_CFGR_PPRE2_DIV2); // PCLK2 90Mhz
// Configures the main PLL
*RCC_PLLCFGR = RCC_PLLCFGR_PLLN(180) |
RCC_PLLCFGR_PLLP(RCC_PLLCFGR_PLLP_2) |
RCC_PLLCFGR_PLLR(2) |
RCC_PLLCFGR_PLLM(4) |
(1 << RCC_PLLCFGR_PLLSRC);
// Enable PLL
*RCC_CR |= (1 << RCC_CR_PLLON);
while (!(*RCC_CR & (1 << RCC_CR_PLLRDY)));
// Use PLL as clock source
*RCC_CFGR &= ~RCC_CFGR_SW_MASK;
*RCC_CFGR |= RCC_CFGR_SW(RCC_CFGR_SW_PLL_P);
while ((*RCC_CFGR & RCC_CFGR_SWS_MASK) != RCC_CFGR_SWS(RCC_CFGR_SWS_PLL));
// Sets the CLOCK Ready status LED
*GPIO_ODR(STATUS_BASE) |= (1 << STATUS_CLKREADY);
}
Below is a complete working project for the NUCLEO_F44RE using gnu tools, everything you need to build and run.
Differences.
I am starting off in the default power mode (well looks like so ar you yes?), so conservatively set the flash divisor to 8 (9 clocks). (can try this after, I would personally set for 8 first get it working then work back to 5).
I am neither using I nor D cache.
I set the system to HSE then set the PLL to use it as well. You skip that and that is probably fine as the HSE is up and ready (to be used by the PLL).
this line
*RCC_CFGR &= ~RCC_CFGR_SW_MASK;
switches the clock to HSI and then
*RCC_CFGR |= RCC_CFGR_SW(RCC_CFGR_SW_PLL_P);
switches the clock to PLL. Need to make up your mind, do not use/abuse the registers in this way as Lundin commented. You should do clean read-modify-writes, read, zero the bits that need to be zeroed (or all of them in the field) set the bits to be set, then write to the register. Use temporary variables for this. Or some flavor of
reg = (reg&this) | that;
but certainly not
reg &= this;
reg |= that;
In general. I doubt that is your problem though...Just a comment by a couple/few of us.
You have PLLQ at an invalid state. Might be a problem, just try it.
I am building for cortex-m0 out of habit/portability of code, can change that easily.
Before PJ brings this up
*RCC_APB1ENR |= (1 << RCC_APB1ENR_PWREN);
*PWR_CR |= PWR_CR_VOS(PWR_CR_VOS_SCALEM1);
Is risky you need to examine the compiled output and it can vary based on compiler, version, phase of the moon. If the str to RCC_APB1ENR is immediately followed by the LDR of PWR_CR, that may not work. What I did see doing experiments based on PJ's comment on another ticket was, that the gpio which was the case there, you can for some reason read the MODER register with the peripheral off so an str of the enable an ldr of the MODER works, then the instructions to do the modify and write are more than enough time for the write. But if you jam the moder register and specifically depending on your compiler and settings, it can optimize those as two back to back stores, I was able to cause this with one compiler and not another. (change settings though and they fix and fail, etc). The GET32/PUT32 thing I do insures there is no problem with touching the peripheral before the enable has had time to settle. YMMV.
flash.s
.cpu cortex-m0
.thumb
.thumb_func
.global _start
_start:
.word 0x20001000
.word reset
.thumb_func
reset:
bl notmain
b hang
.thumb_func
hang: b .
.thumb_func
.globl PUT32
PUT32:
str r1,[r0]
bx lr
.thumb_func
.globl GET32
GET32:
ldr r0,[r0]
bx lr
flash.ld
MEMORY
{
rom : ORIGIN = 0x08000000, LENGTH = 0x1000
}
SECTIONS
{
.text : { *(.text*) } > rom
}
notmain.c
void PUT32 ( unsigned int, unsigned int );
unsigned int GET32 ( unsigned int );
void dummy ( unsigned int );
#define RCCBASE 0x40023800
#define RCC_AHB1ENR (RCCBASE+0x30)
#define RCC_CR (RCCBASE+0x00)
#define RCC_PLLCFGR (RCCBASE+0x08)
#define RCC_CFGR (RCCBASE+0x08)
#define FLASH_ACR 0x40023C00
#define GPIOABASE 0x40020000
#define GPIOA_MODER (GPIOABASE+0x00)
#define GPIOA_BSRR (GPIOABASE+0x18)
//PA5
#define STK_CSR 0xE000E010
#define STK_RVR 0xE000E014
#define STK_CVR 0xE000E018
static void clock_init ( void )
{
unsigned int ra;
//switch to external clock.
ra=GET32(RCC_CR);
ra|=1<<16;
PUT32(RCC_CR,ra);
while(1) if(GET32(RCC_CR)&(1<<17)) break;
if(1)
{
ra=GET32(RCC_CFGR);
ra&=~3;
ra|=1;
PUT32(RCC_CFGR,ra);
while(1) if(((GET32(RCC_CFGR)>>2)&3)==1) break;
}
//HSE ready
}
static void pll_init ( void )
{
unsigned int ra;
//clock_init();
ra=GET32(FLASH_ACR);
ra&=(~(0xF<<0));
ra|=( 8<<0);
PUT32(FLASH_ACR,ra);
//poll this?
ra=GET32(RCC_CFGR);
ra&=(~(0x3<<13));
ra|=( 4<<13); //180/90 = 2
ra&=(~(0x3<<10));
ra|=( 5<<10); //180/45 = 4
PUT32(RCC_CFGR,ra);
//HSE 8Mhz
//PLLM It is recommended to select a frequency of 2 MHz to limit
// PLL jitter.
//PLLN input is 2, want >=50 and <=432 so between 25 and 216
//PLLM 4, PLLN 180, VCO 360, PLLP 2
//PLLM 8/4 = 2
//PLLN 2 * 180 = 360
//PLLP 360 / 2 = 180
//PLLR 2?
//PLLQ 180 / 48 = 3.75 so 4.
ra=0;
ra|=2<<28; //PLLR
ra|=4<<24; //PLLQ dont care
ra|=1<<22; //PLLSRC HSE
ra|=2<<16; //PLLP
ra|=180<<6; //PLLN
ra|=4<<0; //PLLM
PUT32(RCC_PLLCFGR,ra);
ra=GET32(RCC_CR);
ra|=1<<24;
PUT32(RCC_CR,ra);
while(1) if(GET32(RCC_CR)&(1<<25)) break;
ra=GET32(RCC_CFGR);
ra&=~3;
ra|=2;
PUT32(RCC_CFGR,ra);
while(1) if(((GET32(RCC_CFGR)>>2)&3)==2) break;
}
static void led_init ( void )
{
unsigned int ra;
ra=GET32(RCC_AHB1ENR);
ra|=1<<0; //enable GPIOA
PUT32(RCC_AHB1ENR,ra);
ra=GET32(GPIOA_MODER);
ra&=~(3<<(5<<1)); //PA5
ra|= (1<<(5<<1)); //PA5
PUT32(GPIOA_MODER,ra);
}
static void led_on ( void )
{
PUT32(GPIOA_BSRR,((1<<5)<< 0));
}
static void led_off ( void )
{
PUT32(GPIOA_BSRR,((1<<5)<<16));
}
void do_delay ( unsigned int sec )
{
unsigned int ra,rb,rc,rd;
rb=GET32(STK_CVR);
for(rd=0;rd<sec;)
{
ra=GET32(STK_CVR);
rc=(rb-ra)&0x00FFFFFF;
if(rc>=16000000)
{
rb=ra;
rd++;
}
}
}
int notmain ( void )
{
unsigned int rx;
led_init();
PUT32(STK_CSR,0x00000004);
PUT32(STK_RVR,0xFFFFFFFF);
PUT32(STK_CSR,0x00000005);
for(rx=0;rx<5;rx++)
{
led_on();
while(1) if((GET32(STK_CVR)&0x200000)!=0) break;
led_off();
while(1) if((GET32(STK_CVR)&0x200000)==0) break;
}
clock_init();
for(rx=0;rx<5;rx++)
{
led_on();
while(1) if((GET32(STK_CVR)&0x200000)!=0) break;
led_off();
while(1) if((GET32(STK_CVR)&0x200000)==0) break;
}
pll_init();
while(1)
{
led_on();
while(1) if((GET32(STK_CVR)&0x200000)!=0) break;
led_off();
while(1) if((GET32(STK_CVR)&0x200000)==0) break;
}
return(0);
}
build
arm-linux-gnueabi-as --warn --fatal-warnings -mcpu=cortex-m0 flash.s -o flash.o
arm-linux-gnueabi-gcc -Wall -O2 -ffreestanding -mcpu=cortex-m0 -mthumb -c notmain.c -o notmain.o
arm-linux-gnueabi-ld -nostdlib -nostartfiles -T flash.ld flash.o notmain.o -o notmain.elf
arm-linux-gnueabi-objdump -D notmain.elf > notmain.list
arm-linux-gnueabi-objcopy -O binary notmain.elf notmain.bin
(you can naturally change the cortex-m0s to cortex-m4s).
copy notmain.bin to the nucleo card and watch the user led change speeds. faster, half slower, much much faster.
Hmm...
when VOS[1:0] = '0x11, the maximum value of f HCLK is 168 MHz. It can be extended to 180 MHz by activating the over-drive mode. The over-drive mode is not available when VDD ranges from 1.8 to 2.1 V (refer to Section 5.1.3: Voltage regulator for details on how to activate the over-drive mode).
and
11: Scale 1 mode (reset value)
(so no need to mess with that)
and
Entering Over-drive mode
It is recommended to enter Over-drive mode when the application is not running critical
tasks and when the system clock source is either HSI or HSE. To optimize the configuration
time, enable the Over-drive mode during the PLL lock phase.
To enter Over-drive mode, follow the sequence below:
Note:
1. Select HSI or HSE as system clock.
2. Configure RCC_PLLCFGR register and set PLLON bit of RCC_CR register.
3. Set ODEN bit of PWR_CR register to enable the Over-drive mode and wait for the
ODRDY flag to be set in the PWR_CSR register.
4. Set the ODSW bit in the PWR_CR register to switch the voltage regulator from Normal
mode to Over-drive mode. The System will be stalled during the switch but the PLL
clock system will be still running during locking phase.
5. Wait for the ODSWRDY flag in the PWR_CSR to be set.
6. Select the required Flash latency as well as AHB and APB prescalers.
7. Wait for PLL lock.
8. Switch the system clock to the PLL.
9. Enable the peripherals that are not generated by the System PLL (I2S clock, SAI1 and
SAI2 clocks, USB_48MHz clock....).
So I am running at room temperature the chip is nowhere near close to max temp so likely why it works fine being overclocked as I have done here. (technically it is not complete needs to either be 168 or set for overdrive).
If you want 180 vs 168 you should do these steps as documented.
I suspect you are not running your part near max temp either so you should be able to get away with 180 as well. Try removing your pwr register stuff see if that helps, make your flash delay longer, etc. Change to 168mhz, etc.
Did you try for 180 out of the gate or did you try some more reasonable speeds first that are not pushing any edges, like something less than 45mhz then something between 45 and 90 then 90 plus then work to 180?
EDIT
The Flash memory interface accelerates code execution with a system of instruction prefetch and cache lines.
Main features
• Flash memory read operations
• Flash memory program/erase operations
• Read / write protections
• Prefetch on I-Code
• 64 cache lines of 128 bits on I-Code
• 8 cache lines of 128 bits on D-Code