We have inherited a project which targets Renesas RX231 microcontroller that I have been looking at.
This uC has only one instruction that locks the bus for atomicity (XCHG).
Because the processor is the only component that accesses RAM memory (no DMA or DTC being used), to manipulate variables in user code that are shared with interrupts, the interrupts are disabled (in the processor status word register) for the access time i.e.
disable_interrupts(); /* set_psw(get_psw() & ~(1 << 16)); */
/* access or modify shared variables */
enable_interrupts(); /* set_psw(get_psw() | (1 << 16)); */
However, there are also "flags" being shared without guarding, which are set in interrupts and polled in user code in the following way:
volatile unsigned char event_request_message = 0;
unsigned char condition_sending_message = 0;
#pragma interrupt
void on_request_message()
{
...
event_request_message = 1; // mov.l #0x3df5, r14
// mov.b #1, [r14]
...
}
void user_code()
{
for(;;)
{
...
/* might be evaluated multiple times before transmit message is completed */
if(event_request_message && !condition_sending_message) // mov.l #0x3df5, r14
// movu.b [r14], r14
// cmp #0, r14
// beq.b 0xfff8e17b <user_code+185>
// mov.l #0x5990, r14
// movu.b [r14], r14
// cmp #0, r14
// bne.b 0xfff8e16f <user_code+173>
{
event_request_message = 0; // mov.l #0x3df5, r14
// mov.b #0, [r14]
condition_sending_message = 1; // mov.l #0x5990, r14
// mov.b #1, [r14]
/* transmit message */
...
}
...
}
}
My understanding of no guarding (by disabling interrupts in user code) in this case would be:
The question is: is my understanding correct? Is such "flag" variables access and manipulation safe in this case?
Or could there be any possible bugs/errors?
What if we needed to upgrade the compiler or change compiler options?
Could such simple operations result in more than "one instruction"?
The justification of using such "flags" without guarding is too limit the amount time interrupts are disabled.
Looking at the code logic, the expected behaviour is that you can request a message one or more times but you only get one answer.
PS. I tried to use the following additional tags: "cc-rx", "rxv2-instruction-set", "rx231".
Depending on your goals, i.e. whether you're writing for a specific platform only or want to ensure portability, you need to keep several additional things in mind:
With optimizations enabled, many compilers will happily reorder accesses to volatile variables with accesses to non volatile variables, as long as the final result of the operation is indistinguishable to a single-thread scenario. This means that code like this:
int a = 0;
volatile int b = 0;
void interrupt_a(void)
{
a = b + 1;
b = 0; // set b to zero when done
}
can be rearranged by the compiler into:
load acc from [b]
store 0 into [b] // set b to zero *before* updating a, to mess with you a bit
add 1 to acc
store acc into [a]
The way to prevent an optimizing compiler from reordering would be to make both variables volatile. (Or if available, to use C11 _Atomic
with memory_order_release
stores and memory_order_acquire
loads to order it with respect to operations on non-atomic variables.)
If you're on a multi-core uC, it can reorder memory operations, so this doesn't solve the problem, and the actual solution is to emit a fence for both the compiler and the CPU, if you care about observers on other cores (or in MMIO even on a single-core uC). Hardware fence instructions aren't needed on a single core or single thread, because even an out-of-order execution CPU sees it's own operations happen in program order.
Again, if the compiler you got with the toolchain for your particular embedded system doesn't know anything about fences, then it's quite likely it will refrain from doing stuff like this. So you need to examine the documentation and check the compiled assembly.
For example, the ARM docs state that the processor is "allowed" to reorder instructions and the programmer should take care to add memory barriers, but right after that it also states (under "implementation detail") that Cortex M processors do not reorder instructions. However, they still insist that proper barriers should be inserted, since it will simplify porting to a newer version of the processor.
Depending on your pipeline length, it might take a couple of instructions until the interrupt is fully enabled or disabled, after you make the request. Again, you need to check the docs for this particular uC/compiler, but sometimes there needs to be a fence of some sort after writing to the register. For example, on ARM Cortex you need to issue both DSB and ISB instructions after disabling the interrupt, to make sure that the interrupt doesn't enter in the next several instructions
// you would have to do this on an ARM Cortex uC
DisableIRQ(device_IRQn); // Disable certain interrupt by writing to NVIC_CLRENA
DSB(); // data memory barrier
ISB(); // instruction synchronization barrier
// <-- this is where the interrupt is "really disabled"
Of course, your library might itself contain all required fence instructions when you call disable_interrupts();
, or they might not be needed at all for this architecture.
Increment operation (x++
) should not be considered atomic, even it might "accidentally" turn out to be atomic on a certain single-core CPU. As you've noticed, it's not atomic on your particular uC, and the only way to guarantee atomicity is to disable interrupts around this operation.
So, ultimately, you should make sure you read the documentation for this platform and understand what the compiler can and cannot do. Something that works today might not work tomorrow if the compiler decides to reorder instructions after you've added a seemingly minor change, especially since a race condition might not be frequent enough to detect it immediately.