Search code examples
cassemblyembeddedrenesas-rx

Is the following "flag" variable access safe between interrupt and user code?


We have inherited a project which targets Renesas RX231 microcontroller that I have been looking at.

This uC has only one instruction that locks the bus for atomicity (XCHG).

Because the processor is the only component that accesses RAM memory (no DMA or DTC being used), to manipulate variables in user code that are shared with interrupts, the interrupts are disabled (in the processor status word register) for the access time i.e.

disable_interrupts(); /* set_psw(get_psw() & ~(1 << 16)); */
/* access or modify shared variables */
enable_interrupts();  /* set_psw(get_psw() | (1 << 16)); */

However, there are also "flags" being shared without guarding, which are set in interrupts and polled in user code in the following way:

volatile unsigned char event_request_message = 0;
unsigned char condition_sending_message = 0;

#pragma interrupt
void on_request_message()
{
     ...
     event_request_message = 1; // mov.l   #0x3df5, r14
                                // mov.b   #1, [r14]
     ... 
}

void user_code()
{
     for(;;)
     {
         ...
         /* might be evaluated multiple times before transmit message is completed */
         if(event_request_message && !condition_sending_message) // mov.l   #0x3df5, r14
                                                                 // movu.b  [r14], r14
                                                                 // cmp     #0, r14
                                                                 // beq.b   0xfff8e17b <user_code+185>
                                                                 // mov.l   #0x5990, r14
                                                                 // movu.b  [r14], r14
                                                                 // cmp     #0, r14
                                                                 // bne.b   0xfff8e16f <user_code+173>
         {
              event_request_message = 0;     // mov.l   #0x3df5, r14  
                                             // mov.b   #0, [r14]                  
              condition_sending_message = 1; // mov.l   #0x5990, r14               
                                             // mov.b   #1, [r14]
              /* transmit message */ 
              ... 
         }
         ...
     }
}

My understanding of no guarding (by disabling interrupts in user code) in this case would be:

  • To read, set or clear a "flag" always two instructions are used, one to put the memory address in a register and one to read/set/clear
  • Memory addresses are always the same so can be discarded from consideration
  • Each read/set/clear operation then is a single instruction and therefore the access/manipulation is atomic

The question is: is my understanding correct? Is such "flag" variables access and manipulation safe in this case?
Or could there be any possible bugs/errors?

  • Assume the compiler used and compiler options are always the same.
  • Assume that the operations described are the only way such "flags" are accessed/manipulated (set to 0 or 1, read (all shown in assembly code)) (no addition, multiplication, etc.)

What if we needed to upgrade the compiler or change compiler options?
Could such simple operations result in more than "one instruction"?

The justification of using such "flags" without guarding is too limit the amount time interrupts are disabled.

Looking at the code logic, the expected behaviour is that you can request a message one or more times but you only get one answer.

PS. I tried to use the following additional tags: "cc-rx", "rxv2-instruction-set", "rx231".


Solution

  • Depending on your goals, i.e. whether you're writing for a specific platform only or want to ensure portability, you need to keep several additional things in mind:

    1. With optimizations enabled, many compilers will happily reorder accesses to volatile variables with accesses to non volatile variables, as long as the final result of the operation is indistinguishable to a single-thread scenario. This means that code like this:

      int a = 0;
      volatile int b = 0;
      
      void interrupt_a(void) 
      {
          a = b + 1;
          b = 0;       // set b to zero when done
      }
      

      can be rearranged by the compiler into:

      load acc from [b]
      store 0 into [b]  // set b to zero *before* updating a, to mess with you a bit
      add 1 to acc
      store acc into [a]
      

      The way to prevent an optimizing compiler from reordering would be to make both variables volatile. (Or if available, to use C11 _Atomic with memory_order_release stores and memory_order_acquire loads to order it with respect to operations on non-atomic variables.)

      If you're on a multi-core uC, it can reorder memory operations, so this doesn't solve the problem, and the actual solution is to emit a fence for both the compiler and the CPU, if you care about observers on other cores (or in MMIO even on a single-core uC). Hardware fence instructions aren't needed on a single core or single thread, because even an out-of-order execution CPU sees it's own operations happen in program order.

      Again, if the compiler you got with the toolchain for your particular embedded system doesn't know anything about fences, then it's quite likely it will refrain from doing stuff like this. So you need to examine the documentation and check the compiled assembly.

      For example, the ARM docs state that the processor is "allowed" to reorder instructions and the programmer should take care to add memory barriers, but right after that it also states (under "implementation detail") that Cortex M processors do not reorder instructions. However, they still insist that proper barriers should be inserted, since it will simplify porting to a newer version of the processor.

    2. Depending on your pipeline length, it might take a couple of instructions until the interrupt is fully enabled or disabled, after you make the request. Again, you need to check the docs for this particular uC/compiler, but sometimes there needs to be a fence of some sort after writing to the register. For example, on ARM Cortex you need to issue both DSB and ISB instructions after disabling the interrupt, to make sure that the interrupt doesn't enter in the next several instructions

      // you would have to do this on an ARM Cortex uC
      DisableIRQ(device_IRQn); // Disable certain interrupt by writing to NVIC_CLRENA
      DSB(); // data memory barrier
      ISB(); // instruction synchronization barrier
      
      // <-- this is where the interrupt is "really disabled"
      

      Of course, your library might itself contain all required fence instructions when you call disable_interrupts();, or they might not be needed at all for this architecture.

    3. Increment operation (x++) should not be considered atomic, even it might "accidentally" turn out to be atomic on a certain single-core CPU. As you've noticed, it's not atomic on your particular uC, and the only way to guarantee atomicity is to disable interrupts around this operation.

    So, ultimately, you should make sure you read the documentation for this platform and understand what the compiler can and cannot do. Something that works today might not work tomorrow if the compiler decides to reorder instructions after you've added a seemingly minor change, especially since a race condition might not be frequent enough to detect it immediately.