I want to use volatile bit-field struct to set hardware register like following code
union foo {
uint32_t value;
struct {
uint32_t x : 1;
uint32_t y : 3;
uint32_t z : 28;
};
};
union foo f = {0};
int main()
{
volatile union foo *f_ptr = &f;
//union foo tmp;
*f_ptr = (union foo) {
.x = 1,
.y = 7,
.z = 10,
};
//*f_ptr = tmp;
return 0;
}
However, the compiler will make it to STR, LDR HW register several times. It is a terrible things that it will trigger hardware to work at once when the register is writed.
main:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movw r3, #:lower16:.LANCHOR0
movs r0, #0
movt r3, #:upper16:.LANCHOR0
ldr r2, [r3]
orr r2, r2, #1
str r2, [r3]
ldr r2, [r3]
orr r2, r2, #14
str r2, [r3]
ldr r2, [r3]
and r2, r2, #15
orr r2, r2, #160
str r2, [r3]
bx lr
.size main, .-main
.global f
.bss
.align 2
My gcc version is : arm-linux-gnueabi-gcc (Linaro GCC 4.9-2017.01) 4.9.4 and build with -O2 optimation
I have tried to use the local variable to resolve this problem
union foo {
uint32_t value;
struct {
uint32_t x : 1;
uint32_t y : 3;
uint32_t z : 28;
};
};
union foo f = {0};
int main()
{
volatile union foo *f_ptr = &f;
union foo tmp;
tmp = (union foo) {
.x = 1,
.y = 7,
.z = 10,
};
*f_ptr = tmp;
return 0;
}
Well, it will not STR to HW register several times
main:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 0, uses_anonymous_args = 0
@ link register save eliminated.
movs r1, #10
movs r2, #15
movw r3, #:lower16:.LANCHOR0
bfi r2, r1, #4, #28
movt r3, #:upper16:.LANCHOR0
movs r0, #0
str r2, [r3]
bx lr
.size main, .-main
.global f
.bss
.align 2
I think it is still not a good idea to use local variable, considering the limitation of binary size for embedded system.
Is there any way to handle this problem without using local variable?
I think this is a bug in GCC. Per discussion below, you might consider using:
f_ptr->value = (union foo) {
.x = 1,
.y = 7,
.z = 10,
} .value;
By the C standard, the code a compiler generates for a program may not access a volatile object when the original C code nominally does not access the object. The code *f_ptr = (union foo) { .x = 1, .y = 7, .z = 10, };
is a single assignment to *f_ptr
. So we would expect this to generate a single store to *f_ptr
; generating two stores is a violation of the standard’s requirements.
We could consider an explanation for this to be that GCC is treating the aggregate (the union and/or the structure within it) as several objects, each individually volatile, rather than one aggregated volatile object.1 But, if this were so, then it ought to generate separate 16-bit strh
instructions for the parts (per the original example code, which had 16-bit parts), not the 32-bit str
instructions we see.
While using a local variable appears to work around the issue, I would not rely on that, because the assignment of the compound literal above is semantically equivalent, so the cause of why GCC generates broken assembly code for one sequence of code and not the other is unclear. With different circumstances (such as additional or modified code in the function or other variations that might affect optimization), GCC might generate broken code with the local variable too.
What I would do is avoid using an aggregate for the volatile object. The hardware register is, presumably, physically more like a 32-bit unsigned integer than like a structure of bit-fields (even though semantically it is defined with bit-fields). So I would define the register as volatile uint32_t
and use that type when assigning values to it. Those values could be prepared with bit shifts or structures with bit-fields or whatever other method you prefer.
It should not be necessary to avoid using local variables, as the optimizer should effectively eliminate them. However, if you wish to neither change the register definition nor use local variables, an alternative is the code I opened with:
f_ptr->value = (union foo) {
.x = 1,
.y = 7,
.z = 10,
} .value;
That prepares the value to be stored but then assigns it using the uint32_t
member of the union rather than using the whole union, and testing with ARM GCC 4.6.4 on Compiler Explorer (the closest match I could find on Compiler Explorer to what you are using) suggests it generates a single store with minimal code:
main: ldr r3, .L2 mov r2, #175 str r2, [r3, #0] mov r0, #0 bx lr .L2: .word .LANCHOR0 .LANCHOR0 = . + 0 f:
1 I would consider this to a bug too, as the C standard does not make provision for applying the volatile
qualifier on a union or structure declaration as applying to the members rather than to the whole aggregate. For arrays, it does say that qualifiers apply to the elements, rather than the whole array (C 2018 6.7.3 10). It has no such wording for unions or structures.