Search code examples
carraysassemblyarmneon

ARM assembly : Access array elements residing in C type struct


I have a ARM Neon function that uses C type struct as the argument. I have a float* and float[] array of fixed size in that struct. I am able to access float* elements in my assembly function. But when I try to access elements of array, my program crashes.

Here is my C side application:

typedef struct{
float*  f1;
float*  f2;
float   f3[4];
}P_STRUCT;

main.c file:

extern void myNeonFunc(P_STRUCT*    p, float* res);
P_STRUCT p;
// memory allocation for f1,f2 and fill array f3 here.
// memory allocation for res

myNeonFunc(&p, res);

And here is my .S file:

.text

.set    P_STRUCT_F1,        0                   @ float* f1
.set    P_STRUCT_F2,        4                   @ float* f2
.set    P_STRUCT_F3,        8                   @ float f3[4]

.globl myNeonFunc

@ void myNeonFunc   (P_STRUCT* p ---->  r0, r1 )

.balign     64                              @ align the function to 64

myNeonFunc:
@save callee-save registers here

ldr         r8,         [r0,P_STRUCT_F1]    @ r8 <- f1 
add         r8,    r8,  #8                  @ r8 points to the f1[2] (2*4 = 8 )
ldr         r9,         [r0,P_STRUCT_F2]    @ r9 <- f2
add         r9,    r9,  #4                  @ r9 points to the f2[1] (1*4 = 8)
ldr         r10,        [r0,P_STRUCT_F3]    @ r10 <- f3
add         r10,   r10, #4                  @ r10 points to the f3[1] (1*4 = 8)

vld1.f32    {d4},       [r8]!               @ d4 now contains the corresponding r8 value
vld1.f32    {d6},       [r9]!               @ d6 now contains the corresponding r9 value
vst1.32     {d4},       [r1]!               @ store f1[2] value in result register
vst1.32     {d6},       [r1]!               @ store f1[1] value in result register

// every thing is ok up to here
// this line probably causes seg fault !!!
vld1.f32    {d8},       [r10]!              @ d8 now contains the corresponding r10 value

//           
vst1.32     {d8},       [r1]!               @ store f3[1] value in result register

// epilog part here ...

This problem might be due to the fact that r10 does not point to the address of f3 array.(maybe)

Now my question is that why accessing fixed size array causes problem here while accessing pointer elements is OK. And what is the solution for that.


Solution

  • A pointer is not the same thing as an array. f1 and f2 are 4 byte pointers in the struct. f3 is a 16-byte array in the struct. The struct as a whole is 24 bytes long.

    What you are loading into r10 is the first element of f3. If you want to set r10 to &f3[0], then just set r10 to r0 + P_STRUCT_F3.