Search code examples
cassemblyvolatileneon

Is there something like "emms" on NEON?


We know that on NEON, the SIMD registers q0~q7 are shared with float registers s0~s31. So the code below has a bug:

float_t fRatio = (float_t)srcWidth/dstWidth;

// NEON asm modified q0~q7
MyNeonFunctionPtr1(pData, Stride, (int32_t)(fHorRatio*m_iHorScale));

//  following sentence use wrong "fHorRatio", 
//  which is modified by "MyNeonFunctionPtr1"; 

int32_t vertStepLuma = (int32_t)(fHorRatio*m_iVertScale);

In x86, emms can solve it. But how do I do it on NEON? My temporary solution is to use volatile on vertStepLuma. Is there a better way? Thanks!


Solution

  • Are you using gcc inline assembly? Then use clobber list. You inform GCC that you will be use specific registers and gcc won't store in them values after inline asm block. Read here: http://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html#ss5.3

    Otherwise, if it is external function implemented elsewhere then ABI dictates that you are allowed to corrupt only q4, q5, q6 and q7 registers: ARM to C calling convention, NEON registers to save Fix the function to preserve registers (q0-q3), or make an inline assembly around it where you save these registers yourself.