How can I convert u8 mask to u32 mask with ARM NEON intrinsic?

There is a uint8x8_t mask, obtained from intrinsics like vcgt_u8(), with values like:

0, 0, 0, 0,255, 0, 255, 255

I would like to convert this mask to two uint32x4_t type masks. It seems vmovl_u8() and vmovl_u16() will still keep 255 instead of 65535 and 4294967295. How can I do this conversion?

Solution

A signed widen operation like vmovl_s will convert an all-ones pattern like 255 into 65535 and so on, so you need to vreinterpret your unsigned vector to signed, and back:

    uint8x8_t v = ...;
    int16x8_t i = vmovl_s8(vreinterpret_s8_u8(v));
    uint32x4_t low = vreinterpretq_u32_s32(vmovl_s16(vget_low_s16(i)));
    uint32x4_t high = vreinterpretq_u32_s32(vmovl_s16(vget_high_s16(i)));

Simple frame by frame video decoder library
GCC no longer implements <varargs.h>
Contents of IO buffer unknown == unsafe?
Avoiding strcpy overflow destination warning
Sort program not working, not sure why
Fast & accurate atan/arctan approximation algorithm
What's the difference between strtok_r and strtok_s in C?
How memory address for pointer to arrays is same as an element in 2D array?
Which is the best way to suppress "unused variable" warning
How to use ellipsis in c's case statement?
How can I exclude non-numeric keys? CS50 Caesar Pset2
Fast ceiling of an integer division in C / C++
Is there an invalid pthread_t id?
How to Implement Universal Setter/Getter Functions for Interrupt-Driven Variables in Embedded C?
How does SIMD (avx) processing work? for example, if I want 10 32 bit floats how do i fit in a 256 bit avx vector?
FDCAN problems on STM32G4
How does the call macro enable mutual recursion between functions f and g in this Hanoi Tower implementation?
Running test on Rocket core CPU - global variable initialized to 0 is unsuccessful, output wrong value instead
Interacting with C arrays without knowing the size
Combination of two strings
carriage return by fgets
How to use special characters in C?
Why does 1.0/100.0 == 0.1/10.0 give True?
Is it correct to compare pointers in C?
Force free() to return malloc memory back to OS
How can I print to standard error in C with 'printf'?
What is the standard behavior of fread in C on Windows?
How is strtok removing lines it shouldn't have access to?
Using array as smart point in C
Assigning string to malloced 2d char array not working as intended