In my iOS
code, I have a matrix (float *
) variable that looks something like this:
[ 1 2 3 4
5 6 7 8
9 0 1 2 ]
I need to build a matrix that has 1
's for all the elements equal to a value (let's say 2
for example), and 0
's for everything else. So the output would be:
[ 0 1 0 0
0 0 0 0
0 0 0 1 ]
I've been scouring the vDSP
docs for a while, but I haven't been able to find an approach to do this. I found the vDSP_vclip()
method, but it looks like it would make the values above and below the bounds (i.e., 2
) equal to 2
. Not exactly what I'm looking for.
Does anyone know how to accomplish this with the Accelerate.framework
in iOS
? If I'm correct, there's not a direct method for this, but could there be combination of other methods to accomplish the same thing?
Any advice is much appreciated! I'm totally stuck here.
If you are using the Xcode 6 beta, the clang auto-vectorizer will generate good (though not perfect) vector code for this operation. It won’t be as efficient as an Accelerate call would be, but there isn’t an Accelerate function that does what you want.
#include <stddef.h>
void findTwos(float * restrict matrix, float * restrict ones, size_t n) {
for (size_t i=0; i<n; ++i) { ones[i] = matrix[i] == 2.0f; }
}
Compiling with -Ofast
, -O3
or -O2
results in decent vector code in my tests (on arm64 and x86_64). If the size of your matrix is known at compile time, replacing the variable size parameter n
with a constant length results in vectorization at -Os
as well.
If this still isn’t fast enough, you can always write your own simd code =)