I have old code with a lot of methods like long[] toLongArray(int[] array)
but for many different primitive types configurations (on both sides) and I just wonder if it is possible to make one generic method for this - without losing performance.
First I created simple method using MethodHandles for int[] -> long[] pair:
static final MethodHandle getIntElement = MethodHandles.arrayElementGetter(int[].class);
static final MethodHandle setLongElement = MethodHandles.arrayElementSetter(long[].class);
static long[] specializedMethodHandle(int[] array) throws Throwable {
long[] newArray = new long[array.length];
for (int i = 0; i < array.length; i++) getIntElement.invokeExact(newArray, i, (long) (int) setLongElement.invokeExact(array, i));
return newArray;
}
And it works great - same performance as manual loop, so I decided to make this generic:
static Map<Class<?>, MethodHandle> metHanGettersObj = Map.of(int[].class, MethodHandles.arrayElementGetter(int[].class).asType(MethodType.methodType(Object.class, Object.class, int.class)));
static Map<Class<?>, MethodHandle> metHanSettersObj = Map.of(long[].class, MethodHandles.arrayElementSetter(long[].class).asType(MethodType.methodType(void.class, Object.class, int.class, Object.class)));
static <F, T> T genericMethodHandleObject(Class<T> to, F array) throws Throwable {
int length = Array.getLength(array);
Object newArray = Array.newInstance(to.getComponentType(), length);
MethodHandle getElement = metHanGettersObj.get(array.getClass());
MethodHandle setElement = metHanSettersObj.get(to);
for (int i = 0; i < length; i++) setElement.invokeExact(newArray, i, getElement.invokeExact(array, i));
return (T) newArray;
}
But this works much much slower, for my example array of 500000 elements it was over 15x slower.
What is interesting CompiledScript made with Nashorn javascript engine is around 20% faster than this code. (simple copy loop inside)
So I wonder if someone know other way to do this? I will probably not use that anywhere as it is starting to be too "hacky" but now I just need to know if it is possible at all - as no generic method with method handles works fine, so why this one is that slow, and is it possible to make it faster?
You can bootstrap together an array converter method handle, that you then cache in some static map.
Here's a benchmark including the code. The convertBootstrap
method creates the converter, that's where the real magic happens:
@BenchmarkMode({ Mode.AverageTime })
@Warmup(iterations = 10, batchSize = 1)
@Measurement(iterations = 10, batchSize = 1)
@Fork(1)
@State(Scope.Thread)
public class MyBenchmark {
int[] input;
static final Map<Class<?>, Map<Class<?>, Function<?, ?>>> cacheGeneric = new HashMap<>();
@Setup
public void setup() {
input = new Random(1).ints().limit(500_000).toArray();
}
@Benchmark
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public long[] manual() {
long[] result = new long[input.length];
for(int i = 0 ; i < input.length; i++) {
result[i] = input[i];
}
return result;
}
@Benchmark
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public long[] cachedGeneric() {
return getWrapped(int[].class, long[].class).apply(input);
}
@Benchmark
@OutputTimeUnit(TimeUnit.MILLISECONDS)
public long[] reflective() throws Throwable {
return genericMethodHandleObject(long[].class, input);
}
static Map<Class<?>, MethodHandle> metHanGettersObj = Map.of(int[].class, MethodHandles.arrayElementGetter(int[].class).asType(MethodType.methodType(Object.class, Object.class, int.class)));
static Map<Class<?>, MethodHandle> metHanSettersObj = Map.of(long[].class, MethodHandles.arrayElementSetter(long[].class).asType(MethodType.methodType(void.class, Object.class, int.class, Object.class)));
static <F, T> T genericMethodHandleObject(Class<T> to, F array) throws Throwable {
int length = Array.getLength(array);
Object newArray = Array.newInstance(to.getComponentType(), length);
MethodHandle getElement = metHanGettersObj.get(array.getClass());
MethodHandle setElement = metHanSettersObj.get(to);
for (int i = 0; i < length; i++) setElement.invokeExact(newArray, i, getElement.invokeExact(array, i));
return (T) newArray;
}
@SuppressWarnings("unchecked")
public static <F, T> Function<F, T> getWrapped(Class<F> from, Class<T> to) {
return (Function<F, T>) cacheGeneric.computeIfAbsent(from, k -> new HashMap<>())
.computeIfAbsent(
to, k -> {
MethodHandle mh = convertBootstrap(from, to);
return arr -> {
try {
return (T) mh.invoke(arr);
} catch (Throwable e) {
throw new RuntimeException(e);
}
};
});
}
public static MethodHandle convertBootstrap(Class<?> from, Class<?> to) {
MethodHandle getter = arrayElementGetter(from);
MethodHandle setter = arrayElementSetter(to);
MethodHandle body = explicitCastArguments(setter, methodType(void.class, to, int.class, from.getComponentType()));
body = collectArguments(body, 2, getter); // get from 1 array, set in other
body = permuteArguments(body, methodType(void.class, to, int.class, from), 0, 1, 2, 1);
body = collectArguments(identity(to), 1, body); // create pass-through for first argument
body = permuteArguments(body, methodType(to, to, int.class, from), 0, 0, 1, 2);
MethodHandle lenGetter = arrayLength(from);
MethodHandle cons = MethodHandles.arrayConstructor(to);
MethodHandle init = collectArguments(cons, 0, lenGetter);
MethodHandle loop = countedLoop(lenGetter, init, body);
return loop;
}
}
Benchmark results are about the same for my method and manual (less score is better):
# JMH version: 1.19
# VM version: JDK 10.0.1, VM 10.0.1+10
Benchmark Mode Cnt Score Error Units
MyBenchmark.cachedGeneric avgt 10 1.175 ± 0.046 ms/op
MyBenchmark.manual avgt 10 1.149 ± 0.098 ms/op
MyBenchmark.reflective avgt 10 10.165 ± 0.665 ms/op
I was actually really surprised how well this is being optimized (unless I made a mistake in the benchmark somewhere, but I can't find it). If you increase the number of elements to 5 million you can see the difference again:
Benchmark Mode Cnt Score Error Units
MyBenchmark.cachedGeneric avgt 10 277.764 ± 14.217 ms/op
MyBenchmark.manual avgt 10 14.851 ± 0.317 ms/op
MyBenchmark.reflective avgt 10 76.599 ± 3.695 ms/op
Those numbers suggest to me that some loop un-rolling/inlining/something-else limit is being hit though, since the difference is suddenly a lot bigger.
You will probably also see a performance drop when the array types are not statically known.