I have many 4KiB buffers, which have a 50% chance of containing only zero values. The non-zero buffers will typically have a non-zero byte early in the buffer.
fn is_zero(buf: &Vec<u8>) -> bool {
for byte in buf.into_iter() {
if *byte != 0 {
return false;
}
}
return true;
}
Is this a performant way of checking in Rust, with --release
? (I am processing many GBs of data.)
(In the C version, I cast the buffer to unsigned long long
before checking. That probably wasn't the best I could do, given SSE etc.)
You an use align_to
to convert the slice of u8
into a slice of u128
, making the comparison more efficient:
fn is_zero(buf: &[u8]) -> bool {
let (prefix, aligned, suffix) = unsafe { buf.align_to::<u128>() };
prefix.iter().all(|&x| x == 0)
&& suffix.iter().all(|&x| x == 0)
&& aligned.iter().all(|&x| x == 0)
}
Running a simple benchmark on my machine shows 16x performance gains!
#![feature(test)]
extern crate test;
fn v() -> Vec<u8> {
std::iter::repeat(0).take(1000000).collect()
}
fn is_zero(buf: &[u8]) -> bool {
buf.into_iter().all(|&b| b == 0)
}
fn is_zero_aligned(buf: &[u8]) -> bool {
let (prefix, aligned, suffix) = unsafe { buf.align_to::<u128>() };
prefix.iter().all(|&x| x == 0)
&& suffix.iter().all(|&x| x == 0)
&& aligned.iter().all(|&x| x == 0)
}
#[bench]
fn bench_is_zero(b: &mut test::Bencher) {
let v = test::black_box(v());
b.iter(|| is_zero(&v[..]))
}
#[bench]
fn bench_is_zero_aligned(b: &mut test::Bencher) {
let v = test::black_box(v());
b.iter(|| is_zero_aligned(&v[..]))
}
running 2 tests
test tests::bench_is_zero ... bench: 455,975 ns/iter (+/- 414)
test tests::bench_is_zero_aligned ... bench: 28,615 ns/iter (+/- 116)
Depending on your machine, different integer types (u64
) may yield better performance.
Thanks to @Globi on the Rust discord server for the idea