I have written the following program out of curiousity. It ticks X amount of times every second, and it should be very close to the aimed tick count, but somehow it always lacks behind a lot.
For example for a 2 second benchmark the output is (it should be 2*64 -> 128). Building in release mode doesn't help the issue:
Starting benchmark 2s with length
Benchmark started.
Ticked 83 in 2 seconds
It is the same for other lenghts, for example 5 second yields 196 which should be 320. It should be actually more cause I let it run a little bit more.
Is it because of Arc Mutex
being slow, or is there anything more to this? Where is the bottleneck?
use std::{ time::{ SystemTime, UNIX_EPOCH }, sync::Arc };
use tokio::{ time::{ sleep, Duration }, sync::Mutex };
const TICK_COUNT: u64 = 64;
const BENCHMARK_LENGTH: u64 = 2;
#[tokio::main]
async fn main() {
let tick_count = 0;
let interval = 1000 / TICK_COUNT;
let shared = Arc::new(Mutex::new(tick_count));
let start_time = SystemTime::now();
println!("Starting benchmark {}s with length", BENCHMARK_LENGTH);
let displayed = shared.clone();
let incrementer = shared.clone();
tokio::spawn(async move {
loop {
if
start_time.duration_since(UNIX_EPOCH).unwrap() +
Duration::from_secs(BENCHMARK_LENGTH) <
SystemTime::now().duration_since(UNIX_EPOCH).unwrap()
{
println!("Ticked {} in {} seconds", displayed.lock().await, BENCHMARK_LENGTH);
std::process::exit(0);
}
sleep(Duration::from_millis(500)).await;
}
});
tokio::spawn(async move {
loop {
sleep(Duration::from_millis(interval)).await;
*incrementer.lock().await += 1;
}
});
let _ = tokio::spawn(async move {
println!("Benchmark started.");
loop {
sleep(Duration::from_secs(1)).await;
}
}).await;
}
I have tried your program on my computer.
Duplicating the *incrementer.lock().await += 1;
statements does not change the behaviour (the count is simply multiplied consequently), thus the slowdown is not due to the Mutex
.
Moreover, I replaced the Mutex
by an atomic integer, and the behaviour is still the same.
cargo flamegraph
reports a lot of time spent in a system-call below ThreadParker::futex_wait()
(of parking_lot
), which confirms the intuition I expressed in the comment: most of the time is spent in the OS suspending/resuming the underlying thread of the tokio
runtime.
About the numerical values, I obtain 122 ticks in 2s, not far from the 128 expected.
With 128 TICK_COUNT
, I obtain 240 in 2s instead of 256 and with 256 TICK_COUNT
, I obtain 466 in 2s instead of 512.
This is not ideal, but not very far from the expected values.
Maybe your OS is not as reactive as mine? (I use a quite usual Linux)