Search code examples
rustdeadlocksleeprust-tokio

Is tokio::time::sleep producing deadlocks in rust?


It seems to me that having to many tokio::time:sleep calls in parallel produces a deadlock. What I wanted to do in the example code below is simulating using an event bus to execute code in parallel using the tokio runtime.

use eventador::{Eventador, SinkExt};
use tokio::time::{sleep, Duration, Instant};
use once_cell::sync::Lazy;

static INSTANT: Lazy<Instant> = Lazy::new(|| Instant::now());

pub struct Subscriber {
    eventbus: Eventador,
    i: u16
}

impl Subscriber {
    pub fn new(i: u16, eventbus: Eventador) -> Self {
        Self {
            i, eventbus
        }
    }

    pub async fn start(self) {
        let subscription = self.eventbus.subscribe::<Event>();
        let value = subscription.recv().value;
        println!("pre sleep {} - {}ms since start", self.i, INSTANT.elapsed().as_millis());
        let now = Instant::now();
        sleep(Duration::from_millis(1000)).await;
        println!("{}: {:?} - {}ms - {}ms since start", self.i, value, now.elapsed().as_millis(), INSTANT.elapsed().as_millis());
    }
}

#[derive(Debug)]
pub struct Event {
    pub value: u16
}

#[tokio::main]
async fn main() {
    let eventbus = Eventador::new(1024).unwrap();

    let mut publisher = eventbus.async_publisher::<Event>(512);

    for i in 1..8 {
        let subscriber = Subscriber::new(i, eventbus.clone());
        println!("spawn {}", i);
        tokio::spawn(subscriber.start());
    }

    println!("sending at {}", INSTANT.elapsed().as_millis());
    publisher.send(Event { value: 1234 }).await.expect("Something went wrong");
    println!("send finished");

    sleep(Duration::from_millis(10000)).await;
    println!("sleep finished");
}

The above code will produce this output:

spawn 1
spawn 2
spawn 3
spawn 4
spawn 5
spawn 6
spawn 7
sending at 0
pre sleep 3 - 1ms since start
pre sleep 4 - 1ms since start
pre sleep 1 - 1ms since start
pre sleep 5 - 1ms since start
send finished
pre sleep 2 - 1ms since start
pre sleep 6 - 1ms since start
pre sleep 7 - 1ms since start
5: 1234 - 1000ms - 1002ms since start
1: 1234 - 1000ms - 1002ms since start
4: 1234 - 1000ms - 1002ms since start
3: 1234 - 1000ms - 1002ms since start
7: 1234 - 1001ms - 1003ms since start
6: 1234 - 1001ms - 1004ms since start
2: 1234 - 1002ms - 1004ms since start
sleep finished

This is what I wanted to see. But when I increase the amount of subscribers to lets say 10 (in the for loop) I get this output

spawn 1
spawn 2
spawn 3
spawn 4
spawn 5
spawn 6
spawn 7
spawn 8
spawn 9
spawn 10
sending at 0
pre sleep 4 - 3ms since start
pre sleep 3 - 3ms since start
pre sleep 1 - 3ms since start
pre sleep 5 - 3ms since start
pre sleep 6 - 3ms since start
pre sleep 2 - 3ms since start
pre sleep 7 - 3ms since start
send finished
pre sleep 8 - 3ms since start
7: 1234 - 1001ms - 1004ms since start
3: 1234 - 1001ms - 1005ms since start
6: 1234 - 1001ms - 1005ms since start
4: 1234 - 1002ms - 1005ms since start
1: 1234 - 1002ms - 1005ms since start
2: 1234 - 1002ms - 1005ms since start
8: 1234 - 1001ms - 1005ms since start
5: 1234 - 1001ms - 1005ms since start
sleep finished

Additionally the program never stops. Why is that important, when I am just running a simulation? I want to make sure that in production when lets say I read 100 files concurrently, that I do not run into this deadlock - this would completely invalidate my idea.


Solution

  • You're using the sync APIs of eventador, blocking the runtime, and this can cause deadlocks, not the sleep. Use the async versions (async_subscriber() and async_publisher()).