Search code examples
rusttokio-postgres

future cannot be sent between threads safely after Mutex


I've been trying to move from postgres to tokio_postgres but struggle with some async.

use scraper::Html;
use std::sync::Arc;
use tokio::sync::Mutex;
use tokio::task;

struct Url {}
impl Url {
    fn scrapped_home(&self, symbol: String) -> Html {
        let url = format!(
            "https://finance.yahoo.com/quote/{}?p={}&.tsrc=fin-srch", symbol, symbol
        );
        
        let response = reqwest::blocking::get(url).unwrap().text().unwrap();

        scraper::Html::parse_document(&response)
    }
}

#[derive(Clone)]
struct StockData {
    symbol: String,
}

#[tokio::main]
async fn main() {
    let stock_data = StockData { symbol: "".to_string() };
    let url = Url {};
    
    let mut uri_test: Arc<Mutex<Html>> = Arc::new(Mutex::from(url.scrapped_home(stock_data.clone().symbol)));
    let mut uri_test_closure = Arc::clone(&uri_test);

    let uri = task::spawn_blocking(|| {
        uri_test_closure.lock()
    });
}

Without putting a mutex on

url.scrapped_home(stock_data.clone().symbol)),

I would get the error that a runtime cannot drop in a context where blocking is not allowed, so I put in inside spawn_blocking. Then I get the error that Cell cannot be shared between threads safely. This, from what I could gather, is because Cell isn'it Sync. I then wrapped in within a Mutex. This on the other hand throws Cell cannot be shared between threads safely'.

Now, is that because it contains a reference to a Cell and therefore isn't memory-safe? If so, would I need to implement Sync for Html? And how?

Html is from the scraper crate.

UPDATE:

Sorry, here's the error.

error: future cannot be sent between threads safely
   --> src/database/queries.rs:141:40
    |
141 |           let uri = task::spawn_blocking(|| {
    |  ________________________________________^
142 | |             uri_test_closure.lock()
143 | |         });
    | |_________^ future is not `Send`
    |
    = help: within `tendril::tendril::NonAtomic`, the trait `Sync` is not implemented for `Cell<usize>`
note: required by a bound in `spawn_blocking`
   --> /home/a/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.20.1/src/task/blocking.rs:195:12
    |
195 |         R: Send + 'static,
    |            ^^^^ required by this bound in `spawn_blocking`

UPDATE:

Adding Cargo.toml as requested:

[package]
name = "reprod"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
reqwest = { version = "0.11", features = ["json", "blocking"] }
tokio = { version = "1", features = ["full"] }
tokio-postgres = "0"
scraper = "0.12.0"

UPDATE: Added original sync code:

fn main() {
    let stock_data = StockData { symbol: "".to_string() };
    let url = Url {};
    
    url.scrapped_home(stock_data.clone().symbol);
}

UPDATE: Thanks to Kevin I was able to get it to work. As he pointed out Html was neither Send nor Sync. This part of the Rust lang doc helped me to understand how message passing works.

pub fn scrapped_home(&self, symbol: String) -> Html {
        let (tx, rx) = mpsc::channel();

        let url = format!(
            "https://finance.yahoo.com/quote/{}?p={}&.tsrc=fin-srch", symbol, symbol
        );

        thread::spawn(move || {
            let val = reqwest::blocking::get(url).unwrap().text().unwrap();
            tx.send(val).unwrap();
        });
        
        scraper::Html::parse_document(&rx.recv().unwrap())
    }

Afterwards I had some sort of epiphany and got it to work with tokio, without message passing, as well

pub async fn scrapped_home(&self, symbol: String) -> Html {
            let url = format!(
                "https://finance.yahoo.com/quote/{}?p={}&.tsrc=fin-srch", symbol, symbol
            );

            let response = task::spawn_blocking(move || {
                reqwest::blocking::get(url).unwrap().text().unwrap()
            }).await.unwrap();
            
            scraper::Html::parse_document(&response)
        }

I hope that this might help someone.


Solution

  • This illustrates it a bit more clearly now: you're trying to return a tokio::sync::MutexGuard across a thread boundary. When you call this:

        let mut uri_test: Arc<Mutex<Html>> = Arc::new(Mutex::from(url.scrapped_home(stock_data.clone().symbol)));
        let mut uri_test_closure = Arc::clone(&uri_test);
    
        let uri = task::spawn_blocking(|| {
            uri_test_closure.lock()
        });
    

    The uri_test_closure.lock() call (tokio::sync::Mutex::lock()) doesn't have a semicolon, which means it's returning the object that's the result of the call. But you can't return a MutexGuard across a thread boundary.

    I suggest you read up on the linked lock() call, as well as blocking_lock() and such there.

    I'm not certain of the point of your call to task::spawn_blocking here. If you're trying to illustrate a use case for something, that's not coming across.

    Edit:

    The problem is deeper. Html is both !Send and !Sync which means you can't even wrap it up in an Arc<Mutex<Html>> or Arc<Mutex<Optional<Html>>> or whatever. You need to get the data from another thread in another way, and not as that "whole" object. See this post on the rust user forum for more detailed information. But whatever you're wrapping must be Send and that struct is explicitly not.

    So if a type is Send and !Sync, you can wrap in a Mutex and an Arc. But if it's !Send, you're hooped, and need to use message passing, or other synchronization mechanisms.