Web-scraping with headless-chrome (Rust), clicking doesn't seem to work

I'm relatively new to Rust and completely new to web (scraping). I tried to implement a web scraper as a pet project to get more comfortable with rust and with the web stack.

I use headless-chrome to go on a website and scrape a website of links, which I will investigate later. So, I open a tab, navigate to the website, then scrape the URLs, and finally want to click on the next button. Even though I find the next button (with a CSS selector) and I use click(), nothing happens. In the next iteration, I scrape the same list again (clearly didn't move to the next page).

use headless_chrome::Tab;
use std::error::Error;
use std::sync::Arc;
use std::{thread, time};

pub fn scrape(tab: Arc<Tab>) {
    let url = "";

    if let Err(_) = tab.navigate_to(url) {
        println!("Failed to navigate to {}", url);

    if let Err(e) = tab.wait_until_navigated() {
        println!("Failed to wait for navigation: {}", e);

    if let Ok(gdpr_accept_button) = tab.wait_for_element(".sc-gsDKAQ.fILFKg") {
        if let Err(e) = {
            println!("Failed to click GDPR accept button: {}", e);
    } else {
        println!("No GDPR popup to acknowledge found.");

    let mut links = Vec::<String>::new();
    loop {
        let mut skipped: usize = 0;
        let new_urls_count: usize;
        match parse_list(&tab) {
            Ok(urls) => {
                new_urls_count = urls.len();
                for url in urls {
                    if !links.contains(&url) {
                        links. Push(url);
                    } else {
                        skipped += 1;
            Err(_) => {
                println!("No more houses found: stopping");

        if skipped == new_urls_count {
            println!("Only previously loaded houses found: stopping");

        if let Ok(button) = tab.wait_for_element("[class=\"arrowButton-20ae5\"]") {
            if let Err(e) = {
                println!("Failed to click next page button: {}", e);
            } else {
                println!("Clicked next page button");
        } else {
            println!("No next page button found: stopping");

        if let Err(e) = tab.wait_until_navigated() {
            println!("Failed to load next page: {}", e);

    println!("Found {} houses:", links.len());
    for link in links {
        println!("\t{}", link);

fn parse_list(tab: &Arc<Tab>) -> Result<Vec<String>, Box<dyn Error>> {
    let elements = tab.find_elements("div[class*=\"EstateItem\"] > a")?; //".EstateItem-1c115"

    let mut links = Vec::<String>::new();
    for element in elements {
        if let Some(url) = element
                &"function() {{ return this.getAttribute(\"href\"); }}",
            links. Push(url.to_string());


When I call this code in main, I get the following output:

No GDPR popup to acknowledge found.
Clicked next page button
Only previously loaded houses found: stopping
Found 20 houses:

My problem is that I don't understand clicking the next button doesn't do anything. As I am new to Rust and web applications if it's a problem with me using the crate (headless-chrome) or my understanding of web scraping.


  • tl;dr: replace the code in the click next page button as this:

    if let Ok(button) = tab.wait_for_element(r#"*[class^="Pagination"] button:last-child"#) {
        // Expl: both left and right arrow buttons have the same class. The original selector doesn't work, thusly.
        if let Err(e) = {
            println!("Failed to click next page button: {}", e);
        } else {
            println!("Clicked next page button");
    } else {
        println!("No next page button found: stopping");
    // Expl: rust is too fast, so we need to wait for the page to load
    std::thread::sleep(std::time::Duration::from_secs(5)); // Wait for 5 seconds
    if let Err(e) = tab.wait_until_navigated() {
        println!("Failed to load next page: {}", e);
    1. The original code would click right button on the first page, then click left button here after because the CSS would match the left button as well; and by virtue being first in the DOM tree, the left button would be returned.
    2. The original code is just too fast. The chrome need to wait a bit to load. Should you find this performance to be not tolerable, check the event here and wait for the browser to emit the event

    As a final suggestion, all the work above is unnecessary: it is obvious that the URL pattern looks like this:{PAGINATION}. And you can find all the pages in this site by basically scrape the pagination elements; you might as well just ditch the chrome and perform and basic HTTP requests and parse the HTML returned. For this purpose, check and out. If performance is mission critical for this spider, reqwest can also be used with tokio to scrape the web page in asynchronous/concurrent manner.


    Below are rust/py implementation of my above suggestion. The rust library to parse HTML/XML and evaluate XPath seems to be very rare and relatively not reliable, however.

    use reqwest::Client;
    use std::error::Error;
    use std::sync::Arc;
    use sxd_xpath::{Context, Factory, Value};
    async fn get_page_count(client: &reqwest::Client, url: &str) -> Result<i32, Box<dyn Error>> {
        let res = client.get(url).send().await?;
        let body = res.text().await?;
        let pages_count = body
    async fn scrape_one(client: &Client, url: &str) -> Result<Vec<String>, Box<dyn Error>> {
        let res = client.get(url).send().await?;
        let body = res.text().await?;
        let package = sxd_html::parse_html(&body);
        let doc = package.as_document();
        let factory = Factory::new();
        let ctx = Context::new();
        let houses_selector = factory
            .build("//*[contains(@class, 'EstateItem')]")?
        let houses = houses_selector.evaluate(&ctx, doc.root())?;
        if let Value::Nodeset(houses) = houses {
            let mut data = Vec::new();
            for house in houses {
                let title_selector =".//h2/text()")?.unwrap();
                let title = title_selector.evaluate(&ctx, house)?.string();
                let a_selector =".//a/@href")?.unwrap();
                let href = a_selector.evaluate(&ctx, house)?.string();
                data.push(format!("{} - {}", title, href));
            return Ok(data);
        Err("No data found".into())
    async fn main() -> Result<(), Box<dyn Error>> {
        let url = "";
        let client = reqwest::Client::builder()
                "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0",
        let client = Arc::new(client);
        let page_count = get_page_count(&client, url).await?;
        let mut tasks = Vec::new();
        for i in 1..=page_count {
            let url = format!("{}&sf={}", url, i);
            let client = client.clone();
            tasks.push(tokio::spawn(async move {
                scrape_one(&client, &url).await.unwrap()
        let results = futures::future::join_all(tasks).await;
        for result in results {
            println!("{:?}", result?);
    async def page_count(url):
        req = await session.get(url)
        return int('"pagesCount":\s*(\d+)', await req.text()).group(1))
    async def scrape_one(url):
        req = await session.get(url)
        tree = etree.HTML(await req.text())
        houses = tree.xpath("//*[contains(@class, 'EstateItem')]")
        data = [
            dict(title=house.xpath(".//h2/text()")[0], href=house.xpath(".//a/@href")[0])
            for house in houses
        return data
    url = ""
    result = await asyncio.gather(
            scrape_one(url + f"&sf={i}")
            for i in range(1, await page_count(url + "&sf=1") + 1)