Search code examples
pythonweb-scrapingrustrequestencode

the rust equivalent of this python code does not work?


i have a code that gets html body of a website in python that routes through tor. this is the code in python:

import requests

session = requests.session()
session.proxies = {'http':  'socks5h://127.0.0.1:9050',
                    'https': 'socks5h://127.0.0.1:9050'}
r = session.get("https://einthusan.tv/movie/results/?lang=kannada&query=love")
print(r.text)

and this is the rust code i wrote:

use std::fs;
use reqwest::blocking::Client;
fn main() -> Result<(), reqwest::Error> {
    let proxy = reqwest::Proxy::all("socks5h://127.0.0.1:9050").expect("error connecting to tor!");
    let client = Client::builder().proxy(proxy).build()?;
    
    let formatted_url = format!("https://einthusan.tv/movie/results/?lang=malayalam&query=premam");
    
    let response = client
        .get(&formatted_url)
        .send()?;

    // Read and print the response body
    let body = response.text()?;
    println!("{}", &body);
    fs::write("response.txt", &body);

    Ok(())
}

. By python, i can get the html body. But, by rust, i am getting some encoded like this. How to resolve this?

I have tried putting headers that accept "text/html" or "charset=utf-8" and no help. I am still getting the encoded text.


Solution

  • The response from that website is gzip encoded (header Content-Encoding: gzip), which is why the content looks garbled. In the python version using requests, the response is automatically gzip decoded.

    To automatically decode gzip using reqwest in rust, you need to enable the gzip feature.

    See ClientBuilder::gzip() for more information