Search code examples
rustiron

How to avoid zombie processes when running a Command?


A small Iron project calls a Command in some route and returns a Response. Here is the relevant code of the route handler function:

fn convert(req: &mut Request) -> IronResult<Response> {

    // ...
    // init some bindings like destination_html and destination_pdf
    // ...

    convert_to_pdf(destination_html, destination_pdf);

    Ok( Response::with((status::Ok, "Done")) )
}

And the code of the called function:

fn convert_to_pdf(destination_html: &str, destination_pdf: &str) {
    Command::new("xvfb-run")
        .arg("-a")
        .arg("wkhtmltopdf")
        .arg(destination_html)
        .arg(destination_pdf)
        .stdout(Stdio::null())
        .stderr(Stdio::null())
        .spawn()
        .expect("failed to execute process");
}

The process works (the file is converted from HTML to PDF) and the response is returned to the browser. Everything is fine, but a zombie process is still there as a child of my app:

enter image description here

I don't know why and I don't know how to avoid it. What could I do?

The wkhtmltopdf command is a long process, I don't want to call it synchronously and wait for its return. And I don't want to restart my Rust program (the parent of the zombie child) twice a day to kill zombies.


Solution

  • Your problem is that you are not waiting for the process termination, so the operating system is not releasing any resources (see the man pages for proper explanation). Your zombies are taking memory, which will result in resource exhaustion. Killing the parent process will not do anything, you need to kill each zombie manually (if you were running wkhtmltopdf within a thread, it would work).


    Beyond that...

    You are trying to spawn a command and answer your clients ... without even checking the status code of wkhtmltopdf. Moreover, you are running as root, which is A BAD PRACTICE (whether you are developing as root or not). And your application is susceptible to DDoS (if you have a lot of clients generating PDFs, your server will face resource exhaustion).

    (IMHO) You should break your project into two :

    1. the server without the rendering process
    2. the PDF rendering engine

    The first would send a message to the second "please generate a PDF with the following parameters(..)". The second would look at the messages queue, take the first, generate the PDF and wait for completion/errors. You could even add a unique #ID to the message, and create an endpoint on the rendering engine to actually query for the status of job #ID.

    What you are trying to do is a job queue like Celery, but it is written in Python and is using third-party software (Redis).