Search code examples
phppdf-generationwkhtmltopdf

wkhtmltopdf without creating a file in php


I have the wkhtmltopdf module in my Drupal which generates the pdf file by running the 'wkhtmltopdf --options URL filename.pdf' command using shell_exec function.

The output of the file is fine, but I don't want to store the pdf in the file system. I just want to show the output on the browser so the user can choose whether or not to download it.

As far as I searched, I couldn't find a way to get the output in the buffer rather than storing it in the pdf file. Is it possible to generate a pdf without creating a file in wkhtmltopdf?


Solution

  • GIF Demonstration (Over-engineered)

    Here is an over-engineered piece of code I wrote just for you :)
    It includes everything from the function to the demo form you can test out.

    I do not guarantee stability with this code, you are free to check it out and modify it for your own use but I can't guarantee 100% stability or security.

    Read the documentation about functions such as shell_exec and why it is a bad practice due to potential security risks.

    My recommendation is to write a PHP library in C++ and load that and use it in PHP.

    I am not sure if one exists for wkhtmltopdf, someone in the comments correct me if I'm wrong.


    Update 1

    I tested this script on http://ifconfig.me and it returns a malformed PDF document.
    So you perhaps have 3 choices, either write a PHP library in C++, wait for someone to come up with a better solution, or just download the file into /tmp and read the file using PHP then delete it.

    GIF Demonstration (Simple)

    Code (Simple)

    <?php
    
    /**
     * --- DO NOT REMOVE THIS DOCBLOCK ---
     * @WebCrawlTrackingId cf9e8c67.3cb7269c.60b1d84b.5b2e5450
     */
    
    /**
     * @file
     * Code for ni_wkhtmltopdf_simple function.
     * Includes a demonstration at the end.
     */
    
    /**
     * Function that saves a PDF file
     * to a temporary directory and
     * returns it.
     * All of this by using wkhtmltopdf.
     *
     * @author [email protected]
     *
     * @param string $url
     *     URL to convert
     *
     * @param string $download
     *     Decide whether to download the
     *     file by specifying a filename
     *     or don't specify anything to
     *     display it in the browsers
     *     built-in PDF viewer.
     *
     * @return int|file
     *     Return (int) -1 if URL is empty
     *     Return (int) -2 if URL is not a string
     *     Return (int) -3 if URL is not a URL
     */
    function ni_wkhtmltopdf_simple($url = "", $download = false) {
        // URL can't be empty
        if ($url == "") {
            return -1;
        }
    
        // URL must be a string
        if (gettype($url) !== "string") {
            return -2;
        }
    
        // Remove whitespace
        $url = trim($url);
    
        // Explode URL by ':' to Array
        $urla = explode(":", $url, 2);
    
        // URL must be an actual URL
        if (strtolower(substr($urla[0], 0, 4)) !== "http" || substr($urla[1], 0, 2) !== "//") {
            return -3;
        }
        
        // Escape Shell Arguments
        $url = escapeshellarg($url);
    
        // Random file name
        $fname = "/tmp/" . bin2hex(random_bytes(10)) . ".pdf";
    
        // Generate a PDF file
        shell_exec("wkhtmltopdf \"$url\" \"$fname\"");
    
        // Load file
        $buffer = file_get_contents("$fname");
        
        // Delete the file after loading
        unlink("$fname");
    
        $buffsz = strlen($buffer);
    
        // Prepare headers
        header("Content-Type:application/pdf");
    
        if ($download) {
            $download = trim($download);
            header("Content-Disposition:attachment;filename=\"$download\"");
        } else {
            header("Content-Disposition:inline");
        }
    
        header("Content-Length:" . $buffsz);
    
        exit($buffer);
    }
    
    // Demonstrate ni_wkhtmltopdf_simple
    
    // Are we getting the URL parameter?
    if (isset($_GET["url"])) {
        // Convert array to string
        if (is_array($_GET["url"])) {
            $_GET["url"] = $_GET["url"][0];
        }
        
        // Remove whitespace
        $url = trim($_GET["url"]);
    
        // URL is empty so unset it
        if ($url == "") {
            unset($_GET["url"], $url);
            header("Location:" . basename(__FILE__));
        }
    
        // Get PDF output
        if (isset($url)) {
            ni_wkhtmltopdf_simple($url);
        }
    } else {
    ?>
    <!DOCTYPE html>
    <html>
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width">
        <title>PHP wkhtmltopdf_simple Demo ([email protected])</title>
        <style>
            *{outline:0}
            html,body{
                zoom:1.25
            }
        </style>
    </head>
    <body>
        <form action="<?= basename(__FILE__) ?>" method="GET">
            <label for="url">URL:</label>
            <input id="url" name="url" type="text" value="https://" minlength="8" required autofocus />
            <button id="btn" type="submit">Generate PDF</button>
    
            <script type="text/javascript">
                function urlhandler(e) {
                    // URL Value must begin with https://
                    if (url.value.trim() == "") {
                        url.value = "https://" + url.value;
                    }
    
                    // Prevent removal of https://
                    if (e.keyCode == 8 && url.value == "https://") {
                        e.preventDefault();
                    }
    
                    // Prevent Delete key
                    if (e.keyCode == 46) {
                        e.preventDefault();
                    }
    
                    // Add https:// if it was removed during Paste operation
                    if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                        url.value = "https://" + url.value;
                    }
                }
    
                function btnhandler(e) {
                    if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                        url.value = "https://" + url.value;
                    }
    
                    // Prevent submission of the form
                    e.preventDefault();
    
                    // Make sure we've provided a URL
                    if (8 >= url.value.trim().length ||
                        url.value.trim()[9] == ".") {
                        alert("You must provide a URL.");
                        return;
                    }
                    
                    // Automatically guess top-level domain
                    if (url.value.trim().substr(-4, 1) !== "." &&
                        url.value.trim().substr(-3, 1) !== ".") {
                        url.value += ".com";
                    }
    
                    url.parentNode.submit();
                }
                
                // Event listeners
                url.addEventListener("keydown", function(e) {
                    urlhandler(e);
                });
                
                url.addEventListener("onpaste", function(e) {
                    urlhandler(e);
                });
                
                btn.addEventListener("click", function(e) {
                   btnhandler(e);
                });
            </script>
        </form>
    </body>
    <?php
    }
    ?>
    

    Code (Over-engineered)

    <?php
    
    /**
     * --- DO NOT REMOVE THIS DOCBLOCK ---
     * @WebCrawlTrackingId fcc5094e.ccc3a1df.5eb4dbfa.6c3772e1
     */
    
    /**
     * @file
     * Code for ni_wkhtmltopdf function.
     * Includes a demonstration at the end.
     */
    
    /**
     * Function that returns a PDF file
     * from a URL using wkhtmltopdf.
     *
     * @author [email protected]
     *
     * @param string $url
     *     URL to convert
     *
     * @param string $https
     *     Ensures we're giving it HTTPS
     *
     * @param string $download
     *     Decide whether to download the
     *     file by specifying a filename
     *     or don't specify anything to
     *     display it in the browsers
     *     built-in PDF viewer.
     *
     * @param string $checkcmd
     *     Ensure we have all commands
     *     required to fulfil the operation.
     *
     *     * On Windows hosts these commands 
     *     can be acquired on using `scoop`.
     *
     * @param string $checkos
     *     Make sure we're running Linux.
     *
     *     * Optional if we have both commands
     *     available on a Windows host.
     *
     *
     * @return int|file
     *     Return (int) -1 if URL is empty
     *     Return (int) -2 if URL is not a string
     *     Return (int) -3 if URL is not a URL
     *     Return (int) -4 if protocol is not HTTPS
     *     Return (int) -5 if OS is not Linux
     *     Return (int) -6 if command wkhtmltopdf not found
     *     Return (int) -7 if command cat not found
     *     Return (int) -8 wkhtmltopdf returned nothing
     */
    function ni_wkhtmltopdf($url = "", $https = false, $download = false, $checkcmd = true, $checkos = false) {
        // URL can't be empty
        if ($url == "") {
            return -1;
        }
    
        // URL must be a string
        if (gettype($url) !== "string") {
            return -2;
        }
    
        // Remove whitespace
        $url = trim($url);
    
        // Explode URL by ':' to Array
        $urla = explode(":", $url, 2);
    
        // URL must be an actual URL
        if (strtolower(substr($urla[0], 0, 4)) !== "http" || substr($urla[1], 0, 2) !== "//") {
            return -3;
        }
    
        // Optional: Make sure the URL is HTTPS (Secure)
        if ($https && strtolower(substr($url, 0, 8)) !== "https://") {
            return -4;
        }
    
        // Optional: Check operating system
        if ($checkos && strtolower(PHP_OS) !== "linux") {
            return -5;
        }
    
        // Optional: (Linux) Make sure the `wkhtmltopdf` command exists
        if ($checkcmd && !(`which wkhtmltopdf` > 0)) {
            return -6;
        }
    
        // Optional: (Linux) Make sure the `cat` command exists
        if ($checkcmd && !(`which cat` > 0)) {
            return -7;
        }
    
        // Clear URL to (hopefully) prevent RCE
        $rep = array(
            " "      => "%20",
            "%20%20" => "",
            "`"      => "%60",
            ";"      => "%3B",
            ":"      => "%3A",
            ">"      => "%3E",
            "<"      => "%3C",
            "["      => "%5B",
            "]"      => "%5D",
            "{"      => "%7B",
            "}"      => "%7D",
            "("      => "%28",
            ")"      => "%29",
            "|"      => "%7C",
            "$"      => "%24",
            "&&"     => "%26%26",
            '"'      => "%22",
            "\\"     => "%5C"
        );
    
        // Replace $a with $b inside URL
        foreach ($rep as $a => $b) {
            $url = str_replace($a, $b, $url);
        }
    
        unset($rep);
    
        // Generate a PDF file
        exec("wkhtmltopdf \"$url\" - | cat", $buffer);
    
        $buffer = implode("\n", $buffer);
    
        $buffsz = strlen($buffer);
    
        // Is buffer empty?
        if (0 >= $buffsz) {
            return -8;
        }
    
        // Prepare headers
        header("Content-Type:application/pdf");
    
        if ($download) {
            $download = trim($download);
            header("Content-Disposition:attachment;filename=\"$download\"");
        } else {
            header("Content-Disposition:inline");
        }
    
        header("Content-Length:" . $buffsz);
    
        exit($buffer);
    }
    
    // Demonstrate ni_wkhtmltopdf
    
    // Are we getting the URL parameter?
    if (isset($_GET["url"])) {
        // Convert array to string
        if (is_array($_GET["url"])) {
            $_GET["url"] = $_GET["url"][0];
        }
        
        // Remove whitespace
        $url = trim($_GET["url"]);
    
        // URL is empty so unset it
        if ($url == "") {
            unset($_GET["url"], $url);
            header("Location:" . basename(__FILE__));
        }
    
        // Get PDF output
        if (isset($url)) {
            ni_wkhtmltopdf($url);
        }
    } else {
    ?>
    <!DOCTYPE html>
    <html>
    <head>
        <meta charset="UTF-8">
        <meta name="viewport" content="width=device-width">
        <title>PHP wkhtmltopdf Demo ([email protected])</title>
        <style>
            *{outline:0}
            html,body{
                zoom:1.25
            }
        </style>
    </head>
    <body>
        <form action="<?= basename(__FILE__) ?>" method="GET">
            <label for="url">URL:</label>
            <input id="url" name="url" type="text" value="https://" minlength="8" required autofocus />
            <button id="btn" type="submit">Generate PDF</button>
    
            <script type="text/javascript">
                function urlhandler(e) {
                    // URL Value must begin with https://
                    if (url.value.trim() == "") {
                        url.value = "https://" + url.value;
                    }
    
                    // Prevent removal of https://
                    if (e.keyCode == 8 && url.value == "https://") {
                        e.preventDefault();
                    }
    
                    // Prevent Delete key
                    if (e.keyCode == 46) {
                        e.preventDefault();
                    }
    
                    // Add https:// if it was removed during Paste operation
                    if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                        url.value = "https://" + url.value;
                    }
                }
    
                function btnhandler(e) {
                    if (url.value.substr(0, 8).toLowerCase() !== "https://") {
                        url.value = "https://" + url.value;
                    }
    
                    // Prevent submission of the form
                    e.preventDefault();
    
                    // Make sure we've provided a URL
                    if (8 >= url.value.trim().length ||
                        url.value.trim()[9] == ".") {
                        alert("You must provide a URL.");
                        return;
                    }
                    
                    // Automatically guess top-level domain
                    if (url.value.trim().substr(-4, 1) !== "." &&
                        url.value.trim().substr(-3, 1) !== ".") {
                        url.value += ".com";
                    }
    
                    url.parentNode.submit();
                }
                
                // Event listeners
                url.addEventListener("keydown", function(e) {
                    urlhandler(e);
                });
                
                url.addEventListener("onpaste", function(e) {
                    urlhandler(e);
                });
                
                btn.addEventListener("click", function(e) {
                   btnhandler(e);
                });
            </script>
        </form>
    </body>
    <?php
    }
    ?>