Search code examples
pythonurllib2pdfkit

Passing in memory html file to pdfkit


I just downloaded a website with python

p =urllib2.build_opener(urllib2.HTTPCookieProcessor).open('http://www.google.com')
html_content = p.read()

And now I want to write it to a pdf file:

pdfkit.from_file(??????,'test.pdf')

But how do I pass the html_content in the function? It expects a file, but I don't want to save the file first as a html. Is there a way to pass the fetched html_content in the pdfkit.from_file function?

Note: I don't wish to use .from_url, I first want to fetch the page using urllib2.


Solution

  • There are pdfkit.from_string:

    ....
    html_content = p.read()
    pdfkit.from_string(html_content,'test.pdf')
    

    and pdfkit.from_url:

    pdfkit.from_url('http://www.google.com')
    

    And, pdfkit.from_file read filename as the first parameter, it also accept file-like object; you can pass the return value of the urllib....open because it's a file-like object.

    See pdfkit usage.