I have been attempting to use Weasyprint and PDFKIT to transform a webpage into a pdf. I have successfully saved a PDF with a portion of the page. in weasyprint i cannot work out how to grab the correct CSS style from the page. using PDFKIT i seem to be retrieving the mobile version of the site rather than the full page. i'm using python 3.6.
from urllib.request import Request, urlopen
import webbrowser
import pdfkit
import weasyprint
#pdfkit.from_url('http://google.com', 'out.pdf')
print("started script")
website = 'https://www.bbcgoodfood.com/recipes/3228/chilli-con-carne'
filename = 'savedPDF.pdf'
req = Request(website, headers={'User-Agent': 'Mozilla/5.0'})
print(urlopen(req).getcode())
temp = urlopen(req).getcode()
if temp == 200:
pdfkit.from_url(website, 'out.pdf')
weasyprint.HTML(website).write_pdf('/Users/me/Documents/weasyprint.pdf')
weasyprint.HTML(website).write_pdf(filename,stylesheets=[weasyprint.CSS('https://www.bbcgoodfood.com/sites/default/files/advagg_css/css__pDgD1vQBFL4LZ6AO_Uw8wEc3MBEaHOzbhMtPie685P8__Kxa0k0VBbKvV5-TOMN_kW3S7CrkFMM4Zf0LjDvzMFnk__mXPuNFBZ0nocZLk5Qifty02tMfg-gomArSBCcGw1mLo.css')])
I cant see an option in pdfkit to specify what to connect with. Furthermore the two PDF's created from weasyprint are identical.
After quite a while of messing around with the above mentioned packages I was still struggling to achieve a correct looking output.
I have settled with using webkit2png, this works almost perfectly, the only downside is that I get a cookie popup message appearing in some of the saved files.