Search code examples
pythonpython-3.xmodulescreenshothtml

How to make screenshot from web page?


How to make screenshot from any url (web page)?

I was trying:

from .ghost import Ghost
ghost = Ghost(wait_timeout=4)
ghost.open('http://www.google.com')
ghost.capture_to('screen_shot.png')

Result:

No module named '__main__.ghost'; '__main__' is not a package

I was trying also:

Python Webkit making web-site screenshots using virtual framebuffer

Take screenshot of multiple URLs using selenium (python)

Fastest way to take a screenshot with python on windows

Take a screenshot of open website in python script

I've also tried other methods that are not listed here. Nothing succeeded. Or an error or module is not found .. or or or. I'm tired. Is there an easy way to make a screenshot of a web page using Python 3.X?

upd1:

C:\prg\PY\PUMA\tests>py save-web-html.py
Traceback (most recent call last):
File "save-web-html.py", line 2, in <module>
from .ghost import Ghost
ModuleNotFoundError: No module named '__main__.ghost'; '__main__' is not a package

upd2:

C:\prg\PY\PUMA\tests>py save-web-html.py
Exception ignored in: <bound method Ghost.__del__ of <ghost.ghost.Ghost object at 0x0000020A169CF860>>
Traceback (most recent call last):
  File "C:\Users\Coar\AppData\Local\Programs\Python\Python36\lib\site-packages\ghost\ghost.py", line 325, in __del__
    self.exit()
  File "C:\Users\Coar\AppData\Local\Programs\Python\Python36\lib\site-packages\ghost\ghost.py", line 315, in exit
    self._app.quit()
AttributeError: 'NoneType' object has no attribute 'quit'
Traceback (most recent call last):
  File "save-web-html.py", line 4, in <module>
    ghost = Ghost(wait_timeout=4)
TypeError: __init__() got an unexpected keyword argument 'wait_timeout'

Solution

  • In the late 80's this may have been a simple task, just render some html to an image instead of the screen.

    But these days web-pages require client-side execution to build parts of its DOM and re-render based on client-side initiated AJAX (or equivalent) requests... it's a whole thing "web 2.0" thing.

    Rendering a web-site such as http://google.com as a simple html return should be easy, but rendering something like https://www.facebook.com/ or https://www.kogan.com/ will have many back & fourth comms to display what you're expecting to see.

    So restricting this to a pure python solution may not be plausible; I'm not aware of a python-based browser. Consider running a separate service to take the screenshots, and use your core application (in python) to fetch requested screenshots.

    I just tried a few with docker, many of them struggle with https and the aforementioned ajax behaviour.

    earlyclaim/docker-manet appears to work demo page

    edit: from your comments, you need the data from a graph that's rendered using a 2nd request.

    you just need the json return from https://www.minnowbooster.net/limit/chart

    try:
        from urllib.request import urlopen  # py3
    except ImportError:
        from urllib2 import urlopen  # py2
    import json
    
    url = 'https://www.minnowbooster.net/limit/chart'
    
    response = urlopen(url)
    data_str = response.read().decode()
    data = json.loads(data_str)
    
    print(data)