Search code examples
javascriptpythonpython-3.xseleniumscreenshot

Getting screenshot of the whole page and hiding the top nav bar doesn't work as expected (Selenium, Python3)


I have found a way to get a screenshot of the whole page after reading an answer in this StackOverflow link.

The problem with this solution is that I am trying to hide the top nav bar with every scroll but the code doesn't seem to do this correctly..

The reason I want to hide it is because it hides a part of the page at the top of each screenshot.

In my code I loop over a set of pages and i am doing the exact same job for all of them. In some cases it hides the top bar, in some other pages it doesn't have at all a top bar. So the code that produces the screenshot, doesn't seem very stable.

This is the related part of my code that produces the screenshot:

    # loop all pages
    j = 0
    while j < len(all_pages):
        browser.get(base_url + all_pages[j])

        total_width = browser.execute_script("return document.body.offsetWidth")
        total_height = browser.execute_script("return document.body.parentNode.scrollHeight")
        viewport_width = browser.execute_script("return document.body.clientWidth")
        viewport_height = browser.execute_script("return window.innerHeight")
        rectangles = []

        i = 0
        while i < total_height:
            ii = 0
            top_height = i + viewport_height

            if top_height > total_height:
                top_height = total_height

            while ii < total_width:
                top_width = ii + viewport_width

                if top_width > total_width:
                    top_width = total_width

                rectangles.append((ii, i, top_width,top_height))

                ii = ii + viewport_width

            i = i + viewport_height

        stitched_image = Image.new('RGB', (total_width, total_height))
        previous = None
        part = 0

        for rectangle in rectangles:
            if not previous is None:
                browser.execute_script("window.scrollTo({0}, {1})".format(rectangle[0], rectangle[1]))
                time.sleep(0.2)
                browser.execute_script("document.getElementById('header-container').setAttribute('style', 'position: absolute; top: 0px;');")
                time.sleep(0.2)
                time.sleep(0.2)

            file_name = "part_{0}.png".format(part)

            browser.get_screenshot_as_file(file_name)
            screenshot = Image.open(file_name)

            if rectangle[1] + viewport_height > total_height:
                offset = (rectangle[0], total_height - viewport_height)
            else:
                offset = (rectangle[0], rectangle[1])

            stitched_image.paste(screenshot, offset)

            del screenshot
            os.remove(file_name)
            part = part + 1
            previous = rectangle

        stitched_image.save("C:\\Users\\marialena\\source\\repos\\HTMLtoPDF\\all_files\\" + all_pages[j] + ".png",)

        j = j + 1

    browser.quit()

And this are two of the screenshots that got generated after the script's execution:

Didn't work - Top Bar at every scroll: bad-scenario

Worked - Top Bar only the first time: good-scenario

Can someone help me understand why it hides the nav bar only some times? Does it need a reset perhaps of a variable?


Solution

  • Based on @pcalkins suggestion what I did to resolve my issue was to add the following:

    browser.execute_script("document.getElementById('header-container').innerHTML = '';")

    right before the line where the screenshot gets executed: browser.get_screenshot_as_file(file_name)


    No header is visible now:

    no-header-screenshot