python caching playwright playwright-python

How to cache playwright-python contexts for testing?

I am doing some web scraping using playwright-python>=1.41, and have to launch the browser in a headed mode (e.g. launch(headless=False).

For CI testing, I would like to somehow cache the headed interactions with Chromium, to enable offline testing:

First invocation: uses Chromium to make real-world HTTP transactions
Later invocations: uses Chromium, but all HTTP transactions read from a cache

How can this be done? I can't find any clear answers on how to do this.

Solution

It might solve your problem using HAR-file recording:

Run the first test while recording a HAR-file
Storing the HAR-file as an artifact, in your repo or similar in your CI environment
Running test again with recorded HAR-file

Here is how to do that with playwright==1.41.1 and pytest-playwright==0.3.3:

import pathlib

import pytest
from playwright.sync_api import Browser, Playwright

CACHE_DIR = pathlib.Path(__file__).parent / "cache"


@pytest.fixture(name="example_har", scope="session")
def fixture_example_har(playwright: Playwright) -> pathlib.Path:
    har_file = CACHE_DIR / "example.har"
    with (
        playwright.chromium.launch(headless=False) as browser,
        browser.new_page() as page,
    ):
        page.route_from_har(har_file, url="*/**", update=True)
        page.goto("https://example.com/")
    return har_file


def test_caching(browser: Browser, example_har: pathlib.Path) -> None:
    with browser.new_context(offline=True) as context:
        page = context.new_page()
        page.route_from_har(example_har, url="*/**")
        page.goto("https://example.com/")