Search code examples
rrestjsonlitegtmetrix

Using GTmetrix REST API v2.0 in R


I am trying to integrate the performance testing of certain websites using GTmetrix. With the API, I am able to run the test and pull the results using the SEO connector tool in Microsoft Excel. However, it uses the xml with older version of API, and some new tests are not available in this. The latest version is 2.0

The link for the xml is here: GTmetrix XML for API 0.1.

I tried using the libraries httr and jsonlite. But, I don't know how authenticate with API, run the test and extract the results.

The documentation for API is available at API Documentation.

library(httr)
library(jsonlite)

url  <- "https://www.berkeley.edu" # URL to be tested
location <- 1 # testing Location
browser <- 3 # Browser to be used for testing
res  <- GET("https://gtmetrix.com/api/gtmetrix-openapi-v2.0.json")
data <- fromJSON(rawToChar(res$content))

Solution

  • Update 2021-11-08:

    I whipped up a small library to talk to GTmetrix via R. There's some basic sanity checking baked in, but obviously this is still work in progress and there are (potentially critical) bugs. Feel free to check it out, though. Would love some feedback.

    # Install and load library.
    devtools::install_github("RomanAbashin/rgtmx")
    library(rgtmx)
    

    Update 2021-11-12: It's available on CRAN now. :-)

    # Install and load library.
    install_packages("rgtmx")
    library(rgtmx)
    

    Start test (and get results)

    # Minimal example #1.
    # Returns the final report after checking test status roughly every 3 seconds. 
    result <- start_test("google.com", "[API_KEY]")
    

    This will start a test and wait for the report to be generated, returning the result as data.frame. Optionally, you can just simply return the test ID and other meta data via the parameter wait_for_completion = FALSE.

    # Minimal example #2.
    # Returns just the test ID and some meta data.
    result <- start_test("google.com", "[API_KEY]", wait_for_completion = FALSE)
    

    Other optional parameters: location, browser, report, retention, httpauth_username, httpauth_password, adblock, cookies, video, stop_onload, throttle, allow_url, block_url, dns, simulate_device, user_agent, browser_width, browser_height, browser_dppx, browser_rotate.

    Show available browsers

    show_available_browsers("[API_KEY]")
    

    Show available locations

    show_available_locations("[API_KEY]")
    

    Get specific test

    get_test("[TEST_ID]", "[API_KEY]")
    

    Get specific report

    get_report("[REPORT_ID]", "[API_KEY]")
    

    Get all tests

    get_all_tests("[API_KEY]")
    

    Get account status

    get_account_status("[API_KEY]")
    

    Original answer:

    Pretty straightforward, actually:

    0. Set test parameters.

    # Your api key from the GTmetrix console.
    api_key <- "[Key]"
    
    # All attributes except URL are optional, and the availability
    # of certain options may depend on the tier of your account.
    
    # URL to test.
    url <- "https://www.worldwildlife.org/"
    # Testing location ID.
    location_id <- 1
    # Browser ID.
    browser_id <- 3
    

    1. Start a test

    res_test_start  <- httr::POST(
        url = "https://gtmetrix.com/api/2.0/tests",
        httr::authenticate(api_key, ""),
        httr::content_type("application/vnd.api+json"),
        body = jsonlite::toJSON(
            list(
                "data" = list(
                    "type" = "test",
                    "attributes" = list(
                        "url" = url,
                        # Optional attributes go here.
                        "location" = location_id,
                        "browser" = browser_id
                    )
                )
            ),
            auto_unbox = TRUE
        ),
        encode = "raw"
    )
    

    2. Get test ID

    test_id <- jsonlite::fromJSON(rawToChar(res_test_start$content))$data$id
    

    3. Get report ID

    # Wait a bit, as generating the report can take some time.
    res_test_status <- httr::GET(
        url = paste0("https://gtmetrix.com/api/2.0/tests/", test_id),
        httr::authenticate(api_key, ""),
        httr::content_type("application/vnd.api+json")
    )
    
    # If this returns the test ID, the report is not ready, yet.
    report_id <- jsonlite::fromJSON(rawToChar(res_test_status$content))$data$id
    

    4. Get report

    res_report <- httr::GET(
        url = paste0("https://gtmetrix.com/api/2.0/reports/", report_id),
        httr::authenticate(api_key, ""),
        httr::content_type("application/vnd.api+json")
    )
    
    # The report is a nested list with the results as you know them from GTmetrix.
    report <- jsonlite::fromJSON(rawToChar(res_report$content))$data
    

    I'm kinda tempted to build something for this as there seems to be no R library for it...