Search code examples
rrcurlhttr

Send expression to website field return dynamic result (picture)


I recently asked a similar question: Send expression to website return dynamic result (picture) and got a terific response that required no sending of an expression to a field of a web page, rather it utilized the URL to get the job done.

I have discovered a better regex visualizer (pointed out by G. Grothendieck) as it can be set to Python based (closer to R; for example it allows lookbehinds like (?<=foo) that http://www.regexper.com/ throws an error for).

Using this regex: "(?<=foo)\\s*foo[A-Z]\\d{2,3}" I'd like to use R to send (?<=foo)\s*foo[A-Z]\d{2,3}, set the drop down menu to Python, and open or return the visual results as seen here:

enter image description here

The same URL trick won't work here as the URL does not change when the expression is entered, rather the embedded JavaScript is returning the results.

MWE

## Expression
x <- "(?<=foo)\\s*foo[A-Z]\\d{2,3}"

Solution

  • You won't be able to manipulate the javascript state of the page via the URL so you'll need to work in an environment that can interact with a the page interactively like RSelenium if you want to interact with the form. But that's opening up anther can of worms. And given that they want you to pay them for their services, they might not be too keen on automatic scraping.

    As far as I can tell they don't have an officially documented API, but when you click the "Share" link on the site, it submits a a JSON object to their server to get a share-able URL. That payload looks like

    {"title":"Untitled Regex",
    "description":"No description",
    "regex":"(?<=foo)\\s*foo[A-Z]\\d{2,3}\n",
    "flavor":"python",
    "strFlags":"",
    "testString":"My test data",
    "unitTests":"[]",
     "share":true}
    

    So if you bypass the UI and directly post that JSON content, you can get the unique code that you can use to browse to see the results. That would look something like

    payload<-list(title="Untitled Regex",
        description="No description",
        regex="(?<=foo)\\s*foo[A-Z]\\d{2,3}\n",
        flavor="python",
        strFlags="",
        testString="My test data",
        unitTests="[]",
        share=TRUE)
    
    library(httr)
    library(jsonlite)
    
    rr <- POST("https://www.debuggex.com/api/regex", 
        body=lapply(payload, unbox), encode="json")
    url <- paste0("https://www.debuggex.com/r/", content(rr)$token)
    browseURL(url)
    

    This is a very fragile solution because they may choose to change their implementation at any time. It's best to use features that they officially support.