Search code examples
regexkrl

parsing url with regular expressions and KRL's replace method


I want to take the current page's URL (using page:env("caller")) and extract a section of it.

For instance, I want to take

http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=cats

and assign

cats

to a variable.

How would I do this with KRL?

I have tried

url = page:env("caller");
query = url.replace("http://www\.google\.com/search\?sourceid=chrome&ie=UTF-8&q=", "");

but it simply assigns the entire page:env("caller") to the variable query (e.g. http://www.google.com/search?sourceid=chrome&ie=UTF-8&q=cats).

Edit: a jQuery solution would most likely work, as well.

Edit2: @JAM --

The select statement you posted doesn't seem to work. I tested it on http://www.google.com/search?q=cats and it didn't fire. Not sure if the URL doesn't match pageview or what (it looks like it should match to me).

The app I put it in:

ruleset a835x36 {
  meta {
    name "regex testing2"
    description <<
 >>
author ""
logging on
}

rule get_query {
    select when pageview "http://www.google.com/search.*(?:&|?)q=(\w+)(?:&|$)"    setting(query) 
      notify("Query",query) with sticky = true;
   }
}

Also, I'm looking for a more robust way to get at the query, since Google has many ways to land on a search results page with URLs that won't look like http://www.google.com/search?q=cats. For example, going to google and searching for cats just gave http://www.google.com/webhp?hl=en#sclient=psy&hl=en&site=webhp&source=hp&q=cats&aq=f&aqi=&aql=&oq=&gs_rfai=&pbx=1&fp=8ac6b4cea9b27ecb for the URL of the results. I guess I could parse anything with a regex, though...


Solution

  • 2 Ways to accomplish what you want.

    1) In the pre block

    pre {
      queryInURL = page:url("query");
      q = queryInURL.replace(re/.*?q=(.*?)(?:$|&.*)/,"$1");
    }
    
    • page:url("query") grabs the entire string of parameters in a url
    • do string replace to capture specific query parameter that you want

    Full Example App Tested

    Tested on url -> http://example.com/?q=cats&wow=cool

    alt text

    ruleset a60x439 {
      meta {
        name "url query test"
        description <<
          Getting the query from the current page URL
        >>
        author "Mike Grace"
        logging on
      }
    
      rule get_query {
        select when pageview ".*"
        pre {
          queryInURL = page:url("query");
          q = queryInURL.replace(re/.*?q=(.*?)(?:$|&.*)/,"$1");
        }
        {
          notify("Query",queryInURL) with sticky = true;
          notify("q",q) with sticky = true;
        }
      }
    
    }
    

    2) In the rules selection expression the way JAM has shown