Search code examples
regexsearch-engine

Regular expression to detect the search engine and search words


I need to detect search engines that refers to my website. Since every search engine has different query strings for searching(e.g. google uses 'q=', yahoo uses 'p=') I created a database for search engines with their url regex patterns.

As an example: http://www.google.com/search?q=blabla&ie=utf-8&oe=utf-8&aq=t&rls=com.ubuntu:en-GB:official&client=firefox-a

the regex I created for google is:

(http:)(\\/)(\\/)(www)(\\.)(google)(\\.).*(\\/)(search).*(&q=|\\?q=).*

(I am a newbie on regex, but so far it works)

This detects that the url belongs to Google. My problem is that I need to extract the search words from the url above or from other search engines. But I dont know how to match it with the regular expression. I have tried extracting the query string from the url by using PHP functions and match it against the pattern, but it returned nothing.

Hope I could explain this clear enough.

Any suggestion?


Solution

  • This blog entry about extracting keywords from the referrer seems like it is a good match for solving your problem.

    I found it using this search for 'extract query string from google referer url'. The search seems to have a number of helpful hits... I just did a sweep of the first few.