Search code examples
javascriptnode.jssearch-enginegoogle-search-apigoogle-custom-search

Google custom search on the whole web and limitations (gizoogle)


I am working on a search engine that needs to have access to results from google. Here are my options:

  • Using the custom search API
  • Using a proxy to make my server send searches and return the data

I am not sure about some things though:

Is the custom search API limited? I may need a really big amount of queries, so if the use is limited it will be a problem.

Is it "authorized" to use a proxy in node that would send search queries to google and intercept the result to show to my users? If I do so, wouldn't I run to some limitations?

The inspiration here is gizoogle which managed to plug into google API (they have the same results as google) while still not using custom search (custom search displays adds, and there aren't any on this website). So I assume they have some sort of proxy, but how come google let them run those queries?

Edit: It turns out that the custom search API is also limited. So, how did gizoogle do ?


Solution

  • Ok here is how I solved this problem:

    It turns out that google has a lost API (probably deprecated so be aware of this) for client-side ajax search. It looks like that:

    http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=test&rsz=large

    Just go to that url to see what results it gives.

    So basically here is the process:

    • The user types a search
    • It is sent to your server in ajax
    • The server might modify the search depending on your application (filtering forbidden words or whatever)
    • Your server polls the ajax web service from google - don't forget to add the getparameter userIp which is needed to avoid limitations (google limits incoming queries from each user, so your server has to tell google that it is making a request on behalf of this userIp
    • You send back the results to the client, and then use javascript to display them

    The only drawback is that the search must be made in ajax, meaning that the page is empty at load and filled later. But you could actually use get parameters in URL to preload the search and fill the page before sending it to the client though.