Search code examples
powershellweb-scrapingweb-content

Get First Search Result from google


I am currently attempting to use Powershell to scrape link pages from a specific site. Have knocked up a variation of the current code but it is essential the same.

I am attempting to get the URL for the search google search result. I have added the below screenshot to explain what i am hoping to grab.

enter image description here

I so far have the following code which is converting the text to a successful search and is working as expected.however when called with the Invoke-WebRequest i dont get any meaningful results. when using a browser with the link it works sucsessfully

function Get-GoogleSEQueryString 
{
    param([string[]] $Query)

    Add-Type -AssemblyName System.Web # To get UrlEncode()
    $QueryString = ($Query | %{ [Web.HttpUtility]::UrlEncode($_)}) -join '+'

    # Return the query string
    $QueryString
}

$SearchString = "Requiem for an American Dream"
$QueryString = Get-GoogleSEQueryString $SearchString
$url = "http://www.google.com.au/?gfe_rd=cr&ei=ZuzTV_v6B7Du8weC8qsY#q="+$QueryString+"+site:IMDB.com"

#(Invoke-WebRequest -Uri $url).links | Where-Object {$_.href -like "http*"}

$t = Invoke-WebRequest -uri $url
$t.AllElements | Where {$_.innerhtml -like '*=*'} |Sort { $_.InnerHtml.Length } | Out-GridView

Can anyone kindly assist in regards to this problem?


Solution

  • To summarize the comments as an answer, Google's main search page doesn't contain the search result in the HTML. It only has some containers and will load it during the page load and populate the HTML DOM dynamically.

    When you download the page you only get the container HTML without the results. You can actually see the same if you select 'View source' on the Google search result page.

    You can try other search engines or use web services to fetch the data.

    You can learn more about Google's web service here: https://developers.google.com/custom-search/json-api/v1/reference/cse/list