Search code examples
javascriptpythongoogle-search

control Javascript translation in Google search


The code performs a google search using the below init.py

def search(term, num_results=10, lang="en", lr="lang_en"):
    usr_agent = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) '
                      'Chrome/61.0.3163.100 Safari/537.36'}

    def fetch_results(search_term, number_results, language_code):
        escaped_search_term = search_term.replace(' ', '+')

        google_url = 'https://www.google.com/search?q={}&num={}&hl={}&lr={}'.format(escaped_search_term, number_results+1,
                                                                              language_code, lr)
...

Some of the returned links use javascript do translate the website:

<script type="text/javascript">
        var home = '/de/', root = '/', country = 'ch', language = 'de', w = {
            "download_image": "Bild download (Niedrige Qualität)",
...

another example:

<script>
                dataLayer.push({
                    'brand' : 'Renault',
                    'countryCode' : 'BE',
                    'googleAccount' : 'UA-23041452-1',
                    'adobeAccount' : 'renaultbeprod',
                    'languageCode' : 'nl',
...

Is there a way to filter out the results translated through javascript and get search results only in one language ?


Solution

  • My apologies. Please disregard this question. I made a trivial mistake of putting tripple quoted comments inside a function in another part of the code. It disabled the parameters to be passed effectively.

    After removing tripple quote comment from below, I get the results only in english:

    for google_url in search(query,  # The query you want to run
                    lang='en',  # User interface language (host language)
                    num_results = 10,  # Number of results per page
                    lr="lang_en"  # Langauge of the documents received
    '''
    lr - parameter is implemented in __init__.py of googlesearch
        It should be handled only here.
    Other useful search parameters not used yet are:
    cr - restricts search results to documents originating in a particular country.
        (ex. cr=countryCA)
    gl - boosts search results whose country of origin matches the parameter value.
        (ex. gl=uk)
    '''
                    ):