Search code examples
pythonbeautifulsoupurllib2google-translate

Python script to translate via google translate


I'm trying to learn python, so I decided to write a script that could translate something using google translate. Till now I wrote this:

import sys
from BeautifulSoup import BeautifulSoup
import urllib2
import urllib

data = {'sl':'en','tl':'it','text':'word'} 
request = urllib2.Request('http://www.translate.google.com', urllib.urlencode(data))

request.add_header('User-Agent', 'Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.11) Gecko/20071127 Firefox/2.0.0.11')
opener = urllib2.build_opener()
feeddata = opener.open(request).read()
#print feeddata
soup = BeautifulSoup(feeddata)
print soup.find('span', id="result_box")
print request.get_method()

And now I'm stuck. I can't see any bugs in it, but it still doesn't work (by that I mean that the script will run, but it wont translate the word).

Does anyone know how to fix it? (Sorry for my poor English)


Solution

  • Google translate is meant to be used with a GET request and not a POST request. However, urrllib2 will automatically submit a POST if you add any data to your request.

    The solution is to construct the url with a querystring so you will be submitting a GET.
    You'll need to alter the request = urllib2.Request('http://www.translate.google.com', urllib.urlencode(data)) line of your code.

    Here goes:

    querystring = urllib.urlencode(data)
    request = urllib2.Request('http://www.translate.google.com' + '?' + querystring )
    

    And you will get the following output:

    <span id="result_box" class="short_text">
        <span title="word" onmouseover="this.style.backgroundColor='#ebeff9'" onmouseout="this.style.backgroundColor='#fff'">
            parola
        </span>
    </span>
    

    By the way, you're kinda breaking Google's terms of service; look into them if you're doing more than hacking a little script for training.

    Using requests

    I strongly advise you to stay away from urllib if possible, and use the excellent requests library, which will allow you to efficiently use HTTP with Python.