Search code examples
pythonpython-2.7wxpythonmechanizejython

how do i write a program to click a particular link in Python


My program takes an user input and searches it through a particular webpage . Further i want it to go and click on a particular link and then download the file present there .

Example :

  1. The webpage : http://www.rcsb.org/pdb/home/home.do
  2. The search Word :"1AW0"
  3. after you search the word on the website it takes you to : http://www.rcsb.org/pdb/explore/explore.do?structureId=1AW0

I want the program to go on the right hand side of the webpage and download the pdb file from the DOWNLOAD FILES option

I have managed to write a program using the mechanize module to automatically search the word however unable to find a way i can click on a link

my code :

import urllib2
import re
import mechanize

br = mechanize.Browser()
br.open("http://www.rcsb.org/pdb/home/home.do")
## name of the form that holds the search text area 
br.select_form("headerQueryForm")

## "q" name of the teaxtarea in the html script
br["q"] = str("1AW0")
response = br.submit()
print response.read() 

any help or any suggestions would help .

Btw i am intermediate programmer in Python and I am trying to learn the Jython module to try make this work .

Thanks in advance


Solution

  • Here's how I would have done it:

    '''
    Created on Dec 9, 2012
    
    @author: Daniel Ng
    '''
    
    import urllib
    
    def fetch_structure(structureid, filetype='pdb'):
      download_url = 'http://www.rcsb.org/pdb/download/downloadFile.do?fileFormat=%s&compression=NO&structureId=%s'
      filetypes = ['pdb','cif','xml']
      if (filetype not in filetypes):
        print "Invalid filetype...", filetype
      else:
        try:
          urllib.urlretrieve(download_url % (filetype,structureid), '%s.%s' % (structureid,filetype))
        except Exception, e:
          print "Download failed...", e
        else:
          print "Saved to", '%s.%s' % (structureid,filetype)
    
    if __name__ == "__main__":
      fetch_structure('1AW0')
      fetch_structure('1AW0', filetype='xml')
      fetch_structure('1AW0', filetype='png')
    

    Which provides this output:

    Saved to 1AW0.pdb
    Saved to 1AW0.xml
    Invalid filetype... png
    

    Along with the 2 files 1AW0.pdb and 1AW0.xml which are saved to the script directory (for this example).

    http://docs.python.org/2/library/urllib.html#urllib.urlretrieve