Search code examples
pythonscreen-scrapingmechanizewww-mechanize

form submitting with mechanize and Python


I'm trying to scrap a website that requires to submit two forms : a first one to loggin and a second one to specify my research. I'm using Python and the mechanize package.

No problem with the first one, but i just can't figure out how to pass through the second one.

Here is the part of my code related to the firm above-mentionned

agemin=18
agemax=25
by='region'
country='France'
region=2
newcustomers=1

browser.select_form(nr=0)
browser['age[min]']=agemin
browser['age[max]']=agemax
browser['country']=country
browser['region']=region
browser['by']=by
browser['new-customers']=newcustomers

response=browser.submit()
content=response.read()

but when I submit the variable 'age[min]' by example, I get the following error message :

TypeError: object of type 'int' has no len()

to give you some more informations, here is what I get with 'print br.form'

<POST http://www.adopteunmec.com/qsearch/ajax_quick application/x-www-form-urlencoded
  <SelectControl(age[min]=[, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, *30, 31,         32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54,   55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])>
  <SelectControl(age[max]=[, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, *45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99])>
  <SelectControl(by=[*region, distance])>
  <SelectControl(country=[*fr, be, ch, ca])>
  <SelectControl(region=[*1, 2, 3, 4, 5, 6, 7, 8, 22, 23, 9, 10, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 11])>
  <SelectControl(distance[min]=[*, 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000])>
  <SelectControl(distance[max]=[, 0, 10, 20, 30, 40, 50, 60, 70, *80, 90, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000])>
  <CheckboxControl(new=[*1])>>

My guess is that the form needs an object (like a list) containing all the variables to accept it ; that's why it refuses the variables submited one by one.

Thank you in advance for any help !

Alexis


Solution

  • agemin should be a string, or casted to string

    In addition, the setter of the value of a form's select control expects a list.

    So, either

    agemin="25"
    

    and

    browser['age[min]']=[agemin]
    

    or just

    agemin=25           #as you did
    browser['age[min]']=[str(agemin)]