Problems with authenticating an aspx page from within Python

There are a couple of related questions on here, but I haven't been able to solve my problem by looking at their answers so I thought I would give this a go.

Basically I am trying to download some *.zip files from a website that requires a username/password. This is the website login page:

Once logged in (in a normal browser session), I can download the *.zip files I need by following the download links, such as:

My attempt so far has tried to make use of the cookielib, urllib, urllib2 and HTMLParser libraries. I use the HTMLParser to read the values of __VIEWSTATE and __EVENTVALIDATION as I read that it was important to resubmit the same values back in the form. However, when I try and open the login page with the correct login data, I just retrieve the (un-authenticated) login page. I'm really not sure what I am doing wrong, but any help would be much appreciated.

import cookielib
import urllib
import urllib2
from HTMLParser import HTMLParser

class IceConnection(object):

    def __init__(self, username, password):

        self.username = username
        self.password = password
        self.url = ""
        self.headers = [
                    ('user-agent','Mozilla/5.0 (Windows NT 6.3; WOW64; rv:30.0) Gecko/20100101 Firefox/30.0'),
                    ('accept-encoding','gzip, deflate'),

        self.cookies = cookielib.CookieJar()
        self.opener = urllib2.build_opener(
        self.opener.addheaders = self.headers

        #Extract view_state and event_validation variables:
        field_names = [r'__VIEWSTATE', r'__EVENTVALIDATION']
        field_values = self.extractFields(field_names)

        view_state = field_values[0]
        event_validation = field_values[1]

        self.fields = (
            (r'__EVENTTARGET', r''),
            (r'__EVENTARGUMENT', r''),
            (r'__VIEWSTATE', view_state),
            (r'__EVENTVALIDATION', event_validation),
            (r'ctl00$ContentPlaceHolder1$LoginControl$m_userName', username),
            (r'ctl00$ContentPlaceHolder1$LoginControl$m_password', password)

        login_data = urllib.urlencode(self.fields)
        print response =, login_data)

    def extractFields(self, field_names):
        response =
        html = ''.join(response.readlines())

        ret = list()

        for field in field_names:
            parser = PageParser(field)

        return ret

class PageParser(HTMLParser):
    def __init__(self, field_name):
        self.field = field_name

    def handle_starttag(self, tag, attrs):
        if tag == 'input':
            #Create dictionary of attributes
            attributes = dict()
            for attr in attrs:
                attributes[attr[0]] = attr[1]

            if attributes.has_key('name'):
                if attributes['name'] == self.field:
                    self.value = attributes['value']


  • I have actually managed to solve my issue by using my browser (Google Chrome) to look at the POST Headers sent to the server. I noticed this line:


    So I replaced the blank string in my code with the above line and it now works!