Search code examples
pythonformspython-requestssubmitwunderground

How to use Python to submit form within Wunderground History page


I would like to do the following.

1) Goto https://www.wunderground.com/history

2) Submit the form with the following values:

Location = 'Los Angeles, CA'

Month = 'November'

Day = '02'

Year = '2017'

3) Retrieve the response and save to a file

Here is what I have at the moment, but does not work:

import requests
url = 'https://www.wunderground.com/history'
payload = {'code': 'Los Angeles, CA', 'month': 'November', 'day': '2', 'year':'2017'}
r = requests.post(url, params=payload)

with open('test.html', 'w') as f:
    f.write(r.text)

I'm not getting the expected response and not sure If I am using requests properly or not.

I know there's an API from wunderground but prefer not to use it at the moment.

The content of test.html is basically the original page with no historical data.

I'm expecting this page:

https://www.wunderground.com/history/airport/KCQT/2017/11/2/DailyHistory.html?req_city=Los+Angeles&req_state=CA&req_statename=California&reqdb.zip=90012&reqdb.magic=1&reqdb.wmo=99999


Solution

  • You cannot blindly send payloads to some websites and expect a good result. First, look at the source code of the form element. I removed some unimportant parts:

    <form action="/cgi-bin/findweather/getForecast" method="get" id="trip">
        <input type="hidden" value="query" name="airportorwmo" />
        <input type="hidden" value="DailyHistory" name="historytype" />
        <input type="hidden" value="/history/index.html" name="backurl" />
        <input type="text" value="" name="code" id="histSearch" />
    
        <select class="month" name="month">
            <option  value="1">January</option>
            <option  value="2">February</option>
            ...
            <option  value="12">December</option>
        </select>
    
        <select class="day" name="day">
            <option>1</option>
            <option>2</option>
            ...
            <option>31</option>
        </select>
    
        <select class="year" name="year">
            <option>2018</option>
            <option>2017</option>
            ...
            <option>1945</option>
        </select>
    
        <input type="submit" value="Submit" class="button radius" />
    </form>
    

    First, from the method attribute of the form element you can see that you have to use the GET method, not POST, to send the payload. Second, from the action attribute you can also see that you should send that payload to this specific URL:

    https://www.wunderground.com/cgi-bin/findweather/getForecast
    

    The payload itself are not just values you want to send. In many cases there are additional values that has to be sent in order for the web server to respond correctly. It is usually best to either send everything (basically every name attribute) or inspect what the website actually sends.

    This code works for me:

    import requests
    
    URL = 'https://www.wunderground.com/cgi-bin/findweather/getForecast'
    CODE = 'Los Angeles, CA'
    DAY = 2
    MONTH = 11
    YEAR = 2017
    
    params = {
        'airportorwmo': 'query',
        'historytype':  'DailyHistory',
        'backurl':      '/history/index.html',
        'code':         CODE,
        'day':          DAY,
        'month':        MONTH,
        'year':         YEAR,
        }
    
    r = requests.get(URL, params=params)
    print(r.text)