Search code examples
pythonhtmlpython-2.7gethttplib

html get request "you didn't select a category"


I'm new to http and in need of help. I'm trying to fill out a search form in craigslist so that I can get the link to the page I would have normally gotten if I had filled out the form manually. By viewing the source, I've found this form:

<form id="search" action="/search/" method="GET">
            <div>search craigslist</div>
            <input type="hidden" name="areaID" value="372">
            <input type="hidden" name="subAreaID" value="">
            <input id="query" name="query" autocorrect="off" autocapitalize="off"><br>
            <select id="catAbb" name="catAbb">
                <option value="ccc">community</option>
                <option value="eee">events</option>
                <option value="ggg">gigs</option>
                <option value="hhh">housing</option>
                <option value="jjj">jobs</option>
                <option value="ppp">personals</option>
                <option value="res">resumes</option>
                <option value="sss" selected="selected">for sale</option>
                <option value="bbb">services</option>
            </select>


<input id="go" type="submit" value="&gt;">
    </form>

So I wrote this code to fill out the form:

import urllib,httplib
conn = httplib.HTTPConnection("auburn.craigslist.org")
params = urllib.urlencode({'query': 'english tutor', 'catAbb': 'bbb'})
conn.request("GET","/search",params)
response = conn.getresponse()
print response.read()

I'm not sure about everything, e.g. how do I specify which form do I want to fill? I assumed it is by specifying "\search" as in the form's "action", but should it really be in the 'url' argument in httplib.request? Anyway, Instead of getting a url to my desired results page, I get this html page:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
    <title>auburn craigslist search</title>
    <blockquote>
        <b>You did not select a category to search.</b>
    </blockquote>

But I'm pretty sure I did select a category. What should I do? Thanks!


Solution

  • You send HTTP GET params in the URL (and not as an encoded part of the request body like POST), change your Python to look like this and you should get what you are after:

    import urllib,httplib
    
    conn = httplib.HTTPConnection("auburn.craigslist.org")
    params = urllib.urlencode({'query': 'english tutor', 'catAbb': 'bbb'})
    conn.request("GET","/search?%s" % params)
    response = conn.getresponse()
    
    print response.read()
    

    Also you it will make your life a lot easier if you pass this input to Beautiful Soup, for parsing and extracting information.