Search code examples
pythoncurlurllib2wgetpycurl

accessing itop database (via oql) with python urllib2


I am trying to access an itop database via its generic web interface. I got it working with a shell script:

#!/bin/bash

export http_proxy=''
SERVER=itop-test
SELECT_STATEMENT="SELECT Ticket"

wget -q -O - \
--http-user=myusername \
--http-password=$(cat /home/dummy/private/.passwd) \
"http://${SERVER}.acme.org:8000/webservices/export.php?login_mode=basic&format=csv&expression=${SELECT_STATEMENT}&fields=${FIELDS}"

This produces csv output as desired. Now since the application I am building is in python, I would like to do the same in python:

#!/usr/bin/python

import csv
import urllib2
import base64


select_statement = 'SELECT Ticket'
fields = ''

itop_server = 'itop-test'
username = 'myusername'
passwd_file = '/home/dummy/private/.passwd'

# extract passwd
password = open(passwd_file,'r').read().replace('\n','')
# clear http_proxy (sometimes set on ACME systems)
proxy_support = urllib2.ProxyHandler({})
opener = urllib2.build_opener(proxy_support)
urllib2.install_opener(opener)

# build url
url = 'http://' + itop_server + \
     '.acme.org:8000/webservices/export.php?login_mode=basic&format=csv&expression='\
     + select_statement + '&fields=' + fields
request = urllib2.Request(url)
base64string = base64.standard_b64encode('%s:%s' % (username, password)).replace('\n', '')
request.add_header('Authorization', 'Basic %s' % base64string)   
result = urllib2.urlopen(request).read()

print result

However, the python version is not working, the result contains, among other things

<p>Error the query can not be executed.</p>
<p>Unexpected token End of Input, found &#039;0&#039; in: <b>S</b>ELECT</p>

I have checked that the urls used are identical, so I guessed there must be a difference in the http header that is send(?).

Here is some output from tcpdump -s 1024 -l -A dst itop-test.acme.org

First wget:

..........@..#..2\.P.9..t..GET
/webservices/export.php?login_mode=basic&format=csv&expression=SELECT%20Ticket&fields= HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */* 
Host: itop-test.acme.org:8000
Connection: Keep-Alive


..........@Q....=..P.9.....GET
/webservices/export.php?login_mode=basic&format=csv&expression=SELECT%20Ticket&fields= HTTP/1.0
User-Agent: Wget/1.12 (linux-gnu)
Accept: */* 
Host: itop-test.acme.org:8000
Connection: Keep-Alive
Authorization: Basic asdfasdfasdfasdf

Then python

..........@...W.@..P.9.....GET
/webservices/export.php?login_mode=basic&format=csv&expression=SELECT Ticket&fields= HTTP/1.1
Accept-Encoding: identity
Host: itop-test.acme.org:8000
Connection: close
Authorization: Basic asdfasdfasdfasdf
User-Agent: Python-urllib/2.6

I changed the user agent for python, but that did not help. I also tried to change the Connection, but that did not work.

Any ideas on what is going on here? What can I try to make this work? Maybe some even understands what is going on? :)

Edit It turns out that also curl is not working:

curl --user myusername:$(cat /home/dummy/private/.passwd) \
"http://${SERVER}.acme.org:8000/webservices/export.php?login_mode=basic&format=csv &expression=${SELECT_STATEMENT}&fields=${FIELDS}"

Same result as with python urllib2. I also tried pycurl, with no success (same result as urllib2 and curl on commandline).


Solution

  • It turns out that only wget is capable of translating whitespace in the url into %20. If I replace it myself, it works. So I build my url like this

    url = 'http://' + itop_server + \
         '.acme.org:8000/webservices/export.php?login_mode=basic&format=xml&expression='\
         + select_statement.replace(' ','%20') + '&fields=' + fields
    

    which automatically replaces whitespace, and I can still write my select statements with whitespace.