Search code examples

Python Web Scraping: trying to control the output


I'm having difficulty getting the output I need when I scrape this web page:

The Web Page

This is what I have:

import urllib2
from html2text import html2text

for line in  html2text(urllib2.urlopen("").read()).split(','):
if "traders_short"in line:
    print "Traders Short AUDUSD: ", line.split(":")[1].strip(' " ')
if "traders_long" in line:
    print "Traders Long AUDUSD: ", line.split(":")[1].strip(' " ')

This is my output:

Traders Short AUDUSD:  
Traders Long AUDUSD:  88
Traders Long AUDUSD:  88

This is what I would like:

Traders Short AUDUSD: number
Traders Long AUDUSD: number 

So the problem is:

A) The output is repeating, I only want it to tell me how many traders are short or long ONCE.

B) I can't get rid of the ' " ' in the second line of the output and I want it to sit next to the ' : ' like the next line.

Now here is some more info, this is what the page looks like once its been tidied up with html2text:



Now obviously 'traders short / long' appears more than once which is why its printing twice. But I need it to only print once.

Any help from the expertise available at this forum would be great!



  • I'd use requests because it's so convenient, e.g. it has a built-in json() method. You can also easily unpack that long URL into a more readable query dict, and pass that in with the basic URL.

    Here's how I would do this:

    import requests
    base_url = ""
    query = {'content': 'positions',
             'do': 'positions_graph_data',
             'limit': '',
             'interval': 'M5',
             'currency': 'AUDUSD'}
    r = requests.get(base_url, query)
    template = "Traders Short {currency_code}: {traders_short}\n"
    template += "Traders Long {currency_code}: {traders_long}\n"
    for position in r.json()['positions']:
        if not position['hidden']:

    Importantly, r.json() is just a dictionary. I chose to hide the 'hidden' results, which seem to be duplicates, but of course you can do any processing you like at this point. The result of this is:

    Traders Short AUDUSD: 116
    Traders Long AUDUSD: 88