Search code examples
pythonxmlweb-scrapingtype-conversion

Changing scraped strings (convert to float and back) inside a list


I'm practicing at scraping websites and I get a string of prices back. I'm not too familiar with lists and how they work so I'm unsure, but I want to convert the USD to AUD which is approximately just a $1:$1.32 ratio. I would assume the string is first eval() to become a list of floats, then possibly just multiplied by 1.32, but I'm unsure how to actually make the ratio exchange however:

from tkinter import *
from re import findall, MULTILINE

rss = open('rss.xhtml', encoding="utf8").read()

    # prints 10 price values
regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
price = ["$" + regex_test for regex_test in regex_test] 
for cost in range(10):
    print(price[cost])

That will print 10 prices, where => represents the transition to the next price ie 20 USD becomes 26.40 AUD:

  1. $20.00 => $26.40
  2. $20.00 => $26.40
  3. $20.00 => $26.40
  4. $20.00 => $26.40
  5. $16.00 => $21.12
  6. $23.50 => $31.02
  7. $20.00 => $26.40
  8. $16.00 => $21.12
  9. $189.00 => $249.48
  10. $16.00 => $21.12

For the sake of an assistive that pulls prices using the same regex here is a similar rss feed https://www.etsy.com/au/shop/ElvenTechnology/rss

a range of 10 is used as I do not wish to scrape hundreds of entries, just a few off the top.


Solution

  • Made your for loop a bit more pythonic:

    from tkinter import *k    from re import findall, MULTILINE
    
    rss = open('rss.xhtml', encoding="utf8").read()
    
        # prints 10 price values
    regex_test = findall(r'([0-9]+[.]*[0-9]*) USD', rss)
    price = ["$" + regex_test for regex_test in regex_test] 
    for individual_price in price:
        print(individual_price)
    

    to convert the list into AUD, Assuming you want to just multiply by a value, for your code it seems better to just go back to the list before the dollar sign was added:

    aud_usd_ratio = 1.32 # 1.32 AUD to 1 USD
    aud_price_list = ["$" + str(float(x)*aud_usd_ratio) for x in regex_test]
    print(aud_price_list)
    

    you could also use string format if you need those two decimal places:

    aud_price_list = ["${:.2f}".format(float(x)*aud_usd_ratio ) for x in regex_test]
    print(aud_price_list)