Hi I'm pretty new to programming and Python, and this is my first post, so I apologize for any poor form.
I am scraping a website's download counts and am receiving the following error when attempting to convert the list of string numbers to integers to get the sum. ValueError: invalid literal for int() with base 10: '1,015'
I have tried .replace() but it does not seem to be doing anything.
And tried to build an if statement to take the commas out of any string that contains them: Does Python have a string contains substring method?
Here's my code:
downloadCount = pageHTML.xpath('//li[@class="download"]/text()')
downloadCount_clean = []
for download in downloadCount:
downloadCount_clean.append(str.strip(download))
for item in downloadCount_clean:
if "," in item:
item.replace(",", "")
print(downloadCount_clean)
downloadCount_clean = map(int, downloadCount_clean)
total = sum(downloadCount_clean)
Strings are not mutable in Python. So when you call item.replace(",", "")
, the method returns what you want, but it is not stored anywhere (thus not in item
).
EDIT :
I suggest this :
for i in range(len(downloadCount_clean)):
if "," in downloadCount_clean[i]:
downloadCount_clean[i] = downloadCount_clean[i].replace(",", "")
SECOND EDIT :
For a bit more simplicity and/or elegance :
for index,value in enumerate(downloadCount_clean):
downloadCount_clean[index] = int(value.replace(",", ""))