I'm using Beautiful Soup to scrape pages trying to get the height of certain athletes:
req = requests.get(url)
soup = BeautifulSoup(req.text, "html.parser")
height = soup.find_all("strong")
height = height[2].contents
print height
Unfortunately, this is what gets returned:
[u'6\'0"']
I've also tried:
height = str(height[2].contents)
and
height = unicode(height[2].contents)
but I still get [u'6\'0"'] as a result.
How can I just have 6'0" returned without the extra characters? Thanks for your help!
Those aren't "extra characters". .contents
returns a list, the element you chose only has one child, and so you're getting a list containing one element. Python prints a list as pseudo Python code, so you can see what it is and what's in it.
Perhaps you want .string
?