I'm trying to parse an HTML source with Python. I'm using BeautifulSoup
for the purpose. What I need to get is to get all td
tags with ids in the form of nameX
format, where X starts from 1. So they are name1, name2, ...
as many as we have.
How can I achieve this? My simple code using regex doesn't work.
soup = BeautifulSoup(response.text,"lxml")
resp=soup.find_all("td",{"id":'name*'})
Error:
IndexError: list index out of range
use lambda + startswith
soup.find_all('td', id=lambda x: x and x.startswith('name'))
or regex
soup.find_all('td', id=re.compile('^name'))