How can I replace some string located between the delimiters href="" ?
<td><a href="https://forms.office.com/Pages/ResponsePage.aspx?id=uI1n" target="_blank">https://forms.office.com/Pages/ResponsePage.aspx?id=uI1n</a></td>
</tr>
I want to replace this:
href="https://forms.office.com/Pages/ResponsePage.aspx?id=uI1n"
with this:
href="LINK"
For a quick and dirty way, you could use re.sub() to match the 'href' tag and replace it with your own:
import re
html = """<td><a href="https://forms.office.com/Pages/ResponsePage.aspx?id=uI1n" target="_blank">https://forms.office.com/Pages/ResponsePage.aspx?id=uI1n</a></td>
</tr>"""
re.sub('">.*<\/a>', '">LINK<\/a>" ' , html)
Output:
'<td><a href="LINK" target="_blank">https://forms.office.com/Pages/ResponsePage.aspx?id=uI1n</a></td>\n </tr>'
But remember that parsing HTML with regular expressions is not recommended, as it can have many edge cases. I would only use this for a quick and dirty way when I absolutely know how my input HTML is structured. For a more professional approach, you should look into HTML parsers (e.g. 'beautifulsoup').