I have a Python code that works fine for parsing some data on html files. At the end of the code I must save the html files by tag. For example, I have this 3 html files with 3 titles tags:
<title>My name is Prince</title>
<title>I love Madonna</title>
<title>Cars and Candies</title>
Each of them must be save like this:
my-name-is-prince.html
I-love-madonna.html
cars-and-candies.html
So, I already have some SAVE solution for Python, but I don't know how to save by tag.
try:
title = re.search('<title.+/title>', html)[0]
title_content = re.search('>(.+)<', title)[1]
except:
pass
with open("my-words.html", "w") as some_file_handle:
some_file_handle.write(finalString)
OR
with open('page_323.txt', 'w') as f:
f.write(result.text)
OR
with open("somefilename.txt", "w") as some_file_handle:
for line in data:
some_file_handle.write(line + "\n")
P.S. I have 500 files. The Python code must find each tag from each html and save each of them into new html.
Update
Are you looking for that:
# html = """<title>My name is Prince</title>"""
>>> re.search(r'<title>(?P<title>.+)</title>', html).groups('title')[0] \
.replace(' ', '-').lower()
'my-name-is-prince'
Old answer If you already extract title from html you can do:
title = 'My name is Prince'
filename = f"{title.lower().replace(' ', '-')}.html"
with open(filename, "w") as some_file_handle:
some_file_handle.write(finalString)