Search code examples
pythonpython-re

Search and create list from a string Python


I am very new to Python and I am trying to create a list out of string in python.

Input = "<html><body><ul style="padding-left: 5pt"><i>(See attached file: File1.pdf)</i><i>(See attached file: File2.ppt)</i><i>(See attached file: File3.docx)</i></ul></body></html>"

Desired Output = [File1.pdf, File2.ppt, File3.docx]

What is the most efficient and pythonic way to achieve this? Any help will be very much appreciated. Thanks


Solution

  • You can use beatifulsoup, which has HTML parsing utils.

    >>> from bs4 import BeautifulSoup
    >>> html = """<html><body><ul style="padding-left: 5pt"><i>(See attached file: File1.pdf)</i><i>(See attached file: File2.ppt)</i><i>(See attached file: File3.docx)</i></ul></body></html>"""
    >>> soup = BeautifulSoup(html, parser='html')
    >>> files_list = [i.text.split('file: ')[1].replace(')', '') for i in soup.find_all('i')]
    >>> print(files_list)
    ['File1.pdf', 'File2.ppt', 'File3.docx']