Search code examples
pythonmarkdown

How do I bold the bullet point headers in markdown using python?


I have markdown text as below:

text = '<div style="padding: 10px; border: 1px solid #e5e7eb; font-size: .95rem; border-radius: 8px"><h3>Income Statement</h3>\n\n1. Total Revenue:\n\nThe total revenue for xxxx for the year ended December 31, 2022 was \\$xxxxx million.\n\n2. Revenue by Source:\n\nThe breakdown of xxxx\'s revenue by source for the year ended December 31, 2022 is as follows:\n\n* Billboard:\n\t+ Static displays: \\$xxxx.x million\n\t+ Digital displays: \\$xxxx million\n\t+ Other: \\$xxx million\n\tTotal billboard revenues: \\$xxxx million\n* Transit:\n\t+ Static displays: \\$xxxx million\n\t+ Digital displays: \\$xxxxx million\n\t+ Other: \\$xxx million\n\tTotal transit revenues: \\$xxxxxxxxx million\n* Sports marketing and other: \\$xxxxx million\n\nTotal revenues: \\$xxxxx million\n\n</div>'

How do I bold any bullet point headers (any text that starts with an ordered or unordered list, and ends with ":". Example: "1. Total Revenue: ")

Expected output:

enter image description here


Solution

  • A quick-and-dirty way is to use regular expressions to change the text, e.g.:

    import re
    
    text = '<div style="padding: 10px; border: 1px solid #e5e7eb; font-size: .95rem; border-radius: 8px"><h3>Income Statement</h3>\n\n1. Total Revenue:\n\nThe total revenue for xxxx for the year ended December 31, 2022 was \\$xxxxx million.\n\n2. Revenue by Source:\n\nThe breakdown of xxxx\'s revenue by source for the year ended December 31, 2022 is as follows:\n\n* Billboard:\n\t+ Static displays: \\$xxxx.x million\n\t+ Digital displays: \\$xxxx million\n\t+ Other: \\$xxx million\n\tTotal billboard revenues: \\$xxxx million\n* Transit:\n\t+ Static displays: \\$xxxx million\n\t+ Digital displays: \\$xxxxx million\n\t+ Other: \\$xxx million\n\tTotal transit revenues: \\$xxxxxxxxx million\n* Sports marketing and other: \\$xxxxx million\n\nTotal revenues: \\$xxxxx million\n\n</div>'
    
    text = re.sub(r"(^\s*[*+]\s*)([^:]+):", r"\g<1>**\g<2>:**", text, flags=re.M)
    text = re.sub(r"(^\s*\d+\.\s*)([^:]+):", r"\g<1>**\g<2>:**", text, flags=re.M)
    print(text)
    

    Prints:

    <div style="padding: 10px; border: 1px solid #e5e7eb; font-size: .95rem; border-radius: 8px"><h3>Income Statement</h3>
    
    1. **Total Revenue:**
    
    The total revenue for xxxx for the year ended December 31, 2022 was \$xxxxx million.
    
    2. **Revenue by Source:**
    
    The breakdown of xxxx's revenue by source for the year ended December 31, 2022 is as follows:
    
    * **Billboard:**
            + **Static displays:** \$xxxx.x million
            + **Digital displays:** \$xxxx million
            + **Other:** \$xxx million
            Total billboard revenues: \$xxxx million
    * **Transit:**
            + **Static displays:** \$xxxx million
            + **Digital displays:** \$xxxxx million
            + **Other:** \$xxx million
            Total transit revenues: \$xxxxxxxxx million
    * **Sports marketing and other:** \$xxxxx million
    
    Total revenues: \$xxxxx million
    
    </div>