Search code examples
pythondjangoregexmarkdownstring-substitution

Combining multiple substitutions with regular expressions in python, django


Couldn't find any solution to my question from the questions answered before. I'd like to convert all words wrapped with symbols to respective texts by substituting them (in short, markdown to HTML). These are my patterns that match title, subtitle, an italic text and so on:

patterns = [re.compile(r'(^#)(\s?[^#].+\s)'),           # title
            re.compile(r'(##)(\s?.+\n)'),               # subtitle
            re.compile(r'(\s)(\*)([^*\n]+)(\*)'),       # italic
            re.compile(r'(\s\*\*)([^*]+)*(\*\*)'),      # boldface
            re.compile(r'(\*\*\*)(.+)(\*\*\*)'),        # bold italic
            re.compile(r'(\n)'),                        # paragraph
            re.compile(r'(\*|-)(\s\w+.\w+.)'),          # list
            re.compile(r'(\[([^[\]]+)\]\(([^)]+)\))')]  # link

Here is my function in views.py of Django:

def entry(request, title):
    if title not in util.list_entries():
        return render(request, "encyclopedia/error.html", {
            "error": "Page Not Found",
            "query": title
        })
    else:

        return render(request, "encyclopedia/entry.html", {
            "entry": util.get_entry(title),
            "title": title
        })

Just for information, this function shows me the content of the page via the context ("entry": util.get_entry(title)), which will be passed to the template. Currently, the page shows me the markdown content, texts with symbols. The function and other related ones are working fine, so no need to change them (no need to focus on this part).

If I change the dictionary in context by putting one of my patterns, like "entry": p3.sub(r'replacement', util.get_entry(title)), it works fine also. But the point is that I want to combine all of my patterns, and all replacements need to be done in the text at once. How can I do this?

PS I am aware of the markdown2 package and looking for the solution without using it, with regex only.

Thank you in advance.


Solution

  • So, the answer to my question is the following code below:

    def entry(request, title):
    
        if title not in util.list_entries():
            return render(request, "encyclopedia/error.html", {
                "error": "Page Not Found",
                "query": title
            })
        else:
            page = util.get_entry(title)
    
            # List of patterns to be found
            patterns = [re.compile(r'(^#)(\s?[^#].+\s)'),           # title
                        re.compile(r'(##)(\s?.+\n)'),               # subtitle
                        re.compile(r'(\s)(\*)([^*\n]+)(\*)'),       # italic
                        re.compile(r'(\s\*\*)([^*]+)*(\*\*)'),      # boldface
                        re.compile(r'(\*\*\*)(.+)(\*\*\*)'),        # bold italic
                        re.compile(r'(\n)'),                        # paragraph
                        re.compile(r'(\*|-)(\s\w+.\w+.)'),          # list
                        re.compile(r'(\[([^[\]]+)\]\(([^)]+)\))')]  # link
    
            # List of replacements for each pattern
            replace = [patterns[0].sub(r'\1', r'<h1>\2</h1>'),
                       patterns[1].sub(r'\1', r'<h2>\2</h2>'),
                       patterns[2].sub(r'\1', r'<i> \3</i>'),
                       patterns[3].sub(r'\1', r'<b> \2</b>'),
                       patterns[4].sub(r'\1', r'<b><i>\2</i></b>'),
                       patterns[5].sub(r'\1', r'<p>'),
                       patterns[6].sub(r'\1', r'<li>\2</li>'),
                       patterns[7].sub(r'\1', r'<a href="\3">\2</a>')]
            count = 0
            for match in patterns:
                replaced = match.sub(f'{replace[count]}', page)
                page = replaced
                count = count + 1
    
            return render(request, "encyclopedia/entry.html", {
                "entry": page,
                "title": title
            })