Search code examples
pythondjangoreplacefindhashtag

Django Find Hashtags in String and Replace It by Wrapping It in <a> Tag


I am making a social media site, and I want to enable hashtags. For example, if a user creates a post like this:

Summer is gone. #sad #comeback #summer

I want to use python to replace all occurrences of # and word like this:

Summer is gone. <a href="http://127.0.0.1:8000/c/sad">#sad</a> <a href="http://127.0.0.1:8000/c/comeback">#comeback</a> <a href="http://127.0.0.1:8000/c/summer">#summer</a>

This is what I have so far:

    def clean_content(self):
        content = self.cleaned_data.get('content')
        content = profanity.censor(content) # (Unrelated Code)
        arr = re.findall(r"#(\w+)", content)
        replace_ar = []
        for hsh in arr:
            if len(hsh) < 80:
                if Category.objects.filter(name__iexact=hsh):
                    # Old Category, Added This To Category List (Unrelated Code)
                    replace_ar.append(hsh)
                else:
                    # New Category, Created Category, Then Added This To Category List (Unrelated Code)
                    replace_ar.append(hsh)
            else:
                # Don't do anything, hashtag length too long
       # No Idea What To Do With replace_ar. Need help here.

In the code above I take the html text input and I find all #{{word}}. I then loop through them and I check if a category exists with that name or not. If it does, I just add it to that category, and if it doesn't, I create a category and then add it. In both these situations, I push to hashtag to the replace_ar array.

Now I want to replace all hashtags in the replace_ar array with a url like in the "Summer is gone" example from above. How would I do this?


Solution

  • To replace hashtags (of the form: "#categoryname") with the url of the related category:

    def clean_content(self):
        content = self.cleaned_data.get('content')
        arr = re.findall(r"#(\w+)", content)
        for hsh in arr:
            if len(hsh) < 80:
                full_hash = '#' + hsh
                if Category.objects.filter(name__iexact=hsh):
                    content = content.replace(full_hash, f'<a href="http://127.0.0.1:8000/c/{hsh}/">#{hsh}</a>')
                else:
                    content = content.replace(full_hash, f'<a href="http://127.0.0.1:8000/c/{hsh}/">#{hsh}</a>')
    

    Note that you should use reverse instead of hard coding the url.