Search code examples
pythonparsinghyperlinkpyqtqtextbrowser

Create hyperlinks from urls in text file using QTextBrowser


I have a text file with some basic text:

For more information on this topic, go to (http://moreInfo.com)
This tool is available from (https://www.someWebsite.co.uk)
Contacts (https://www.contacts.net)

I would like the urls to show up as hyperlinks in a QTextBrowser, so that when clicked, the web browser will open and load the website. I have seen this post which uses:

<a href="http://foo">Bar</a>

but as the text file can be edited by anyone (i.e. they might include text which does not provide a web address), I would like it if these addresses, if any, can be automatically hyperlinked before being added to the text browser.

This is how I read the text file:

def info(self):
    text_browser = self.dockwidget.text_browser
    file_path = 'path/to/text.txt'
    f = open(file_path, 'r')
    text = f.read()
    text_browser.setText(text)
    text_browser.setOpenExternalLinks(True)
    self.dockwidget.show()

Edit:

Made some headway and managed to get the hyperlinks using (assuming the links are inside parenthesis):

import re

def info(self):
    text_browser = self.dockwidget.text_browser
    file_path = 'path/to/text.txt'
    f = open(about_file_path, 'r')
    text = f.read()
    urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', text)

    for x in urls:
        if x in text:
            text = text.replace(x, x.replace('http', '<a href="http').replace(')', '">') + x + '</a>')

    textBrowser.setHtml(text)
    textBrowser.setOpenExternalLinks(True)
    self.dockwidget.show()

However, it all appears in one line and not in the same format as in the text file. How could I solve this?

Image


Solution

  • Matching urls correctly is more complex than your current solution might suggest. For a full breakdown of the issues, see: What is the best regular expression to check if a string is a valid URL? .

    The other problem is much easier to solve. To preserve newlines, you can use this:

    text = '<br>'.join(text.splitlines())