Search code examples
markdowntelegramtelegram-botpython-telegram-bot

Telegram cannot send this specific URL as a link


Telegram does not like this link for some reason:

https://reuters.com/world/uk/pressure-builds-uks-johnson-fire-health-secretary-2021-06-26/?taid=60d6f3a8b9a1150001df150c&utm_campaign=trueAnthem:+Trending+Content&utm_medium=trueAnthem&utm_source=twitter

For one of my projects I use telegram to send messages on news articles. Here is one message that my program created. I'm usring the Mardownv2 message type and using requests in python to send them.

[2\) UK health minister quits after breaking COVID rules by kissing aide](https://reuters.com/world/uk/pressure-builds-uks-johnson-fire-health-secretary-2021-06-26/?taid=60d6f3a8b9a1150001df150c&utm_campaign=trueAnthem:+Trending+Content&utm_medium=trueAnthem&utm_source=twitter)

Which should appear as:

2) UK health minister quits after breaking COVID rules by kissing aide

The program works great and sends all sorts of links this way, almost no issues. If I replace the link in the code with a different link, it sends fine. If I send the link outside a hyperlink format, it also sends fine. The link itself works fine, and opens a functioning webpage. But I'm really scratching my head as to why telegram won't allow me to hyperlink that URL, and I would like to futureproof my program agains this sort of thing happening again, especially without me realizing it

edit: after testing a little more and sending a lot of different messages to determine the cause, the offending character seems to be the & in https://reuters.com/world/uk/pressure-builds-uks-johnson-fire-health-secretary-2021-06-26/?taid=60d6f3a8b9a1150001df150c & utm_campaign=trueAnthem:+Trending+Content&utm_medium=trueAnthem&utm_source=twitter

edit2: after doing some more testing, I think I figured it out.

first of all it wasn't that particular &, it was all of them. That was just the first one that appeared. For a URL to work as a hyperlink in telegram, & (and similar characters maybe? Seems to be working anyway)

So normally when I send a message with telegram I need to escape all the special characters with this:

message_body = re.sub(r"([_*\[\]()~`>\#\+\-=|\.!{}])", r"\\\1", message_body)

Except obviously I can't do that with hyperlinked URLs. So I don't escape them.

But there's ANOTHER escape that's important too that I wasn't doing with the URLS but SHOULD Be doing. And that's escaping all the percent encoding characters

message_body=message_body.replace('%', '\\%25')
message_body=message_body.replace('#', '\\%23')
message_body=message_body.replace('+', '\\%2B')
message_body=message_body.replace('*', '\\%2A')
message_body=message_body.replace('&', '\\%26')

Adding these lines to my urls fixed my problem!

So normally in a normal text message I escape the percent encoding characters then the special characters. And normally with URLs that I like I don't escape at all. What I need to do is do the percent encoding escape for my urls too.


Solution

  • after doing some more testing, I think I figured it out.

    first of all it wasn't that particular &, it was all of them. That was just the first one that appeared. For a URL to work as a hyperlink in telegram, & (and similar characters maybe? Seems to be working anyway)

    So normally when I send a message with telegram I need to escape all the special characters with this:

    message_body = re.sub(r"([_*\[\]()~`>\#\+\-=|\.!{}])", r"\\\1", message_body)
    

    Except obviously I can't do that with hyperlinked URLs. So I don't escape them.

    But there's ANOTHER escape that's important too that I wasn't doing with the URLS but SHOULD Be doing. And that's escaping all the percent encoding characters

    message_body=message_body.replace('%', '\\%25')
    message_body=message_body.replace('#', '\\%23')
    message_body=message_body.replace('+', '\\%2B')
    message_body=message_body.replace('*', '\\%2A')
    message_body=message_body.replace('&', '\\%26')
    

    Adding these lines to my urls fixed my problem!

    So normally in a normal text message I escape the percent encoding characters then the special characters. And normally with URLs that I like I don't escape at all. What I need to do is do the percent encoding escape for my urls too.