Search code examples
pythonstringstring-formatting

Does python have font face for strings?


I recently used Google Vision API to extract text from a pdf. Now I searching for a keyword in the response text (from API). When I compare the given string and found string, they do not match even they have same characters. The only reason I can see is font types of given and found string which looks different which lead to different ascii/utf-8 code of the characters in the string. (I never came across such a problem)

How to solve this? How can I bring these two string to same characters? I am using Jupyter notebook but I even pasted the comparison on terminal but still its evaluates it to False.

Here are the strings I am trying to match:

'КА Р5259' == 'KA P5259'

But they look the same on Stack Overflow so here's a screenshot:

a busy cat


Solution

  • Thanks everyone for the your comments.

    I found the solution. I am posting it here, it might be helpful for someone. Actually it's correct that python does not support font faces. So if one copies a font faced character and paste it to python console or jupyter notebook (which renders the font faces due to the fact that it uses html to display information) it is considered a different unicode character.

    So the idea is to first bring the text response in a plain text format which I achieved by storing the response in a .txt file (or .pkl file more precisely) which I had to do anyway to preserve the response objects for later data analysis purposes. Once the response in stored in plain text file you can read it without any font face problem unlike I faced above.