I am trying to convert Preeti font text to unicode. I am using Python and npTTF2UTF library. It works most of the time but it also converts English words like 'Microwave' into unicode equivalent. How can I avoid that?
Here is my code:
import npttf2utf
mapper = npttf2utf.FontMapper("npttf2utf-main/src/npttf2utf/map.json")
text = ''' cGo g]6js{;Fu cfj4 x"g] :yfg, tl/sf / lsl;d like (Microwave/Satellite/Cable etc.) '''
converted_text = mapper.map_to_unicode(word, from_font="Preeti", unescape_html_input=False, escape_html_output=False)
print(converted_text)
I get: अन्य नेटवर्कसँग आवद्ध हूने स्थान, तरिका र किसिम ष्पिभ ९ःष्अचयधबखभरक्बतभििष्तभरऋबदभि भतअ।०
I don't want the text after 'lsl;d'(किसिम) converted into unicode. How can I do so?
Extract the text together with it's font information (PyMuPDF has this feature) and convert only parts which are set to have the Preeti font.