The problem I am facing is for Unicode text file.Notepad++ plugin>python script.
Below code perfectly works and replace the words contains the wordlist.txt. Only it works for English. Non ASCII it is unable to search. I tried With open('C:\Users\Desktop\wordlist.txt') as f:
--> with io.open('C:\Users\Desktop\wordlist.txt', encoding='utf-8') as f:
but notepad++ not performing for Unicode words text file.
Now i need help how to pass unicode string for search. in the below code. Else please help with python code for batch whole word replace in A.text
file using "word list find and replace with delimiter in B.Text
file".
With open('C:\Users\Desktop\wordlist.txt') as f:
for l in f:
s = l.split()
editor.rereplace(r'\b' + s[0] + r'\b', s[1])
Do not use word boundary \b
that cause problem with utf8 characters. Use instead lookaround:
import re
with open('D:\\temp\\wordlist.txt') as f:
for l in f:
s = l.split()
editor.rereplace(r'(?<!\S)' + s[0] + r'(?!\S)', '\t' + s[1])
Where:
(?<!\S)
is a negative lookbehind that make sure with haven't a NON space before the word to be modified(?!\S)
is a negative lookahead that make sure with haven't a NON space after the word to be modifiedWith your 2 sample files, I got:
मारुती
नामशिवाया
जयश्रीराम
जयश्रीराम
Screenshot: