My texteditor (vim) can give the positions of a string in a string but counts the number of bytes, not the number of characters.
Example:
s="I don't take an apéritif après-ski"
When I search the word apéritif
my texteditor gives the position:
16,25
Python gives this position of the same word:
16,24
Vim gives the possibility to execute python code in the editor.
In one of my python scripts I do a lot of slicing.
But I never find the correct word if there are accented characters in the string.
Is there a way to resolve this in python?
Can I find the byte position of a string in a string in python?
This is,admittedly, a naive solution. You can encode both the text and word to bytes, and then run find() operation on encoded text with encoded word as parameter.
def f(text,word):
en_text=bytes(text,encoding="utf-8")
en_word=bytes(word,encoding="utf-8")
start = en_text.find(en_word)
return (start,start+len(en_word))
When run as:
f("I don't take an apéritif après-ski","apéritif")
returns (16, 25)