I’m writing a plugin for Gedit which makes changes to certain words depending on a regex. In some case this is applying the tag several characters beyond the intended word.
So the values returned by match.start() and match.end() are not valid for use in get_iter_at_offset.
def on_save(self, doc, location, *args, **kwargs):
"""called when document is saved"""
for match in WORD_RE.finditer(get_text(doc)):
if not self._checker.check(match.group().strip()):
self.apply_tag(doc, match.start(), match.end())
def apply_tag(self, doc, start, end):
"""apply the tag to the text between start and end"""
istart = doc.get_iter_at_offset(start)
iend = doc.get_iter_at_offset(end)
doc.apply_tag(self._spell_error_tag, istart, iend)
I figured it out in the end, it should have been obvious really. The text in the document contained some non-ascii characters, so the regex wasn't able to correctly determine the positions, decoding the documents string to unicode fixed the issue.
so:
get_text(doc).decode('utf-8')