I have this simple regex:
text = re.sub("[إأٱآا]", "ا", text)
However, I get this (Python 2.7) error:
TypeError: expected string or buffer
I'm a regex newbie, I imagine this is a simple thing to fix, but I'm not sure how? Thanks.
Define all your strings as unicode
and don't forget to add the encoding line in the header of the file:
#coding: utf-8
import re
text = re.sub(u"[إأٱآا]", u"ا", u"الآلهة")
print text
To get:
الالهة