im trying to split text at all punctuation for english and russian. this works except for with spaces. for some reason \s is not working. allRussianWords ends up containing spaces but I do not want it to.
allRussianWords = re.split("[—…();«»!?.:,%\s\n]",words)
this is the string that i am attempting to split
words = "привет, моё имя Мэтт. Как ты?"
the punctuation is in russian
Seems like you need a + after the closing square bracket, to match consecutive characters. One of the other answers points this out, too.
The \n is also redundant, as \s contains the line return character.