I am working on something in MATLAB that processes text. For this I need my regular expression to work properly.
So far I have this regexp below, which works for almost everything, but I would like to add something to it so that it reads an apostrophe as a part of a word.
V1 = regexp(inpstr,'\w*[^a-zA-Z0-9\ _\ -\ "\ *\f\n\r\t\v\x20]?','match');
So, my an examplary question is: If I have a string:
'Hi, let's play some ball.'
I would like the regexp to give me 'Hi,' - 'let's' - 'play' - 'some' - 'ball.'
and currently it gives me 'Hi,' - 'let' - 's' - 'play' - 'some' - 'ball.'
I guess the problem is I can't just add \ ' to the regexp because of MATLABs use of '.
I tried just adding it and this happened: ??? Error: File: TestScript.m Line: 13 Column: 38
The input character is not valid in MATLAB statements or expressions.
Any help would be greatly appreciated =)
The solution to my problem was this:
V1 = regexp(inpstr,'\w*[\'']*[^\_\-\"\*\s]*','match')
Basically, in between [ ]
you put characters you want to express and in between [^ ]
you put characters you want to skip. Also, \s
was a shortcut for all whitespace.