In my regex, I want to say that within the sample text, any characters are allowed, including a-z in upper and lower case, numbers and special characters.
For example, my regular expression may be checking that a document is html. therefore:
"/\n<html>[]+</html>\n/"
i have tried []+ but it does not seem to like this?
Using [XXX]+
means any character that's between [
and ]
, one or more than one time.
Here, you didn't put any character between [
and ]
-- hence the problem.
If you want to say any letter, you can use :
[a-z]
[A-Z]
[a-zA-Z]
And, for numbers :
[0-9]
: any digit[a-zA-Z0-9]
: any lower-case or upper-case letter, and any number.\w
meta-character, which means "any word character"/.+/s
which should match :
You'll see that it doesn't "stop" when you expect it too -- that's because matching is greedy, by default -- you'll have to use a ?
after the +
, or use the U
modifier ; see the Repetition section, for more informations.
It's generally much better to use a DOM Parser, such as :