I'm trying to extract tokens that satisfy many conditions out of which, I'm using lookahead to implement the following two conditions:
'-','/','\','.','_'
etc.,I want to match strings like: 165271
, agya678
, yah@123
, kj*12-
ajh12-&
I don't want to match strings like: ajh12-&
, 671%&i^
I'm using a positive lookahead for the first condition: (?=\w*\d\w*)
and a negative lookahead for the second condition: (?!=[\_\.\:\;\-\\\/\@\+]{2})
I'm not sure how to combine these two look-ahead conditions.
Any suggestions would be helpful. Thanks in advance.
Edit 1 :
I would like to extract complete tokens that are part of a larger string too (i.e., They may be present in middle of the string).
I would like to match all the tokens in the string:
165271 agya678 yah@123 kj*12-
and none of the tokens (not even a part of a token) in the string: ajh12-& 671%&i^
In order to force the regex to consider the whole string I've also used \b
in the above regexs : (?=\b\w*\d\w*\b)
and (?!=\b[\_\.\:\;\-\\\/\@\+]{2}\b)
You can use
^(?!=.*[_.:;\-\\\/@+*]{2})(?=[^\d\n]*\d)[\w.:;\-\\\/@+*]+$
The negative lookahead (?=[^\d\n]*\d)
matches any char except a digit or a newline use a negated character class, and then match a digit.
Note that you also have to add *
and that most characters don't have to be escaped in the character class.
Using contrast, you could also turn the first .*
into a negated character class to prevent some backtracking
^(?!=[^_.:;\-\\\/@+*\n][_.:;\-\\\/@+*]{2})(?=[^\d\n]*\d)[\w.:;\-\\\/@+*]+$
Edit
Without the anchors, you can use whitespace boundaries to the left (?<!\S)
and to the right (?!\S)
(?<!\S)(?!=\S*[_.:;\-\\\/@+*]{2})(?=[^\d\s]*\d)[\w.:;\-\\\/@+*]+(?!\S)