Search code examples
pythonjsonregexkey

Negative regex pattern matching in python for JSON


I believe this may have been asked before many times, but i could not find a way make it working for json content. The result negative pattern is matching for all json strings (even if the substring exists). Im sure, i might be doing something wrong.

Idea is to match the json string which has no "key" string in it, and not match the one with "key" string in it.

Note: I do need to achieve this via "re.match" with negative regex (and not with matching it and negating in python), as im doing this in bulk with many expression, and cant really change the way of the function for one expression alone.

For example, below is my two json strings

'{"key": "success", "name": "peter"}'
'{"name": "sam"}'

And Im using the below regex pattern to negative match

((?!key).).*

Result is

Python 3.9.5 (default, May 11 2021, 08:20:37) 
[GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import re
>>> pattern = r"((?!key).).*"
>>> jsonstring = '{"key": "success", "name": "peter"}'
>>> re.match(pattern, jsonstring)
<re.Match object; span=(0, 35), match='{"key": "success", "name": "peter"}'>

>>> jsonstring = '{"name": "sam"}'
>>> re.match(pattern, jsonstring)
<re.Match object; span=(0, 15), match='{"name": "sam"}'>

Am I doing anything terribly wrong here? was trying different pattern, but without success so far.


Solution

  • ((?!key).).* matches a positive sequence of characters ..* (that is equivalent to .+) which does not start with "key" (more precisely, the beginning must not be followed by the word "key"). Indeed both the strings do not start with the word "key", so both of them match the pattern. Notice that the brackets are useless here.

    You may want to use (?!.*"key").*:

    >>> import re
    >>> pattern = r"(?!.*\"key\").*"
    >>> jsonstring = '{"key": "success", "name": "peter"}'
    >>>
    
    >>> jsonstring = '{"name": "sam"}'
    >>> re.match(pattern, jsonstring)
    <re.Match object; span=(0, 15), match='{"name": "sam"}'>
    

    which works in this case although it is not a good way of parsing a JSON string.

    The best way is to use a JSON parser:

    >>> import json
    >>> jsonstring = '{"key": "success", "name": "peter"}'
    >>> obj = json.loads(jsonstring)
    >>> "key" not in obj
    False
    >>> jsonstring = '{"name": "sam"}'
    >>> obj = json.loads(jsonstring)
    >>> "key" not in obj
    True