Search code examples
pythonpython-3.xregexregex-group

Python - split a string to multiple json string


I'm trying to split a string and get all json string that are in it
My string :

{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 55, 223]}}}{"datas": {"type": "auth", "value": 0}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 60, 218]}}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 65, 213]}}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 70, 208]}}}

My regex :

({.*})({.*)

But, the first group is the entire string without the last json string

{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 55, 223]}}}{"datas": {"type": "auth", "value": 0}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 60, 218]}}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 65, 213]}}}

I want to get one by one like this :

{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 55, 223]}}}

I don't know how to properly explain my problem, i hope you'll understand
Thanks for reading


**EDIT**: Finally, i didn't used regex. Here is my function :
def packet_to_jsonlist(s):
    jsonlist = []
    count = 0
    current = 0
    for i in range(0, len(s)):
        if s[i] == '{':
            count += 1
        elif s[i] == '}':
            count -= 1
            if count == 0:
                jsonlist.append(s[current:i+1])
                current = i + 1

    return jsonlist

Solution

  • I don't think it's a great general solution, but in this case you can split the individual strings on a regex matching the closing } next to the opening {. This will give you a list of json strings which you can then parse:

    import re
    import json
    
    s = '{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 55, 223]}}}{"datas": {"type": "auth", "value": 0}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 60, 218]}}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 65, 213]}}}{"datas": {"type": "custom", "value": {"cat": "game", "func": "game", "args": ["action", "move", "ball", 0, 70, 208]}}}'
    
    js = re.split(r'(?<=})\B(?={)', s)
    
    dicts = [json.loads(s) for s in js]
    

    Making dicts:

    [{'datas': {'type': 'custom',
       'value': {'cat': 'game',
        'func': 'game',
        'args': ['action', 'move', 'ball', 0, 55, 223]}}},
     {'datas': {'type': 'auth', 'value': 0}},
     {'datas': {'type': 'custom',
       'value': {'cat': 'game',
        'func': 'game',
        'args': ['action', 'move', 'ball', 0, 60, 218]}}},
     {'datas': {'type': 'custom',
       'value': {'cat': 'game',
        'func': 'game',
        'args': ['action', 'move', 'ball', 0, 65, 213]}}},
     {'datas': {'type': 'custom',
       'value': {'cat': 'game',
        'func': 'game',
        'args': ['action', 'move', 'ball', 0, 70, 208]}}}]
    

    For a more general solution, you can make a quick parser that keeps track of balanced brackets and yields your strings:

    def getGroups(s):
        current = ''
        count = 0
        for c in s:
            if c == '{':
                count += 1
            elif c == '}':
                count -=1 
            current += c
            if count == 0:
                yield current
                current = ''
    
    [json.loads(js) for js in getGroups(s)]
    # same output
    

    This assumes the braces are balanced properly.