Search code examples
regexdiscord

Correction of regex function


I'm matching a Discord server response with regex and had it working some time ago but now it broke. It should check all connected users from the JSON for a voice channel connection (only those have a channel id), but exclude the two bots. Then it shall give back the user and channel id. The negative lookahead of "id" was needed so that it doesn't jump between users when it checks channel ids. A user that is not connected to any channel will look like Bot1.

My expresion so far was: "username": "((?!Bot1|Bot2)[^"]*)"(?:(?!"id").)*channel_id": "(\d+)".*? It worked but now it doesn't find matches anymore.

The two parts work by themself but not combined: "username": "((?!Bot1|Bot2)[^"]*)" (?:(?!"id").)*channel_id": "(\d+)".*?

the JSON is:

...
"members": [
        {
            "id": "0",
            "username": "User1",
            "discriminator": "0000",
            "avatar": null,
            "status": "online",
            "avatar_url": "https://cdn.discordapp.com/widget-avatars/...",
            "deaf": false,
            "mute": false,
            "self_deaf": false,
            "self_mute": false,
            "suppress": false,
            "channel_id": "(digits between 0-9)"
        },
        {
            "id": "1",
            "username": "Bot1",
            "discriminator": "0000",
            "avatar": null,
            "status": "online",
            "avatar_url": "https://cdn.discordapp.com/widget-avatars/..."
        },
        {
            "id": "2",
            "username": "Bot2",
            "discriminator": "0000",
            "avatar": null,
            "status": "online",
            "avatar_url": "https://cdn.discordapp.com/widget-avatars/...",
            "game": {
                "name": "music | /help"
            },
            "deaf": false,
            "mute": false,
            "self_deaf": false,
            "self_mute": false,
            "suppress": false,
            "channel_id": "(digits between 0-9)"
        }
    ],
    "presence_count": 3
}

I'm using Rainmeter which is capable of natively using regex but not able to parse JSON.


Solution

  • The immediate problem is that your combined regex assumes the two values are immediately adjacent, which they are not in the example JSON you posted.

    I have no access to the platform you are using, but try adding optional additional fields between the values.

    "username": "((?!Bot[12]")[^"]*)"
    (?:\s*"[^"]+": (?:[^"]+|"[^"]*"),)*
    \s*"channel_id": "(\d+)"
    

    (I added line breaks for legibility, but you probably want to remove them.)

    The optional group (?:...)* skips zero or more additional "key": value pairs between the two you are interested in (where values can be unquoted JSON keywords like true, false, or null, or quoted strings). Because the regex now cannot straddle boundaries between unrelated JSON dictionaries, and we target the key channel_id exactly (and we rely on the JSON to be valid, so there cannot be any duplicate keys), no negative lookahead is necessary.

    This could still fail if the keys are in the opposite order (that is, "channel_id" comes before "username" within an entry), or if some JSON strings could contain escaped double quotes. The latter is not hard to fix as such; just refactor any "[^"]+" or "[^"]* to

    "(?:\\.|[^"\\]+)+"
    

    where the last quantifier before the closing quotes should be * instead if you want to permit an empty string.

    Demo (using PCRE2 regex syntax): https://regex101.com/r/iGI8Xq/1