Search code examples
javajsonregexstring-parsing

How to convert invalid JSON like string with inconsistent quotes into a valid JSON using regex in Java?


Here is a sample String. I am facing trouble particularly in the part 'rgba(71,13,226,0.47)' as the commas are present. The same regex should also parse a JSON array like string.

varStr ="{
    red : {
        orange: 100,
        dates: {
            later: "4_WEEK",
            now:   "1_WEEK"
        },
        pink: {
            crimson : '#16gcvcn',
            lavender : '#47vdsaj',
            purple : '#h7465',
            baby pink : '#hd576',
            yellow : 'rgba(71,13,226,0.47)',
            magenta : 'rgba(211,25,25,0.01)'
        }
    }
}";

So far I have a series of regex replacements, But there are issues with my regex it does not handle all cases.

    varStr = varStr.replaceAll("([^:{,\"\\n =\\[}\\]][\\w &|\\/'.\\-)#]+)", "\"$1\"");
    varStr = varStr.replace("\"[", "[");
    varStr = varStr.replace("]\"", "]");
    varStr = varStr.replaceAll("'", "\"");

Solution

  • I'd like to point out to you first, convert a string to a JSON one in this way is going to treat each value as String. Some datatypes such as Number or Boolean will be ignored and lose its convenience while deserialization.

    You can achieve what you want as follows and I am going to update my solution if I could find a more elegant one!

    Code snippet

    String regex = "(?:[{ ])(\\w+)(?!\")";
    varStr = varStr.replaceAll(regex, "\\\"$1\\\"");
    varStr = varStr.replace("\"\"", " ").replace("'", "\"");
    

    BTW, even though there is no standard naming convention for JSON, but I've never seen a field name separated by empty space(s). Therefore, it is better to use one of babyPink, baby_pink and baby-pink.