Search code examples
sed

How to use sed or something to remove strings within both square brackets and braces


We have now some uncommon CSV data file which partly contains JSON data type as shown below:

"00001","str1","[a.b.c] str3, str4",true,false,"2022-04-18T12:00:00+00:00","[{""k1"":""v1"",""k2"":""v2""}]","str5"

We wanted to remove all characters within square brackets and braces which come together later with no other changing. But, when I use the following sed command sed -e 's/[.*]//g', it returns undesired output like:

"00001","str1","","str5"

If it were truly expected, it should be like:

"00001","str1","[a.b.c] str3, str4",true,false,"2022-04-18T12:00:00+00:00","","str5"

We do not know how to capture and replace the part containing JSON-typed data and cannot find the relative information to do so.

How can we achieve this?


Solution

  • Your current code is greedy matchig from the first [ to the last ] hence removing everything in between and also seems to have a redundant g flag.

    Try this sed

    $ sed 's/\[{[^]]*]//' input_file
    "00001","str1","[a.b.c] str3, str4",true,false,"2022-04-18T12:00:00+00:00","","str5"
    

    Match from [{ an opening square bracket with curly braces beside to the next occurance of a closing sqare bracket [^]]*