We have now some uncommon CSV data file which partly contains JSON data type as shown below:
"00001","str1","[a.b.c] str3, str4",true,false,"2022-04-18T12:00:00+00:00","[{""k1"":""v1"",""k2"":""v2""}]","str5"
We wanted to remove all characters within square brackets and braces which come together later with no other changing. But, when I use the following sed command sed -e 's/[.*]//g', it returns undesired output like:
"00001","str1","","str5"
If it were truly expected, it should be like:
"00001","str1","[a.b.c] str3, str4",true,false,"2022-04-18T12:00:00+00:00","","str5"
We do not know how to capture and replace the part containing JSON-typed data and cannot find the relative information to do so.
How can we achieve this?
Your current code is greedy matchig from the first [
to the last ]
hence removing everything in between and also seems to have a redundant g
flag.
Try this sed
$ sed 's/\[{[^]]*]//' input_file
"00001","str1","[a.b.c] str3, str4",true,false,"2022-04-18T12:00:00+00:00","","str5"
Match from [{
an opening square bracket with curly braces beside to the next occurance of a closing sqare bracket [^]]*