Search code examples
jsonstringbashunixcygwin

Finding a string between two strings in a file


This is a bit of a .json file I need to find information in:

"title":
"Spring bank holiday","date":"2012-06-04","notes":"Substitute day","bunting":true},
{"title":"Queen\u2019s Diamond Jubilee","date":"2012-06-05","notes":"Extra bank holiday","bunting":true},
{"title":"Summer bank holiday","date":"2012-08-27","notes":"","bunting":true},
{"title":"Christmas Day","date":"2012-12-25","notes":"","bunting":true},
{"title":"Boxing Day","date":"2012-12-26","notes":"","bunting":true},
{"title":"New Year\u2019s Day","date":"2013-01-01","notes":"","bunting":true},
{"title":"Good Friday","date":"2013-03-29","notes":"","bunting":false},
{"title":"

The file is much longer, but it is one long line of text.

I would like to display what bank holiday it is after a certain date, and also if it involves bunting. I've tried grep and sed but I can't figure it out. I'd like something like this:

[command] between [date] and [}] display [title] and [bunting]/[no bunting]

[title] should be just "Christmas Day" or something else

Forgot to mention: I would like to achieve this in bash shell, either from the prompt or from a short bit of code.


Solution

  • You can try this with awk:

     awk -F"}," '{for(i=1;i<=NF;i++){print $i}}' file.json | awk -F"\"[:,]\"?" '$4>"2013-01-01"{printf "%s:%s:%s\n" ,$2,$4,$8}'
    

    Seeing that the json file is one long string we first split this line into multiple json records on },. Then each individual record is split on a combination of ":, characters with an optional closing ". We then only output the line if its after a certain date.

    This will find all records after Jan 1 2013.

    EDIT:

    The 2nd awk splits each individual json record into key-value pairs using a sub-string starting with ", followed by either a : or ,, and an optional ending ". So in your example it will split on either ",", ":" or ":.

    All odd fields are keys, and all even fields are values (hence $4 being the date in your example). We then check if $4(date) is after 2013-01-01.

    I noticed i made a mistake on the optional " (should be followed by ? instead of *) in the split which i have now corrected and i also used printf function to display the values.