I have a complex json file, nested to 4th and 5th levels and I am trying to get the below result using jq. Any help would be appreciated:
{
"Name": "unix-global",
"Title": "AWS cli should be installed",
"desc": "System Package aws-cli should be installed",
"result": "passed"
}
{
"Name": "unix-global",
"Title": "AWS cli should be installed",
"desc": "Service besclient should be installed",
"result": "failed"
}
This is a json file that i get as a result of running an inspec profile. The real aim is to extract the only needed info into a simple json so that I can finally update a AWS Redshift database.
{
"version": "1.7.1",
"profiles": [{
"name": "java",
"title": "InSpec Java in system",
"maintainer": "awim",
"copyright": "awim / mtaqwim",
"copyright_email": "[email protected]",
"license": "All Rights Reserved",
"summary": "An InSpec Compliance Profile",
"version": "0.0.1",
"supports": [],
"controls": [{
"title": "identify java in system",
"desc": "identify java in PATH system",
"impact": 0.3,
"refs": [],
"tags": {},
"code": "control 'java-1.0' do\n impact 0.3\n title 'identify java in system'\n desc 'identify java in PATH system'\n\n describe java_info do\n it{ should exist }\n its(:version){ should match '1.7'}\n end\nend",
"source_location": {
"ref": "inspec/java/controls/java_1.0.rb",
"line": 6
},
"id": "java-1.0",
"results": [{
"status": "passed",
"code_desc": "java_info should exist",
"run_time": 0.000895896,
"start_time": "2017-01-20 05:04:47 +0000"
}, {
"status": "passed",
"code_desc": "java_info version should match \"1.7\"",
"run_time": 0.067581113,
"start_time": "2017-01-20 05:04:47 +0000"
}]
}, {
"title": "run java from specific path",
"desc": "run java from specific path",
"impact": 1.0,
"refs": [],
"tags": {},
"code": "control 'java-2.0' do\n impact 1.0\n title 'run java from specific path'\n desc 'run java from specific path'\n\n describe java_info(java_path) do\n it{ should exist }\n its(:version){ should match '1.7'}\n end\nend",
"source_location": {
"ref": "inspec/java/controls/java_2.0.rb",
"line": 8
},
"id": "java-2.0",
"results": [{
"status": "skipped",
"code_desc": "java_info",
"skip_message": "Can't find file \"/opt/jdk/current\"",
"resource": "java_info",
"run_time": 1.6512e-05,
"start_time": "2017-01-20 05:04:47 +0000"
}]
}, {
"title": "identify java home",
"desc": "identify java home match to specific path",
"impact": 0.1,
"refs": [],
"tags": {},
"code": "control 'java-3.0' do\n impact 0.1\n title 'identify java home'\n desc 'identify java home match to specific path'\n\n describe java_info(java_path) do\n its(:java_home){ should match java_path}\n end\nend",
"source_location": {
"ref": "inspec/java/controls/java_3.0.rb",
"line": 8
},
"id": "java-3.0",
"results": [{
"status": "skipped",
"code_desc": "java_info",
"skip_message": "Can't find file \"/opt/jdk/current\"",
"resource": "java_info",
"run_time": 6.139e-06,
"start_time": "2017-01-20 05:04:47 +0000"
}]
}],
"groups": [{
"title": "which(UNIX)/where(Windows) java installed",
"controls": ["java-1.0"],
"id": "controls/java_1.0.rb"
}, {
"title": "which(UNIX)/where(Windows) java installed",
"controls": ["java-2.0"],
"id": "controls/java_2.0.rb"
}, {
"title": "which(UNIX)/where(Windows) java installed",
"controls": ["java-3.0"],
"id": "controls/java_3.0.rb"
}],
"attributes": []
}],
"other_checks": [],
"statistics": {
"duration": 0.069669698
}
}
Here's a jq
filter to flatten this out. Note that the "piping" between filters is essential. You must flatten each parent array before you flatten it's child or you get a cartesian product of them all (which is very bad).
jq '.profiles[]
| { Name: .name , Controls: .controls[] }
| { Name: .Name, Desc: .Controls.desc , Title: .Controls.title , Results: .Controls.results[] }
| { Name: .Name, Desc: .Desc , Title: .Title , StartTime: .Results.start_time , RunTime: .Results.run_time , Result: .Results.status }'
Line breaks added to the code for clarity
Output:
{
"Name": "java",
"Desc": "identify java in PATH system",
"Title": "identify java in system",
"StartTime": "2017-01-20 05:04:47 +0000",
"RunTime": 0.000895896,
"Result": "passed"
}
…etc
Once you've flattened it this far I would consider saving it as CSV instead as that will be somewhat simpler to load into Redshift.
jq '.profiles[]
| { Name: .name , Controls: .controls[] }
| { Name: .Name, Desc: .Controls.desc , Title: .Controls.title , Results: .Controls.results[] }
| [ .Name, .Desc , .Title , .Results.start_time , .Results.run_time , .Results.status ]
| @csv '
Output:
"\"java\",\"identify java in PATH system\",\"identify java in system\",\"2017-01-20 05:04:47 +0000\",0.000895896,\"passed\""
…etc