I'm reworking some json using jq
and trying to extract some strings from a larger description and move them into an array of related controls.
Here's my input json:
{"description": "Fail-safe procedures include, for example, alerting operator personnel and providing specific instructions on subsequent steps to take (e.g., do nothing, re-establish system settings, shut down processes, restart the system, or contact designated organizational personnel). Related controls: CA-2, CA-7, CM-3, CM-5, CM-8, MA-2, IR-4, RA-5, SA-10, SA-1x, SI-1x"}
The output I want is:
{"description": "Fail-safe procedures include, for example, alerting operator personnel and providing specific instructions on subsequent steps to take (e.g., do nothing, re-establish system settings, shut down processes, restart the system, or contact designated organizational personnel).",
"relatedControls": ["CA-2", "CA-7", "CM-3", "CM-5", "CM-8", "MA-2", "IR-4", "RA-5", "SA-10", "SA-1x", "SI-1x"}
I've worked out something I think is pretty close, but this is creating more objects instead of creating an array of controls like I wanted.
jq '. | {description: .description | sub(" Related controls:.*";""), relatedControls: .description | scan("[A-Z]{2}-\\d[0-9x]?") }'
Here's the whole thing on one line so it's easy to test:
echo '{"description": "Fail-safe procedures include, for example, alerting operator personnel and providing specific instructions on subsequent steps to take (e.g., do nothing, re-establish system settings, shut down processes, restart the system, or contact designated organizational personnel). Related controls: CA-2, CA-7, CM-3, CM-5, CM-8, MA-2, IR-4, RA-5, SA-10, SA-1x, SI-1x"}' | jq '. | {description: .description | sub(" Related controls:.*";""), relatedControls: .description | scan("[A-Z]{2}-\\d[0-9x]?") }'
jq
wizards... what a I missing to get the output I'm after?
You could just split /
at " Related controls: "
, then split again at ", "
:
.description / " Related controls: "
| {description: .[0], relatedControls: (.[1] / ", ")}
Alternatively, here's another approach using capture
and scan
with your regular expressions:
.description
| capture("(?<description>.*) Related controls: (?<relatedControls>.*)")
| .relatedControls |= [scan("[A-Z]{2}-\\d[0-9x]?")]
Output:
{
"description": "Fail-safe procedures include, for example, alerting operator personnel and providing specific instructions on subsequent steps to take (e.g., do nothing, re-establish system settings, shut down processes, restart the system, or contact designated organizational personnel).",
"relatedControls": [
"CA-2",
"CA-7",
"CM-3",
"CM-5",
"CM-8",
"MA-2",
"IR-4",
"RA-5",
"SA-10",
"SA-1x",
"SI-1x"
]
}