I have a complex yaml and want to extract information from it using bash script.
The yaml is like:
content:
images:
sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d0:
annotations:
kbld.carvel.dev/id: index.docker.io/dkalinin/k8s-simple-app
kbld.carvel.dev/origins: |
- resolved:
tag: latest
url: index.docker.io/dkalinin/k8s-simple-app
image: image1
imageType: Image
layers:
- digest: sha256:8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54
origin: index.docker.io/dkalinin/k8s-simple-app@sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d0
sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d1:
annotations:
kbld.carvel.dev/id: index.docker.io/dkalinin/k8s-simple-app
kbld.carvel.dev/origins: |
- resolved:
tag: latest
url: index.docker.io/dkalinin/k8s-simple-app
image: image2
imageType: Image
layers:
- digest: sha256:8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54
- digest: sha256:8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba5h
- digest: sha256:8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba6u
origin: index.docker.io/dkalinin/k8s-simple-app@sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d0
sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d2:
annotations:
kbld.carvel.dev/id: index.docker.io/dkalinin/k8s-simple-app
kbld.carvel.dev/origins: |
- resolved:
tag: latest
url: index.docker.io/dkalinin/k8s-simple-app
image: image3
imageType: Image
layers:
- digest: sha256:8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54
origin: index.docker.io/dkalinin/k8s-simple-app@sha256:4c8b96d4fffdfae29258d94a22ae4ad1fe36139d47288b8960d9958d1e63a9d0
The expected result is:
{
"image1":
["8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54"],
"image2":
["8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54",
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba5h",
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba6u"],
"image3":
["8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54"]
}
The key image1
, image2
, image3
are the .content.images.image
value.
The value is the sha256 part of all digests in the Layers as a string list.
What I tried is to first get all keys using images=$(yq -r '.content.images | keys' file)
Then traverse the values using each key. But failed to traverse the values since the value is a complex one. Is there any simpler way to achieve this?
I'm using Go yq
You can use with_entries
to modify each .key
and .value
. Or update each subitem .[] |=
by modifying key
and the context as value. Then, use sub
with ""
in the second argument to remove substrings by regex:
yq -oj '.content.images | with_entries(
.key = .value.image | .value |= [.layers[].digest | sub("^sha256:", "")]
)'
# or
yq -oj '.content.images | .[] |= (
key = .image | [.layers[].digest | sub("^sha256:", "")]
)'
{
"image1": [
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54"
],
"image2": [
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54",
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba5h",
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba6u"
],
"image3": [
"8ece9ac45f2b7228b2ed95e9f407b4f0dc2ac74f93c62ff1156f24c53042ba54"
]
}