I'm looking to extract the programme title and sub-title from the (clipped) XML file below. I was extracting both individually using xmllint and sed and combining them into one file, but I have since discovered that there are the occasional entries that only have a title and no sub-title. In this case I would like to leave sub-title blank. Please could someone suggest a way to account for this discrepancy?
XML File
<programme start="20171013170000 +0100" stop="20171013180000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">Accessories Gift Hall</title>
<sub-title lang="eng">Find the perfect gift with fashion accessories by some of our most sought-after brands. From chic purses and wallets to cosy PJs and slippers, there's something for everyone.</sub-title>
</programme>
<programme start="20171013180000 +0100" stop="20171014130000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">..programmes start again at 1pm</title>
</programme>
<programme start="20171014130000 +0100" stop="20171014140000 +0100" channel="b492458d826d592ec7c528545a16c757">
<title lang="eng">Ruth Langsford's Fashion Edit</title>
<sub-title lang="eng">TV personality and QVC fashion ambassador, Ruth Langsford, shares her favourite looks and must-have pieces that will transform your wardrobe and have you looking fabulously stylish.</sub-title>
</programme>
Bash commands v1
xmllint --xpath "//programme/title" xmltv | sed -r 's/\n//g' | sed 's/<\/title>/\n/g' | sed 's/<title lang="eng">//g' > 1.txt
xmllint --xpath "//programme/sub-title" xmltv | sed -r 's/\n//g' | sed 's/<\/sub-title>/\n/g' | sed 's/<sub-title lang="eng">//g' > 2.txt
paste <(cat 1.txt) <(cat 2.txt) > 3.txt
Thanks!
Here's an example using the sel
command of xmlstarlet
from the command line...
$ xmlstarlet sel -T -t -m '//programme' -v 'concat(normalize-space(title)," ",normalize-space(sub-title))' -n input.xml
Accessories Gift Hall Find the perfect gift with fashion accessories by some of our most sought-after brands. From chic purses and wallets to cosy PJs and slippers, there's something for everyone.
..programmes start again at 1pm
Ruth Langsford's Fashion Edit TV personality and QVC fashion ambassador, Ruth Langsford, shares her favourite looks and must-have pieces that will transform your wardrobe and have you looking fabulously stylish.
I'm separating the title and sub-title by a single space, but that can be changed.