i'm writing a small script that parse an rss using xmllint.
Now i fetch the titles list with the following command:
ITEMS=`echo "cat //title" | xmllint --shell rss.xml `
echo $ITEMS > tmpfile
But it returns:
<title>xxx</title> ------- <title>yyy :)</title> ------- <title>zzzzzz</title>
without newlines, or space. Now i'm interested only in the text content of title tags, and if possible i want to navigate through the titles using a for/while loop, something like:
for val in $ITEMS
do
echo $val
done
How it can be done? Thanks in advance
I had the same type of requirement at some point to parse xml in bash. I ended up using xmlstarlet http://xmlstar.sourceforge.net/ which you might be able to install.
If not, something like that will remove the surounding tags:
echo "cat //title/text()" | xmllint --shell rss.xml
Then you will need to cleanup the output after piping it, a basic solution would be:
echo "cat //title/text()" | xmllint --shell rss.xml | egrep '^\w'
Hope this helps