I have a program that should be piped with grep command, the outpu of my program is sth like this:
<cite>www.site.com/sdds/ass</cite>A-"><div Class="sa_mc"><div class="sb_tlst"><h3><a href=
and so on...
I run a python script:
./python.py | grep -Po '(?<=<cite>)([^</cite>])'
in order to grep every thing between cite
tag...
Can you help me?
You need to make a proper use of lookaround feature, your lookbehind is fine but lookahead is not. Try this:
grep -Po "(?<=<cite>).*?(?=</cite>)"
Ex:
echo '<cite>www.site.com/sdds/ass</cite>A-"><div Class="sa_mc"><div class="sb_tlst"><h3><a href=' | grep -Po "(?<=<cite>).*?(?=</cite>)"
www.site.com/sdds/ass
Disclaimer: It's a bad practice to parse XML/HTML with regex. You should probably use a parser like xmllint instead.