I am trying to trim down an output in some code I'm working on, and for whatever reason can't get it to work.
version= wget --output-document=- https://dolphin-emu.org/download 2>/dev/null \ | grep 'version always-ltr' -m 1
until [[ "${version::2}" == "." ]];
do version= echo "$version" | sed 's/^.//'
done
until [[ "${version: -1}" -ge "0" ]];
do version= echo "$version" | sed 's/.$//'
done
echo $version
Initially, $version equals something long and clunky:
<td class="version always-ltr"><a href="/download/dev/8ecfa537a242de74d2e372e30d9d79b14584b2fb/">5.0-16101</a></td>
However, I only want the 5.0-xxxxx
number. How do I do that? (Or what absolutely idiotic mistake am I making?)
If as you show your version
is of the form:
version='<td class="version always-ltr"><a href="/download/dev/8ecfa537a242de74d2e372e30d9d79b14584b2fb/">5.0-16101</a></td>'
A simple sed
expression capturing the wanted value and reinserting as the first backreference is all that is needed, e.g.
$ echo "$version" | sed 's/^.*">\([^<][^<]*\).*$/\1/'
5.0-16101
Where you can rely on the greedy match from the beginning of the string to the final ">
and then capture the wanted text with \([^<][^<]*\)
and then reinsert it as the substituted text with \1
.
To capture in a variable, just use command substitution, e.g. var=$(command)
, e.g.
ver=$(echo "$version" | sed 's/^.*">\([^<][^<]*\).*$/\1/')
Note: processing html should be done with an html/xml aware application like xmllint
or xmlstarlet
. There are far too many variations and caveats in what you may get back with curl
to rely solely on shell processing to extract data consistently.