I have a text file that contains html markup. I would like to extract the values in this section:
<th scope="col" class="text-center">158</th>
<th scope="col" class="text-center">139 (87.97%)</th>
<th scope="col" class="text-center">18 (11.39%)</th>
<th scope="col" class="text-center">0 (0.00%)</th>
<th scope="col" class="text-center">1 (0.63%)</th>
<th scope="col" class="text-center">0 (0.00%)</th>
The values change from time to time but there will always be only 6 of thesr tags. I've tried doing something like this:
text="$(cat email_resp.txt | grep -n '<th scope="col" class="text-center">' | sort)"
I also tried this as well:
text2="$(sed -n '/<th scope="col" class="text-center">/,/<\/th>/p' email_resp.txt)"
But what I get is like a "blob" of text and I'm not able to iterate over it.
689: <th scope="col" class="text-center">158</th>
690: <th scope="col" class="text-center">139 (87.97%)</th>
691: <th scope="col" class="text-center">18 (11.39%)</th>
692: <th scope="col" class="text-center">0 (0.00%)</th>
693: <th scope="col" class="text-center">1 (0.63%)</th>
694: <th scope="col" class="text-center">0 (0.00%)</th>
This is the output when I use the sed command:
<th scope="col" class="text-center">158</th>
<th scope="col" class="text-center">139 (87.97%)</th>
<th scope="col" class="text-center">18 (11.39%)</th>
<th scope="col" class="text-center">0 (0.00%)</th>
<th scope="col" class="text-center">1 (0.63%)</th>
<th scope="col" class="text-center">0 (0.00%)</th>
Ideally what I would like to do is extract those values between the <th>
tags into an array or variables so that I can use them elsewhere.
#!/bin/bash
source <(
awk -F'<th scope="col" class="text-center">|</th>' '
BEGIN{print "declare -a myArr1=(" }
NF==3{print "\047"$2"\047"}
END{print ")"}
' file
)
declare -a myArr2="(
$(
awk -F'<th scope="col" class="text-center">|</th>' '
NF==3{print "\047"$2"\047"}
' file
)
)"
declare -p myArr1
declare -p myArr2
declare -a myArr1=([0]="158" [1]="139 (87.97%)" [2]="18 (11.39%)" [3]="0 (0.00%)" [4]="1 (0.63%)" [5]="0 (0.00%)")
declare -a myArr2=([0]="158" [1]="139 (87.97%)" [2]="18 (11.39%)" [3]="0 (0.00%)" [4]="1 (0.63%)" [5]="0 (0.00%)")