My initial attempt was to run curl https://stackoverflow.com/users/5825294/enlico
and pipe the result into sed
/awk
. However, as I've frequently read, sed
and awk
are not the best tools to parse HTML code. Furthermore, the above URL changes if I change my user name.
Oh, this is my quick attempt with sed
, written on multiple lines for readability:
curl https://stackoverflow.com/users/5825294/enlico 2> /dev/null | sed -nE '
/title="reputation"/,/bronze badges/{
/"reputation"/{
N
N
s!.*>(.*)</.*!\1!p
}
/badges/s/.*[^1-9]([1-9]+[0-9]*,*[0-9]* (gold|silver|bronze) badges).*/\1/p
}'
which prints
10,968
5 gold badges
27 silver badges
56 bronze badge
Obviously this script heavily relies on the peculiar structure of the specific HTML page, the most notable example being that I run N
twice because I've verified that the reputation is two lines below the first line in the file containing "reputation"
.
Léa Gris' answer almost answers my question. The missing bit is that I have 5 gold, 27 silver, and 56 bronze badges, not 5, 18, 7.
In this respect, I've noticed that 18 is the is the number of silver badges I have if I don't consider those awarded multilple times, therefore I've played around with jq
and discovered that I can query for the award_count
beside the rank
, and I thought that I could use that to take multiply awarded badges into account. This kind of works, in the sense that running the following (fetch_user_badges
is from Léa Gris' answer) generates the correct number of silver badges but the wrong number of bronze badges:
$ fetch_user_badges stackoverflow 5825294 | jq -r '
.items
| map({rank: .rank, count: .award_count})
| group_by(.rank)
| map([[.[0].rank],map(.count) | add])'
[
[
"bronze",
22
],
[
"gold",
5
],
[
"silver",
27
]
]
Is anybody aware of why is that?
Full example using StackExchange API and jq for parsing the response.
#!/usr/bin/env bash
# This script fetches and prints some user info
# from a stack-site using the stackexchange's API
# Change this to the stackoverflow's numerical user ID
STACK_UID=5825294
STACK_SITE='stackoverflow'
STACK_API='https://api.stackexchange.com/2.2'
API_CACHE=~/.cache/stack_api
mkdir -p "$API_CACHE"
# Get a stack-site user using the stackexchange API and caches the result
# @Params:
# $1: the website (example stackoverflow)
# $2: the numerical user ID
# @Output:
# &1: API Json reply
stack_api::user() {
stack_site=$1
stack_uid=$2
cache_file="${API_CACHE}/${stack_site}-users-${stack_uid}.json"
yesterday_ref="${API_CACHE}/yesterday.ref"
touch -d yesterday "$yesterday_ref"
# Expire cache
[ "$cache_file" -ot "$yesterday_ref" ] && rm -f -- "$cache_file"
# Call stack API only if no cached answer
[ -f "$cache_file" ] || curl \
--silent \
--output "$cache_file" \
--request GET \
--url "${STACK_API}/users/${stack_uid}?site=${stack_site}"
# Return cached answer
zcat --force -- "$cache_file" 2>/dev/null
}
IFS=$'\n' read -r -d '' username reputation bronze silver gold < <(
# Fetch user from a stack site
stack_api::user "$STACK_SITE" "$STACK_UID" |
# Parse the stack_api user data from the JSON response
jq -r '
.items[0] |
.display_name,
.reputation,
( .badge_counts |
.bronze,
.silver,
.gold
)
'
)
printf 'Badges from UserID %d %s on the %s website:\n\n' \
$STACK_UID "$username" "$STACK_SITE"
printf 'Réputation: %6d\n' "$reputation"
printf 'Bronze: %6d\n' "$bronze"
printf 'Silver: %6d\n' "$silver"
printf 'Gold: %6d\n' "$gold"
Example output:
Badges from UserID 5825294 Enlico on the stackoverflow website:
Reputation: 11144
Bronze: 56
Silver: 27
Gold: 5