I have extracted an string about 64 bit steam ID's and friendlist using web scraping. I want to get the unique steamid's so that I can store them on a different file. I used regex, but I think I have a mistake in the the notation part.
This is the string.
{"friendslist":{"friends":[{"steamid":"7656xxxxxxx80x76","relationship":"friend","friend_since":1552765824},{"steamid":"76561xxxxxxx4xx89","relationship":"friend","friend_since":1508594830},{"steamid":"765xxxxxxxxxxx3194","relationship":"friend","friend_since":1543773569}]}}
I used regex as this:
import re
re.findall("[^:[0-9]+[0-9]+", soup.text)
However, I got this result:
['"7656xxxxxxx80x76',
'"76561xxxxxxx4xx89',
'"765xxxxxxxxxxx3194']
How am I going to get rid of the ditto marks (") at the beginning of the numbers?
You have JSON string so use module json
import json
text = '{"friendslist":{"friends":[{"steamid":"7656xxxxxxx80x76","relationship":"friend","friend_since":1552765824},{"steamid":"76561xxxxxxx4xx89","relationship":"friend","friend_since":1508594830},{"steamid":"765xxxxxxxxxxx3194","relationship":"friend","friend_since":1543773569}]}}'
data = json.loads(text)
for friend in data["friendslist"]['friends']:
print(friend['steamid'])
Result:
7656xxxxxxx80x76
76561xxxxxxx4xx89
765xxxxxxxxxxx3194