I have a pandas.core.series.Series
where each element is a JSON as shown
0 {"count": 157065, "grp": {"a1": 12, "a2": 32}}
1 {"count": 2342, "grp": {"a1": 4, "a2": 34}}
2 {"count": 543, "grp": {"a1": 1, "a2": 11}}
3 {"count": 156, "grp": {"a1": 56, "a2": 75}}
How to compute the average value of count
in all the JSONs and also compute the average value of a1
and a2
I'm not entirely sure whether this is what you were asking for.
This is for calculating the average of "count"
doc1 = {"count": 157065, "grp": {"a1": 12, "a2": 32}}
doc2 = {"count": 2342, "grp": {"a1": 4, "a2": 34}}
doc3 = {"count": 543, "grp": {"a1": 1, "a2": 11}}
doc4 = {"count": 156, "grp": {"a1": 56, "a2": 75}}
lojs = [doc1, doc2, doc3, doc4] # list of all the jsons
countaverage = 0
# For every json, it gets the count and adds it to the variable I defined
for j in lojs:
countaverage += j["count"]
# Divides it by the length of the amount of documents
countaverage = countaverage/len(lojs)
And if you wanted to get the average of a1 with or instead of the one above, you could use this code:
a1average = 0
for j in lojs:
a1average += j["grp"]["a1"] # getting "a1" inside of "grp"
a1average = a1average/len(lojs)
and you could just swap a1 out for a2 if wanted to get a2
EXTENSION For documents that might have different amount of "a"s:
doc1 = {"count": 157065, "grp": {"a1": 12, "a2": 32}}
doc2 = {"count": 2342, "grp": {"a1": 4, "a2": 34}}
doc3 = {"count": 543, "grp": {"a1": 1, "a2": 11, "a3": 46, "a4": 23}}
doc4 = {"count": 156, "grp": {"a1": 56, "a2": 75, "a3": 23}}
lojs = [doc1, doc2, doc3, doc4]
grps = [] # defining a list that will contain all of the "a"s
for doc in lojs: # getting each document in the list of documents
for a in doc["grp"].keys(): # getting all the keys in the grp of that document
if a not in grps: # checking whether the "a" already exists in the list of "a"s
grps.append(a) # adding the new "a" to the list
averages = {} # using a dict instead of a list because it will be containing multiple values
for grp in grps: # getting each "a"
averages[grp] = [0, 0] # setting the value of that "a" to zero
for grp in grps: # getting each "a"
for doc in lojs: # getting each document
if grp in doc["grp"].keys(): # getting every "a" in the grp of the document
averages[grp][0] += doc["grp"][grp] # adding the value of that a to the corresponding value/key (idk dude) in the dictionary
averages[grp][1] += 1 # increasing the amount the "a" has been mentioned by 1
for el in averages: # getting each average
averages[el][0] = averages[el][0]/averages[el][1] # dividing b
And you can get the value of each average using
Of course, you can change "a3" to whichever "a" you want. Btw, if it isn't clear, you are getting the first element because the value of that key is a list that contains both the averaged (idk if that's a word) value and the amount of times the "a" has occurred inside your documents.
This probably isn't the most efficient way, but I mean, it works!