Search code examples
pythonpython-3.xlistdictionarypydictionary

How do I create a nested format json from the list of dicts?


create a dictionary in a list of dictionaries

How do I group this list of dicts by the same month?. Tried to implement the answer from this link but no luck. would appreciate help.

Here's the list of dictionary format, I have

[
{'date':'2020-02-02','id' : '1','dept': '20020','CNT' : '1','rep_level' : 'form1'},
{'date':'2020-02-02','id' : '1','dept': '20020','CNT' : '0','rep_level' : 'form2'},
{'date':'2020-02-02','id' : '1','dept': '20020','CNT' : '4','rep_level' : 'form3'},
{'date':'2020-02-02','id' : '2','dept': '20020','CNT' : '9','rep_level' : 'all'},
{'date':'2020-02-02','id' : '3','dept': '20021','CNT' : '14','rep_level' : 'all'},
{'date':'2020-02-02','id' : '1','dept': '20022','CNT' : '5','rep_level' : 'form1'},
{'date':'2020-02-02','id' : '1','dept': '20022','CNT' : '2','rep_level' : 'form2'},
{'date':'2020-02-02','id' : '1','dept': '20022','CNT' : '3','rep_level' : 'form3'}
]

answer format:

[
{"dept":"20020", "date":"2020-02-02", "answers":[{"id":"1", "answerValue":[1,0,4]},{"id":"2", answer:9}]},
{"dept":"20021", "date":"2020-02-02", "answers":[{"id":"3", "answerValue":14}]},
{"dept":"20022", "date":"2020-02-02", "answers":[{"id":"1", "answerValue":[5,2,3]}]}
]

Thanks,


Solution

  • The solution provided in the answer you linked is correct, but you have to put it all together in a specific way to get the result you're after:

    from itertools import groupby
    
    data = [
        {'date': '2020-02-02', 'id': '1', 'dept': '20020', 'CNT': '1', 'rep_level': 'form1'},
        {'date': '2020-02-02', 'id': '1', 'dept': '20020', 'CNT': '0', 'rep_level': 'form2'},
        {'date': '2020-02-02', 'id': '1', 'dept': '20020', 'CNT': '4', 'rep_level': 'form3'},
        {'date': '2020-02-02', 'id': '2', 'dept': '20020', 'CNT': '9', 'rep_level': 'all'},
        {'date': '2020-02-02', 'id': '3', 'dept': '20021', 'CNT': '14', 'rep_level': 'all'},
        {'date': '2020-02-02', 'id': '1', 'dept': '20022', 'CNT': '5', 'rep_level': 'form1'},
        {'date': '2020-02-02', 'id': '1', 'dept': '20022', 'CNT': '2', 'rep_level': 'form2'},
        {'date': '2020-02-02', 'id': '1', 'dept': '20022', 'CNT': '3', 'rep_level': 'form3'}
    ]
    
    result = [{
        'dept': dept,
        'answers': [{
            'id': identifier,
            'answerValue': [int(a['CNT']) for a in answers]
        } for identifier, answers in groupby(results, key=lambda x: x['id'])]
    } for dept, results in groupby(data, key=lambda x: x['dept'])]
    

    On the inside, there's:

            'answerValue': [int(a['CNT']) for a in answers]
    

    Which constructs a list of the answer integer values from string values for 'CNT' in answers, as a list comprehension.

    That answers comes from the expression around it:

        'answers': [{
            'id': identifier,
            'answerValue': [int(a['CNT']) for a in answers]
        } for identifier, answers in groupby(results, key=lambda x: x['id'])]
    

    This is another list comprehension, creating one dictionary for each value of identifier and the answers that come with it, after a call to groupby(), grouping results on the 'id' field.

    And that results comes from the outer comprehension:

    result = [{
        'dept': dept,
        'answers': [{
            'id': identifier,
            'answerValue': [int(a['CNT']) for a in answers]
        } for identifier, answers in groupby(results, key=lambda x: x['id'])]
    } for dept, results in groupby(data, key=lambda x: x['dept'])]
    

    This is similar to the previous, grouping the original data by 'dept' and creating one dictionary for each department and the results grouped for it.

    If you print(result):

    [{'dept': '20020', 'answers': [{'id': '1', 'answerValue': [1, 0, 4]}, {'id': '2', 'answerValue': [9]}]}, {'dept': '20021', 'answers': [{'id': '3', 'answerValue': [14]}]}, {'dept': '20022', 'answers': [{'id': '1', 'answerValue': [5, 2, 3]}]}]
    

    Which is the result you were after. You could of course add the date, if you wanted to, but you indicated this is always the same anyway.

    Note: personally, I think this is a more useful way of doing something similar:

    result = {
        dept: {
            identifier: [int(a['CNT']) for a in answers]
            for identifier, answers in groupby(results, key=lambda x: x['id'])
        }
        for dept, results in groupby(data, key=lambda x: x['dept'])
    }
    

    This gets you (when printed):

    {'20020': {'1': [1, 0, 4], '2': [9]}, '20021': {'3': [14]}, '20022': {'1': [5, 2, 3]}}
    

    And you could access that like this:

    print(result['20020']['2'])  # prints "[9]"