Search code examples
pythonarrayslistdictionaryrdd

I want to convert this data from my spark rdd to a dictonary


The data is like a dictionary but inside square brackets instead, which make it a list. The list is the following:

a = [{'sI': ['17046', '17043'], 'sQ': ['15800', '15789'], 'rid': 572, 'pid': 511, 'uid': 411, 'st': 1594892854.513586, 'et': '16


Solution

  • If your raw data is a one element list then your should only use the zero index and you will get the dictionary.

    Code:

    raw_data = [
        {
            "sI": ["17046", "17043"],
            "sQ": ["15800", "15789"],
            "rid": 572,
            "pid": 511,
            "uid": 411,
            "st": 1594892854.513586,
            "et": "16",
        }
    ]
    
    print("Raw type: {}".format(type(raw_data)))
    
    converted_data = raw_data[0]  # Get the first element of list.
    
    print("Converted type: {}".format(type(converted_data)))
    

    Output:

    >>> python3 test.py
    Raw type: <class 'list'>
    Converted type: <class 'dict'>
    

    If your list contains more dicts then your can get the dicts one-by-one in a for loop. I have written an example for you.

    Code:

    raw_data = [
        {
            "sI": ["17046", "17043"],
            "sQ": ["15800", "15789"],
        },
        {
            "sI": ["2", "3"],
            "sQ": ["4", "5"],
        }
    ]  # This list contains 2 dicts
    
    print("Raw type: {}".format(type(raw_data)))
    
    for item in raw_data:
        print("Type inside loop: {}".format(type(item)))  # Getting dicts one-by-one ("item" variable contains it)
    

    Output:

    >>> python3 test.py
    Raw type: <class 'list'>
    Type inside loop: <class 'dict'>
    Type inside loop: <class 'dict'>