Search code examples
pythonjsonpandaspandas-datareader

How can I normalize json


I can't normalize json string looking like that:

{
    "Mercy": [
        [
            "Chicago",
            "New Orleans"
        ]
    ],
    "Zarya": [
        [
            "Rio de Janeiro",
            "Capetown"
        ],
        [
            "Rome",
            "Seattle"
        ]
    ],
    "Torbjörn": [
        [
            "Buenos Aires",
            "New York"
        ],
        [
            "Capetown",
            "Juneau"
        ],
        [
            "Istanbul",
            "Cairo"
        ]
    ]
}

I want to get the dataframe like that: name city1 city2 city3 city4 city5 city6 ..... cityN 0 1 . . . Is it possible?


Solution

  • I believe you need 2 columns DataFrame:

    df = pd.DataFrame([(k, y) for k, v in d.items() for x in v for y in x],
                       columns=['name','city'])
    print (df)
            name            city
    0      Mercy         Chicago
    1      Mercy     New Orleans
    2      Zarya  Rio de Janeiro
    3      Zarya        Capetown
    4      Zarya            Rome
    5      Zarya         Seattle
    6   Torbjörn    Buenos Aires
    7   Torbjörn        New York
    8   Torbjörn        Capetown
    9   Torbjörn          Juneau
    10  Torbjörn        Istanbul
    11  Torbjörn           Cairo
    

    EDIT: If need lists in new columns use:

    df = pd.DataFrame([(k,*x) for k, v in d.items() for x in v])
    df.columns=['name'] + [f'city{i}' for i in df.columns[1:]]
    print (df)
           name           city1        city2
    0     Mercy         Chicago  New Orleans
    1     Zarya  Rio de Janeiro     Capetown
    2     Zarya            Rome      Seattle
    3  Torbjörn    Buenos Aires     New York
    4  Torbjörn        Capetown       Juneau
    5  Torbjörn        Istanbul        Cairo