Search code examples
pythonpandasdataframedictionaryseries

Replacing keys in a dict with its corresponding indices using python


I have a list and dict like as shown below

col_indices = [df.columns.tolist().index(col) for col in cat_cols]
print(col_indices)  #returns [1,5] 

 t = {'thisdict':{
          "Ford":"brand",
          "Mustang":"model",
          1964:"year"
        },
        'thatdict':{
          "jfsak":"af",
          "jhas":"asjf"}}

Basically, I would like to replace dict keys with their corresponding column indices.

For ex: column index 1 belongs to thisdict and column index 5 belongs to thatdict.

I was trying something like below but doesn't work.

key_map_dict = {'1':'thisdict','5':'thatdict'}
d = {(key_map_dict[k] if k in key_map_dict else k):v  for (k,v) in t.items() }

Instead of me manually defining key_map_dict. Is there anyway to find the matching column names and get the index position and do the replacement in dicts automatically? I cannot do this for big data frame of million rows and 200 columns.

I expect my output to be like as shown below

           {1:{
              "Ford":"brand",
              "Mustang":"model",
              1964:"year"
            },
            5:{
              "jfsak":"af",
              "jhas":"asjf"}}

Solution

  • To replace the keys in your dictionary t with their column index in the DataFrame you can lookup the index of the corresponding column in the DataFrame and assign it to a value in t like this:

    import pandas
    # Provided t
    t = {'thisdict': {
        "Ford": "brand",
        "Mustang": "model",
        1964: "year"
    },
        'thatdict': {
        "jfsak": "af",
        "jhas": "asjf"}
    }
    
    # Assumed df looks something like this
    dct = {'thisdict': ['abc'],
           'thatdict': ['def']}
    df = pandas.DataFrame(dct)
    
    output = {df.columns.get_loc(name): dct for name, dct in t.items()}
    print(output)
    

    Output:

    {0: {'Ford': 'brand', 'Mustang': 'model', 1964: 'year'}, 1: {'jfsak': 'af', 'jhas': 'asjf'}}
    

    Note: This relies on all the keys in t existing in your DataFrame, but it would be relatively trivial to add checks if t is not one-to-one with the DataFrame.