Search code examples
pythonpython-2.7dictionarymappingnested

How do you create nested dict in Python?


I have 2 CSV files: 'Data' and 'Mapping':

  • 'Mapping' file has 4 columns: Device_Name, GDN, Device_Type, and Device_OS. All four columns are populated.
  • 'Data' file has these same columns, with Device_Name column populated and the other three columns blank.
  • I want my Python code to open both files and for each Device_Name in the Data file, map its GDN, Device_Type, and Device_OS value from the Mapping file.

I know how to use dict when only 2 columns are present (1 is needed to be mapped) but I don't know how to accomplish this when 3 columns need to be mapped.

Following is the code using which I tried to accomplish mapping of Device_Type:

x = dict([])
with open("Pricing Mapping_2013-04-22.csv", "rb") as in_file1:
    file_map = csv.reader(in_file1, delimiter=',')
    for row in file_map:
       typemap = [row[0],row[2]]
       x.append(typemap)

with open("Pricing_Updated_Cleaned.csv", "rb") as in_file2, open("Data Scraper_GDN.csv", "wb") as out_file:
    writer = csv.writer(out_file, delimiter=',')
    for row in csv.reader(in_file2, delimiter=','):
         try:
              row[27] = x[row[11]]
         except KeyError:
              row[27] = ""
         writer.writerow(row)

It returns Attribute Error.

After some researching, I think I need to create a nested dict, but I don't have any idea how to do this.


Solution

  • A nested dict is a dictionary within a dictionary. A very simple thing.

    >>> d = {}
    >>> d['dict1'] = {}
    >>> d['dict1']['innerkey'] = 'value'
    >>> d['dict1']['innerkey2'] = 'value2'
    >>> d
    {'dict1': {'innerkey': 'value', 'innerkey2': 'value2'}}
    

    You can also use a defaultdict from the collections package to facilitate creating nested dictionaries.

    >>> import collections
    >>> d = collections.defaultdict(dict)
    >>> d['dict1']['innerkey'] = 'value'
    >>> d  # currently a defaultdict type
    defaultdict(<type 'dict'>, {'dict1': {'innerkey': 'value'}})
    >>> dict(d)  # but is exactly like a normal dictionary.
    {'dict1': {'innerkey': 'value'}}
    

    You can populate that however you want.

    I would recommend in your code something like the following:

    d = {}  # can use defaultdict(dict) instead
    
    for row in file_map:
        # derive row key from something 
        # when using defaultdict, we can skip the next step creating a dictionary on row_key
        d[row_key] = {} 
        for idx, col in enumerate(row):
            d[row_key][idx] = col
    

    According to your comment:

    may be above code is confusing the question. My problem in nutshell: I have 2 files a.csv b.csv, a.csv has 4 columns i j k l, b.csv also has these columns. i is kind of key columns for these csvs'. j k l column is empty in a.csv but populated in b.csv. I want to map values of j k l columns using 'i` as key column from b.csv to a.csv file

    My suggestion would be something like this (without using defaultdict):

    a_file = "path/to/a.csv"
    b_file = "path/to/b.csv"
    
    # read from file a.csv
    with open(a_file) as f:
        # skip headers
        f.next()
        # get first colum as keys
        keys = (line.split(',')[0] for line in f) 
    
    # create empty dictionary:
    d = {}
    
    # read from file b.csv
    with open(b_file) as f:
        # gather headers except first key header
        headers = f.next().split(',')[1:]
        # iterate lines
        for line in f:
            # gather the colums
            cols = line.strip().split(',')
            # check to make sure this key should be mapped.
            if cols[0] not in keys:
                continue
            # add key to dict
            d[cols[0]] = dict(
                # inner keys are the header names, values are columns
                (headers[idx], v) for idx, v in enumerate(cols[1:]))
    

    Please note though, that for parsing csv files there is a csv module.