Search code examples
pythondictionary

dict.get returns empty instead of default value


I got a CSV-file that contains the rows id, name, firstname and company

I'm looping over the CSV with csv.DictReader and want to insert some default-values into name and firstname if they are empty

dict.get() should do the trick - however it only works if name and firstname never contain any data. As soon as they contain data at least once, the returned default-value will be omited and nothing gets returned

test.csv

"id","name","firstname","company"
"1","doe","john","jdoe inc"
"2","doe","jane","jdoe inc"
"3",,,"company inc"

import_csv.py

import csv

with open("test.csv") as csv_file:
    reader = csv.DictReader(csv_file)

    for row in reader:
        firstname = row.get("firstname", "Company")
        name = row.get("name", row["company"])
        company = row["company"]

        print(f"Firstname:  {firstname}")
        print(f"Name:       {name}")
        print(f"Company:    {company}\n")

The output of the above test-script is

Firstname:  john
Name:       doe
Company:    jdoe inc

Firstname:  jane
Name:       doe
Company:    jdoe inc

Firstname:
Name:
Company:    company inc

My desired output would be

Firstname:  john
Name:       doe
Company:    jdoe inc

Firstname:  jane
Name:       doe
Company:    jdoe inc

Firstname:  Company       # <- default value of dict.get()
Name:       company inc   # <- default value of dict.get()
Company:    company inc

Solution

  • dict.get() returns the default value only if the key is not set. But DictReader() is setting the key, with an empty string as the value. That's because there is an empty string in that column.

    In fact, DictReader() guarantees that the there is a key set for every field name (where the field names are taken from the first row here); if a column is missing entirely, the value is set to None.

    You can trivially account for this by using or:

    firstname = row["firstname"] or "Company"
    name = row["name"] or row["company"]
    

    There is no point in using dict.get() if a key is always there. But if row["firstname"] is set to either an empty string or None, then that's a value that is considered false, and so Python will produce the other operand to or instead.