Python Read from csv1 and reference csv2 to write to csv3

I have two csv files, csv1 is the working file that changes, csv2 is static and used as a reference set.

csv1 - input.csv

Name	Type	Owner	Status	Date
bananas	fruit	Joe	In Stock	1/1/23
apples	fruit	jim	Out Of Stock	1/2/23
tomato	veggie	bob	In Stock	1/3/23
potato	vegie	tom	Out Of Stock	1/4/23
kiwi	fruit	jane	In Stock	1/5/23
chicken	meat	francis	Out Of Stock	1/6/23
beef	meat	linda	In Stock	1/7/23

csv2 - reference.csv

Type	Group
fruit	1
veggie	2
meat	3

I found this post here https://stackoverflow.com/a/14257599 which helped me get started but it's the processing that doesn't seem to be working

I'm using this code:

with open("input.csv", "r") as csv_input, open("reference.csv", "r")as assign_csv, open("output.csv", "w") as out_file:
reader = csv.reader(csv_input)
reader2 = csv.reader(assign_csv)
writer = csv.writer(out_file)

for error in reader:
    writer.writerow(error)
    for group in reader2:
        if group[0] in error[1]:
            error[5] = group[1]
            writer.writerow(error)

This reads the input and reference files just fine but at the very bottom in the if statement it's not doing anything and I'm not sure why. Basically I want it to loop through every row in the input.csv and check the value in the Type column and then loop through the reference.csv and if the text is contained there, then write to the Group column in the output.csv.

Currently the code is essentially just replicating input.csv to output.csv without writing anything to the cells for the Group column. I know the loop logic is correct because I tried it with a separate code sample and it worked just fine so I think my problem is the if statement and where I have the writer.writerow(error) line placed.

Solution

This kind of work must be done in separate passes of each file: read the reference file first and store the lookup values, then read the other CSV and use the lookup structure.

Python's dict is perfect for holding the lookup keys and values:

lookup: dict[str, str] = {}
with open("input1.csv", newline="") as f:
    reader = csv.reader(f)
    next(reader)  # discard header

    for row in reader:
        lookup[row[0]] = row[1]

The lookup dict looks like:

{
    "fruit": "1",
    "veggie": "2",
    "meat": "3",
}

With the lookup ready, read the other CSV and check each row's Type to see if it's in lookup. I like to skip rows that don't qualify instead of adding nested if-statements:

new_rows: list[list[str]] = []
with open("input2.csv", newline="") as f:
    reader = csv.reader(f)
    new_rows.append(next(reader))  # append header

    for row in reader:
        group = lookup.get(row[1])
        if group == None:
            continue

        row[5] = group
        new_rows.append(row)

Then write the final file:

with open("output.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerows(new_rows)

| Name    | Type   | Owner   | Status       | Date   | Group |
|---------|--------|---------|--------------|--------|-------|
| bananas | fruit  | joe     | In Stock     | 1/1/23 | 1     |
| apples  | fruit  | Jim     | Out Of Stock | 1/2/23 | 1     |
| tomato  | veggie | bob     | In Stock     | 1/3/23 | 2     |
| kiwi    | fruit  | jane    | In Stock     | 1/5/23 | 1     |
| chicken | meat   | francis | Out Of Stock | 1/6/23 | 3     |
| beef    | meat   | linda   | In Stock     | 1/7/23 | 3     |