Search code examples
pythoncsvreadercsvwriter

How to delete a particular column in csv file without pandas library


I am trying to delete particular columns in csv file.

CSV file:

Name,Age,YearofService,Department,Allocation
Birla,49,12,Welding,Production
Robin,38,10,Molding,Production

I am trying to delete Entire column having column header "Department" and "Allocation".

My code :

 with open(input.csv,'r') as i:
    with open(output.csv,'w',new line='') as o:
       reader=csv.reader(i)
       writer = csv.writer(o)
       for row in reader:
          for i in range(len(row)):
            if row[i]!="Department" and row[i]!="Allocation":
              writer.writerow(row)

My output:

Name
Birla
Robin
Age
49
38
YearofService
12
10

Expected output:

Name,Age,YearofService
Birla,49,12
Robin,38,10

We cannot gaurantee Department and Allocation will be in column header position "3" and "4". thats y am using iteration through length of row


Solution

  • In this case, the csv.DictReader and csv.DictWriter classes are very handy:

    import csv
    
    with open("input.csv") as instream, open("output.csv", "w") as outstream:
        # Setup the input
        reader = csv.DictReader(instream)
    
        # Setup the output fields
        output_fields = reader.fieldnames
        output_fields.remove("Department")
        output_fields.remove("Allocation")
    
        # Setup the output
        writer = csv.DictWriter(
            outstream,
            fieldnames=output_fields,
            extrasaction="ignore",  # Ignore extra dictionary keys/values
        )
    
        # Write to the output
        writer.writeheader()
        writer.writerows(reader)
    

    Notes

    • For input, each of the row will be a dictionary such as

      {'Name': 'Birla', 'Age': '49', 'YearofService': '12', 'Department': 'Welding', 'Allocation': 'Production'}
      
    • For output, we remove those columns (fields) that we don't need, see output_fields

    • The extraaction parameter tells DictReader to ignore extra keys/values from the dictionaries

    Update

    In order to remove columns from a CSV file we need to

    1. Open the input file, reader all the rows, close it
    2. Open it again to write.

    Here is the code, which I modified from the above

    import csv
    
    with open("input.csv") as instream:
        # Setup the input
        reader = csv.DictReader(instream)
        rows = list(reader)
    
        # Setup the output fields
        output_fields = reader.fieldnames
        output_fields.remove("Department")
        output_fields.remove("Allocation")
    
    with open("input.csv", "w") as outstream:
        # Setup the output
        writer = csv.DictWriter(
            outstream,
            fieldnames=output_fields,
            extrasaction="ignore",  # Ignore extra dictionary keys/values
        )
    
        # Write to the output
        writer.writeheader()
        writer.writerows(rows)