I am very new to python so please bear with me. I have a csv file that looks like this:
.
I am trying to loop through the file and for every unique individual there are, create a new csv file and copy rows over. I was successful in doing this for one animal but I am having trouble creating the syntax for a more generic approach. Here is what I currently have:
import arcpy
import csv
from csv import DictReader
WS = arcpy.env.workspace = raw_input("Where if your workspace")
infile = raw_input("where is your file?")
outfile = raw_input("What is your outfile name?")
arcpy.env.overwriteOutput = True
with open(infile, "r") as csvFile, open(outfile, "w") as out, open("outfile2.csv", "w") as out2:
reader = csv.DictReader(csvFile)
writer = csv.writer(out)
writer.writerow(reader.fieldnames)
for row in reader:
if row["Animal"] == "1":
values = [row[field] for field in reader.fieldnames]
writer.writerow(values)
To write each Animal
into its own CSV file, you would need to open a different file for each type of animal. This could be done by using a dictionary to store the file object and csv writer object for each animal. At the end, this could then be used to close all the files correctly:
import csv
output_csvs = {} # e.g. {'1' : [file_object, csv_object]}
with open('input.csv', 'rb') as f_input:
csv_reader = csv.reader(f_input)
header = next(csv_reader)
for row in csv_reader:
animal = row[0]
if animal in output_csvs:
output_csvs[animal][1].writerow(row)
else:
f_output = open('animal_{}.csv'.format(animal), 'wb')
csv_output = csv.writer(f_output)
output_csvs[animal] = [f_output, csv_output]
csv_output.writerow(header)
csv_output.writerow(row)
for csv_file, csv_writer in output_csvs.values():
csv_file.close()
This would give you a set of output CSV files named according to the animal, e.g. animal_1.csv
Alternatively, if the data is small enough to be read into memory, it could be sorted by animal and output one block at a time by making use of Python's itertools.groupby()
function:
from itertools import groupby
import csv
with open('input.csv', 'rb') as f_input:
csv_reader = csv.reader(f_input)
header = next(csv_reader)
for animal, group in groupby(sorted(csv_reader), lambda x: x[0]):
with open('animal_{}.csv'.format(animal), 'wb') as f_output:
csv_output = csv.writer(f_output)
csv_output.writerow(header)
csv_output.writerows(group)
Using sorted()
ensures that all animals of the same kind are grouped together. If this is already the case in the data, sorting is not needed.
To access these files you could use glob.glob()
:
import matplotlib.pyplot as plt
import glob
for animal_filename in glob.glob('animal_*.csv'):
with open(animal_filename, 'rb') as f_input:
csv_input = csv.reader(f_input)
heading = next(csv_input)
x, y = [], []
for row in csv_input:
x.append(int(row[1]))
y.append(int(row[2]))
fig, ax = plt.subplots()
plt.title(animal_filename)
ax.scatter(x, y)
plt.show()