I'm very new to Python and am trying to read a CSV file:`
1980,Mark,Male,Student,L,90,56,78,44,88
1982,Cindy,Female,Student,S,45,76,22,42,90
1984,Kevin,Male,Student,L,67,83,52,55,59
1986,Michael,Male,Student,M,94,63,73,60,43
1988,Anna,Female,Student,S,66,50,59,57,33
1990,Jessica,Female,Student,S,72,34,29,69,27
1992,John,Male,Student,L,80,67,90,89,68
1994,Tom,Male,Student,M,23,60,89,78,39
1996,Nick,Male,Student,S,56,98,84,44,50
1998,Oscar,Male,Student,M,64,61,74,59,63
2000,Andy,Male,Student,M,11,50,93,69,90
I'd like to save only the specific attributes of this data into a dictionary, or a list of lists. For example, I'd only like to keep the year, name and the five numbers (in a row). I'm not sure how to exclude only the middle three columns.
This is the code I have now:
def read_data(filename):
f = open("myfile.csv", "rt")
import csv
data = {}
for line in f:
row = line.rstrip().split(',')
data[row[0]] = [e for e in row[5:]]
return data
I only know how to keep chunks of columns together, but not only specific columns one by one.
You could do this with a simple list comprehension:
def read_data(filename):
f = open("myfile.csv", "rt")
data = {}
col_nums = [0, 1, 5, 6, 7, 8, 9]
for line in f:
row = line.rstrip().split(',')
data[row[0]] = [row[i] for i in col_nums]
return data
You could also consider using Pandas to help you read and wrangle the data:
import pandas as pd
df = pd.read_csv("myfile.csv", columns=['year', 'name', 'gender', 'kind', 'size', 'num1', 'num2', 'num3', 'num4', 'num5'])
data = df[['year', 'name', 'num1', 'num2', 'num3', 'num4', 'num5']]