Cannot use column slicing (correctly) in a matrix with data read from a CSV in Python

I am trying to read a CSV file (containing one column of strings and one of integers) into a matrix using genfromtxt and then use slicing to get only the column containing the string values and load it into an array for further processing.

CSV File:

explore,1043
 sky,   585
 nikon, 552
 2007,  552
 ....

I use genfromtxt to load the csv:

my_data = np.genfromtxt('c:/tags.csv', delimiter=',')

and when I try to slice the matrix in order to get column containing the strings only:

print my_data[:,0]

i get the following:

[   nan    nan    nan  2007.    nan    nan    nan    nan    nan    nan ....

Which seems that it complains with the data type, then I try to specify the data types contained in the CSV:

my_data = np.genfromtxt('c:/tags.csv', dtype = [('mystring','S5'), ('myint','i8')], delimiter=',')

I get an array of tuples instead of a matrix....

[('flower', 1043L) ('sky', 585L) ('nikon', 552L) ('2007', 552L) ..... ]

What am I doing wrong???

Solution

If you are only interested in the first column, you can load the CSV as a 2D array of strings :

my_data = np.genfromtxt('c:/tags.csv', delimiter=',', dtype='S')
print my_data[:, 0]

result :

['explore' 'sky' 'nikon' '2007']