Search code examples
pythoncsvpandasread-write

python - lists are being read in from csv as strings


I have a dictionary that uses an strings(edit) as keys and stores lists of lists as values.

dict = {key1: [[data1],[data2],[data3]], key2: [[data4],[data5]],...etc}

EDIT: where the data variables are rows containing different data types from a converted pandas DataFrame

Ex.

df = pd.DataFrame()
df['City'] = ['New York','Austin','New Orleans','New Orleans']
df['State'] = ['NY','TX','LA','LA']
df['Latitude'] = [29.12,23.53,34.53,34.53]
df['Time'] = [1.46420e+09,1.47340e+09,1.487820e+09,1.497820e+09]

City         State    Latitude   Time
New York     NY       29.12      1.46420e+09
Austin       TX       23.53      1.47340e+09
New Orleans  LA       34.53      1.487820e+09
New Orleans  LA       34.53      1.497820e+09

dict = {}
cities = df['City'].unique()
for c in cities:
    temp = df[df['City'] == c]
    dict[c] = temp.as_matrix().tolist()

#which outputs this for a given key
dict['New Orleans'] = [['New Orleans' 'LA' 34.53  1.487820e+09],
    ['New Orleans' 'LA' 34.53  1.497820e+09]]

I am storing it as a csv using the following:

filename = 'storage.csv'
with open(filename,'w') as f:
    w = csv.writer(f)
    for key in dict.keys():
        w.writerow((key,dict[key]))

I then read the file back into a dictionary using the following:

reader = csv.reader(open(filename, 'r'))
dict = {}
for key,val in reader:
    dict[key] = val

val comes in looking perfect, except it is now a string. for example, key1 looks like this:

dict[key1] = "[[data1],[data2],[data3]]"

How can I read the values in as lists, or remove the quotes from the read-in version of val?


Solution

  • Edit: Since you are using a pandas.DataFrame don't use the csv module or json module. Instead, use pandas.io for both reading and writing.


    Original Answer:

    Short answer: use json.

    CSV is fine for saving tables of strings. Anything further than that and you need to manually convert the strings back into Python objects.

    If your data has just lists, dictionaries and basic literals like strings and numbers json would be the right tool for this job.

    Given:

    example = {'x': [1, 2], 'y': [3, 4]}
    

    Save to file:

    with open('f.txt','w') as f:
        json.dump(example, f)
    

    Load from file:

    with open('f.txt') as f:
        reloaded_example = json.load(f)