Search code examples
pythondataframecsvdata-analysis

How to save a multidimensional list as a file, when the file has inhomogeneous part


My data shape is as follows:

[0][0] = Variable 0, yearly values (an array of 20 numbers)  
[0][1] = Variable 0, yearly error values (an array of 20 numbers)  
[0][2] = Variable 0, name of the variable 0 (a string)
[11][1] = Variable 11, yearly error values 
etc.` 

There are total of 50 variables.

It was constructed like this from an original csv data frame, where variable values and errors were mixed:

a = []    
b = []  
b.append(Data[Data.column[1]) #the values for each variable 
b.append(Data[Data.column[2]) #error value is the value after the actual value in my data 
b.append("Data of y by person x) 
a.append(b)   
b = [] #emptying the list b, proceeding to the next variable to add to the list  
b.append(Data[Data.column[3]) ...  
a.append(b)   
b = []  

And so on. I wanted to create a new data set where I can examine each variable by inserting a number matching the variable. This made plotting and analyzing the data much easier.

I have tried the following methods:

np.savetxt("myfile.csv", a)
np.savetxt("myfile.txt", a)
a.tofile('myfile.csv', sep = ',')

With no luck. I get the error of "ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 2 dimensions. The detected shape was (67, 3) + inhomogeneous part".

My list does everything I want it to do when handling the data. Only problem is saving it. It would be nice to open the file directly, without the need to run through the entire "file construction" code.


Solution

  • Pickle suggested by @rockhard4hu did the trick! Posting the answer if someone else happens to Google and ends up here:

    import pickle
    with open('filename.pickle', 'wb') as handle:
        pickle.dump(a, handle, protocol=pickle.HIGHEST_PROTOCOL)
    

    And then I simply open the file in another place:

    import pickle
    with open('filename.pickle', 'rb') as handle:
        b = pickle.load(handle)
    

    And then b == a and I can do all I need with the data!