I am trying to load an old pickle file containing the airline dataset ( https://arxiv.org/abs/1611.06740 ) . The pickle is very old and I have problems accessing it. If I try:
objects = []
with (open("airline.pickle", "rb")) as openfile:
while True:
try:
objects.append(pickle.load(openfile))
except EOFError:
break
I get the following warning and error:
FutureWarning: pandas.core.index is deprecated and will be removed in a future version. The public classes are available in the top-level namespace.
objects.append(pickle.load(openfile))
Traceback (most recent call last):
File "c:\Users\LocalAdmin\surfdrive\Code\Python\Airline\pickleToCSV.py", line 9, in <module>
objects.append(pickle.load(openfile))
TypeError: _reconstruct: First argument must be a sub-type of ndarray
Trying with pandas does not work:
File "C:\Users\LocalAdmin\surfdrive\Code\Python\Airline\Airline\lib\site-packages\pandas\io\pickle.py", line 203, in read_pickle
return pickle.load(handles.handle) # type: ignore[arg-type]
TypeError: _reconstruct: First argument must be a sub-type of ndarray
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "c:\Users\LocalAdmin\surfdrive\Code\Python\Airline\pickleToCSV.py", line 7, in <module>
df = pd.read_pickle('airline.pickle')
File "C:\Users\LocalAdmin\surfdrive\Code\Python\Airline\Airline\lib\site-packages\pandas\io\pickle.py", line 208, in read_pickle
return pc.load(handles.handle, encoding=None)
File "C:\Users\LocalAdmin\surfdrive\Code\Python\Airline\Airline\lib\site-packages\pandas\compat\pickle_compat.py",
line 249, in load
return up.load()
File "C:\Users\LocalAdmin\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 1212, in load
dispatch[key[0]](self)
File "C:\Users\LocalAdmin\AppData\Local\Programs\Python\Python39\lib\pickle.py", line 1725, in load_build
for k, v in state.items():
AttributeError: 'tuple' object has no attribute 'items'
How can I access the file and save it to csv? I need the data that is contained there. I am using pandas 1.2.4 and python 3.6.
As mentioned in a previous answer, the error TypeError: _reconstruct: First argument must be a sub-type of ndarray
is due to a change from pandas version 0.14 to 0.15 (Source). The documentation said that pd.read_pickle
would be able to load such old pickle files, but this is not working on recent versions. If you install an older version, I tested 0.17.1 which can be obtained in pypi or conda-forge, it can load that pickle file successfully.
If you are using conda, the following should work:
conda create -n old_pandas -c conda-forge pandas=0.17.* python=3.*
conda activate old_pandas
And then, in a Python prompt,
import pandas as pd
dataset = pd.read_pickle("airline.pickle")