I wanted to convert a list
to pandas
dataframe, where the first element of the list
is a dictionary
.
I have below code
import pandas as pd
import numpy as np
pd.DataFrame([{'aa' : 10}, np.nan])
However this fails with below message
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python3.11/site-packages/pandas/core/frame.py", line 782, in __init__
arrays, columns, index = nested_data_to_arrays(
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 498, in nested_data_to_arrays
arrays, columns = to_arrays(data, columns, dtype=dtype)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 832, in to_arrays
arr, columns = _list_of_dict_to_arrays(data, columns)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 912, in _list_of_dict_to_arrays
pre_cols = lib.fast_unique_multiple_list_gen(gen, sort=sort)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "pandas/_libs/lib.pyx", line 374, in pandas._libs.lib.fast_unique_multiple_list_gen
File "/usr/local/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 910, in <genexpr>
gen = (list(x.keys()) for x in data)
^^^^^^
AttributeError: 'float' object has no attribute 'keys'
Could you please help how to resolve this issue?
Enclose your list into np.array
:
pd.DataFrame(np.array([{'aa' : 10}, np.nan]))
0
0 {'aa': 10}
1 NaN
Though you list is quite small, here's timings comparison just for the case:
In [777]: %timeit pd.DataFrame(np.array([{'aa' : 10}, np.nan]))
26.6 µs ± 220 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [778]: %timeit pd.Series([{'aa' : 10}, np.nan]).to_frame()
49.6 µs ± 911 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)