Search code examples
pythonpandasdataframeparquet

Cannot Read Parquet File of Multi-Level Complex Index Data Frame


I can create a sample data frame with the following code and save it as parquet. When I try to read it throws "TypeError: unhashable type: 'numpy.ndarray'". Is it possible to save an index comprised of tuples or do I have to reset the index before saving to parquet? Thanks

import pandas as pd

# Creating sample data
data = {
    'A': [1, 2, 3],
    'B': [6, 7, 8],
    'C': [11, 12, 13],
}

# Creating multi-index
index = pd.MultiIndex.from_tuples(
    [
        ((10, 30), (0.75, 1.0)), 
        ((10, 30), (0.75, 1.25)),
        ((10, 30), (1.0, 1.25))
    ],
    names=['level_0', 'level_1']
)

# Creating DataFrame with multi-index
df = pd.DataFrame(data, index=index)

print(df)

df.to_parquet(path="test.parquet")
pd.read_parquet("test.parquet")

Solution

  • You must specify the levels:

    import pandas as pd
    
    data = {
        'A': [1, 2, 3],
        'B': [6, 7, 8],
        'C': [11, 12, 13],
    }
    
    index = pd.MultiIndex.from_tuples(
        [
            (str((10, 30)), str((0.75, 1.0))), 
            (str((10, 30)), str((0.75, 1.25))),
            (str((10, 30)), str((1.0, 1.25)))
        ],
        names=['level_0', 'level_1']
    )
    
    df = pd.DataFrame(data, index=index)
    
    print("Original DataFrame:")
    print(df)
    
    df_reset = df.reset_index()
    df_reset.to_parquet(path="test.parquet")
    df_read = pd.read_parquet("test.parquet")
    df_read.set_index(['level_0', 'level_1'], inplace=True)
    
    print("DataFrame read from Parquet:")
    print(df_read)
    

    which returns

    Original DataFrame:
                           A  B   C
    level_0  level_1               
    (10, 30) (0.75, 1.0)   1  6  11
             (0.75, 1.25)  2  7  12
             (1.0, 1.25)   3  8  13
    DataFrame read from Parquet:
                           A  B   C
    level_0  level_1               
    (10, 30) (0.75, 1.0)   1  6  11
             (0.75, 1.25)  2  7  12
             (1.0, 1.25)   3  8  13