Search code examples
pythonpandashdf5

Pandas dataframe - what causes this error?


My code:

frame = pd.DataFrame({'a': np.random.randn(100)})
store = pd.HDFStore('mydata.h5')
store['obj1'] = frame
store['obj1_col'] = frame['a']
store.put('obj2',frame,foramt='table')
store.select('obj2',where=['index >= 10 and index <= 15'])

Gives this error message:

TypeError: cannot pass a where specification when reading from 
a Fixed format store. this store must be selected in its entirety

Why does this code give this error if every piece of code is right? How do I avoid similar errors in the future?


Solution

  • (I wanted to comment, but I can't yet due to reputation...)

    Hello, this is interesting -- for some reason it works on my machine. For the sake of completeness, I attach the code (with added imports).

    import pandas as pd
    import numpy as np
    frame = pd.DataFrame({'a': np.random.randn(100)})
    store = pd.HDFStore('mydata.h5')
    store['obj1'] = frame
    store['obj1_col'] = frame['a']
    store.put('obj2',frame,format='table')
    store.select('obj2',where=['index >= 10 and index <= 15'])
    

    Returns

    a
    10  -0.049168
    11  0.130048
    12  -1.553641
    13  -0.978392
    14  0.723070
    15  0.066814
    

    Could you please mention the version of libraries you're using? I wonder if we might have different versions of libraries. I have

    import tables
    import sys
    print(sys.version) # 3.10.6 (main, Mar 10 2023, 10:55:28) [GCC 11.3.0]
    print(pd.__version__) # 1.5.0
    print(np.__version__) # 1.23.3
    print(tables.__version__) # 3.8.0 ... (this one is dependency)
    

    To clarify -- I have a suspicion that it might be connected to pytables version, as referred in this, possibly related answer.

    Could you try upgrading pytables (e.g. by pip install --upgrade tables) and run again?