Search code examples
pandaskerasrecurrent-neural-network

How to reshape pandas dataframe as input to keras simpleRNN?


I have a dataframe of time series data like so

df = pd.DataFrame({'TimeStep': [1, 2, 3, 1, 2, 3],
                   'Feature1': [100, 250, 300, 400, 100, 50],
                   'Feature2' : [2, 5, 100, 10, 42, 17]})

   TimeStep |Feature1   |Feature2
    |1      |100    |2
    |2      |250    |5
    |3      |300    |100
    |1      |400    |10
    |2      |100    |42
    |3      |50     |17

Now I would like to feed these to a simpleRNN layer in keras for example above Batch Size would be 2, timesteps = 3 and input_dim = 2

I tried df.to_numpy().reshape((2, 3, 2)) (with the actual dimensions of the real df of course) And that shape didn't work.

I'm grateful for any pointers you could give me. A while back I did something similar with a pure numpy array, but where I didn't specify the input_dim and that worked.

Thanks in advance!


Solution

  • You are close! If you reshape the dataframe excluding the TimeStep column (via iloc[:, 1:]), it should do:

    >>> df.iloc[:, 1:].to_numpy().reshape(2, 3, 2)
    array([[[100,   2],
            [250,   5],
            [300, 100]],
    
           [[400,  10],
            [100,  42],
            [ 50,  17]]], dtype=int64)
    

    which has the (batch_size, seq_len, num_features) shape.