Search code examples
pythonpandasdatetimedatetimeindex

Create a dataframe with two columns (hour and minute) using a datetime index


Hi all I have a list with Datetime indexes in it, with time interval 15 minutes (see screenshot)

I would like to create a dataframe with 2 columns only, the first with 'hour' and the second with 'minute' using this Datetime index. enter image description here The result I would like to be like this:

hour   minute

08     00

08     15
08     30 etc.

Is it possible? thanks a lot!


Solution

  • #sample data
    L = [pd.Index([x]) for x in pd.date_range('2022-12-23 08:30', freq='15Min', periods=5)]
    print (L)
    [DatetimeIndex(['2022-12-23 08:30:00'], dtype='datetime64[ns]', freq=None),
     DatetimeIndex(['2022-12-23 08:45:00'], dtype='datetime64[ns]', freq=None),
     DatetimeIndex(['2022-12-23 09:00:00'], dtype='datetime64[ns]', freq=None), 
     DatetimeIndex(['2022-12-23 09:15:00'], dtype='datetime64[ns]', freq=None), 
     DatetimeIndex(['2022-12-23 09:30:00'], dtype='datetime64[ns]', freq=None)]
    

    Use list comprehension for tuples for minutes and hours with flattening:

    df = pd.DataFrame([(y.hour, y.minute) for x in L for y in x], columns=['hour','minute'])
    print (df)
       hour  minute
    0     8      30
    1     8      45
    2     9       0
    3     9      15
    4     9      30
    

    Or:

    idx = pd.Index([y for x in L for y in x])
    print (idx)
    DatetimeIndex(['2022-12-23 08:30:00', '2022-12-23 08:45:00',
                   '2022-12-23 09:00:00', '2022-12-23 09:15:00',
                   '2022-12-23 09:30:00'],
                  dtype='datetime64[ns]', freq=None)
    
    df = pd.DataFrame({'hour':idx.hour, 'minute':idx.minute})
    print (df)
       hour  minute
    0     8      30
    1     8      45
    2     9       0
    3     9      15
    4     9      30
    

    If always one element index values is possible select first value by indexing:

    df = pd.DataFrame([(x[0].hour, x[0].minute) for x in L], columns=['hour','minute'])
    print (df)
       hour  minute
    0     8      30
    1     8      45
    2     9       0
    3     9      15
    4     9      30
    

    Or:

    idx = pd.Index([x[0] for x in L])
    print (idx)
    DatetimeIndex(['2022-12-23 08:30:00', '2022-12-23 08:45:00',
                   '2022-12-23 09:00:00', '2022-12-23 09:15:00',
                   '2022-12-23 09:30:00'],
                  dtype='datetime64[ns]', freq=None)
    
    df = pd.DataFrame({'hour':idx.hour, 'minute':idx.minute})
    print (df)
       hour  minute
    0     8      30
    1     8      45
    2     9       0
    3     9      15
    4     9      30