Search code examples
pythonpandasnumpytimexlsx

Creating categories with rows information with pandas


I'm using a database (xlsx) that's informs the time that the animals were inside of the studied area, like this:

Data

I wanna separate the hours in groups, pick up the time and put it in categories.

24h-6h - Dawn 6h-12h - Morning 12h-18h - Afternoon 18h - 24h - Night

So instead of "21:25:00" would be "night", "10:36:00" would be "morning" and so on.

How can i do it with pandas? I tried to use the same method used to rename the columns, something like this.

pamdf.loc[pamdf['Hora em que foram detectados dentro da área de exclusão'].isin( ['24:00:00':'06:00:00' ]), 'Hora em que foram detectados dentro da área de exclusão'] = 'Dawn'

But it didn't make any sense and obviously didn't work. Can i use "pandas.cut"? how?

Thanks in advance folks and sorry for the bad english.

EXAMPLE:

Hour  
21:25
10:36
23:17
04:55
21:12
23:58
19:55
20:54
23:01
00:20
18:54

Solution

  • You can do this:

    import pandas as pd
    
    def convert_to_category(x):
        if x in set([i for i in range(1,7)]):
            return 'Dawn'
        elif x in set([i for i in range(7,13)]):
            return 'Morning'
        elif x in set([i for i in range(13,19)]):
            return 'Afternoon'
        else:
            return 'Night'
    
    df = pd.DataFrame({'time':['21:25:00','10:36:00','23:17:00']})
    
    df['period'] = pd.to_datetime(df['time']).dt.hour.apply(lambda x: convert_to_category(x))
    
    
        time        period
    0   21:25:00    Night
    1   10:36:00    Morning
    2   23:17:00    Night