Search code examples
pythonpandasdataframe

Get 3 rows from a Pandas Dataframe with the latest timestamp entry for a specific condition


I have put the following log data to a pandas dataframe. Now I want to get the 3 rows with target_label= 'Security Alert' having the most recent timestamps

In the following table the third column is the target_label column,

1759,12/29/2025 20:33,ModernCRM,Unauthorised access attempt from 192.168.99.79 detected,Security Alert,bert
2213,9/1/2025 10:30,AnalyticsEngine,Security alert: suspicious activity from 192.168.214.63,Security Alert,bert
168,5/19/2025 7:56,ModernHR,Multiple incorrect login attempts were made by user 7918,Security Alert,bert
1844,6/10/2025 8:39,ThirdPartyAPI,"Alert: server 9 experienced unusual login attempts, security risk",Security Alert,bert
961,4/26/2025 5:21,ThirdPartyAPI,API access audit trail shows unauthorized entry for user 5627,Security Alert,bert
1077,3/4/2025 9:49,AnalyticsEngine,"Anomalous activity identified on server 47, security review recommended",Security Alert,bert
1356,5/3/2025 13:03,ThirdPartyAPI,Multiple rejected login attempts found for user 1805,Security Alert,bert
43,11/22/2025 11:06,BillingSystem,"Abnormal system behavior on server 40, potential security breach",Security Alert,bert
2062,6/7/2025 1:22,AnalyticsEngine,"Server 11 experienced potential security incident, review required",Security Alert,bert
769,4/28/2025 4:07,ModernHR,API access denied due to unauthorized credentials for user 5914,Security Alert,bert

Solution

  • First get all rows with target_label='Security Alert'

    selected = df[ df['target_label']=='Security Alert' ]
    

    Next sort values by date (doc: pandas.DataFrame.sort_values)

    selected = selected.sort_values(by="date")  # .sort_values(by="date", ascending=True)
    

    Finally get last (or first) three rows (depends on sorting "ascending" or "descending")

    selected = selected[-3:]  # selected[:3]
    

    So it can be

    selected = df[ df['target_label'] == 'Security Alert' ].sort_values(by='date')[-3:]
    

    Of course you have to check if pandas keeps date as datetime object - or you will have to convert them on your own (pandas.to_datetime())


    Minimal working example.

    I use io only to create file like object in memory - so everyone can simply copy and run this example (but you should use filename)

    I had to use pandas.to_datetime() to convert string with date to object datetime.

    data = '''1760,12/29/2025 20:33,ModernCRM,Unauthorised access attempt from 192.168.99.79 detected,Security Alert,bert
    2213,9/1/2025 10:30,AnalyticsEngine,Security alert: suspicious activity from 192.168.214.63,Security Alert,bert
    168,5/19/2025 7:56,ModernHR,Multiple incorrect login attempts were made by user 7918,Security Alert,bert
    1844,6/10/2025 8:39,ThirdPartyAPI,"Alert: server 9 experienced unusual login attempts, security risk",Security Alert,bert
    961,4/26/2025 5:21,ThirdPartyAPI,API access audit trail shows unauthorized entry for user 5627,Security Alert,bert
    1077,3/4/2025 9:49,AnalyticsEngine,"Anomalous activity identified on server 47, security review recommended",Security Alert,bert
    1356,5/3/2025 13:03,ThirdPartyAPI,Multiple rejected login attempts found for user 1805,Security Alert,bert
    43,11/22/2025 11:06,BillingSystem,"Abnormal system behavior on server 40, potential security breach",Security Alert,bert
    2062,6/7/2025 1:22,AnalyticsEngine,"Server 11 experienced potential security incident, review required",Security Alert,bert
    769,4/28/2025 4:07,ModernHR,API access denied due to unauthorized credentials for user 5914,Security Alert,bert
    '''
    
    import pandas as pd
    import io
    
    file_like_object = io.StringIO(data)
    df = pd.read_csv(file_like_object, names=['number', 'date', 'system', 'description', 'target_label', 'person'])
    
    print('data type for column "date":', df['date'].dtype)
    
    # convert string with date to datetime object
    df['date'] = pd.to_datetime(df['date'])   
    
    print('data type for column "date":', df['date'].dtype)
    
    #selected = df[ df['target_label']=='Security Alert' ]
    #selected = selected.sort_values(by="date")
    #selected = selected[-3:]
    
    selected = df[ df['target_label'] == 'Security Alert' ].sort_values(by='date')[-3:]
    print(selected)
    

    Result

    data type for column "date": object
    data type for column "date": datetime64[ns]
    
       number                date           system                                        description    target_label person
    1    2213 2025-09-01 10:30:00  AnalyticsEngine  Security alert: suspicious activity from 192.1...  Security Alert   bert
    7      43 2025-11-22 11:06:00    BillingSystem  Abnormal system behavior on server 40, potenti...  Security Alert   bert
    0    1760 2025-12-29 20:33:00        ModernCRM  Unauthorised access attempt from 192.168.99.79...  Security Alert   bert
    

    EDIT:

    If you need it as timestamps

    # divide by `10**9` to convert `nanoseconds` to `seconds`
    
    timestamps = selected['date'].astype(int) // (10**9)
    print(timestamps)
    
    # check if timestamp gives correct datetime (BTW: problem can be timezone)
    
    import datetime
    
    print(datetime.datetime.fromtimestamp( timestamps.iloc[0] ))
    

    See other methods on Stackoverflow: pandas datetime to unix timestamp seconds