Search code examples
pythonpandaslambdapandas-loc

Pandas - .loc / lambda / time series


I did a web scraping which returns a table of port congestion. So each row represents a ship and its days of arrival, berthing and departure.

Link for data source

I'd like my code only to retrieve data of the interval from yesterday up to the 7 previous days, i.e, from the previous week until yesterday.

I tried below script to retrieve data from yesterday, which is working fine. I am using the dates of tabela['departure'] as reference:

today = date.today().strftime("%Y-%m-%d")
today = datetime.strptime(today, '%Y-%m-%d')

yesterday = pd.to_datetime(today - pd.Timedelta('1 days 00:00:00'))

df0 = tabela.loc[lambda x: pd.to_datetime(x['departure'].dt.date) == yesterday, :]

How could I retrieve the whole interval of the previous week?

I've tried the following but it does not return the data frame:

time = ['1 days 00:00:00', '2 days 00:00:00', '3 days 00:00:00', '4 days 00:00:00', '5 days 00:00:00', '6 days 00:00:00', '7 days 00:00:00']

week = pd.to_datetime([today - pd.Timedelta(i) for i in time])

tabela.loc[lambda x: [ x for x in list(pd.to_datetime(x['departure'].dt.date)) if x in week],:]

Solution

  • In .loc, combine multiple conditions using &.

    Syntax: df.loc[condition1 & condition2]

    for your case, you can try like this.

    from datetime import datetime
    import pandas as pd
    
    today = datetime.today()
    yest = today - pd.Timedelta('1 days 00:00:00')
    lastweek = yest - pd.Timedelta('7 days 00:00:00')
    
    tabela.loc[(tabela['departure']<=yest) & (tabela['departure']>=lastweek)]
    

    hope, this helps.