I did a web scraping which returns a table of port congestion. So each row represents a ship and its days of arrival, berthing and departure.
I'd like my code only to retrieve data of the interval from yesterday up to the 7 previous days, i.e, from the previous week until yesterday.
I tried below script to retrieve data from yesterday, which is working fine. I am using the dates of tabela['departure'] as reference:
today = date.today().strftime("%Y-%m-%d")
today = datetime.strptime(today, '%Y-%m-%d')
yesterday = pd.to_datetime(today - pd.Timedelta('1 days 00:00:00'))
df0 = tabela.loc[lambda x: pd.to_datetime(x['departure'].dt.date) == yesterday, :]
How could I retrieve the whole interval of the previous week?
I've tried the following but it does not return the data frame:
time = ['1 days 00:00:00', '2 days 00:00:00', '3 days 00:00:00', '4 days 00:00:00', '5 days 00:00:00', '6 days 00:00:00', '7 days 00:00:00']
week = pd.to_datetime([today - pd.Timedelta(i) for i in time])
tabela.loc[lambda x: [ x for x in list(pd.to_datetime(x['departure'].dt.date)) if x in week],:]
In .loc
, combine multiple conditions using &
.
Syntax: df.loc[condition1 & condition2]
for your case, you can try like this.
from datetime import datetime
import pandas as pd
today = datetime.today()
yest = today - pd.Timedelta('1 days 00:00:00')
lastweek = yest - pd.Timedelta('7 days 00:00:00')
tabela.loc[(tabela['departure']<=yest) & (tabela['departure']>=lastweek)]
hope, this helps.