I have a list of datetime objects called 'date', and I am trying to remove the elements of the list which are outside of a startdate and an enddate. Can anyone help me understand how to properly do this and why I am getting this list index out of range error? I feel like I am so close!
My code:
startDate = datetime.strptime('1948-1-1',"%Y-%m-%d")
endDate = datetime.strptime('1950-2-1',"%Y-%m-%d")
for row in range(0,len(date)):
if date[row] < startDate:
del date[row]
elif date[row] > endDate:
del date[row]
I have also tried the following way and it runs but does not delete the list elements:
count = 0
for row in date:
if row < startDate:
del date[count]
elif row > endDate:
del date[count]
count += 1
as you are looping through the list and deleting the same list, which is making it out of the index. think like, you are looping through len(list) but the list is not of the same length as deleted some entries.
so list comprehension can be helpful here, pls note I changed > and < to other way for the expected result, please see the example below:
from datetime import datetime
# datasetup
date=['1947-01-01','1948-01-01','1948-02-02','1951-01-01']
date=[datetime.strptime(each,"%Y-%m-%d") for each in date]
#Control date
startDate = datetime.strptime('1948-1-1',"%Y-%m-%d")
endDate = datetime.strptime('1950-2-1',"%Y-%m-%d")
#list comprehension
date = [each for each in date if each >= startDate and each <= endDate ]
taking a solution to further, download the data from google drive, filter the needed data using pandas, and then plot it for analysis. step 1- download the data
import pandas as pd
import requests
from io import StringIO
gd_url='https://drive.google.com/file/d/1N2J136mog2CZK_XRyL3pxocaoUV8DByS/view?usp=sharing'
file_id = gd_url.split('/')[-2]
download_url='https://drive.google.com/uc?export=download&id=' + file_id
url = requests.get(download_url).text # get the file
csv_raw = StringIO(url)
df = pd.read_csv(csv_raw)
print(df.head(1))
Step 2: filter the data
#Control date
startDate = '1948-01-01'
endDate = '1950-02-01'
df_new=df.loc[(df['DATE'] >= startDate) & (df['DATE'] <= endDate)] # as doing string compare, make sure that
#data looks okay otherwise change it to date for comparision
Step 3: show the graph.
import pandas as pd
import matplotlib.pyplot as plt
df_new.plot()
plt.show()