Search code examples
pythonlistdatetimeindexingdel

How can I remove datetime elements of a list outside of a specified startdate and enddate period?


I have a list of datetime objects called 'date', and I am trying to remove the elements of the list which are outside of a startdate and an enddate. Can anyone help me understand how to properly do this and why I am getting this list index out of range error? I feel like I am so close!

My code:

startDate = datetime.strptime('1948-1-1',"%Y-%m-%d")
endDate = datetime.strptime('1950-2-1',"%Y-%m-%d")

for row in range(0,len(date)):
  if date[row] < startDate:
    del date[row]
  elif date[row] > endDate:
    del date[row]

List index out of range error

I have also tried the following way and it runs but does not delete the list elements:

count = 0

for row in date:
  if row < startDate:
    del date[count]
  elif row > endDate:
    del date[count]
  count += 1

Solution

  • as you are looping through the list and deleting the same list, which is making it out of the index. think like, you are looping through len(list) but the list is not of the same length as deleted some entries.

    so list comprehension can be helpful here, pls note I changed > and < to other way for the expected result, please see the example below:

    from datetime import datetime
    # datasetup
    date=['1947-01-01','1948-01-01','1948-02-02','1951-01-01']
    date=[datetime.strptime(each,"%Y-%m-%d") for each in date]
    #Control date
    startDate = datetime.strptime('1948-1-1',"%Y-%m-%d")
    endDate = datetime.strptime('1950-2-1',"%Y-%m-%d")
    #list comprehension
    date = [each for each in date if  each >= startDate and each <= endDate ]
    

    taking a solution to further, download the data from google drive, filter the needed data using pandas, and then plot it for analysis. step 1- download the data

    import pandas as pd
    import requests
    from io import StringIO
    
    gd_url='https://drive.google.com/file/d/1N2J136mog2CZK_XRyL3pxocaoUV8DByS/view?usp=sharing'
    file_id = gd_url.split('/')[-2]
    download_url='https://drive.google.com/uc?export=download&id=' + file_id
    url = requests.get(download_url).text # get the file
    csv_raw = StringIO(url)
    df = pd.read_csv(csv_raw)
    print(df.head(1))
    

    Step 2: filter the data

    #Control date
    startDate = '1948-01-01'
    endDate = '1950-02-01'
    df_new=df.loc[(df['DATE'] >= startDate) & (df['DATE'] <= endDate)] # as doing string compare, make sure that 
    #data looks okay otherwise change it to date for comparision
    

    Step 3: show the graph.

    import pandas as pd
    import matplotlib.pyplot as plt
    df_new.plot()
    plt.show() 
    

    enter image description here