Search code examples
pandasconditional-statementsdelete-row

Delete All Rows with Year != Pandas


I have a huge panda df with hourly data from years 1991-2021 and I need to drop all rows with year != 2021 or the current year. In my dataframe there is a column "year" with years ranging from 1991-2021 of hourly data. I am using this line of code below but it does not seem to be doing anything for dataframe df1. Is there a better way to delete all rows that do not equal year == 2021?:

trimmed_df1 = df1.drop(df1[df1.year != '2021'].index)

My data is a 4532472 X 10 column df in this format:

df1.columns.values
Out[20]: 
array(['plant_name', 'business_name', 'business_code',
   'maint_region_name', 'power_kwh', 'wind_speed_ms', 'mos_time',
   'dataset', 'month', 'year'], dtype=object)

Solution

  • This should do the job:

    >>> trimmed_df1 = df1.query(‘year != 2021’).reset_index()
    

    Maybe you don’t even need to reset the index - it’s up to you.