Search code examples
pythonpandasdrop

How can I drop a row if all values in a given set of columns are 0


I have a dataframe with about 8000 records that I would like to go through and drop all rows where columns 1/1/2015 - 1/1/2030 are equal to 0 (columns are labelled as 1/1/2015, 1/1/2016, etc.). Here is my code:

find_zeros = 0
for index, row in df.iterrows():
    for z in range(2015,2031):
        find_zeros += row['1/1/%s' %(z)]
    if (find_zeros <= 0):
        df.drop(index, inplace=True)

I keep running into many issues so I want to know if you know how to make this code work or if there is a better way to do this.

For reference, here is an example of the df:

   name      1/1/2015 1/1/2016 1/1/2017
   a           0.0      0.0      0.0
   b           0.0      0.0      1.0
   c           0.0      0.0      0.0
   d           1.0      0.0      0.0
   e           0.0      0.0      1.0
   f           0.0      0.0      0.0

Expected outcome:

 name      1/1/2015 1/1/2016 1/1/2017
   b           0.0      0.0      1.0
   d           1.0      0.0      0.0
   e           0.0      0.0      1.0

Solution

  • We can filter to keep values where not all the values are (eq) 0 in the subset of cols:

    cols = [f'1/1/{v}' for v in range(2015, 2018)]  # change upper bound to 2031
    df = df[~df[cols].eq(0).all(axis=1)]
    

    or where any are not equal (ne) to 0 in the subset:

    cols = [f'1/1/{v}' for v in range(2015, 2018)]  # change upper bound to 2031
    df = df[df[cols].ne(0).any(axis=1)]
    

    *Note if creating a new DataFrame from this selection:

    new_df = df[df[cols].ne(0).any(axis=1)]
    

    copy or reset_index or loc will be needed to avoid a later SettingWithCopyWarning.

    For example,

    new_df = df[df[cols].ne(0).any(axis=1)].copy()
    

    Sample Data:

    df = pd.DataFrame({
        'name': {0: 'a', 1: 'b', 2: 'c', 3: 'd', 4: 'e', 5: 'f'},
        '1/1/2015': {0: 0.0, 1: 0.0, 2: 0.0, 3: 1.0, 4: 0.0, 5: 0.0},
        '1/1/2016': {0: 0.0, 1: 0.0, 2: 0.0, 3: 0.0, 4: 0.0, 5: 0.0},
        '1/1/2017': {0: 0.0, 1: 1.0, 2: 0.0, 3: 0.0, 4: 1.0, 5: 0.0}
    })
    
      name  1/1/2015  1/1/2016  1/1/2017
    0    a       0.0       0.0       0.0
    1    b       0.0       0.0       1.0
    2    c       0.0       0.0       0.0
    3    d       1.0       0.0       0.0
    4    e       0.0       0.0       1.0
    5    f       0.0       0.0       0.0
    

    Sample Output:

      name  1/1/2015  1/1/2016  1/1/2017
    1    b       0.0       0.0       1.0
    3    d       1.0       0.0       0.0
    4    e       0.0       0.0       1.0