Search code examples
pythonpandasscikit-learnsklearn-pandas

Dropping a number of columns in a pandas DataFrame on one line


Greetings so I have a pandas DataFrame which looks like this:

    Product_Code  W0  W1  W2  W3  W4  W5  W6  W7  W8      ...        \
806         P815   0   0   1   0   0   2   1   0   0      ...         
807         P816   0   1   0   0   1   2   2   6   0      ...         
808         P817   1   0   0   0   1   1   2   1   1      ...         
809         P818   0   0   0   1   0   0   0   0   1      ...         
810         P819   0   1   0   0   0   0   0   0   0      ...         

     Normalized 42  Normalized 43  Normalized 44  Normalized 45  \
806           0.00           0.33           0.33           0.00   
807           0.43           0.43           0.57           0.29   
808           0.50           0.00           0.00           0.50   
809           0.00           0.00           0.00           0.50   
810           0.00           0.00           0.00           0.00   

but I don't need these columns as a matter of fact I need only W0 and W4, So I wanna remove all of them so this is what I tried:

raw_data = [ raw_data.drop( [i], 1, inplace = True )  for i in raw_data if i is not 'W0' and i is not  'W4'  ]

after half an hour I figured out that for some reason that != does not work strings and I was wondering why? so I have a stable solution:

#WORKS !!!!
# for i in raw_data:
#     if i != 'W0' and i != 'W4':
#         raw_data.drop( [i], 1, inplace = True )  

But I don't like it at all and I have commented it because it takes a lot of space and it's not beautiful, I want to make the one-line loop if expression to work, is it possible, the problem is that:

  raw_data = [ raw_data.drop( [i], 1, inplace = True )  for i in raw_data if i != 'W0' and i != 'W4'  ]

tries to convert the DataFrame to a list, how should it be done?


Solution

  • You can use:

    raw_data.drop([i for i in raw_data if i is not 'W0' and i is not  'W4'], 
                   axis=1, inplace=True)
    

    This answers the question but the condition that you state doesn't make sense. The condition you have put is if i is not 'W0' and i is not 'W4', this is going to be true always. You probably need to look at the condition again.