Search code examples
pythonpython-3.xpandaspython-2.7sklearn-pandas

Do logical operations in 2 column simmultaneously in Pandas


I have one Dataframe name df -

   id   year    month      target1  
0  324  2019.0  1.0        100.0    
1  325  2019.0  3.0        100.0   
2  326  2019.0  10.0       100.0    
3  327  2019.0  11.0       100.0    
4  328  2019.0  12.0       100.0   

Now I want to compare 2 columns of the Dataframe namely 'year' & 'month', to get new dataframe like this -

   id   year    month      amount  
0  324  2019.0  1.0        100.0    
1  325  2019.0  3.0        100.0   
2  326  2019.0  10.0       100.0    

Here the operations was -

year <= 2019
&
month <= 10

How can I achieve this in Pandas.

Similar operations could be -

year < 2019     &      month <= 6
year < 2019     &      month <  6
year < 2019     &      month <= 12

Any logic or way to achieve this result using Python3 or Pandas. I tried a logic where I simply compare the year and month separately, like this -

df_new = df[df['year'] <= year ]
df_new = df_new[df_new['month_num'] <= month_num ]  

But this logic fail in case I have more than 1 year. By this I get this dataframe as output -

year <= 2020  &  month <= 1
   id year    month      amount 
0  3  2019.0  1.0        100.0     
5  9  2020.0  1.0        100.0     

But ideally what I need is this -

   year    month      target1 
0  2019.0  1.0        100.0     
1  2019.0  3.0        100.0     
2  2019.0  10.0       100.0     
3  2019.0  11.0       100.0     
4  2019.0  12.0       100.0     
5  2020.0  1.0        100.0    

Thank you


Solution

  • As far as I understand you want to print everything before the specified month of the specified year. This could be done in several ways

    1. Simply add
    data.query('(year == 2020 and month <= 1) or (year <= 2020)')
    
    1. You can try to create a new column, which would use pd.to_datetime()