Search code examples
pythondataframenotnull

Checking if the dataframe column is filled and searching by string


I have the following dataframe:

      import pandas as pd
      import re

      df = pd.DataFrame({'Column_01': ['Press', 'Temp', '', 'Strain gauge', 'Ultrassonic', ''], 
                         'Column_02': ['five', 'two', 'five', 'five', 'three', 'three']})

I would first like to check that 'Column_01' is filled. If 'Columns_01' is filled OR 'Column_02' contains the words 'one', 'two', 'three'. A new column (Classifier) will receive 'SENSOR'.

To identify the 'Column_02' string I implemented the following code:

     df['Classifier'] = df.apply(lambda x: 'SENSOR'
                        if re.search(r'one|two|three', x['Column_02'])
                        else 'Nan', axis = 1)

This code is working. It perfectly finds the string on the dataframe line. However, I also needed to check that 'Column_01' is filled. I'm not able to use the function notnull(), to solve the problem.

I would like the output to be:

      Column_01      Column_02  Classifier
         Press         five        SENSOR        #current line of Column_01 completed
         Temp           two        SENSOR        #current line of Column_02 completed; string 'two'
                        five        Nan                    
    Strain gauge        five       SENSOR        #current line of Column_01 completed
     Ultrassonic        three      SENSOR        #current line of Column_01 completed; string 'three' 
                        three      SENSOR        #string 'three'

Solution

  • Generally you should avoid .apply() (ref https://stackoverflow.com/a/54432584/11610186 ).

    This should do the trick:

    import numpy as np
    
    df["Classifier"]=np.where(df["Column_01"].fillna('').ne('')|df["Column_02"].str.contains("(one)|(two)|(three)"), "SENSOR", np.nan)
    

    Outputs:

          Column_01 Column_02 Classifier
    0         Press      five     SENSOR
    1          Temp       two     SENSOR
    2                    five        nan
    3  Strain gauge      five     SENSOR
    4   Ultrassonic     three     SENSOR
    5                   three     SENSOR