Search code examples
pythonlistdataframecomparison

Comparing two data frames columns and assigning Zero and One


I have a dataframe and a list, which includes a part of columns' name from my dataframe as follows:

 my_frame:
           col1, col2, col3, ..., coln
              2,    3,    4, ..., 2
              5,    8,    5, ..., 1
              6,    1,    8, ..., 9

 my_list:
             ['col1','col3','coln']

Now, I want to create an array with the size of my original dataframe (total number of columns) which consists only zero and one. Basically I want the array includes 1 if the there is a similar columns name in "my_list", otherwise 0. My desired output should be like this:

  my_array={[1,0,1,0,0,...,1]} 

Solution

  • This should help u:

    import pandas as pd
    
    dictt = {'a':[1,2,3],
             'b':[4,5,6],
             'c':[7,8,9]}
    
    df = pd.DataFrame(dictt)
    
    my_list = ['a','h','g','c']
    
    my_array = []
    
    for column in df.columns:
        if column in my_list:
            my_array.append(1)
        else:
            my_array.append(0)
    print(my_array)
    

    Output:

    [1, 0, 1]
    

    If u wanna use my_array as a numpy array instead of a list, then use this:

    import pandas as pd
    import numpy as np
    
    dictt = {'a':[1,2,3],
             'b':[4,5,6],
             'c':[7,8,9]}
    
    df = pd.DataFrame(dictt)
    
    my_list = ['a','h','g','c']
    
    my_array = np.empty(0,dtype = int)
    
    for column in df.columns:
        if column in my_list:
            my_array = np.append(my_array,1)
        else:
            my_array = np.append(my_array,0)
    print(my_array)
    

    Output:

    [1 0 1]
    

    I have used test data in my code for easier understanding. U can replace the test data with ur actual data (i.e replace my test dataframe with ur actual dataframe). Hope that this helps!