Search code examples
pythonnumpyloopsconditional-statementsmultiple-columns

How to efficiently check conditions on two columns and perform operation on third column in python


I have three columns with thousands of rows. Numbers in column 1 and 2 change from 1 to 6. I desire to check combinations of numbers in both column 1 and 2 to divide the value in column 3 by a certain value.

1     2    3.036010    
1     3    2.622544    
3     1    2.622544    
1     2    3.036010    
2     1    3.036010  

Further, column 3 will be divided by same number if values of column 1 and column 2 are swapped. For example, for 1 2 and 2 1 combinations, column 3 may be divided by same value. My present approach does the job, but I would have to write several conditions manually. What could be more efficient way to perform this task? Thanks in advance!

my_data = np.loadtxt('abc.dat')

for row in my_data:    
    if row[0] == 1 and row[1] == 2:
        row[3]/some_value
   



  

Solution

  • Numpy offers np.where which allows for vectorized test:

    result = np.where(data[:, 0] == data[:, 1], data[:, 2]/some_value, data[:, 2])
    

    or if you want to change the array in place:

    data[:, 2] = np.where(data[:, 0] == data[:, 1], data[:, 2]/some_value, data[:, 2])