Search code examples
pythonpandascriteriamultiple-columns

Python: search in file with multiple criteria


I have a text file with multiple column like this:

1000 1 2 3
1000 1.5 2.5 3.1
2000 4 5 6
3000 7 8 9

I would like to create a python script in which I enter a series of 3 number, search the closest number in the first 3 column and return corresponding value of the last column. For example if I enter 1200 1 2 it should return 3.

UPDATE: Is it possible to have a linear interpolation between data that have the same value for the second and third row? For example my data is: 1000 100 2 0.1 1200 100 2 0.2 1000 80 3 0.4 And my input is '1100 100 2', it should return 0.15.


Solution

    1. open the file,

      values = []
      with open("myfile.txt") as inf:
      
    2. read each line,

          for line in inf:
      
    3. convert it into numbers,

              values.append([float(s) for s in line.split()])
      
    4. define what you mean by "closest". Manhattan distance? Least squares?

      def make_manhattan_dist_fn(from):
          def distance_fn(pt):
              return sum(abs(b-a) for a,b in zip(from, pt))
          return distance_fn
      
      my_dist_fn = make_manhattan_dist_fn([1200, 1, 2])
      

      Edit: based on your comment, you want

      def make_tuple_dist_fn(from):
          def distance_fn(pt):
              return tuple(abs(b - a) for a,b in zip(from, pt))
          return distance_fn
      
      my_dist_fn = make_tuple_dist_fn([1200, 1, 2])
      
    5. find the closest value,

      print(min(values, key = my_dist_fn)[-1])
      

      which results in

      3.0