Search code examples
csvawkrangepointminimum

return minimum point in a range of points in CSV data using AWK


I want to return the minimum y value observed within a specified range of x values in CSV x, y data using AWK in Bash. So, specifically, I may have data such as the following:

xyData="10, 100
20, 200
30, 300
40, 400
50, 500
60, 600
70, 700
80, 800
90, 900
100, 1000"

I would want to ask a question such as the following: What is the point at which the minimum y value is observed in the range of points starting with x value 50 and ending with x value 90? The answer for this example would be "50, 500", because 500 is the minimum y value observed in the inclusive range of points starting with x value 50 and ending with x value 90.

I'm very new to AWK. Is there some nifty way in which this may be accomplished? Thank you very much for your assistance on this.


Solution

  • One way (EDIT: Solution with bugs, see Scrutinizer's comment):

    awk -F'[, ]+' '
      $1 >= 50 && $1 <= 90 { 
        if (y > $2 || y == 0) { 
          y = $2; x = $1 
        } 
      } 
      END { 
        printf "%s, %s\n", x, y 
      }
    ' infile
    

    It yields:

    50, 500
    

    UPDATE the solution based in the comment of Scrutinizer:

    awk -F'[, ]+' '
      $1 >= 50 && $1 <= 90 { 
        if (y > $2 || !y_set) { 
          y = $2 
          x = $1
          y_set = 1
        } 
      } 
      END {
        if ( x || y ) {
          printf "%s, %s", x, y 
        }
      }
    ' infile