Search code examples
unixawkgawk

AWK extract the shorter string in length from an array


I have an array containing words such "gummy", "owl", "table" ... what I need is to extract the word shorter in length and assign it to a variable.

What I've tried

st[$1] = x;
for (i in st)
{
    if(min < st[i])
    {
        min = st[i];
    }
}
ld=min;

Solution

  • So for just finding the shortest length, consider this:

    $ ./bar.awk
    shortest= -1   i= 1    st[i]= gummy
    first time, now shortest= 5
    shortest= 5   i= 2    st[i]= owl
    found shorter value, now shortest= 3
    shortest= 3   i= 3    st[i]= table
    shortest= 3   i= 4    st[i]= cat
    done. shortest= 3
    
    $ cat bar.awk
    #!/usr/bin/awk -f
    
    BEGIN {
       st[1]="gummy"
       st[2]="owl"
       st[3]="table"
       st[4]="cat"
    
       shortest = -1
       for (i in st)
       {
           print "shortest=", shortest, "  i=", i, "   st[i]=", st[i]
           if( shortest == -1 ) {
              shortest = length( st[i] )
              print "first time, now shortest=", shortest
           } else if( length( st[i] ) < shortest ) {
              shortest = length( st[i] )
              print "found shorter value, now shortest=", shortest
           }
       }
       print "done. shortest=", shortest
    }
    

    Original post: Here's a short example, it should get you started.

    I want to call out the use of printing things to see what the code is doing. If you're not sure why something is working a particular way, add prints around it to display the values that are involved until you understand. The printing doesn't need to be fancy or anything, just enough for you to understand what different expressions are doing what what a given variable happens to be at any point in time.

    note 1: We start with candidate as an element in our array. It is a little redundant because the loop will do an unecessary compare but it is easy to write this way, clear what is going on, and we avoid a possible error (what happens if you initialze candidate = "" and your array didn't have any empty string values?)

    note 2: I'm assigning st[i] to a variable 'value' since I think that reads more clearly that st[i] everywhere (either way is fine).

    $ chmod +x foo.awk
    $ cat foo.awk
    #!/usr/bin/awk -f
    
    BEGIN {
       st[1]="gummy"
       st[2]="owl"
       st[3]="table"
       st[4]="cat"
    
       candidate=st[1]
       for (i in st)
       {
           print "candidate=", candidate
           print "        i=", i
           print "    st[i]=", st[i]
           value = st[i]
           if( length( value ) < length(candidate) )
           {
               candidate = value
               print "found shorter value, changing candidate=", candidate
           }
       }
       print "done. candidate=", candidate
    }
    
    $ ./foo.awk 
    candidate= gummy
            i= 1
        st[i]= gummy
    candidate= gummy
            i= 2
        st[i]= owl
    found shorter value, changing candidate= owl
    candidate= owl
            i= 3
        st[i]= table
    candidate= owl
            i= 4
        st[i]= cat
    done. candidate= owl
    

    Question: Suppose you have two (or more) candidates that are all equally short, like "cat" and "owl" in the above example. Which value(s) do you want to produce? Can you think of a way to produce all of the shortest values ?