Search code examples
google-sheetsgoogle-sheets-formulavlookupspreadsheetdata-analysis

Get lowest maximum value across various ranges in Google Sheets


I'm doing some recreational data analysis and after spending a the day learning how to use VLOOKUP() I've hit a wall I don't think I can figure out through just googling. I have a dataset of a bunch of ideas, linked to the people I want to do them with, a bunch of names associated with them, and the subjects of the ideas. I've been doing fun stuff like how many ideas each subject has, how much of a person's ideas are about a given subject, and how much of a subject's ideas are with a given person. I've also been working out some stats for the whole sheet: total ideas, average ideas per person, average ideas per subject, who has the largest "subject monopoly", whatever seems interesting.

I want to have a stat for who has the most diverse range of subjects, which I would want to indicate by checking who has the lowest highest "subject focus" (how much of a person's ideas are a given subject). I know how I would do this with javascript or python: I want to run through all the "subject focus" values, dividing them up by person, and discarding all but the highest per-person. Then I compare each of these "maximum subject focuses", and pick out the lowest, and display the name associated.

Despite how clear it is in my head, I have no idea how to actually start doing that in Google Sheets. Any advice? Has anyone had to perform a similar operation?

Edit: So the closest I've gotten is to make a new column populated with the formula =if(MAX(FILTER(G:G, A:A=A3)) = 0, "", MAX(FILTER(G:G, A:A=A3))), where G:G is where I have the "subject focus" of a given person, and A:A is the name of the person (it starts at A3 because rows 1 and 2 are header). Then, to display the stat, I just have =CONCATENATE("Most diverse idea list: ", VLOOKUP(MIN(J3:J), {J3:J,A3:A}, 2, FALSE), " (", TEXT(ROUND(MIN(J3:J) * 100, 2), "#.00"), "% max focus)"), where J:J is the new column I added. It works, but obviously this is incredibly messy and if there's a way to do this without making a whole new column for data storage that would be much preferred.

Edit 2: Here's a general mockup of the sheet I'm using, since I'd rather not share it directly.


Solution

  • Use query() to get the max value in G grouped by the names in column A, and wrap it in another query() to get just the first result. Put this formula in cell N1:

    =query( 
      query( 
        A1:H, 
        "select A, max(G) 
         where A is not null 
         and G is not null 
         group by A", 
        1 
      ), 
      "order by Col2 asc 
       limit 1", 
      1 
    )
    

    ...and this formula in cell I16:

    ="Most diverse idea list: " & N2 & " (" & text(O2, "#.00%") & " max focus)"
    

    To see how this works, try putting this formula in cell K1:

    =query( 
      A1:H, 
      "select A, max(G) 
       where A is not null 
       and G is not null 
       group by A", 
      1 
    )
    

    If you need to place these results in column J for each name, put this formula in cell J3:

    =arrayformula( 
      iferror( 
        vlookup( 
          A3:A, 
          K2:L, 
          columns(K2:L), 
          false 
        ) 
      ) 
    )
    

    This is an array formula that fills whole column in one go, so you will have to clear column J3:J before inserting the formula to make room for the results.

    To put the first two formulas together so that no helper columns are needed, use this:

    ="Most diverse idea list: " & 
    join( " (", 
      query( 
        query( 
          A2:H, 
          "select A, max(G) 
           where A is not null 
           and G is not null 
           group by A 
           label max(G) '' 
           format max(G) '0.00%' ", 
          0 
        ), 
        "order by Col2 asc 
         limit 1", 
        0 
      ) 
    ) & " max focus)"