Search code examples
matrixgoogle-sheets-formulalookup-tables

(Google Sheets) How to find column 1 and row 1 values for a matrix of occurrences in which both are present in data


( edit: Rough idea of what I'm trying to do: https://docs.google.com/spreadsheets/d/12rIUiuE-TSB0E3XRSnQ0YU9BCpuTsHfrUZXgCvxOE44/edit#gid=2100307022 )

I feel like this should have a really simple solution, but I can't work it out...

Basically, I am analysing a table of tagging data in which more than one tag can be applied to the same record. I am interested in seeing which tags occur together most frequently, so I have created a matrix as such, showing tags in the first column and row, and their co-occurrence in the count matrix below:

Simplified version of matrix - the version I'm using has 60 X and 60 Y items

Below it, I've pulled the highest numbers from the matrix (i.e. where the X and Y values appear the most - 5 in the above case), but am really struggling to work out how to simply create a function that will find the column and row header values for each.

Essentially, I ultimately want e.g. in this case:

5: A & D (where 5 = number of times A and D occur together) (plus any others where the number of co-occurrences is greater than X)

It's highly possible that I got stuck in the woods with this one and there's an easier way to do this, or that I'm just overlooking a simple formula I could use, but at the moment I can't think of a code to create for this that wouldn't be incredibly long or time-consuming.

I've thought of ways to do it with MATCH and INDEX, but I'm not particularly strong with these functions tbh and can't work out how to use them in a 2-D array like this. I've also thought about using combinations of FILTER, FLATTEN and different LOOKUPS, but likewise can't work it out.

Any help would be much appreciated.


Solution

  • Here's one approach you may test out:

    =let(Σ,sort(unique(reduce(,O2:X11,lambda(a,c,vstack(a,{c,torow(sort(tocol({index(N1:N11,row(c)),index(A1:X1,column(c))}),1,))})))),1,),
           filter(Σ,index(Σ,,1)>=3,--index(Σ,,1)))
    
    • X value is taken as >=3 for test purposes

    enter image description here