Search code examples
google-sheetsgoogle-sheets-formulauniquespreadsheetarray-formulas

Joining of non-unique cells across multiple columns in Google Sheets


I have some personal data with first/last names, e-mails, institutions people work at, etc. There are many, many duplicates because this was collected from a few sources over 2-3 years. Sometimes the same person provided different versions of their name, a different e-mail address, etc. I'd like to have a compact version of this data, where a single person (identified by a PersonID) is listed on a single row, with unique variants of their name, e-mail, etc. listed in each cell. Bonus points if the values in every cell are sorted, but far from required. Incoming data and expected output Example above also available at https://docs.google.com/spreadsheets/d/1jizgysC1dntZHg8pZ0--dSAPevSfyXyiVyenj02GiwQ/edit#gid=0

I'm looking for a way to display the unique values in each column of a filter result, ideally staying away from =QUERY if at all possible.

This is easy to do when working with just one resulting column:

=FILTER(A4:A9,D4:D9=1) --> =JOIN(", ",UNIQUE(FILTER(A4:A9,D4:D9=1)))

...but the moment the filter spits out results in multiple columns:

=FILTER(A4:C9,D4:D9=1) --> ???

...I have no clue what to do, other than doing the code above for each column separately (which would be a hassle, given the number of columns involved). Is this possible?


Solution

  • Here's one way to do that using MAP:

    =MAP(UNIQUE(D4:D),LAMBDA(id,BYCOL(FILTER(A4:D,D4:D=id),LAMBDA(col,JOIN(CHAR(10),UNIQUE(col))))))