Search code examples
google-sheetsformula

Combine 10 columns of single-word data into 3 columns of max 4 words each, with punctuation


PURPOSE: I am trying to combine 10 columns of names into three columns, following three guidelines. (I could probably work out the punctuation on my own if necessary, but don't know how to combine it with this task in one step.)

  1. Add a parenthesis before the first word and after the last word, and commas in between names;

  2. Add a maximum of four words into each column before going to next column (note the issue in number 3 below);

  3. Ensure that a name is not broken between columns; this means a first-and-last cannot follow three names in a column, as that would mean five words. This break would have to happen after three words and begin the next column with the first-and-last (see rows 5 & 6 in the Desired Output, beginning "Evan" and "Frank").

Here is an image of the example data: enter image description here

Here is an image of the desired output: enter image description here

Here is a link to the data: https://docs.google.com/spreadsheets/d/1rpZvxrBGPndRTmPq9ZXfOXm_XUbE23Uw0XabIqYnxqo/edit?usp=sharing

Thank you for looking!


Solution

  • Here's a possible solution:

    =ARRAYFORMULA(
       LET(data,A2:O8,
           max_words,4,
           GET_COUNT,LAMBDA(str,COUNTA(SPLIT(str,", "))),
           first_last,REDUCE(TOROW(,1),SEQUENCE(COLUMNS(data)/3,1,1,3),
                        LAMBDA(acc,i,HSTACK(acc,BYROW(CHOOSECOLS(data,i,i+2),LAMBDA(names,TEXTJOIN(" ",1,names)))))),
           joined_first_last,BYROW(first_last,LAMBDA(names,TEXTJOIN(", ",1,names))),
           people,SPLIT(joined_first_last,", ",,),
           IFERROR(
             REDUCE(TOCOL(,1),SEQUENCE(ROWS(people)),
               LAMBDA(v_acc,i,IFNA(VSTACK(v_acc,
                 LET(ans,TOROW(
                           REDUCE(,SEQUENCE(COUNTA(INDEX(people,i))),
                             LAMBDA(h_acc,j,
                                IF(j=1,INDEX(people,i,j),         
                                  IF(GET_COUNT(INDEX(people,i,j))+GET_COUNT(CHOOSECOLS(h_acc,-1))<=max_words,
                                    h_acc&IF(SEQUENCE(1,COLUMNS(h_acc))<COLUMNS(h_acc),,", "&INDEX(people,i,j)),
                                    HSTACK(h_acc,INDEX(people,i,j)))))),3),
                     cols,COLUMNS(ans),
                     IF(cols=1,"("&ans&")",
                       IF(SEQUENCE(1,cols)=1,"("&ans&",",
                         IF(SEQUENCE(1,cols)=cols,ans&")",ans&",")))))))))))
    

    enter image description here

    The first part of the formula is transforming the data into a more usable format:

    =ARRAYFORMULA(
       LET(data,A2:O8,
           max_words,4,
           GET_COUNT,LAMBDA(str,COUNTA(SPLIT(str,", "))),
           first_last,REDUCE(TOROW(,1),SEQUENCE(COLUMNS(data)/3,1,1,3),
                        LAMBDA(acc,i,HSTACK(acc,BYROW(CHOOSECOLS(data,i,i+1,i+2),LAMBDA(names,TEXTJOIN(" ",1,names)))))),
           joined_first_last,BYROW(first_last,LAMBDA(names,TEXTJOIN(", ",1,names))),
           people,SPLIT(joined_first_last,", ",,),
           people))
    

    enter image description here

    Then for each row we are iterating over each column and applying the following algorithm:

    • If the count of the words in the current string plus the count of the words in the previous string is less than or equal to `max_words`, append the current string to the previous string, else;
    • (If the count of the words in the current string plus the count of the words in the previous string is greater than `max_words`) create a new string.
    // if the count of the words in the current string plus the count of the words in the previous string is less than or equal to `max_words`
    
    IF(GET_COUNT(INDEX(people,i,j))+GET_COUNT(CHOOSECOLS(h_acc,-1))<=max_words,
    
    // append the current string to the previous string
    
       h_acc&IF(SEQUENCE(1,COLUMNS(h_acc))<COLUMNS(h_acc),,", "&INDEX(people,i,j)),
    
    // otherwise, create a new string
    
       HSTACK(h_acc,INDEX(people,i,j))))