Search code examples
google-sheetslambdagoogle-sheets-formula

Google Sheets ARRAY_ROW issue with sorting a pass-through LAMBDA. Additional empty rows cause formula to error


Recently I was able to figure out how to accurately pass more complex arguments into a LAMBDA via Passing QUERY() as a parameter to a Google Sheets LAMBDA function. I wanted to take that knowledge and apply it elsewhere in my dataset. I found a good candidate where I wanted to programmatically reverse sort a range based on a key column.

The problem is that everything is programmatically generated/retrieved via formula (yellow cells F2:H2). This is very easy to do with the menu sort method, but in order to do this programmatically, I believe I needed to use a LAMBDA (pass-through because I pass the parameters through without any modification within the lambda).

The problem is that this only works when I don't have any additional rows in my sheet. If I add just one more line, it all errors.

Working Code

Every items is correctly sorted and displayed as expected. Reverse sort based on date and then reverse sort based on key. You can notice there are 52 lines in total with no blank lines in my sheet.

enter image description here

In L2 I, have the following code...

=sort(arrayformula(
  lambda(date, key, values, {date, key, values})
  (
    xlookup(G2:G, 'Designation Date Raw'!$A$2:$A, 'Designation Date Raw'!$B$2:$B),
    query('Alias Key Raw'!A:B, "Select A Where A Starts with 'X'", 0),
    split(regexreplace(query('Alias Key Raw'!A:B, "Select B Where A Starts with 'X'", 0), "\*", ""), ","))
  ), 
1, false, 2, false)
// F2
=iferror(arrayformula(xlookup(G2:G, 'Designation Date Raw'!$A$2:$A, 'Designation Date Raw'!$B$2:$B)))
// G2
=query('Alias Key Raw'!A:B, "Select A Where A Starts with 'X'", 0)
// H2
=arrayformula(split(regexreplace(query('Alias Key Raw'!A:B, "Select B Where A Starts with 'X'", 0), "\*", ""), ","))

Added One Additional Line

No change except for an additional Line 53

Error
Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 52. Actual: 51.

enter image description here

Additional Information

My intention is to replace the 3 formulas in F2:H2 with the single formula in L2. Also, all code works as expected, but I don't want to truncate my sheet to remove all empty rows.

Existing Quesiton Difference

This question is different from Error: Function ARRAY_ROW parameter 2 has mismatched row size. Expected: 1. Actual: 10434 as my code works, and it errors due to external factors of the formula such as adding a new sheet row. Also this is using the new LAMBDA function which operates differently


Solution

  • I would get the ARRAY_ROW error simply because there was a mismatch of rows for each input column. While there was the same number of non-empty rows, the fact that I used G2:G for one column of data and query(...) for other columns is the source of the error. The G2:G column had additional blank rows of data that were unaccounted for.

    By simply replacing G2:G with the equivalent query(...) function, the error will go away no matter how many additional empty rows are added to the worksheet.

    enter image description here

    =SORT(arrayformula(
      lambda(date, key, values, {date, key, values})
      (
        xlookup(query('Raw AK'!A:B, "Select A Where A Starts with 'X'", 0), 'Designation Date Raw'!$A$2:$A, 'Designation Date Raw'!$B$2:$B),
        query('Raw AK'!A:B, "Select A Where A Starts with 'X'", 0),
        split(regexreplace(query('Raw AK'!A:B, "Select B Where A Starts with 'X'", 0), "\*", ""), ","))
      ), 
    1, false, 2, false)
    

    Note: The reason G2:G worked when there were no additional blank rows in the worksheet was simply because the 2 different data sets (G2:G and query(...) had the same number of rows).

    Lambda's that avoid ARRAY_ROW errors

    One can expand on the use of lambdas to ensure that ARRAY_ROW errors are much more difficult to produce. This works when the data has a one-to-one relationship.

    With lambda's, you have to pass in the values to be used within the lambda function. By using a lambda inside another lambda's parameter list, one can force a single dataset that can be used in all calculations.

    enter image description here As you can see, I only have a single query(...) function now. All underlying data is based on this one single function call. I can mutate this data with other data from other sheets, but it's all still based on the query(...) call.

    enter image description here I take the query_data and split it up into each individual column and perform additional calculations on those columns. Notice, I can add as many

    =reverse(arrayformula(
      lambda(data, 
        {
          lambda(recombinant_alias, designation_date, csv_unalised_PANGO_lineage, csv_partial_PANGO_lineage, csv_PANGO_lineage, 
            {
              recombinant_alias, 
              designation_date,
              split({switch_array(AS1, 
                "PANGO Lineages", csv_PANGO_lineage, 
                "Partial PANGO Lineages", csv_partial_PANGO_lineage,
                "Unaliased PANGO Lineages", csv_unalised_PANGO_lineage
              )}, ", ")
            }) (getcolumn(data, 0), getcolumn(data, 1), getcolumn(data, 2), getcolumn(data, 3), getcolumn(data, 4))
        })
      (
        lambda(query_data, 
          {
            getcolumn(query_data, 0),
            xlookup(getcolumn(query_data, 0), 'Designation Date Raw'!$A$2:$A, 'Designation Date Raw'!$B$2:$B),
            getcolumn(query_data, 1),
            multireplace(getcolumn(query_data, 1), {"B.1.1.529", "BA"}),
            multireplace(getcolumn(query_data, 1), {'Raw AK'!B2:B, 'Raw AK'!A2:A})
          }) 
          (
            query('Processed AK'!A:B, "Select * Where A Starts with 'X' and B contains 'B.1.1.529'", 0)
          )
        )
    ))
    

    enter image description here Here is some proof this formula works as expected.

    Note: I used Google App Script to create custom functions getcolumn(...), switch_array(...), and multireplace(...)

    switch_array(...) is emulating the built-in switch(...) function but works on arrays.

    getcolumn(...) will retrieve a specific column of data from an array

    multireplace(...) is a non-regex-based multiple, text-replacement function that works on arrays of data.