as per example:
A B C D E F G ∞
|======|=======|=====|=====|=====|=====|=====|=====
1 | |AVERAGE| | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
2 | xx 1 | | 1 | 2 | 0.5 | 10 | |
|======|=======|=====|=====|=====|=====|=====|=====
3 | xx 2 | | 7 | 1 | | | |
|======|=======|=====|=====|=====|=====|=====|=====
4 | | | 0 | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
5 | xx 3 | | 9 | 8 | 7 | 6 | |
|======|=======|=====|=====|=====|=====|=====|=====
6 | xx 4 | | 0 | 1 | 2 | 1 | |
|======|=======|=====|=====|=====|=====|=====|=====
7 | | | 1 | | 4 | | |
|======|=======|=====|=====|=====|=====|=====|=====
8 | xx 5 | | | | | | |
|======|=======|=====|=====|=====|=====|=====|=====
9 | | | | | | | 5 |
|======|=======|=====|=====|=====|=====|=====|=====
∞ | | | | | | | |
what's the most optimal way of getting AVERAGE
for every valid row in the dynamic sense of terms (unknown quantity of rows & unknown quantity of columns) ?
if you are here by accident for running / cumulative / rolling average see: https://stackoverflow.com/a/59120993/5632629
if all 5 cells in range C2:G have values:
=QUERY(QUERY(C2:G, "select (C+D+E+F+G)/5"), "offset 1", )
if not, then rows are skipped:
if empty cells are considered as zeros:
=INDEX(QUERY(QUERY({C2:G*1}, "select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))
to remove zero values we use IFERROR(1/(1/...))
wrapping:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select (Col1+Col2+Col3+Col4+Col5)/5"), "offset 1", ))))
to make Col
references dynamic we can do:
=INDEX(IFERROR(1/(1/QUERY(QUERY({C2:G*1},
"select "&
"("&JOIN("+", "Col"&ROW(INDIRECT("1:"&COLUMNS(C:G))))&")/"&COLUMNS(C:G)),
"offset 1", ))))
if empty cells are not considered as zeros and shouldn't be skipped:
=INDEX(TRANSPOSE(QUERY(TRANSPOSE(E2:I),
"select "&TEXTJOIN(",", 1, IF(A2:A="",,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")")))),, 2)
note that this is column A dependant, so missing values in column A will offset the results
fun fact !! we can swap avg
to max
or min
:
to free it from confinement of column A and make it work for any valid row:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(C2:G),,9^9)))="", C2:G*0, C2:G)),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
if present 0's in range shouldn't be averaged we can add a small IF statement:
=INDEX(IFERROR(1/(1/TRANSPOSE(QUERY(TRANSPOSE(
IF(TRIM(TRANSPOSE(QUERY(TRANSPOSE(
IF(C2:G>0, C2:G, )),,9^9)))="", C2:G*0,
IF(C2:G>0, C2:G, ))),
"select "&TEXTJOIN(",", 1,
"avg(Col"&ROW(A2:A)-ROW(A2)+1&")"))))),, 2)
here we used so-called "vertical query smash" which takes all values in a given range and concentrates it to one single column, where all cells per each row are joined with empty space as a byproduct:
=FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9))
apart from this, there is also "horizontal query smash":
=QUERY(C2:G,,9^9)
and also "ultimate 360° double query smash" which puts all cells from range into one single cell:
=QUERY(FLATTEN(QUERY(TRANSPOSE(C2:G),,9^9)),,9^9)
and finally "the infamous negative 360° reverse double query smash" which prioritizes columns over rows:
=QUERY(FLATTEN(QUERY(C2:G,,9^9)),,9^9)
all query smash names are copyrighted of course
back to the topic... as mentioned above all cells per row in range are joined with empty space even those empty ones, so we got a situation where we getting double or multiple spaces between values. to fix this we use TRIM
and introduce a simple IF
statement to assign 0 values for empty rows in a given range eg. to counter the offset:
MMULT
is a kind of heavy class formula that is able to perform addition, subtraction, multiplication, division even running total on arrays/matrixes... however, bigger the dataset = slower the formula calculation (because in MMULT
even empty rows take time to perform + - × ÷
operation) ...unless we use truly dynamic range infinite in both directions...
to get the last row with values of a given range:
=INDEX(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))))
to get the last column with values of a given range:
=INDEX(MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))
now we can construct it in a simple way:
=INDIRECT("C2:"&ADDRESS(9, 7))
which is the same as:
=INDEX(INDIRECT("C2:"&ADDRESS(MAX(IF(TRIM(FLATTEN(QUERY(TRANSPOSE(
INDIRECT("C2:"&ROWS(A:A))),,9^9)))="",,ROW(A2:A))),
MAX(IF(TRIM(QUERY(INDIRECT("C2:"&ROWS(A:A)),,9^9))="",,COLUMN(C2:2))))))
or shorter alternative:
=INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2)))))
therefore simplified MMULT formula would be:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9<>"", 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
in case we want to exclude zero values from range, the formula would be:
=ARRAYFORMULA(IFERROR(
MMULT(N( C2:G9), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)/
MMULT(N(IF(C2:G9>0, 1, )), ROW(INDIRECT("C1:C"&COLUMNS(C:G)))^0)))
putting together all above to make it infinitely dynamic and still restricted to valid dataset:
=INDEX(IFERROR(
MMULT(N( INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)/
MMULT(N(IF(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))<>"", 1, )), ROW(INDIRECT("C1:C"&
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))-(COLUMN(C2)-1)))^0)))
again, not including cells with zeros in range:
since 20 September 2022, we can use new functions that make stuff easier:
so to jump right in for a closed range we can take an average like:
=IFERROR(BYROW(C2:G9, LAMBDA(x, AVERAGE(x))))
and to get an average column-wise we just replace BYROW with BYCOL. now to make the range open and truly dynamic we can modify the above formula like this:
=IFERROR(BYROW(INDEX(INDIRECT("C2:"&ADDRESS(
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*ROW(A2:A)),
MAX((INDIRECT("C2:"&ROWS(A:A))<>"")*COLUMN(C2:2))))), LAMBDA(x, AVERAGE(x))))
we can do it shorter by 12 characters like:
=IFERROR(BYROW(INDEX(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2)))), LAMBDA(x, AVERAGE(x))))
to exclude zeros from the output:
=INDEX(IFERROR(1/(1/BYROW(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGE(x))))))
to exclude zeros from input:
=INDEX(IFERROR(1/(1/BYROW(OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGEIF(x, ">0"))))))
or if blank cells should be treated as zeros:
=INDEX(IFERROR(1/(1/BYROW(1*OFFSET(C2,,,
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*ROW(C2:C)),
MAX((INDIRECT("C2:"&ROWS(C:C))<>"")*COLUMN(C2:2))), LAMBDA(x, AVERAGE(x))))))
also, it's worth mentioning the BYROW limitation of ~ 99990 rows
the polar opposite of the previous formula would be to run the MMULT
on
C2:?
(all rows, all columns)
instead ofC2:?
(excluding empty rows and columns)
which avoids mass-calculations of 0 × 0 = 0
including zeros:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))<>"", 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
excluding zeros:
=INDEX(IFERROR(
MMULT( INDIRECT("C2:"&ROWS(C:C))*1, SEQUENCE(COLUMNS(C2:2))^0)/
MMULT(IF(INDIRECT("C2:"&ROWS(C:C))>0, 1)*1, SEQUENCE(COLUMNS(C2:2))^0)))
for a fixed range C2:G9
the MMULT
average would be:
=INDEX(IFERROR(
MMULT( C2:G9*1, FLATTEN(COLUMN(C:G))^0)/
MMULT((C2:G9>0)*1, FLATTEN(COLUMN(C:G))^0)))
=INDEX(IFNA(VLOOKUP(ROW(C2:C),
QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&C2:J), "×"),
"select Col1,avg(Col2)
where Col2 is not null
group by Col1"), 2, )))
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))
excluding zeros:
=INDEX(QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)), "×"),
"select avg(Col2)
where Col2 <> 0
group by Col1
label avg(Col2)''"))
including empty cells:
=INDEX(IFERROR(1/(1/QUERY(SPLIT(FLATTEN(ROW(C2:C)&"×"&OFFSET(C2,,,9^9, 9^9)*1), "×"),
"select avg(Col2)
group by Col1
label avg(Col2)''"))))