Search code examples
excelcalculated-columns

How to replicate a single column of data from one excel spreadsheet to another based on the contents of a cell


I have an array in a worksheet (S1) with approximately 100 columns and over 7500 rows. Cells in the first row of S1 (A1, B1, ...) have header identifiers. Column A contains timestamps applicable to the data in the row they're in.

Data are periodically added to S1 typically by inserting a new timestamp in S1!A2 and newly sampled data in the rest of row 2 thus increasing the number of rows by one each update.

In another worksheet (S2) I have:

S2!A2=AVERAGE(B:B)
S2!A3=STDEV.S(B:B)
S2!C1='Z-SCORE'
S2!Ci=(Bi-$A$2) * $A$3

for each i >= 2. Finally, S2!A1 contains text entered by a user. For example, 'P_ID1'

I need a formula that looks up the cell in S1 row 1 that contains the content of S2!A1 and replicates the column in which the heading is found into S2!B:B. It should handle dynamic changes and should not contain 0 values from empty cells in S1. In other words, it should return a dynamic array. Here is an example that does not work.

S2!B1=S1!B:B

This is a dynamic array, but it fills in more than 900,000 cells mostly with the value 0 which causes the average and standard deviation formulas to fail. It also does not search for the correct column to duplicate. A better example is

S2!B1=FILTER(S1!B:B, ISNUMBER(S1!B:B) + ISTEXT(S1!B:B))

In this example, S2!B:B is populated with the right data and is dynamic. However, it also does not depend on S2!A1 which is required.

A third (failing) example:

S2!B1=HLOOKUP(S2!$A$1, S1!$A$1:$??, 1)
S2!B2=HLOOKUP(S2!$A$1, S1!$A$1:$??, 2)
...

These formulas correctly return the right column, but fails in the obvious way. The value that replaces ?? must be already known which makes the resulting column not dynamic and depends on both the number of columns and rows in S1. Also, it is computationally intensive and somewhat difficult to construct.

The array in S1 could be transposed, but I'm not sure that VLOOKUP would be much better than HLOOKUP and charting the data would still be difficult as well.

It is possible, but not realistic to create a worksheet for each label (about 100 of them), to replicate all the arithmetic across those worksheets and render the data with about 100 charts. I would like to find a way to have a cell in one worksheet (S2) as described above into which the user enters the header name of the column for which analysis is desired and then to have the correct column duplicated in S2. This also greatly simplifies charting.

I looked at XLOOKUP, INDEX, ADDRESS, CHOOSE, MATCH, FILTER, SORT and many other methods and haven't found an answer.


Solution

  • One answer I found by experimentation (which looks inefficient, but it works) is below. The essential trick is to use ADDRESS and INDIRECT to construct the dynamic array. (For brevity if not clarity I removed the sheetname S2 in formulas in S2.)

    In my original post I had:

    A2=AVERAGE(B:B)
    A3=STDEV.S(B:B)
    C1='Z-SCORE'
    Ci=(Bi-$A$2) * $A$3
    

    For clarity in the answer I add helper cells and move the locations of the calculation of the mean and standard deviations. The complete answer including the calculation of mean, standard deviation and zscores involves formulas in worksheet S2 and is shown below:

    A2=IF(ISBLANK($A$1), "Enter name of required data above", MATCH($A$1,S1!1:1,0))
    A3=IF(ISBLANK($A$1), "", SUBSTITUTE(ADDRESS(1,$A$2,4),"1",""))
    A4=IF(ISBLANK($A$1), "", "S1!" & $A$3 & ":" & $A$3)
    A5=IF(ISBLANK($A$1), "", AVERAGE(B:B))
    A6=IF(ISBLANK($A$1), "", STDEV.S(B:B))
    
    B1=IF(ISBLANK($A$1), "", FILTER(INDIRECT($A$4), ISNUMBER(INDIRECT($A$4) + ISTEXT(INDIRECT($A$4))
    
    C1="Z_SCORE"
    Ci=IF(ISBLANK($A$1), "", ($Bi-$A$5) / $A$6)
    

    I would still very much appreciate a more efficient answer.