google-sheets google-sheets-formula calculated-columns xlookup

Looking for formula to most recent value in COL A when data in COL B is text

I have a google sheet where

col A is an imported list of TRUE/FALSE values
col B is a series of incrementing integers which resets as T->F or F->T
Col C is either an integer that is unrelated to col B OR it is blank OR it is one of a few string values.
col D is equal to colB when col C is blank or an integer OR static at the most recent integer value of col B when col C is a string OR continues counting up from the static value until col B resets again
When col A is FALSE, col C will be null and cols B and D will match

I have tried brute force. I have tried XLOOKUP. I simply can't figure out how to make col D come out accurate during and after the TEXT skips.

D2=XLOOKUP(1000, $C$2:indirect("C"&row()), $B$2:indirect("B"&row()), "calculate normally", -1, -1)

I think XLOOKUP will likely end up being the answer, but I've spent all evening on it and all I got was a bloody forehead. Obviously, in a perfect world, I'll be able to wrap this in an ARRAYFORMULA and have it populate down

Solution

SOLUTION

For column B you can use:

=SCAN(1,SEQUENCE(ROWS(A3:A)),LAMBDA(v,a,IF(INDEX(A3:A,a)=INDEX(A2:A,a),v+1,1)))

And for column D:

    =SCAN(0,SEQUENCE(ROWS(A3:A)),LAMBDA(v,a,
IF(INDEX(A3:A,a)=FALSE,INDEX(B3:B,a),
IF(INDEX(A2:A,a)=FALSE,1,
IF(ISTEXT(INDEX(C3:C,a)),v,v+1)))))

EXPLANATION:

You asked me to get you through the formulas:

LAMBDA in this case is an auxiliary function that lets you name needed variables in certain formulas that require more than one and a sequential process of several values. For example BYROW, BYCOL, REDUCE and, in this case, SCAN. The basic process is that you stablish names of variables in the first argument/s and then you set a formula that uses them. Yes, you use it by itself (not as auxiliary of others) in order to simplify formulas. If you know about named functions, it would be something similar. You stablish variables, a formula that uses them and then create a new parentheses with the values or cells that are included --> read LAMBDA for further explanation and examples.
SCAN lets you process an array of values sequentially returning a "result" value and letting you use it for calcultions with the next one. The first argument (named by LAMBDA) is the aggregator (where you "save" that last calculated value) and the second one is the array which will be processed one by one. For example, if you use =SCAN(0,A1:A10,LAMBDA(a,v,a+v)) you'll have a cumulative sum, if you use =SCAN(1,A1:A10,LAMBDA(a,v,a*v)) :

For more information read: SCAN - in your particular case, since you had to compare values from different cells of different rows or different columns, I decided to use it in a not very conventional way: instead of giving a group of cells or values, I gave it a sequence of numbers that represents the amount of rows in between A3 and last row in A: SEQUENCE(ROWS(A3:A)) and combined with INDEX it let me in the formula of B and in the formula of D to go through each of the rows of the needed columns to fetch the conditions set by you. Yes, probably with OFFSET you could have solved it too, but to use a similar approach in both formulas I inclined for this. For example in B column: INDEX(A3:A,a)=INDEX(A2:A,a) compares the values from a cell an the one that is one cell up (observe that in second member of equation it starts from A2 instead of A3. And in Column D I can return the value from column B if A is false with INDEX(B3:B,a) and check if C is text with ISTEXT(INDEX(C3:C,a)) - all with that value of the sequence considered each time.

I think that is probably already clear but in B formula I stated that if the cell in A is equal to the one above it sums one to the previous "saved" value (called "v" in my lambda function) or else it "starts" again from 1. In D column you also have those conditionals set, I think they're clearer because of the organizations in different lines. (Hope it's useful despite my English!)