I'm trying to keep lines that contain the word "NOA" in a column A which has many multi-line cells as can be viewed in this Google Spreadsheet.
If "NOA" is present then, I would like to keep the line. The input and output should look like the image which I have "working" with too-many helper cells. Can this be combined into a single formula?
Theoretical Approaches: I have been thinking about three approaches to solve this:
Practical attempts:
FIRST APPROACH: first I attempted using REGEXEXTRACT to extract out everything that did not have NOA in it, the Regex worked in demo but didn't work properly in sheets. I thought this might be a concise way to get the value, perhaps if my REGEX skill was better?
ARRAYFORMULA(REGEXREPLACE(A1:A7, "^(?:[^N\n]|N(?:[^O\n]|O(?:[^A\n]|$)|$)|$)+",""))
I think the Regex because overly complex, didn't work in Google or perhaps the formula could be improved, but because Google RE2 has limitations it makes it harder to do certain things.
SECOND APPROACH:
Then I came up with an alternate approach which seems to work 2 stages (with multiple helper cells) but I would like to do this with one equation.
=TRANSPOSE(split(A2,CHAR(10)))
=TEXTJOIN(CHAR(10),1,FILTER(C2:C7,REGEXMATCH(C2:C7,"NOA")))
Questions:
The shared Google spreadhseet is here.
Thank you in advance for your help.
Here's one way you can do that:
=index(substitute(substitute(transpose(trim(
query(substitute(transpose(if(regexmatch(split(
filter(A2:A,A2:A<>""),char(10)),"NOA"),split(
filter(A2:A,A2:A<>""),char(10)),))," ","❄️")
,,9^9)))," ",char(10)),"❄️"," "))
First, we split the data by the newline (char 10), then we filter out the lines that don't contain NOA and finally we use a "query smush" to join everything back together.