Search code examples
google-sheetsgoogle-sheets-formula

exclusion of stop words with REGEXEXTRACT in Google Sheets


Now I have a formula (cell D2) that extracts the surname and first name of a chess player from raw data (column A). But it doesn't fully work if there are some words in front of the last name that should not be output (blacklisted in column B). I need to add a stop-list of these words to the existing formula, so that only the last name and first name before the "long dash" sign remain. File: https://docs.google.com/spreadsheets/d/1IlzIbHGAe4L7ts-VeQMfkuJaRAE2ErY_pb-s-aLc5YM/edit#gid=0

=ARRAY_CONSTRAIN(ARRAYFORMULA(REGEXEXTRACT(A2:A6;"[0-9]+(.+) –"));COUNTA(A2:A6);1)

enter image description here


Solution

  • You may try:

    =map(A2:A;lambda(Σ;if(Σ="";;trim(regexextract(Σ;"^\d+(?:"&textjoin("|";1;sort(B2:B;len(B2:B);))&")?(.*) –")))))
    

    enter image description here