Search code examples
google-sheetsconditional-statementscharactersubstitution

How do I substitute based on character and position in word


I have one word per cell. I need to substitute characters with other characters based on a range of conditions, as follows.

Condition 1 - if the word contains an 'l' double it to 'll'. Condition 2 - if the first vowel in the word is an 'e', split the word with an apostrophe after said 'e'. Condition 3 - the last vowel of each word becomes an 'i'. Condition 4 - if the word ends in 'a','e','i','o', add an m to the end.

Ideally, I'd like them all to work in one formula, but each working separately would suffice. I can apply in a chain, cell to cell.

Condition 1 - SUBSTITUTE(SUBSTITUTE(E2,"l","ll"),"L","Ll") This is successful.

Condition 2 - SUBSTITUTE("e","e'",1) Applies to every 'e', rather than only when it is the first vowel in the word.

Together, these work as =SUBSTITUTE(SUBSTITUTE(SUBSTITUTE(E2,"l","ll"),"L","Ll"),"e","e'",1)

Condition 3 - NO CURRENT FORMULA

Condition 4 - IF(RIGHT(TRIM(F2),1)="a",F2&"m",F2&"") Works for a single letter (in this case "a"), but not for all required letters at once.


Solution

  • Use regexreplace(), like this:

    =lambda(
      data, regexes, replaceWith, 
      byrow( 
        data, 
        lambda( 
          word, 
          if( 
            len(word), 
            reduce( 
              trim(word), sequence(counta(regexes)), 
              lambda( 
                acc, regexIndex, 
                regexreplace( 
                  acc, 
                  "(?i)" & index(regexes, regexIndex), 
                  index(replaceWith, regexIndex) 
                ) 
              ) 
            ), 
            iferror(1/0)  
          ) 
        ) 
      ) 
    )( 
      A2:A10, 
      { "l", "^([^aeiou]*)(e)", "[aeiou]([^aeiou]*)$", "([aeio])$" }, 
      { "ll", "$1e-", "i$1", "$1m" }  
    )
    

    The formula will only deal with lowercase letters because that is what is specified by the question. To replace uppercase letters as well, prefix the first index() with "(?i)" & . Note that case will not be retained.