Search code examples
google-sheetsgoogle-sheets-formula

In a google sheets query, replace all diacritic caracters with a formula


I would like to be able to search with "contains" in a query, but without being case or diacritics sensitive. For the case sensitivity, I've found how to fix that, but for the diacritics, it's pretty difficult.

My plan here is to use an arrayformula in the range of the query request to replace all diacritics caracters by their corresponding non-diacritics caracters (they are in a two columns range I c/c from this project https://docs.google.com/spreadsheets/d/1cJ4YmvkvmSlUr9mXGR9HadyJanQWw6DXbNFuCApgIr4/edit#gid=0). I stored this in the range R2:S342.

But to construct this arrayformula is pretty difficult.

I am testing it in the range of data P5:P8, containing these 4 cells: café àéïôù àéïôù àéïôù

I would expect my formula to give me : cafe aeiou aeiou aeiou

But what I actually have is : ceio ceio ceio ceio

Which is the first caracter of the first cell, then the second caracter of the second cell, then the third caracter of the third cell, etc...

Here is the formula I have right now :

=ArrayFormula(
   If(
      IsBlank(P5:P8); "";
      Join("";
         iferror(
            Vlookup(
               Mid(P5:P8; Row(Indirect("1:"&Len(P5:P8))); 1);
               R2:S342;
               2;
               False
            );
            Mid(P5:P8; Row(Indirect("1:"&Len(P5:P8))); 1)
         )
      )
   )
)

When I do it on a single cell, it works ("café" -> "cafe"), but not on a range.

How could I make it work the way I want ?

Thank you.


Solution

  • Here's one approach:

    =let(Λ,reduce(wraprows(,4,),sequence(2),lambda(a,c,{a;importhtml("https://docs.oracle.com/cd/E29584_01/webhelp/mdex_basicDev/src/rbdv_chars_mapping.html","table",c)})),
       map(A:A,lambda(Σ,if(Σ="",,index(let(Δ,mid(Σ,sequence(len(Σ)),1),
               join(,xlookup(Δ,index(Λ,,2),index(Λ,,3),Δ))))))))
    
    • this is the source I used as a backend data dump for the diacritic-character mapping but you are welcome to use a different\robust one just in case...

    enter image description here

    Updated formula (to take into account case-senitivity):

    =map(P5:P8;lambda(Σ;if(Σ="";;index(let(Δ;mid(Σ;sequence(len(Σ));1);
               join(;map(Δ;lambda(Λ;xlookup(1;--exact(R2:R342;Λ);S2:S342;Λ)))))))))
    

    enter image description here