Search code examples
excelreplaceexcel-formuladiacriticsexcel-2019

Diacritic normalization in excel formula, without vba


How to replace diacritics (accented characters) with corresponding alphabetical character with regular formula: =SUBSTITUTE(A1,"é","e"), without VBA, for large range of uppercase and lowercase diacritics like this:

áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâ =a

ÁÀȦÄǍĀÃÅĄȺẤẦẮẰǠǺǞẪẴẢȀȂẨẲẠḀẬẶĂÂ = A

ḃƀɓḅḇ =b

ḂɃƁḄḆ = B

ćċĉčçȼḉƈɔ =c

ĆĊĈČÇȻḈƇƆ = C

ḋďḑđɗḍḓḏðɖ =d

ḊĎḐĐƊḌḒḎÐƉ =D

etc.


Solution

  • With Excel 2019 (the version you are using) this is quite a stretch. I'd suggest something along the following lines:

    enter image description here

    Formula in C1:

    =CONCAT(IFERROR(IF(EXACT(UPPER(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),UPPER(MID(A2,SEARCH(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),A1),1)),MID(A2,FIND(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),A1),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))
    

    Understandig the limit of SEARCH() (255 chars) I opted to only include the lowercase variants and add the extra needed intelligence inside the formulae through functions like IF(), EXACT() and UPPER(). Again, this is an CSE-entered array formula.


    EDIT: Without helper:

    =CONCAT(IFERROR(IF(EXACT(UPPER(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),UPPER(MID("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccdddddddddd",SEARCH(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),"áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖ"),1)),MID("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccdddddddddd",FIND(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),"áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖ"),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))
    

    EDIT2: With all lowercase diacritics:

    =CONCAT(IFERROR(IF(EXACT(UPPER(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)),UPPER(MID(CONCATENATE("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeffggggggggghhhhhhhhhiiiiiiiiiiiiiiiiiiiijjjkkkkkkkklllllllllllll","mmmnnnnnnnnnnnnooooooooooooooooooooooooooooooooooooooooopppprrrrrrrrrrrrsssssssssttttttttttuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuvvwwwwwwxxyyyyyyyyyyyyzzzzzzz"),SEARCH(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),CONCATENATE("áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖéèėêëěĕēẽęȩɇếềḗḕễḝẻȅȇểẹḙḛệéèêëḟƒǵġĝǧğḡģǥɠḣĥḧȟḩħḥḫⱨıíìïǐĭīĩįɨḯỉȉȋịḭíìïîȷĵǰḱǩķƙᶄḳḵⱪĺŀľɫⱡļƚłḷḽḻḹꝈ","ḿṁṃńǹṅňñņɲṇṋṉñŋóòȯôöǒŏōõǫőốồɵøṓṑṍȫỗṏǿȭǭỏȍȏơổọớờỡộởợóòôöõṕṗᵽƥʀŕṙřŗɍɽȑȓṛṟṝśṡŝšṥṧṣṩşṫťƭṭʈṱṯⱦţŧúùûüǔŭūũůųűʉǘǜṹṻủȕȗưụṳứừṷṵữửựúùûüṽṿẃẁẇŵẅẉẋẍýỳẏŷÿȳỹɏỷƴỵýźżẑžƶẓẕ")),1)),MID(CONCATENATE("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaabbbbbcccccccccddddddddddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeffggggggggghhhhhhhhhiiiiiiiiiiiiiiiiiiiijjjkkkkkkkklllllllllllll","mmmnnnnnnnnnnnnooooooooooooooooooooooooooooooooooooooooopppprrrrrrrrrrrrsssssssssttttttttttuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuuvvwwwwwwxxyyyyyyyyyyyyzzzzzzz"),FIND(MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1),CONCATENATE("áàȧäǎāãåąⱥấầắằǡǻǟẫẵảȁȃẩẳạḁậặăâḃƀɓḅḇćċĉčçȼḉƈɔḋďḑđɗḍḓḏðɖéèėêëěĕēẽęȩɇếềḗḕễḝẻȅȇểẹḙḛệéèêëḟƒǵġĝǧğḡģǥɠḣĥḧȟḩħḥḫⱨıíìïǐĭīĩįɨḯỉȉȋịḭíìïîȷĵǰḱǩķƙᶄḳḵⱪĺŀľɫⱡļƚłḷḽḻḹꝈ","ḿṁṃńǹṅňñņɲṇṋṉñŋóòȯôöǒŏōõǫőốồɵøṓṑṍȫỗṏǿȭǭỏȍȏơổọớờỡộởợóòôöõṕṗᵽƥʀŕṙřŗɍɽȑȓṛṟṝśṡŝšṥṧṣṩşṫťƭṭʈṱṯⱦţŧúùûüǔŭūũůųűʉǘǜṹṻủȕȗưụṳứừṷṵữửựúùûüṽṿẃẁẇŵẅẉẋẍýỳẏŷÿȳỹɏỷƴỵýźżẑžƶẓẕ")),1)),MID(B1,ROW(A$1:INDEX(A:A,LEN(B1))),1)))