Search code examples
utf-8sasstring-function

Check that some letter belongs to Russian alphabet


I want to check that some letter belongs to Russian alphabet. I can do it with direct comparizon with Cyrillic letters:

letter in ('А', 'Б', 'В', 'Г', 'Д', 'Ж', ...)

Is there some simpler approach? E.g. for English alphabet I could use rank() function:

rank('A') <= rank(letter) <= rank('z')

But this function doesn't work for UTF-8 encoding. How can I get position of the letter in UTF-8 table?


Solution

  • I believe you could use the BASECHAR function. This will convert the character to it's unicode escape.

    data test;
    input char $;
    datalines;
    Б
    Г
    Д
    Ж
    a
    b
    c
    ;
    run;
    
    data test;
    set test;
    ok = (char=basechar(char,'ESC'));
    put char= ok=;
    run;
    

    Returns:

    char=Б ok=0
    char=Г ok=0
    char=Д ok=0
    char=Ж ok=0
    char=a ok=1
    char=b ok=1
    char=c ok=1