I want to check that some letter belongs to Russian alphabet. I can do it with direct comparizon with Cyrillic letters:
letter in ('А', 'Б', 'В', 'Г', 'Д', 'Ж', ...)
Is there some simpler approach? E.g. for English alphabet I could use rank() function:
rank('A') <= rank(letter) <= rank('z')
But this function doesn't work for UTF-8 encoding. How can I get position of the letter in UTF-8 table?
I believe you could use the BASECHAR
function. This will convert the character to it's unicode escape.
data test;
input char $;
datalines;
Б
Г
Д
Ж
a
b
c
;
run;
data test;
set test;
ok = (char=basechar(char,'ESC'));
put char= ok=;
run;
Returns:
char=Б ok=0
char=Г ok=0
char=Д ok=0
char=Ж ok=0
char=a ok=1
char=b ok=1
char=c ok=1