I am trying write simple program that will remove all 'o' letters from the string. Example :
I love cats
Output:
I lve cats
I wrote following code :
var
x:integer;
text:string;
text_no_o:string;
begin
text:='I love cats';
for x := 0 to Length(text) do
//writeln(Ord(text[6]));
if(Ord(text[x])=111) then
else
text_no_o[x]:=text[x];
write(text_no_o);
end.
begin
end;
end.
When text is in English program works fine . But if i change it to Russian . It returns we question marks in console. Code with small modifications for Russian language.
var
x:integer;
text:string;
text_no_o:string;
begin
text:='Русский язык мой родной';
for x := 0 to Length(text) do
//writeln(Ord(text[6]));
if(Ord(text[x])=190) then
else
text_no_o[x]:=text[x];
write(text_no_o);
end.
begin
end;
end.
And result in console that i receive is :
Русский язык м�й р�дн�й
I expect receive
Русский язык мй рднй
As I got the problem can be caused incorrect encoding settings in console, so i should force pascal to use CP1252 instead ANSI .
I am using Free Pascal Compiler version 3.2.0+dfsg-12 for Linux . P.S I am not allowed to use StringReplace or Pos
The string is likely to be UTF8 encoded. So the cyrillic o is encoded as two chars $d0 $be. Here you replace one $be (=190). You need to replace both chars, though you cannot just test for the value of the char, because their meaning depends of surrounding chars.
Here is a way, remembering the current state (outside of letter or after first byte)
var
c: char;
text: string;
state: (sOutside, sAfterD0);
begin
text:= 'Русский язык мой родной';
state:= sOutside;
for c in text do
begin
if state = sOutside then
begin
if c = #$D0 then // may be the start of the letter
state := sAfterD0
else
write(c); // output this char because not part of letter
end
else if state = sAfterD0 then
begin
if c = #$BE then state := sOutside // finished skipping
else
begin
// chars do not form letter so output skipped char
write(#$D0, c);
state := sOutside;
end;
end
end;
writeln;
end.