I need to extract a sub-string after a specific delimiter, but if the specified delimiter is between two other tags it should be ignored.
For example, take this test string:
The quick <"@brown fox"> jumps over the lazy dog. The quick @brown fox jumps over the lazy dog
The desired output would be:
brown fox jumps over the lazy dog
This is because the first found @ delimiter is between two " " and so should be ignored, the second @ delimiter is not inside " " and so the text afterwards should be extracted.
I am able to find the starting position of the @ delimiter by using Pos
and extracting the text to the right of it as shown below:
procedure TForm1.Button1Click(Sender: TObject);
var
S: string;
I: Integer;
begin
S := 'The quick <"@brown fox"> jumps over the lazy dog. The quick @brown fox jumps over the lazy dog';
I := Pos('@', S);
if I > 0 then
begin
ShowMessage(Copy(S, I, Length(S)));
end;
end;
However this will always find the first @ delimiter regardless if it is surrounded by two " " or not. The result from the above is:
@brown fox"> jumps over the lazy dog. The quick @brown fox jumps over the lazy dog
where the desired result should be:
brown fox jumps over the lazy dog
How can I change the code to ignore @ delimiters when using Pos
if the delimiter is between two " " tags? I only want to find the first @ delimiter and copy the text afterwards.
It also does not matter if there are any other @ delimiters after the first valid one is found, for example this should also be valid:
The quick <"@brown fox"> jumps over the lazy dog. The quick @brown fox jumps@ ov@er the lazy@ dog
Should still return:
brown fox jumps over the lazy dog
Because we are only interested in the first valid @ delimiter, ignoring anything else afterwards and ignoring anything between two " " tags.
Please note although I have tagged Delphi I do primarily use Lazarus so ideally I would need help coming up with a solution that does not use magic help with string helpers etc.
Thanks.
To find out if the @
is not within "
enclosing tags, parse the string from the beginning.
If a delimiter is found after an opening tag, but there is no closing tag, this routine will extract the result as well.
function ExtractString(const s: String): String;
var
tagOpen: Boolean;
delimiterPos,i,j: Integer;
begin
tagOpen := false;
delimiterPos := 0;
Result := '';
for i := 1 to Length(s) do begin
if (s[i] = '"') then begin
tagOpen := not tagOpen;
delimiterPos := 0;
end
else begin
if (s[i] = '@') then begin
if (delimiterPos = 0) then
delimiterPos := i;
if not tagOpen then // Found answer
Break;
end;
end;
end;
// If there is no closing tag and a delimiter is found
// since the last opening tag, deliver a result.
if (delimiterPos > 0) then begin
// Finally extract the string and remove all `@` delimiters.
SetLength(Result,Length(s)-delimiterPos);
j := 0;
for i := 1 to Length(Result) do begin
Inc(delimiterPos);
if (s[delimiterPos] <> '@') then begin
Inc(j);
Result[j] := s[delimiterPos];
end;
end;
SetLength(Result,j);
end;
end;