I have a THashedStringList with almost 200k items on it(strings and objects). This list will be heavy used on searching items on it. In some cases I need an incremental search on it. I've wrote a rudimentary example for an incremental search using the StartsText routine
unit Unit1;
interface
uses
Winapi.Windows, Winapi.Messages, System.SysUtils, System.Variants, System.Classes, Vcl.Graphics,
Vcl.Controls, Vcl.Forms, Vcl.Dialogs,
System.IniFiles, Vcl.StdCtrls,
System.StrUtils;
type
TForm1 = class(TForm)
Memo1: TMemo;
Edit1: TEdit;
Memo2: TMemo;
procedure FormCreate(Sender: TObject);
procedure Edit1KeyUp(Sender: TObject; var Key: Word; Shift: TShiftState);
private
{ Private declarations }
public
ahsStrLst : THashedStringList;
{ Public declarations }
end;
var
Form1: TForm1;
implementation
{$R *.dfm}
function generate(cantidad: integer): string;
const
letras_mi = 'abcdefghijklmnopqrstuvwxyz';
var
i: Integer;
begin
Result := '';
for I := 0 to cantidad-1 do
Result := Result + letras_mi[Random(Length(letras_mi)) + 1];
end;
procedure TForm1.Edit1KeyUp(Sender: TObject; var Key: Word; Shift: TShiftState);
var iPos: Integer;
begin
Memo2.Clear;
Memo2.Update;
for ipos := 0 to ahsStrLst.Count-1 do
if StartsText(Edit1.Text,ahsStrLst[iPos]) then
Memo2.Lines.Add(ahsStrLst[iPos])
end;
procedure TForm1.FormCreate(Sender: TObject);
var ipos:Integer;
begin
ahsStrLst := THashedStringList.Create;
for iPos := 0 to 50000 do
ahsStrLst.AddObject(generate(4),TObject(ipos));//here there will be diffent type of objects
for iPos := 0 to 50000 do
ahsStrLst.AddObject(generate(6),TObject(ipos));
for iPos := 0 to 50000 do
ahsStrLst.AddObject(generate(8),TObject(ipos));
for iPos := 0 to 50000 do
ahsStrLst.AddObject(generate(10),TObject(ipos));
Memo1.Lines.AddStrings(ahsStrLst);
end;
end.
Is there any way to speed up this incremental search?
For a start, you should stop using THashedStringList
. Once you've got a Delphi with generics you should use TDictionary<K,V>
instead. It presents a much cleaner interface, and performs better. However, in this case, a demand for partial matching renders hash based lookup helpless. You need a different data structure.
For efficient partial match lookup you can consider maintaining an ordered list. Then you can perform the lookup using bisection. Because you are performing partial matches you'll have to craft your own bisection because all the facilities provided by the RTL search for single items.
Suppose that you are searching for 'abc'
. The first of all find the largest value that is < 'abc'
. Your partial matches starts with the item that follows. Then find the largest value that begins with 'abc'
. Your partial matches end with that item. If no such item exists, there are no matches.