Jump to content
dummzeuch

When sorting a “StringList” is very costly

Recommended Posts

They are all adaptive to one degree or another. However, you can check it yourself.

Edited by avk

Share this post


Link to post
32 minutes ago, avk said:

I appreciate your interest.

But it still seems to me that your QuicksortPlus is not in a hurry.

My sorting functions aren't optimized for speed but for convenience. They take function pointers for comparing and swapping two elements rather than directly working on the data. That's a considerable overhead. My current TimSort doesn't do that yet and I'm not sure whether it will be possible to implement it in such a way, but I will try it.

Edited by dummzeuch

Share this post


Link to post
25 minutes ago, avk said:

They are all adaptive to one degree or another. However, you can check it yourself.

Well then they aren't what you say they are. 

Share this post


Link to post

Well, yes, I forgot to mention, in the benchmark all sortings except QuicksortPlus from LGenerics.

Share this post


Link to post
On 1/5/2021 at 6:43 PM, dummzeuch said:

That shouldn’t be any problem for Quicksort, should it? But it takes about 2 seconds on my computer which is muuuuuuch longer than I expected. The same code running on a simple TStringList takes less than 1/10 of a second. Why is that?

Sorting time like 2 seconds is not about sorting at all. It's absolutely crazy :classic_blink:.

The solution is simple: wrap your sorting with Items.BeginUpdate/Items.EndUpdate.  And it's not so important which sort algorithm you will use. QuickSort is good enough.

 

Edited by balabuev

Share this post


Link to post
13 hours ago, balabuev said:

The solution is simple: wrap your sorting with Items.BeginUpdate/Items.EndUpdate.

BeginUpdate / EndUpdate didn't make any noticeable difference in my tests. Otherwise I wouldn't have bothered to make any of the other changes.

Share this post


Link to post
3 hours ago, dummzeuch said:

BeginUpdate / EndUpdate didn't make any noticeable difference in my tests.

Quite strange. Because in my tests they did a drammatic difference (I've used TCheckListBox for test).

Share this post


Link to post

I've uses the following test with 100 items in list (item count as in your initial post):

 

procedure TForm1.FormCreate(Sender: TObject);
var
  i: Integer;
begin
  for i := 0 to 99 do // Just init the list with some data.
  begin
    CheckListBox1.Items.Add('Long long long string ' + IntToStr(Random(1000)));
    CheckListBox1.Items.Objects[i] := TObject(i);
    CheckListBox1.Checked[i] := (Random > 0.5);
  end;
end;

function TForm1.SortCompare(Item1, Item2: Integer): Integer;
begin
  Result := CompareStr(CheckListBox1.Items[Item1],
                       CheckListBox1.Items[Item2]);
end;

procedure TForm1.SortList(L, R: Integer; SCompare: TJCHListSortCompare);
var
  I, J, P: Integer;
  tmpObj: TObject;
  tmpStr: string;
  tmpChecked: Boolean;
begin
  repeat
    I := L;
    J := R;
    P := (L + R) shr 1;
    repeat
      while SCompare(I, P) < 0 do Inc(I);
      while SCompare(J, P) > 0 do Dec(J);
      if I <= J then
      begin
        // exchange I and J
        tmpStr                         := CheckListBox1.Items[I];
        tmpObj                         := CheckListBox1.Items.Objects[I];
        tmpChecked                     := CheckListBox1.Checked[I];

        CheckListBox1.Items[I]         := CheckListBox1.Items[J];
        CheckListBox1.Items.Objects[I] := CheckListBox1.Items.Objects[J];
        CheckListBox1.Checked[I]       := CheckListBox1.Checked[J];

        CheckListBox1.Items[J]         := tmpStr;
        CheckListBox1.Items.Objects[J] := tmpObj;
        CheckListBox1.Checked[J]       := tmpChecked;

        if P = I then
          P := J
        else if P = J then
          P := I;

        Inc(I);
        Dec(J);
      end;
    until I > J;
    if L < J then SortList(L, J, SCompare);
    L := I;
  until I >= R;
end;

procedure TForm1.Button1Click(Sender: TObject);
var
  c: Cardinal;
begin
  c := GetTickCount;
  CheckListBox1.Items.BeginUpdate;
  try
    SortList(0, CheckListBox1.Items.Count - 1, SortCompare);
  finally
    CheckListBox1.Items.EndUpdate;
  end;
  Edit1.Text := IntToStr(GetTickCount - c);
end;

 

With BeginUpdate/EndUpdate  run time is 32 msecs.

Without BeginUpdate/EndUpdate  - 422 msecs.

 

As for me, even 32 msecs is a way too big for sorting 100 items, but it not 2 seconds!

 

Edited by balabuev

Share this post


Link to post
On 1/6/2021 at 6:42 PM, dummzeuch said:

QuickSort+ is my optimized quicksort which:

  • choses a pivot by selecting the median between elements Left, Right and (Left + Right) / 2
  • uses InsertionSort for small sets, if Right - Left < Cutoff (The value in parenthesis is the cutoff.)

Then you are almost at an Introsort (Spring4D uses that instead of plain Quicksort)

Share this post


Link to post

@dummzeuch Your problem is not in sorting algorithm at all. Trying to optimize sorting algorithm you just solving the wrong task.

 

Even faster way will be to copy the data into some array, sort the array and then copy data back:

type
  TDataItem = record
    S:       string;
    O:       TObject;
    Checked: Boolean;
  end;

  TData = array of TDataItem;

var
  FData: TData;

procedure TForm1.Button1Click(Sender: TObject);
var
  c:  Cardinal;
  i:  Integer;
  c2: Cardinal;
begin
  c := GetTickCount;
  CheckListBox1.Items.BeginUpdate;
  try
    SetLength(FData, CheckListBox1.Items.Count);          // Copy data from UI
    for i := 0 to High(FData) do                          // control.
    begin                                                 //
      FData[i].S       := CheckListBox1.Items[i];         //
      FData[i].O       := CheckListBox1.Items.Objects[i]; //
      FData[i].Checked := CheckListBox1.Checked[i];       //
    end;                                                  //

    c2 := GetTickCount;
    SortList(0, High(FData), SortCompare);                // Sort data in array.
    c2 := GetTickCount - c2;

    for i := 0 to High(FData) do
    begin
      CheckListBox1.Items[i]         := FData[i].S;       // Copy sorted data
      CheckListBox1.Items.Objects[i] := FData[i].O;       // back.
      CheckListBox1.Checked[i]       := FData[i].Checked; //
    end;
  finally
    CheckListBox1.Items.EndUpdate;
    c := GetTickCount - c;
  end;

  Edit1.Text := IntToStr(c);  // Whole time, including UI updates.
  Edit2.Text := IntToStr(c2); // Sort only time.
end;

 

With 10000 items (!) this code run in:

 

Whole time, including UI updates - 1.5 secs (the control is slow, nothing to do with that).

Sort only time - 15 msecs.

 

15 msecs for sorting of 10000 items!

Edited by balabuev
  • Like 1

Share this post


Link to post
2 hours ago, balabuev said:

Even faster way will be to copy the data into some array, sort the array and then copy data back

What do you think the article linked in the original post says? 

Share this post


Link to post
1 hour ago, David Heffernan said:

What do you think the article linked in the original post says? 

Ahh, sorry :classic_blush:, missed that link. Then my previous message can be deleted.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×