Jump to content
Sign in to follow this  
Mike Torrettinni

Why is TArray.BinarySearch slower than normal binary search function?

Recommended Posts

My simple test shows TArray.BinarySearch is about 6x slower than custom binary search function:

GetName_TArrayBinarySearch: 858
GetName_CustomBinarySearch: 131

 

I was expecting similar timings. Is this just a fact or did I set up this example wrongly:

 

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils, System.Generics.Collections, System.Generics.Defaults, System.Math, System.Diagnostics;

type
  TDataLine = record
    SeqNo: integer;
    CustomName: string;
  end;

const cNOfORecords = 1000000;
var xData: TArray<TDataLine>;

procedure PrepData;
var i: Integer;
begin
  SetLength(xData, cNOfORecords);
  for i := 0 to cNofORecords do
  begin
    xData[i].SeqNo := i + 1;
    xData[i].CustomName := i.ToString;
  end;
end;

function GetName_TArrayBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var vIdx: integer;
begin
  if TArray.BinarySearch<TDataLine>(aData, aItem, vIdx,
    TComparer<TDataLine>.Construct(
      function (const Left, Right: TDataLine): Integer
      begin
       Result := CompareValue(Left.SeqNo, Right.SeqNo);
      end
      ))
  then
      Result := aData[vIdx].CustomName;
end;

function GetName_CustomBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var L, H, i, c: Integer;
begin
  Result := '';
  L := 0; H := High(aData);
  while L <= H do
  begin
    i := L + (H - L) shr 1;
    c := CompareValue(aData[i].SeqNo, aItem.SeqNo);
    if c < 0 then
      L := i + 1
    else
    begin
      if c = 0 then
        Exit(aData[i].CustomName);
      H := i - 1;
    end;
  end;
end;

var vDataLine: TDataLine;
    vName: string;
    i: integer;
    vSW: TStopWatch;
begin
  PrepData; // prepares data, sorted by integer

  vSW := TStopwatch.StartNew;
  for i := 1 to cNOfORecords do
  begin
    vDataLine.SeqNo := i;
    vName := GetName_TArrayBinarySearch(xData, vDataLine);
  end;
  writeln('GetName_TArrayBinarySearch: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopwatch.StartNew;
  for i := 1 to cNOfORecords do
  begin
    vDataLine.SeqNo := i;
    vName := GetName_CustomBinarySearch(xData, vDataLine);
  end;
  writeln('GetName_CustomBinarySearch: ' + vSW.ElapsedMilliseconds.ToString);

  readln;

end.

 

Any advice appreciated, thanks!

 

 

 

 

Share this post


Link to post

Because IComparer<T> created via TComparer.Construct (TDelegatedComparer<T>) suffer from this issue: https://www.idefixpack.de/blog/2016/05/whats-wrong-with-virtual-methods-called-through-an-interface/

 

Furthermore every call to GetName_TArrayBinarySearch constructs the comparer again. Eliminating that as well gives me a result of 159 vs 105 which then can be explained by the additional calls through IComparer<T> and the probably a little less optimal allocated registers in the actual method that performs the search because that is the one with many more arguments in 

class function TArray.BinarySearch<T>(const Values: array of T; const Item: T; out FoundIndex: Integer; const Comparer: IComparer<T>; Index, Count: Integer): Boolean;

 

FWIW your implementation is not exactly the same as the RTL one as you exit as soon as you find a match while the RTL implementation because it returns the index goes on because it returns the first index in case there are successive elements matching.

Edited by Stefan Glienke
  • Like 1
  • Thanks 1

Share this post


Link to post
1 hour ago, Stefan Glienke said:

Furthermore every call to GetName_TArrayBinarySearch constructs the comparer again. Eliminating that

How did you eliminate constructing the comparer each call? I tried to 'pre-construct' comparer like this:

 

// new comparer definitions
var
  xComparison: TComparison<TDataLine>;
  xComparer: IComparer<TDataLine>;


// Function that does the value comparison
function CompareIDs(const Left, Right: TDataLine): Integer;
begin
  Result := CompareValue(Left.SeqNo, Right.SeqNo);
end;

// and I 'pre-construct' Comparer before first GetName_TArrayBinarySearch call
...
  xComparison := CompareIDs;
  xComparer := TComparer<TDataLine>.Construct(xComparison);
...  

// New search function
function GetName_TArrayBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var vIdx: integer;
begin
  if TArray.BinarySearch<TDataLine>(aData, aItem, vIdx, xComparer) // using pre-constructed xComparer
  then
      Result := aData[vIdx].CustomName;
end;

But this only speeds up a little bit, still: 776 vs 129 ms. So, this is not the correct solution, right?

 

Share this post


Link to post

When I debug step by step, I see that the Comparison assignment to CompareIDs method is made on every binary search call, again and again!? I assumed if I move it before the actual call of BinarySearch method, it will assign and construct Comparer once and use it.

But the debugger keeps jumping to the line to xComparison := CompareIDs; while doing binary search.

 

I assume this is the cause for slower execution?

 

debug.thumb.gif.220323ad76871ae363f1550ee554d50d.gif

Share this post


Link to post

The main reason is in the very first sentence of my previous post.

Also as you are so into benchmarking I suggest you make yourself familiar with profilers so you don't have to guess or ask others what is taking time as you can very well see for yourself.

  • Like 1
  • Thanks 1

Share this post


Link to post
4 minutes ago, Stefan Glienke said:

The main reason is in the very first sentence of my previous post.

Also as you are so into benchmarking I suggest you make yourself familiar with profilers so you don't have to guess or ask others what is taking time as you can very well see for yourself.

Thanks, it makes sense. This jumping of debugger to the same line again and again, even though the flow should be past it, is something new to me. Cool stuff! 😉

Share this post


Link to post
9 minutes ago, Mike Torrettinni said:

Thanks, it makes sense. This jumping of debugger to the same line again and again, even though the flow should be past it, is something new to me. Cool stuff! 😉

That is because technically what the compiler generates for line 93 is this:

 

xComparison := function(const left, right: TDataLine): Integer begin Result := CompareIDs(left, right) end);

When you put a breakpoint there you have them in multiple lines: one time it gets hit for the assignment to xComparison and all other times for the Result := CompareIDs

Share this post


Link to post
4 minutes ago, Stefan Glienke said:

That is because technically what the compiler generates for line 93 is this:

 


xComparison := function(const left, right: TDataLine): Integer begin Result := CompareIDs(left, right) end);

When you put a breakpoint there you have them in multiple lines: one time it gets hit for the assignment to xComparison and all other times for the Result := CompareIDs

Aha, I assumed it's like OnClick assignment to event... the flow would go just to the assigned method.

Share this post


Link to post
On 3/1/2021 at 12:47 AM, Stefan Glienke said:

FWIW your implementation is not exactly the same as the RTL one as you exit as soon as you find a match while the RTL implementation because it returns the index goes on because it returns the first index in case there are successive elements matching.

Yes, this is because I'm implementing common fast class for lookup-up data. And in this case, the searched values are always unique. So, I can exit asap.

 

Well, I didn't really check what would happen with non-unique data, but I guess I lucked out with this.


 

 

Share this post


Link to post

Solution: Here is example of code where TArray.BinarySearch uses custom Comparer, initialized once only!

 

Interesting that 64bit results are much faster w/o custom comparer, but of course initializing custom comparer once and using it is better:

 

32bit:

image.png.ddf31a2bd8fc9748b5a1f08cd16117f5.png

 

64bit:

image.png.7dc27b31bc5c218fd01af77ffad5c8cd.png

 

Source:

 

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils, System.Generics.Collections, System.Generics.Defaults, System.Math, System.Diagnostics;

type
  TDataLine = record
    SeqNo: integer;
    CustomName: string;
  end;

type
  TCustomComparer = class(TInterfacedObject, IComparer<TDataLine>)
  public
    function Compare(const Left, Right: TDataLine): Integer;
  end;

const cNOfORecords = 1000000;
var xData: TArray<TDataLine>;
  xComparer:  IComparer<TDataLine>;

function TCustomComparer.Compare(const Left, Right: TDataLine): Integer;
begin
  Result := Left.SeqNo - Right.SeqNo;
end;

procedure PrepData;
var i: Integer;
begin
  SetLength(xData, cNOfORecords);
  for i := 0 to cNofORecords do
  begin
    xData[i].SeqNo := i + 1;
    xData[i].CustomName := i.ToString;
  end;
end;

function GetName_TArrayBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var vIdx: integer;
begin
  if TArray.BinarySearch<TDataLine>(aData, aItem, vIdx,
    TComparer<TDataLine>.Construct(
      function (const Left, Right: TDataLine): Integer
      begin
       Result := CompareValue(Left.SeqNo, Right.SeqNo);
      end
      ))
  then
      Result := aData[vIdx].CustomName;
end;

function GetName_TArrayBinarySearch_w_Comparer(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var vIdx: integer;
begin
  if TArray.BinarySearch<TDataLine>(aData, aItem, vIdx, xComparer)
  then
      Result := aData[vIdx].CustomName;
end;

function GetName_CustomBinarySearch(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var L, H, i, c: Integer;
begin
  Result := '';
  L := 0; H := High(aData);
  while L <= H do
  begin
    i := L + (H - L) shr 1;
    c := CompareValue(aData[i].SeqNo, aItem.SeqNo);
    if c < 0 then
      L := i + 1
    else
    begin
      if c = 0 then
        Exit(aData[i].CustomName);
      H := i - 1;
    end;
  end;
end;

var vDataLine: TDataLine;
    vName: string;
    i: integer;
    vSW: TStopWatch;
begin
  PrepData; // prepares data, sorted by integer

  vSW := TStopwatch.StartNew;
  for i := 1 to cNOfORecords do
  begin
    vDataLine.SeqNo := i;
    vName := GetName_TArrayBinarySearch(xData, vDataLine);
  end;
  writeln('GetName_TArrayBinarySearch: ' + vSW.ElapsedMilliseconds.ToString);

  // New TArray.BinarySearch with Comparer
  xComparer  := TCustomComparer.Create; // initialize Comparer
  vSW := TStopwatch.StartNew;
  for i := 1 to cNOfORecords do
  begin
    vDataLine.SeqNo := i;
    vName := GetName_TArrayBinarySearch_w_Comparer(xData, vDataLine);
  end;
  writeln('GetName_TArrayBinarySearch w Comparer: ' + vSW.ElapsedMilliseconds.ToString);

  vSW := TStopwatch.StartNew;
  for i := 1 to cNOfORecords do
  begin
    vDataLine.SeqNo := i;
    vName := GetName_CustomBinarySearch(xData, vDataLine);
  end;
  writeln('GetName_CustomBinarySearch: ' + vSW.ElapsedMilliseconds.ToString);

  readln;

end.

 

 

Thanks  @Stefan Glienke and @balabuev

 

Edited by Mike Torrettinni

Share this post


Link to post

@Mike Torrettinni folding code is a bit tricky. You have to add the spoiler tag first, then post the message, then edit it again and post the code into the folding area.

 

At least this was my workaround, to get it work. If anybody knows a better way to do it, please share.

Share this post


Link to post
4 hours ago, Mike Torrettinni said:

Interesting that 64bit results are much faster w/o custom comparer

First because 64bit does not suffer from the "virtual method glitch".

Second because even with a class implementing the IComparer interface you get adjustor thunks which you don't have with the default comparers provided by Generics.Collections.Defaults because those are directly crafted IMTs (interface method tables)

 

BTW: you again unnecessarily slow down your benchmark results by putting everything into the dpr main.

And you are comparing different things - your CustomBinarySearch uses CompareValue which is a function call.

So if you decide to roll your custom implementation then don't dumb it down:

 

function GetName_CustomBinarySearch2(const aData: TArray<TDataLine>; const aItem: TDataLine): string;
var L, H, i, c: Integer;
begin
  L := 0; H := High(aData);
  while L <= H do
  begin
    i := L + (H - L) shr 1;
    c := aData[i].SeqNo - aItem.SeqNo;
    if c < 0 then
      L := i + 1
    else if c > 0 then
      H := i - 1
    else
      Break;
  end;
  if L <= H then
    Result := aData[i].CustomName
  else
    Result := '';
end;

makes it almost twice as fast as your implementation. I decided to use the subtract approach as you did with the comparers - keep in mind this only works if there cant be an overflow which wont happen if SeqNo is just 0..MaxInt

Anyhow at least in this benchmark it did not make any noticable difference to writing:

 

    if aData[i].SeqNo < aItem.SeqNo then
      L := i + 1
    else if aData[i].SeqNo > aItem.SeqNo then
      H := i - 1
    else
      Break;

 

Edited by Stefan Glienke
  • Thanks 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×