Jump to content
Mike Torrettinni

Micro optimization: IN vs OR vs CASE

Recommended Posts

I have a few functions where I check if 'data' is 'a control', and in some cases I use MatchStr where I have string info, and IN where I have integer ID:

function IsControlGroupA(const aControl: string): boolean;
  Result := MatchStr(aControl, ['Edit', 'Check', 'ComboBox', 'RadioBtn', 'ListBox']);
end;

function IsControlGroupX(aControl: integer): boolean;
  Result := aControl in [xConttrols.EditID, xConttrols.CheckID, xConttrols.ComboBoxID, xConttrols.RadioBtnID, xConttrols.ListBoxID];
end;

Replacing MatchStr with OR is of course much faster - because we know MatchStr is quite inefficient compared to simple = comparison, so this is much faster:

function IsControlGroupA(const aControl: string): boolean;
  Result := (aControl = 'Edit') or (aControl = 'Check') or (aControl = 'ComboBox') or (aControl = 'RadioBtn') or (aControl = 'ListBox');
end;

 

Then I looked at IN and it looks like simple check of integers, which is fast anyway. So, I benchmarked if it's worth replacing it also with OR:

function IsControlGroupX(aControl: integer): boolean;
  Result := (aControl = xConttrols.EditID) or (aControl = xConttrols.CheckID) or (aControl = xConttrols.ComboBoxID) or (aControl = xConttrols.RadioBtnID) or (aControl = xConttrols.ListBoxID);
end;

 

Then I remember reading Marco's response on Case vs If Else about a single comparison and jump:

https://twitter.com/marcocantu/status/1501844578898530308?cxt=HHwWiMC-kf_rz9cpAAAA

 

so I put Case into benchmarking and seems like Case is not suitable for my data, since I don't use a large number of ordinals, like Marco explained when Case is faster, and I don't use If Else, so probably not good case anyway, but I put it in so I know how it go against OR.

 

In test benchmark, the OR is a winner, and Case is a bit slower especially in 64bit - which I'm slowly moving into anyway:

I benchmarked with my current D 10.2.3 and D 11.1 running in VM:

 

image.png.a1bd2e5fe23136fae0e0cfff5d39fc44.png

 

As Marco said, perhaps if I had a large number of checks and if I compared it to If Else, it could win.

 

 

The real-life difference in one data sample, the most used check for a group is 6x faster using OR vs IN:

 

using IN:      image.png.8a6e84c89e2f50fa5edc927c3b9f5b1f.png

using OR:     image.png.81aa39851a4ad3c87568724cc6b74347.png

 

So, definitely quite useful improvement!

 

 

1 fact about Case: as we know you can't use Case on non-constant expressions, so using Case for my functions would require a good amount of code change. If it proved to be significantly faster then OR, I would make the change, but it's not.

 

 

As usual I'm just looking for comment on my observations, perhaps suggestion on something I don't see or didn't take into consideration.

 

And the benchmark code - with the usual limitation of simple repeated calls:

 

program Project1;

{$APPTYPE CONSOLE}

{$R *.res}

uses
  System.SysUtils,
  System.Diagnostics;

type
  TControlsRec = record
    EditID, ButtonID, CheckID, FormID, FrameID, ListBoxID, PageControlID, TabControlID, RadioBtnID, ComboBoxID: integer;
    EditStr, ButtonStr, CheckStr, FormStr, FrameStr, ListBoxStr, PageControlStr, TabControlStr, RadioBtnStr, ComboBoxStr: string;
  end;

var
  xControlsRec: TControlsRec;

procedure AssignControlsRec;
begin
  xControlsRec.EditID        := 1; xControlsRec.EditStr        := 'Edit';
  xControlsRec.ButtonID      := 2; xControlsRec.ButtonStr      := 'Button';
  xControlsRec.CheckID       := 3; xControlsRec.CheckStr       := 'Check';
  xControlsRec.FormID        := 4; xControlsRec.FormStr        := 'Form';
  xControlsRec.FrameID       := 5; xControlsRec.FrameStr       := 'Frame';
  xControlsRec.ListBoxID     := 6; xControlsRec.ListBoxStr     := 'ListBox';
  xControlsRec.PageControlID := 7; xControlsRec.PageControlStr := 'PageControl';
  xControlsRec.TabControlID  := 8; xControlsRec.TabControlStr  := 'TabControl';
  xControlsRec.RadioBtnID    := 9; xControlsRec.RadioBtnStr    := 'RadioBtn';
  xControlsRec.ComboBoxID    := 10; xControlsRec.ComboBoxStr   := 'ComboBox';
end;

function IsIN(aID: integer): boolean;
begin
  Result := aID in [xControlsRec.ButtonID,  xControlsRec.FormID, xControlsRec.ListBoxID, xControlsRec.TabControlID, xControlsRec.ComboBoxID];
end;

function IsOR(aID: integer): boolean;
begin
  Result := (aID = xControlsRec.ButtonID) or (aID = xControlsRec.FormID) or (aID = xControlsRec.ListBoxID) or (aID = xControlsRec.TabControlID) or (aID = xControlsRec.ComboBoxID);
end;

function IsCase(aID: integer): boolean;
begin
  case aID of
    2,4,6,8,10: Result := True;
    else
      Result := false;
  end;
end;

function IsCase2(aID: integer): boolean;
begin
  case aID of
    2: Result := True;
    4: Result := True;
    6: Result := True;
    8: Result := True;
    10: Result := True;
    else
      Result := false;
  end;
end;


const
  cLoop = 100000000;

  // Values/ControlIDs to search for
  // 3 values are in the group, 2 are not - similar ratio as in real data
  cSearch: array[1..5] of integer = (2,4,5,8,11);

var vSW: TStopwatch;
    vBool:boolean;
    i: integer;
begin
  AssignControlsRec;

  // IsIN
  vSW := TStopwatch.StartNew;
  for i := 1 to cLoop do
  begin
    vBool := IsIN(cSearch[1]);
    vBool := IsIN(cSearch[2]);
    vBool := IsIN(cSearch[3]);
    vBool := IsIN(cSearch[4]);
    vBool := IsIN(cSearch[5]);
  end;
  vSW.Stop;
  Writeln(Format('IsIN    = %5s', [vSW.ElapsedMilliseconds.ToString]));

  // IsOR
  vSW := TStopwatch.StartNew;
  for i := 1 to cLoop do
  begin
    vBool := IsOR(cSearch[1]);
    vBool := IsOR(cSearch[2]);
    vBool := IsOR(cSearch[3]);
    vBool := IsOR(cSearch[4]);
    vBool := IsOR(cSearch[5]);
  end;
  vSW.Stop;
  Writeln(Format('IsOR    = %5s', [vSW.ElapsedMilliseconds.ToString]));

  // IsCase
  vSW := TStopwatch.StartNew;
  for i := 1 to cLoop do
  begin
    vBool := IsCase(cSearch[1]);
    vBool := IsCase(cSearch[2]);
    vBool := IsCase(cSearch[3]);
    vBool := IsCase(cSearch[4]);
    vBool := IsCase(cSearch[5]);
  end;
  vSW.Stop;
  Writeln(Format('IsCase  = %5s', [vSW.ElapsedMilliseconds.ToString]));

  // IsCase2
  vSW := TStopwatch.StartNew;
  for i := 1 to cLoop do
  begin
    vBool := IsCase2(cSearch[1]);
    vBool := IsCase2(cSearch[2]);
    vBool := IsCase2(cSearch[3]);
    vBool := IsCase2(cSearch[4]);
    vBool := IsCase2(cSearch[5]);
  end;
  vSW.Stop;
  Writeln(Format('IsCase2 = %5s', [vSW.ElapsedMilliseconds.ToString]));

  readln;
end.

 

  • Like 1

Share this post


Link to post

You are mostly benchmarking stack operations, calls etc.. -again- everything which has nothing to do with the actual comparisons and it's asm implementations.

I'd suggest you to buy an assembly book first and try to understand the asm in the Delphi debugger.

 

There are 3 changes in the unit, try to find them and tell us what do you think about the new results.

 

 

 

Project1.dpr

Edited by Attila Kovacs
  • Like 1

Share this post


Link to post
5 hours ago, Mike Torrettinni said:

As usual I'm just looking for comment on my observations, perhaps suggestion on something I don't see or didn't take into consideration.

Okay then, here's a comment: If you really cared about performance then you would profile your code and direct your efforts at areas that would make an actual difference on a higher level.

  • Like 1

Share this post


Link to post
35 minutes ago, Anders Melander said:

Okay then, here's a comment: If you really cared about performance then you would profile your code and direct your efforts at areas that would make an actual difference on a higher level.

The familiar refrain, but the message has not landed yet 

Share this post


Link to post

I am usually happy to help people improve in that subject but seeing that you simply ignored half the advice given to you before in countless similar threads I consider this a waste of time.

  • Like 1
  • Thanks 3

Share this post


Link to post
20 minutes ago, Stefan Glienke said:

I am usually happy to help people improve in that subject but seeing that you simply ignored half the advice given to you before in countless similar threads I consider this a waste of time.

I recently reached the age bracket that many would probably consider 'unteachable', so it's not on purpose.

I was away from Delphi for a couple of months and now preparing projects for 64bit, so it's fun to do some benchmarking again.

 

If I can make some good progress, like in this case reducing some process runtime by half a second, even better. This process is part of search feature, so user interaction feature, and any speed improvement is very useful because users notice them very quickly.

  • Like 1

Share this post


Link to post

@Stefan Glienke I got a little 'nudge' that my reply above was rude and seems inconsiderate to your help in the past. I hope you didn't read it that way. I took your comment as if my benchmark was waste of time, but re-reading it, I see what you meant.

Anyway, I think I was always appreciative to your help and expressed it in past topics. Don't feel like you need to help me in every topic, I have no problem getting no comments on my benchmarks, I take it as "nothing big is wrong with it" or "it's so wrong, it would take me too long to explain", I welcome both options, because they mean either I did something right or I'm slowly getting there.

The 64bit projects will go through a lot of benchmarks, so I will probably post more observations like this one, so be on the lookout for something that peeks your interest, I will not be offended if you skip the rest.

 

Also, come on, you can't say that Moe, Larry and Curly's reprisal of 'First!' comment wasn't worth at least a little chuckle at my expense... I know I did, he he.

  • Like 2

Share this post


Link to post

The point that others already have expressed is that despite being interested in a topic as performance improvement so low level (as in instruction-level instead of algorithmic level) you seem to lack some important knowledge to do so such as assembly - it does not require as much as it does to write assembly code but to understand it in order to be able to look at the code in the debugger and see that some comparisons are apples and bananas.

 

I did not even read through your code but simply placed a breakpoint into your IsIN function and noticed that it contained a function call to System.SetElem (that even was the first time I have ever seen that function being called so a TIL for me). Why was that the case? Because you are not using consts here but variables. Had you simply made consts for all those IDs the code would have almost as fast as the IsOR which does not suffer to extra function calls but from memory reads (not noticeable in the benchmark because its all in L1 cache already).

 

On my CPU InOR is still a little faster than IsIN which is due to the fact how the compiler builds the in - you can see that for yourself in the disassembly and then look at instruction timings, read up on macro-operation fusion and data dependency

 

For reference, this is the assembly for the two functions when using consts

 

Project1.dpr.40: Result := aID in [xControlsRec.ButtonID,  xControlsRec.FormID, xControlsRec.ListBoxID, xControlsRec.TabControlID, xControlsRec.ComboBoxID];
004CEE7C 83E802           sub eax,$02
004CEE7F 7417             jz $004cee98
004CEE81 83E802           sub eax,$02
004CEE84 7412             jz $004cee98
004CEE86 83E802           sub eax,$02
004CEE89 740D             jz $004cee98
004CEE8B 83E802           sub eax,$02
004CEE8E 7408             jz $004cee98
004CEE90 83E802           sub eax,$02
004CEE93 7403             jz $004cee98
004CEE95 33C0             xor eax,eax
004CEE97 C3               ret 
004CEE98 B001             mov al,$01
Project1.dpr.41: end;
004CEE9A C3               ret 
004CEE9B 90               nop 
Project1.dpr.45: Result := (aID = xControlsRec.ButtonID) or (aID = xControlsRec.FormID) or (aID = xControlsRec.ListBoxID) or (aID = xControlsRec.TabControlID) or (aID = xControlsRec.ComboBoxID);
004CEE9C 83F802           cmp eax,$02
004CEE9F 7417             jz $004ceeb8
004CEEA1 83F804           cmp eax,$04
004CEEA4 7412             jz $004ceeb8
004CEEA6 83F806           cmp eax,$06
004CEEA9 740D             jz $004ceeb8
004CEEAB 83F808           cmp eax,$08
004CEEAE 7408             jz $004ceeb8
004CEEB0 83F80A           cmp eax,$0a
004CEEB3 7403             jz $004ceeb8
004CEEB5 33C0             xor eax,eax
004CEEB7 C3               ret 
004CEEB8 B001             mov al,$01
Project1.dpr.46: end;
004CEEBA C3               ret 

 

Depending on the number of IDs you have it might be worth using power of two and bitmasks or an enum directly because that would only require one cmp/test making the function twice as fast and perfect for inlining which would then also eliminate the function call overhead at all.

Edited by Stefan Glienke
  • Like 4
  • Thanks 2

Share this post


Link to post
18 minutes ago, Stefan Glienke said:

Because you are not using consts here but variables.

Thanks for the good explanation above and taking the time to write it up.

Share this post


Link to post

I learned alot from Mike questions and answers.  Here's my stab at it.  

program interSector;

{$APPTYPE CONSOLE}
{$R *.res}
{$o+}

uses
  System.SysUtils,
  System.Diagnostics,
  Classes;

type
  // TsetOPs = (EditStr, ButtonStr, CheckStr, FormStr, FrameStr, ListBoxStr, PageControlStr, TabControlStr, RadioBtnStr, ComboBoxStr);
  TdbKey = (dateID, sqlID, CashID, EditID, ButtonID, CheckID, FormID, FrameID,
    ListBoxID, PageControlID, TabControlID, RadioBtnID, ComboBoxID);
  // : integer;
  TsetKeys = set of TdbKey;

const
  cLoop = 100000000;
  cSet: TsetKeys = [dateID, CheckID, FrameID, ComboBoxID];

var
  vSW: TStopwatch;
  vBool: boolean;

  procedure IsIntersected(const aID: integer; var aB: boolean);
  inline
  var
    Intersected: TsetKeys;
    Op: TdbKey;
  begin
    Intersected := [CheckID, FrameID, ButtonID] * cSet;
    aB := aB or (Intersected <> []);
    //for OPs in Intersected do to build relational DB
  end;

begin
  vBool := false;
  vSW := TStopwatch.StartNew;
  var
    I: CppULongInt := 0;

  while I < cLoop do
  begin
    Inc(I);
    IsIntersected(I, vBool);
  end;

  Writeln(Format('IsIntersect = %5s', [vSW.ElapsedMilliseconds.ToString]));

  vBool := not vBool;
  readln;

end.
interSector.dpr.44: IsInterSected(I,VBool);
000000000052B51F 89C1             mov ecx,eax
000000000052B521 85C9             test ecx,ecx
000000000052B523 7D05             jnl interSector + $6A
000000000052B525 E8A611EEFF       call @BoundErr
000000000052B52A 480FB70D1E010000 movzx rcx,word ptr [rel $0000011e]
000000000052B532 66230D035B0200   and cx,[rel $00025b03]
000000000052B539 803D5831030000   cmp byte ptr [rel $00033158],$00
000000000052B540 750D             jnz interSector + $8F
000000000052B542 663B0D09010000   cmp cx,[rel $00000109]
000000000052B549 7504             jnz interSector + $8F
000000000052B54B 33C9             xor ecx,ecx
000000000052B54D EB02             jmp interSector + $91
000000000052B54F B101             mov cl,$01
000000000052B551 880D41310300     mov [rel $00033141],cl
interSector.dpr.41: while I < cLoop do
000000000052B557 81F800E1F505     cmp eax,$05f5e100
000000000052B55D 72B6             jb interSector + $55
000000000052B55F 90               nop

 

  • Like 3

Share this post


Link to post
13 hours ago, Stefan Glienke said:

when using consts

Something like this, right?:

const
    cControlsRec: TControlsRec = (EditID: 1; ButtonID:2; CheckID: 3; FormID:4; FrameID:5; ListBoxID:6; PageControlID:7; TabControlID:8; RadioBtnID:9; ComboBoxID:10 );

When I use const cControlsRec, the difference is not that big and @SetElem is called anyway

 

non-const:

image.thumb.png.43b791f5576d0e646eab8e8897b178ec.png

 

const:

image.thumb.png.27c7e6f5edb4228c030b85642a9cf1f3.png

 

 

You can see my asm code is quite different than yours. Is it possible because of Delphi versions? I still use 10.2.3.

 

Share this post


Link to post

No, that is a const record - which is not really const but just a write protected variable. You know how to declare integer consts, do you?

Share this post


Link to post
8 hours ago, Stefan Glienke said:

No, that is a const record - which is not really const but just a write protected variable. You know how to declare integer consts, do you?

OK, const record is not the same as const, wasn't aware of that - then how did you manage to get a integer const prefixed with xControlsRec, like xControlsRec.ButtonID? If I try a const with a dot in the name, it doesn't compile:

const xCtrls.EditID: integer = 1;

it complains:

[dcc32 Error] Project1.dpr(93): E2029 '=' expected but '.' found

 

 

If I set integer consts like this:

const
  xButtonID: integer = 2; xFormID: integer = 4; xListBoxID: integer = 6; xTabControlID: integer = 8; xComboBoxID: integer = 10;

the runtime is similar as before, still uses @SetElem:

image.thumb.png.c6416fc3cf7503f24b05d6febad688e5.png

This is D10.2.3.

 

 

When trying this in D11.1 in VM, it generates the same asm, so not sure how you got your completely different asm code:

image.thumb.png.ca4f4ec2ee17366f4f835249ca948df9.png

 

 

Share this post


Link to post
2 minutes ago, Mike Torrettinni said:

how did you manage to get a integer const prefixed with xControlsRec, like xControlsRec.ButtonID?

type
  xConst = record
  const
    ButtonId = 1;
  end;

procedure Test;
var 
  bId: Integer;
begin
  bId := xConst.ButtonId;
end;

 

  • Thanks 1

Share this post


Link to post
Just now, Lars Fosdal said:

type
  xConst = record
  const
    ButtonId = 1;
  end;

procedure Test;
var 
  bId: Integer;
begin
  bId := xConst.ButtonId;
end;

 

OK, I think I see it now, const in a record vs const record.

 

 

Share this post


Link to post

Well, even with a real const in a record, I still get @SetElem:

 

type
    TCtrls = record
    const
      ButtonID: integer = 2;
      FormID: integer = 4;
      ListBoxID: integer = 6;
      TabControlID: integer = 8;
      ComboBoxID: integer = 10;
    end;

 

image.thumb.png.bca72afb5d24411a026ce19d94e2cce6.png

 

 

 

I tried with a var of TCtrls type, just in case if it makes a diff, still the same:

 

var xCtrls: TCtrls;

 

image.thumb.png.13a1c33b3885962c8b5ef5e9f62e54d1.png

 

 

Share this post


Link to post

Typed consts are not consts unless you type them on the right side of the expression.

const

  NotAConst: Integer = -1;

  IsAConst = Integer(-1);

The record type simply becomes a "scope" for const declarations.

Benefit: the consts can be non-uniquely named without polluting your name spaces since you always need to prefix with the type name to access the const.

Added bonus - you can have methods in the record declaration that can deal with logical operations related to the constants.

 

type
  cButton = record
  const
    ThisIsFine = mrOK;
  end;
  
  cAnswer = record
  const
    ThisIsFine = String('This is fine');
  end;
  
  cLimit = record
  const
    ThisIsFine = Double(3.14);
  end;

 

  • Thanks 2

Share this post


Link to post
31 minutes ago, Lars Fosdal said:

Typed consts are not consts unless you type them on the right side of the expression.

I knew there is a difference between them, more for strings - in one of my benchmarks using string typed consts resulted in faster runtime vs non-typed consts.

But still don't know to what extent it affects every case that uses typed conts.

 

Re-reading this thread, but still trying to process all the info there:

 

Share this post


Link to post

What proportion of the overall time your program takes for the task, is the operation you are trying to optimise in this thread? 

  • Like 2

Share this post


Link to post
On 3/21/2022 at 1:32 PM, Stefan Glienke said:

Thanks and after trying to grasp the content of the links I realized I shouldn't be using micro optimization phrase at all! Will drop it from next topics, as I'm interested in whats faster but definitely I'm out of my comfort zone and not ready to dive into this topic the way it was expected because of the title.

Share this post


Link to post
On 3/22/2022 at 1:32 AM, Stefan Glienke said:

Depending on the number of IDs you have it might be worth using power of two and bitmasks or an enum directly because that would only require one cmp/test making the function twice as fast and perfect for inlining which would then also eliminate the function call overhead at all. 

As Stefan said ... without const ...

procedure AssignControlsRec;
begin
  xControlsRec.EditID        :=   1;    xControlsRec.EditStr        := 'Edit'       ;
  xControlsRec.ButtonID      :=   2;    xControlsRec.ButtonStr      := 'Button'     ;
  xControlsRec.CheckID       :=   4;    xControlsRec.CheckStr       := 'Check'      ;
  xControlsRec.FormID        :=   8;    xControlsRec.FormStr        := 'Form'       ;
  xControlsRec.FrameID       :=  16;    xControlsRec.FrameStr       := 'Frame'      ;
  xControlsRec.ListBoxID     :=  32;    xControlsRec.ListBoxStr     := 'ListBox'    ;
  xControlsRec.PageControlID :=  64;    xControlsRec.PageControlStr := 'PageControl';
  xControlsRec.TabControlID  := 128;    xControlsRec.TabControlStr  := 'TabControl' ;
  xControlsRec.RadioBtnID    := 256;    xControlsRec.RadioBtnStr    := 'RadioBtn'   ;
  xControlsRec.ComboBoxID    := 512;    xControlsRec.ComboBoxStr    := 'ComboBox'   ;
end;

function IsIN(aID: integer): boolean ;     // inline;
begin
  const test = xControlsRec.ButtonID  + xControlsRec.FormID + xControlsRec.ListBoxID + xControlsRec.TabControlID + xControlsRec.ComboBoxID;
  IsIn := (test and aID) <> 0;
end;

image.thumb.png.600782fb8f5e55c7f08106c44c2e3f97.png

 

image.thumb.png.d5c2f762fabb5f0d73a96b4112277038.png

Edited by pmcgee
  • Thanks 1

Share this post


Link to post
10 hours ago, David Heffernan said:

What proportion of the overall time your program takes for the task, is the operation you are trying to optimise in this thread? 

Specifically, what is shown by a profiler?

Share this post


Link to post
55 minutes ago, Bill Meyer said:

Specifically, what is shown by a profiler?

I pasted 2 profiler results in first post and you can see replacing IN with OR shaves off 0.5s. It's part of a search feature and verification of data type is significant portion of search runtime. 

 

Share this post


Link to post
11 hours ago, David Heffernan said:

What proportion of the overall time your program takes for the task, is the operation you are trying to optimise in this thread? 

Do you know this? 

Share this post


Link to post

Neater and equally fast ...

type
  Conts = (edit, button, check, form, frame, listbox, pgctrl, tabctrl, radiobtn, combobox);
  Ctest = set of Conts;

  const
  Strs : array[Conts] of string =
                       (    'Edit',       'Button',      'Check',     'Form',      'Frame',
                         'ListBox',  'PageControl', 'TabControl', 'RadioBtn',   'ComboBox'   );

function IsIN(aID: Conts): boolean ;     // inline;
begin
  const test : Ctest = [button, form, listbox, tabctrl, combobox];
  IsIn := aId in test;
end;

image.thumb.png.7f3632614a24cfb56b672ffa7b440145.png

 

 

 

Edited by pmcgee
  • Thanks 2

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×