Mike Torrettinni 198 Posted March 19, 2022 I have a few functions where I check if 'data' is 'a control', and in some cases I use MatchStr where I have string info, and IN where I have integer ID: function IsControlGroupA(const aControl: string): boolean; Result := MatchStr(aControl, ['Edit', 'Check', 'ComboBox', 'RadioBtn', 'ListBox']); end; function IsControlGroupX(aControl: integer): boolean; Result := aControl in [xConttrols.EditID, xConttrols.CheckID, xConttrols.ComboBoxID, xConttrols.RadioBtnID, xConttrols.ListBoxID]; end; Replacing MatchStr with OR is of course much faster - because we know MatchStr is quite inefficient compared to simple = comparison, so this is much faster: function IsControlGroupA(const aControl: string): boolean; Result := (aControl = 'Edit') or (aControl = 'Check') or (aControl = 'ComboBox') or (aControl = 'RadioBtn') or (aControl = 'ListBox'); end; Then I looked at IN and it looks like simple check of integers, which is fast anyway. So, I benchmarked if it's worth replacing it also with OR: function IsControlGroupX(aControl: integer): boolean; Result := (aControl = xConttrols.EditID) or (aControl = xConttrols.CheckID) or (aControl = xConttrols.ComboBoxID) or (aControl = xConttrols.RadioBtnID) or (aControl = xConttrols.ListBoxID); end; Then I remember reading Marco's response on Case vs If Else about a single comparison and jump: https://twitter.com/marcocantu/status/1501844578898530308?cxt=HHwWiMC-kf_rz9cpAAAA so I put Case into benchmarking and seems like Case is not suitable for my data, since I don't use a large number of ordinals, like Marco explained when Case is faster, and I don't use If Else, so probably not good case anyway, but I put it in so I know how it go against OR. In test benchmark, the OR is a winner, and Case is a bit slower especially in 64bit - which I'm slowly moving into anyway: I benchmarked with my current D 10.2.3 and D 11.1 running in VM: As Marco said, perhaps if I had a large number of checks and if I compared it to If Else, it could win. The real-life difference in one data sample, the most used check for a group is 6x faster using OR vs IN: using IN: using OR: So, definitely quite useful improvement! 1 fact about Case: as we know you can't use Case on non-constant expressions, so using Case for my functions would require a good amount of code change. If it proved to be significantly faster then OR, I would make the change, but it's not. As usual I'm just looking for comment on my observations, perhaps suggestion on something I don't see or didn't take into consideration. And the benchmark code - with the usual limitation of simple repeated calls: program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Diagnostics; type TControlsRec = record EditID, ButtonID, CheckID, FormID, FrameID, ListBoxID, PageControlID, TabControlID, RadioBtnID, ComboBoxID: integer; EditStr, ButtonStr, CheckStr, FormStr, FrameStr, ListBoxStr, PageControlStr, TabControlStr, RadioBtnStr, ComboBoxStr: string; end; var xControlsRec: TControlsRec; procedure AssignControlsRec; begin xControlsRec.EditID := 1; xControlsRec.EditStr := 'Edit'; xControlsRec.ButtonID := 2; xControlsRec.ButtonStr := 'Button'; xControlsRec.CheckID := 3; xControlsRec.CheckStr := 'Check'; xControlsRec.FormID := 4; xControlsRec.FormStr := 'Form'; xControlsRec.FrameID := 5; xControlsRec.FrameStr := 'Frame'; xControlsRec.ListBoxID := 6; xControlsRec.ListBoxStr := 'ListBox'; xControlsRec.PageControlID := 7; xControlsRec.PageControlStr := 'PageControl'; xControlsRec.TabControlID := 8; xControlsRec.TabControlStr := 'TabControl'; xControlsRec.RadioBtnID := 9; xControlsRec.RadioBtnStr := 'RadioBtn'; xControlsRec.ComboBoxID := 10; xControlsRec.ComboBoxStr := 'ComboBox'; end; function IsIN(aID: integer): boolean; begin Result := aID in [xControlsRec.ButtonID, xControlsRec.FormID, xControlsRec.ListBoxID, xControlsRec.TabControlID, xControlsRec.ComboBoxID]; end; function IsOR(aID: integer): boolean; begin Result := (aID = xControlsRec.ButtonID) or (aID = xControlsRec.FormID) or (aID = xControlsRec.ListBoxID) or (aID = xControlsRec.TabControlID) or (aID = xControlsRec.ComboBoxID); end; function IsCase(aID: integer): boolean; begin case aID of 2,4,6,8,10: Result := True; else Result := false; end; end; function IsCase2(aID: integer): boolean; begin case aID of 2: Result := True; 4: Result := True; 6: Result := True; 8: Result := True; 10: Result := True; else Result := false; end; end; const cLoop = 100000000; // Values/ControlIDs to search for // 3 values are in the group, 2 are not - similar ratio as in real data cSearch: array[1..5] of integer = (2,4,5,8,11); var vSW: TStopwatch; vBool:boolean; i: integer; begin AssignControlsRec; // IsIN vSW := TStopwatch.StartNew; for i := 1 to cLoop do begin vBool := IsIN(cSearch[1]); vBool := IsIN(cSearch[2]); vBool := IsIN(cSearch[3]); vBool := IsIN(cSearch[4]); vBool := IsIN(cSearch[5]); end; vSW.Stop; Writeln(Format('IsIN = %5s', [vSW.ElapsedMilliseconds.ToString])); // IsOR vSW := TStopwatch.StartNew; for i := 1 to cLoop do begin vBool := IsOR(cSearch[1]); vBool := IsOR(cSearch[2]); vBool := IsOR(cSearch[3]); vBool := IsOR(cSearch[4]); vBool := IsOR(cSearch[5]); end; vSW.Stop; Writeln(Format('IsOR = %5s', [vSW.ElapsedMilliseconds.ToString])); // IsCase vSW := TStopwatch.StartNew; for i := 1 to cLoop do begin vBool := IsCase(cSearch[1]); vBool := IsCase(cSearch[2]); vBool := IsCase(cSearch[3]); vBool := IsCase(cSearch[4]); vBool := IsCase(cSearch[5]); end; vSW.Stop; Writeln(Format('IsCase = %5s', [vSW.ElapsedMilliseconds.ToString])); // IsCase2 vSW := TStopwatch.StartNew; for i := 1 to cLoop do begin vBool := IsCase2(cSearch[1]); vBool := IsCase2(cSearch[2]); vBool := IsCase2(cSearch[3]); vBool := IsCase2(cSearch[4]); vBool := IsCase2(cSearch[5]); end; vSW.Stop; Writeln(Format('IsCase2 = %5s', [vSW.ElapsedMilliseconds.ToString])); readln; end. 1 Share this post Link to post
Attila Kovacs 629 Posted March 19, 2022 (edited) You are mostly benchmarking stack operations, calls etc.. -again- everything which has nothing to do with the actual comparisons and it's asm implementations. I'd suggest you to buy an assembly book first and try to understand the asm in the Delphi debugger. There are 3 changes in the unit, try to find them and tell us what do you think about the new results. Project1.dpr Edited March 19, 2022 by Attila Kovacs 1 Share this post Link to post
Anders Melander 1795 Posted March 19, 2022 5 hours ago, Mike Torrettinni said: As usual I'm just looking for comment on my observations, perhaps suggestion on something I don't see or didn't take into consideration. Okay then, here's a comment: If you really cared about performance then you would profile your code and direct your efforts at areas that would make an actual difference on a higher level. 1 Share this post Link to post
David Heffernan 2350 Posted March 19, 2022 35 minutes ago, Anders Melander said: Okay then, here's a comment: If you really cared about performance then you would profile your code and direct your efforts at areas that would make an actual difference on a higher level. The familiar refrain, but the message has not landed yet Share this post Link to post
Stefan Glienke 2009 Posted March 21, 2022 I am usually happy to help people improve in that subject but seeing that you simply ignored half the advice given to you before in countless similar threads I consider this a waste of time. 1 3 Share this post Link to post
Mike Torrettinni 198 Posted March 21, 2022 20 minutes ago, Stefan Glienke said: I am usually happy to help people improve in that subject but seeing that you simply ignored half the advice given to you before in countless similar threads I consider this a waste of time. I recently reached the age bracket that many would probably consider 'unteachable', so it's not on purpose. I was away from Delphi for a couple of months and now preparing projects for 64bit, so it's fun to do some benchmarking again. If I can make some good progress, like in this case reducing some process runtime by half a second, even better. This process is part of search feature, so user interaction feature, and any speed improvement is very useful because users notice them very quickly. 1 Share this post Link to post
Mike Torrettinni 198 Posted March 21, 2022 @Stefan Glienke I got a little 'nudge' that my reply above was rude and seems inconsiderate to your help in the past. I hope you didn't read it that way. I took your comment as if my benchmark was waste of time, but re-reading it, I see what you meant. Anyway, I think I was always appreciative to your help and expressed it in past topics. Don't feel like you need to help me in every topic, I have no problem getting no comments on my benchmarks, I take it as "nothing big is wrong with it" or "it's so wrong, it would take me too long to explain", I welcome both options, because they mean either I did something right or I'm slowly getting there. The 64bit projects will go through a lot of benchmarks, so I will probably post more observations like this one, so be on the lookout for something that peeks your interest, I will not be offended if you skip the rest. Also, come on, you can't say that Moe, Larry and Curly's reprisal of 'First!' comment wasn't worth at least a little chuckle at my expense... I know I did, he he. 2 Share this post Link to post
Stefan Glienke 2009 Posted March 21, 2022 (edited) The point that others already have expressed is that despite being interested in a topic as performance improvement so low level (as in instruction-level instead of algorithmic level) you seem to lack some important knowledge to do so such as assembly - it does not require as much as it does to write assembly code but to understand it in order to be able to look at the code in the debugger and see that some comparisons are apples and bananas. I did not even read through your code but simply placed a breakpoint into your IsIN function and noticed that it contained a function call to System.SetElem (that even was the first time I have ever seen that function being called so a TIL for me). Why was that the case? Because you are not using consts here but variables. Had you simply made consts for all those IDs the code would have almost as fast as the IsOR which does not suffer to extra function calls but from memory reads (not noticeable in the benchmark because its all in L1 cache already). On my CPU InOR is still a little faster than IsIN which is due to the fact how the compiler builds the in - you can see that for yourself in the disassembly and then look at instruction timings, read up on macro-operation fusion and data dependency For reference, this is the assembly for the two functions when using consts Project1.dpr.40: Result := aID in [xControlsRec.ButtonID, xControlsRec.FormID, xControlsRec.ListBoxID, xControlsRec.TabControlID, xControlsRec.ComboBoxID]; 004CEE7C 83E802 sub eax,$02 004CEE7F 7417 jz $004cee98 004CEE81 83E802 sub eax,$02 004CEE84 7412 jz $004cee98 004CEE86 83E802 sub eax,$02 004CEE89 740D jz $004cee98 004CEE8B 83E802 sub eax,$02 004CEE8E 7408 jz $004cee98 004CEE90 83E802 sub eax,$02 004CEE93 7403 jz $004cee98 004CEE95 33C0 xor eax,eax 004CEE97 C3 ret 004CEE98 B001 mov al,$01 Project1.dpr.41: end; 004CEE9A C3 ret 004CEE9B 90 nop Project1.dpr.45: Result := (aID = xControlsRec.ButtonID) or (aID = xControlsRec.FormID) or (aID = xControlsRec.ListBoxID) or (aID = xControlsRec.TabControlID) or (aID = xControlsRec.ComboBoxID); 004CEE9C 83F802 cmp eax,$02 004CEE9F 7417 jz $004ceeb8 004CEEA1 83F804 cmp eax,$04 004CEEA4 7412 jz $004ceeb8 004CEEA6 83F806 cmp eax,$06 004CEEA9 740D jz $004ceeb8 004CEEAB 83F808 cmp eax,$08 004CEEAE 7408 jz $004ceeb8 004CEEB0 83F80A cmp eax,$0a 004CEEB3 7403 jz $004ceeb8 004CEEB5 33C0 xor eax,eax 004CEEB7 C3 ret 004CEEB8 B001 mov al,$01 Project1.dpr.46: end; 004CEEBA C3 ret Depending on the number of IDs you have it might be worth using power of two and bitmasks or an enum directly because that would only require one cmp/test making the function twice as fast and perfect for inlining which would then also eliminate the function call overhead at all. Edited March 21, 2022 by Stefan Glienke 2 2 Share this post Link to post
SwiftExpat 65 Posted March 21, 2022 18 minutes ago, Stefan Glienke said: Because you are not using consts here but variables. Thanks for the good explanation above and taking the time to write it up. Share this post Link to post
Pat Foley 51 Posted March 22, 2022 I learned alot from Mike questions and answers. Here's my stab at it. program interSector; {$APPTYPE CONSOLE} {$R *.res} {$o+} uses System.SysUtils, System.Diagnostics, Classes; type // TsetOPs = (EditStr, ButtonStr, CheckStr, FormStr, FrameStr, ListBoxStr, PageControlStr, TabControlStr, RadioBtnStr, ComboBoxStr); TdbKey = (dateID, sqlID, CashID, EditID, ButtonID, CheckID, FormID, FrameID, ListBoxID, PageControlID, TabControlID, RadioBtnID, ComboBoxID); // : integer; TsetKeys = set of TdbKey; const cLoop = 100000000; cSet: TsetKeys = [dateID, CheckID, FrameID, ComboBoxID]; var vSW: TStopwatch; vBool: boolean; procedure IsIntersected(const aID: integer; var aB: boolean); inline var Intersected: TsetKeys; Op: TdbKey; begin Intersected := [CheckID, FrameID, ButtonID] * cSet; aB := aB or (Intersected <> []); //for OPs in Intersected do to build relational DB end; begin vBool := false; vSW := TStopwatch.StartNew; var I: CppULongInt := 0; while I < cLoop do begin Inc(I); IsIntersected(I, vBool); end; Writeln(Format('IsIntersect = %5s', [vSW.ElapsedMilliseconds.ToString])); vBool := not vBool; readln; end. interSector.dpr.44: IsInterSected(I,VBool); 000000000052B51F 89C1 mov ecx,eax 000000000052B521 85C9 test ecx,ecx 000000000052B523 7D05 jnl interSector + $6A 000000000052B525 E8A611EEFF call @BoundErr 000000000052B52A 480FB70D1E010000 movzx rcx,word ptr [rel $0000011e] 000000000052B532 66230D035B0200 and cx,[rel $00025b03] 000000000052B539 803D5831030000 cmp byte ptr [rel $00033158],$00 000000000052B540 750D jnz interSector + $8F 000000000052B542 663B0D09010000 cmp cx,[rel $00000109] 000000000052B549 7504 jnz interSector + $8F 000000000052B54B 33C9 xor ecx,ecx 000000000052B54D EB02 jmp interSector + $91 000000000052B54F B101 mov cl,$01 000000000052B551 880D41310300 mov [rel $00033141],cl interSector.dpr.41: while I < cLoop do 000000000052B557 81F800E1F505 cmp eax,$05f5e100 000000000052B55D 72B6 jb interSector + $55 000000000052B55F 90 nop 3 Share this post Link to post
Mike Torrettinni 198 Posted March 22, 2022 13 hours ago, Stefan Glienke said: when using consts Something like this, right?: const cControlsRec: TControlsRec = (EditID: 1; ButtonID:2; CheckID: 3; FormID:4; FrameID:5; ListBoxID:6; PageControlID:7; TabControlID:8; RadioBtnID:9; ComboBoxID:10 ); When I use const cControlsRec, the difference is not that big and @SetElem is called anyway non-const: const: You can see my asm code is quite different than yours. Is it possible because of Delphi versions? I still use 10.2.3. Share this post Link to post
Stefan Glienke 2009 Posted March 22, 2022 No, that is a const record - which is not really const but just a write protected variable. You know how to declare integer consts, do you? Share this post Link to post
Mike Torrettinni 198 Posted March 23, 2022 8 hours ago, Stefan Glienke said: No, that is a const record - which is not really const but just a write protected variable. You know how to declare integer consts, do you? OK, const record is not the same as const, wasn't aware of that - then how did you manage to get a integer const prefixed with xControlsRec, like xControlsRec.ButtonID? If I try a const with a dot in the name, it doesn't compile: const xCtrls.EditID: integer = 1; it complains: [dcc32 Error] Project1.dpr(93): E2029 '=' expected but '.' found If I set integer consts like this: const xButtonID: integer = 2; xFormID: integer = 4; xListBoxID: integer = 6; xTabControlID: integer = 8; xComboBoxID: integer = 10; the runtime is similar as before, still uses @SetElem: This is D10.2.3. When trying this in D11.1 in VM, it generates the same asm, so not sure how you got your completely different asm code: Share this post Link to post
Lars Fosdal 1792 Posted March 23, 2022 2 minutes ago, Mike Torrettinni said: how did you manage to get a integer const prefixed with xControlsRec, like xControlsRec.ButtonID? type xConst = record const ButtonId = 1; end; procedure Test; var bId: Integer; begin bId := xConst.ButtonId; end; 1 Share this post Link to post
Mike Torrettinni 198 Posted March 23, 2022 Just now, Lars Fosdal said: type xConst = record const ButtonId = 1; end; procedure Test; var bId: Integer; begin bId := xConst.ButtonId; end; OK, I think I see it now, const in a record vs const record. Share this post Link to post
Mike Torrettinni 198 Posted March 23, 2022 Well, even with a real const in a record, I still get @SetElem: type TCtrls = record const ButtonID: integer = 2; FormID: integer = 4; ListBoxID: integer = 6; TabControlID: integer = 8; ComboBoxID: integer = 10; end; I tried with a var of TCtrls type, just in case if it makes a diff, still the same: var xCtrls: TCtrls; Share this post Link to post
Lars Fosdal 1792 Posted March 23, 2022 Typed consts are not consts unless you type them on the right side of the expression. const NotAConst: Integer = -1; IsAConst = Integer(-1); The record type simply becomes a "scope" for const declarations. Benefit: the consts can be non-uniquely named without polluting your name spaces since you always need to prefix with the type name to access the const. Added bonus - you can have methods in the record declaration that can deal with logical operations related to the constants. type cButton = record const ThisIsFine = mrOK; end; cAnswer = record const ThisIsFine = String('This is fine'); end; cLimit = record const ThisIsFine = Double(3.14); end; 1 Share this post Link to post
Mike Torrettinni 198 Posted March 23, 2022 31 minutes ago, Lars Fosdal said: Typed consts are not consts unless you type them on the right side of the expression. I knew there is a difference between them, more for strings - in one of my benchmarks using string typed consts resulted in faster runtime vs non-typed consts. But still don't know to what extent it affects every case that uses typed conts. Re-reading this thread, but still trying to process all the info there: Share this post Link to post
David Heffernan 2350 Posted March 23, 2022 What proportion of the overall time your program takes for the task, is the operation you are trying to optimise in this thread? 2 Share this post Link to post
Mike Torrettinni 198 Posted March 23, 2022 On 3/21/2022 at 1:32 PM, Stefan Glienke said: then look at instruction timings, read up on macro-operation fusion and data dependency Thanks and after trying to grasp the content of the links I realized I shouldn't be using micro optimization phrase at all! Will drop it from next topics, as I'm interested in whats faster but definitely I'm out of my comfort zone and not ready to dive into this topic the way it was expected because of the title. Share this post Link to post
pmcgee 18 Posted March 23, 2022 (edited) On 3/22/2022 at 1:32 AM, Stefan Glienke said: Depending on the number of IDs you have it might be worth using power of two and bitmasks or an enum directly because that would only require one cmp/test making the function twice as fast and perfect for inlining which would then also eliminate the function call overhead at all. As Stefan said ... without const ... procedure AssignControlsRec; begin xControlsRec.EditID := 1; xControlsRec.EditStr := 'Edit' ; xControlsRec.ButtonID := 2; xControlsRec.ButtonStr := 'Button' ; xControlsRec.CheckID := 4; xControlsRec.CheckStr := 'Check' ; xControlsRec.FormID := 8; xControlsRec.FormStr := 'Form' ; xControlsRec.FrameID := 16; xControlsRec.FrameStr := 'Frame' ; xControlsRec.ListBoxID := 32; xControlsRec.ListBoxStr := 'ListBox' ; xControlsRec.PageControlID := 64; xControlsRec.PageControlStr := 'PageControl'; xControlsRec.TabControlID := 128; xControlsRec.TabControlStr := 'TabControl' ; xControlsRec.RadioBtnID := 256; xControlsRec.RadioBtnStr := 'RadioBtn' ; xControlsRec.ComboBoxID := 512; xControlsRec.ComboBoxStr := 'ComboBox' ; end; function IsIN(aID: integer): boolean ; // inline; begin const test = xControlsRec.ButtonID + xControlsRec.FormID + xControlsRec.ListBoxID + xControlsRec.TabControlID + xControlsRec.ComboBoxID; IsIn := (test and aID) <> 0; end; Edited March 23, 2022 by pmcgee 1 Share this post Link to post
Bill Meyer 337 Posted March 23, 2022 10 hours ago, David Heffernan said: What proportion of the overall time your program takes for the task, is the operation you are trying to optimise in this thread? Specifically, what is shown by a profiler? Share this post Link to post
Mike Torrettinni 198 Posted March 23, 2022 55 minutes ago, Bill Meyer said: Specifically, what is shown by a profiler? I pasted 2 profiler results in first post and you can see replacing IN with OR shaves off 0.5s. It's part of a search feature and verification of data type is significant portion of search runtime. Share this post Link to post
David Heffernan 2350 Posted March 23, 2022 11 hours ago, David Heffernan said: What proportion of the overall time your program takes for the task, is the operation you are trying to optimise in this thread? Do you know this? Share this post Link to post
pmcgee 18 Posted March 24, 2022 (edited) Neater and equally fast ... type Conts = (edit, button, check, form, frame, listbox, pgctrl, tabctrl, radiobtn, combobox); Ctest = set of Conts; const Strs : array[Conts] of string = ( 'Edit', 'Button', 'Check', 'Form', 'Frame', 'ListBox', 'PageControl', 'TabControl', 'RadioBtn', 'ComboBox' ); function IsIN(aID: Conts): boolean ; // inline; begin const test : Ctest = [button, form, listbox, tabctrl, combobox]; IsIn := aId in test; end; Edited March 24, 2022 by pmcgee 1 Share this post Link to post