DelphiUdIT 200 Posted December 27, 2024 (edited) Hi, I was doing some tests and I came across a strange "behavior". The code here simply performs a division between a random value taken from an array and a constant via an assembler function, and returns the result of the division in integer and the remainder. Program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Diagnostics; //WinApi.Windows, {$IFDEF CPUX64} function DivandMod(Dividendo: UInt32; Divisore: UInt32; var Resto: UInt32): UInt32; overload; asm .noframe mov EAX, ECX mov ECX, EDX xor EDX, EDX div ECX mov DWORD PTR [R8], EDX end; {$ENDIF} const counting = 500_000_000; var i, k: UInt32; t: TStopWatch; a: TArray<UInt32>; resto: UInt32; //risultato: UInt32; //prochandle: THandle; begin // procHandle := GetCurrentProcess; // SetProcessAffinityMask(prochandle, $1); SetLength(a, counting + 1); t := TStopWatch.StartNew; for i := 0 to counting do a[i] := Random(High(Int32)); t.Stop; WriteLn('Load random values:':20, t.ElapsedMilliseconds); for k := 1 to 3 do begin t := TStopWatch.StartNew; for i := 0 to counting do DivandMod(a[i], 2039, resto); // risultato := DivandMod(-12, 7, resto); t.Stop; WriteLn('Test:':20, t.ElapsedMilliseconds); end; writeln(''); writeln('Press any key to close the program'); ReadLn; end. Now the code if compiled with the options "Overflow checking" and "Range checking" (typical of debugging) enabled has the times (in ms.) listed here: Quote Load random values:876 Test:672 Test:671 Test:687 Press any key to close the program If they are disabled I would expect the times to improve, instead they get worse: Quote Load random values:826 Test:802 Test:812 Test:812 Press any key to close the program Does anyone have any idea why? I noticed this because when running the program in RELEASE mode the times increased. I wasted some time trying to understand why, then simply activating or deactivating those options changes occur (this is independent of the DEBUG or RELEASE mode). Try with Windows 11, I9 14900HX , Delphi 12.2 last patch, project build for X64 without optimization and without stack frames. Edited December 27, 2024 by DelphiUdIT Share this post Link to post
Brian Evans 111 Posted December 27, 2024 (edited) Sure you don't have the options reversed between debug/release? The processor might be making poor choices in where it sends instructions but it would be odd to see a couple more instructions execute faster even with poor choice of execution unit scheduling or register usage. The only difference I see is the default debug settings generate code that does range bounds checking on a[i] checking it is within the array bounds. -- range and bounds checking off Project1.dpr.51: DivandMod(a[i], 2039, resto); 000000000036CB5E 488B05834B0300 mov rax,[rel $00034b83] 000000000036CB65 8B0D594B0300 mov ecx,[rel $00034b59] 000000000036CB6B 8B0C88 mov ecx,[rax+rcx*4] 000000000036CB6E BAF7070000 mov edx,$000007f7 000000000036CB73 4C8D05764B0300 lea r8,[rel $00034b76] 000000000036CB7A E881FEFFFF call DivandMod -- range and bounds checking on Project1.dpr.51: DivandMod(a[i], 2039, resto); 000000000036CB83 8B053B4B0300 mov eax,[rel $00034b3b] 000000000036CB89 48833D574B030000 cmp qword ptr [rel $00034b57],$00 000000000036CB91 740D jz Project1 + $150 000000000036CB93 488B0D4E4B0300 mov rcx,[rel $00034b4e] 000000000036CB9A 483B41F8 cmp rax,[rcx-$08] 000000000036CB9E 7205 jb Project1 + $155 000000000036CBA0 E8BBFFECFF call @BoundErr 000000000036CBA5 488B0D3C4B0300 mov rcx,[rel $00034b3c] 000000000036CBAC 8B0C81 mov ecx,[rcx+rax*4] 000000000036CBAF BAF7070000 mov edx,$000007f7 000000000036CBB4 4C8D05354B0300 lea r8,[rel $00034b35] 000000000036CBBB E840FEFFFF call DivandMod Edited December 27, 2024 by Brian Evans Share this post Link to post
DelphiUdIT 200 Posted December 28, 2024 (edited) 1 hour ago, Brian Evans said: Sure you don't have the options reversed between debug/release? I can change the two options in DEBUG or RELEASE and the final otput is the same, there are not difference between RELEASE and DEBUG: if I uncheck them in RELEASE the time increased like if I uncheck them in DEBUG. And I'm sure about the "inversion". 1 hour ago, Brian Evans said: The processor might be making poor choices in where it sends instructions but it would be odd to see a couple more instructions execute faster even with poor choice of execution unit scheduling or register usage. I think the same. That is really strange, 'cause the instructions executed are the same in either (with or without check) but there are more others with check. And the "processor logic" seems to prefer the long path .... Edited December 28, 2024 by DelphiUdIT Share this post Link to post
David Heffernan 2357 Posted December 28, 2024 Probably the benchmark code is not telling you anything useful Share this post Link to post
DelphiUdIT 200 Posted December 28, 2024 1 hour ago, David Heffernan said: Probably the benchmark code is not telling you anything useful You might be right. Share this post Link to post