Jump to content
Sign in to follow this  
pyscripter

Impact of debug dcus on performance

Recommended Posts

I used to think that activating the compiler option "Use debug dcus" had minimal impact on performance.  I hadn't done any benchmarking but from experience I could not tell the difference in performance.  So I would include debug dcus even on release versions, so that full stack trace information would be available.   But in my recent tests regarding the performance of Regular Expressions, I found that activating this option would increase run times by as much as 70-80%.  

  • Does anyone has done any benchmarking or have any rough idea about this? 
  • Is the large performance hit specific to this case? 
  • Can you still get full stack trace information without using debug dcus?   (I am using jcl's Debug expert for exception reporting, with jdbg info linked into the executable).

 

Example stack trace:

Quote

Stack list, generated 14/09/2018 15:56:54
(0000F543){PyScripter.exe} [00410543] System.LocaleCharsFromUnicode (Line 39900, "System.pas" + 1) + $17
(001CDF3F){PyScripter.exe} [005CEF3F] Vcl.Controls.TWinControl.CMEnabledChanged (Line 11672, "Vcl.Controls.pas" + 2) + $2
(002B5F41){PyScripter.exe} [006B6F41] VirtualTrees.TBaseVirtualTree.CMEnabledChanged (Line 15752, "VirtualTrees.pas" + 1) + $2
(0003F413){PyScripter.exe} [00440413] System.Generics.Collections.TListHelper.DoRemoveFwd4 (Line 2295, "System.Generics.Collections.pas" + 3) + $6
(000081DB){PyScripter.exe} [004091DB] System.TMonitor.Exit (Line 18722, "System.pas" + 2) + $7
(000A558C){PyScripter.exe} [004A658C] System.Classes.RemoveFixups (Line 9715, "System.Classes.pas" + 14) + $8
(00007620){PyScripter.exe} [00408620] System.TObject.Destroy (Line 16985, "System.pas" + 0) + $0

 

 
  • Like 1

Share this post


Link to post

Originally the idea was that the debug dcus only contain additional information for the integrated debugger which should have no performance impact at all. This of course is only true, if all compiler (and possibly linker) settings are equal, which I doubt. E.g. enabling range checking (which I always do for debug builds) can have a significant impact on performance. No idea what the compiler options are in the supplied debug dcus.

 

The jcldebug stack trace does not require debug dcus, but a detailed map file, which does not have any performance impact.

  • Like 1

Share this post


Link to post

My understanding has always been that debug dcus are compiled with optimisations etc. and so there should be no performance difference. 

 

Can you provide a cut down program that demonstrates the issue? 

Edited by David Heffernan

Share this post


Link to post

But here I am comparing the same app with/without debug dcus not between 32bit/64bit. 

Actually I just noticed that the difference only appears with 32 bit in Delphi Rio.   Debug dcus make little difference in Rio 64bits or in Delphi Tokyo

The surprising thing is that in this benchmark application, most of the computational time is spent inside external c code linked into the application.   Could it be that the debug dcus link to different c code in 32 bits Rio compared to dcus without debug info?

 

Another finding was that in Delphi Tokyo, the benchmark runs significantly faster in 64bits than 32 bits  (quite the opposite with Davids zlib experience).

Share this post


Link to post

I don't have the impression that the debug dcu code has been compiled with optimization.

I'm also not sure if it's should have been. For exception reporting I'm using eurekalog, the call stack is okay, maybe not as verbose as with debug dcu's.

But in principle I never release with debug dcu's.

Share this post


Link to post
4 hours ago, pyscripter said:

But here I am comparing the same app with/without debug dcus not between 32bit/64bit. 

Actually I just noticed that the difference only appears with 32 bit in Delphi Rio.   Debug dcus make little difference in Rio 64bits or in Delphi Tokyo

The surprising thing is that in this benchmark application, most of the computational time is spent inside external c code linked into the application.   Could it be that the debug dcus link to different c code in 32 bits Rio compared to dcus without debug info?

 

Another finding was that in Delphi Tokyo, the benchmark runs significantly faster in 64bits than 32 bits  (quite the opposite with Davids zlib experience).

My point is not that it's a 32/64 bit issue. My point is that it could be a difference in the way the C code is compiled. 

Share this post


Link to post

@dummzeuch I checked again and without debug dcu's you do not get line information for vcl and rtl routines in the call stack, when using jcl debug.  This was the reason I was using the "debug dcus" option.

  • Like 2

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
Sign in to follow this  

×