Jump to content
Steve Maughan

Any Benchmarks Comparing Executable Speeds for MacOS 64 vs Win 64?

Recommended Posts

Since the new MacOS compiler is based on LLVM I'd be interested to see any executable speed comparisons against Delphi Win 64 compiles. Won't the LLVM based compiler produce executables that run significantly faster? Or am I being too optimistic?

 

Steve

Share this post


Link to post
Posted (edited)

Why would you expect it to be faster?

Why would you expect it to be significantly faster? 

What sort of code do you expect to be faster? 

Edited by David Heffernan

Share this post


Link to post
7 hours ago, David Heffernan said:

Why would you expect it to be faster?

Why would you expect it to be significantly faster? 

What sort of code do you expect to be faster? 

Hi David,

 

I'm certainly not an expert in this field. Over the years I've seen various speed benchmarks that suggest Delphi's Windows compiler produces executables that are significantly slower than those produced by the top C++ compilers (e.g. Intel). In the chess world (where I am an expert) the rule of thumb is a Delphi version of a chess engine will run about 30% slower than the C++ equivalent code bases (Critter is the engine that was developed in two parallel code bases).

 

Let's face it, there doesn't seem to have been any work done on optimizing the code created by the Delphi compiler in the last 20 years. I'm just hoping the new backend will be better.

 

Thanks,

 

Steve

  • Like 1
  • Sad 1

Share this post


Link to post
6 minutes ago, Steve Maughan said:

I'm just hoping the new backend will be better.

Didn't work out that way for the Linux compiler. 

  • Sad 1

Share this post


Link to post
17 minutes ago, David Heffernan said:

Didn't work out that way for the Linux compiler. 

Ugh!!!

 

Thanks for sharing

Share this post


Link to post
Posted (edited)

If they hardcode turn off many optimization options in the LLVM backend it won't produce any "as fast as it could be" code...

Edited by Stefan Glienke

Share this post


Link to post
Posted (edited)
31 minutes ago, Stefan Glienke said:

If they hardcode turn off many optimization options in the LLVM backend it won't produce any "as fast as it could be" code...

... or the emitted IR code of the frontend compiler can't be optimized anymore.

Edited by RonaldK

Share this post


Link to post
1 hour ago, Stefan Glienke said:

If they hardcode turn off many optimization options in the LLVM backend it won't produce any "as fast as it could be" code...

Have they done that? Why would they do that?

 

Thanks,

 

Steve

Share this post


Link to post
Posted (edited)
2 hours ago, Steve Maughan said:

Have they done that? Why would they do that?

Yes but don't ask me why - I've been told that some of the optimizations slow down compile time even more than already happening on LLVM based compilers compared to the classic ones but fwiw I would not care for longer compile time on release config.

Edited by Stefan Glienke
  • Like 2
  • Sad 1

Share this post


Link to post
On 7/29/2019 at 1:38 PM, Stefan Glienke said:

Yes but don't ask me why - I've been told that some of the optimizations slow down compile time even more than already happening on LLVM based compilers compared to the classic ones but fwiw I would not care for longer compile time on release config.

Very strange. Isn't that the main reason for a release config?

Maybe here is an new area for the IDEFixPack;-) @jbg

Share this post


Link to post
1 hour ago, RonaldK said:

Maybe here is an new area for the IDEFixPack;-) @jbg

Only if you invent a cloning machine first.

  • Haha 1

Share this post


Link to post
53 minutes ago, jbg said:

Only if you invent a cloning machine first.

... for settings the optimization flags?

 

 

Share this post


Link to post
3 minutes ago, RonaldK said:

... for settings the optimization flags?

Unless you can send me the compiler's source code and the build tools for it...

Share this post


Link to post
On 7/29/2019 at 1:38 PM, Stefan Glienke said:

Yes but don't ask me why - I've been told that some of the optimizations slow down compile time even more than already happening on LLVM based compilers compared to the classic ones but fwiw I would not care for longer compile time on release config.

Yes, that is true.
I didn't made any benchmarks, but regarding compile-time my gut feeling tells me that Macos64 right now compiles significantly slower than Macos32.
So I hope that this is only an effect of the initial version, and that optimization as well as compile-speed will be enhanced to normal in the next update(s).

Share this post


Link to post
29 minutes ago, Rollo62 said:

So I hope that this is only an effect of the initial version, and that optimization as well as compile-speed will be enhanced to normal in the next update(s).

Judging from the performance of the mobile compilers over the years I would say: no

  • Sad 2

Share this post


Link to post

Compiling this simple program:

program test;

{$OPTIMIZATION ON}
{$STACKFRAMES OFF}

procedure loop;
var
  i,sum : integer;
begin
  sum := 0;
  for i := 0 to 1000 do
    sum := sum + 1;
end;

begin
  loop;
end.

Generates the following code (Delphi Mac 64-bit compiler, release build):

__project1_loop PROC
        push    rbp                                  
        mov     rbp, rsp                             
        mov     dword ptr [rbp-8H], 0                
        mov     dword ptr [rbp-0CH], 1001            
ALIGN   16
?_001:  inc     dword ptr [rbp-8H]                   
        dec     dword ptr [rbp-0CH]                  
        jnz     ?_001                                
        pop     rbp                                  
        ret                                          
__project1_loop ENDP


Some point to notice:
- The loop procedure has a stack frame even though we've turned them off
- It does not try to use registers for local variables. Both "i" and "sum" are allocated on the stack.
- Since the loop body has no side effects and the result is not used then the whole loop can be optimized away. This happens with the same code written in C and compiled with Clang compiler. Clang is interesting comparison because it also uses LLVM infrastructure so it gives an idea of what kind of optimizations that are possible. 
- If we were to change the loop so the result variable is used (by printing the content of "sum" variable) then Clang will recognize that the loop can be evaluated at compile time and replace it with the equivalent code of printing that constant. Again, Delphi does not do this. See result of Clang here.

  • Sad 1

Share this post


Link to post
Posted (edited)
39 minutes ago, Stefan Glienke said:

Looks pretty much like -O0 assembly

I agree. It seems no LLVM optimizations passes are enabled, which corresponds with what others have reported about the mobile LLVM compilers. And this is very disappointing.

Edited by VilleK
  • Sad 1

Share this post


Link to post

No LLVM optimization is really disappointing. 

 

If you wind the clock back to 1995, many of the early developers coming from VB3 were exited about a compiled executable that ran at the same speed as "C" programs. They wanted and valued fast executables. Borland / Codegear / Embarcadero have each ignored this value their developers place on fast code and haven't improved the code optimization for 20 years. It seems they take a, "it's good enough" view. I'd encourage them to invest some time and include all the optimizations that LLVM offers (even if it slows down compilation speed for the Release Build). I think they'd be surprised at how well this would be received by the Delphi community.

 

Steve

  • Like 3

Share this post


Link to post

If it's too slow for you, buy a faster computer

:classic_ninja:

  • Like 1

Share this post


Link to post

When optimizations are turned off in Delphi IDE the code is worse:

- Each assign to "sum" variable is a load to register from stack, modify register, store back to stack. Same for loop iteration variable "i".

- Loop start is not on aligned address

- Loop count upwards instead of downwards requiring a compare at the end instead of just jump if not zero.

 

So the compiler do some optimizations but they are very very basic.

  • Sad 1

Share this post


Link to post

I did a benchmark with aes software ciphering class, and the win64 is near double faster than linux/osx llvm

problem appears that llvm optimizations are not used at all

  • Like 2

Share this post


Link to post
4 hours ago, RDP1974 said:

I did a benchmark with aes software ciphering class, and the win64 is near double faster than linux/osx llvm

This difference will no longer exist when Windows also gets this LLVM compiler :classic_biggrin:

  • Like 1
  • Sad 1

Share this post


Link to post
Posted (edited)
17 minutes ago, RonaldK said:

This difference will no longer exist when Windows also gets this LLVM compiler :classic_biggrin:

396fgi.thumb.jpg.afb2b25d2c2ecc45cfdf46bdd1a80a5f.jpg

Edited by Stefan Glienke
  • Like 1
  • Haha 7

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×