Jump to content
A.M. Hoornweg

Disabled floating point exceptions are problematic for DLLs

Recommended Posts

Hello all,

 

I've noticed that an application I'm working on was behaving differently when compiled in Delphi 11 or 12.  In Delphi 12 I was getting all sorts of overflows and unexpected behavior. It took me a while to figure out the cause. 

 

Now the surprising thing: The cause wasn't even in this executable I was working on, it was in a DLL that my program is using. This DLL was built with a previous version of Delphi.

 

This particular DLL expects that a FP error such as a division by zero will throw an exception.  A breaking change in Delphi 12 is that the executables it produces mask the FP exceptions. This change in behavior affects all code running in the process, including all DLLs. In other words, this DLL no longer threw exceptions where it was supposed to.

 

My workaround was to re-enable FPU exceptions in the executable and the situation instantly went "back to normal".

 

But it got me thinking, how does one write code that copes with both FP situations and can run safely with *any* Delphi version?  Should I do something like this?

Try
   c:=SomeHighlyComplexCalculation();
   if IsNan (c) then
     raise eDivByZero.Create('Something went wrong');
except
   // the exception is triggered either automatically (Old Delphi version) or manually (New Delphi version)
end;

 

Currently, my gut feeling tells me to manually enable FPU exceptions in all my Delphi exe projects to prevent such nasty surprises in the future. I use tons of tried-and-tested DLLs written in Delphi and I really don't want them to break.

 

 

 

 

 

 

 

 

 

 

 

  • Like 1

Share this post


Link to post
34 minutes ago, Lajos Juhász said:

It really depends how many Delphi versions you have to support. The change is documented and was made to make life easier to integrate with other programming languages. Mora information:

https://dalijap.blogspot.com/2023/09/coming-in-delphi-12-disabled-floating.html

https://docwiki.embarcadero.com/RADStudio/Athens/en/Floating_Point_Operation_Exception_Masks

 

Sure it is documented. But it may be off the radar that existing binaries such as DLL's may suddenly behave differently if they were written in Delphi.  I use some third-party DLL's that were written in Delphi.

 

 

 

 

 

 

 

Share this post


Link to post
Posted (edited)

Strictly speaking, this is not exclusively related to Delphi 12, your DLLs would have suffered the same, if they were called from another non-Delphi application.

 

I believe that if your code relies on a certain FPU behaviour, it should either fail-fast by aborting if the FPU is not set correctly, or temporarily alter it, and then restore it to the previous state via try..finally.

In any case, that would be your DLL would have to be changed. 

Edited by Der schöne Günther

Share this post


Link to post

From my experience, you could never really rely on floating point exceptions. If your code is in a DLL, the host process might change the exception mask. If your code is in the EXE, a loaded DLL might unexpectedly change the exception mask. This can even be caused by showing a common dialog (think Open/Save dialogs) because that will cause Shell extension DLLs to be loaded into your process. Or an ActiveX, an in-process COM client and so on.

 

In the end, you should make your code work both ways and check floating point calculation results with IsNAN and IsInfinity, e.g.

try
  F:= <some floating point calculation>
except
  on E: EInvalidOp do
    F:= NAN;
  on E: EZeroDivide do
    F:= INF;
end;
if IsNAN(F) then
  <handle InvalidOp>
if IsInfinity(F) then
  <handle zero divide>

 

  • Like 1

Share this post


Link to post

DLLs should take charge of this. They should either make it part of their contract that the host sets the fp control state to a specific state. Or they should set it on entry, and restore it on exit. My DLL does the latter. 

 

  • Like 1

Share this post


Link to post
6 hours ago, msohn said:

In the end, you should make your code work both ways and check floating point calculation results with IsNAN and IsInfinity, e.g.


try
  F:= <some floating point calculation>
except
  on E: EInvalidOp do
    F:= NAN;
  on E: EZeroDivide do
    F:= INF;
end;
if IsNAN(F) then
  <handle InvalidOp>
if IsInfinity(F) then
  <handle zero divide>

This is the worst advice I've seen in quite some time!! 

Share this post


Link to post
8 hours ago, David Heffernan said:

This is the worst advice I've seen in quite some time!! 

Thank you for letting me tick off an item from my bucket list. I'm sure googling for similar quotes from you would bring up quite a few hits, so I'll take it the timeframe isn't actually that large.

 

Forget the (obviously not real) code, I'll stand by my advice that it's best to aim for code which works with and without FP exceptions - how to achieve that will vary depending on your requirements.

 

9 hours ago, David Heffernan said:

DLLs should take charge of this. They should either make it part of their contract that the host sets the fp control state to a specific state.

What if the host isn't even aware that your DLL is involved or that it has such a contract? Or other parties are involved which you have no control over (again shell extensions, printer drivers etc.) Unless you have full control over where your code is run, this is impossible to enforce.

 

9 hours ago, David Heffernan said:

Or they should set it on entry, and restore it on exit. My DLL does the latter. 

How can you make this work in a thread-safe manner?

 

Now to get back to some real code, I'd love to hear your opinion on the following fragment (sorry, insert code popup again didn't work):

 

function MyFloatingPointAPI: Double; cdecl;
begin
  try
    Result:= ComplexCalculation;
  except
    on E: EInvalidOp do
      Result:= NAN;
    on E: EZeroDivide do
      Result:= INF;
    on E: Exception do
      ...log, handle, whatever
  end;
end;

 

If you properly document that NAN and INF are possible results of your API, this should work fine both with and without FP exceptions without introducing much of a performance hit, right?

Share this post


Link to post
42 minutes ago, msohn said:

How can you make this work in a thread-safe manner?

FPCR can be set in a thread safe manner. it is just that Delphi RTL doesn't do that on Windows platform. David has written about it see 

and https://stackoverflow.com/a/27125657

 

I have also written about it in my book Delphi Thread Safety Patterns.

 

Working with masked FPU exceptions is the norm across various languages. Also even before Delphi 12, all other Delphi platforms except Windows VCL has masked FPU exceptions. So if you had written FMX application it would have been working the same way it is working now across the board. 

So the best option is moving to the masked FPU exceptions as this will create the least amount of trouble for the future. Of course, if you are dealing with Delphi built DLLs which don't honor that, you will have to adapt. Another option is to unmask exceptions and proceed as usual, but doing that in multithreaded application is not advisable unless you also patch the RTL as it is not thread-safe. Even having masked FPU exceptions is not thread-safe, but it is much harder to trigger the issue and much easier to avoid RTL API that uses unsafe parts than with unmasked exceptions.

  • Like 1
  • Thanks 1

Share this post


Link to post
10 hours ago, David Heffernan said:

DLLs should take charge of this. They should either make it part of their contract that the host sets the fp control state to a specific state. Or they should set it on entry, and restore it on exit. My DLL does the latter. 

 

Since threads run concurrently on multi-core systems, isn't the fp control state global to all threads? Or does every cpu core have its own fpu core?

 

Share this post


Link to post
54 minutes ago, A.M. Hoornweg said:

Since threads run concurrently on multi-core systems, isn't the fp control state global to all threads? Or does every cpu core have its own fpu core?

FPCR is part of the FPU and its state is preserved during context switch. So each thread works with its own state that is independent of others. If you only handle floating point state directly through FPCR it will be thread-safe. Problem with Delphi is that it throws global variable into the equation and then handles FPCR with the help of that global in thread-unsafe manner, which can then leak "wrong" state into different thread.

Share this post


Link to post
2 hours ago, Dalija Prasnikar said:

FPCR is part of the FPU and its state is preserved during context switch. So each thread works with its own state that is independent of others. If you only handle floating point state directly through FPCR it will be thread-safe. Problem with Delphi is that it throws global variable into the equation and then handles FPCR with the help of that global in thread-unsafe manner, which can then leak "wrong" state into different thread.

That is not what I mean, since a "context switch" is done when the scheduler switches from one thread to another (so one is halted, its state is saved and the next thread resumes after its state was restored). 

My question is about threads literally running simultaneously on different CPU cores, does each core always have an independent FPU + FPCR so one thread cannot jeopardize another ? 

 

 

 

 

 

 

 

 

 

Share this post


Link to post
12 minutes ago, A.M. Hoornweg said:

My question is about threads literally running simultaneously on different CPU cores, does each core always have an independent FPU + FPCR so one thread cannot jeopardize another ?  

If FPCR would not be per core then one processes would change FPCR of all processes! This can't be.

FPCR must be pe core like all CPU registers.

Share this post


Link to post
3 minutes ago, Cristian Peța said:

If FPCR would not be per core then one processes would change FPCR of all processes! This can't be.

FPCR must be pe core like all CPU registers.

I have no way of knowing.

 

In the distant past the FPU (80x87) used to be an expensive separate chip. I simply don't know if the fpu is now part of each cpu core.

Share this post


Link to post
25 minutes ago, A.M. Hoornweg said:

My question is about threads literally running simultaneously on different CPU cores, does each core always have an independent FPU + FPCR so one thread cannot jeopardize another ?

Each core has its own FPU. It is possible that some older multicore processors shared single FPU, but even in such case each thread would have access to its own "FPU data copy". If those values would be shared among threads, then different threads could trample upon results of other thread calculations.

Share this post


Link to post
7 hours ago, msohn said:

Now to get back to some real code, I'd love to hear your opinion on the following fragment (sorry, insert code popup again didn't work):

 

function MyFloatingPointAPI: Double; cdecl;
begin
  try
    Result:= ComplexCalculation;
  except
    on E: EInvalidOp do
      Result:= NAN;
    on E: EZeroDivide do
      Result:= INF;
    on E: Exception do
      ...log, handle, whatever
  end;
end;

 

If you properly document that NAN and INF are possible results of your API, this should work fine both with and without FP exceptions without introducing much of a performance hit, right?

One very obvious problem is that you need to write code like this everywhere. Not just at the top level. I stand by my advice before. Either:

 

1. Make FPCR part of the contract, or

2. Take control on entry, and restore on exit.

Share this post


Link to post
1 hour ago, Dalija Prasnikar said:

Each core has its own FPU. It is possible that some older multicore processors shared single FPU, but even in such case each thread would have access to its own "FPU data copy". 

OK, that massively simplifies matters. My fear was that the FPCR would be a shared resource among threads.

Share this post


Link to post
Posted (edited)
11 minutes ago, David Heffernan said:

One very obvious problem is that you need to write code like this everywhere. Not just at the top level. I stand by my advice before. Either:

 

1. Make FPCR part of the contract, or

2. Take control on entry, and restore on exit.

"Take control on entry and restore on exit" would be very cumbersome in the case of DLL's written in Delphi.  It would need to be done in every exposed function / method.

(edit) or at least in every method that has to do with FP calculations.

Edited by A.M. Hoornweg

Share this post


Link to post
11 minutes ago, A.M. Hoornweg said:

"Take control on entry and restore on exit" would be very cumbersome in the case of DLL's written in Delphi.  It would need to be done in every exposed function / method.

procedure Foo; stdcall;
begin
  SetFPCR;
  ...
  RestoreFPCR;
end;

Do you think is so cumbersome to do this for every exposed function?

The SetFPCR and RestoreFPCR you need to write for yourself but only once.

Share this post


Link to post
18 minutes ago, Cristian Peța said:

procedure Foo; stdcall;
begin
  SetFPCR;
  ...
  RestoreFPCR;
end;

Do you think is so cumbersome to do this for every exposed function?

The SetFPCR and RestoreFPCR you need to write for yourself but only once.

And for convenience, use a global variable to store the state ... gd&r

  • Haha 2

Share this post


Link to post
5 minutes ago, dummzeuch said:

And for convenience, use a global variable to store the state ... gd&r 

If not multi threading then it can be.

Share this post


Link to post
1 hour ago, Cristian Peța said:

If not multi threading then it can be.

We are talking about a DLL here, so your code might not be multithreaded, but the caller's still might be.

  • Like 1

Share this post


Link to post

 

2 hours ago, Cristian Peța said:

procedure Foo; stdcall;
begin
  SetFPCR;
  ...
  RestoreFPCR;
end;

Do you think is so cumbersome to do this for every exposed function?

The SetFPCR and RestoreFPCR you need to write for yourself but only once.

It would be a can of worms for me I'm afraid. I am thinking of all my Delphi COM DLL's that expose interfaces and class factories (see image). Each interface is basically an object that can have dozens of methods so we're not talking about just a few functions, it's more like hundreds of exposed methods.  And many of these methods call each other, which complicates matters further because setfpcr/restorefpcr would have to support nesting.  And multi-threading would make matters even more complicated.

ridl.png

Share this post


Link to post
Posted (edited)
27 minutes ago, A.M. Hoornweg said:

And many of these methods call each other, which complicates matters further because setfpcr/restorefpcr would have to support nesting.

That's easy to fix. You just make sure that they call internal methods only.

 

27 minutes ago, A.M. Hoornweg said:

And multi-threading would make matters even more complicated.

Not really. You can store the FPCR to a local variable in the exported method. 

 

 

The thing is, the change in Delphi 12 isn't actually causing any new issues. These issues always existed. It's just a consequence of the "design" of DLLs not making FPCR part of the ABI.

Edited by David Heffernan
  • Like 3

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×