Jump to content
bernhard_LA

StrToFloat () all combinations of decimal separator and lang. settings

Recommended Posts

need a good and flexible solution to  convert string to floats

 

the string  can be 1,99  vs. 1.99   the OS can be  german or english or .... 

 

not working  code goes like this :-( 


 

internalFormatSettings := TFormatSettings.Invariant();

 

Result := StrToFloat(FloatStr, internalFormatSettings); 

 

Share this post


Link to post
function MyStrToFloat(const S: string): Extended;
const
  Komma: TFormatSettings = (DecimalSeparator: ',');
  Dot: TFormatSettings = (DecimalSeparator: '.');
begin
  if not TryStrToFloat(S, Result, Komma) then
    Result := StrToFloat(S, Dot);
end;

 

  • Like 1
  • Thanks 1

Share this post


Link to post
Posted (edited)

I prefer to replace dot or comma with the locale's decimal sep and then use StrToFloat. 

 

Although in reality I don't use the Emba conversion functions because they are defective, i.e. not accurate. 

Edited by David Heffernan
  • Like 1

Share this post


Link to post
27 minutes ago, David Heffernan said:

I prefer to replace dot or comma with the locale's decimal sep and then use StrToFloat. 

Unless you do that in place with PChar pointer magic that's a heap allocation! 😉

  • Like 1

Share this post


Link to post
1 hour ago, Stefan Glienke said:

Unless you do that in place with PChar pointer magic that's a heap allocation! 😉

Yeah I don't do a heap allocation. I use a local fixed length array. 

Share this post


Link to post
2 hours ago, David Heffernan said:

Yeah I don't do a heap allocation. I use a local fixed length array. 

That's probably not true. I think I dtoa accepts both, or if not then I modified it. Since I'm in control of the conversion code thats the right way. 

Share this post


Link to post
Posted (edited)

When Thousand separators come into play, it starts to get almost impossible, (At least in here).

1,006.66 or 1 006,66 or 1 006.66 etc...

 

And if sometimes there are no decimals. How to fix US version without decimals 1,006 and Finnish ones with them 6,66. As far as I know there can't be universal routine to rule them all. 

 

US I think thousand separator is Comma, here it is decimal separator. THis is one of the places there would have been nice to have global standard, and only one 🙂

 

-Tee-

Edited by Tommi Prami
  • Like 2

Share this post


Link to post
Posted (edited)
Quote

US I think thousand separator is Comma, here it is decimal separator. THis is one of the places there would have been nice to have global standard, and only one 🙂

 

 

  numeric keypad shows . writes , :classic_angry:

Edited by Pat Foley
fixed link
  • Like 1

Share this post


Link to post
3 hours ago, Tommi Prami said:

When Thousand separators come into play, it starts to get almost impossible, (At least in here).

1,006.66 or 1 006,66 or 1 006.66 etc...

 

And if sometimes there are no decimals. How to fix US version without decimals 1,006 and Finnish ones with them 6,66. As far as I know there can't be universal routine to rule them all. 

 

US I think thousand separator is Comma, here it is decimal separator. THis is one of the places there would have been nice to have global standard, and only one 🙂

 

-Tee-

So far as I am aware, routines for conversion from string to float do not accept strings with thousand seps so this is just not an issue. 

Share this post


Link to post
8 minutes ago, David Heffernan said:

So far as I am aware, routines for conversion from string to float do not accept strings with thousand seps so this is just not an issue. 

Has to be handled anyways.

 

-Tee-

Share this post


Link to post
19 minutes ago, Tommi Prami said:

Has to be handled anyways.

 

-Tee-

Try passing such a string to StrToFloat and see how it works out for you. 

Share this post


Link to post

You could also prepare input for processing: remove ^[\.,\d], replace , to . and convert invariantly

Share this post


Link to post
On 8/9/2021 at 11:14 AM, David Heffernan said:

Try passing such a string to StrToFloat and see how it works out for you. 

That seems to be correct, thousands separators are not accepted, so have to first handle them self.

  • Like 1

Share this post


Link to post
38 minutes ago, Tommi Prami said:

That seems to be correct, thousands separators are not accepted, so have to first handle them self.

Well, not really. Certainly in my experience when dealing with floating point data, you can just mandate that it doesn't have thousand separators.

 

Perhaps if you are dealing with currency then you'd need to handle thousand separators but for other data it's not necessary. 

  • Like 1

Share this post


Link to post

These things are perpetual headaches, as is the time and date separators. 

The default Norwegian Windows language setting is using period for both, which confuses the hell out of the Delphi string to datetime decoders.

Share this post


Link to post

I'd mark deprecated all functions that use global FormatSettings and this variable itself. Otherwise it's quite hard to track and kill all hidden usages

  • Like 1

Share this post


Link to post
2 hours ago, Lars Fosdal said:

The default Norwegian Windows language setting is using period for both, which confuses the hell out of the Delphi string to datetime decoders.

Not surprising it would. Can't help wondering what the reasoning may have been for such a default.

Share this post


Link to post
2 hours ago, Lars Fosdal said:

The default Norwegian Windows language setting is using period for both, which confuses the hell out of the Delphi string to datetime decoders.

Very nice idea for testing BTW - change system settings to something wild and see if software stands it.

Share this post


Link to post

The common trait is that both floats and dates have separator character challenges.

 

For floats, the only reliable solution is to KNOW the input format and do the necessary stripping/replacement before passing the string to the converter.

 

In some of my older input parsers, I stripped spaces, then checked for the presence of , and . and did the following processing

- if only one exists, don't touch it

- if more than one of a kind exists, remove them all

- if both exists - remove all but the last one

Which still is hopeless if the 1,000 is 1000 and not 1 with three decimals.

  • Like 3

Share this post


Link to post
2 hours ago, Lars Fosdal said:

For floats, the only reliable solution is to KNOW the input format and do the necessary stripping/replacement before passing the string to the converter.

Correct.

 

And as a developer that does a lot of parsing of scientific / engineering data, I just don't recognise the issue of dealing with thousand seperators. In the data that I process, they just don't appear. You don't bother trying to strip them, you just require the data not to have them.

 

2 hours ago, Lars Fosdal said:

In some of my older input parsers, I stripped spaces, then checked for the presence of , and . and did the following processing

- if only one exists, don't touch it

- if more than one of a kind exists, remove them all

- if both exists - remove all but the last one

Which still is hopeless if the 1,000 is 1000 and not 1 with three decimals.

Exactly.  It's a mugs game trying to handle data whose formatting is ambiguous, so don't do it.

  • Like 1

Share this post


Link to post

I was working with various kinds of financial data, weather data and power data (prices, volumes, etc), and thousand separators usage was variable. Spaces, commas, dots, the lot.

It was a hodge-podge of formats since very few standard exchange formats existed at the time.

 

Even vendors that you had contractual agreements with, would change the format on the fly, without notice. 
"Yeah, we changed the format. Nobody told you?"

  • Like 2
  • Thanks 1

Share this post


Link to post
Posted (edited)
function StrToInt00(sString:String; out f : Float32):Boolean;
  var
    fs:TFormatSettings;
begin
  try
    try
      GetLocaleFormatSettings(LOCALE_SYSTEM_DEFAULT, fs);
        // remove space
        if ContainsText(sString, ' ')
        then sString := StringReplace(sString,' ','',[rfReplaceAll, rfIgnoreCase]);      

        if ContainsText(sString, ',') then 
        begin
          if fs.DecimalSeparator='.' 
          then sString :=  StringReplace(sString,',','.',[rfReplaceAll, rfIgnoreCase]);
        end
        else if ContainsText(sString, '.') then 
        begin
          if fs.DecimalSeparator=',' 
          then sString :=  StringReplace(sString,'.',',',[rfReplaceAll, rfIgnoreCase]);
        end; 

    finally
      f :=  StrToFloat(sString);	
      Result := True;
    end;
  except
    Result := False;
  end;
end;

 

Edited by skyzoframe[hun]
  • Sad 1

Share this post


Link to post
Posted (edited)
 
if you have problems with "Thousand Separators", remove it from the string!
if   (ContainsText(sString, ',')) 
 and (ContainsText(sString, '.')) 
then
  begin
  
  if  (fs.ThousandSeparator=',') then sString := StringReplace(sString,',','',[rfReplaceAll, rfIgnoreCase])
  else if (fs.ThousandSeparator='.') then sString := StringReplace(sString,'.','',[rfReplaceAll, rfIgnoreCase]); 

  end;

hm.. it is stupid... if string="1,1546.98" and ThousandSeparator=','  then result="1,154698"

Edited by skyzoframe[hun]

Share this post


Link to post
Posted (edited)
On 8/17/2021 at 9:48 AM, Lars Fosdal said:

Which still is hopeless if the 1,000 is 1000 and not 1 with three decimals.

I agree to that.

Maybe the best way, to catch any of the remaining formats is,

to seek the string reversely, from end to start.

 

1. Then take the last separator (first found when searching reversely) in the string, as preliminary decimal separator.

2. If there are more separators in the string, they could be safely ignored, they can only be thousands separators.

3. If separator 1.) is the only separator in the string, only then you might run into an unclear situation, as you explained above.

4. But you still could try to count the number of decimals after the last separator

    if there are <> 3, then you still could be very sure that it's NOT a thousands separator, but a decimal separator.

5. Even if you have >= 6 decimals, it should be a decimal separator, as I rarely expect having a million separator without a thousands separator.

    ( Still somebody could do nasty things like that )

6. Only if there are exactly 3 decimals, then you better know your format source.

 

 

 

Edited by Rollo62
  • Like 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×