Jump to content
dummzeuch

Guessing the decimal separator

Recommended Posts

13 minutes ago, Dany Marmur said:

* Not me.

Me neither!

Share this post


Link to post
41 minutes ago, Sherlock said:
55 minutes ago, Dany Marmur said:

* Not me.

Me neither!

Just pay me enough(*1) and I will do that for you - for one hour per week.

 

*1: Enough means that that one hour per week must be enough to live on.

Share this post


Link to post

Well, "plain text files" are a terrible data interchange format anyway, because it doesn't say anything about the syntax or even the character set.   Guessing the character set is hard enough!  

 

 

Share this post


Link to post

@dummzeuch perhaps it would be more prudent to process the entire file and write a function that do consider, say, between 100 and 10000 numbers. What i'm suggestion since you started off mentioning you received files they would contain a lot of numbers, no? Hopefully all those numbers (at least in the same column) would be using the same format. Otherwise we are talking a word-document where a human entered a handful of numbers non-considering. If you have a set of "quasi-floats" like over a hundred of them, then you would be able to write a function that takes ranges into account. And even more stuff.

  • Like 1
  • Thanks 1

Share this post


Link to post
On 2/4/2019 at 8:56 PM, Attila Kovacs said:

Why would you search decimal separator on integers?

What formula do you think would be feasible to tell what 1.000 is?

Check the first X rows, if still not obvious check another X, if still not and EOF, panic.

This is parsing strings. You can't tell if a string contains an integer until you parsed it. And integers could contain one single thousands separator. Say we have `100,001`. Is that the floating point value 1.00001e+03, the integer value 100001, or the floating point value 1.00001e+00? Hard to tell, IMO, if you don't know if the dot is a decimal or thousands separator.

Share this post


Link to post

@Rudy Velthuis I could not tell whether you agree or arguing with me. If he is running GuessDecimalseparator() on strings, and the file contains only "0" 's, it will be still treated as float 0, isn't it? Otherwise he would run GuessIntegerOrFloatOrNoneOfThem() first and then WhatNow().

Share this post


Link to post
On 2/8/2019 at 11:33 PM, Attila Kovacs said:

@Rudy Velthuis I could not tell whether you agree or arguing with me. If he is running GuessDecimalseparator() on strings, and the file contains only "0" 's, it will be still treated as float 0, isn't it? Otherwise he would run GuessIntegerOrFloatOrNoneOfThem() first and then WhatNow().

ISTM that it simply doesn't make a lot of sense to guess the decimal separator. In some scenarios it might work, but not in others. We humans can sometimes extract such information from the context, but you'd need better AI for that.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×