David Schwartz 430 Posted July 23, 2020 Where I work right now, we build apps to import all manner of data supplied to us by clients. A lot of the data is numeric, and the code uses functions like StrToInt or StrToCurr or StrToFloat or StrToDate or StrToTime to get the binary values to manipulate ... you get the idea. Here's the rub: there's no Try ... finally fence anywhere in the code that handles these conversions. So if there's a single empty or non-numeric field passed to one of these conversion routines, an exception gets thrown and the import process aborts right then and there. Here's my question .................... I suggested that all of these functions should be replaced with ones that don't throw exceptions, like StrToXxxDef(...), and that at least there should be Try...finally fencing around each method anyway just in case something unexpectedly throws an exception. Then I offhandedly made a comment like, "these are all programmer errors, and they should be fixed". My work colleague took strenuous exception to that statement. (No pun intended) His rebut was that clients give us the specs for their data, and they don't always (usually?) tell us what to do if any of the fields are empty or invalid. My position is that it's the programmer's job to ASK! He disagreed. I spent the first 5-1/2 years of my career working on real-time embedded systems and machine control software. Un-trapped errors (today's unhandled exceptions) can cause damage. Sometimes they cost lives. It's irresponsible to shrug your shoulders and say, "Well, the client knows best, and we shouldn't question them!" That's my thinking, anyway. What do YOU think? Share this post Link to post
Lars Fosdal 1793 Posted July 23, 2020 So, some input is invalid. What would you put instead of that corrupt value? Are the data line by line? Is there a sequence among those lines that gives interdependency (f.x. main/detail, header/line, etc)? Can you skip a line without corrupting the import? I agree that the specification should contain instructions on handling empty or invalid fields. If it doesn't - letting it crash and burn and returning that fact to the client is appropriate. 2 1 Share this post Link to post
Rollo62 539 Posted July 23, 2020 (edited) I would think that especially with external data, you can never be 100% sure that all works fine. You made a contract to the data known at that certain time, but you cannot predict the future. Maybe your customer changes the contract without noticing you. I think also that the programmer should handle that as careful as possible, but in reality you will always see unexpected data and behaviour. UnitTests might help a lot, also to identify known "wrong data", but who can predict the unkown ? In some cases the system can handle the unexpected gracefully, in some systems you will find hard crashs. To throw an exception in these cases at least may help to prevent further damage. To cover everything in try - finally, and to ignore exceptions is maybe not the best choice to ensure a good data quality. That way nobody gets aware of such (hopefully rare) cases. Edited July 23, 2020 by Rollo62 1 Share this post Link to post
Stefan Glienke 2019 Posted July 23, 2020 Giving an error is *always* better than assuming an "instead of" value for invalid data. You can potentially ruin the entire system by doing that. The strategy of giving the error is up to the developer - the "it simply crashes and refuses to do anything with the entire data" is fine. It depends on the data itself if let's say all valid entries can be imported and only the invalid ones can be left out and possibly reported back so the customer can fix them and then import those. 3 1 Share this post Link to post
David Heffernan 2353 Posted July 23, 2020 As a programmer, you are aware of the issues around invalid data, missing data, etc. The client who provides the specification is often less aware of these issues, and their consequences. In terms of ethics, I would say that as a programmer it is your responsibility to raise the issue with the client. That's part of being a good professional. How the client responds to that is down to them. If they insist on ignoring invalid data, then that is their choice. But you should strongly advise them against that, and if they do insist, then document that this was their choice. In your case it sounds like the ultimate client is not being advised of this, and the culture in your organisation is to do what the client asks without questioning it. That's a slightly different problem. If the decision takers in your organisation are not receptive to best professional practise then there's only so much you can do. 1 1 Share this post Link to post
Vincent Parrett 763 Posted July 23, 2020 I would approach this from the point of view that all data provided by the client is bad until proven otherwise - seriously, expect bad data, when it' encountered report that and exit/stop gracefully (with an error message that enables the client to fix the data). 2 1 Share this post Link to post
Fr0sT.Brutal 900 Posted July 23, 2020 Another +1 for throwing an error. You can't tell whether it's OK to replace some value from "foo" to "0" and will take responsibility for this replacement f.ex., if someone's bank balance would become zero when a corrupted input is given to you app 🙂 In the area you described specs are your Bible, your weapon and defense. Don't do anything more or less 1 1 Share this post Link to post
Bill Meyer 337 Posted July 23, 2020 2 hours ago, Vincent Parrett said: I would approach this from the point of view that all data provided by the client is bad until proven otherwise - seriously, expect bad data, when it' encountered report that and exit/stop gracefully (with an error message that enables the client to fix the data). In contractor role, I would, as well. But if my manager determines we're not to do that, then I'd be stuck. Share this post Link to post
Alexander Elagin 143 Posted July 23, 2020 8 minutes ago, Bill Meyer said: But if my manager determines we're not to do that, then I'd be stuck. If the manager assumes the responsibility, then just have a written document signed by him - be it a specification or a simple memo. If things go wrong, you'll have you covered. Share this post Link to post
David Schwartz 430 Posted July 23, 2020 8 hours ago, Lars Fosdal said: I agree that the specification should contain instructions on handling empty or invalid fields. If it doesn't - letting it crash and burn and returning that fact to the client is appropriate. They have no way to reproduce the error. Simply telling them "it failed" isn't very helpful to anybody. So we end up spending quite a bit of time tracking down the issue for them. The problem is they rarely take corrective action b/c these data files are generated by an export routine on some app they're running. Share this post Link to post
David Schwartz 430 Posted July 23, 2020 8 hours ago, Stefan Glienke said: It depends on the data itself if let's say all valid entries can be imported and only the invalid ones can be left out and possibly reported back so the customer can fix them and then import those. We don't currently do this, but it seems like it would be a nice alternative. Share this post Link to post
Mike Torrettinni 198 Posted July 23, 2020 Perhaps this is opportunity to sell them better app that produces constantly accurate exports. Or a little utility that per-processes data and reports on issues, before they send data to you. Share this post Link to post
David Schwartz 430 Posted July 23, 2020 (edited) 18 minutes ago, Mike Torrettinni said: Perhaps this is opportunity to sell them better app that produces constantly accurate exports. Or a little utility that per-processes data and reports on issues, before they send data to you. The first ain't gonna happen -- that's not the business we're in. The second is one where we're not clear why it's not already available -- the only thing we've heard is "This IS the pre-processor" that's failing. Except it's also the front door to the entire process. Edited July 23, 2020 by David Schwartz Share this post Link to post
Attila Kovacs 631 Posted July 23, 2020 if you don't have "Plan B", you should not try to catch exceptions 1 Share this post Link to post
Guest Posted July 23, 2020 2 hours ago, David Schwartz said: They have no way to reproduce the error. Simply telling them "it failed" isn't very helpful to anybody. So we end up spending quite a bit of time tracking down the issue for them. The problem is they rarely take corrective action b/c these data files are generated by an export routine on some app they're running. That sucks and that is morally/ethically WRONG. BUT if the client has been told and "do not care" and still pays for the "lost" hours... bad client, get a better one! At least if you are in this business long-term. Alternative is milking this client for $ w/o actually adding any intellectual/experienced value... if you can do that and sleep at nights, well... you could buy some shares in a lot of companies. Start with CyberCom and continue. MORALLY a lot of coders should not do what they are doing. At all. Share this post Link to post
Rollo62 539 Posted July 24, 2020 10 hours ago, Attila Kovacs said: if you don't have "Plan B", you should not try to catch exceptions Then you better catch exceptions and log them in a "invalid data" file or DB. To allow to find them manual rework them later, that would prevent that you are loosing any important information. Share this post Link to post
aehimself 399 Posted July 24, 2020 I'd simply re-throw exceptions in an understandable way; you now can even use RaiseOuterException to include the data from the first one. When I am working in a service application which should operate 24/7 without interruptions, I'm placing a Try ... Except in the worker thread only. It picks an item to process from the queue, if it fails, it logs why and places it back to the end of the queue. Once an item failed to process 3 times it is discarded. Specifications (even if given by the client) are only specifications. Our client keeps sending invalid XMLs for us to process, even though they created the validating XSD for that very document. So yes, expect bad data; no matter what. But depending on the needs - don't change values and/or swallow errors because the code looks... cleaner. 22 hours ago, David Schwartz said: I suggested that all of these functions should be replaced with ones that don't throw exceptions, like StrToXxxDef(...), and that at least there should be Try...finally fencing around each method anyway just in case something unexpectedly throws an exception. I'd suggest TryStrToInt and it's counterparts, or simply Val. That way you know if/where the expected data is malformed and can Rase Exception.Create('The length of your shoes must be a number, not "' + s + '"'); Share this post Link to post
Lars Fosdal 1793 Posted July 24, 2020 13 hours ago, David Schwartz said: They have no way to reproduce the error. Simply telling them "it failed" isn't very helpful to anybody. So we end up spending quite a bit of time tracking down the issue for them. The problem is they rarely take corrective action b/c these data files are generated by an export routine on some app they're running. Having proper logging of the context of the error is a must - but if they are routinely producing crap data, and rarely take corrective action - the problem definitively is upstream. A specification must exist that describes the appropriate action on erroneous data. Having worked with live data from external sources all my life, I can say with some weight that replacing erroneous data with "neutral/zero" data can have negative effects downstream. There is no joke in the old saying: Garbage in, garbage out. Without knowing the nature of the data, I would say that the only viable option with regards to simply halting, would be to omit import of data that fails validation and instead properly log what fails. The problem here is how much will need to be discarded. This is something that needs to be agreed on. Example for data that are hierarchical: Orders can have a order head, one or more delivery heads, with one or more order lines. If one order line fails, do you fail just that delivery and its order lines, or the entire order? If a delivery head fails, do you fail just that delivery and its order lines, or the entire order? If the order head fails, are you able to eliminate all the deliveries and order lines? Example of data that are sequential: If each line is a time series item, and one line in the series fails - do you fail just that line or the entire series? Share this post Link to post
Rollo62 539 Posted July 24, 2020 4 minutes ago, aehimself said: I'd suggest TryStrToInt and it's counterparts, or simply Val. That way you know if/where the expected data is malformed and can Rase Exception.Create('The length of your shoes must be a number, not "' + s + '"'); Right, I also use NaN a lot, to make sure that float numbers are not valid. Share this post Link to post
Lars Fosdal 1793 Posted July 24, 2020 Basically, this case needs the nullable types that we have been promised for some time. Share this post Link to post
David Heffernan 2353 Posted July 24, 2020 And to think that the question was meant to be about ethics. 1 Share this post Link to post
David Schwartz 430 Posted July 24, 2020 1 hour ago, David Heffernan said: And to think that the question was meant to be about ethics. I was looking for comments relative to my statement that "it's a programmer error" vs. my colleague who disagreed and believes the programmer did exactly what was asked -- and no more. This is a constant problem outsourcing to people from certain countries / cultures because they're taught to never ask questions on the assumption that the person who gave you the directions knew what they're talking about. So to fill gaps in specifications that cannot be ignored, they just make stuff up instead of asking about it. It wastes time and resources, and leaves a lot of people quite upset. "Why didn't you just ASK???" is a common refrain. 1 Share this post Link to post
Rollo62 539 Posted July 24, 2020 1 hour ago, David Heffernan said: And to think that the question was meant to be about ethics. The programmers ethics ... you can find here Quote The CodeEdit A Jedi such as Obi-Wan Kenobi, trained in the ways of the light side of the Force, could take comfort in the words of the mantra of the Jedi Code: There is no emotion, there is peace. There is no ignorance, there is knowledge. There is no passion, there is serenity. There is no chaos, there is harmony. There is no death, there is the Force.[5] There also existed an alternate version of the Code, recited by Jedi younglings during their Initiate Trials, and by Depa Billaba during her full fitness re-assessment after waking up from her 6-month coma: Emotion, yet peace. Ignorance, yet knowledge. Passion, yet serenity. Chaos, yet harmony. Death, yet the Force.[11] Share this post Link to post
Guest Posted July 24, 2020 Get back on track. This is about ethics, not nullable types. However, imho, gullable guys normally do not care about ethics. Share this post Link to post
Stefan Glienke 2019 Posted July 25, 2020 @Rollo62 Reminds me of Zen of Python 1 Share this post Link to post