Jump to content

David Schwartz

Members
  • Content Count

    366
  • Joined

  • Last visited

  • Days Won

    5

David Schwartz last won the day on February 19

David Schwartz had the most liked content!

Community Reputation

118 Excellent

Technical Information

  • Delphi-Version
    Delphi 10.3 Rio

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. David Schwartz

    Front-end vs Back-end question

    nearly every question posted here can be answered with "it depends" and stop right there.
  2. David Schwartz

    Something the comunity should be aware of

    I have mixed feelings about this stuff. One big one is that Borland / Inprise really dropped the ball back in the D6/D7 years when they thought it was a Good Idea to hitch their wagon to .NET and everything Microsoft. They made some improvements in the language that left a lot of customers in the dust holding a bag of rocks. Here we are today and they're complaining that these same users STILL don't think it's worthwhile to invest in moving past D6/D7. Sheesh. Developers cost 2x-3x more today than they did back then, and if it didn't make financial sense to upgrade back then, then surely it makes worse sense today. Embt is not making any more friends complaining about the resources these legacy clients are costing them. The problem isn't the compiler -- it's the 3rd-party components like Dream Components that died on the vine and couldn't easily move forward. If they want to fix the problem, Embt should consider buying the rights to these old component libs and investing their own resources in making them work on the latest versions of Delphi. Add them to GetIt and give people a legitimate upgrade path. Whoa! What a novel idea! Still, a lot of folks still won't consider upgrading because it's harder than ever to find developers with solid Delphi skills today. (I think it's easier to find COBOL programmers today than Delphi folks!) Another option is to have a separate maintenance program for legacy products. I have not found a single job in the past decade doing NEW Delphi work -- it's all supporting LEGACY apps that were written in the D4-D7 years. Maybe they're using newer versions of the compiler, but it seems silly to me that the company is COMPLAINING about the fact that all of these old legacy clients are refusing to pay their ridiculous maintenance fees to stay exactly where they are. It's nice that Embt wants people to move forward, but until more jobs start showing up for NEW DELPHI PROJECTS, they're doing little more than Sisyphus pushing a rock up a hill while complaining about the effort involved. They (previous Mgt) created this problem but they don't seem to want to fix it. The world is moving to Open Source Software. Delphi is one if the few remaining products that's not just NOT OSS, but VERY EXPENSIVE for commercial use. Microsoft subsidizes the crap out of their dev tools, as do others like IBM and Oracle. I think the best thing for Delphi would be for Embt to push to get Delphi acquired by a company that can afford to move it in the direction of OSS by subsidizing it from other product revenues. Instead, they keep raising the costs to customers who are mostly using it to MAINTAIN OLD CODE. I'm working on my 4th or 5th gig since 2009 that's maintaining code written prior to D2007 and it hasn't changed at all. The company has NO PLANS FOR FURTHER DELPHI DEVELOPMENT beyond maintaining their legacy code. They pay for maintenance updates, but so what? A couple of places I worked are extremely hesitant to allow any sort of large-scale refactoring -- they say if they wanted to invest in that amount of work, they'd just assume switch to rebuilding the thing from scratch in C#/.NET or something else -- not Delphi. WHERE ARE THE NEW PROJECTS THAT ARE CREATING MORE DELPHI JOBS? This is a MARKETING PROBLEM for Embt. I don't think they have any right to complain when they have steadfastly maintained a posture that has gotten them exactly nowhere in the market. There's no evidence that their products are being used for more NEW product development than to support LEGACY projects. Where's the beef? Or rather, Where's the NEW work? (And don't respond with, "well, we're doing new stuff!" If you are, say how many devs you've hired to help with the NEW stuff vs. to maintain the OLD code. Rather, show me, say, 10 job postings made to any of the popular job boards that are legit posts to hire people for NEW DELPHI-based projects. Nobody hires new devs for new Delphi work -- it's a reward given to long-time employees. The new-hires are almost always for back-filling open spots maintaining the old code. We've lost 3 people in the past 6 months who worked with Delphi, and I'm the only new-hire to replace one. Now Mgt is running around like chickens with their heads cut off because they failed to plan for this. Two of these guys left to work on stuff that's "more fun"; one non-Delphi and one is another legacy project but with some slow growth of new features. EVERYTHING I've seen in the past decade, or been contacted by recruiters about, has been MAINTAINING LEGACY CODE. I've found NO NEW WORK on Delphi, especially within 500 miles of where I live.)
  3. David Schwartz

    Front-end vs Back-end question

    depends on what? I'm curious what folks think when it comes to web apps vs. desktop apps.
  4. What's your take on whether a FE and BE should be accessible from the same page / form or completely separately? I've seen desktop apps where the Admin / setup stuff is a totally separate app, and apps where there's a Setup / Config / Options link in a menu. Wordpress is infamous for their "Meta" section with a "Login" link to get to the Admin dashboard. You can't separate them even if you wanted to. SaaS solutions often take you to an Admin area that's separate from where your Users will go, and that generates the User's view elsewhere (frequently a subdomain). I've never given much thought to this. But with things like TMS WebCore, IntraWeb, UniGui, and others for building web apps, now I'm curious.
  5. David Schwartz

    app logins

    @stijnsanders and @Kas Ob. -- I appreciate the depth of your replies, but in this case there's nothing really on the server that's personal other than when a user subscribed, for how long, if they're still active, and their renewal rate, along with their first name, pwd hash, and email address. I suppose I could encrypt that stuff, but it's not particularly sensitive. However, the front-end needs to read some of it -- in particular, the FE needs to know if a visitor is a currently active subscriber, and possibly if their subscription is close to expiring so they can renew it. Is any of this a problem with GDPR? I do like the idea of validating using a 3rd-party login like FB, Twitter, etc., as an option. I'll look into that. I imagine it puts the onus of user validation on these other systems, right?
  6. David Schwartz

    app logins

    This is what Wordpress does ... which I addressed. Seems in your hurry, you have nothing to add. Better off just moving on next time.
  7. David Schwartz

    app logins

    I'd like to hear people's thoughts about this topic. I'm working with TMS WebCore and their MyCloudData to prototype something. There's a kind of utopian idea that you can "build once, deploy anywhere", but there's a fly in the ointmen that nobody seems to talk about. It seems to me that web apps come in two flavors: open and accessible to all that don't tend to save data; and everything else that lets you do stuff and save data across some notion of "sessions". The former might delivery kind of utility, like prettyprinting code or translating data from one format to another. The latter is what I'll generically refer to here as a "membership site". (Perhaps another terms is more appropriate; this is just how I think of them.) Historically speaking, desktop apps had no form of "login" -- they relied on the fact that there was a login on the computer, and assumed anybody who could get on the computer was permitted to access the software on it. This assumption still lives today on desktops as well as mobile devices. Which means you cannot simply take a desktop app that saves user data and drop it on a website to turn it into a web-based app. A lot of existing apps DO, in fact, offer if not require you to login, and there are a lot of reasons for that besides allowing you to keep your saved data separate from others. One big reason is to access walled-off services that require a paid subscription to allow access, for example. (At the very least, a registration is required in any case.) The thing is, the front-end or web-app could use something like OAuth2 to verify your login. If it's simply to gain access to some stuff kept behind a paywall, that's fine. But what if it uses your login to partition your data from everybody else? Back-end services typically have a login; in many cases, they're used by the developer or vendor to ensure nobody else can use the resource(s). For example, if my app uses SQL Server or MySQL, I have a login that all of my apps probably use to access my DBs. They may all share the same credentials. But they're MY credentials, as the developer. What about the users? How deep do you push the use of user credentials? The user could login just to prove they have a current account, then everything else could be done with MY (developer's) credentials. If you need HIPAA or PCI compliance, tho, I'm not sure that would fly. I'm wondering about this b/c I work in an environment now where user credentials go all the way down to the bedrock for desktop stuff. I'm not sure about our web tools, except they do require logins that are integrated into our single-sign-on protocols. I can see that a lot of services my software might access do not need to be partitioned for use by each user with their own credentials. But, in some cases, they might. So let's say you have an app and it requires a login to access and maintain some personal (but not very sensitive) data, then it can drop a cookie (in the web-app case) that, say, lasts a month. (I see this on lots of my phone apps.) The login controls access to some common data as well as a limited set of personal data. This isn't how Desktop apps normally work -- Windows or MacOS or *nix logins run the show in most cases. I'm not sure about mobile apps. Web apps designed like Delphi apps are still rather new. (Any IntraWeb users wanna chime in here?) But you don't design php or Wordpress sites as if they're Delphi apps. (In Wordpress, everybody gets a login, but the underlying resources all rely on a common access login. Strangely, it's common for membership sites that run inside of Wordpress to have a completely separate way of managing users rather than using the logic built-in to Wordpress. I think that's because the membership sites want more meta-data than WP can collect on its users.) What do you do when you can build web apps in Delphi that can look and feel more like normal Delphi desktop apps? (I'm not says they MUST or even should, only that they can.) Have you given this any thought? If so, I'd love to hear your ideas.
  8. David Schwartz

    Record and process audio

    Mitov.com has a bunch of things you can use. You want AudioLab. It works great.
  9. While your approach makes a little sense, many of us learned very painfully that centralized collections of things become a huge, ugly mess over time. If you're using OOP, the notion of "encapsulation" carries over to other things besides classes. Classes are contained in things, and those things are contained in things, and so forth. Put stuff in the unit that manages it. Use some form of dependency injection to pass objects into others that need them, and Factories to request things you need from common locations.
  10. David Schwartz

    How to flip image taken from front camera

    There's a setting somewhere to tell the camera to "mirror" the selfie images. For whatever reason they're set OFF by default, so most people end up with their selfies backwards. I don't know why this is so frigging confusing to people, or why the folks who make the camera software don't make it easier to flip images so they look correct without having to open your Gallery and edit the images one-by-one.
  11. David Schwartz

    Do GetIt libs install any Help files?

    not here.... I did find the help files, but they're pretty out-of-date. (Look at TRzMRUComboBox. Only 2 of numerous events that are there. In particular, OnCloseup and OnSelect are missing. Also, TRzRegIniFile is missing a ton of stuff.
  12. David Schwartz

    Do GetIt libs install any Help files?

    where do I find them? And then install them into the Help system?
  13. I have Raize Components (errr ... Konopka Bonus thingies, or whatever they're called now) installed, and hitting F1 for help doesn't do anything useful. (in Tokyo 10.2.3). I know Raize Components had lots of great help files. So I'm curious whether the GetIt facility just chucks them aside, or if they're present but not installed? I'm not sure how to check.
  14. I've got a log file that I want to parse so I can use the data behind some kind of "dashboard". I don't have anything specific in mind yet for the dashboard, except for maybe an approach I describe at the end of this post. Let me describe what's going on briefly first. Basically, the client has a bunch of offices around the country, and at the end of the day someone drops a pile of forms into a scanner and it scans them all into a single PDF file. I call this an "aggregate PDF file." These aggregate files are uploaded to an FTP area for us. We get anywhere from 10-40 of them to process daily and they contain anywhere from one to 200+ scanned forms. It's a batch process. I was assigned to maintain the code that performs that batch process. When I took this on, there were some problems, but they were totally invisible so nobody knew about them. The log file was only documenting a few errors, and it turned out there were other errors that weren't being detected or reported. So my first task was to enhance the log file to the point where I could use it to audit the results, and ultimately track down errors. I've tracked down most of the errors at this point, and I've found structural issues and other things that have been around for ages that nobody considers as "problems", but that's a story for another day. Here's a snippet of the log file showing what's recorded for a typical aggregate file. This is one of 37 aggregate PDFs in this particular batch, and it contains four documents. One of them appears to be junk, which is common. (Some offices will drop in dozens of the wrong forms into the scanner; we just ignore them.) Just FYI, there's an OCR process that occurs for each document where we try to extract a couple of numbers; they show up here as DocLink / Pick Ticket# / PTN, and OrderNum / Order#. The database is queried for a corresponding invoice that refers to one of these so they can be matched up. In a lot of cases, there isn't one. (Another issue that nobody sees as a problem.) Note that I separate different sections by ========================== flags. Don't look too close and try to make sense of the details as I've doctored it a bit to show some variations. =============================================================================== >> Processing Aggregate file 3/37: 0100_TEMSTOD4_%D%T_200507162634.pdf >> 4 Files extracted -- ===================================== -- Now processing single-page PDF (1 of 4) -- PDF file: 0100_TEMSTOD4_%D%T_200507162634.1.pdf >> Pick Ticket# found : 10058190 >> Order # found : 11452907 (corrected from: 1[452907) ===================================== -- Now processing single-page PDF (2 of 4) -- PDF file: 0100_TEMSTOD4_%D%T_200507162634.2.pdf >> Pick Ticket# found : 10052571 (corrected from: |0052571) >> Order # found : 11416133 (corrected from: I1416133) ===================================== -- Now processing single-page PDF (3 of 4) -- PDF file: 0100_TEMSTOD4_%D%T_200507162634.3.pdf ?? didn't find a PTN or OrdNum where we expected them to be () Maybe it's upside-down ... let's rotate it and try again :( Nothing useful here. Moving on... ===================================== -- Now processing single-page PDF (4 of 4) -- PDF file: 0100_TEMSTOD4_%D%T_200507162634.4.pdf ?? didn't find a PTN or OrdNum where we expected them to be () Maybe it's upside-down ... let's rotate it and try again >> Pick Ticket# found : 10061327 >> Order # found : 11547908 ============================ == CombineMultiPageOrders == >> RelatesToPage: first-pg=[0800_LVCopierBW05_08_2002_59_44.6.pdf] this-pg=[0800_LVCopierBW05_08_2002_59_44.7.pdf] >> RelatesToPage: first-pg=[0800_LVCopierBW05_08_2002_59_44.8.pdf] this-pg=[0800_LVCopierBW05_08_2002_59_44.9.pdf] -- Deleting file #8: C:\Loader\Out\0800_LVCopierBW05_08_2002_59_44.9.pdf -- Deleting file #7: C:\Loader\Out\0800_LVCopierBW05_08_2002_59_44.8.pdf -- Deleting file #6: C:\Loader\Out\0800_LVCopierBW05_08_2002_59_44.7.pdf -- Deleting file #5: C:\Loader\Out\0800_LVCopierBW05_08_2002_59_44.6.pdf >> 2 pages in C:\Loader\Out\2_1_0800_LVCopierBW05_08_2002_59_44.6.pdf | Doclink=10062731 OrderNum=11561321 >> 2 pages in C:\Loader\Out\2_1_0800_LVCopierBW05_08_2002_59_44.8.pdf | Doclink=10060874 OrderNum=11558566 ================================ == ProcessDocumentAttachments == >> Processing Attachment File 1/4: "C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.1.pdf" DocLink=10058190 OrderNum=11452907 == RemoveDuplicateAttachments == -- Removing 1 Dup attachment(s) [20383603] -- TZipStore.Delete#1([20383603]) -- ZipFileName: \\xxxx.yyyy.com\attachments\2020-05-07\21419503.zip -- TZipStore.Delete#2(20383603) -- ZipFileName: \\xxxx.yyyy.com\attachments\2020-05-07\21419503.zip == LinkAttachmentToDocument == -- Pick Ticket found; linking with DocLink (PTN) = 10058190 docid = 3362739699 -- INSERTED document attachment named "Pick_Ticket_p10058190.pdf" with document_id=3362739699 -- TStorageMgr.StoreAttachment.FullName: \\xxxx.yyyy.com\attachments\2020-05-07\21419511.zip -- TStorageMgr.StoreAttachment( ZS(Not NIL), 20382703, C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.1.pdf ) -- an = "C:\Loader\Out\20382703" -- FileExists(C:\Loader\Out\20382703) = YES! -- ZS.Add(an) --> 1 !! SUCCEEDED !! -- StorageMgr: storing C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.1.pdf --> 20382703 (renamed before adding) -- File: C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.1.pdf does NOT exist (should NOT) -- File: C:\Loader\Out\20382703 DOES exist (should) >> Processing Attachment File 2/4: "C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.2.pdf" DocLink=10052571 OrderNum=11416133 ** No matching doc (TicketHash) found for PTN: 10052571 ** Document with this order number already has a pick ticket (1 #row(s) found) : 11416133 >> Processing Attachment File 3/4: "C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.3.pdf" DocLink=10060319 OrderNum=-UNKNOWN- ** No matching doc (TicketHash) found for PTN: 10060319 ** Page OCR read problem: Missing orderNum >> Processing Attachment File 4/4: "C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.4.pdf" DocLink=10061327 OrderNum=-UNKNOWN- == RemoveDuplicateAttachments == -- No duplicate attachments found == LinkAttachmentToDocument == -- Pick Ticket found; linking with DocLink (PTN) = 10061327 docid = 3362739763 -- INSERTED document attachment named "Pick_Ticket_p10061327.pdf" with document_id=3362739763 -- TStorageMgr.StoreAttachment.FullName: \\xxxx.yyyy.com\attachments\2020-05-07\21419495.zip -- TStorageMgr.StoreAttachment( ZS(Not NIL), 20382705, C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.4.pdf ) -- an = "C:\Loader\Out\20382705" -- FileExists(C:\Loader\Out\20382705) = YES! -- ZS.Add(an) --> 1 !! SUCCEEDED !! -- StorageMgr: storing C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.4.pdf --> 20382705 (renamed before adding) -- File: C:\Loader\Out\0100_TEMSTOD4_%D%T_200507162634.4.pdf does NOT exist (should NOT) -- File: C:\Loader\Out\20382705 DOES exist (should) ======================== == SetFilesAsBillable == -- IDs found that are ready to bill: [8124785177] =================== == UpdateWebview == -- Checking to see if there are any datafiles to sync with webview ... -- IDs found to sync with webview: [21419511,21419495] =============================================================================== >> Processing Complete for Aggregate file #3/37 : 0100_TEMSTOD4_%D%T_200507162634.pdf ... cleaning up ... =============================================================================== What I want to do is parse these out to be used to support some kind of "dashboard". Today's log file was nearly 14k lines long and it's pretty useless as-is. It all looks like the same patterns over and over -- only the numbers and filenames look different, and none of them are really meaningful. What's helpful is to see how they relate, and sometimes to be able to view the documents themselves. As can be seen, the structure is fairly simple with some variations in each block. It's easy to use regular expressions to recognize and parse different parts. What I'm wondering is what might be the best approach to ingest data like this? I can tell, for instance, when I've started processing an aggregate PDF (of which the whole example above is an example), and I can distinguish each of the different "sections" and "files" that are being processed. This much is easy. But would you build some kind of "parse tree" for this internally? Or would you just take the data as it's parsed and display it with bits and pieces attached as objects for when more details are wanted? Here's a statistical summary I show at the very end: =============================================== ============= S T A T I S T I C S ============= =============================================== == 37 Aggregate PDF files == 900 Documents processed == 804 Pick Tickets identified -- 89% == 51 Corrected PTNs == 143 Unreadable PTNs == 754 Order Numbers identified -- 83% == 246 Corrected Order Nums == 142 Unreadable OrdNums == 0 Pick Tickets with no PTN found == 194 PTNs not matching any documents -- 21% == 96 Forms found that were not identified as Pick Tickets -- 10% == 187 Docs with no matching Order Nums == 2 Docs with OrdNum that already have a PT attached == 6 Docs attached using OrdNums -- 0% == 52 Pages rotated to get viable data -- 5% =============================================== I'm thinking it might be nice to have something similar to act as the "entry point" to the dashboard that lets you click on one of the lines and display data starting from that perspective. You could drill-down to see details of what went into a given statistic. It might also allow you to see overlaps with other items if they were meaningful. There's some interesting data that could be gathered by looking at this data longitudinally. (Am I right in guessing that this edges into the world of "analytics"?) If you've got any experience with things like this, I'd be really interested in your thoughts on how you might approach it. Like ... would you store the parse tree anywhere? It takes less than 2 seconds to parse this file, so I'm not sure what might be gained from saving it. But I don't know ... that's what I'm asking about.
  15. yup. And the comments are not simply stating the obvious, but rather why you put this part here and are doing this other part later. And not just for your own benefit, but for people down the road years later. Here's CASE STUDY of sorts for anybody who's interested. Grab something to drink first... I've been working on updating some code written over 10 years ago that is designed to take some scanned documents and attach them to invoices in a DB that they're related to. When done properly, the user can pull up an invoice and then view the scanned documents. The problem is to extract enough identifying data from the scanned documents to look up related records in the DB and then "attach" them somehow. There's nothing magical or proprietary here. It's just a brute-force process that anybody would have to do in similar situations. There are a bunch of offices this client has that are scattered around the country. They take orders, post them to a database, and later on send out a truck with the stuff that was ordered. Like what Amazon does -- only the Amazon folks scan a barcode and take a photo upon delivery and the whole loop is closed. These guys print out a slip of paper, pull stuff from inventory, put it on a truck, send the driver out to deliver the stuff, and he comes back and put the ticket in a pile. That pile is scanned into a single PDF file that's sent to us for processing. We split it up, scan each page, and attach them to related invoices that are already online. Easy peasy, right? HA! This application uses an OCR engine to extract a couple of numbers from each scanned image. The original author did a bunch of OCR processing all at once, and put the results into a big list of records. Then when ALL of the OCR processing was done, they came back and digested that data. I don't know why he did it that way -- it had a side-effect of producing hundreds if not thousands of files in a single folder, and anybody with much experience using Windows knows that's a recipe for a horribly slow process. Maybe they didn't process that many documents per day back when it was first written, but today it's a LOT! Part of the "digesting" looked up one of the numbers in a database in order to find an associated invoice#. If they found one, they proceeded to add a database record associating these numbers to the invoice. They went through this data in batches that corresponded to the piles of documents scanned in together in each office. Later on, they renamed the file that was scanned with the OCR, stuffed it into a Zip file, and recorded both the zip file name and the new name of the document in another record. So when you find the invoice online, you can see if it has this attachment and then pull up the attachment and view it. I found this process far more complicated than it needed to be. But after a while this approach began to make sense if you look at it as a linear process: scan, extract numbers, lookup invoices, annotate the DB records, store the scanned document. It also had a benefit that the original author did recognize, which is that a large number of these scanned documents (15%-25%) did not have corresponding invoices to associate with in the DB. (It may not have been that high originally, but it's that high today. I don't know if anybody is even aware of that.) So if the numbers didn't yield a lookup, the document was chucked aside and they just went on to the next one. There's a big problem I ran into, however: due to some as-yet undetermined issues, the documents that were scanned (and renamed) are not getting added to a zip file sometimes. Because this process was further down the assembly line from where the records were added to the database associating the extracted numbers, the zip file and filename, the original author didn't take into account what to do if a given file didn't get stored in a zip file for some reason. Oops! So users would click to view one of these files, and they'd get an error saying the system can't find the document, because it's not in the zip file where it was expected to be. Another person might take an approach where each document is scanned, it's numbers extracted, the DB looks up the invoice, and only then is it added to a zip file and saved. Each one would be processed in its entirety before the next one was looked at. There would appear to be a lot more integrity in this process because the point where the data is recorded to the DB is "closer" to when the files are added to a zip file -- so if the latter fails, the DB process can be reversed by using a "transaction". As it happens, you can't process each one completely before the next one, because some of them represent multi-page documents. We basically get piles of documents that all come from the same office, and they can be processed separately from each other pile. But within that pile, there may be 2, 3, 4, or more that are related and need to be stored together. Suffice it to say, none of this was documented in the code. There was just a rough process that I could follow, with bits and pieces sticking off to the side that didn't seem to have any relationship to anything until after observing a sufficiently large number of documents being processed and realizing they were dealing with "exceptions" of sorts. One thing I lobbied for early on was to replace the OCR engine with something newer and faster. After I did, I found all sorts of decisions that appeared to have been made because the original OCR process took so damn long -- part of which was a whole multi-threading mechanism set up to parallel process things while the OCR engine was busy. (They could have just run the OCR as a separate thread, but they dumped most of the logic into the thread as well. I think that was because it processed so fast. Dunno.) With the new OCR engine, it takes 2-3 seconds to process each document in its entirety. The old OCR engine took 2-3 minutes per document. In situations like this, people rarely bother to explain why they organized things the way they did to account for the amount of time needed to do a particular part of the process. They figure it's pretty darn obvious. Well, no it wasn't. Especially after replacing the OCR engine with a much faster one. One of the first things I did was refactor things so I could switch between OCR engines to support some kind of A/B testing. In that process, I saw how (needlessly) tightly coupled the code was to the OCR process when all it was doing was pulling out two numbers. In the original approach, there was a ton of logic associated with the OCR process that I was able to pull out into a base class (in a separate unit) and reuse with the new OCR engine. I was able to re-do the overall process to better reflect different activities, and the code became much easier to read. At this point, however, management has not given me the go-ahead to deal with the problem of losing attachments that can't be inserted into archives for whatever reason, but it won't be as hard to do now as it would be in the original code. Management thought this would be maybe a week's worth of work. I spent several weeks just figuring out how the code worked! No documentation existed on it anywhere, of course, and virtually no comments existed in the code. It's finally coming to a close a couple months later than expected, and they have no appetite to hear about the problems I ran into along the way. ANY kind of documentation would have been helpful in this case. But unfortunately, this is pretty much the norm from what I've seen over the years.
×