Jump to content

David Schwartz

Members
  • Content Count

    1264
  • Joined

  • Last visited

  • Days Won

    26

Everything posted by David Schwartz

  1. yup. And the comments are not simply stating the obvious, but rather why you put this part here and are doing this other part later. And not just for your own benefit, but for people down the road years later. Here's CASE STUDY of sorts for anybody who's interested. Grab something to drink first... I've been working on updating some code written over 10 years ago that is designed to take some scanned documents and attach them to invoices in a DB that they're related to. When done properly, the user can pull up an invoice and then view the scanned documents. The problem is to extract enough identifying data from the scanned documents to look up related records in the DB and then "attach" them somehow. There's nothing magical or proprietary here. It's just a brute-force process that anybody would have to do in similar situations. There are a bunch of offices this client has that are scattered around the country. They take orders, post them to a database, and later on send out a truck with the stuff that was ordered. Like what Amazon does -- only the Amazon folks scan a barcode and take a photo upon delivery and the whole loop is closed. These guys print out a slip of paper, pull stuff from inventory, put it on a truck, send the driver out to deliver the stuff, and he comes back and put the ticket in a pile. That pile is scanned into a single PDF file that's sent to us for processing. We split it up, scan each page, and attach them to related invoices that are already online. Easy peasy, right? HA! This application uses an OCR engine to extract a couple of numbers from each scanned image. The original author did a bunch of OCR processing all at once, and put the results into a big list of records. Then when ALL of the OCR processing was done, they came back and digested that data. I don't know why he did it that way -- it had a side-effect of producing hundreds if not thousands of files in a single folder, and anybody with much experience using Windows knows that's a recipe for a horribly slow process. Maybe they didn't process that many documents per day back when it was first written, but today it's a LOT! Part of the "digesting" looked up one of the numbers in a database in order to find an associated invoice#. If they found one, they proceeded to add a database record associating these numbers to the invoice. They went through this data in batches that corresponded to the piles of documents scanned in together in each office. Later on, they renamed the file that was scanned with the OCR, stuffed it into a Zip file, and recorded both the zip file name and the new name of the document in another record. So when you find the invoice online, you can see if it has this attachment and then pull up the attachment and view it. I found this process far more complicated than it needed to be. But after a while this approach began to make sense if you look at it as a linear process: scan, extract numbers, lookup invoices, annotate the DB records, store the scanned document. It also had a benefit that the original author did recognize, which is that a large number of these scanned documents (15%-25%) did not have corresponding invoices to associate with in the DB. (It may not have been that high originally, but it's that high today. I don't know if anybody is even aware of that.) So if the numbers didn't yield a lookup, the document was chucked aside and they just went on to the next one. There's a big problem I ran into, however: due to some as-yet undetermined issues, the documents that were scanned (and renamed) are not getting added to a zip file sometimes. Because this process was further down the assembly line from where the records were added to the database associating the extracted numbers, the zip file and filename, the original author didn't take into account what to do if a given file didn't get stored in a zip file for some reason. Oops! So users would click to view one of these files, and they'd get an error saying the system can't find the document, because it's not in the zip file where it was expected to be. Another person might take an approach where each document is scanned, it's numbers extracted, the DB looks up the invoice, and only then is it added to a zip file and saved. Each one would be processed in its entirety before the next one was looked at. There would appear to be a lot more integrity in this process because the point where the data is recorded to the DB is "closer" to when the files are added to a zip file -- so if the latter fails, the DB process can be reversed by using a "transaction". As it happens, you can't process each one completely before the next one, because some of them represent multi-page documents. We basically get piles of documents that all come from the same office, and they can be processed separately from each other pile. But within that pile, there may be 2, 3, 4, or more that are related and need to be stored together. Suffice it to say, none of this was documented in the code. There was just a rough process that I could follow, with bits and pieces sticking off to the side that didn't seem to have any relationship to anything until after observing a sufficiently large number of documents being processed and realizing they were dealing with "exceptions" of sorts. One thing I lobbied for early on was to replace the OCR engine with something newer and faster. After I did, I found all sorts of decisions that appeared to have been made because the original OCR process took so damn long -- part of which was a whole multi-threading mechanism set up to parallel process things while the OCR engine was busy. (They could have just run the OCR as a separate thread, but they dumped most of the logic into the thread as well. I think that was because it processed so fast. Dunno.) With the new OCR engine, it takes 2-3 seconds to process each document in its entirety. The old OCR engine took 2-3 minutes per document. In situations like this, people rarely bother to explain why they organized things the way they did to account for the amount of time needed to do a particular part of the process. They figure it's pretty darn obvious. Well, no it wasn't. Especially after replacing the OCR engine with a much faster one. One of the first things I did was refactor things so I could switch between OCR engines to support some kind of A/B testing. In that process, I saw how (needlessly) tightly coupled the code was to the OCR process when all it was doing was pulling out two numbers. In the original approach, there was a ton of logic associated with the OCR process that I was able to pull out into a base class (in a separate unit) and reuse with the new OCR engine. I was able to re-do the overall process to better reflect different activities, and the code became much easier to read. At this point, however, management has not given me the go-ahead to deal with the problem of losing attachments that can't be inserted into archives for whatever reason, but it won't be as hard to do now as it would be in the original code. Management thought this would be maybe a week's worth of work. I spent several weeks just figuring out how the code worked! No documentation existed on it anywhere, of course, and virtually no comments existed in the code. It's finally coming to a close a couple months later than expected, and they have no appetite to hear about the problems I ran into along the way. ANY kind of documentation would have been helpful in this case. But unfortunately, this is pretty much the norm from what I've seen over the years.
  2. This doesn't seem to have anything to do with either OOP or the subject of Design Patterns. And your description is too microscopic to give the reader (this one, anyway) a grasp of what you're trying to accomplish. Like if you talked about feeding a string through several holes, using different criss-cross patterns, and making a knot at one end ... not many people would conclude you're talking about putting a shoelace on a shoe and tying it. Your description focuses more on the string, the holes, and the criss-cross pattern than what you're doing with them. Does that make sense? It's like you bought 1000 toilet seats and you're trying to figure out what to do with them by describing the kinds of toilets they fit on, when what you're really interested in is something like building public toilets in a part of the world where they don't have many of them. In other words, you're not focused on the right thing. Generally speaking, arrays are good for storing things if you have a finite number of items that can be indexed and accessed by their positions relative to each other. They're good for representing physical associations. Reserved parking spaces in a parking area, desks in a classroom, or seats in a restaurant. You'd typically number them 1..n and then assign each number to a person for some period of time. There's a one-to-one relationship, so you could refer to each one by their assigned number, or their name. At some point, the relationship expires, and the spot can be reallocated to someone else. The spot itself does not disappear, just the relationship it has to another person or object. That is, you don't "destroy" a parking spot, a desk in a classroom, or a table in a restaurant, right? In practice, you clean them up and re-use them. Similarly, you don't destroy a slot in an array, you simply clear it out and re-use it. If you have an inventory of something that is refreshed periodically, and it is consumed at a different rate by others, then you want a list in both sides. There's no relationship between items in one and consumers in the other. (It's called a "consumer-producer" problem.) There may be an ordering in terms of how inventory is acquired and then distributed, like First-In-First-Out (FIFO) or Last-In-First-Out (LIFO) or maybe its random. There may be a maximum capacity (the size of a warehouse is fixed, but rarely filled to capacity). You could also have the inventory broken down by some attribute like size, shape, color, texture, some kind of date, etc., and stored in places related by those attributes. Things on a list usually come inside of containers, and you typically dispose of the containers once you hand out the goods inside of them. While you have them under your control, however, you use the containers to move the stuff around and identify what they contain. Anyway, I've discussed a number of examples here at a level of detail that I bet is sufficient for you to build a model around each one with no further information, based simply on your familiarity with the subject mentioned. You've used a lot of words to explain something at a level of detail that is nearly impossible for someone to know what you're talking about. That's ok, it's a common problem people have when learning new concepts. I myself tend to be very verbose because I like to be very precise and explicit, so I usually say too much. But it's usually all relevant. Why don't you try setting aside what you've written so far, and start over beginning with a one-sentence description of the problem you're trying to solve? If you make it similar to something that most people would understand, you'll need far fewer words to explain it. And you'll have a much more clear understanding of it in your own mind as well. BTW, this is called a "use-case model".
  3. David Schwartz

    GridPanel question

    I have a gridpanel with some buttons on it, and I want to be able to move them around using the mouse (drag-n-drop). I found this article: https://stackoverflow.com/questions/5359948/drag-n-drop-controls-in-a-gridpanel The problem is it simply swaps the source and destination buttons. What I want is to move the source button to the spot where the destination button is and then have all of the others move up or down as needed. For example, if I have a list of 10 words, and I drag line 8 to line 3, I want it to push lines 3-7 down by one and leave this one at line 3. The gridpanel already has some flow logic built-in, but it's unclear whether you can use it to move things around. There's an Insert method, for example, but they say not to use it. Instead you merely change it's parent. The general behavior is to add something to the panel by assigning its parent and it's added at the end. But if you delete something, it closes the gap that's left. Several of the methods and properties have no explanation, so it's not clear if this is a linear thing that can be accomplished by moving items around in the ControlCollection, or if you need to move them one-at-a-time from one (col,row) spot in the grid to the next. There aren't many examples I've found online for this control either. Does anybody here have any experience with this?
  4. David Schwartz

    GridPanel question

    Thanks, I'll check into that!
  5. David Schwartz

    Experience/opinions on FastMM5

    My comment was about the licensing terms, not the product or developer's choice. I expect that people who see no problem with this kind of licensing scheme will never again complain about taxes or having their rights taken away by so-called "government overreach" since they're realliy the same as a GPL license you agree to by supporting your government officials. And y'all will defend the government's right to seize your stuff because, well, you agreed to the terms when you voted for the shysters in the first place.
  6. David Schwartz

    Experience/opinions on FastMM5

    Oh, that one. "If I borrow your hammer then I have to give away everything I'll ever build with it in the future for free, even if it cost me a lot of time and money to build." Some folks have a strange notion of what "equity" and "balance" are about.
  7. David Schwartz

    multi-media question

    Well, generally speaking, yes. I'm simply trying to automate an otherwise highly mechanical process. It does not, however, come close to creating "every possible combination". Your graph thing simply requires custom drawing to the canvas. I would not bother with html, css, or js. WAY too much work. I think Lars suggested Tee-Chart because it offers a lot of options when it comes to customizing the display. I've seen some amazing things done with Tee-Chart that I never would have guessed are possible. In my case, I've got several different kinds of media that need to be merged and mixed in a synchronized way over some amount of time. It's a very different problem.
  8. David Schwartz

    Experience/opinions on FastMM5

    What is the practical impact of a GPL V3 license for those of us who don't keep up with such things?
  9. David Schwartz

    looking for a lo-fi Delphi Style

    A common practice used by lots of web and mobile app devs is to do the basic design work using what's often called a "lo-fi" theme or style. It looks like hand-drawn figures on paper with handwritten lettering. Balsamic is a tool I found that offers this, but there are many others. I like using Delphi for basic prototyping, and I can often build a semi-functional prototype as quickly as graphic artists can build a static wire-frame model in drawing tools. Here's a decent article on the topic that I found with Google's help: https://www.justinmind.com/ui-kits/sketching-web-and-mobile-wireframes-with-justinminds-ui-kit The problem comes when the user sees what appears to be a fully-functional app, and thinks it's nearly finished. Uh, no, it's like a Hollywood sound stage. They can't tell, and I'm hard-pressed to prove it to them. I thought perhaps if there's a lo-fi style for Delphi, I could build something that LOOKS like it's hand-drawn so they don't mistake it for something more complete. Does anybody know of anything like this that's available?
  10. David Schwartz

    looking for a lo-fi Delphi Style

    I guess my original request just wasn't very clear. I know that if I ask this on SO, it'll get locked.
  11. David Schwartz

    looking for a lo-fi Delphi Style

    It's not a problem with bosses, but rather when Delphi is used for rapid prototyping and you build a mockup that LOOKS like a fully-functional application to someone (eg., clients or prospects). They can't tell the difference. You show them something that LOOKS "nearly complete" and say you want to charge them $10k or whatever to "build it out" and they think you're pulling their leg. It doesn't go very well in my experience. I can show them the code, but they don't know what they're looking at. Why duplicate effort? Why spend hours building a mockup in one tool and then re-doing that in another just because you can't make the UI look ... uh ... let's say ... less refined? The UI is all anybody has to go by. They can't tell it's analogous to a Hollywood sound stage and there's nothing behind the facades. As an aside, I thought it was fascinating to see the HGTV series on The Brady Bunch home rebuild. They had a street view of a house they used in the TV series to make it appear like what their "home" looked like from the outside, when in fact it never existed except inside of various sound stages in a big warehouse. They had to double the size of the real house and do some massive remodeling to make it come close to what audiences believed was what it should look like. It took 6 months, a couple million dollars, and it still wasn't an exact replica of what was shown on TV. When you're selling your home and you "stage" it, you don't build a mockup in a warehouse of exactly what they're looking for. You pimp up the home itself to give buyers a look at what it might look like. But people know that's not THEIR furniture and they don't expect it to look exactly that way when they move in. Software is different -- they look at what you show them and if it's "too polished" they think THAT'S HOW IT'S GOING TO LOOK AND WORK. And if it LOOKS AND FEELS FINISHED, they don't take you seriously if you try to convince them it's not. Why double your workload during the design stage by building a mockup on a sound stage in a warehouse when Delphi lets you build a rapid prototype just as fast?
  12. David Schwartz

    looking for a lo-fi Delphi Style

    I'd really love something that gets this kind of effect right within Delphi. Colors are no problem. Typography is no problem. Maybe buttons and widgets that look like they're hand-drawn would require a different skin or CustomDraw method? Any thoughts?
  13. David Schwartz

    Threading question

    Well, I still haven't been able to install MadExcept, but I did figure out that one problem was a constructor where I neglected to call 'inherited'. It didn't get fully initialized and was the source of one of the AVs. Another one is where I seem to be calling a method on a freed object. Haven't tracked that one down yet. Probably an object allocated via an Interface.
  14. David Schwartz

    Threading question

    I'm blocked waiting on help from our IT Dept getting madExcept installed. I'm having file permissions issues, and I don't have permissions to fix them either. I realize that it's necessary at times for devs to have access to production networks to test their work, but giving us unrestricted access carries with it some pretty severe security protocols that make you wonder who they're trying to keep out exactly. I've never worked anywhere that imposes such draconian restrictions on developers. It makes work extremely cumbersome and often impossible without IT's intervention. I'm sure IT isn't happy with it either.
  15. David Schwartz

    Threading question

    The VSoft.Messaging.Channel unit is referring to VSoft.WeakReference (containing TWeakReferencedObject) in its uses clause. That file is not in the package.
  16. David Schwartz

    Threading question

    @Kas Ob. it does not seem to make a difference. An AV is still happening. Maybe from something else.
  17. David Schwartz

    Threading question

    I cannot change the infrastructure, per se, but I can change how this app works for its own needs. The means of sending data between the thread and the form is of no consequence outside the app. The form does not communicate with the thread once the thread is initiated. The thread does, however, communicate with some supporting services and I can't change that part. But as far as I can tell, they're working fine. Just one question ... mostly what I need to pass is strings, 25-250 characters in length, mostly single lines under 80 chars. Using this mechanism, would I need to pass them as arrays of bytes rather than as pointers? (I think this is where my current problem lies in that the pointers to the objects or strings being sent are getting zapped.) Since these are Records, are they passed by value (ie, copied)? If so, would it make sense to break things up into, say, 127-char blocks and send them that way?
  18. David Schwartz

    Threading question

    I've been trying to install MadExcept and not having much luck so far. We have severe security restrictions to deal with in our environment. In particular, our normal logins do not have Admin privs. We have a separate login "a-<username>" that DOES have Admin privs, but it's only temporary. It's kinda like sudo but more restrictive. The install is putting the data in to the Admin's registry, not the normal user's registry. So it's there if I run Delphi as Admin but not otherwise. When i try to add the BPLs to Delphi with my normal login, I get an error saying I don't have permission to read the files. Not even the Help files! We can't change ownership, and are restricted in terms of granting wider visibility to files. We're even blocked from running RegEdit. Getting a slow response from IT Dept isn't helping. Grrrr....
  19. David Schwartz

    Threading question

    I know there are a couple of very long and detailed books that have been written that amount to "The Tao of Multi-Threading in Windows (and Delphi)". I'll leave it to others to master this topic since I have not really needed to do so up to this point in my career. My experience with multi-threading is with kernels that take perhaps 10 pages to explain the entire model. They're very simple, succinct, consistent, and designed to support the needs of (soft) real-time interrupt-driven programming. I cut my teeth on the "Manager vs. Monitor" debates that flared up in the 1980s as they applied to multi-processing, and I did a lot with both tightly-coupled and loosely-coupled multi-processing. Then DOS showed up and people mostly laughed. Then Windows showed up and people gagged. Need I mention some of the high-profile failures Windows NT caused during the 90's when it was used to control real-time process-control applications? (Does anybody remember the fiasco around the baggage handling system at Denver International Airport that delayed its opening by over a year? The popular press blamed the DBMS they were using, but I used the same DBMS and it worked flawlessly; the guys I talked with there said it was how Win NT was so inconsistent in how it processed things that made it keep locking up.) I think the numerous episodes like this speak quite eloquently for themselves. Several people have pointed out that the original developers of this code may have also lacked a depth of understanding based on what I've shown here. From comments in the code, I'd say the threading was added around 2004 and it does not appear to have been significantly altered since then, except for a few odd bits here and there given the comments that were left. The threading model that this app uses is implemented in several units that form the core of a couple dozen other apps, and I'll get skewered if I make any changes that affect them. My task is simply to migrate this thing from Win XP to Win 10, and there are issues showing up in Win 10 that may have been around forever but weren't manifesting in XP. All I know is that with minimal calls to the tracing & logging methods, everything works fine. When I send more and more tracing data out, I hit a point where I'm getting AVs in unpredictable areas. I can tell there are some race conditions going on. I just can't tell exactly where they're happening or why, and how to either fix them or circumvent them. I'm trying different things people have suggested, but to little avail so far. I do appreciate the support and suggestions, tho.
  20. David Schwartz

    Threading question

    It seems to me that if the string is gone at the time this method is called, then copying it is only going to trigger the same AV fault, right? If it's not, then it's unnecessary.
  21. David Schwartz

    Threading question

    Well, there's a problem right off the bat. procedure StatusOut(const stat: string); var AMsg:string; begin AMsg := UniqueString(stat); 'stat' is a const arg, and UniqueString is looking for a 'var' arg. procedure UniqueString(var str: UnicodeString); overload; Perhaps the 'const' argument is part of the problem here?
  22. David Schwartz

    issues with non-Win platforms

    I just created a new VM with a fresh copy of Win10 and a new license, then loaded in some common utilities and Delphi 10.3.3 Rio via the ISO. (I realized I forgot to run the setup as Admin. Will that pose any problems down the line?) It looked like all of the various options are enabled by the installer, so I just loaded everything up. I started installing various libraries, and got to some TMS libs. They have the ability to select which of the different platforms you want to use. I seem to have bad luck with them a lot because all I get mostly is Win32 and Win64. All the others generate errors. In this case, the MacOS platforms fail because there's a folder named Import that's not found. Earlier versions of Delphi have some kind of Platform Manager that let you install different options that I guess aren't installed by default, but I can't find it in 10.3.3. I loaded up the REST multi-platform demo and it let me add every platform other than Linux64. I didn't try to build anything, but in the past if you could load it, it would build. Still the TMS installer chokes, but I don't think this is TMS' fault. What changed from 10.3.2 --> 10.3.3 vis a vis the platform installation / setup / selections ? I've had this problem with previous versions but was able to fix it by poking and prodding the Delphi installer and Platform Manager settings. What am I missing here? (Right now, I'm mainly interested in MacOS64, iOS Simulator, and Android64. All I can successfully install is Win32 and Win64.)
  23. David Schwartz

    Threading question

    I'll give it a try and let you know.
  24. David Schwartz

    Threading question

    can anybody guess why they'd both write to the TMemo and use a separate logger to save the same data to disk instead of just calling memo.Lines.SaveToDisk( ) periodically? In a sense, the memo is a queue, although it's not flushed and cleared. Is there a chance the logger is the source of these AVs? (I tried to set up a TStringList as a buffer instead of the memo and logger, but I couldn't get it to flush when the program terminated.)
  25. David Schwartz

    Threading question

    FWIW, I'd like to think I have a better-than-average understanding of multi-threading. I spent the first 10 years of my career in the real-time machine-control world and even spent a couple of years working on a real-time multitasking kernel. The thing is, these platforms were all designed from the ground-up to do this, and they were very consistent and predictable. In contrast, I find Windows to be more of a swampy mess when it comes to multi-threading. And when you layer Delphi on top of that, it only gets worse. Thankfully, I have not had to deal with it much. So I do find this mess rather confusing to deal with. The fact that you guys keep arguing with each other about things that should be quite simple only points to the underlying complexity of how Windows offers up it's ugly, hydra-like threading models. It makes me feel like it's not just my own stupidity. But I do really appreciate the help. 🙂
×