Jump to content
Mike Torrettinni

How to remember units and classes, where is what?

Recommended Posts

So my first attempt with classes structuring in separate units was very successful. Very good separation of logic and presentation details, classes, inheritance, some poly-thingy probably and other... in multiple units. Nice and good. Happy.

Now, 2 weeks later, I started a new feature and I said OK, let me review and see how I did it then, to start similar concept:

 

Open main Form -> that feature is added as a Frame ->

open Frame... aha, no code coupled with Controls, so I have no idea what is shown anywhere -> I want to see how I present data in one of the Virtual Treeviews ->

I remember I used name something like 'presentation layer', where all data is 'pushed' to controls, so lets look for it ->

Frame only uses main data class, in data unit, ok go there -> open main data unit ->

search for 'presentation'.. found TPresentationLayer -> open presentation unit ->

browse through TPresentationLayer class in presentation layer unit ->

find OnGetText for Virtual Treeview, not the right one... browse through the next 2 OnGetText methods... aha:  finally found the Array and what is presented in selected Virtual Treeview.

 

I don't get it. At the time of implementing the feature it looked great, everything structured, nice, but I knew where everything was because I was working on the feature. 2 weeks later and it's like it's been 10 years, I'm looking where data is defined, where enums are used and how, where is Array filled, processed and the displayed. Luckily Ctrl+Click works very well so I can jump back and forth through units, if this wasn't working... I can't imagine.

 

What am I doing wrong, did I do 'too much' separation of logic and UI? Do I have bad memory that I can't remember details of 6 small units for a feature, after 2 weeks?

How do you deal with this, how do you find out later on what is where and all the details?

 

Any advice appreciated, even though you can't see my code.

 

 

  • Like 1
  • Thanks 1

Share this post


Link to post

When using something new or something you use rarely, writing comments in the code can be helpful.

For each unit, mention the related units, how the types relate, how to "set things up" with regards to how to "link" related types.

 

It might feel like waste time since the source is there, but yeah, memory doesn't really improve with age.

 

Also - the idea you had when you broke it into six units, may eventually lead to the removal of units, so taking a note of why the design is as it is, is helpful.

I usually know how my libs mature by the amount of code I can delete or simplify.  When things gel properly, the solutions often can turn out simpler than the original idea.

 

And then there is the eternal question: Are you solving a problem, or are you writing a framework?

My favorite pitfall that I keep falling into.

Share this post


Link to post
44 minutes ago, Lars Fosdal said:

Are you solving a problem, or are you writing a framework?

In my work there's rarely a difference. I have a lot of clients where the job boils down to "our code sucks - fix it". The problem is often the complete lack of a framework - or a badly designed one.

 

49 minutes ago, Lars Fosdal said:

I usually know how my libs mature by the amount of code I can delete or simplify.  When things gel properly, the solutions often can turn out simpler than the original idea.

Yes!

In my work process I usually refactor and redesign iteratively, continuously changing stuff toward a distant goal while making sure the new code still works. In the beginning the size of the code and the complexity grows as the solution now contains both the old mess and the new solution. Then at some point it's as if the junk suddenly evaporates and I'm left with amazingly little code. Always a good feeling.

  • Like 1

Share this post


Link to post
4 hours ago, Lars Fosdal said:

Are you solving a problem, or are you writing a framework?

When I started with that feature, I had an idea that it would be really easy to plug'n'play other features (I guess like a little framework), but eventually it ended up so customized for specific feature, otherwise I could still be working on it and refactor minor details.

 

3 hours ago, Anders Melander said:

Then at some point it's as if the junk suddenly evaporates and I'm left with amazingly little code. Always a good feeling. 

I can't wait to get there 🙂

 

So, I guess you guys are saying documentation and comments and refactoring, there is nothing else that could help with understanding the code at later date.

Share this post


Link to post

That sums it up. 

 

Use the xml doc feature for methods and properties, and do it straight away. 

If it is postponed, it never happens. 

 

A colleague asked me on how to use a specific feature yesterday...

So... I opened a couple of related units I wrote some time ago, and they had NO xml doc, so yeah, that never happened.


The interface sections alone are 1070 and 350 lines, so...

I guess I really should work more on my own documentation rather than offer advice :classic_blush:

 

To my defense, they are widely used in our projects, so there are plenty of usage examples :classic_rolleyes:

  • Like 4

Share this post


Link to post

The best way is that you understand your design well in the first place. My reading of your various posts is that a lot of the time you are developing your code by following methodologies used like a recipe. I suspect that with a deeper understanding of why these methodologies were being used then you wouldn't find your code alien when you encounter it. 

Share this post


Link to post

 

2 hours ago, David Heffernan said:

My reading of your various posts is that a lot of the time you are developing your code by following methodologies used like a recipe.

I'm gaining experience, with the mistakes and failures on the way. Rome wasn't built in a day.

 

4 hours ago, Lars Fosdal said:

I guess I really should work more on my own documentation rather than offer advice 

It shouldn't, but it gives me a little comfort knowing not everybody is perfect and I'm not the only one having odd 'relationship' with the documentation.

  • Like 3

Share this post


Link to post

Good or bad memory doesn't matter. You need to be able to navigate and understand the code even if you didn't wrote it yourself.

I would guess, your biggest problem lays in the hard to find coupling of controls with the actual code. In other words, bad code navigation capabilities.

Code navigation is for me personally more important then separation. Well, I know, controversial. And don't get me wrong, I'm all for separation, but try to find a way, that allows you to navigate your code nicely and have still your separation.
When you have a dozen of projects, each with hundreds of units... well, you will be thankful to be able to navigate the code with ease.
Do not try to follow the hype just because. There should be a reason to do it. Over complicating things doesn't solve anything.

You wrote, that your mainform references the frame, maybe it shouldn't? Maybe it should reference the presentation layer instead. Maybe you should couple the frame with that presentation layer there, or at least have a call to the coupling there?

I know the unpleasant WTF moment, when you look at a form, and have no idea where the code is, that supposed to be there... then you go on a search, wasting a lot of time...just because someone decided to implement a fully decoupled system just for the sake of it.

I would suggest two solutions:
1. Hide the frame from yourself. the mainform (or better said, your entry point) should reference the presentation layer, where you can find the coupling of the controls and the frame that is used.

or..

2. let the frame couple itself with the presentation layer. There is something beautiful to be able to click in the form designer on a button and see where the call goes.
You need to do the coupling somewhere anyway.
There is little reason to remove *all* the code from the frame. And if you do that, ask yourself what is the real benefit for you.  Are you solving a problem or are you introducing one?

 

 

Share this post


Link to post
2 hours ago, Pawel Piotrowski said:

There is little reason to remove *all* the code from the frame. And if you do that, ask yourself what is the real benefit for you.  Are you solving a problem or are you introducing one?

With each iteration I learn something new and which way is appropriate for which implementation. It's fun 🙂

Share this post


Link to post
On 5/4/2020 at 12:54 PM, Mike Torrettinni said:

So, I guess you guys are saying documentation and comments and refactoring, there is nothing else that could help with understanding the code at later date.

yup. And the comments are not simply stating the obvious, but rather why you put this part here and are doing this other part later. And not just for your own benefit, but for people down the road years later.

 

Here's CASE STUDY of sorts for anybody who's interested. Grab something to drink first...

 

I've been working on updating some code written over 10 years ago that is designed to take some scanned documents and attach them to invoices in a DB that they're related to. When done properly, the user can pull up an invoice and then view the scanned documents. The problem is to extract enough identifying data from the scanned documents to look up related records in the DB and then "attach" them somehow. There's nothing magical or proprietary here. It's just a brute-force process that anybody would have to do in similar situations.

 

There are a bunch of offices this client has that are scattered around the country. They take orders, post them to a database, and later on send out a truck with the stuff that was ordered. Like what Amazon does -- only the Amazon folks scan a barcode and take a photo upon delivery and the whole loop is closed. These guys print out a slip of paper, pull stuff from inventory, put it on a truck, send the driver out to deliver the stuff, and he comes back and put the ticket in a pile. That pile is scanned into a single PDF file that's sent to us for processing. We split it up, scan each page, and attach them to related invoices that are already online. Easy peasy, right? HA!

 

This application uses an OCR engine to extract a couple of numbers from each scanned image. The original author did a bunch of OCR processing all at once, and put the results into a big list of records. Then when ALL of the OCR processing was done, they came back and digested that data.  I don't know why he did it that way -- it had a side-effect of producing hundreds if not thousands of files in a single folder, and anybody with much experience using Windows knows that's a recipe for a horribly slow process. Maybe they didn't process that many documents per day back when it was first written, but today it's a LOT!

 

Part of the "digesting" looked up one of the numbers in a database in order to find an associated invoice#. If they found one, they proceeded to add a database record associating these numbers to the invoice. They went through this data in batches that corresponded to the piles of documents scanned in together in each office. 

 

Later on, they renamed the file that was scanned with the OCR, stuffed it into a Zip file, and recorded both the zip file name and the new name of the document in another record. So when you find the invoice online, you can see if it has this attachment and then pull up the attachment and view it.

 

I found this process far more complicated than it needed to be. But after a while this approach began to make sense if you look at it as a linear process: scan, extract numbers, lookup invoices, annotate the DB records, store the scanned document.

 

It also had a benefit that the original author did recognize, which is that a large number of these scanned documents (15%-25%) did not have corresponding invoices to associate with in the DB. (It may not have been that high originally, but it's that high today. I don't know if anybody is even aware of that.) So if the numbers didn't yield a lookup, the document was chucked aside and they just went on to the next one.

 

There's a big problem I ran into, however: due to some as-yet undetermined issues, the documents that were scanned (and renamed) are not getting added to a zip file sometimes. Because this process was further down the assembly line from where the records were added to the database associating the extracted numbers, the zip file and filename, the original author didn't take into account what to do if a given file didn't get stored in a zip file for some reason. Oops! So users would click to view one of these files, and they'd get an error saying the system can't find the document, because it's not in the zip file where it was expected to be.

 

Another person might take an approach where each document is scanned, it's numbers extracted, the DB looks up the invoice, and only then is it added to a zip file and saved. Each one would be processed in its entirety before the next one was looked at. There would appear to be a lot more integrity in this process because the point where the data is recorded to the DB is "closer" to when the files are added to a zip file -- so if the latter fails, the DB process can be reversed by using a "transaction".

 

As it happens, you can't process each one completely before the next one, because some of them represent multi-page documents. We basically get piles of documents that all come from the same office, and they can be processed separately from each other pile. But within that pile, there may be 2, 3, 4, or more that are related and need to be stored together.

 

Suffice it to say, none of this was documented in the code. There was just a rough process that I could follow, with bits and pieces sticking off to the side that didn't seem to have any relationship to anything until after observing a sufficiently large number of documents being processed and realizing they were dealing with "exceptions" of sorts.

 

One thing I lobbied for early on was to replace the OCR engine with something newer and faster. After I did, I found all sorts of decisions that appeared to have been made because the original OCR process took so damn long -- part of which was a whole multi-threading mechanism set up to parallel process things while the OCR engine was busy. (They could have just run the OCR as a separate thread, but they dumped most of the logic into the thread as well. I think that was because it processed so fast. Dunno.)

 

With the new OCR engine, it takes 2-3 seconds to process each document in its entirety. The old OCR engine took 2-3 minutes per document.

 

In situations like this, people rarely bother to explain why they organized things the way they did to account for the amount of time needed to do a particular part of the process. They figure it's pretty darn obvious. Well, no it wasn't. Especially after replacing the OCR engine with a much faster one.

One of the first things I did was refactor things so I could switch between OCR engines to support some kind of A/B testing. In that process, I saw how (needlessly) tightly coupled the code was to the OCR process when all it was doing was pulling out two numbers.  In the original approach, there was a ton of logic associated with the OCR process that I was able to pull out into a base class (in a separate unit) and reuse with the new OCR engine.

 

I was able to re-do the overall process to better reflect different activities, and the code became much easier to read. At this point, however, management has not given me the go-ahead to deal with the problem of losing attachments that can't be inserted into archives for whatever reason, but it won't be as hard to do now as it would be in the original code.

 

Management thought this would be maybe a week's worth of work. I spent several weeks just figuring out how the code worked! No documentation existed on it anywhere, of course, and virtually no comments existed in the code. It's finally coming to a close a couple months later than expected, and they have no appetite to hear about the problems I ran into along the way. 

 

ANY kind of documentation would have been helpful in this case. But unfortunately, this is pretty much the norm from what I've seen over the years.

 

  • Like 1

Share this post


Link to post

I find using unit namespaces to be extremely helpful in structuring code

 

e.g VSoft.Core.Utils.String

 

I can look at that unit name and know which project it's in (core) and which folder it's in (Utils).

 

I always prefix my units (in new code, not all older code has been changed yet) with VSoft (company name) - it helps

a) avoid conflicts with other libraries

b) let's me differentiate between the code I wrote vs a library.

 

I really wish all library vendors would do the same. Utils.pas in 20 different libraries is not helpful!

 

Other than that, I will say code navigation is not one of Rad Studio's strengths.. hopefully with them moving to LSP this will improve in the next few releases. 

 

  • Like 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×