Jump to content
Mike Torrettinni

List of usable RegEx for source code

Recommended Posts

As we can see in Range Check Error ERangeError topic (https://en.delphipraxis.net/topic/4825-range-check-error-erangeerror/), a RegEx script could find the issue.

 

So, I wanted to share my 3 simple scripts I run on my code every now and then to make sure I don't make mistakes:

 

1. To find all array increments by 1, like SetLength(array, Length(array)+1):

SetLength.*\(.*Length\(.*\).*\+.*1

 

2. and when using High() - this is error anyway, so good to find it!

SetLength.*\(.*High\(.*\).*\+.*1

 

3. If I forgot to add -1 when iterating array (although I use High most of the time, I still use Length() - 1 sometimes):

0 to Length\(.*\).do

 

 

If anybody wants to share any scripts they use, please do!

 

 

Share this post


Link to post

I'd rather say that it's impossible to write RegEx that will reliably catch issues listed in the original post. New Delphi LSP may provide improvements in this area.

  • Like 1

Share this post


Link to post

Yes, I did, keeping in mind that RegEx provides only false sense of safety. RegEx can be used as an additional tool to well-established tools - static analysis and unit tests.

  • Thanks 1

Share this post


Link to post
14 hours ago, 0x8000FFFF said:

RegEx can be used as an additional tool to well-established tools - static analysis and unit tests.

Yes, I agree. 

Edited by Mike Torrettinni

Share this post


Link to post
6 minutes ago, Mike Torrettinni said:

Found another one:

 

for missing - 1 after List.Count :

0 to .*Count[ ]do

With this one I found, in 3rd party libraries, few cases of :

ItemsCount := Items.Count - 1;
...
for i := 0 to ItemsCount do

Valid code, I guess, but tricky.

Share this post


Link to post
14 hours ago, Mike Torrettinni said:

That should be "raise e[a-z]*\(", right?

Yes.

 

Hm, thinking about this, It should better be

raise [a-z]*\(

Because there might be exception names that do not start with an "e". And of course the match should be case insensitive.

Edited by dummzeuch
  • Thanks 1

Share this post


Link to post
17 hours ago, Mike Torrettinni said:

1. To find all array increments by 1, like SetLength(array, Length(array)+1):

SetLength.*\(.*Length\(.*\).*\+.*1

What's wrong with this one?

 

I'd add checks for integer typecasting: "Integer(..." and "Cardinal(..." to catch incorrect pointer casts which will likely fail on x64

  • Like 1

Share this post


Link to post
11 minutes ago, Fr0sT.Brutal said:

What's wrong with this one?

I used to have a lot of cases where I would increase array size by +1, when adding new records, instead of pre-allocate x records and increase only if needed. Perhaps it's specific to my old ways of doing things, but now I try not to. Of course if I know its going to be only a few records, then I make this known in comments. But I try to avoid it.

Share this post


Link to post
Just now, Mike Torrettinni said:

I used to have a lot of cases where I would increase array size by +1, when adding new records, instead of pre-allocate x records and increase only if needed. Perhaps it's specific to my old ways of doing things, but now I try not to. Of course if I know its going to be only a few records, then I make this known in comments. But I try to avoid it.

I see, but in general it's not a mistake. "Hint" level, not even "Warn" 🙂

  • Like 1

Share this post


Link to post
Just now, Fr0sT.Brutal said:

I see, but in general it's not a mistake. "Hint" level, not even "Warn" 🙂

Well, yes, of course. But it can bite you really fast when suddenly you have a customer who has different data and instead of 10K items at 1-10 properties per item (using Length + 1), turns into 10 items with 10K properties each. Then it becomes a bottleneck suddenly.

Share this post


Link to post
2 hours ago, Mike Torrettinni said:

Well, yes, of course. But it can bite you really fast when suddenly you have a customer who has different data and instead of 10K items at 1-10 properties per item (using Length + 1), turns into 10 items with 10K properties each. Then it becomes a bottleneck suddenly.

Contrary, you can mess with reservation and optimization for 10-item-arrays. FastMM already reserves some space after strings and arrays so reallocating won't happen every time.

Share this post


Link to post
2 minutes ago, Fr0sT.Brutal said:

Contrary, you can mess with reservation and optimization for 10-item-arrays. FastMM already reserves some space after strings and arrays so reallocating won't happen every time.

Hm, Ok, I didn't know that. Maybe my original case was a bit different but eventually proved a bottleneck with constant calling Length()+1.

Share this post


Link to post
8 minutes ago, Fr0sT.Brutal said:

Contrary, you can mess with reservation and optimization for 10-item-arrays. FastMM already reserves some space after strings and arrays so reallocating won't happen every time.

What about multithreaded programs?

Share this post


Link to post
26 minutes ago, Fr0sT.Brutal said:

What about them?

Well, won't they suffer when you make lots of reallocations. That's always been my experience.

Share this post


Link to post
4 minutes ago, David Heffernan said:

Well, won't they suffer when you make lots of reallocations. That's always been my experience.

Sure they will, but we're going too deep in details here. There's no super-universal algo for every case. Dealing with 10 and 10000 items should be done via different approaches.

  • Like 1
  • Thanks 1

Share this post


Link to post
4 hours ago, Fr0sT.Brutal said:

Sure they will, but we're going too deep in details here. There's no super-universal algo for every case. Dealing with 10 and 10000 items should be done via different approaches.

Really? Contention on a lock has the same impact irrespective of how many items are in the collection.

Share this post


Link to post
15 hours ago, David Heffernan said:

Really? Contention on a lock has the same impact irrespective of how many items are in the collection.

There's no super-universal algo for every case. Dealing with single-threaded / several-threaded / multi-threaded and non-performance-demanding / performance-critical applications should be done via different approaches.

😉

Share this post


Link to post
31 minutes ago, Fr0sT.Brutal said:

There's no super-universal algo for every case. Dealing with single-threaded / several-threaded / multi-threaded and non-performance-demanding / performance-critical applications should be done via different approaches.

😉

It's always a good idea to minimise contention on process wide locks, which is why it is best not to call SetLength over and over when that can readily be avoided. 

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×