Jump to content
Mike Torrettinni

DynArraySetLength doesn't check for NewLength = OldLength

Recommended Posts

I use arrays a lot, so of course also SetLength(array, length); after filing it, like this:

 

// initialize array to fit all expected items + a little wiggle room
SetLength(array, initial length + x);
// current item index
curr_index := 0;  
begin
 // fill in the array and increase index with each line
 // if curr_index (+1) > Length(array) : resize array to new length to fit more items
 Inc(curr_index);
end;
// finalize array
SetLength(array, curr_index);

 I thought that I don't really need to SetLength, if it's already filled to the max:

// finalize array, only if needed
if curr_index <> Length(array) then
  SetLength(array, curr_index);

So, I was looking into the DynArraySetLength to see what it does if you try to resize to the same length as it is. And DynArraySetLength checks for 0, <0, erros... but it doesn't check for NewLength = OldLength. This happens only a few calls later in ReallocMem -> GetMem -> AllocateLargeBlock. And it does some memory related stuff, but eventually doesn't resize or allocates new memory space.

 

So it goes through all the memory stuff, instead of just simple check if we are resizing to the same size.

 

Anybody using SetLength with condition? Does it make sense or not?

 

Share this post


Link to post
Guest
52 minutes ago, Mike Torrettinni said:

Anybody using SetLength with condition? Does it make sense or not?

I do that, but after collecting enough evidences that it will make sense, not everywhere.

 

Not only with SetLength but with other situation where loops with hundreds or thousands of iterations will be performed, eg. in a case where searching for TBytes in an array, sometimes i do check length before call the compare, in other places where the length is the same but i have thousands of hash values, so i check the first byte then call the compare, thus way i reduced the overhead of the call to 1 in 256 instead of paying it each compare, sometimes i do that with strings !

 

The moral is : you have the right to do so, just make sure it will bring you a benefit not only longer and less readable code.

Share this post


Link to post
1 hour ago, Mike Torrettinni said:

So it goes through all the memory stuff, instead of just simple check if we are resizing to the same size.

Just a note: A check for old = new length is only sufficient if the array is of one dimension. Even if the length in the current dimension stays the same there can still be changes in the inner dimensions.

  • Like 1

Share this post


Link to post
48 minutes ago, Uwe Raabe said:

Just a note: A check for old = new length is only sufficient if the array is of one dimension. Even if the length in the current dimension stays the same there can still be changes in the inner dimensions.

Very good point, I rarely use dynamic multidimensional arrays, so I didn't consider this.

 

57 minutes ago, Kas Ob. said:

The moral is : you have the right to do so, just make sure it will bring you a benefit not only longer and less readable code.

If DynArraySetLength had this simple check, it would be so much easier - I would ignore and never come back to this question 🙂

Share this post


Link to post
27 minutes ago, Mike Torrettinni said:

If DynArraySetLength had this simple check, it would be so much easier

Perhaps a QP entry might be helpful?

Share this post


Link to post
10 minutes ago, David Heffernan said:

I'm struggling to appreciate what the problem is. I don't think SetLength behaves incorrectly. Am I missing something?

At a quick glance it seems DynArraySetLength could use a simple check before trying to set size which is the same as original. This way it would avoid additional checking and calls, which at the end do nothing or result in the same size as before.

The method has a check if newLength <= 0 and Exit if True, it could have similar check if newLength = oldLength and Exit if True.

 

Of course I could be completely off here.

 

Share this post


Link to post
3 minutes ago, David Heffernan said:

What difference would that make?

🙂 It never ends well when you ask 2 questions in a row - usually means I'm way over my head.

I have no idea what answer are you looking for that would make sense to you, so I can only answer with something like this:  the difference would be that the missing check would not be missing anymore.

Share this post


Link to post
1 minute ago, David Heffernan said:

Why would a check be needed? What is the defect that you are claiming because there is no check?

I can't classify it as defect, is more a wish that it wouldn't do all sorts of calls, runs through asm code, Delphi code... just to end up with the same result as it began with.

I wish I didn't go and check... 🙂

Share this post


Link to post

But, at the end, do you, or do you not, end up with an array that is the correct length, and respects the contents that existed before you called SetLength?

 

If perhaps you are worried that there is a performance hit, measure it. I can't believe we would still be having this debate about premature optimisation. You remember us mentioning this to you before?

 

If you want to make your code run faster, why don't you optimise the parts that take the most time?

Share this post


Link to post
Just now, David Heffernan said:

But, at the end, do you, or do you not, end up with an array that is the correct length, and respects the contents that existed before you called SetLength?

 

If perhaps you are worried that there is a performance hit, measure it. I can't believe we would still be having this debate about premature optimisation. You remember us mentioning this to you before?

 

If you want to make your code run faster, why don't you optimise the parts that take the most time?

I see now where you were going, no, this case is not about performance. It was more curiosity, will it copy to new memory space, will it just do nothing or something else. I would expect a check and exit.

I don't know a lot how memory works, so looking at DynArraySetLength was pretty interesting and not that much complicated, so the missing check was pretty obvious.

 

 

Share this post


Link to post

It's just a realloc of a block the same size, which is a null op and nothing happens. Unless you have a pathologically insane memory manager. But no memory manager I know of would do anything other than null op for this realloc.

  • Thanks 1

Share this post


Link to post
2 hours ago, Uwe Raabe said:

Perhaps a QP entry might be helpful?

I would file it if I had a case where the changes would have meaningful effect, but I don't have one. Maybe it's better not to waste their time on this.

 

  • Like 1

Share this post


Link to post
1 hour ago, David Heffernan said:

It's just a realloc of a block the same size, which is a null op and nothing happens. Unless you have a pathologically insane memory manager. But no memory manager I know of would do anything other than null op for this realloc.

In fact it can be a serious bottleneck !!! 

1 - A copy operation may happen even if the block wasn't resized (same size) ... weak-ref :

procedure DynArraySetLength(var a: Pointer; typeInfo: Pointer; dimCnt: NativeInt; lengthVec: PNativeint);
//....
if SysHasWeakRef(PTypeInfo(ElTypeInfo)) then
      begin
        if newLength < oldLength then
          minLength := newLength
        else
          minLength := oldLength;
        GetMem(pp, neededSize);
        FillChar((PByte(pp) + SizeOf(TDynArrayRec))^, minLength * elSize, 0);
        if p <> nil then
        begin
         // ---> here <---
          MoveArray(PByte(pp) + SizeOf(TDynArrayRec),
                     PByte(p) + SizeOf(TDynArrayRec), ElTypeInfo, minLength);
          if newLength < oldLength then
            FinalizeArray(PByte(p) + SizeOf(TDynArrayRec) + newLength*elSize, ElTypeInfo, oldLength - newLength);
          FreeMem(p);
        end;
      end

2 - The operation executes on O(n) when the array is multidimensional :

 // Take care of the inner dimensions, if any
  if dimCnt > 1 then
  begin
    Inc(lengthVec);
    Dec(dimCnt);
    i := 0;
    try
      while i < newLength do
        begin
          DynArraySetLength(PPointerArray(p)[i], ElTypeInfo, dimCnt, lengthVec);
          Inc(i);
        end;
    except
      // Free arrays on exception
      for j := 0 to i  do
        _DynArrayClear(PPointerArray(p)[j], ElTypeInfo);
      _DynArrayClear(p, typeInfo);
      raise;
    end;
  end;
 //---------------------------------
 var
  LArray: array of array of Integer;
begin
  SetLength(LArray, 100,10);
  SetLength(LArray, 100,10); // DynArraySetLength x100
end;

3 - The function can ruin the cache ! because it dereferences some rtti data and it relies on MM to check for the block size. If the check was implemented inside the function than we won't dereference any unnecessary data.

4 - Adding a simple check to compare old vs new length will be much better and avoids all the issues above.

1 hour ago, Mike Torrettinni said:

I would file it if I had a case where the changes would have meaningful effect, but I don't have one. Maybe it's better not to waste their time on this.

 

I think that you should fire a case.

  • Like 1

Share this post


Link to post
Guest
12 minutes ago, Mahdi Safsafi said:

SetLength(LArray, 100,10); // DynArraySetLength x100

You are right, and to add, this line will call ReallocMem ending at the memory manager 1000 times which in turn will read the length of the allocation causing stress/thrash/pollute of the CPU cache even further.

Share this post


Link to post
1 hour ago, Mahdi Safsafi said:

SetLength(LArray, 100,10); // DynArraySetLength x100

I wouldn't overplay this. Jagged arrays have terrible performance. Your problem is not that all these ReallocMems can be skipped, but aren't. Your problem is that you are using a jagged array rather than a multidimensional array. Your problem is that you did 101 allocations (not 1000) rather than 1.

Edited by David Heffernan

Share this post


Link to post
25 minutes ago, David Heffernan said:

I wouldn't overplay this.

OP was referring to SetLength (not jagged/multidimensional array) and he was wondering whether a check against old-length should be performed before calling SetLength. So I think it makes a sense and worth to be mentioned 🙂 

Share this post


Link to post
9 minutes ago, Mahdi Safsafi said:

OP was referring to SetLength (not jagged/multidimensional array) and he was wondering whether a check against old-length should be performed before calling SetLength. So I think it makes a sense and worth to be mentioned 🙂 

It was you that referred to jagged arrays. Once you start using them, for rectangular data, you've given up caring about performance.

 

I didn't see any copying when I looked at this. I don't see any evidence that performance is a significant issue here. 

Edited by David Heffernan

Share this post


Link to post
2 minutes ago, David Heffernan said:

It was you that referred to jagged arrays. Once you start using them, for rectangular data, you've given up caring about performance.

All the points I demonstrated are related to the use of SetLength without a check.

Quote

I didn't see any copying when I looked at this. I don't see any evidence that performance is a significant issue here. 

A copy happens when array contains weak-references:

// DynArraySetLength:
if SysHasWeakRef(PTypeInfo(ElTypeInfo)) then
begin
...
	GetMem(pp, neededSize);
    FillChar((PByte(pp) + SizeOf(TDynArrayRec))^, minLength * elSize, 0);
    MoveArray(PByte(pp) + SizeOf(TDynArrayRec), PByte(p) + SizeOf(TDynArrayRec), ElTypeInfo, minLength);
end

 

Share this post


Link to post

@David Heffernan Please step into for the second call and you'll clearly see that the second call performs a copy even size didn't changed :

type
 TRec = record //
  [Weak]
    FInterface: IInterface;
  end;

procedure TForm1.Button1Click(Sender: TObject);
var
  LArray: array of TRec;
begin
  SetLength(LArray, 100);
  SetLength(LArray, 100); // step in
end;

 

Share this post


Link to post
3 minutes ago, David Heffernan said:

OK, so no copy, and not much else, unless it's a jagged array or has weak refs. 

Or addressing it at the RTL level by implementing a simple check in DynArraySetLength.

Share this post


Link to post
Just now, Mahdi Safsafi said:

Or addressing it at the RTL level by implementing a simple check in DynArraySetLength.

I'm just saying that it's very unlikely that there will be real world code that suffers. That said, I don't know about weak refs so that could be significant. 

  • Like 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×