Jump to content
Mike Torrettinni

Why TList uses a lot more memory than TArray?

Recommended Posts

I'm loading a lot of data from files, so I was testing the difference between TList and TArray memory consumption. And it seems TList uses a lot more memory.

 

This is my simple test:

 

procedure TForm8.Button1Click(Sender: TObject);
type
  TDataRec = record
    ID: integer;
    Name: string;
  end;

var vDataRec: TDataRec;
    vDataList: TList<TDataRec>;
     vArray: TArray<TDataRec>;
    i, vMax: integer;
begin

  vMax := 10000000;

  vDataList := TList<TDataRec>.Create;
  try
    for i := 1 to vMax do
    begin
       vDataRec := Default(TDataRec);
       vDataRec.ID := i;
       vDataRec.Name := 'X';
       vDataList.Add(vDataRec);
    end;
  finally
    vDataList.Free;
  end;

  SetLength(vArray, vMax);
  for i := 0 to vMax-1 do
  begin
    vArray[i].ID := i;
    vArray[i].Name := 'X';
  end;
  vArray := nil;

end;

And here are results as I see them in Task Manager in Memory (active private working set) column - I believe this is the one that can be monitored when/if Out of memory error occurs.

 

TList<TDataRec>:

Starting memory:  3,716 K

Filled list: 369,488 K

Consumption: 365,772 K

 

TArray<TDataRec>:

Starting memory:  5,548 K

Filled list: 316,588 K

Consumption: 311,040 K

 

TList<TDataRec> vs TArray<TDataRec> = 54,732 K = 17%

 

17% is not extreme, but it is significant difference.

 

 

But! If the TDataRec only has 1 integer:

 

TDataRec = record
    ID: integer;
  end;

Now the difference is unacceptable:

 

TList<TDataRec>:

Starting memory:  3,720 K

Filled list: 69,700 K

Consumption: 65,980 K

 

TArray<TDataRec>:

Starting memory:  4,208 K

Filled list: 43,272 K

Consumption: 39,064 K

 

TList<TDataRec> vs TArray<TDataRec> = 26,916 K = 68%

 

68% seems extreme difference!

 

Is this normal, or am I missing something very obvious?

 

 

Edited by Mike Torrettinni

Share this post


Link to post

What happens, if you add

vDataList.Capacity := vMax;

before filling vDataList.

vDataList is using array internally, but it is constantly grown in your example. It might be the cause of those issues.

Also, you really should run those test so, that one test is only one run on that execution of test program. Otherwise testing array might make use of memory that was already allocated  during list testing (Delphi memory manager does not return memory to windows, when it is freed.

  • Like 1
  • Thanks 1

Share this post


Link to post

TList grows its internal array by some factor (1.5 IIRC). So if you use just Add, there are big chances that list will have free space at the end of test

  • Like 1

Share this post


Link to post
9 minutes ago, Fr0sT.Brutal said:

TList grows its internal array by some factor (1.5 IIRC). So if you use just Add, there are big chances that list will have free space at the end of test

Probably depends on Delphi version. In XE it grows by 1 TList<T>.Add.

Or not. I did not check the content of Grow.... It doubles in size in XE. But it can still cause memory fragmentation, if reallocation has to relocate memory block.

 

Edited by Virgo

Share this post


Link to post

Actually the defect is in your test code. Set the capacity when you create the list. Just as you do with the array. 

 

Obviously you need to compare like with like. 

Edited by David Heffernan
  • Like 3
  • Thanks 1

Share this post


Link to post

Like others said, TList has a capacity growing  strategy.

 

The second, and more important  thing is, that the memory manager doesn't release the memory back to windows.
So what you might getting is this:
when you add items to the list, and it tries to resize, it might happen, that there is not sufficient  memory in the current place to just in place resize. so the memory manager copies the whole array to a new location, that can hold the size of the new array.
but the old memory space is still kept reserved by the memory manager.
So when you use the task manager, windows sees both.
Andd BTW, the memory manager doesn't request the exact size either. Like TList, it has a growth strategy, too.

 

 

  • Like 1

Share this post


Link to post
40 minutes ago, Pawel Piotrowski said:

The second, and more important  thing is, that the memory manager doesn't release the memory back to windows.

Stop even trying to justify this. The two pieces of code are not comparable. 

 

You can see the single SetLength call. That's missing for the list. There's nothing more to say. 

Share this post


Link to post

Thanks! By setting Capacity (vDataList.Capacity := vMax;) I can make it work as efficient as TArray. I was under impression TList has this magical power of handling with memory, I guess it still needs a little help here and there.

 

 

Share this post


Link to post
24 minutes ago, Mike Torrettinni said:

Thanks! By setting Capacity (vDataList.Capacity := vMax;) I can make it work as efficient as TArray. I was under impression TList has this magical power of handling with memory, I guess it still needs a little help here and there.

 

 

How could it possibly know how many items were going to be added?

 

In reality you often don't know this. I would not be surprised if your array code has lots of lines like SetLength(arr, Length(arr) + 1) which are hideously inefficient.

Share this post


Link to post
1 hour ago, David Heffernan said:

Stop even trying to justify this. The two pieces of code are not comparable. 

 

You can see the single SetLength call. That's missing for the list. There's nothing more to say. 

I agree. But others before me pointed that already out 🙂

I was trying to explain (not justify) the differences between the two samples given, the one with 17% more memory and the second with 68% more.
You can explain the +17% with the missing SetLength, but not the +68% more memory.
More knowledge doesn't hurt, and if someone uses the task manager to compare memory usage, I suppose it is nice to know about the memory manager itself, isn't it?

  • Like 1

Share this post


Link to post

I thought that TList has the benefit over array because it doesn't need a contiguous memory block to keep data. Array needs it, while TList keeps a list of pointers to memory where it stores data, so it just finds next available free space and never needs a contiguous block. I guess it doesn't work that way.

Share this post


Link to post

No it doesn't. It's just a wrapper for an array. Effectively it's the exact same data type but with a lot of helper methods to let you write higher level code. And an allocation strategy that overallocates to avoid performing heap allocation every time you add an item.

 

I'm not sure that there are many collection libraries that implement array like collections as anything other than a contiguous array. I know my own personal code has some classes that do this, to avoid address space fragmentation. Less of a problem now that we have 64  bit. 

  • Like 1

Share this post


Link to post

This is actually strange realization, right now... seems useful, if it would be like that. I use similar approach in showing data in Virtual Treeview, where records can be added at the end of the array, while displayed as sub node of 1st node - dynamic tree generations, I guess.

Share this post


Link to post

There are other structures that don't actually need contiguous areas of memory, e.g. linked lists or some tree structures. But of course these have other weaknesses.

Share this post


Link to post
9 minutes ago, dummzeuch said:

There are other structures that don't actually need contiguous areas of memory, e.g. linked lists or some tree structures. But of course these have other weaknesses.

I'm talking about an array like structure (i.e. O(1) random access, contiguous indices) whose backing allocation is not contiguous.

 

I'm not aware of many (if any) collections in widespread use that are like this.  I introduced such classes in my codebase in the bad old 32 bit days when address space fragmentation was killing me.

Share this post


Link to post
28 minutes ago, Mike Torrettinni said:

I thought that TList has the benefit over array because it doesn't need a contiguous memory block to keep data. Array needs it, while TList keeps a list of pointers to memory where it stores data, so it just finds next available free space and never needs a contiguous block. I guess it doesn't work that way.

What you describe could avoid memory reallocations and moving items if reallocation could not happen inplace but has a lot of other performance considerations - see https://softwareengineering.stackexchange.com/questions/267870/are-noncontiguous-arrays-performant

  • Like 1

Share this post


Link to post
12 minutes ago, Stefan Glienke said:

What you describe could avoid memory reallocations and moving items if reallocation could not happen inplace but has a lot of other performance considerations - see https://softwareengineering.stackexchange.com/questions/267870/are-noncontiguous-arrays-performant

Thanks, interesting. All disadvantages aside, it would solve many Out of memory errors, when it's getting close to limit, but only goes over because of the need for contiguous block.

Share this post


Link to post
46 minutes ago, David Heffernan said:

Less of a problem now that we have 64  bit.

I agree, less of a problem, not without it completely. I was thinking for may years that 64bit will solve my memory issues... then I realized that many customers don't have 32GB, some are OK with just 4GB of memory.

Share this post


Link to post
37 minutes ago, Mike Torrettinni said:

Thanks, interesting. All disadvantages aside, it would solve many Out of memory errors, when it's getting close to limit, but only goes over because of the need for contiguous block.

64 bit makes that problem go away.

Share this post


Link to post

I once thought on a structure consisting of multiple blocks (arrays) but uniting them with common index. But I never really needed it 😄

Implementation would be quite trivial though.

Share this post


Link to post
11 minutes ago, Fr0sT.Brutal said:

I once thought on a structure consisting of multiple blocks (arrays) but uniting them with common index. But I never really needed it 😄

Implementation would be quite trivial though.

 

{$POINTERMATH ON}
function TBlockAllocatedList<T>.GetItem(Index: Integer): P;
var
  BlockIndex, ItemIndex: Cardinal;
begin
  Assert(InRange(Index, 0, Count-1));
  DivMod(Index, FItemsPerBlock, BlockIndex, ItemIndex);
  Result := P(FBlocks[BlockIndex]) + ItemIndex;
end;
{$POINTERMATH OFF}

 

That's the key function in my implementation.

Share this post


Link to post
23 minutes ago, David Heffernan said:

 


{$POINTERMATH ON}
function TBlockAllocatedList<T>.GetItem(Index: Integer): P;
var
  BlockIndex, ItemIndex: Cardinal;
begin
  Assert(InRange(Index, 0, Count-1));
  DivMod(Index, FItemsPerBlock, BlockIndex, ItemIndex);
  Result := P(FBlocks[BlockIndex]) + ItemIndex;
end;
{$POINTERMATH OFF}

 

That's the key function in my implementation.

David, if you are up for some details... can you explain the context on which you decided to use this approach? What did it help solve and what advantage does it give you.. is it memory consumption or something else? Do you build such blocks of data once and just access it through the life of application? Do you use it for simple types, complex records or both and works good?

 

Share this post


Link to post

I already said. It's to avoid memory address space fragmentation when growing an array, even with the TList growth strategy. Address space fragmentation was a big issue when we had no 64 bit compiler. 

 

This is way beyond where you currently are. You need to fully understand how TList works first. 

Share this post


Link to post
1 hour ago, Mike Torrettinni said:

I was thinking for may years that 64bit will solve my memory issues... then I realized that many customers don't have 32GB, some are OK with just 4GB of memory. 

Please google "virtual memory".

  • Like 2

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×