Mike Torrettinni 198 Posted March 5, 2020 (edited) I'm loading a lot of data from files, so I was testing the difference between TList and TArray memory consumption. And it seems TList uses a lot more memory. Â This is my simple test: Â procedure TForm8.Button1Click(Sender: TObject); type TDataRec = record ID: integer; Name: string; end; var vDataRec: TDataRec; vDataList: TList<TDataRec>; vArray: TArray<TDataRec>; i, vMax: integer; begin vMax := 10000000; vDataList := TList<TDataRec>.Create; try for i := 1 to vMax do begin vDataRec := Default(TDataRec); vDataRec.ID := i; vDataRec.Name := 'X'; vDataList.Add(vDataRec); end; finally vDataList.Free; end; SetLength(vArray, vMax); for i := 0 to vMax-1 do begin vArray[i].ID := i; vArray[i].Name := 'X'; end; vArray := nil; end; And here are results as I see them in Task Manager in Memory (active private working set) column - I believe this is the one that can be monitored when/if Out of memory error occurs. Â TList<TDataRec>: Starting memory:Â 3,716 K Filled list: 369,488 K Consumption: 365,772 K Â TArray<TDataRec>: Starting memory:Â 5,548 K Filled list: 316,588 K Consumption: 311,040 K Â TList<TDataRec> vs TArray<TDataRec> = 54,732 K = 17% Â 17% is not extreme, but it is significant difference. Â Â But! If the TDataRec only has 1 integer: Â TDataRec = record ID: integer; end; Now the difference is unacceptable: Â TList<TDataRec>: Starting memory:Â 3,720 K Filled list: 69,700 K Consumption: 65,980 K Â TArray<TDataRec>: Starting memory:Â 4,208 K Filled list: 43,272 K Consumption: 39,064 K Â TList<TDataRec> vs TArray<TDataRec> = 26,916 K = 68% Â 68% seems extreme difference! Â Is this normal, or am I missing something very obvious? Â Â Edited March 5, 2020 by Mike Torrettinni Share this post Link to post
Virgo 18 Posted March 5, 2020 What happens, if you add vDataList.Capacity := vMax; before filling vDataList. vDataList is using array internally, but it is constantly grown in your example. It might be the cause of those issues. Also, you really should run those test so, that one test is only one run on that execution of test program. Otherwise testing array might make use of memory that was already allocated during list testing (Delphi memory manager does not return memory to windows, when it is freed. 1 1 Share this post Link to post
Fr0sT.Brutal 900 Posted March 5, 2020 TList grows its internal array by some factor (1.5 IIRC). So if you use just Add, there are big chances that list will have free space at the end of test 1 Share this post Link to post
Virgo 18 Posted March 5, 2020 (edited) 9 minutes ago, Fr0sT.Brutal said: TList grows its internal array by some factor (1.5 IIRC). So if you use just Add, there are big chances that list will have free space at the end of test Probably depends on Delphi version. In XE it grows by 1 TList<T>.Add. Or not. I did not check the content of Grow.... It doubles in size in XE. But it can still cause memory fragmentation, if reallocation has to relocate memory block. Â Edited March 5, 2020 by Virgo Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 (edited) Actually the defect is in your test code. Set the capacity when you create the list. Just as you do with the array.  Obviously you need to compare like with like. Edited March 5, 2020 by David Heffernan 3 1 Share this post Link to post
Pawel Piotrowski 18 Posted March 5, 2020 Like others said, TList has a capacity growing strategy.  The second, and more important thing is, that the memory manager doesn't release the memory back to windows. So what you might getting is this: when you add items to the list, and it tries to resize, it might happen, that there is not sufficient memory in the current place to just in place resize. so the memory manager copies the whole array to a new location, that can hold the size of the new array. but the old memory space is still kept reserved by the memory manager. So when you use the task manager, windows sees both. Andd BTW, the memory manager doesn't request the exact size either. Like TList, it has a growth strategy, too.   1 Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 40 minutes ago, Pawel Piotrowski said: The second, and more important thing is, that the memory manager doesn't release the memory back to windows. Stop even trying to justify this. The two pieces of code are not comparable.  You can see the single SetLength call. That's missing for the list. There's nothing more to say. Share this post Link to post
Mike Torrettinni 198 Posted March 5, 2020 Thanks! By setting Capacity (vDataList.Capacity := vMax;) I can make it work as efficient as TArray. I was under impression TList has this magical power of handling with memory, I guess it still needs a little help here and there. Â Â Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 24 minutes ago, Mike Torrettinni said: Thanks! By setting Capacity (vDataList.Capacity := vMax;) I can make it work as efficient as TArray. I was under impression TList has this magical power of handling with memory, I guess it still needs a little help here and there. Â Â How could it possibly know how many items were going to be added? Â In reality you often don't know this. I would not be surprised if your array code has lots of lines like SetLength(arr, Length(arr) + 1) which are hideously inefficient. Share this post Link to post
Pawel Piotrowski 18 Posted March 5, 2020 1 hour ago, David Heffernan said: Stop even trying to justify this. The two pieces of code are not comparable.  You can see the single SetLength call. That's missing for the list. There's nothing more to say. I agree. But others before me pointed that already out 🙂 I was trying to explain (not justify) the differences between the two samples given, the one with 17% more memory and the second with 68% more. You can explain the +17% with the missing SetLength, but not the +68% more memory. More knowledge doesn't hurt, and if someone uses the task manager to compare memory usage, I suppose it is nice to know about the memory manager itself, isn't it? 1 Share this post Link to post
Mike Torrettinni 198 Posted March 5, 2020 I thought that TList has the benefit over array because it doesn't need a contiguous memory block to keep data. Array needs it, while TList keeps a list of pointers to memory where it stores data, so it just finds next available free space and never needs a contiguous block. I guess it doesn't work that way. Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 No it doesn't. It's just a wrapper for an array. Effectively it's the exact same data type but with a lot of helper methods to let you write higher level code. And an allocation strategy that overallocates to avoid performing heap allocation every time you add an item.  I'm not sure that there are many collection libraries that implement array like collections as anything other than a contiguous array. I know my own personal code has some classes that do this, to avoid address space fragmentation. Less of a problem now that we have 64 bit. 1 Share this post Link to post
Mike Torrettinni 198 Posted March 5, 2020 This is actually strange realization, right now... seems useful, if it would be like that. I use similar approach in showing data in Virtual Treeview, where records can be added at the end of the array, while displayed as sub node of 1st node - dynamic tree generations, I guess. Share this post Link to post
dummzeuch 1505 Posted March 5, 2020 There are other structures that don't actually need contiguous areas of memory, e.g. linked lists or some tree structures. But of course these have other weaknesses. Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 9 minutes ago, dummzeuch said: There are other structures that don't actually need contiguous areas of memory, e.g. linked lists or some tree structures. But of course these have other weaknesses. I'm talking about an array like structure (i.e. O(1) random access, contiguous indices) whose backing allocation is not contiguous.  I'm not aware of many (if any) collections in widespread use that are like this. I introduced such classes in my codebase in the bad old 32 bit days when address space fragmentation was killing me. Share this post Link to post
Stefan Glienke 2002 Posted March 5, 2020 28 minutes ago, Mike Torrettinni said: I thought that TList has the benefit over array because it doesn't need a contiguous memory block to keep data. Array needs it, while TList keeps a list of pointers to memory where it stores data, so it just finds next available free space and never needs a contiguous block. I guess it doesn't work that way. What you describe could avoid memory reallocations and moving items if reallocation could not happen inplace but has a lot of other performance considerations - see https://softwareengineering.stackexchange.com/questions/267870/are-noncontiguous-arrays-performant 1 Share this post Link to post
Mike Torrettinni 198 Posted March 5, 2020 12 minutes ago, Stefan Glienke said: What you describe could avoid memory reallocations and moving items if reallocation could not happen inplace but has a lot of other performance considerations - see https://softwareengineering.stackexchange.com/questions/267870/are-noncontiguous-arrays-performant Thanks, interesting. All disadvantages aside, it would solve many Out of memory errors, when it's getting close to limit, but only goes over because of the need for contiguous block. Share this post Link to post
Mike Torrettinni 198 Posted March 5, 2020 46 minutes ago, David Heffernan said: Less of a problem now that we have 64Â bit. I agree, less of a problem, not without it completely. I was thinking for may years that 64bit will solve my memory issues... then I realized that many customers don't have 32GB, some are OK with just 4GB of memory. Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 37 minutes ago, Mike Torrettinni said: Thanks, interesting. All disadvantages aside, it would solve many Out of memory errors, when it's getting close to limit, but only goes over because of the need for contiguous block. 64 bit makes that problem go away. Share this post Link to post
Fr0sT.Brutal 900 Posted March 5, 2020 I once thought on a structure consisting of multiple blocks (arrays) but uniting them with common index. But I never really needed it 😄 Implementation would be quite trivial though. Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 11 minutes ago, Fr0sT.Brutal said: I once thought on a structure consisting of multiple blocks (arrays) but uniting them with common index. But I never really needed it 😄 Implementation would be quite trivial though.  {$POINTERMATH ON} function TBlockAllocatedList<T>.GetItem(Index: Integer): P; var  BlockIndex, ItemIndex: Cardinal; begin  Assert(InRange(Index, 0, Count-1));  DivMod(Index, FItemsPerBlock, BlockIndex, ItemIndex);  Result := P(FBlocks[BlockIndex]) + ItemIndex; end; {$POINTERMATH OFF}  That's the key function in my implementation. Share this post Link to post
Mike Torrettinni 198 Posted March 5, 2020 23 minutes ago, David Heffernan said:  {$POINTERMATH ON} function TBlockAllocatedList<T>.GetItem(Index: Integer): P; var  BlockIndex, ItemIndex: Cardinal; begin  Assert(InRange(Index, 0, Count-1));  DivMod(Index, FItemsPerBlock, BlockIndex, ItemIndex);  Result := P(FBlocks[BlockIndex]) + ItemIndex; end; {$POINTERMATH OFF}  That's the key function in my implementation. David, if you are up for some details... can you explain the context on which you decided to use this approach? What did it help solve and what advantage does it give you.. is it memory consumption or something else? Do you build such blocks of data once and just access it through the life of application? Do you use it for simple types, complex records or both and works good?  Share this post Link to post
David Heffernan 2345 Posted March 5, 2020 I already said. It's to avoid memory address space fragmentation when growing an array, even with the TList growth strategy. Address space fragmentation was a big issue when we had no 64 bit compiler.  This is way beyond where you currently are. You need to fully understand how TList works first. Share this post Link to post
Anders Melander 1784 Posted March 5, 2020 1 hour ago, Mike Torrettinni said: I was thinking for may years that 64bit will solve my memory issues... then I realized that many customers don't have 32GB, some are OK with just 4GB of memory.  Please google "virtual memory". 2 Share this post Link to post