Jump to content
Edwin Yip

Oz-SGL, a new generic collections library based on the idea of C++ STL

Recommended Posts

Oz-SGL is a new generic collections library written by github user marat1961, and I'm trying to add backward compatibilities for earlier versions of Delphi compiler.

 

I'm just not sure if it also has the drawback of the EXE bloatness described by @Eric Grange here: https://delphisorcery.blogspot.com/2014/03/why-delphi-generics-are-annoying.html

 

Anyone else with some low-level code experience want to review it?

Share this post


Link to post

The decision to implement the collections as records look problematic to me. That means these types will have value type semantics. 

Share this post


Link to post

@Edwin Yip The article you linked to is my blog, not Erics 😉

Yes, it also suffers from that however the types in that library are way smaller and have only limited functionality of just storing stuff, no rich IEnumerable API such as Spring4D. That makes the binary overhead very very small or even non existing.

What I observed though is that they are no general purpose collection classes but obviously tailored for some specific needs - you cannot use TsgList<T> for any type as it is very limited as what it handles (only non managed types).

 

I also did not do a in depth performance comparison but just adding some Integers to a list was 3-4 times slower than in my latest Spring4D build (which is a faster than the RTL).

 

If you watch the video by Herb Sutter I linked in the other thread I even wonder why someone coming from C++ would need to create some list type as he could just use TArray<T> because he should have used Vector<T> in C++.

  • Like 1

Share this post


Link to post

@Stefan Glienke, Oops! Sorry,  a little mistake I made at that moment, need not to say, I clearly know that you now maintain Spring4D and Eric mains DWS, both write good low-level code :)

 

Thanks for info about the performance comparison, it's definitely helpful!

Share this post


Link to post
3 hours ago, David Heffernan said:

That means these types will have value type semantics. 

Thanks for your input David. I'm not able to interpret that :P

Share this post


Link to post

While the library's concept is very interesting and worth investigating it, however, I found it incompatible with the Delphi language(at least in its current specs). In other word : what applies to c++ does not necessary apply to Delphi (the reverse holds as well). For example, the library has a sealed architecture ! Being based on records, it means that you can't roll easily your own collection based on the existing one (at least in the way we know all) just because the current status of the language does not permit record inheritance. In the other hand, RTL/Spring4D are more flexible in this area. 

Another thing, a quick look at the Heap manager, it appears that its based on the ordinal memory manager, in other word: a memory manager on the top of another memory manager. This technically may cause all sort of memory troubles.

 

  • Like 2

Share this post


Link to post
On 10/9/2020 at 6:38 PM, Stefan Glienke said:

@Edwin Yip

Yes, it also suffers from that however the types in that library are way smaller and have only limited functionality of just storing stuff, no rich IEnumerable API such as Spring4D. That makes the binary overhead very very small or even non existing.

What I observed though is that they are no general purpose collection classes but obviously tailored for some specific needs - you cannot use TsgList<T> for any type as it is very limited as what it handles (only non managed types).

 

Why did you decide that the type of something is limited?
You want to use a record, an object or a string ...

 

This is not quite true, managed types are supported.

First, when type assignment occurs, all counters for managed types will be updated (Delphi string, interfaces, etc.).

Second, when deleting data, the cleanup method is passed, which will be used when deleting.

If it is a record with managed types, it can be a procedure inside which the Default (TMyRecord) operator contains.

 

Edited by Marat1961
  • Like 1

Share this post


Link to post
21 hours ago, Mahdi Safsafi said:

 For example, the library has a sealed architecture ! Being based on records, it means that you can't roll easily your own collection based on the existing one (at least in the way we know all) just because the current status of the language does not permit record inheritance. In the other hand, RTL/Spring4D are more flexible in this area. 

Another thing, a quick look at the Heap manager, it appears that its based on the ordinal memory manager, in other word: a memory manager on the top of another memory manager. This technically may cause all sort of memory troubles.

 

I rarely inherit from collections, for some reason I rarely have a desire to add something.
If you want to extend functionality without adding fields, use a helper.

 

I am a fan of domain-driven design.
Using inheritance for collections does not seem to be a very good idea to me.
I am very suspicious of thoughtless inheritance.
Most often, inheritance does not solve a problem, usually it creates them.

If it is necessary to reuse code, use the aggregation relation.

 

You can replace inheritance with aggregation and declare only the methods you need in the public section.

1. Place the structure you want to aggregate as a private field
2. Next we open only the necessary part of the interface by redefining it to public 
sections of required methods and properties. 
If you set the inline option, we will avoid additional costs. 
The Delphi compiler will not generate code for overridden methods. 
At the place where the method is called, there will be a direct call to the aggregated structure method.

 

Selecting abstraction levels is not a rarity.
FastMM replaces the memory manager.
Delphi manager works on top of the operating system manager.

 

The memory organization system is very well written here.

https://www.gunsmoker.ru/2009/01/blog-post.html

https://www.gunsmoker.ru/2011/04/windows.html

https://en.wikipedia.org/wiki/Region-based_memory_management

 

I use FastMM with fully enabled diagnostics, no problems with memory leaks.

The code of this library is used in real projects.
Practice is a criterion of truth.

 

Edited by Marat1961

Share this post


Link to post
On 10/9/2020 at 6:38 PM, Stefan Glienke said:

@Edwin Yip

I also did not do a in depth performance comparison but just adding some Integers to a list was 3-4 times slower than in my latest Spring4D build (which is a faster than the RTL).

I was wondering if you've actually done comparative testing of adding to the list?
Which list did you use?
Can I see your code?

I downloaded your project. Installed.
Trying to run the tests resulted in a huge number of exceptions being thrown.
What's wrong with the project?
I am using Delphi Community Edition 10.3.
On 10/9/2020 at 6:38 PM, Stefan Glienke said:

@Edwin Yip The article you linked to is my blog, not Erics 😉

Yes, it also suffers from that however the types in that library are way smaller and have only limited functionality of just storing stuff, no rich IEnumerable API such as Spring4D. That makes the binary overhead very very small or even non existing.

What I observed though is that they are no general purpose collection classes but obviously tailored for some specific needs - you cannot use TsgList<T> for any type as it is very limited as what it handles (only non managed types).

 

I also did not do a in depth performance comparison but just adding some Integers to a list was 3-4 times slower than in my latest Spring4D build (which is a faster than the RTL).

 

If you watch the video by Herb Sutter I linked in the other thread I even wonder why someone coming from C++ would need to create some list type as he could just use TArray<T> because he should have used Vector<T> in C++.

 

Share this post


Link to post
3 hours ago, Marat1961 said:

I was wondering if you've actually done comparative testing of adding to the list?
Which list did you use?

 

As I wrote that was not an in depth benchmark but just to get a rough idea and its apples and oranges anway, ymmv.

unit Unit1;

interface

procedure RunSGL;
procedure RunSpring;
procedure RunRTL;
procedure RunArray;

implementation

uses
  Diagnostics,
  Generics.Collections,
  Spring.Collections,
  Oz.SGL.Collections;

const
  COUNT = 10000000;

procedure RunSGL;
var
  list: TsgList<Integer>;
  sw: TStopwatch;
  i, n: Integer;
begin
  list.From(nil);
  list.Count := COUNT;
  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
    list[i] := i;
  Writeln('SGL SetItem ', sw.ElapsedMilliseconds);

  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
  begin
    n := list[i];
    if n = -1 then Break;
  end;
  Writeln('SGL GetItem ', sw.ElapsedMilliseconds);
end;

procedure RunSpring;
var
  list: IList<Integer>;
  sw: TStopwatch;
  i, n: Integer;
begin
  list := TCollections.CreateList<Integer>;
  list.Count := COUNT;
  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
    list[i] := i;
  Writeln('Spring SetItem ', sw.ElapsedMilliseconds);

  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
  begin
    n := list[i];
    if n = -1 then Break;
  end;
  Writeln('Spring GetItem ', sw.ElapsedMilliseconds);
end;

procedure RunRTL;
var
  list: TList<Integer>;
  sw: TStopwatch;
  i, n: Integer;
begin
  list := TList<Integer>.Create;
  list.Count := COUNT;
  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
    list[i] := i;
  Writeln('RTL SetItem ', sw.ElapsedMilliseconds);

  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
  begin
    n := list[i];
    if n = -1 then Break;
  end;
  Writeln('RTL GetItem ', sw.ElapsedMilliseconds);
end;

procedure RunArray;
var
  list: TArray<Integer>;
  sw: TStopwatch;
  i, n: Integer;
begin
  SetLength(list, COUNT);
  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
    list[i] := i;
  Writeln('array ', sw.ElapsedMilliseconds);

  sw := TStopwatch.StartNew;
  for i := 0 to COUNT-1 do
  begin
    n := list[i];
    if n = -1 then Break;
  end;
  Writeln('array ', sw.ElapsedMilliseconds);
end;

end.

 

3 hours ago, Marat1961 said:

Trying to run the tests resulted in a huge number of exceptions being thrown.

You know that unit testing also includes testing if exceptions are thrown properly, yes?

All of them are expected exceptions and you saw that all tests are green, yes?

Share this post


Link to post

 

10 hours ago, Stefan Glienke said:

As I wrote that was not an in depth benchmark but just to get a rough idea and its apples and oranges anway, ymmv.

Your library code is functionally complete.
Written well and reliably, you can see a strong love for interfaces and OOP.
I also had a love stage with COM objects and interfaces.

 

I am now developing the protobuf-delphi project and was looking for a suitable collection implementation.
It will be possible to try to make code generation for your library.

 

The integer test will not be able to show the advantage of using memory regions.

 

I have an implementation of two buffers https://sourceforge.net/projects/protobuf-delphi/, which I used on the application server for folding 
there the results sent to the client. 
A segmented buffer is a connected chain of memory regions and when there is not enough memory, it asks for another piece of memory.

The second buffer uses a continuous piece of memory and has a different allocation strategy.
When there is insufficient memory, it asks for a larger piece of memory (usually ReallocMem),

copies the current data (Assign) and then deletes the old block (FreeMem).

 

Now we have to consider a typical memory manager implementation. 

And at my age, I have seen and studied the device dozens of their implementations
(starting with the implementation of Donald Knuth on his UM processor...).


Working with a heap in Delphi uses this structure

  TMemoryManager = record
    GetMem: function(Size: Integer): Pointer;
    FreeMem: function(P: Pointer): Integer;
    ReallocMem: function(P: Pointer; Size: Integer): Pointer;
  end;


  
At first, the memory in the heap is not defragmented and we have one solid piece of memory.
The mechanism of allocation at this stage is sequential.
A correct implementation of ReallocMem checks the input parameter of a variable and

if it is on the free memory pile boundary, it will simply reduce the amount of free memory and move the free memory pile boundary.

In this case there is no need to copy data and delete the old block.

 

You have TArray<Integer> used "overboard" your class. In other words, it's a dynamic array. 
In my case, it uses a continuous memory region.

If you are in a loop, you will increment a single collection 
we have one solid piece of memory and we test the dynamic array operation
in the best conditions imaginable.

So your test is in an ideal world.


In order to see the difference you need to get closer to the combat conditions of use.

 

Integer is not an object, not a string that is placed in a heap.

If it is possible to use records instead of an object I use a record.
I don't like to use controllable types (string, interface, reference to proc). for the record fields in a heavily loaded server.
Because Delphi's finalization operation is not very cheap, it uses rtti and will scan the record fields in order to take account of the referenced fields with a managed type.

 

Why don't we improve the test and bring it closer to the real world?

It is possible to simulate the real working conditions of the program.
First, we bring the heap to a defragmented state.

Defragmentation occurs after multiple memory allocation and memory return.

 

Then we interrupt the work of several collections including a linked list, a hash table and a tree. Memory freeing should be used as well.

 

BSP Test


Take for example the BSP algorithm and implement it 
1. on your collections using objects.
2. on SGL collections using records placed in memory regions.

 

Multithreaded application server

 

Simulate typical server operation state.
There are different types of requests from different clients.
We accept them, service and return the answer.
It can be json, xml or binary data format such as Google protocol buffer.
 

Edited by Marat1961

Share this post


Link to post
On 10/11/2020 at 2:21 AM, Stefan Glienke said:
You know that unit testing also includes testing if exceptions are thrown properly, yes?

All of them are expected exceptions and you saw that all tests are green, yes?

Usually when I implement a library for it, I declare a separate type of exception.

When checking when throwing an exception, I try to use only this type of exception.

 

While running the tests, I added a bunch of "ignore on execution" exceptions.
The most incredible exceptions were thrown.

Especially, I was very confused by the "division by zero" exception.

 

I decided that it was not worth continuing further.

I really decided that the installation of the library was done incorrectly.

Edited by Marat1961

Share this post


Link to post
1 hour ago, Marat1961 said:

Usually when I implement a library for it, I declare a separate type of exception.

Ask yourself why the RTL raises more than a single class of exception. 

Share this post


Link to post
2 hours ago, David Heffernan said:

Ask yourself why the RTL raises more than a single class of exception. 

The Delphi Runtime Library is a collection of levels of abstraction.
RTL itself is implemented on top of the operating system API.

The operating system (already sometimes) runs using hardware and a processor.

 

Take an I / O library for example.
It is pertinent to see an exception like "Oh, I (i / o) cannot open the file".
Or "Oh, why am I (i / o) trying to read something beyond the end of the file."

 

When I see "division by zero" or "Access violation" it is generally a disaster - a tragedy.

This means that the problem did not occur at the library level, but at the hardware and processor levels.

For me, this is a signal that the memory is destroyed, or the code does not understand what.

 

I am an advocate of defensive programming. The library function must validate the input parameters.

Edited by Marat1961

Share this post


Link to post

The RTL and spring4d are actually pretty comparable. Your claim that spring4d code should only raise a single exception class indicates that you don't appreciate its scope.

 

Im an advocate for understanding something before judging it. 

Share this post


Link to post
1 hour ago, Marat1961 said:

When I see "division by zero" or "Access violation" it is generally a disaster - a tragedy.

This means that the problem did not occur at the library level, but at the hardware and processor levels.

For me, this is a signal that the memory is destroyed, or the code does not understand what.

There're a lot of place where an AV must be used/expected by tests to ensure a logic consistence. An example would be to test if a ROM structure holds at a given time ! Simply you can't use a comparison with an initial value because a write with the same initial value will make your test pass successfully. Another example is from madExcept instantly crash on buffer, if you want to validate that functionality, your test must expect a hardware exception when running overrun/underrun!

The point is : its all fine if those exceptions are expected to happen whether being hardware or software ! The not fine is when you expect your test to throw an exception and it does not ! 

 

Share this post


Link to post
3 hours ago, Mahdi Safsafi said:

There're a lot of place where an AV must be used/expected by tests to ensure a logic consistence.

 

I actually talked about "division by zero in the context of working with a library offering work with collections and IOC.

Why not check the parameter for zero and display a message in the place where it might appear?

 

Perhaps the error was in the test: "What will happen with our engine if we are dealing with a buggy UDF?"

 

Access Violation was not met!
I brought them alongside because they are hardware.

Share this post


Link to post
22 minutes ago, Marat1961 said:

I actually talked about "division by zero in the context of working with a library offering work with collections and IOC.

Why not check the parameter for zero and display a message in the place where it might appear?

Can you show the code? 

Share this post


Link to post

The conversation is about the Spring4D project.
For simplicity of heart, I ran the test in the IDE.
program Spring.Tests;

Share this post


Link to post

I'm using Spring4D all the time. Run the Test without debugger......
for me It is one of the best, correction, the best library for Delphi. 

  • Like 1
  • Thanks 1

Share this post


Link to post
1 hour ago, Marat1961 said:

The conversation is about the Spring4D project.
For simplicity of heart, I ran the test in the IDE.
program Spring.Tests;

Show the code that raises exceptions that you think are a sign of poor design. Details matter. 

Share this post


Link to post
10 hours ago, Fritzew said:

I'm using Spring4D all the time. Run the Test without debugger......
for me It is one of the best, correction, the best library for Delphi. 

I have no complaints about the library. I already wrote about this.
I looked at the library code. The spirit of the library is close to java.

I was offended by several phrases that I considered incorrect.

 

1. What I observed though is that they are no general purpose collection classes but obviously tailored for some specific needs - you cannot use TsgList <T> for any type as it is very limited as what it handles (only non managed types).

 

I'm sorry, I was wrong in this case.

 

SgList doesn't really support managed types.


The implementation of this collection was one of the very first.
I actually used it for data that does not contain managed types.

 

It will be necessary to fix this shortfall and do it in about the same way as in other collections, in which this problem should not be.

You will need to look into the code of each collection again and provide test support for the managed types.

 

2. I also did not do a in depth performance comparison but just adding some Integers to a list was 3-4 times slower than in my latest Spring4D build (which is a faster than the RTL).

 

I looked at the test.
First, we specify the size of the list, and then we assign a scalar value to the allocated memory area.

In this case, a collection is used, which is essentially a clone of the standard collection, with the exception of the possibility to refer to a data element by a pointer and using a memory region.

I suggested changing the test a bit, bringing it closer to real conditions.

 

Added a level and we have indirection of access to the data item. In my opinion, an indirect base-index access method is obtained.
Instead of a base-index accessor.
In addition, I now have to check the correctness of the index.
 

procedure TMemSegment.CheckPointer (Ptr: Pointer; Size: Cardinal);
var
  lo, hi: NativeUInt;
begin
  lo: = NativeUInt (@Self) + sizeof (TMemSegment);
  hi: = NativeUInt (@Self) + HeapSize - Size;
  Check (InRange (NativeUInt (Ptr), lo, hi));
end;

I have been developing software for embedded microprocessors for six years and always took my parachute with me during a real flight.
This is me about checks when accessing an array.
Now, though, I have this option enabled at least during the development of the project.

 

This and some other checks can be left only for debugging, but I love to fly with a parachute and identify the problem at the point of origin.

Edited by Marat1961

Share this post


Link to post
4 hours ago, Marat1961 said:

I was offended by several phrases that I considered incorrect.

 

1. What I observed though is that they are no general purpose collection classes but obviously tailored for some specific needs - you cannot use TsgList <T> for any type as it is very limited as what it handles (only non managed types).

 

2. I also did not do a in depth performance comparison but just adding some Integers to a list was 3-4 times slower than in my latest Spring4D build (which is a faster than the RTL).

So instead of adressing those statements you bring up some completely irrelevant and wrong things about my library? Well... I am all for criticism if you find issues in its design or implementation but that was just a low blow. 😉

 

Maybe I am missing something when setting up the list or using the wrong one but there is clearly the lack of handling managed types in TsgList<T> because it simply calls TslListHelper.SetItem which uses ordinal assignments or System.Move.

 

Here is some code that shows that something is wrong - shouldn't it print 0 to 9? But it does print an empty line and then raises an EInvalidPointer.

 

const
  COUNT = 10;

procedure RunSGL;
var
  list: TsgList<string>;
  i: Integer;
  s: string;
begin
  list.From(nil);
  for i := 0 to COUNT-1 do
  begin
    s := i.ToString;
    list.Add(s);
  end;
  
  s := '';

  for i := 0 to COUNT-1 do
  begin
    s := list[i];
    Writeln(s);
  end;
end;

P.S. You can edit your posts - no need for multiposting to address multiple previous comments.

Edited by Stefan Glienke
  • Like 4

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×