Jump to content
Rollo62

Best type for data buffer: TBytes, RawByteString, String, AnsiString, ...

Recommended Posts

42 minutes ago, Dany Marmur said:

consider metal/virtual when making decisions concerning O(n)-stuff?

No but I consider what state of the art hardware likes and what it does not like which sometimes varies from what people learned 20 years ago.

Big O is about scaling of an algorithm - not about absolute speed. You can have an O(1) algo being stomped by some O(n) simply because you only ever have that much data that the constant factor in the O(1) still makes it slower than the O(n). The constant factor can be influenced by either additional computations (such as hashing) or simply because you have to do stuff that is more expensive hardware wise (such as memory allocation or memory indirections). That is why for example the fastest general purpose sorting algorithms out there are hybrid sorting algorithms combining the best sorting algorithms for particular parts of the to be sorted data (such as introsort or timsort) with timsort even being optimized on realistic data appearance such as already, almost or reverse sorted.

 

Having said so imo pondering over that stuff is only important for certain areas of software development such as core library developers or if you roll your own data types and algorithm implementations.

 

Here is another thing about the "stuff that modern hardware loves" argument.

Interesting part starts around 24:00 with all the theory and is followed by two mind blowing examples at around 41:00: 

 

 

Edited by Stefan Glienke
  • Like 4
  • Thanks 1

Share this post


Link to post
On 10/7/2020 at 11:20 AM, Stefan Glienke said:

Additional read on the subject

This article is about random insertion and random deletion, I guess they take into account the time required to locate the record in the list.  If random means generating a random index number, then it is the worst case possible for linked list and best case for array (direct access). If that is a main access pattern, then a linked list is NOT the correct data structure. Insertion/deletion in a linked list is only quick provided the insertion/deletion point is already available. Measuring the performance for a search separately from performance for insert/delete itself is what has to be done to truly compare linked-list and arrays or any other data structure such as a dictionary or hash table.

 

A linked list is excellent for a FIFO QUEUE or LIFO QUEUE (STACK). And that is the problem in the case of this message thread: communication which require a FIFO queue between receiving process and computing process. That is why I proposed a linked-list.

 

 

Share this post


Link to post
1 hour ago, FPiette said:

This article is about random insertion and random deletion, I guess they take into account the time required to locate the record in the list.  If random means generating a random index number, then it is the worst case possible for linked list and best case for array (direct access). If that is a main access pattern, then a linked list is NOT the correct data structure. Insertion/deletion in a linked list is only quick provided the insertion/deletion point is already available. Measuring the performance for a search separately from performance for insert/delete itself is what has to be done to truly compare linked-list and arrays or any other data structure such as a dictionary or hash table.

 

A linked list is excellent for a FIFO QUEUE or LIFO QUEUE (STACK). And that is the problem in the case of this message thread: communication which require a FIFO queue between receiving process and computing process. That is why I proposed a linked-list.

 

 

François, I was very glad to see you. I want to say a big thank you for your ICS components. In my server, I used them.

Edited by Marat1961
  • Like 1

Share this post


Link to post

I always analyze the types of data being processed and the operations that will be performed on them.

It is possible to combine the merits of both array and linked list. To do this, it is enough to use data structures using memory regions.


If you arrange the list items in a memory region, the data will certainly be located side by side and the percentage of cache hits will be maximum.

Using structures using memory regions can also make the job of freeing memory easier.

 

I would also advise to take a closer look at the storage format of the buffer protocol. Which is equally good for storing any data. In addition, it shakes data quite well.
at the expense of default values and does not store empty values.

All messages in this format start with a length value. You can store pointers to their beginning in an array.
You can pass this set of bytes to the reader of an object that reads data. The writer of an object can write its state to a buffer.

 

If this is processing data from client applications, I would do something like a processing context associated with a specific client and perhaps a message builder based on a deterministic finite automaton.

 

https://github.com/marat1961/protobuf-delphi

https://github.com/marat1961/Oz-SGL

Edited by Marat1961

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×