Best type for data buffer: TBytes, RawByteString, String, AnsiString, ...

Stefan Glienke · October 8, 2020

42 minutes ago, Dany Marmur said:

consider metal/virtual when making decisions concerning O(n)-stuff?

No but I consider what state of the art hardware likes and what it does not like which sometimes varies from what people learned 20 years ago.

Big O is about scaling of an algorithm - not about absolute speed. You can have an O(1) algo being stomped by some O(n) simply because you only ever have that much data that the constant factor in the O(1) still makes it slower than the O(n). The constant factor can be influenced by either additional computations (such as hashing) or simply because you have to do stuff that is more expensive hardware wise (such as memory allocation or memory indirections). That is why for example the fastest general purpose sorting algorithms out there are hybrid sorting algorithms combining the best sorting algorithms for particular parts of the to be sorted data (such as introsort or timsort) with timsort even being optimized on realistic data appearance such as already, almost or reverse sorted.

Having said so imo pondering over that stuff is only important for certain areas of software development such as core library developers or if you roll your own data types and algorithm implementations.

Here is another thing about the "stuff that modern hardware loves" argument.

Interesting part starts around 24:00 with all the theory and is followed by two mind blowing examples at around 41:00:

Edited October 8, 2020 by Stefan Glienke

bdw_nz20 · October 9, 2020

Yeah Stefane that was interesting about the game loop, even better its a C++ example.

FPiette · October 10, 2020

On 10/7/2020 at 11:20 AM, Stefan Glienke said:

Additional read on the subject

This article is about random insertion and random deletion, I guess they take into account the time required to locate the record in the list. If random means generating a random index number, then it is the worst case possible for linked list and best case for array (direct access). If that is a main access pattern, then a linked list is NOT the correct data structure. Insertion/deletion in a linked list is only quick provided the insertion/deletion point is already available. Measuring the performance for a search separately from performance for insert/delete itself is what has to be done to truly compare linked-list and arrays or any other data structure such as a dictionary or hash table.

A linked list is excellent for a FIFO QUEUE or LIFO QUEUE (STACK). And that is the problem in the case of this message thread: communication which require a FIFO queue between receiving process and computing process. That is why I proposed a linked-list.

Marat1961 · October 10, 2020

1 hour ago, FPiette said:

This article is about random insertion and random deletion, I guess they take into account the time required to locate the record in the list. If random means generating a random index number, then it is the worst case possible for linked list and best case for array (direct access). If that is a main access pattern, then a linked list is NOT the correct data structure. Insertion/deletion in a linked list is only quick provided the insertion/deletion point is already available. Measuring the performance for a search separately from performance for insert/delete itself is what has to be done to truly compare linked-list and arrays or any other data structure such as a dictionary or hash table.

A linked list is excellent for a FIFO QUEUE or LIFO QUEUE (STACK). And that is the problem in the case of this message thread: communication which require a FIFO queue between receiving process and computing process. That is why I proposed a linked-list.

François, I was very glad to see you. I want to say a big thank you for your ICS components. In my server, I used them.

Edited October 10, 2020 by Marat1961

October 10, 2020

@Marat1961, hit @ then f and p and you can call him.

If you appreciate his "moves" you ought to click "like" or "thanks" (lower right).

Marat1961 · October 10, 2020

I always analyze the types of data being processed and the operations that will be performed on them.

It is possible to combine the merits of both array and linked list. To do this, it is enough to use data structures using memory regions.

If you arrange the list items in a memory region, the data will certainly be located side by side and the percentage of cache hits will be maximum.

Using structures using memory regions can also make the job of freeing memory easier.

I would also advise to take a closer look at the storage format of the buffer protocol. Which is equally good for storing any data. In addition, it shakes data quite well.
at the expense of default values and does not store empty values.

All messages in this format start with a length value. You can store pointers to their beginning in an array.
You can pass this set of bytes to the reader of an object that reads data. The writer of an object can write its state to a buffer.

If this is processing data from client applications, I would do something like a processing context associated with a specific client and perhaps a message builder based on a deterministic finite automaton.

https://github.com/marat1961/protobuf-delphi

https://github.com/marat1961/Oz-SGL

Edited October 11, 2020 by Marat1961

Sign In

Best type for data buffer: TBytes, RawByteString, String, AnsiString, ...

Recommended Posts

Stefan Glienke 1901

Share this post

Link to post

bdw_nz20 10

Share this post

Link to post

FPiette 363

Share this post

Link to post

Marat1961 17

Share this post

Link to post

Guest

Share this post

Link to post

Marat1961 17

Share this post

Link to post

Create an account or sign in to comment

Create an account

Sign in

Browse

Activity