RaelB 4 Posted July 23, 2020 In Data structures & Algorithm courses, there is usually a discussion on Binary Search Trees and it is seems to be an important data structure. Although in Delphi land I don't see much mention of BST. So I'm curious do people use BST and if so in what cases? or does TStringList mostly replace the need for it. If you do use it, what implementation do you use. Most of the libraries I found don't seem to have a BST, although I did find one in https://github.com/Zeus64/alcinoe Share this post Link to post
vfbb 285 Posted July 24, 2020 The binary tree is essential in programming, especially when you want to insert, remove, search and maintain an ordered list at the same time, with an excellent performance. I use it, but on a day-to-day basis we don’t use it that much because generally what we need is just a simple sorting, example List.Sort, or a list with a fast search, example TDictionary, or inserting, removing and sorting small lists (for example example lists with less than 200 items). The ordering of delphi uses QuickSort which has the same complexity / performance as adding + ordering a binary tree O (N * (Log N)). Just like the dictionary has similar performance in the search. But as I said, when you want to add, remove, search quickly and sort the same list, you better use a binary tree. Delphi does have support for this in TList<T>, you don't need to create a class or make implementations as alcinoe did, you just need to use for example TList<Integer>, and use List.BinarySearch to add, remove or search. Note: To add you will use BinarySearch because although it returns False, which is what it should return when it cannot be found, it also returns the Index, which is where this item should be, so you will insert the item in the position of that index that BinarySearch returned. That way you will already insert sorted, the list will always be sorted, that is, any operation like this will be very fast, because it uses BinarySearch which has the complexity O (Log N), so your list will be a Binary Tree. 😉 1 Share this post Link to post
Stefan Glienke 2007 Posted July 24, 2020 A binary search tree that uses objects as nodes will be crap if you are after the pure performance advantages of a BST simply because all the cache misses caused by indirection will blow it out of the window. Simply talking about the Big O of the algorithm does not tell the full story because it simply ignores all those factors. 1 Share this post Link to post
Mahdi Safsafi 225 Posted July 24, 2020 3 minutes ago, Stefan Glienke said: A binary search tree that uses objects as nodes will be crap if you are after the pure performance advantages of a BST simply because all the cache misses caused by indirection will blow it out of the window. Simply talking about the Big O of the algorithm does not tell the full story because it simply ignores all those factors. Can you please explain more. Im unable to understand what you exactly mean? Better to give an example.. So I can catch. Share this post Link to post
Guest Posted July 24, 2020 I can explain it, Creating object means the chance or probability of two objects being allocated within 64 byte is very low unlike allocate arrays of records where this will depends on the size if them, By searching you need to jump form one to another triggering fetch into cache, cache line is 64 byte, while CPU with their algorithm may fetch more than one line, that depend on the code that been executed frequently, means there is also a chance the CPU is wasting time and energy fetching 2 or more lines to only use one, all of that is not the problem, the problem is the cache size in full, so if your L1 cache is 128kb means you have 8192 cache line can be fit there, ( even if L3 cache is 8MB) the real performance from all books and references that give you the access time and delay is based on assume the operation in memory access happen in L1, thinks gets real un predictable and ugly on L2 and L3 or neither, most access operation to memory that is not in the cache L1 will have toll between 30 to 177 cycle ( even more) . Now when your algorithm already walked lets say 8k item causing cache miss all the way, that algorithm already thrashed the cache for that core in full and this will affect the code even after finishing the search, as every memory access will pay the price of waiting the cache line to be fetched. This is exactly what hindering FastMM5, by access -1 index on 16b allocated memory, -1 index of the pointer is used to store data belongs block, it is just high rate cache miss as walking the pointers will cause more of those, it is faster than FastMM4 but it can be much faster by eliminating some of those cache miss or reduce them greatly, i suggested a work around, and don't know if Pierre tested the idea or not, by moving the -1 index to relative already known address to the pointer meaning the chance is big to reuse that fetched data storing the information of the blocks are serialized and and will not cause the cache miss for each and every block bigger than 16b, which they are. Share this post Link to post
Stefan Glienke 2007 Posted July 24, 2020 (edited) What Kas said - it's similar to the performance of array vs. linked list on modern computers - using objects or any dynamically allocated memory without special suballocator that ensures clustered memory and cache friendly layout will cause memory that's scattered all over the heap the longer the data structure exists unless you perform some defragmentation. Also using objects wastes at least precious Pointersize*2 Bytes of memory for every node. If you then throw virtual methods into the mix you very likely end up with a pretty terribly performing tree structure compared to how it could perform according to its big O. Edited July 24, 2020 by Stefan Glienke 1 Share this post Link to post
Mahdi Safsafi 225 Posted July 24, 2020 1 minute ago, Stefan Glienke said: What Kas said - it's similar to the performance of array vs. linked list on modern computers - using objects or any dynamically allocated memory without special suballocator that ensures clustered memory and cache friendly layout will cause memory that's scattered all over the heap the longer the data structure exists unless you perform some defragmentation. Thank you both. Now Im understanding what are you referring to. I guess that flat bst would be much great than. http://bannalia.blogspot.com/2015/06/cache-friendly-binary-search.html?m=1#:~:text=June 25%2C 2015-,Cache-friendly binary search,behind classes such as Boost. In addition to what you pointed out, there is another important factor: branch misprediction : http://databasearchitects.blogspot.com/2015/09/trying-to-speed-up-binary-search.html?m=1#:~:text=The performance of binary search,for the large data sets. Share this post Link to post
vfbb 285 Posted July 24, 2020 The best strategy should be analyzed for each case, but I can tell you that 99% of the time the implementation of binary tree using a simple ordered list will be much faster than creating objects for nodes, because you need to balance the tree all over moment, while the simulation of the binary tree in a simple ordered list you don't have to balance anything (although the List.Insert or List.Delete can call a very large Move, the CPUs are already optimized for that and it will be much faster that go through node by node balancing the tree to maintain the ideal number of levels of a binary tree which is Log N in base 2, besides the costs of creating the objects of the nodes). Share this post Link to post
Stefan Glienke 2007 Posted July 24, 2020 (edited) @vfbb Depending on the tree the balancing is pretty damn efficient - but the memory layout after a few iterations can be totally screwed up which I mentioned earlier. But as always it depends - there is no "best for everything" data structure or algorithms but you typically want the "reasonably well for most cases" (bonus if they are "not exceptionally terrible for some rare cases"). Typically the decision is if the thing is often read or often written. If both is the case you have to evaluate the costs of going down the rabbit hole of implementing some special case data structure or simply roll with some standard implementation. Edited July 24, 2020 by Stefan Glienke Share this post Link to post