Jump to content
Mike Torrettinni

Fast lookup tables - TArray.BinarySearch vs Dictionary vs binary search

Recommended Posts

I'm refactoring usage of various different look-up tables and I'm trying to come with 1 fast look-up table design. Most of them have 1 thing in common: search by integer (ID) and search by string (Name).

All data is sorted.

 

So, I bench-marked TArray.BinarySearch, TDictionary and custom binary search.

Even though a flaw in design was pointed out for TArray.BinarySearch, (see https://en.delphipraxis.net/topic/4575-why-is-tarraybinarysearch-slower-than-normal-binary-search-function/), I still wanted to compare results:

 

For search by integer, custom binary search seems to be a little faster than TDictionary, until large tables:

image.png.67388e6df369cfce9ae31490179558c9.png

 

For search by string, custom binary search wins even more, still for not large tables:

image.png.50244b99e8133d6704d0554f0f020b16.png

 

Interesting is that for smaller searches, 10 records, a sequential search is faster than TDictionary.

 

I also tried TGpStringHash (by Primoz Gabrielcic), which is very fast, but unfortunately no implementation for integers.

TSynDictionary (from mORMot) is also very fast, but I don't use mORMotand license is not friendly for my commercial project.

 

Is there any other fast search implementations that is not too specialized and ready to use out of the box, that I can try?

 

 

 

 

  • Like 3

Share this post


Link to post

No surprise there - linear search is O(n), binary search is O(log n) and hash table is O(1), what you see with the differences for certain number of records are the different constant factors - building a hash has a certain cost which is why TDict is slower up to a certain point - this depends on the hash algorithm being used - indirections being caused by extra function calls like IEqualityComparer.GetHashCode and the memory layout of the hash table. You can even see that it's not strictly O(1) as it gets a little slower the more records you have which I would guess is due to memory layout within the hashtable where at a certain point memory access time kicks in because the data does not fit in the fast cache anymore (access pattern of the benchmarking code matters there as well).

 

Without specifying the exact use cases it's hard to give any suggestions - is the data one time fill and then only lookup, is data being changed after - adding/removing records, is data within the records changing. Whats the typical lookup pattern - is data being looked up randomly or in certain patterns (such as ID or Name sequences in certain order).

 

If you aim for maximum performance a custom data structure and algorithm will always win over some standard algorithm by a few percent - the question is if you want to do all that work including risking bugs for that negligible few percent or use some standard data structures and algorithms that are proven to work and easy to be applicable.

  • Like 3
  • Thanks 1

Share this post


Link to post
2 hours ago, Stefan Glienke said:

Without specifying the exact use cases it's hard to give any suggestions - is the data one time fill and then only lookup, is data being changed after - adding/removing records, is data within the records changing. Whats the typical lookup pattern - is data being looked up randomly or in certain patterns (such as ID or Name sequences in certain order).

My data is simple in 99% cases: ID = integer, full range 1.. MaxInt (but I guess that doesn't matter), string= 12-30 characters.

In majority cases it's populated at start, sorted once and then just used for accessing by ID, Name, single random lookup - I mean no lookups in sequence. I guess.

 

2 hours ago, Stefan Glienke said:

If you aim for maximum performance a custom data structure and algorithm will always win over some standard algorithm by a few percent - the question is if you want to do all that work including risking bugs for that negligible few percent or use some standard data structures and algorithms that are proven to work and easy to be applicable.

Currently looking for general fast implementation. The biggest data tables, 1K-10K records are not accessed often, but I still want to implement faster access - because sequential access by string gets crazy slower in 64bit - 5x!

 

That's why I'm looking for ready made solution. I tried to implement search by integer, a copy of TGpStrinhHash, and it got so complicated I made a mess. Like you said, I don't want to use something that can have bugs, for a few percentage. Not in this case.

 

 

 

Share this post


Link to post
5 minutes ago, Mike Torrettinni said:

My data is simple in 99% cases: ID = integer, full range 1.. MaxInt (but I guess that doesn't matter)

It does matter - if your ID values for example are in a certain range maybe even starting at 1 you could get O(1) access if you used them simply as index in the array itself.

 

7 minutes ago, Mike Torrettinni said:

because sequential access by string gets crazy slower in 64bit - 5x!

I don't know what that means - but talking about at most 10k records with strings up to 30 characters looking up one of those is certainly within the ms if not ns range.

 

 

  • Like 1

Share this post


Link to post
32 minutes ago, Stefan Glienke said:

It does matter - if your ID values for example are in a certain range maybe even starting at 1 you could get O(1) access if you used them simply as index in the array itself.

Hm, you are right, some are 1-n in range, some are random. Perhaps I can have a flag that will determine if access is direct or by search mechanism (TDictionary or bin array).

 

38 minutes ago, Stefan Glienke said:

I don't know what that means - but talking about at most 10k records with strings up to 30 characters looking up one of those is certainly within the ms if not ns range.

Well, 2.5x slower (5x was exaggeration) , when run as 64bit, while the rest is still acceptable:

 

image.png.6da13fee5aeaa0842638093dcbdb7aea.png

Share this post


Link to post

You still have not said what your use case is.  My own need for fast string lookup (and finding either an integer or a pointer), is principally to load once, lookup many times, and I am not too concerned about memory usage or load time.  The task usually is to handle up to a couple of hundred strings of about 25 to 200 characters. For this I have long used a ternary search tree, which I first learnt about from Dr Dobbs Journal, April 1998.  I adapted the C version for Delphi 1, I think, and by now the lookup supports Unicode strings  I cannot post useful timings, because you cannot compare my environment to yours, but I just did a test load of quarter of a million random 64-character strings into the tree in 390ms. The speed of lookup is what matters to me and  this is too fast to bother to benchmark it.  But those 250k strings (far more than I need in practice) took a tree size of 116MB.  Whether this search method would be useful to you depends on how you value load speed and memory against lookup time. Lookup time may not the only thing you should be considering.

  • Like 2

Share this post


Link to post
Guest
1 hour ago, Mike Torrettinni said:

Well, 2.5x slower (5x was exaggeration) , when run as 64bit, while the rest is still acceptable:

 

image.png.6da13fee5aeaa0842638093dcbdb7aea.png

There is something very ugly and stupid is been done by Delphi compiler all the time, which is affecting almost every indexed item access causing slowness on x64, where x64 in general should be always faster than x32 but the literal usage of Integer as 32bit even in safe places cause unneeded overhead.

This is not easy to show or explain without code, and my suggestion might not yield the result i suspect for you in your case and code,  but try to replace Integer with NativeInt for the "I" or "J" (or whatever) that control the index for the items in list or arrays or even for chars in string, and please share the result.

Share this post


Link to post

Have you tried in-memory datasets?  There's one included with Firedac.  If you are manageing multiple lists of 10,000 records and indexed dataset might be worth adding to your speed test.

  • Like 1
  • Thanks 1

Share this post


Link to post
2 hours ago, timfrost said:

You still have not said what your use case is.  My own need for fast string lookup (and finding either an integer or a pointer), is principally to load once, lookup many times, and I am not too concerned about memory usage or load time.  The task usually is to handle up to a couple of hundred strings of about 25 to 200 characters. For this I have long used a ternary search tree, which I first learnt about from Dr Dobbs Journal, April 1998.  I adapted the C version for Delphi 1, I think, and by now the lookup supports Unicode strings  I cannot post useful timings, because you cannot compare my environment to yours, but I just did a test load of quarter of a million random 64-character strings into the tree in 390ms. The speed of lookup is what matters to me and  this is too fast to bother to benchmark it.  But those 250k strings (far more than I need in practice) took a tree size of 116MB.  Whether this search method would be useful to you depends on how you value load speed and memory against lookup time. Lookup time may not the only thing you should be considering.

Is this same, similar as Trie (https://en.wikipedia.org/wiki/Trie)? Something might come useful down the line, but right now probably overkill.

 

 

Share this post


Link to post
34 minutes ago, Darian Miller said:

Have you tried in-memory datasets?  There's one included with Firedac.  If you are manageing multiple lists of 10,000 records and indexed dataset might be worth adding to your speed test.

Thanks, but since max records is 10K and rare, I think I can skip this for now, but perhaps in the future.

Share this post


Link to post
1 hour ago, Kas Ob. said:

There is something very ugly and stupid is been done by Delphi compiler all the time, which is affecting almost every indexed item access causing slowness on x64, where x64 in general should be always faster than x32 but the literal usage of Integer as 32bit even in safe places cause unneeded overhead.

This is not easy to show or explain without code, and my suggestion might not yield the result i suspect for you in your case and code,  but try to replace Integer with NativeInt for the "I" or "J" (or whatever) that control the index for the items in list or arrays or even for chars in string, and please share the result.

Thanks for the tip! In next few weeks I will start benchmarking 64bit in my project and see how it goes. But just this simple test showed I have these little slow pieces all over the project, so not neccessarily a stand-out bottleneck, but definitely a refactoring and faster implementation is needed in a lot of places. Will see how 64bit behaves.

Share this post


Link to post
8 hours ago, Mike Torrettinni said:

TSynDictionary (from mORMot) is also very fast, but I don't use mORMotand license is not friendly for my commercial project.

As David wrote, mORMot has a 3 license - if you use MPL it is very commercial-project-friendly.

TL&LR: Nothing to pay, just mention somewhere in your software that you used it, and publish any modification you may do to the source code.

 

Another article worth looking at:
https://www.delphitools.info/2015/03/17/long-strings-hash-vs-sorted-vs-unsorted/

It depends what you expect.
Also note that for long strings, hashing may have a cost - this is why we implemented https://blog.synopse.info/?post/2021/02/12/New-AesNiHash-for-mORMot-2

Edited by Arnaud Bouchez
  • Like 3
  • Thanks 1

Share this post


Link to post
1 hour ago, Mike Torrettinni said:

Is this same, similar as Trie (https://en.wikipedia.org/wiki/Trie)? Something might come useful down the line, but right now probably overkill.

 

 

It's an improvement on a Trie.  The Dr Dobbs article (still on line at https://www.drdobbs.com/database/ternary-search-trees/184410528) claims in their Conclusion, "Ternary search trees are efficient and easy to implement. They offer substantial advantages over both binary search trees and digital search tries. We feel that they are superior to hashing in many applications.".  See also: 

 

  • Like 1
  • Thanks 1

Share this post


Link to post

I re-ran with TSynDictionary to verify again, and it's much faster versus TDictionary for string searches, not so much for integers. But seems to be pretty much same performance in 32bit and 64 bit!

 

32bit:

image.png.11bea3f77812e857bca91205773f7004.png

 

64bit:

image.png.f14d4425bbdf15f394d8a45cca9ea095.png

Edited by Mike Torrettinni
  • Like 1

Share this post


Link to post

I finally realized why binary search by ID (integer) was slower than by Name (string): because function by ID was returning a Name (string). If I return data index or pointer, the speed is same. 

 

Everyday learning something new.

Share this post


Link to post
On 3/7/2021 at 7:35 AM, Mike Torrettinni said:

I re-ran with TSynDictionary to verify again, and it's much faster versus TDictionary for string searches, not so much for integers. But seems to be pretty much same performance in 32bit and 64 bit!

 

32bit:

image.png.11bea3f77812e857bca91205773f7004.png

 

64bit:

image.png.f14d4425bbdf15f394d8a45cca9ea095.png

mORMot never ceases to surprise me ;) Mike, can you share your benchmarking code on github?

Share this post


Link to post
9 minutes ago, Edwin Yip said:

Mike, can you share your benchmarking code on github?

Or simply upload zipped project here...

Share this post


Link to post

As I already said - the high constant factor with the RTL hash function is the problem - both TDict and Spring with faster equalitycomparer from Tiny.Library - results from an i5-3570 @ 3.4GHz

Also testing in the global main is garbage because you are then using global variables that cause way less optimal code - fixing that (simply put all that benchmark code into a procedure) and you get the following results

 

Win32:

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       26         375       25         24         32         27
   50       56         447       25         29         32         26
  100       83         473       26         32         35         25
 1000      535         585       28         83         37         26
10000     5561         753       46        130         44         30

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       21         382       28         12         30         27
   50      109         466       24         18         28         25
  100      206         491       24         23         31         26
 1000     2044         603       30         88         39         30
10000    20532         810       36        181         49         36

Win64:

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       24         297       26         23         32         26
   50       60         323       26         30         31         26
  100       94         328       26         40         32         26
 1000      752         381       28         85         47         28
10000     8619         427       33        128         47         30

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       61         306       24         30         27         24
   50      304         342       23         41         31         22
  100      620         347       23         51         31         23
 1000     6466         426       31        127         52         30
10000    64263         552       38        237         58         36

 

Edited by Stefan Glienke
  • Like 4
  • Thanks 1

Share this post


Link to post
27 minutes ago, Stefan Glienke said:

Also testing in the global main is garbage because you are then using global variables that cause way less optimal code - fixing that (simply put all that benchmark code into a procedure) and you get the following results

Oh, didn't know about this.

 

But results are pretty much the same for me, even thought I put all the calls and vars in procedure. Did you make any other changes? Because your TDict results are very good, but I still get same results.

Oh, perhaps is Delphi version, I use 10.2.3.

Share this post


Link to post

As always you again seem to only read half of what I wrote.

 

Base line (code as you posted) - running on an i5-3570K @ 3.4GHz

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       46         397       60         23         48         59
   50       72         474       67         28         52         59
  100      105         504       62         32         51         59

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       21         373      106         11         25        107
   50      105         443      109         18         26        110
  100      204         487      104         23         29        103

Use equalitycomparer from Tiny.Generics.pas for TDict and Spring - running on the i5-3570K @ 3.4GHz

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       45         407       44         24         49         44
   50       71         468       42         27         48         42
  100       96         499       43         32         53         43

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       21         379       26         11         27         25
   50      106         445       25         18         27         25
  100      207         483       26         22         29         27

Simply put "procedure Main;" after the CreateRandomData procedure and "end;begin Main;" before the end. - running on the i5-3570K @ 3.4GHz

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       28         387       27         25         32         26
   50       51         455       25         28         32         24
  100       76         487       27         32         34         27

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       22         380       27         12         25         27
   50      105         446       24         18         28         24
  100      207         489       25         24         30         26
  
Same code as before but running on an i7-8700 @ 3.2Ghz

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       23         306       21         20         27         20
   50       42         355       22         24         28         20
  100       63         377       22         26         29         21

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       17         299       16          7         18         17
   50       72         359       16         13         19         17
  100      139         383       17         21         22         18
  
Same code compiled with XE8 and also running on the i7-8700 @ 3.2GHz

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       26         292       22         20         24         21
   50       65         342       22         23         26         21
  100      112         355       21         25         26         23

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       24         281       17          8         18         18
   50      116         331       18         13         20         18
  100      215         347       17         16         22         17

Edit - for bonus points - I fixed the TArray.BinarySearch case by using an IComparer that I prepared and stored as field because that a) solves the return stack buffer glitch caused my TDelegatedComparer and b) avoids producing an anonymous method every time that method is being called. Also got rid of the CompareValue call in BinarySearchByID, can use < and >.

 

Running on the i5-3570k @ 3.4Ghz

Get Name from ID (Search by integer):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       26          64       25         21         31         27
   50       52          73       26         23         32         30
  100       77          84       25         23         34         25

Get ID from Name (Search by string):
records Sequenced  TArray.BinS  TDict  CustomBinS  TSynDict     Spring
   10       22          63       25         11         27         27
   50      105          81       24         18         27         26
  100      205          94       24         22         29         27

 

Edited by Stefan Glienke
  • Like 3
  • Thanks 1

Share this post


Link to post

Here is my try. I've changed a lot in existing code (mentioned early comparers, equality comparers, etc.). Also, I've replaced returned string type in functions by PString, just because otherwise string handling takes too much time and algorythms comparison becomes not interesting. 

As well I've added my own minimalistic implementation of hash maps (not universal).

 

mapstest.zip

 

 

Edited by balabuev
  • Like 3
  • Thanks 1

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×