Jump to content

Stefan Glienke

Members
  • Content Count

    1428
  • Joined

  • Last visited

  • Days Won

    141

Everything posted by Stefan Glienke

  1. Stefan Glienke

    Double, default value

    Which - fun fact - does not happen on the method call (unless it's a virtual method) but when accessing any member inside of it. You could still have some method which does not access any member and it will not AV at all. We can argue all day about this - any runtime behavior one might slap onto it will not save the language from being inherently unsafe. It needs to be built into the language/compiler itself (Hi, Rust)
  2. Stefan Glienke

    Double, default value

    The main issue with memory safety in Delphi is not non-initialized local variables, which the compiler warns about, but use-after-free.
  3. Stefan Glienke

    Double, default value

    First, it's wrong to assume that all local variables reside on the stack, depending on optimization they might be in registers. Second, rep stosd is terribly slow for small sizes (see https://stackoverflow.com/a/33485055/587106 and also the discussions on this issue https://github.com/dotnet/runtime/issues/10744) Also, the Delphi compiler arranges managed local variables into one block that it zeroes (in many different and mostly terribly inefficient ways).
  4. Stefan Glienke

    Double, default value

    Why should the CPU waste time zeroing stuff that will be set shortly after anyway? Compilers are there to warn/error when any code path leads to a situation where a variable is not initialized.
  5. type TCustomItem = class; TCustomContainer = class abstract protected procedure HandleItemNotification(AItem: TCustomItem); virtual; abstract; end; TCustomItem = class protected FContainer: TCustomContainer; end; TCustomContainer<T: TCustomItem> = class abstract(TCustomContainer) protected FItems: TArray<T>; procedure HandleItemNotification(AItem: TCustomItem); overload; override; final; procedure HandleItemNotification(AItem: T); reintroduce; overload; end; TFooItem = class(TCustomItem) procedure SomeMethod; end; TFooContainer = class(TCustomContainer<TFooItem>); procedure TCustomContainer<T>.HandleItemNotification(AItem: T); begin // ... end; procedure TCustomContainer<T>.HandleItemNotification(AItem: TCustomItem); begin HandleItemNotification(T(AItem)); end; procedure TFooItem.SomeMethod; begin FContainer.HandleItemNotification(Self); end; var foo: TFooItem; begin foo := TFooItem.Create; foo.FContainer := TFooContainer.Create; foo.SomeMethod; end. I suggest moving as much code into the non generic base classes as possible. Given the constraint on TCustomItem this should even be reasonably easy to achieve. This will prevent causing larger than necessary binary sizes and longer than necessary compiletimes.
  6. Stefan Glienke

    Do you need an ARM64 compiler for Windows?

    Yes, pretty much - we all know they first ship a half-baked feature to generate marketing hype and then spend the next decade tinkering around the edges to make it work while representatives are telling us "eh, its complicated"
  7. It looks like whoever implemented that in the Delphi compiler was overly obsessed with codesize. If you compare what both gcc and clang do you can see that they rather produce a ton of mov instructions (which execute nicely in parallel on modern processors) before using memmov or rep: https://godbolt.org/z/Gjsqn7qas (change the size of the static array to see the assembly code change)
  8. Stefan Glienke

    win11 24h2 msheap fastest

    That benchmark proves almost (*) nothing, the only point where it allocates is during the form creation where it builds the card deck and later when it prints the output into the TListBox. (*) the only thing affected here is the possible layout of the card objects in the heap as they are all read during the hand-processing code. The difference that you can observe here between Delphi buids using different memory managers is most likely caused by the amount of overhead the respective memory manager is using thus fitting more card objects within the same memory pages, thus more of them (most likely all on modern processors) fitting into L1 cache. As for this particular code - removing the name of the Cards from the object and only building it when it is needed for some UI would probably speed up code more than anything else because you get rid of 20 Byte for every object (Name is a string[19]) - on my CPU this makes the code go down from ~900ms to ~680ms - simply because it does not need to copy the strings in CopyCardFromDeck. Circumventing the getter and setter of TList (which contribute around 25% of the remaining time) brings it down to 460ms. And after that we are not done with string stuff - in every loop iteration, it calls Hand.SetHighValues which produces a name for the cards on the hand - removing that gets me down to ~400. Now because I have a run and SamplingProfiler open already I see that now one of the scorers is TcaaPokerHand.CopyCardFromDeck - the Items getter is not inlined which causes it to be called 10 times for the same 2 card objects. Changing that gets me down to 270ms. But how about avoiding repeated access to the same object in the 2 lists altogether? 230ms I could go on because I see a lot more room to optimize - but I think I made my point. Instead of fiddling with the memory manager one should first look if heap allocations are even the issue. And then identify unnecessary work and eliminate that. ... change TcaaEvaluationCard to be 8 Byte size - (that avoids that the compiler creates a movsd/movsb instruction but simply does an 8 byte mov) -> 160ms
  9. Stefan Glienke

    String memory usage

    If you were using the RTL dictionary with its abysmal performance, that does not surprise me at all. Mostly because of its poor hash function, replacing that with a better one speeds it up significantly. However, string interning does not require a dictionary with key/value but just a hashtable for strings. DelphiAST has the option to use one for its parsing - we needed that when using it in the context of the IDE to avoid spikes in memory consumption.
  10. Clever way of introducing thread-unsafety 😉
  11. Stefan Glienke

    TParallelArray Sort Performance...

    Did you precompile Spring or are you compiling directly from the sources? If you precompile (recommended) then you need to precompile with release settings of course. If I had to bet I would say you still have range checking enabled (RTL has that disabled, so does your code) Also tbh I don't care about a 30% improvement of a handwritten algo specifically for one type over the generic one. Yes, compiler could do a better job with more aggressively inlining and stuff but nowadays I am happy already if it generates working code. Apart from the code not compiling I would say it depends on what you want to benchmark - how sort performs on completely cold data? Like some data that you loaded an hour ago, then did completely different things so it does not reside in the cache anymore and now you want to have the fastest sorting possible. Then yes, otherwise I would say that usually you typically are sorting data that is already in cache because you previously filled the array or list with it, or processed it followed by the sort operation. Some additional read on that "clear the LLC for some benchmark" topic: https://stackoverflow.com/a/49077734/587106
  12. Stefan Glienke

    TParallelArray Sort Performance...

    Here is something else to tease you all a bit - PDQSort is using comparer, PDQSort2 does not - others as before QuickSort2 is using the Spring Comparer - test type is Double: Looks like I am getting twice the speed at worst and 50 times at best. For faster to compare types such as Integer the gain goes up to almost 100 times. -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------------------------- random/QuickSort/count:100 2917 ns 2886 ns 100 random/QuickSort2/count:100 2301 ns 2284 ns 100 random/IntroSort/count:100 1582 ns 1566 ns 100 random/PDQSort/count:100 1519 ns 1504 ns 100 random/PDQSort2/count:100 759 ns 746 ns 100 random/QuickSort/count:1000 48127 ns 47750 ns 100 random/QuickSort2/count:1000 26820 ns 26615 ns 100 random/IntroSort/count:1000 16349 ns 16157 ns 100 random/PDQSort/count:1000 24101 ns 23928 ns 100 random/PDQSort2/count:1000 6485 ns 6442 ns 100 random/QuickSort/count:10000 686240 ns 685846 ns 100 random/QuickSort2/count:10000 621977 ns 618258 ns 100 random/IntroSort/count:10000 426604 ns 426023 ns 100 random/PDQSort/count:10000 445646 ns 443871 ns 100 random/PDQSort2/count:10000 338404 ns 336588 ns 100 random/QuickSort/count:100000 8252128 ns 8220957 ns 100 random/QuickSort2/count:100000 7492506 ns 7371508 ns 100 random/IntroSort/count:100000 5723208 ns 5578393 ns 100 random/PDQSort/count:100000 5675312 ns 5650503 ns 100 random/PDQSort2/count:100000 4504745 ns 4482480 ns 100 random/QuickSort/count:1000000 98163494 ns 97011011 ns 100 random/QuickSort2/count:1000000 88461390 ns 87306397 ns 100 random/IntroSort/count:1000000 67687293 ns 66648855 ns 100 random/PDQSort/count:1000000 69019337 ns 68147235 ns 100 random/PDQSort2/count:1000000 55628991 ns 54791149 ns 100 few_unique/QuickSort/count:100 2356 ns 2314 ns 100 few_unique/QuickSort2/count:100 1540 ns 1500 ns 100 few_unique/IntroSort/count:100 1291 ns 1275 ns 100 few_unique/PDQSort/count:100 1559 ns 1483 ns 100 few_unique/PDQSort2/count:100 689 ns 674 ns 100 few_unique/QuickSort/count:1000 29073 ns 29009 ns 100 few_unique/QuickSort2/count:1000 16086 ns 16026 ns 100 few_unique/IntroSort/count:1000 11084 ns 11031 ns 100 few_unique/PDQSort/count:1000 8022 ns 7984 ns 100 few_unique/PDQSort2/count:1000 4015 ns 3983 ns 100 few_unique/QuickSort/count:10000 424766 ns 424046 ns 100 few_unique/QuickSort2/count:10000 292291 ns 288395 ns 100 few_unique/IntroSort/count:10000 224547 ns 222666 ns 100 few_unique/PDQSort/count:10000 145437 ns 144788 ns 100 few_unique/PDQSort2/count:10000 102878 ns 102414 ns 100 few_unique/QuickSort/count:100000 4654357 ns 4640420 ns 100 few_unique/QuickSort2/count:100000 3436405 ns 3421938 ns 100 few_unique/IntroSort/count:100000 2649901 ns 2639618 ns 100 few_unique/PDQSort/count:100000 1487662 ns 1479095 ns 100 few_unique/PDQSort2/count:100000 1172432 ns 1167015 ns 100 few_unique/QuickSort/count:1000000 51130249 ns 50537573 ns 100 few_unique/QuickSort2/count:1000000 38627555 ns 38033922 ns 100 few_unique/IntroSort/count:1000000 30907231 ns 30808893 ns 100 few_unique/PDQSort/count:1000000 14733627 ns 14572366 ns 100 few_unique/PDQSort2/count:1000000 12287426 ns 12233176 ns 100 two_values/QuickSort/count:100 2115 ns 2099 ns 100 two_values/QuickSort2/count:100 1601 ns 1558 ns 100 two_values/IntroSort/count:100 1058 ns 1037 ns 100 two_values/PDQSort/count:100 714 ns 699 ns 100 two_values/PDQSort2/count:100 395 ns 377 ns 100 two_values/QuickSort/count:1000 18838 ns 18787 ns 100 two_values/QuickSort2/count:1000 15509 ns 15468 ns 100 two_values/IntroSort/count:1000 10009 ns 9971 ns 100 two_values/PDQSort/count:1000 3434 ns 3397 ns 100 two_values/PDQSort2/count:1000 1715 ns 1673 ns 100 two_values/QuickSort/count:10000 310630 ns 306744 ns 100 two_values/QuickSort2/count:10000 221981 ns 214977 ns 100 two_values/IntroSort/count:10000 148667 ns 148595 ns 100 two_values/PDQSort/count:10000 42702 ns 42577 ns 100 two_values/PDQSort2/count:10000 24225 ns 24032 ns 100 two_values/QuickSort/count:100000 3353827 ns 3342019 ns 100 two_values/QuickSort2/count:100000 2487005 ns 2476672 ns 100 two_values/IntroSort/count:100000 1870906 ns 1863767 ns 100 two_values/PDQSort/count:100000 541802 ns 540355 ns 100 two_values/PDQSort2/count:100000 378254 ns 378023 ns 100 two_values/QuickSort/count:1000000 40114855 ns 39440910 ns 100 two_values/QuickSort2/count:1000000 29158407 ns 28744654 ns 100 two_values/IntroSort/count:1000000 23100678 ns 22744457 ns 100 two_values/PDQSort/count:1000000 5633201 ns 5601282 ns 100 two_values/PDQSort2/count:1000000 3762100 ns 3740864 ns 100 all_equal/QuickSort/count:100 2151 ns 2136 ns 100 all_equal/QuickSort2/count:100 1647 ns 1618 ns 100 all_equal/IntroSort/count:100 1058 ns 1040 ns 100 all_equal/PDQSort/count:100 570 ns 526 ns 100 all_equal/PDQSort2/count:100 326 ns 310 ns 100 all_equal/QuickSort/count:1000 17604 ns 17531 ns 100 all_equal/QuickSort2/count:1000 12558 ns 12503 ns 100 all_equal/IntroSort/count:1000 8658 ns 8607 ns 100 all_equal/PDQSort/count:1000 2501 ns 2517 ns 100 all_equal/PDQSort2/count:1000 1211 ns 1200 ns 100 all_equal/QuickSort/count:10000 229287 ns 228053 ns 100 all_equal/QuickSort2/count:10000 168158 ns 167545 ns 100 all_equal/IntroSort/count:10000 123868 ns 122653 ns 100 all_equal/PDQSort/count:10000 17648 ns 17566 ns 100 all_equal/PDQSort2/count:10000 6898 ns 6844 ns 100 all_equal/QuickSort/count:100000 3089106 ns 2839190 ns 100 all_equal/QuickSort2/count:100000 1991645 ns 1983108 ns 100 all_equal/IntroSort/count:100000 1517401 ns 1509352 ns 100 all_equal/PDQSort/count:100000 168092 ns 167526 ns 100 all_equal/PDQSort2/count:100000 64056 ns 63376 ns 100 all_equal/QuickSort/count:1000000 31780316 ns 31401768 ns 100 all_equal/QuickSort2/count:1000000 23670745 ns 23308078 ns 100 all_equal/IntroSort/count:1000000 18492352 ns 18165700 ns 100 all_equal/PDQSort/count:1000000 1693063 ns 1685270 ns 100 all_equal/PDQSort2/count:1000000 667052 ns 665049 ns 100 ascending/QuickSort/count:100 1645 ns 1629 ns 100 ascending/QuickSort2/count:100 1249 ns 1233 ns 100 ascending/IntroSort/count:100 993 ns 956 ns 100 ascending/PDQSort/count:100 524 ns 510 ns 100 ascending/PDQSort2/count:100 290 ns 276 ns 100 ascending/QuickSort/count:1000 15279 ns 15237 ns 100 ascending/QuickSort2/count:1000 11455 ns 11389 ns 100 ascending/IntroSort/count:1000 7931 ns 7802 ns 100 ascending/PDQSort/count:1000 2618 ns 2572 ns 100 ascending/PDQSort2/count:1000 1199 ns 1169 ns 100 ascending/QuickSort/count:10000 186722 ns 185998 ns 100 ascending/QuickSort2/count:10000 134269 ns 134014 ns 100 ascending/IntroSort/count:10000 106604 ns 106492 ns 100 ascending/PDQSort/count:10000 19298 ns 19236 ns 100 ascending/PDQSort2/count:10000 6002 ns 5953 ns 100 ascending/QuickSort/count:100000 2229729 ns 2219903 ns 100 ascending/QuickSort2/count:100000 1655304 ns 1647404 ns 100 ascending/IntroSort/count:100000 1298027 ns 1291995 ns 100 ascending/PDQSort/count:100000 190239 ns 189988 ns 100 ascending/PDQSort2/count:100000 53607 ns 53501 ns 100 ascending/QuickSort/count:1000000 27509633 ns 27165759 ns 100 ascending/QuickSort2/count:1000000 19985236 ns 19668315 ns 100 ascending/IntroSort/count:1000000 15326770 ns 15254001 ns 100 ascending/PDQSort/count:1000000 1901178 ns 1890171 ns 100 ascending/PDQSort2/count:1000000 629675 ns 628905 ns 100 slightly_scrambled/QuickSort/count:100 1750 ns 1732 ns 100 slightly_scrambled/QuickSort2/count:100 1373 ns 1343 ns 100 slightly_scrambled/IntroSort/count:100 985 ns 968 ns 100 slightly_scrambled/PDQSort/count:100 1012 ns 997 ns 100 slightly_scrambled/PDQSort2/count:100 471 ns 454 ns 100 slightly_scrambled/QuickSort/count:1000 16791 ns 16735 ns 100 slightly_scrambled/QuickSort2/count:1000 12824 ns 12776 ns 100 slightly_scrambled/IntroSort/count:1000 10458 ns 10399 ns 100 slightly_scrambled/PDQSort/count:1000 16268 ns 15887 ns 100 slightly_scrambled/PDQSort2/count:1000 4547 ns 4494 ns 100 slightly_scrambled/QuickSort/count:10000 235073 ns 233521 ns 100 slightly_scrambled/QuickSort2/count:10000 200321 ns 199399 ns 100 slightly_scrambled/IntroSort/count:10000 151789 ns 151249 ns 100 slightly_scrambled/PDQSort/count:10000 174541 ns 173592 ns 100 slightly_scrambled/PDQSort2/count:10000 89454 ns 89257 ns 100 slightly_scrambled/QuickSort/count:100000 2972211 ns 2964870 ns 100 slightly_scrambled/QuickSort2/count:100000 2376263 ns 2366504 ns 100 slightly_scrambled/IntroSort/count:100000 2305909 ns 2052472 ns 100 slightly_scrambled/PDQSort/count:100000 2093171 ns 2083240 ns 100 slightly_scrambled/PDQSort2/count:100000 982124 ns 978410 ns 100 slightly_scrambled/QuickSort/count:1000000 35292840 ns 34939068 ns 100 slightly_scrambled/QuickSort2/count:1000000 29451877 ns 28875217 ns 100 slightly_scrambled/IntroSort/count:1000000 25058823 ns 24750291 ns 100 slightly_scrambled/PDQSort/count:1000000 25962367 ns 25535252 ns 100 slightly_scrambled/PDQSort2/count:1000000 12483234 ns 12181630 ns 100 sorted_blocks/QuickSort/count:100 2430 ns 2405 ns 100 sorted_blocks/QuickSort2/count:100 2499 ns 2478 ns 100 sorted_blocks/IntroSort/count:100 1061 ns 1019 ns 100 sorted_blocks/PDQSort/count:100 1329 ns 1285 ns 100 sorted_blocks/PDQSort2/count:100 720 ns 652 ns 100 sorted_blocks/QuickSort/count:1000 16224 ns 16169 ns 100 sorted_blocks/QuickSort2/count:1000 19509 ns 19454 ns 100 sorted_blocks/IntroSort/count:1000 14560 ns 14515 ns 100 sorted_blocks/PDQSort/count:1000 9479 ns 9313 ns 100 sorted_blocks/PDQSort2/count:1000 4913 ns 4885 ns 100 sorted_blocks/QuickSort/count:10000 308032 ns 306791 ns 100 sorted_blocks/QuickSort2/count:10000 238446 ns 238183 ns 100 sorted_blocks/IntroSort/count:10000 173146 ns 172243 ns 100 sorted_blocks/PDQSort/count:10000 168365 ns 166934 ns 100 sorted_blocks/PDQSort2/count:10000 65434 ns 65356 ns 100 sorted_blocks/QuickSort/count:100000 4041913 ns 4028550 ns 100 sorted_blocks/QuickSort2/count:100000 3239075 ns 3226690 ns 100 sorted_blocks/IntroSort/count:100000 2578785 ns 2434971 ns 100 sorted_blocks/PDQSort/count:100000 2386599 ns 2375001 ns 100 sorted_blocks/PDQSort2/count:100000 966824 ns 959131 ns 100 sorted_blocks/QuickSort/count:1000000 51596313 ns 50949966 ns 100 sorted_blocks/QuickSort2/count:1000000 39977951 ns 39455223 ns 100 sorted_blocks/IntroSort/count:1000000 29587960 ns 29072445 ns 100 sorted_blocks/PDQSort/count:1000000 29140836 ns 28926711 ns 100 sorted_blocks/PDQSort2/count:1000000 13950878 ns 13703237 ns 100 single_big_swap/QuickSort/count:100 1290 ns 1277 ns 100 single_big_swap/QuickSort2/count:100 1004 ns 1000 ns 100 single_big_swap/IntroSort/count:100 1087 ns 1069 ns 100 single_big_swap/PDQSort/count:100 633 ns 620 ns 100 single_big_swap/PDQSort2/count:100 474 ns 456 ns 100 single_big_swap/QuickSort/count:1000 16889 ns 16794 ns 100 single_big_swap/QuickSort2/count:1000 12697 ns 12655 ns 100 single_big_swap/IntroSort/count:1000 14016 ns 13967 ns 100 single_big_swap/PDQSort/count:1000 7008 ns 6939 ns 100 single_big_swap/PDQSort2/count:1000 2342 ns 2339 ns 100 single_big_swap/QuickSort/count:10000 204246 ns 204182 ns 100 single_big_swap/QuickSort2/count:10000 156179 ns 155306 ns 100 single_big_swap/IntroSort/count:10000 192030 ns 190639 ns 100 single_big_swap/PDQSort/count:10000 78781 ns 78285 ns 100 single_big_swap/PDQSort2/count:10000 23238 ns 23145 ns 100 single_big_swap/QuickSort/count:100000 2414905 ns 2395809 ns 100 single_big_swap/QuickSort2/count:100000 1787565 ns 1778945 ns 100 single_big_swap/IntroSort/count:100000 2523279 ns 2517094 ns 100 single_big_swap/PDQSort/count:100000 889865 ns 884869 ns 100 single_big_swap/PDQSort2/count:100000 249087 ns 246596 ns 100 single_big_swap/QuickSort/count:1000000 29262380 ns 28791769 ns 100 single_big_swap/QuickSort2/count:1000000 21306314 ns 20978264 ns 100 single_big_swap/IntroSort/count:1000000 30104774 ns 29736630 ns 100 single_big_swap/PDQSort/count:1000000 10839585 ns 10689165 ns 100 single_big_swap/PDQSort2/count:1000000 3153072 ns 3134668 ns 100 descending/QuickSort/count:100 1633 ns 1619 ns 100 descending/QuickSort2/count:100 1324 ns 1245 ns 100 descending/IntroSort/count:100 915 ns 904 ns 100 descending/PDQSort/count:100 534 ns 518 ns 100 descending/PDQSort2/count:100 305 ns 290 ns 100 descending/QuickSort/count:1000 15212 ns 15109 ns 100 descending/QuickSort2/count:1000 12131 ns 12071 ns 100 descending/IntroSort/count:1000 8520 ns 8483 ns 100 descending/PDQSort/count:1000 2677 ns 2616 ns 100 descending/PDQSort2/count:1000 1168 ns 1167 ns 100 descending/QuickSort/count:10000 190983 ns 190004 ns 100 descending/QuickSort2/count:10000 142282 ns 141461 ns 100 descending/IntroSort/count:10000 106956 ns 106801 ns 100 descending/PDQSort/count:10000 22199 ns 22098 ns 100 descending/PDQSort2/count:10000 7163 ns 6853 ns 100 descending/QuickSort/count:100000 2210157 ns 2201096 ns 100 descending/QuickSort2/count:100000 1639887 ns 1632072 ns 100 descending/IntroSort/count:100000 1308584 ns 1301543 ns 100 descending/PDQSort/count:100000 186613 ns 185900 ns 100 descending/PDQSort2/count:100000 56812 ns 56648 ns 100 descending/QuickSort/count:1000000 27759185 ns 27197419 ns 100 descending/QuickSort2/count:1000000 19937923 ns 19733711 ns 100 descending/IntroSort/count:1000000 15222011 ns 15155862 ns 100 descending/PDQSort/count:1000000 2159517 ns 1899555 ns 100 descending/PDQSort2/count:1000000 560522 ns 557461 ns 100 pipe_organ/QuickSort/count:100 1853 ns 1835 ns 100 pipe_organ/QuickSort2/count:100 1439 ns 1416 ns 100 pipe_organ/IntroSort/count:100 1029 ns 950 ns 100 pipe_organ/PDQSort/count:100 528 ns 517 ns 100 pipe_organ/PDQSort2/count:100 296 ns 282 ns 100 pipe_organ/QuickSort/count:1000 15940 ns 15886 ns 100 pipe_organ/QuickSort2/count:1000 11960 ns 11911 ns 100 pipe_organ/IntroSort/count:1000 8271 ns 8225 ns 100 pipe_organ/PDQSort/count:1000 2622 ns 2591 ns 100 pipe_organ/PDQSort2/count:1000 1192 ns 1172 ns 100 pipe_organ/QuickSort/count:10000 193559 ns 193175 ns 100 pipe_organ/QuickSort2/count:10000 150531 ns 149370 ns 100 pipe_organ/IntroSort/count:10000 115439 ns 115205 ns 100 pipe_organ/PDQSort/count:10000 19195 ns 19176 ns 100 pipe_organ/PDQSort2/count:10000 6117 ns 6021 ns 100 pipe_organ/QuickSort/count:100000 2486991 ns 2478769 ns 100 pipe_organ/QuickSort2/count:100000 1819950 ns 1810639 ns 100 pipe_organ/IntroSort/count:100000 1388682 ns 1383509 ns 100 pipe_organ/PDQSort/count:100000 190331 ns 188891 ns 100 pipe_organ/PDQSort2/count:100000 52793 ns 52639 ns 100 pipe_organ/QuickSort/count:1000000 29577606 ns 29326755 ns 100 pipe_organ/QuickSort2/count:1000000 21748586 ns 21414611 ns 100 pipe_organ/IntroSort/count:1000000 16696194 ns 16630029 ns 100 pipe_organ/PDQSort/count:1000000 1869925 ns 1864242 ns 100 pipe_organ/PDQSort2/count:1000000 549429 ns 545797 ns 100 push_front/QuickSort/count:100 2289 ns 2269 ns 100 push_front/QuickSort2/count:100 1478 ns 1491 ns 100 push_front/IntroSort/count:100 1605 ns 1561 ns 100 push_front/PDQSort/count:100 1282 ns 1240 ns 100 push_front/PDQSort2/count:100 543 ns 523 ns 100 push_front/QuickSort/count:1000 18117 ns 18055 ns 100 push_front/QuickSort2/count:1000 15428 ns 15155 ns 100 push_front/IntroSort/count:1000 14087 ns 13988 ns 100 push_front/PDQSort/count:1000 12455 ns 12404 ns 100 push_front/PDQSort2/count:1000 4450 ns 4413 ns 100 push_front/QuickSort/count:10000 229186 ns 227644 ns 100 push_front/QuickSort2/count:10000 160690 ns 160380 ns 100 push_front/IntroSort/count:10000 191233 ns 190610 ns 100 push_front/PDQSort/count:10000 145472 ns 144255 ns 100 push_front/PDQSort2/count:10000 41785 ns 41612 ns 100 push_front/QuickSort/count:100000 2922293 ns 2702687 ns 100 push_front/QuickSort2/count:100000 1923425 ns 1917358 ns 100 push_front/IntroSort/count:100000 2488440 ns 2478518 ns 100 push_front/PDQSort/count:100000 1688943 ns 1681909 ns 100 push_front/PDQSort2/count:100000 466741 ns 465112 ns 100 push_front/QuickSort/count:1000000 31411786 ns 31044262 ns 100 push_front/QuickSort2/count:1000000 22438347 ns 22338651 ns 100 push_front/IntroSort/count:1000000 30187199 ns 29784770 ns 100 push_front/PDQSort/count:1000000 19681892 ns 19340073 ns 100 push_front/PDQSort2/count:1000000 5359031 ns 5330950 ns 100 push_middle/QuickSort/count:100 1986 ns 1972 ns 100 push_middle/QuickSort2/count:100 1523 ns 1475 ns 100 push_middle/IntroSort/count:100 1198 ns 1155 ns 100 push_middle/PDQSort/count:100 982 ns 965 ns 100 push_middle/PDQSort2/count:100 437 ns 419 ns 100 push_middle/QuickSort/count:1000 16948 ns 16891 ns 100 push_middle/QuickSort2/count:1000 12581 ns 12432 ns 100 push_middle/IntroSort/count:1000 10488 ns 10440 ns 100 push_middle/PDQSort/count:1000 4969 ns 4929 ns 100 push_middle/PDQSort2/count:1000 2006 ns 1953 ns 100 push_middle/QuickSort/count:10000 206586 ns 206205 ns 100 push_middle/QuickSort2/count:10000 147860 ns 147583 ns 100 push_middle/IntroSort/count:10000 143688 ns 143232 ns 100 push_middle/PDQSort/count:10000 44343 ns 44226 ns 100 push_middle/PDQSort2/count:10000 14500 ns 14441 ns 100 push_middle/QuickSort/count:100000 2641267 ns 2493017 ns 100 push_middle/QuickSort2/count:100000 1782662 ns 1775961 ns 100 push_middle/IntroSort/count:100000 1867750 ns 1861113 ns 100 push_middle/PDQSort/count:100000 426423 ns 423756 ns 100 push_middle/PDQSort2/count:100000 121412 ns 120541 ns 100 push_middle/QuickSort/count:1000000 29463664 ns 29094682 ns 100 push_middle/QuickSort2/count:1000000 21699498 ns 21338872 ns 100 push_middle/IntroSort/count:1000000 22344304 ns 21994267 ns 100 push_middle/PDQSort/count:1000000 4514580 ns 4247772 ns 100 push_middle/PDQSort2/count:1000000 1244558 ns 1236921 ns 100
  13. Stefan Glienke

    TParallelArray Sort Performance...

    Benchmark code: {$APPTYPE CONSOLE} uses System.Diagnostics, System.Generics.Collections, System.SysUtils, Spring; procedure Test(count: NativeInt); var SSortArray, PSortArray: TArray<double>; I: Integer; T: TStopwatch; begin try SetLength(SSortArray, count); SetLength(PSortArray, count); Writeln('Fill array (double), elements count: ', count); for I := Low(SSortArray) to High(SSortArray) do begin SSortArray[I] := Random(MaxInt); PSortArray[I] := SSortArray[I]; end; Writeln('Start sorting ...'); T := TStopwatch.StartNew; System.Generics.Collections.TArray.Sort<double>(SSortArray); T.Stop; Writeln('RTL TArray.Sort (ms.): ', T.ElapsedMilliseconds.ToString); T := TStopwatch.StartNew; Spring.TArray.Sort<double>(PSortArray); T.Stop; Writeln('Spring TArray.Sort (ms.): ', T.ElapsedMilliseconds.ToString); except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; Writeln; end; begin Test(500_000); Test(5_000_000); Test(100_000_000); Readln; end. Results (win64 release) on i7-12700: Fill array (double), elements count: 500000 Start sorting ... RTL TArray.Sort (ms.): 49 Spring TArray.Sort (ms.): 33 Fill array (double), elements count: 5000000 Start sorting ... RTL TArray.Sort (ms.): 560 Spring TArray.Sort (ms.): 393 Fill array (double), elements count: 100000000 Start sorting ... RTL TArray.Sort (ms.): 13053 Spring TArray.Sort (ms.): 9351
  14. Stefan Glienke

    TParallelArray Sort Performance...

    To me they are not - if you have some backend server code that is already doing multiple things in parallel it won't get any better if you then do parallel sort.
  15. Stefan Glienke

    TParallelArray Sort Performance...

    The issue is that in the Parallel implementation there is no mitigation against QuickSorts well known worst case.
  16. Stefan Glienke

    TParallelArray Sort Performance...

    This is the benchmark code - previous results I posted were with TTestType = Integer, but using Double shows similar results (I ran the benchmark as 64bit release) program SortBench; {$APPTYPE CONSOLE} uses System.Generics.Collections, Spring, Spring.Comparers, Spring.Benchmark, System.SysUtils; type TTestType = Double; TDistFunc = function(size: Integer): TArray<TTestType>; TSortFunc = procedure(var values: array of TTestType); function IntToTestType(i: Integer): TTestType; begin Result := i; end; function Equals(const left, right: TTestType): Boolean; inline; begin Result := left = right; end; function Shuffled(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); Spring.TArray.Shuffle<TTestType>(Result); end; function Shuffled_16Values(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(Random(16)); end; function TwoValues(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(Random(2)); end; function AllEqual(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(0); end; function Ascending(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); end; function SlightlyScrambled(size: Integer): TArray<TTestType>; const scramblePercentage = 5; var i, idx1, idx2: Integer; temp: TTestType; scrambleCount: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); scrambleCount := size * scramblePercentage div 100; for i := 1 to scrambleCount do begin idx1 := Random(size); idx2 := Random(size); temp := Result[idx1]; Result[idx1] := Result[idx2]; Result[idx2] := temp; end; end; function SortedBlocks(size: Integer): TArray<TTestType>; type TBlock100 = array[1..100] of TTestType; TBlock10 = array[1..10] of TTestType; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); if size > 100 then Spring.TArray.Shuffle<TBlock100>(Pointer(Result), (size div (SizeOf(TBlock100) div SizeOf(TTestType))) - 1) else Spring.TArray.Shuffle<TBlock10>(Pointer(Result), (size div (SizeOf(TBlock10) div SizeOf(TTestType))) - 1); end; function SingleBigSwap(size: Integer): TArray<TTestType>; var i: Integer; halfSize: Integer; begin SetLength(Result, size); halfSize := size div 2; for i := 0 to halfSize - 1 do Result[i] := IntToTestType(i + halfSize); for i := halfSize to size -1 do Result[i] := IntToTestType(i - halfSize); end; function Descending(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := size-1 downto 0 do Result[i] := IntToTestType(i); end; function PipeOrgan(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to (size div 2) - 1 do Result[i] := IntToTestType(i); for i := size div 2 to size -1 do Result[i] := IntToTestType(size - 1); end; function PushFront(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 1 to size - 1 do Result[i - 1] := IntToTestType(i); Result[size - 1] := IntToTestType(0); end; function PushMiddle(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to (size div 2) - 1 do Result[i] := IntToTestType(i); for i := (size div 2) + 1 to size -1 do Result[i - 1] := IntToTestType(i); Result[size - 1] := IntToTestType(size div 2); end; procedure QuickSort(var values: array of TTestType); begin System.Generics.Collections.TArray.Sort<TTestType>(values); end; procedure QuickSort_SpringComparer(var values: array of TTestType); begin System.Generics.Collections.TArray.Sort<TTestType>(values , Spring.Comparers.TComparer<TTestType>.Default ); end; procedure IntroSort(var values: array of TTestType); begin Spring.TArray.Sort<TTestType>(values); end; const Distributions: array[0..11] of record name: string; func: TDistFunc end = ( (name: 'random'; func: Shuffled), (name: 'few_unique'; func: Shuffled_16Values), (name: 'two_values'; func: TwoValues), (name: 'all_equal'; func: AllEqual), (name: 'ascending'; func: Ascending), (name: 'slightly_scrambled'; func: SlightlyScrambled), (name: 'sorted_blocks'; func: SortedBlocks), (name: 'single_big_swap'; func: SingleBigSwap), (name: 'descending'; func: Descending), (name: 'pipe_organ'; func: PipeOrgan), (name: 'push_front'; func: PushFront), (name: 'push_middle'; func: PushMiddle) ); Sorts: array[0..2] of record name: string; func: TSortFunc end = ( (name: 'QuickSort'; func: QuickSort) ,(name: 'QuickSort2'; func: QuickSort_SpringComparer) ,(name: 'IntroSort'; func: IntroSort) ); Sizes: array of Integer = [100, 1000, 10000, 100000, 1000000]; procedure BenchSort(const state: TState); begin var distIdx := state[0]; var sortIdx := state[1]; var size := state[2]; var distribution: TDistFunc := Distributions[distIdx].func; var sort: TSortFunc := Sorts[sortIdx].func; var data := distribution(size); var copy: TArray<TTestType>; SetLength(copy, size); for var _ in state do begin state.PauseTiming; if IsManagedType(TTestType) then MoveManaged(data, copy, TypeInfo(TTestType), size) else Move(data[0], copy[0], size * SizeOf(TTestType)); state.ResumeTiming; sort(copy); end; QuickSort(data); for var i := 0 to high(copy) do Assert(Equals(copy[i], data[i]), Distributions[distIdx].Name + '/' + Sorts[sortIdx].Name + '/' + size.ToString); end; procedure Main; begin benchmark_format_args := False; for var distIdx := Low(distributions) to High(Distributions) do for var size in Sizes do for var sortIdx := Low(Sorts) to High(Sorts) do begin var bm := Benchmark(BenchSort, Distributions[distIdx].name + '/' + Sorts[sortIdx].name + '/count:' + size.ToString) .Args([distIdx, sortIdx, size]); bm.Iterations(100); end; Benchmark_Main; end; begin try Main; except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; Readln; end.
  17. Stefan Glienke

    TParallelArray Sort Performance...

    TParallelArray.Sort is broken by the way - try this: uses System.Threading, System.SysUtils; procedure Main; begin var data: TArray<Integer>; SetLength(data, 1000000); for var i := 0 to High(data) do data[i] := Random(16); TParallelArray.Sort<Integer>(data); end; begin try Main; except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; Writeln('done'); Readln; end.
  18. Stefan Glienke

    TParallelArray Sort Performance...

    Depends on your IntroSort implementation - the one in Spring is usually faster than RTL quicksort. (*) But this also is a result of it using branchless comparers as far as possible. Also, there is a reason why usually benchmarks for sorting algorithms don't only test on complete random data but also on several patterns. Because in reality, you often deal with things such as mostly sorted with a few unsorted elements at the end or reversed sorted. (*) here are the benchmark results for RTL QuickSort vs RTL QuickSort with Spring4d Comparer vs Spring4d IntroSort for several data patterns - you can easily see how the branchless compare affects the performance quite significantly -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------------------------- random/QuickSort/count:100 2463 ns 2435 ns 100 random/QuickSort2/count:100 1802 ns 1782 ns 100 random/IntroSort/count:100 1420 ns 1395 ns 100 random/QuickSort/count:1000 50209 ns 50189 ns 100 random/QuickSort2/count:1000 27645 ns 27612 ns 100 random/IntroSort/count:1000 18489 ns 18468 ns 100 random/QuickSort/count:10000 836172 ns 835798 ns 100 random/QuickSort2/count:10000 728400 ns 727977 ns 100 random/IntroSort/count:10000 523833 ns 523565 ns 100 random/QuickSort/count:100000 9855657 ns 9850263 ns 100 random/QuickSort2/count:100000 8918812 ns 8912401 ns 100 random/IntroSort/count:100000 6839316 ns 6829126 ns 100 random/QuickSort/count:1000000 119455635 ns 119330369 ns 100 random/QuickSort2/count:1000000 104800849 ns 104696480 ns 100 random/IntroSort/count:1000000 81749401 ns 81660293 ns 100 few_unique/QuickSort/count:100 2249 ns 2251 ns 100 few_unique/QuickSort2/count:100 1632 ns 1593 ns 100 few_unique/IntroSort/count:100 1231 ns 1202 ns 100 few_unique/QuickSort/count:1000 35486 ns 35454 ns 100 few_unique/QuickSort2/count:1000 25086 ns 25038 ns 100 few_unique/IntroSort/count:1000 13094 ns 13071 ns 100 few_unique/QuickSort/count:10000 515193 ns 514924 ns 100 few_unique/QuickSort2/count:10000 371173 ns 370283 ns 100 few_unique/IntroSort/count:10000 285188 ns 285036 ns 100 few_unique/QuickSort/count:100000 5664379 ns 5657759 ns 100 few_unique/QuickSort2/count:100000 4348078 ns 4341859 ns 100 few_unique/IntroSort/count:100000 3164370 ns 3162498 ns 100 few_unique/QuickSort/count:1000000 63121983 ns 63076941 ns 100 few_unique/QuickSort2/count:1000000 43872122 ns 43809752 ns 100 few_unique/IntroSort/count:1000000 30719211 ns 30651572 ns 100 two_values/QuickSort/count:100 1538 ns 1514 ns 100 two_values/QuickSort2/count:100 1205 ns 1181 ns 100 two_values/IntroSort/count:100 781 ns 768 ns 100 two_values/QuickSort/count:1000 19267 ns 19242 ns 100 two_values/QuickSort2/count:1000 15309 ns 15261 ns 100 two_values/IntroSort/count:1000 9174 ns 9146 ns 100 two_values/QuickSort/count:10000 281462 ns 276065 ns 100 two_values/QuickSort2/count:10000 206943 ns 206818 ns 100 two_values/IntroSort/count:10000 150871 ns 150572 ns 100 two_values/QuickSort/count:100000 3483592 ns 3467751 ns 100 two_values/QuickSort2/count:100000 2574497 ns 2555053 ns 100 two_values/IntroSort/count:100000 1947766 ns 1943228 ns 100 two_values/QuickSort/count:1000000 39543461 ns 39439575 ns 100 two_values/QuickSort2/count:1000000 29565091 ns 29524866 ns 100 two_values/IntroSort/count:1000000 22888036 ns 22847673 ns 100 all_equal/QuickSort/count:100 1453 ns 1424 ns 100 all_equal/QuickSort2/count:100 1152 ns 1135 ns 100 all_equal/IntroSort/count:100 735 ns 718 ns 100 all_equal/QuickSort/count:1000 16442 ns 16424 ns 100 all_equal/QuickSort2/count:1000 12609 ns 12588 ns 100 all_equal/IntroSort/count:1000 8274 ns 8253 ns 100 all_equal/QuickSort/count:10000 214083 ns 213979 ns 100 all_equal/QuickSort2/count:10000 164971 ns 164921 ns 100 all_equal/IntroSort/count:10000 125590 ns 125514 ns 100 all_equal/QuickSort/count:100000 2819341 ns 2815702 ns 100 all_equal/QuickSort2/count:100000 2120029 ns 2117072 ns 100 all_equal/IntroSort/count:100000 1563758 ns 1560929 ns 100 all_equal/QuickSort/count:1000000 32668228 ns 32563853 ns 100 all_equal/QuickSort2/count:1000000 24450182 ns 24411351 ns 100 all_equal/IntroSort/count:1000000 18799427 ns 18758084 ns 100 ascending/QuickSort/count:100 1238 ns 1208 ns 100 ascending/QuickSort2/count:100 989 ns 962 ns 100 ascending/IntroSort/count:100 723 ns 700 ns 100 ascending/QuickSort/count:1000 15109 ns 15088 ns 100 ascending/QuickSort2/count:1000 11357 ns 11335 ns 100 ascending/IntroSort/count:1000 7988 ns 7969 ns 100 ascending/QuickSort/count:10000 187903 ns 187573 ns 100 ascending/QuickSort2/count:10000 140454 ns 140400 ns 100 ascending/IntroSort/count:10000 119619 ns 119090 ns 100 ascending/QuickSort/count:100000 2250330 ns 2249079 ns 100 ascending/QuickSort2/count:100000 1700410 ns 1699408 ns 100 ascending/IntroSort/count:100000 1470538 ns 1467097 ns 100 ascending/QuickSort/count:1000000 28118193 ns 28071460 ns 100 ascending/QuickSort2/count:1000000 20655432 ns 20619601 ns 100 ascending/IntroSort/count:1000000 17387674 ns 17366389 ns 100 slightly_scrambled/QuickSort/count:100 1260 ns 1221 ns 100 slightly_scrambled/QuickSort2/count:100 992 ns 976 ns 100 slightly_scrambled/IntroSort/count:100 760 ns 729 ns 100 slightly_scrambled/QuickSort/count:1000 16685 ns 16664 ns 100 slightly_scrambled/QuickSort2/count:1000 12477 ns 12462 ns 100 slightly_scrambled/IntroSort/count:1000 9125 ns 9105 ns 100 slightly_scrambled/QuickSort/count:10000 247135 ns 246653 ns 100 slightly_scrambled/QuickSort2/count:10000 183518 ns 183063 ns 100 slightly_scrambled/IntroSort/count:10000 158241 ns 157415 ns 100 slightly_scrambled/QuickSort/count:100000 2975025 ns 2964542 ns 100 slightly_scrambled/QuickSort2/count:100000 2393465 ns 2389448 ns 100 slightly_scrambled/IntroSort/count:100000 2140478 ns 2137574 ns 100 slightly_scrambled/QuickSort/count:1000000 36245900 ns 36023047 ns 100 slightly_scrambled/QuickSort2/count:1000000 29002166 ns 28920883 ns 100 slightly_scrambled/IntroSort/count:1000000 25116044 ns 25034741 ns 100 sorted_blocks/QuickSort/count:100 1588 ns 1557 ns 100 sorted_blocks/QuickSort2/count:100 1377 ns 1369 ns 100 sorted_blocks/IntroSort/count:100 1195 ns 1171 ns 100 sorted_blocks/QuickSort/count:1000 25010 ns 24969 ns 100 sorted_blocks/QuickSort2/count:1000 19958 ns 19927 ns 100 sorted_blocks/IntroSort/count:1000 13229 ns 13200 ns 100 sorted_blocks/QuickSort/count:10000 296620 ns 295986 ns 100 sorted_blocks/QuickSort2/count:10000 246763 ns 246503 ns 100 sorted_blocks/IntroSort/count:10000 173911 ns 173822 ns 100 sorted_blocks/QuickSort/count:100000 4510684 ns 4474993 ns 100 sorted_blocks/QuickSort2/count:100000 3492714 ns 3479751 ns 100 sorted_blocks/IntroSort/count:100000 2468243 ns 2466781 ns 100 sorted_blocks/QuickSort/count:1000000 51631011 ns 51470454 ns 100 sorted_blocks/QuickSort2/count:1000000 41472336 ns 41323492 ns 100 sorted_blocks/IntroSort/count:1000000 29948990 ns 29886920 ns 100 single_big_swap/QuickSort/count:100 1408 ns 1378 ns 100 single_big_swap/QuickSort2/count:100 1124 ns 1097 ns 100 single_big_swap/IntroSort/count:100 1096 ns 1067 ns 100 single_big_swap/QuickSort/count:1000 16770 ns 16747 ns 100 single_big_swap/QuickSort2/count:1000 12769 ns 12753 ns 100 single_big_swap/IntroSort/count:1000 14975 ns 14945 ns 100 single_big_swap/QuickSort/count:10000 205373 ns 204738 ns 100 single_big_swap/QuickSort2/count:10000 153375 ns 153290 ns 100 single_big_swap/IntroSort/count:10000 211204 ns 210621 ns 100 single_big_swap/QuickSort/count:100000 2427136 ns 2416559 ns 100 single_big_swap/QuickSort2/count:100000 1853420 ns 1851812 ns 100 single_big_swap/IntroSort/count:100000 2900139 ns 2884030 ns 100 single_big_swap/QuickSort/count:1000000 29680453 ns 29600956 ns 100 single_big_swap/QuickSort2/count:1000000 22054106 ns 21979630 ns 100 single_big_swap/IntroSort/count:1000000 35043829 ns 34961055 ns 100 descending/QuickSort/count:100 1225 ns 1205 ns 100 descending/QuickSort2/count:100 979 ns 955 ns 100 descending/IntroSort/count:100 726 ns 703 ns 100 descending/QuickSort/count:1000 15322 ns 15138 ns 100 descending/QuickSort2/count:1000 11398 ns 11374 ns 100 descending/IntroSort/count:1000 8001 ns 7982 ns 100 descending/QuickSort/count:10000 187287 ns 187230 ns 100 descending/QuickSort2/count:10000 140713 ns 139754 ns 100 descending/IntroSort/count:10000 119207 ns 119157 ns 100 descending/QuickSort/count:100000 2251193 ns 2250138 ns 100 descending/QuickSort2/count:100000 1697955 ns 1697229 ns 100 descending/IntroSort/count:100000 1468175 ns 1467358 ns 100 descending/QuickSort/count:1000000 28025499 ns 27991254 ns 100 descending/QuickSort2/count:1000000 20643835 ns 20610794 ns 100 descending/IntroSort/count:1000000 17434312 ns 17380960 ns 100 pipe_organ/QuickSort/count:100 1347 ns 1325 ns 100 pipe_organ/QuickSort2/count:100 1068 ns 1052 ns 100 pipe_organ/IntroSort/count:100 712 ns 690 ns 100 pipe_organ/QuickSort/count:1000 15772 ns 15749 ns 100 pipe_organ/QuickSort2/count:1000 11923 ns 11899 ns 100 pipe_organ/IntroSort/count:1000 8073 ns 8059 ns 100 pipe_organ/QuickSort/count:10000 200075 ns 200011 ns 100 pipe_organ/QuickSort2/count:10000 153828 ns 153583 ns 100 pipe_organ/IntroSort/count:10000 121821 ns 121781 ns 100 pipe_organ/QuickSort/count:100000 2564621 ns 2558463 ns 100 pipe_organ/QuickSort2/count:100000 1958687 ns 1951100 ns 100 pipe_organ/IntroSort/count:100000 1516152 ns 1511432 ns 100 pipe_organ/QuickSort/count:1000000 30322479 ns 30281379 ns 100 pipe_organ/QuickSort2/count:1000000 23019458 ns 22963220 ns 100 pipe_organ/IntroSort/count:1000000 18033541 ns 17998660 ns 100 push_front/QuickSort/count:100 1657 ns 1627 ns 100 push_front/QuickSort2/count:100 1208 ns 1193 ns 100 push_front/IntroSort/count:100 1173 ns 1150 ns 100 push_front/QuickSort/count:1000 18850 ns 18774 ns 100 push_front/QuickSort2/count:1000 13694 ns 13669 ns 100 push_front/IntroSort/count:1000 15505 ns 15491 ns 100 push_front/QuickSort/count:10000 232070 ns 231695 ns 100 push_front/QuickSort2/count:10000 164370 ns 164085 ns 100 push_front/IntroSort/count:10000 223185 ns 223119 ns 100 push_front/QuickSort/count:100000 2786631 ns 2780806 ns 100 push_front/QuickSort2/count:100000 1981209 ns 1977728 ns 100 push_front/IntroSort/count:100000 2986261 ns 2979247 ns 100 push_front/QuickSort/count:1000000 32337658 ns 32237857 ns 100 push_front/QuickSort2/count:1000000 22995000 ns 22948793 ns 100 push_front/IntroSort/count:1000000 35586468 ns 35516490 ns 100 push_middle/QuickSort/count:100 1445 ns 1416 ns 100 push_middle/QuickSort2/count:100 1088 ns 1069 ns 100 push_middle/IntroSort/count:100 893 ns 868 ns 100 push_middle/QuickSort/count:1000 17087 ns 17063 ns 100 push_middle/QuickSort2/count:1000 27199 ns 12531 ns 100 push_middle/IntroSort/count:1000 11300 ns 11284 ns 100 push_middle/QuickSort/count:10000 210365 ns 210271 ns 100 push_middle/QuickSort2/count:10000 152367 ns 152240 ns 100 push_middle/IntroSort/count:10000 166248 ns 166154 ns 100 push_middle/QuickSort/count:100000 2529640 ns 2526924 ns 100 push_middle/QuickSort2/count:100000 1844729 ns 1842635 ns 100 push_middle/IntroSort/count:100000 2160489 ns 2158998 ns 100 push_middle/QuickSort/count:1000000 30219881 ns 30136359 ns 100 push_middle/QuickSort2/count:1000000 21940150 ns 21867966 ns 100 push_middle/IntroSort/count:1000000 25940856 ns 25884546 ns 100
  19. Stefan Glienke

    TParallelArray Sort Performance...

    I am not commenting on benchmark results without seeing the code - I have seen too many flawed benchmarks in my life. If you look into the implementation you see that it is using the most simple parallel quicksort imaginable - that means the first run over the entire array which arranges the elements according to the pivot (as in the TArray.Sort the most simple pivot selection - take the middle) runs single threaded. Only then the processing of left and right is being passed to the PPL. Anyone who has researched a bit into sorting algorithms in the past 20 years knows that a pure quicksort is hardly anything good enough to be used as general purpose sorting algo in a runtime - other languages using hybrids for years, .NET and some C++ runtimes use IntroSort, Python uses Timsort, so does Java for objects and some special version of QuickSort for primitive types, lately some languages such as Go adapted PDQSort as their default sorting algorithm. I'd argue that some of these better algorithms running single-threaded outperform parallel quicksort on certain and not so uncommon data patterns.
  20. Somehow I cannot believe that ZEOS did not properly support 64bit until now - you must have been using some old code
  21. Stefan Glienke

    TParallelArray Sort Performance...

    Quicksort with comparer calls compiled by a mediocre compiler operates far below L3 cache speed.
  22. Stefan Glienke

    Records as TDictionary keys

    Of course - TDictionary<K,V> is hashtable based, so it needs GetHashCode and Equals
  23. There should not be any temp, nor assign call tbh. Check out this code: {$APPTYPE CONSOLE} uses SysUtils; type TMyRec = record x: Integer; class operator Initialize(out dest: TMyRec); class operator Finalize(var dest: TMyRec); class operator Assign(var left: TMyRec; const[ref] right: TMyRec); end; function MyDefault: TMyRec; begin Writeln('default - ', IntToHex(IntPtr(@Result))); Result.x := 0; end; class operator TMyRec.Initialize(out dest: TMyRec); begin Writeln('init - ', IntToHex(IntPtr(@dest))); end; class operator TMyRec.Finalize(var dest: TMyRec); begin Writeln('finalize - ', IntToHex(IntPtr(@dest))); end; class operator TMyRec.Assign(var left: TMyRec; const[ref] right: TMyRec); begin Writeln('assign'); end; procedure Main; var r: TMyRec; begin r := MyDefault; end; begin Main; end. Now we can argue why there is no assign call because we are assigning the result of the MyDefault function to r. But even though we have declared an assign operator it does what it always does with managed return type - passing it as hidden var parameter. I said this before and I say it again - CMR are broken as designed and personally I stay the heck away from them. I remember some years ago there was a question on SO about struct default ctors serving as default initializers and why C# does not have them and I don't remember if it was Jon or Eric explained that it would cause all kinds of trouble. Edit: Ah, here it is: https://stackoverflow.com/a/333840/587106 - eventually they added them in C# 10 but there is quite some extensive language spec - when I asked for the CMR spec it was all crickets. FWIW the Delphi implementation suffers from exactly the situation that Jon describes - allocate a dynamic array of some CMR and watch all the initializers run. Which is completely bonkers - C# 10 does not do any of that for array allocation - see https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-10.0/parameterless-struct-constructors#array-allocation
  24. Already reported in 2021
  25. Stefan Glienke

    MAP2PDB - Profiling with VTune

    you can generate a dpk with the help of System.SysUtils.GetPackageInfo
×