Steve Maughan 26 Posted September 12 (edited) The new TParallelArray<T> seems like a drop-in replacement for TArray<T>. I tested the performance of the parallel sort algorithm on my Intel i9-12900HK laptop. This has 6 performance cores (with hyperthreading) and 8 efficient cores. Here are the results: One Billion Double Values: TArray<>: 183.8 seconds TParallelArray<>: 34.8 seconds Speed-up: x5.3 Five Million Double Values: TArray<>: 692 ms TParallelArray<>: 122 ms Speed-up: x5.7 500k Double Values: TArray<>: 65 ms TParallelArray<>: 18 ms Speed-up: x3.6 This speed up seems quite good to me. — Steve Edited September 12 by Steve Maughan Share this post Link to post
Anders Melander 1783 Posted September 12 There's something fishy with those numbers. I would have expected the 500K test to have better throughput than the 5M and 1B tests since the 500K array can fit in the 24Mb cache while the 5M and 1B arrays cannot. Did you do a warm-up run before the benchmark to get the thread pool spun up? How many iterations did you do on each test? Share this post Link to post
Steve Maughan 26 Posted September 12 I've just repeated the 500k test and there is indeed quite a bit of variation across the parallel times. They range from 10 ms to 25 ms. The single core times are quite consistent at 63 ms to 65 ms. — Steve Share this post Link to post
Anders Melander 1783 Posted September 12 You didn't answer any of my questions... Regardless, would you care to publish your test code so we can see how the sausages got made? Share this post Link to post
DelphiUdIT 176 Posted September 12 I try this test (changing the lengths of the array) with an Intel I9-14900HX: {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Classes, System.Generics.Collections, System.Threading, System.Diagnostics, WinApi.MMSystem; var SSortArray, PSortArray: TArray<double>; I: Integer; T: TStopwatch; begin try SetLength(SSortArray, 1_000_000_000); SetLength(PSortArray, 1_000_000_000); Writeln('Fill array (double), elements count: ', Length(SSortArray).ToString); for I := Low(SSortArray) to High(SSortArray) do begin SSortArray[I] := Random(MaxInt); PSortArray[I] := SSortArray[I]; end; // Boot up t. pool TTask.Create( procedure begin ; end).Start; Writeln('Start sorting ...'); // Serial sorting T := TStopwatch.StartNew; TArray.Sort<double>(SSortArray); T.Stop; Writeln('TArray.Sort (ms.): ', T.ElapsedMilliseconds.ToString); //Insertion of timeBegin... / timeEnd... make the parallel process more fast (10% more) //timeBeginPeriod(1); // Parallel sorting T := TStopwatch.StartNew; TParallelArray.Sort<double>(PSortArray); T.Stop; Writeln('TParallelArray.Sort (ms.): ', T.ElapsedMilliseconds.ToString); //timeEndPeriod(1); except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; SetLength(SSortArray, 0); SetLength(PSortArray, 0); Readln; Share this post Link to post
Stefan Glienke 2002 Posted September 13 5 hours ago, Anders Melander said: I would have expected the 500K test to have better throughput than the 5M and 1B tests since the 500K array can fit in the 24Mb cache while the 5M and 1B arrays cannot. Quicksort with comparer calls compiled by a mediocre compiler operates far below L3 cache speed. Share this post Link to post
Anders Melander 1783 Posted September 13 12 minutes ago, Stefan Glienke said: Quicksort with comparer calls compiled by a mediocre compiler operates far below L3 cache speed. Yes but that's the same for all three tests. The smaller array should still benefit the most from the cache. To me the posted results indicate that there's a lot of per-sort overhead somewhere. Share this post Link to post
Stefan Glienke 2002 Posted September 13 (edited) I am not commenting on benchmark results without seeing the code - I have seen too many flawed benchmarks in my life. If you look into the implementation you see that it is using the most simple parallel quicksort imaginable - that means the first run over the entire array which arranges the elements according to the pivot (as in the TArray.Sort the most simple pivot selection - take the middle) runs single threaded. Only then the processing of left and right is being passed to the PPL. Anyone who has researched a bit into sorting algorithms in the past 20 years knows that a pure quicksort is hardly anything good enough to be used as general purpose sorting algo in a runtime - other languages using hybrids for years, .NET and some C++ runtimes use IntroSort, Python uses Timsort, so does Java for objects and some special version of QuickSort for primitive types, lately some languages such as Go adapted PDQSort as their default sorting algorithm. I'd argue that some of these better algorithms running single-threaded outperform parallel quicksort on certain and not so uncommon data patterns. Edited September 13 by Stefan Glienke 2 Share this post Link to post
Tommi Prami 130 Posted September 13 4 hours ago, Stefan Glienke said: Anyone who has researched a bit into sorting algorithms in the past 20 years knows that a pure quicksort is hardly anything good enough to be used as general purpose sorting algo in a runtime Average case has most likely very small amount of items to sort. I think I saw somewhere that Bubble Sort would be faster than QuickSort on very small item counts. Not sure tough. But better is always better and faster is faster, PDQSort was quite interesting, quick look had interesting cases when it is very fast. Seems that it has no really bad worst cases either. Did not find very good comparison against many other algorithms. Anyhow getting better default sort algorithms to RTL most likely would not hurt much 🙂 -Tee- Share this post Link to post
Eric Grange 11 Posted September 13 (edited) @Stefan Glienke I've been testing Introsort and other variants as well, they're more resilient to edge cases, but in the typical case, I've found them to be quite underwhelming tbh in Delphi. In C++ that's different, templates vs generics + loop unrolling give a definitive edge. On a slower processor (i7-1165G7), for 5 millions values: - TArray.Sort: 641 ms - TArray.IntroSort: 673 ms - Quicksort (non generic, single-thread): 450 ms - TParallelArray.Sort: 176 ms So 3.6x speedup over TArray.Sort for a quad-core, which is fair. Overhead of generics is about 40%. On larger array sizes (like 1T), timing ratios are similar, but the whole machine became unusably slow during the TParallelArray.Sort run, so that's something to keep in mind (don't use it to sort large arrays in a GUI app where you need to keep things snappy) Edited September 13 by Eric Grange 1 Share this post Link to post
Stefan Glienke 2002 Posted September 13 (edited) Depends on your IntroSort implementation - the one in Spring is usually faster than RTL quicksort. (*) But this also is a result of it using branchless comparers as far as possible. Also, there is a reason why usually benchmarks for sorting algorithms don't only test on complete random data but also on several patterns. Because in reality, you often deal with things such as mostly sorted with a few unsorted elements at the end or reversed sorted. (*) here are the benchmark results for RTL QuickSort vs RTL QuickSort with Spring4d Comparer vs Spring4d IntroSort for several data patterns - you can easily see how the branchless compare affects the performance quite significantly -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------------------------- random/QuickSort/count:100 2463 ns 2435 ns 100 random/QuickSort2/count:100 1802 ns 1782 ns 100 random/IntroSort/count:100 1420 ns 1395 ns 100 random/QuickSort/count:1000 50209 ns 50189 ns 100 random/QuickSort2/count:1000 27645 ns 27612 ns 100 random/IntroSort/count:1000 18489 ns 18468 ns 100 random/QuickSort/count:10000 836172 ns 835798 ns 100 random/QuickSort2/count:10000 728400 ns 727977 ns 100 random/IntroSort/count:10000 523833 ns 523565 ns 100 random/QuickSort/count:100000 9855657 ns 9850263 ns 100 random/QuickSort2/count:100000 8918812 ns 8912401 ns 100 random/IntroSort/count:100000 6839316 ns 6829126 ns 100 random/QuickSort/count:1000000 119455635 ns 119330369 ns 100 random/QuickSort2/count:1000000 104800849 ns 104696480 ns 100 random/IntroSort/count:1000000 81749401 ns 81660293 ns 100 few_unique/QuickSort/count:100 2249 ns 2251 ns 100 few_unique/QuickSort2/count:100 1632 ns 1593 ns 100 few_unique/IntroSort/count:100 1231 ns 1202 ns 100 few_unique/QuickSort/count:1000 35486 ns 35454 ns 100 few_unique/QuickSort2/count:1000 25086 ns 25038 ns 100 few_unique/IntroSort/count:1000 13094 ns 13071 ns 100 few_unique/QuickSort/count:10000 515193 ns 514924 ns 100 few_unique/QuickSort2/count:10000 371173 ns 370283 ns 100 few_unique/IntroSort/count:10000 285188 ns 285036 ns 100 few_unique/QuickSort/count:100000 5664379 ns 5657759 ns 100 few_unique/QuickSort2/count:100000 4348078 ns 4341859 ns 100 few_unique/IntroSort/count:100000 3164370 ns 3162498 ns 100 few_unique/QuickSort/count:1000000 63121983 ns 63076941 ns 100 few_unique/QuickSort2/count:1000000 43872122 ns 43809752 ns 100 few_unique/IntroSort/count:1000000 30719211 ns 30651572 ns 100 two_values/QuickSort/count:100 1538 ns 1514 ns 100 two_values/QuickSort2/count:100 1205 ns 1181 ns 100 two_values/IntroSort/count:100 781 ns 768 ns 100 two_values/QuickSort/count:1000 19267 ns 19242 ns 100 two_values/QuickSort2/count:1000 15309 ns 15261 ns 100 two_values/IntroSort/count:1000 9174 ns 9146 ns 100 two_values/QuickSort/count:10000 281462 ns 276065 ns 100 two_values/QuickSort2/count:10000 206943 ns 206818 ns 100 two_values/IntroSort/count:10000 150871 ns 150572 ns 100 two_values/QuickSort/count:100000 3483592 ns 3467751 ns 100 two_values/QuickSort2/count:100000 2574497 ns 2555053 ns 100 two_values/IntroSort/count:100000 1947766 ns 1943228 ns 100 two_values/QuickSort/count:1000000 39543461 ns 39439575 ns 100 two_values/QuickSort2/count:1000000 29565091 ns 29524866 ns 100 two_values/IntroSort/count:1000000 22888036 ns 22847673 ns 100 all_equal/QuickSort/count:100 1453 ns 1424 ns 100 all_equal/QuickSort2/count:100 1152 ns 1135 ns 100 all_equal/IntroSort/count:100 735 ns 718 ns 100 all_equal/QuickSort/count:1000 16442 ns 16424 ns 100 all_equal/QuickSort2/count:1000 12609 ns 12588 ns 100 all_equal/IntroSort/count:1000 8274 ns 8253 ns 100 all_equal/QuickSort/count:10000 214083 ns 213979 ns 100 all_equal/QuickSort2/count:10000 164971 ns 164921 ns 100 all_equal/IntroSort/count:10000 125590 ns 125514 ns 100 all_equal/QuickSort/count:100000 2819341 ns 2815702 ns 100 all_equal/QuickSort2/count:100000 2120029 ns 2117072 ns 100 all_equal/IntroSort/count:100000 1563758 ns 1560929 ns 100 all_equal/QuickSort/count:1000000 32668228 ns 32563853 ns 100 all_equal/QuickSort2/count:1000000 24450182 ns 24411351 ns 100 all_equal/IntroSort/count:1000000 18799427 ns 18758084 ns 100 ascending/QuickSort/count:100 1238 ns 1208 ns 100 ascending/QuickSort2/count:100 989 ns 962 ns 100 ascending/IntroSort/count:100 723 ns 700 ns 100 ascending/QuickSort/count:1000 15109 ns 15088 ns 100 ascending/QuickSort2/count:1000 11357 ns 11335 ns 100 ascending/IntroSort/count:1000 7988 ns 7969 ns 100 ascending/QuickSort/count:10000 187903 ns 187573 ns 100 ascending/QuickSort2/count:10000 140454 ns 140400 ns 100 ascending/IntroSort/count:10000 119619 ns 119090 ns 100 ascending/QuickSort/count:100000 2250330 ns 2249079 ns 100 ascending/QuickSort2/count:100000 1700410 ns 1699408 ns 100 ascending/IntroSort/count:100000 1470538 ns 1467097 ns 100 ascending/QuickSort/count:1000000 28118193 ns 28071460 ns 100 ascending/QuickSort2/count:1000000 20655432 ns 20619601 ns 100 ascending/IntroSort/count:1000000 17387674 ns 17366389 ns 100 slightly_scrambled/QuickSort/count:100 1260 ns 1221 ns 100 slightly_scrambled/QuickSort2/count:100 992 ns 976 ns 100 slightly_scrambled/IntroSort/count:100 760 ns 729 ns 100 slightly_scrambled/QuickSort/count:1000 16685 ns 16664 ns 100 slightly_scrambled/QuickSort2/count:1000 12477 ns 12462 ns 100 slightly_scrambled/IntroSort/count:1000 9125 ns 9105 ns 100 slightly_scrambled/QuickSort/count:10000 247135 ns 246653 ns 100 slightly_scrambled/QuickSort2/count:10000 183518 ns 183063 ns 100 slightly_scrambled/IntroSort/count:10000 158241 ns 157415 ns 100 slightly_scrambled/QuickSort/count:100000 2975025 ns 2964542 ns 100 slightly_scrambled/QuickSort2/count:100000 2393465 ns 2389448 ns 100 slightly_scrambled/IntroSort/count:100000 2140478 ns 2137574 ns 100 slightly_scrambled/QuickSort/count:1000000 36245900 ns 36023047 ns 100 slightly_scrambled/QuickSort2/count:1000000 29002166 ns 28920883 ns 100 slightly_scrambled/IntroSort/count:1000000 25116044 ns 25034741 ns 100 sorted_blocks/QuickSort/count:100 1588 ns 1557 ns 100 sorted_blocks/QuickSort2/count:100 1377 ns 1369 ns 100 sorted_blocks/IntroSort/count:100 1195 ns 1171 ns 100 sorted_blocks/QuickSort/count:1000 25010 ns 24969 ns 100 sorted_blocks/QuickSort2/count:1000 19958 ns 19927 ns 100 sorted_blocks/IntroSort/count:1000 13229 ns 13200 ns 100 sorted_blocks/QuickSort/count:10000 296620 ns 295986 ns 100 sorted_blocks/QuickSort2/count:10000 246763 ns 246503 ns 100 sorted_blocks/IntroSort/count:10000 173911 ns 173822 ns 100 sorted_blocks/QuickSort/count:100000 4510684 ns 4474993 ns 100 sorted_blocks/QuickSort2/count:100000 3492714 ns 3479751 ns 100 sorted_blocks/IntroSort/count:100000 2468243 ns 2466781 ns 100 sorted_blocks/QuickSort/count:1000000 51631011 ns 51470454 ns 100 sorted_blocks/QuickSort2/count:1000000 41472336 ns 41323492 ns 100 sorted_blocks/IntroSort/count:1000000 29948990 ns 29886920 ns 100 single_big_swap/QuickSort/count:100 1408 ns 1378 ns 100 single_big_swap/QuickSort2/count:100 1124 ns 1097 ns 100 single_big_swap/IntroSort/count:100 1096 ns 1067 ns 100 single_big_swap/QuickSort/count:1000 16770 ns 16747 ns 100 single_big_swap/QuickSort2/count:1000 12769 ns 12753 ns 100 single_big_swap/IntroSort/count:1000 14975 ns 14945 ns 100 single_big_swap/QuickSort/count:10000 205373 ns 204738 ns 100 single_big_swap/QuickSort2/count:10000 153375 ns 153290 ns 100 single_big_swap/IntroSort/count:10000 211204 ns 210621 ns 100 single_big_swap/QuickSort/count:100000 2427136 ns 2416559 ns 100 single_big_swap/QuickSort2/count:100000 1853420 ns 1851812 ns 100 single_big_swap/IntroSort/count:100000 2900139 ns 2884030 ns 100 single_big_swap/QuickSort/count:1000000 29680453 ns 29600956 ns 100 single_big_swap/QuickSort2/count:1000000 22054106 ns 21979630 ns 100 single_big_swap/IntroSort/count:1000000 35043829 ns 34961055 ns 100 descending/QuickSort/count:100 1225 ns 1205 ns 100 descending/QuickSort2/count:100 979 ns 955 ns 100 descending/IntroSort/count:100 726 ns 703 ns 100 descending/QuickSort/count:1000 15322 ns 15138 ns 100 descending/QuickSort2/count:1000 11398 ns 11374 ns 100 descending/IntroSort/count:1000 8001 ns 7982 ns 100 descending/QuickSort/count:10000 187287 ns 187230 ns 100 descending/QuickSort2/count:10000 140713 ns 139754 ns 100 descending/IntroSort/count:10000 119207 ns 119157 ns 100 descending/QuickSort/count:100000 2251193 ns 2250138 ns 100 descending/QuickSort2/count:100000 1697955 ns 1697229 ns 100 descending/IntroSort/count:100000 1468175 ns 1467358 ns 100 descending/QuickSort/count:1000000 28025499 ns 27991254 ns 100 descending/QuickSort2/count:1000000 20643835 ns 20610794 ns 100 descending/IntroSort/count:1000000 17434312 ns 17380960 ns 100 pipe_organ/QuickSort/count:100 1347 ns 1325 ns 100 pipe_organ/QuickSort2/count:100 1068 ns 1052 ns 100 pipe_organ/IntroSort/count:100 712 ns 690 ns 100 pipe_organ/QuickSort/count:1000 15772 ns 15749 ns 100 pipe_organ/QuickSort2/count:1000 11923 ns 11899 ns 100 pipe_organ/IntroSort/count:1000 8073 ns 8059 ns 100 pipe_organ/QuickSort/count:10000 200075 ns 200011 ns 100 pipe_organ/QuickSort2/count:10000 153828 ns 153583 ns 100 pipe_organ/IntroSort/count:10000 121821 ns 121781 ns 100 pipe_organ/QuickSort/count:100000 2564621 ns 2558463 ns 100 pipe_organ/QuickSort2/count:100000 1958687 ns 1951100 ns 100 pipe_organ/IntroSort/count:100000 1516152 ns 1511432 ns 100 pipe_organ/QuickSort/count:1000000 30322479 ns 30281379 ns 100 pipe_organ/QuickSort2/count:1000000 23019458 ns 22963220 ns 100 pipe_organ/IntroSort/count:1000000 18033541 ns 17998660 ns 100 push_front/QuickSort/count:100 1657 ns 1627 ns 100 push_front/QuickSort2/count:100 1208 ns 1193 ns 100 push_front/IntroSort/count:100 1173 ns 1150 ns 100 push_front/QuickSort/count:1000 18850 ns 18774 ns 100 push_front/QuickSort2/count:1000 13694 ns 13669 ns 100 push_front/IntroSort/count:1000 15505 ns 15491 ns 100 push_front/QuickSort/count:10000 232070 ns 231695 ns 100 push_front/QuickSort2/count:10000 164370 ns 164085 ns 100 push_front/IntroSort/count:10000 223185 ns 223119 ns 100 push_front/QuickSort/count:100000 2786631 ns 2780806 ns 100 push_front/QuickSort2/count:100000 1981209 ns 1977728 ns 100 push_front/IntroSort/count:100000 2986261 ns 2979247 ns 100 push_front/QuickSort/count:1000000 32337658 ns 32237857 ns 100 push_front/QuickSort2/count:1000000 22995000 ns 22948793 ns 100 push_front/IntroSort/count:1000000 35586468 ns 35516490 ns 100 push_middle/QuickSort/count:100 1445 ns 1416 ns 100 push_middle/QuickSort2/count:100 1088 ns 1069 ns 100 push_middle/IntroSort/count:100 893 ns 868 ns 100 push_middle/QuickSort/count:1000 17087 ns 17063 ns 100 push_middle/QuickSort2/count:1000 27199 ns 12531 ns 100 push_middle/IntroSort/count:1000 11300 ns 11284 ns 100 push_middle/QuickSort/count:10000 210365 ns 210271 ns 100 push_middle/QuickSort2/count:10000 152367 ns 152240 ns 100 push_middle/IntroSort/count:10000 166248 ns 166154 ns 100 push_middle/QuickSort/count:100000 2529640 ns 2526924 ns 100 push_middle/QuickSort2/count:100000 1844729 ns 1842635 ns 100 push_middle/IntroSort/count:100000 2160489 ns 2158998 ns 100 push_middle/QuickSort/count:1000000 30219881 ns 30136359 ns 100 push_middle/QuickSort2/count:1000000 21940150 ns 21867966 ns 100 push_middle/IntroSort/count:1000000 25940856 ns 25884546 ns 100 Edited September 13 by Stefan Glienke Share this post Link to post
Stefan Glienke 2002 Posted September 13 TParallelArray.Sort is broken by the way - try this: uses System.Threading, System.SysUtils; procedure Main; begin var data: TArray<Integer>; SetLength(data, 1000000); for var i := 0 to High(data) do data[i] := Random(16); TParallelArray.Sort<Integer>(data); end; begin try Main; except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; Writeln('done'); Readln; end. 1 Share this post Link to post
Lajos Juhász 293 Posted September 13 Stefan you should give at least 48 hours to a new version before you break it. It does handle array of 100_000 integers. Maybe for a larger arrays you have to buy a special SKU for sorting. 8 Share this post Link to post
Eric Grange 11 Posted September 13 I actually used an Introsort implementation you had posted sometime ago, but with the RTL comparare 🙂 However I can't really reproduce your benchmark advantage with latest Spring4D from your repository: in the same conditions as above, Spring4D's sort clocks at 609 ms for 5 million values, just 7% faster than RTL. . The code really goes in the IntroSort_Double methods and the Compare_Double comparer (I checked with the debugger), is there some option or setting required ? When the array is pre-sorted instead of random, the timings are: - RTL TArray.Sort : 265 ms - Spring4D IntroSort : 238 ms - non-generic QuickSort : 78 ms - TParallelArray.Sort: 100 ms So Spring4d gets somewhat faster compared to RTL, but not as a massively as the non-generic Quicksort. The other edge case is an array of 5 millions times the same value: - RTL TArray.Sort : 248 ms - Spring4D IntroSort : 229 ms - non-generic QuickSort : 102 ms - TParallelArray.Sort: CRASH with Stack Overflow !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Ok, this one looks like a death blow to the new parallel sort 😕 For reference, this is the non-generic QuickSort copy-pasta {$R-} {$O+} {$Q-} type TDoubleArray = array [0..MaxInt div 8 - 1] of Double; PDoubleArray = ^TDoubleArray; procedure QuickSort(pArray : PDoubleArray; minIndex, maxIndex : NativeInt); var i, j, p : NativeInt; begin repeat i := minIndex; j := maxIndex; p := (i+j) shr 1; repeat var pv := pArray[p]; while pArray[i] > pv do Inc(i); while pArray[j] < pv do Dec(j); if i <= j then begin var buf := pArray[i]; pArray[i] := pArray[j]; pArray[j] := buf; if p = i then p := j else if p = j then p := i; Inc(i); Dec(j); end; until i > j; if minIndex < j then QuickSort(pArray, minIndex, j); minIndex := i; until i >= maxIndex; end; 1 Share this post Link to post
Eric Grange 11 Posted September 13 Looks like parallel sort largest non-crash single value array size is somewhere between 10000 and 15000 actually at @Lajos Juhász Stefan actually went easy at it with a Random(16) 😉 Share this post Link to post
Anders Melander 1783 Posted September 13 49 minutes ago, Stefan Glienke said: here are the benchmark results [...] How do you get the benchmark results ordered by parameters rather than method? The way I use spring.benchmark I get the results ordered by method and then parameters which makes it hard to compare the different methods: procedure Benchmark(BenchmarkFunc: TFunction; const Name: string); begin Spring.Benchmark.Benchmark(BenchmarkFunc, Name).RangeMultiplier(4).Ranges([Range(1024+1, 8192+13), Range(128, 5120)]).TimeUnit(kMillisecond); end; //------------------------------------------------------------------------------ begin Benchmark(BenchmarkNoTranspose32, 'MemCopy (no transpose)'); Benchmark(BenchmarkReferenceTranspose32, 'ReferenceTranspose32'); Benchmark(BenchmarkCacheObliviousTranspose32, 'CacheObliviousTranspose32'); Benchmark(BenchmarkCacheObliviousTransposeEx32, 'CacheObliviousTransposeEx32'); Benchmark(BenchmarkSuperDuperTranspose32, 'SuperDuperTranspose32'); Spring.Benchmark.Benchmark_Main; end. ------------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------------------ MemCopy (no transpose)/1025/128 0,028 ms 0,025 ms 26353 Rate=5.26859G/s MemCopy (no transpose)/4096/128 0,153 ms 0,143 ms 4480 Rate=3.66644G/s MemCopy (no transpose)/8205/128 0,616 ms 0,578 ms 1000 Rate=1.81663G/s MemCopy (no transpose)/1025/256 0,064 ms 0,049 ms 11200 Rate=5.37395G/s MemCopy (no transpose)/4096/256 0,596 ms 0,502 ms 1120 Rate=2.08783G/s MemCopy (no transpose)/8205/256 1,30 ms 1,03 ms 560 Rate=2.03463G/s MemCopy (no transpose)/1025/1024 0,590 ms 0,558 ms 1120 Rate=1.88088G/s MemCopy (no transpose)/4096/1024 2,56 ms 2,25 ms 299 Rate=1.86656G/s MemCopy (no transpose)/8205/1024 5,48 ms 4,26 ms 154 Rate=1.97165G/s MemCopy (no transpose)/1025/4096 2,63 ms 2,08 ms 345 Rate=2.01523G/s MemCopy (no transpose)/4096/4096 10,9 ms 10,0 ms 64 Rate=1.67608G/s MemCopy (no transpose)/8205/4096 23,6 ms 22,6 ms 29 Rate=1.48514G/s MemCopy (no transpose)/1025/5120 3,36 ms 2,85 ms 236 Rate=1.84339G/s MemCopy (no transpose)/4096/5120 14,3 ms 12,3 ms 56 Rate=1.70823G/s MemCopy (no transpose)/8205/5120 29,3 ms 23,8 ms 21 Rate=1.7644G/s ReferenceTranspose32/1025/128 0,345 ms 0,322 ms 2036 Rate=407.045M/s ReferenceTranspose32/4096/128 1,49 ms 1,35 ms 498 Rate=388.607M/s ReferenceTranspose32/8205/128 3,84 ms 3,07 ms 224 Rate=342.187M/s ReferenceTranspose32/1025/256 0,806 ms 0,628 ms 1120 Rate=417.974M/s ReferenceTranspose32/4096/256 3,91 ms 3,23 ms 213 Rate=324.868M/s ReferenceTranspose32/8205/256 28,4 ms 25,7 ms 28 Rate=81.8274M/s ReferenceTranspose32/1025/1024 7,00 ms 6,77 ms 90 Rate=155.018M/s ReferenceTranspose32/4096/1024 97,6 ms 93,8 ms 7 Rate=44.7392M/s ReferenceTranspose32/8205/1024 184 ms 168 ms 4 Rate=50.0207M/s ReferenceTranspose32/1025/4096 33,4 ms 32,8 ms 20 Rate=127.951M/s ReferenceTranspose32/4096/4096 367 ms 336 ms 2 Rate=49.9415M/s ReferenceTranspose32/8205/4096 720 ms 656 ms 1 Rate=51.2117M/s ReferenceTranspose32/1025/5120 44,2 ms 41,4 ms 17 Rate=126.885M/s ReferenceTranspose32/4096/5120 459 ms 391 ms 2 Rate=53.6871M/s ReferenceTranspose32/8205/5120 969 ms 781 ms 1 Rate=53.7723M/s CacheObliviousTranspose32/1025/128 0,326 ms 0,285 ms 2800 Rate=461.001M/s CacheObliviousTranspose32/4096/128 1,37 ms 1,34 ms 560 Rate=391.468M/s [...] Share this post Link to post
Anders Melander 1783 Posted September 13 9 minutes ago, Eric Grange said: TParallelArray.Sort: CRASH with Stack Overflow But it crashed faster and fast is better, right? Right? If only there was some easy way of getting stuff like this tested before release... I mean, come on, we all make bugs but this is simply not acceptable. 2 Share this post Link to post
Stefan Glienke 2002 Posted September 13 (edited) This is the benchmark code - previous results I posted were with TTestType = Integer, but using Double shows similar results (I ran the benchmark as 64bit release) program SortBench; {$APPTYPE CONSOLE} uses System.Generics.Collections, Spring, Spring.Comparers, Spring.Benchmark, System.SysUtils; type TTestType = Double; TDistFunc = function(size: Integer): TArray<TTestType>; TSortFunc = procedure(var values: array of TTestType); function IntToTestType(i: Integer): TTestType; begin Result := i; end; function Equals(const left, right: TTestType): Boolean; inline; begin Result := left = right; end; function Shuffled(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); Spring.TArray.Shuffle<TTestType>(Result); end; function Shuffled_16Values(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(Random(16)); end; function TwoValues(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(Random(2)); end; function AllEqual(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(0); end; function Ascending(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); end; function SlightlyScrambled(size: Integer): TArray<TTestType>; const scramblePercentage = 5; var i, idx1, idx2: Integer; temp: TTestType; scrambleCount: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); scrambleCount := size * scramblePercentage div 100; for i := 1 to scrambleCount do begin idx1 := Random(size); idx2 := Random(size); temp := Result[idx1]; Result[idx1] := Result[idx2]; Result[idx2] := temp; end; end; function SortedBlocks(size: Integer): TArray<TTestType>; type TBlock100 = array[1..100] of TTestType; TBlock10 = array[1..10] of TTestType; var i: Integer; begin SetLength(Result, size); for i := 0 to size - 1 do Result[i] := IntToTestType(i); if size > 100 then Spring.TArray.Shuffle<TBlock100>(Pointer(Result), (size div (SizeOf(TBlock100) div SizeOf(TTestType))) - 1) else Spring.TArray.Shuffle<TBlock10>(Pointer(Result), (size div (SizeOf(TBlock10) div SizeOf(TTestType))) - 1); end; function SingleBigSwap(size: Integer): TArray<TTestType>; var i: Integer; halfSize: Integer; begin SetLength(Result, size); halfSize := size div 2; for i := 0 to halfSize - 1 do Result[i] := IntToTestType(i + halfSize); for i := halfSize to size -1 do Result[i] := IntToTestType(i - halfSize); end; function Descending(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := size-1 downto 0 do Result[i] := IntToTestType(i); end; function PipeOrgan(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to (size div 2) - 1 do Result[i] := IntToTestType(i); for i := size div 2 to size -1 do Result[i] := IntToTestType(size - 1); end; function PushFront(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 1 to size - 1 do Result[i - 1] := IntToTestType(i); Result[size - 1] := IntToTestType(0); end; function PushMiddle(size: Integer): TArray<TTestType>; var i: Integer; begin SetLength(Result, size); for i := 0 to (size div 2) - 1 do Result[i] := IntToTestType(i); for i := (size div 2) + 1 to size -1 do Result[i - 1] := IntToTestType(i); Result[size - 1] := IntToTestType(size div 2); end; procedure QuickSort(var values: array of TTestType); begin System.Generics.Collections.TArray.Sort<TTestType>(values); end; procedure QuickSort_SpringComparer(var values: array of TTestType); begin System.Generics.Collections.TArray.Sort<TTestType>(values , Spring.Comparers.TComparer<TTestType>.Default ); end; procedure IntroSort(var values: array of TTestType); begin Spring.TArray.Sort<TTestType>(values); end; const Distributions: array[0..11] of record name: string; func: TDistFunc end = ( (name: 'random'; func: Shuffled), (name: 'few_unique'; func: Shuffled_16Values), (name: 'two_values'; func: TwoValues), (name: 'all_equal'; func: AllEqual), (name: 'ascending'; func: Ascending), (name: 'slightly_scrambled'; func: SlightlyScrambled), (name: 'sorted_blocks'; func: SortedBlocks), (name: 'single_big_swap'; func: SingleBigSwap), (name: 'descending'; func: Descending), (name: 'pipe_organ'; func: PipeOrgan), (name: 'push_front'; func: PushFront), (name: 'push_middle'; func: PushMiddle) ); Sorts: array[0..2] of record name: string; func: TSortFunc end = ( (name: 'QuickSort'; func: QuickSort) ,(name: 'QuickSort2'; func: QuickSort_SpringComparer) ,(name: 'IntroSort'; func: IntroSort) ); Sizes: array of Integer = [100, 1000, 10000, 100000, 1000000]; procedure BenchSort(const state: TState); begin var distIdx := state[0]; var sortIdx := state[1]; var size := state[2]; var distribution: TDistFunc := Distributions[distIdx].func; var sort: TSortFunc := Sorts[sortIdx].func; var data := distribution(size); var copy: TArray<TTestType>; SetLength(copy, size); for var _ in state do begin state.PauseTiming; if IsManagedType(TTestType) then MoveManaged(data, copy, TypeInfo(TTestType), size) else Move(data[0], copy[0], size * SizeOf(TTestType)); state.ResumeTiming; sort(copy); end; QuickSort(data); for var i := 0 to high(copy) do Assert(Equals(copy[i], data[i]), Distributions[distIdx].Name + '/' + Sorts[sortIdx].Name + '/' + size.ToString); end; procedure Main; begin benchmark_format_args := False; for var distIdx := Low(distributions) to High(Distributions) do for var size in Sizes do for var sortIdx := Low(Sorts) to High(Sorts) do begin var bm := Benchmark(BenchSort, Distributions[distIdx].name + '/' + Sorts[sortIdx].name + '/count:' + size.ToString) .Args([distIdx, sortIdx, size]); bm.Iterations(100); end; Benchmark_Main; end; begin try Main; except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; Readln; end. Edited September 13 by Stefan Glienke 1 Share this post Link to post
DelphiUdIT 176 Posted September 13 I do test with 1 bilions of elements, and there were no issue, but the test that Stefan (the first post, not the last) did was with limited value (only 16 values in 1 milions of elements). There are sure an issue with recursive and stack allocation (or may be the algorithm in the ICompare method). Try to modify the maximum stack allocation and vary the Random value to 24 (Random(24)) and all is running. Of course this is anyway an issue Share this post Link to post
Stefan Glienke 2002 Posted September 13 Just now, DelphiUdIT said: There are sure an issue with recursive and stack allocation (or may be the algorithm in the ICompare method). The issue is that in the Parallel implementation there is no mitigation against QuickSorts well known worst case. Share this post Link to post
DelphiUdIT 176 Posted September 13 (edited) 13 minutes ago, Stefan Glienke said: The issue is that in the Parallel implementation there is no mitigation against QuickSorts well known worst case. If this is so known and clear, Embarcadero was very "light" in proposing such a solution ... let's hope they propose another solution because the times with parallel processing are definitely interesting. Edited September 13 by DelphiUdIT Share this post Link to post
Stefan Glienke 2002 Posted September 13 1 hour ago, DelphiUdIT said: the times with parallel processing are definitely interesting. To me they are not - if you have some backend server code that is already doing multiple things in parallel it won't get any better if you then do parallel sort. Share this post Link to post
Eric Grange 11 Posted September 13 @Stefan Glienke could you try reproducing by invoking your sort calls directly in the minimalistic benchmark code posted in the thread beginning ? Share this post Link to post
Stefan Glienke 2002 Posted September 13 (edited) Benchmark code: {$APPTYPE CONSOLE} uses System.Diagnostics, System.Generics.Collections, System.SysUtils, Spring; procedure Test(count: NativeInt); var SSortArray, PSortArray: TArray<double>; I: Integer; T: TStopwatch; begin try SetLength(SSortArray, count); SetLength(PSortArray, count); Writeln('Fill array (double), elements count: ', count); for I := Low(SSortArray) to High(SSortArray) do begin SSortArray[I] := Random(MaxInt); PSortArray[I] := SSortArray[I]; end; Writeln('Start sorting ...'); T := TStopwatch.StartNew; System.Generics.Collections.TArray.Sort<double>(SSortArray); T.Stop; Writeln('RTL TArray.Sort (ms.): ', T.ElapsedMilliseconds.ToString); T := TStopwatch.StartNew; Spring.TArray.Sort<double>(PSortArray); T.Stop; Writeln('Spring TArray.Sort (ms.): ', T.ElapsedMilliseconds.ToString); except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; Writeln; end; begin Test(500_000); Test(5_000_000); Test(100_000_000); Readln; end. Results (win64 release) on i7-12700: Fill array (double), elements count: 500000 Start sorting ... RTL TArray.Sort (ms.): 49 Spring TArray.Sort (ms.): 33 Fill array (double), elements count: 5000000 Start sorting ... RTL TArray.Sort (ms.): 560 Spring TArray.Sort (ms.): 393 Fill array (double), elements count: 100000000 Start sorting ... RTL TArray.Sort (ms.): 13053 Spring TArray.Sort (ms.): 9351 Edited September 13 by Stefan Glienke Share this post Link to post
Stefan Glienke 2002 Posted September 13 (edited) Here is something else to tease you all a bit - PDQSort is using comparer, PDQSort2 does not - others as before QuickSort2 is using the Spring Comparer - test type is Double: Looks like I am getting twice the speed at worst and 50 times at best. For faster to compare types such as Integer the gain goes up to almost 100 times. -------------------------------------------------------------------------------------- Benchmark Time CPU Iterations -------------------------------------------------------------------------------------- random/QuickSort/count:100 2917 ns 2886 ns 100 random/QuickSort2/count:100 2301 ns 2284 ns 100 random/IntroSort/count:100 1582 ns 1566 ns 100 random/PDQSort/count:100 1519 ns 1504 ns 100 random/PDQSort2/count:100 759 ns 746 ns 100 random/QuickSort/count:1000 48127 ns 47750 ns 100 random/QuickSort2/count:1000 26820 ns 26615 ns 100 random/IntroSort/count:1000 16349 ns 16157 ns 100 random/PDQSort/count:1000 24101 ns 23928 ns 100 random/PDQSort2/count:1000 6485 ns 6442 ns 100 random/QuickSort/count:10000 686240 ns 685846 ns 100 random/QuickSort2/count:10000 621977 ns 618258 ns 100 random/IntroSort/count:10000 426604 ns 426023 ns 100 random/PDQSort/count:10000 445646 ns 443871 ns 100 random/PDQSort2/count:10000 338404 ns 336588 ns 100 random/QuickSort/count:100000 8252128 ns 8220957 ns 100 random/QuickSort2/count:100000 7492506 ns 7371508 ns 100 random/IntroSort/count:100000 5723208 ns 5578393 ns 100 random/PDQSort/count:100000 5675312 ns 5650503 ns 100 random/PDQSort2/count:100000 4504745 ns 4482480 ns 100 random/QuickSort/count:1000000 98163494 ns 97011011 ns 100 random/QuickSort2/count:1000000 88461390 ns 87306397 ns 100 random/IntroSort/count:1000000 67687293 ns 66648855 ns 100 random/PDQSort/count:1000000 69019337 ns 68147235 ns 100 random/PDQSort2/count:1000000 55628991 ns 54791149 ns 100 few_unique/QuickSort/count:100 2356 ns 2314 ns 100 few_unique/QuickSort2/count:100 1540 ns 1500 ns 100 few_unique/IntroSort/count:100 1291 ns 1275 ns 100 few_unique/PDQSort/count:100 1559 ns 1483 ns 100 few_unique/PDQSort2/count:100 689 ns 674 ns 100 few_unique/QuickSort/count:1000 29073 ns 29009 ns 100 few_unique/QuickSort2/count:1000 16086 ns 16026 ns 100 few_unique/IntroSort/count:1000 11084 ns 11031 ns 100 few_unique/PDQSort/count:1000 8022 ns 7984 ns 100 few_unique/PDQSort2/count:1000 4015 ns 3983 ns 100 few_unique/QuickSort/count:10000 424766 ns 424046 ns 100 few_unique/QuickSort2/count:10000 292291 ns 288395 ns 100 few_unique/IntroSort/count:10000 224547 ns 222666 ns 100 few_unique/PDQSort/count:10000 145437 ns 144788 ns 100 few_unique/PDQSort2/count:10000 102878 ns 102414 ns 100 few_unique/QuickSort/count:100000 4654357 ns 4640420 ns 100 few_unique/QuickSort2/count:100000 3436405 ns 3421938 ns 100 few_unique/IntroSort/count:100000 2649901 ns 2639618 ns 100 few_unique/PDQSort/count:100000 1487662 ns 1479095 ns 100 few_unique/PDQSort2/count:100000 1172432 ns 1167015 ns 100 few_unique/QuickSort/count:1000000 51130249 ns 50537573 ns 100 few_unique/QuickSort2/count:1000000 38627555 ns 38033922 ns 100 few_unique/IntroSort/count:1000000 30907231 ns 30808893 ns 100 few_unique/PDQSort/count:1000000 14733627 ns 14572366 ns 100 few_unique/PDQSort2/count:1000000 12287426 ns 12233176 ns 100 two_values/QuickSort/count:100 2115 ns 2099 ns 100 two_values/QuickSort2/count:100 1601 ns 1558 ns 100 two_values/IntroSort/count:100 1058 ns 1037 ns 100 two_values/PDQSort/count:100 714 ns 699 ns 100 two_values/PDQSort2/count:100 395 ns 377 ns 100 two_values/QuickSort/count:1000 18838 ns 18787 ns 100 two_values/QuickSort2/count:1000 15509 ns 15468 ns 100 two_values/IntroSort/count:1000 10009 ns 9971 ns 100 two_values/PDQSort/count:1000 3434 ns 3397 ns 100 two_values/PDQSort2/count:1000 1715 ns 1673 ns 100 two_values/QuickSort/count:10000 310630 ns 306744 ns 100 two_values/QuickSort2/count:10000 221981 ns 214977 ns 100 two_values/IntroSort/count:10000 148667 ns 148595 ns 100 two_values/PDQSort/count:10000 42702 ns 42577 ns 100 two_values/PDQSort2/count:10000 24225 ns 24032 ns 100 two_values/QuickSort/count:100000 3353827 ns 3342019 ns 100 two_values/QuickSort2/count:100000 2487005 ns 2476672 ns 100 two_values/IntroSort/count:100000 1870906 ns 1863767 ns 100 two_values/PDQSort/count:100000 541802 ns 540355 ns 100 two_values/PDQSort2/count:100000 378254 ns 378023 ns 100 two_values/QuickSort/count:1000000 40114855 ns 39440910 ns 100 two_values/QuickSort2/count:1000000 29158407 ns 28744654 ns 100 two_values/IntroSort/count:1000000 23100678 ns 22744457 ns 100 two_values/PDQSort/count:1000000 5633201 ns 5601282 ns 100 two_values/PDQSort2/count:1000000 3762100 ns 3740864 ns 100 all_equal/QuickSort/count:100 2151 ns 2136 ns 100 all_equal/QuickSort2/count:100 1647 ns 1618 ns 100 all_equal/IntroSort/count:100 1058 ns 1040 ns 100 all_equal/PDQSort/count:100 570 ns 526 ns 100 all_equal/PDQSort2/count:100 326 ns 310 ns 100 all_equal/QuickSort/count:1000 17604 ns 17531 ns 100 all_equal/QuickSort2/count:1000 12558 ns 12503 ns 100 all_equal/IntroSort/count:1000 8658 ns 8607 ns 100 all_equal/PDQSort/count:1000 2501 ns 2517 ns 100 all_equal/PDQSort2/count:1000 1211 ns 1200 ns 100 all_equal/QuickSort/count:10000 229287 ns 228053 ns 100 all_equal/QuickSort2/count:10000 168158 ns 167545 ns 100 all_equal/IntroSort/count:10000 123868 ns 122653 ns 100 all_equal/PDQSort/count:10000 17648 ns 17566 ns 100 all_equal/PDQSort2/count:10000 6898 ns 6844 ns 100 all_equal/QuickSort/count:100000 3089106 ns 2839190 ns 100 all_equal/QuickSort2/count:100000 1991645 ns 1983108 ns 100 all_equal/IntroSort/count:100000 1517401 ns 1509352 ns 100 all_equal/PDQSort/count:100000 168092 ns 167526 ns 100 all_equal/PDQSort2/count:100000 64056 ns 63376 ns 100 all_equal/QuickSort/count:1000000 31780316 ns 31401768 ns 100 all_equal/QuickSort2/count:1000000 23670745 ns 23308078 ns 100 all_equal/IntroSort/count:1000000 18492352 ns 18165700 ns 100 all_equal/PDQSort/count:1000000 1693063 ns 1685270 ns 100 all_equal/PDQSort2/count:1000000 667052 ns 665049 ns 100 ascending/QuickSort/count:100 1645 ns 1629 ns 100 ascending/QuickSort2/count:100 1249 ns 1233 ns 100 ascending/IntroSort/count:100 993 ns 956 ns 100 ascending/PDQSort/count:100 524 ns 510 ns 100 ascending/PDQSort2/count:100 290 ns 276 ns 100 ascending/QuickSort/count:1000 15279 ns 15237 ns 100 ascending/QuickSort2/count:1000 11455 ns 11389 ns 100 ascending/IntroSort/count:1000 7931 ns 7802 ns 100 ascending/PDQSort/count:1000 2618 ns 2572 ns 100 ascending/PDQSort2/count:1000 1199 ns 1169 ns 100 ascending/QuickSort/count:10000 186722 ns 185998 ns 100 ascending/QuickSort2/count:10000 134269 ns 134014 ns 100 ascending/IntroSort/count:10000 106604 ns 106492 ns 100 ascending/PDQSort/count:10000 19298 ns 19236 ns 100 ascending/PDQSort2/count:10000 6002 ns 5953 ns 100 ascending/QuickSort/count:100000 2229729 ns 2219903 ns 100 ascending/QuickSort2/count:100000 1655304 ns 1647404 ns 100 ascending/IntroSort/count:100000 1298027 ns 1291995 ns 100 ascending/PDQSort/count:100000 190239 ns 189988 ns 100 ascending/PDQSort2/count:100000 53607 ns 53501 ns 100 ascending/QuickSort/count:1000000 27509633 ns 27165759 ns 100 ascending/QuickSort2/count:1000000 19985236 ns 19668315 ns 100 ascending/IntroSort/count:1000000 15326770 ns 15254001 ns 100 ascending/PDQSort/count:1000000 1901178 ns 1890171 ns 100 ascending/PDQSort2/count:1000000 629675 ns 628905 ns 100 slightly_scrambled/QuickSort/count:100 1750 ns 1732 ns 100 slightly_scrambled/QuickSort2/count:100 1373 ns 1343 ns 100 slightly_scrambled/IntroSort/count:100 985 ns 968 ns 100 slightly_scrambled/PDQSort/count:100 1012 ns 997 ns 100 slightly_scrambled/PDQSort2/count:100 471 ns 454 ns 100 slightly_scrambled/QuickSort/count:1000 16791 ns 16735 ns 100 slightly_scrambled/QuickSort2/count:1000 12824 ns 12776 ns 100 slightly_scrambled/IntroSort/count:1000 10458 ns 10399 ns 100 slightly_scrambled/PDQSort/count:1000 16268 ns 15887 ns 100 slightly_scrambled/PDQSort2/count:1000 4547 ns 4494 ns 100 slightly_scrambled/QuickSort/count:10000 235073 ns 233521 ns 100 slightly_scrambled/QuickSort2/count:10000 200321 ns 199399 ns 100 slightly_scrambled/IntroSort/count:10000 151789 ns 151249 ns 100 slightly_scrambled/PDQSort/count:10000 174541 ns 173592 ns 100 slightly_scrambled/PDQSort2/count:10000 89454 ns 89257 ns 100 slightly_scrambled/QuickSort/count:100000 2972211 ns 2964870 ns 100 slightly_scrambled/QuickSort2/count:100000 2376263 ns 2366504 ns 100 slightly_scrambled/IntroSort/count:100000 2305909 ns 2052472 ns 100 slightly_scrambled/PDQSort/count:100000 2093171 ns 2083240 ns 100 slightly_scrambled/PDQSort2/count:100000 982124 ns 978410 ns 100 slightly_scrambled/QuickSort/count:1000000 35292840 ns 34939068 ns 100 slightly_scrambled/QuickSort2/count:1000000 29451877 ns 28875217 ns 100 slightly_scrambled/IntroSort/count:1000000 25058823 ns 24750291 ns 100 slightly_scrambled/PDQSort/count:1000000 25962367 ns 25535252 ns 100 slightly_scrambled/PDQSort2/count:1000000 12483234 ns 12181630 ns 100 sorted_blocks/QuickSort/count:100 2430 ns 2405 ns 100 sorted_blocks/QuickSort2/count:100 2499 ns 2478 ns 100 sorted_blocks/IntroSort/count:100 1061 ns 1019 ns 100 sorted_blocks/PDQSort/count:100 1329 ns 1285 ns 100 sorted_blocks/PDQSort2/count:100 720 ns 652 ns 100 sorted_blocks/QuickSort/count:1000 16224 ns 16169 ns 100 sorted_blocks/QuickSort2/count:1000 19509 ns 19454 ns 100 sorted_blocks/IntroSort/count:1000 14560 ns 14515 ns 100 sorted_blocks/PDQSort/count:1000 9479 ns 9313 ns 100 sorted_blocks/PDQSort2/count:1000 4913 ns 4885 ns 100 sorted_blocks/QuickSort/count:10000 308032 ns 306791 ns 100 sorted_blocks/QuickSort2/count:10000 238446 ns 238183 ns 100 sorted_blocks/IntroSort/count:10000 173146 ns 172243 ns 100 sorted_blocks/PDQSort/count:10000 168365 ns 166934 ns 100 sorted_blocks/PDQSort2/count:10000 65434 ns 65356 ns 100 sorted_blocks/QuickSort/count:100000 4041913 ns 4028550 ns 100 sorted_blocks/QuickSort2/count:100000 3239075 ns 3226690 ns 100 sorted_blocks/IntroSort/count:100000 2578785 ns 2434971 ns 100 sorted_blocks/PDQSort/count:100000 2386599 ns 2375001 ns 100 sorted_blocks/PDQSort2/count:100000 966824 ns 959131 ns 100 sorted_blocks/QuickSort/count:1000000 51596313 ns 50949966 ns 100 sorted_blocks/QuickSort2/count:1000000 39977951 ns 39455223 ns 100 sorted_blocks/IntroSort/count:1000000 29587960 ns 29072445 ns 100 sorted_blocks/PDQSort/count:1000000 29140836 ns 28926711 ns 100 sorted_blocks/PDQSort2/count:1000000 13950878 ns 13703237 ns 100 single_big_swap/QuickSort/count:100 1290 ns 1277 ns 100 single_big_swap/QuickSort2/count:100 1004 ns 1000 ns 100 single_big_swap/IntroSort/count:100 1087 ns 1069 ns 100 single_big_swap/PDQSort/count:100 633 ns 620 ns 100 single_big_swap/PDQSort2/count:100 474 ns 456 ns 100 single_big_swap/QuickSort/count:1000 16889 ns 16794 ns 100 single_big_swap/QuickSort2/count:1000 12697 ns 12655 ns 100 single_big_swap/IntroSort/count:1000 14016 ns 13967 ns 100 single_big_swap/PDQSort/count:1000 7008 ns 6939 ns 100 single_big_swap/PDQSort2/count:1000 2342 ns 2339 ns 100 single_big_swap/QuickSort/count:10000 204246 ns 204182 ns 100 single_big_swap/QuickSort2/count:10000 156179 ns 155306 ns 100 single_big_swap/IntroSort/count:10000 192030 ns 190639 ns 100 single_big_swap/PDQSort/count:10000 78781 ns 78285 ns 100 single_big_swap/PDQSort2/count:10000 23238 ns 23145 ns 100 single_big_swap/QuickSort/count:100000 2414905 ns 2395809 ns 100 single_big_swap/QuickSort2/count:100000 1787565 ns 1778945 ns 100 single_big_swap/IntroSort/count:100000 2523279 ns 2517094 ns 100 single_big_swap/PDQSort/count:100000 889865 ns 884869 ns 100 single_big_swap/PDQSort2/count:100000 249087 ns 246596 ns 100 single_big_swap/QuickSort/count:1000000 29262380 ns 28791769 ns 100 single_big_swap/QuickSort2/count:1000000 21306314 ns 20978264 ns 100 single_big_swap/IntroSort/count:1000000 30104774 ns 29736630 ns 100 single_big_swap/PDQSort/count:1000000 10839585 ns 10689165 ns 100 single_big_swap/PDQSort2/count:1000000 3153072 ns 3134668 ns 100 descending/QuickSort/count:100 1633 ns 1619 ns 100 descending/QuickSort2/count:100 1324 ns 1245 ns 100 descending/IntroSort/count:100 915 ns 904 ns 100 descending/PDQSort/count:100 534 ns 518 ns 100 descending/PDQSort2/count:100 305 ns 290 ns 100 descending/QuickSort/count:1000 15212 ns 15109 ns 100 descending/QuickSort2/count:1000 12131 ns 12071 ns 100 descending/IntroSort/count:1000 8520 ns 8483 ns 100 descending/PDQSort/count:1000 2677 ns 2616 ns 100 descending/PDQSort2/count:1000 1168 ns 1167 ns 100 descending/QuickSort/count:10000 190983 ns 190004 ns 100 descending/QuickSort2/count:10000 142282 ns 141461 ns 100 descending/IntroSort/count:10000 106956 ns 106801 ns 100 descending/PDQSort/count:10000 22199 ns 22098 ns 100 descending/PDQSort2/count:10000 7163 ns 6853 ns 100 descending/QuickSort/count:100000 2210157 ns 2201096 ns 100 descending/QuickSort2/count:100000 1639887 ns 1632072 ns 100 descending/IntroSort/count:100000 1308584 ns 1301543 ns 100 descending/PDQSort/count:100000 186613 ns 185900 ns 100 descending/PDQSort2/count:100000 56812 ns 56648 ns 100 descending/QuickSort/count:1000000 27759185 ns 27197419 ns 100 descending/QuickSort2/count:1000000 19937923 ns 19733711 ns 100 descending/IntroSort/count:1000000 15222011 ns 15155862 ns 100 descending/PDQSort/count:1000000 2159517 ns 1899555 ns 100 descending/PDQSort2/count:1000000 560522 ns 557461 ns 100 pipe_organ/QuickSort/count:100 1853 ns 1835 ns 100 pipe_organ/QuickSort2/count:100 1439 ns 1416 ns 100 pipe_organ/IntroSort/count:100 1029 ns 950 ns 100 pipe_organ/PDQSort/count:100 528 ns 517 ns 100 pipe_organ/PDQSort2/count:100 296 ns 282 ns 100 pipe_organ/QuickSort/count:1000 15940 ns 15886 ns 100 pipe_organ/QuickSort2/count:1000 11960 ns 11911 ns 100 pipe_organ/IntroSort/count:1000 8271 ns 8225 ns 100 pipe_organ/PDQSort/count:1000 2622 ns 2591 ns 100 pipe_organ/PDQSort2/count:1000 1192 ns 1172 ns 100 pipe_organ/QuickSort/count:10000 193559 ns 193175 ns 100 pipe_organ/QuickSort2/count:10000 150531 ns 149370 ns 100 pipe_organ/IntroSort/count:10000 115439 ns 115205 ns 100 pipe_organ/PDQSort/count:10000 19195 ns 19176 ns 100 pipe_organ/PDQSort2/count:10000 6117 ns 6021 ns 100 pipe_organ/QuickSort/count:100000 2486991 ns 2478769 ns 100 pipe_organ/QuickSort2/count:100000 1819950 ns 1810639 ns 100 pipe_organ/IntroSort/count:100000 1388682 ns 1383509 ns 100 pipe_organ/PDQSort/count:100000 190331 ns 188891 ns 100 pipe_organ/PDQSort2/count:100000 52793 ns 52639 ns 100 pipe_organ/QuickSort/count:1000000 29577606 ns 29326755 ns 100 pipe_organ/QuickSort2/count:1000000 21748586 ns 21414611 ns 100 pipe_organ/IntroSort/count:1000000 16696194 ns 16630029 ns 100 pipe_organ/PDQSort/count:1000000 1869925 ns 1864242 ns 100 pipe_organ/PDQSort2/count:1000000 549429 ns 545797 ns 100 push_front/QuickSort/count:100 2289 ns 2269 ns 100 push_front/QuickSort2/count:100 1478 ns 1491 ns 100 push_front/IntroSort/count:100 1605 ns 1561 ns 100 push_front/PDQSort/count:100 1282 ns 1240 ns 100 push_front/PDQSort2/count:100 543 ns 523 ns 100 push_front/QuickSort/count:1000 18117 ns 18055 ns 100 push_front/QuickSort2/count:1000 15428 ns 15155 ns 100 push_front/IntroSort/count:1000 14087 ns 13988 ns 100 push_front/PDQSort/count:1000 12455 ns 12404 ns 100 push_front/PDQSort2/count:1000 4450 ns 4413 ns 100 push_front/QuickSort/count:10000 229186 ns 227644 ns 100 push_front/QuickSort2/count:10000 160690 ns 160380 ns 100 push_front/IntroSort/count:10000 191233 ns 190610 ns 100 push_front/PDQSort/count:10000 145472 ns 144255 ns 100 push_front/PDQSort2/count:10000 41785 ns 41612 ns 100 push_front/QuickSort/count:100000 2922293 ns 2702687 ns 100 push_front/QuickSort2/count:100000 1923425 ns 1917358 ns 100 push_front/IntroSort/count:100000 2488440 ns 2478518 ns 100 push_front/PDQSort/count:100000 1688943 ns 1681909 ns 100 push_front/PDQSort2/count:100000 466741 ns 465112 ns 100 push_front/QuickSort/count:1000000 31411786 ns 31044262 ns 100 push_front/QuickSort2/count:1000000 22438347 ns 22338651 ns 100 push_front/IntroSort/count:1000000 30187199 ns 29784770 ns 100 push_front/PDQSort/count:1000000 19681892 ns 19340073 ns 100 push_front/PDQSort2/count:1000000 5359031 ns 5330950 ns 100 push_middle/QuickSort/count:100 1986 ns 1972 ns 100 push_middle/QuickSort2/count:100 1523 ns 1475 ns 100 push_middle/IntroSort/count:100 1198 ns 1155 ns 100 push_middle/PDQSort/count:100 982 ns 965 ns 100 push_middle/PDQSort2/count:100 437 ns 419 ns 100 push_middle/QuickSort/count:1000 16948 ns 16891 ns 100 push_middle/QuickSort2/count:1000 12581 ns 12432 ns 100 push_middle/IntroSort/count:1000 10488 ns 10440 ns 100 push_middle/PDQSort/count:1000 4969 ns 4929 ns 100 push_middle/PDQSort2/count:1000 2006 ns 1953 ns 100 push_middle/QuickSort/count:10000 206586 ns 206205 ns 100 push_middle/QuickSort2/count:10000 147860 ns 147583 ns 100 push_middle/IntroSort/count:10000 143688 ns 143232 ns 100 push_middle/PDQSort/count:10000 44343 ns 44226 ns 100 push_middle/PDQSort2/count:10000 14500 ns 14441 ns 100 push_middle/QuickSort/count:100000 2641267 ns 2493017 ns 100 push_middle/QuickSort2/count:100000 1782662 ns 1775961 ns 100 push_middle/IntroSort/count:100000 1867750 ns 1861113 ns 100 push_middle/PDQSort/count:100000 426423 ns 423756 ns 100 push_middle/PDQSort2/count:100000 121412 ns 120541 ns 100 push_middle/QuickSort/count:1000000 29463664 ns 29094682 ns 100 push_middle/QuickSort2/count:1000000 21699498 ns 21338872 ns 100 push_middle/IntroSort/count:1000000 22344304 ns 21994267 ns 100 push_middle/PDQSort/count:1000000 4514580 ns 4247772 ns 100 push_middle/PDQSort2/count:1000000 1244558 ns 1236921 ns 100 Edited September 13 by Stefan Glienke Share this post Link to post