Stefan Glienke's Content - Page 3

Implementation of procedure <T: Record>

Stefan Glienke replied to Json's topic in General Help

Because the language spec - pardon, the compiler - says so. Constraints on generics are part of the declaration.

Rapid.Generics revamp

Stefan Glienke replied to Alex7691's topic in Delphi Third-Party

"It's not the bottleneck, therefore, I don't care" is the reason why Embarcadero does not care to implement anything that is reasonably optimized.

Rapid.Generics revamp

Stefan Glienke replied to Alex7691's topic in Delphi Third-Party

I was just curious. I expected the fact that it's a drop-in replacement to be a reason. Rapid is quite an achievement and damn those collections are fast - I use them to challenge my implementation from time to time. The design decisions are different, and for the extensive API of spring, I have to sacrifice a few nanoseconds here and there, as much as I dislike that

Rapid.Generics revamp

Stefan Glienke replied to Alex7691's topic in Delphi Third-Party

You could have given spring4d a try, which is actively maintained

Dictionaries, Hashing and Performance

Stefan Glienke replied to Clément's topic in Algorithms, Data Structures and Class Design

The numbers before the last one are all based on the quoted code using RTL Dictionary with the tweaks I explained. There is no setting needed. When you are using collections in spring4d without an explicit comparer, there are several places where they use an optimized path without a comparer at all - for example, when using an Integer as a key in a dictionary or when sorting a list of integers. Did not read the paper, but I watched his presentation. I have to admit I only understand half of all that theory stuff - I am still waiting to see this in an actual implementation because theory-wise things are all fine, but when it comes to the actual code reality kicks in. For example, hardware wants to have a word on these different arrays, which ideally would be laid out in a contiguous block, and then you need some indexing algorithm for the subparts and so on.

March 26
59 replies
- tdictionary
- tstringlist
- (and 2 more)
  Tagged with:

SIMD QSort

Stefan Glienke replied to mikerabat's topic in I made this

Pscht, don't spoil the secret soup that Primoz and I have been cooking As for the simd qsort - I'll let the numbers speak for themselves - that is on Win32 sorting Int32 - pdqsort is the one you can find in spring4d development branch: ---------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ---------------------------------------------------------------------------------------------- random/QuickSort_RTL/count:100 1937 ns 1888 ns 100 random/QuickSort_Simd/count:100 945 ns 931 ns 100 random/PDQSort/count:100 592 ns 577 ns 100 random/QuickSort_RTL/count:1000 39206 ns 39184 ns 100 random/QuickSort_Simd/count:1000 10290 ns 10272 ns 100 random/PDQSort/count:1000 5355 ns 5342 ns 100 random/QuickSort_RTL/count:10000 652150 ns 651096 ns 100 random/QuickSort_Simd/count:10000 205577 ns 205363 ns 100 random/PDQSort/count:10000 200458 ns 199829 ns 100 random/QuickSort_RTL/count:100000 7759142 ns 7750341 ns 100 random/QuickSort_Simd/count:100000 2565052 ns 2560527 ns 100 random/PDQSort/count:100000 2416112 ns 2412229 ns 100 random/QuickSort_RTL/count:1000000 91422324 ns 91249522 ns 100 random/QuickSort_Simd/count:1000000 27870985 ns 27814505 ns 100 random/PDQSort/count:1000000 27731286 ns 27692879 ns 100 few_unique/QuickSort_RTL/count:100 1871 ns 1855 ns 100 few_unique/QuickSort_Simd/count:100 950 ns 930 ns 100 few_unique/PDQSort/count:100 501 ns 489 ns 100 few_unique/QuickSort_RTL/count:1000 28397 ns 28377 ns 100 few_unique/QuickSort_Simd/count:1000 11415 ns 11401 ns 100 few_unique/PDQSort/count:1000 2635 ns 2628 ns 100 few_unique/QuickSort_RTL/count:10000 405657 ns 405324 ns 100 few_unique/QuickSort_Simd/count:10000 156791 ns 156707 ns 100 few_unique/PDQSort/count:10000 39877 ns 39837 ns 100 few_unique/QuickSort_RTL/count:100000 4705581 ns 4698238 ns 100 few_unique/QuickSort_Simd/count:100000 2006202 ns 2002594 ns 100 few_unique/PDQSort/count:100000 446669 ns 446170 ns 100 few_unique/QuickSort_RTL/count:1000000 52019783 ns 51936730 ns 100 few_unique/QuickSort_Simd/count:1000000 25403129 ns 25348705 ns 100 few_unique/PDQSort/count:1000000 5053615 ns 5046242 ns 100 two_values/QuickSort_RTL/count:100 1626 ns 1602 ns 100 two_values/QuickSort_Simd/count:100 992 ns 978 ns 100 two_values/PDQSort/count:100 371 ns 349 ns 100 two_values/QuickSort_RTL/count:1000 20607 ns 20587 ns 100 two_values/QuickSort_Simd/count:1000 12320 ns 12314 ns 100 two_values/PDQSort/count:1000 1256 ns 1246 ns 100 two_values/QuickSort_RTL/count:10000 293260 ns 293008 ns 100 two_values/QuickSort_Simd/count:10000 170420 ns 170229 ns 100 two_values/PDQSort/count:10000 10936 ns 10887 ns 100 two_values/QuickSort_RTL/count:100000 3664647 ns 3659945 ns 100 two_values/QuickSort_Simd/count:100000 2131772 ns 2128914 ns 100 two_values/PDQSort/count:100000 123928 ns 123845 ns 100 two_values/QuickSort_RTL/count:1000000 41535128 ns 41447869 ns 100 two_values/QuickSort_Simd/count:1000000 27185208 ns 27141671 ns 100 two_values/PDQSort/count:1000000 1279685 ns 1276747 ns 100 all_equal/QuickSort_RTL/count:100 2366 ns 2342 ns 100 all_equal/QuickSort_Simd/count:100 1457 ns 1439 ns 100 all_equal/PDQSort/count:100 527 ns 477 ns 100 all_equal/QuickSort_RTL/count:1000 18446 ns 18411 ns 100 all_equal/QuickSort_Simd/count:1000 13511 ns 13463 ns 100 all_equal/PDQSort/count:1000 1377 ns 1368 ns 100 all_equal/QuickSort_RTL/count:10000 237075 ns 236655 ns 100 all_equal/QuickSort_Simd/count:10000 172246 ns 171992 ns 100 all_equal/PDQSort/count:10000 5947 ns 5847 ns 100 all_equal/QuickSort_RTL/count:100000 3014635 ns 3006448 ns 100 all_equal/QuickSort_Simd/count:100000 2235909 ns 2233777 ns 100 all_equal/PDQSort/count:100000 42432 ns 42382 ns 100 all_equal/QuickSort_RTL/count:1000000 34511832 ns 34427709 ns 100 all_equal/QuickSort_Simd/count:1000000 28091462 ns 27988156 ns 100 all_equal/PDQSort/count:1000000 413582 ns 412323 ns 100 ascending/QuickSort_RTL/count:100 1711 ns 1651 ns 100 ascending/QuickSort_Simd/count:100 1300 ns 1262 ns 100 ascending/PDQSort/count:100 909 ns 886 ns 100 ascending/QuickSort_RTL/count:1000 14402 ns 14355 ns 100 ascending/QuickSort_Simd/count:1000 12382 ns 12293 ns 100 ascending/PDQSort/count:1000 1283 ns 1259 ns 100 ascending/QuickSort_RTL/count:10000 169937 ns 169542 ns 100 ascending/QuickSort_Simd/count:10000 164197 ns 163830 ns 100 ascending/PDQSort/count:10000 5191 ns 5126 ns 100 ascending/QuickSort_RTL/count:100000 2073379 ns 2070924 ns 100 ascending/QuickSort_Simd/count:100000 2021744 ns 2018514 ns 100 ascending/PDQSort/count:100000 47881 ns 47829 ns 100 ascending/QuickSort_RTL/count:1000000 24703110 ns 24661199 ns 100 ascending/QuickSort_Simd/count:1000000 22872824 ns 22813704 ns 100 ascending/PDQSort/count:1000000 528476 ns 527071 ns 100 slightly_scrambled/QuickSort_RTL/count:100 1192 ns 1148 ns 100 slightly_scrambled/QuickSort_Simd/count:100 967 ns 950 ns 100 slightly_scrambled/PDQSort/count:100 523 ns 498 ns 100 slightly_scrambled/QuickSort_RTL/count:1000 16250 ns 16220 ns 100 slightly_scrambled/QuickSort_Simd/count:1000 11613 ns 11424 ns 100 slightly_scrambled/PDQSort/count:1000 4808 ns 4757 ns 100 slightly_scrambled/QuickSort_RTL/count:10000 230412 ns 230013 ns 100 slightly_scrambled/QuickSort_Simd/count:10000 180107 ns 179869 ns 100 slightly_scrambled/PDQSort/count:10000 93826 ns 93569 ns 100 slightly_scrambled/QuickSort_RTL/count:100000 2705736 ns 2702848 ns 100 slightly_scrambled/QuickSort_Simd/count:100000 2204179 ns 2201412 ns 100 slightly_scrambled/PDQSort/count:100000 1355543 ns 1346419 ns 100 slightly_scrambled/QuickSort_RTL/count:1000000 32222921 ns 32171861 ns 100 slightly_scrambled/QuickSort_Simd/count:1000000 24320016 ns 24261911 ns 100 slightly_scrambled/PDQSort/count:1000000 15813751 ns 15769810 ns 100 sorted_blocks/QuickSort_RTL/count:100 2051 ns 2030 ns 100 sorted_blocks/QuickSort_Simd/count:100 1747 ns 1650 ns 100 sorted_blocks/PDQSort/count:100 978 ns 956 ns 100 sorted_blocks/QuickSort_RTL/count:1000 22202 ns 22159 ns 100 sorted_blocks/QuickSort_Simd/count:1000 13048 ns 13006 ns 100 sorted_blocks/PDQSort/count:1000 6170 ns 6131 ns 100 sorted_blocks/QuickSort_RTL/count:10000 259193 ns 257887 ns 100 sorted_blocks/QuickSort_Simd/count:10000 194062 ns 193862 ns 100 sorted_blocks/PDQSort/count:10000 170191 ns 169930 ns 100 sorted_blocks/QuickSort_RTL/count:100000 4040755 ns 4035505 ns 100 sorted_blocks/QuickSort_Simd/count:100000 2505645 ns 2502093 ns 100 sorted_blocks/PDQSort/count:100000 2312275 ns 2310099 ns 100 sorted_blocks/QuickSort_RTL/count:1000000 46132898 ns 46058821 ns 100 sorted_blocks/QuickSort_Simd/count:1000000 27532882 ns 27458613 ns 100 sorted_blocks/PDQSort/count:1000000 27203482 ns 27114899 ns 100 single_big_swap/QuickSort_RTL/count:100 1769 ns 1748 ns 100 single_big_swap/QuickSort_Simd/count:100 1235 ns 1199 ns 100 single_big_swap/PDQSort/count:100 1076 ns 1060 ns 100 single_big_swap/QuickSort_RTL/count:1000 15277 ns 15188 ns 100 single_big_swap/QuickSort_Simd/count:1000 12614 ns 12571 ns 100 single_big_swap/PDQSort/count:1000 2836 ns 2787 ns 100 single_big_swap/QuickSort_RTL/count:10000 174662 ns 174264 ns 100 single_big_swap/QuickSort_Simd/count:10000 190061 ns 187438 ns 100 single_big_swap/PDQSort/count:10000 29452 ns 29394 ns 100 single_big_swap/QuickSort_RTL/count:100000 2094548 ns 2090125 ns 100 single_big_swap/QuickSort_Simd/count:100000 2252041 ns 2245660 ns 100 single_big_swap/PDQSort/count:100000 280203 ns 279238 ns 100 single_big_swap/QuickSort_RTL/count:1000000 25817701 ns 25698143 ns 100 single_big_swap/QuickSort_Simd/count:1000000 24978407 ns 24883864 ns 100 single_big_swap/PDQSort/count:1000000 2840811 ns 2822251 ns 100 descending/QuickSort_RTL/count:100 1364 ns 1326 ns 100 descending/QuickSort_Simd/count:100 1402 ns 1337 ns 100 descending/PDQSort/count:100 617 ns 599 ns 100 descending/QuickSort_RTL/count:1000 13961 ns 13914 ns 100 descending/QuickSort_Simd/count:1000 10914 ns 10875 ns 100 descending/PDQSort/count:1000 842 ns 810 ns 100 descending/QuickSort_RTL/count:10000 168231 ns 167973 ns 100 descending/QuickSort_Simd/count:10000 158758 ns 158528 ns 100 descending/PDQSort/count:10000 4137 ns 4091 ns 100 descending/QuickSort_RTL/count:100000 2044335 ns 2039881 ns 100 descending/QuickSort_Simd/count:100000 1990965 ns 1984400 ns 100 descending/PDQSort/count:100000 33285 ns 32847 ns 100 descending/QuickSort_RTL/count:1000000 24925931 ns 24823017 ns 100 descending/QuickSort_Simd/count:1000000 22903629 ns 22831022 ns 100 descending/PDQSort/count:1000000 480036 ns 477583 ns 100 pipe_organ/QuickSort_RTL/count:100 1351 ns 1326 ns 100 pipe_organ/QuickSort_Simd/count:100 946 ns 931 ns 100 pipe_organ/PDQSort/count:100 530 ns 503 ns 100 pipe_organ/QuickSort_RTL/count:1000 16315 ns 16203 ns 100 pipe_organ/QuickSort_Simd/count:1000 11241 ns 11244 ns 100 pipe_organ/PDQSort/count:1000 1301 ns 1271 ns 100 pipe_organ/QuickSort_RTL/count:10000 203222 ns 202833 ns 100 pipe_organ/QuickSort_Simd/count:10000 158800 ns 158708 ns 100 pipe_organ/PDQSort/count:10000 4128 ns 4063 ns 100 pipe_organ/QuickSort_RTL/count:100000 2543356 ns 2535591 ns 100 pipe_organ/QuickSort_Simd/count:100000 2086136 ns 2081574 ns 100 pipe_organ/PDQSort/count:100000 33407 ns 33379 ns 100 pipe_organ/QuickSort_RTL/count:1000000 29884109 ns 29808067 ns 100 pipe_organ/QuickSort_Simd/count:1000000 25110723 ns 25014817 ns 100 pipe_organ/PDQSort/count:1000000 411686 ns 409344 ns 100 push_front/QuickSort_RTL/count:100 1717 ns 1672 ns 100 push_front/QuickSort_Simd/count:100 1576 ns 1510 ns 100 push_front/PDQSort/count:100 1159 ns 1109 ns 100 push_front/QuickSort_RTL/count:1000 16895 ns 16866 ns 100 push_front/QuickSort_Simd/count:1000 10169 ns 10159 ns 100 push_front/PDQSort/count:1000 834 ns 820 ns 100 push_front/QuickSort_RTL/count:10000 207717 ns 206724 ns 100 push_front/QuickSort_Simd/count:10000 160138 ns 159126 ns 100 push_front/PDQSort/count:10000 6265 ns 6215 ns 100 push_front/QuickSort_RTL/count:100000 2508110 ns 2491772 ns 100 push_front/QuickSort_Simd/count:100000 2049725 ns 2044132 ns 100 push_front/PDQSort/count:100000 57586 ns 57545 ns 100 push_front/QuickSort_RTL/count:1000000 28090747 ns 27987564 ns 100 push_front/QuickSort_Simd/count:1000000 22919136 ns 22856894 ns 100 push_front/PDQSort/count:1000000 586990 ns 585977 ns 100 push_middle/QuickSort_RTL/count:100 1859 ns 1831 ns 100 push_middle/QuickSort_Simd/count:100 1557 ns 1524 ns 100 push_middle/PDQSort/count:100 365 ns 347 ns 100 push_middle/QuickSort_RTL/count:1000 15468 ns 15391 ns 100 push_middle/QuickSort_Simd/count:1000 10691 ns 10661 ns 100 push_middle/PDQSort/count:1000 852 ns 818 ns 100 push_middle/QuickSort_RTL/count:10000 186437 ns 185943 ns 100 push_middle/QuickSort_Simd/count:10000 159570 ns 159464 ns 100 push_middle/PDQSort/count:10000 4892 ns 4843 ns 100 push_middle/QuickSort_RTL/count:100000 2251077 ns 2244548 ns 100 push_middle/QuickSort_Simd/count:100000 2096000 ns 2088067 ns 100 push_middle/PDQSort/count:100000 61318 ns 61166 ns 100 push_middle/QuickSort_RTL/count:1000000 26583284 ns 26476059 ns 100 push_middle/QuickSort_Simd/count:1000000 23082618 ns 22978049 ns 100 push_middle/PDQSort/count:1000000 457556 ns 455193 ns 100 And this is sorting Double on Win64: ------------------------------------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------ random/QuickSort_RTL/count:100 2565 ns 2547 ns 100 random/QuickSort_Simd/count:100 3349 ns 3301 ns 100 random/PDQSort/count:100 1168 ns 1183 ns 100 random/QuickSort_RTL/count:1000 47373 ns 47332 ns 100 random/QuickSort_Simd/count:1000 32689 ns 32635 ns 100 random/PDQSort/count:1000 6910 ns 6875 ns 100 random/QuickSort_RTL/count:10000 776390 ns 775674 ns 100 random/QuickSort_Simd/count:10000 503738 ns 478031 ns 100 random/PDQSort/count:10000 167340 ns 167280 ns 100 random/QuickSort_RTL/count:100000 9351122 ns 9337285 ns 100 random/QuickSort_Simd/count:100000 5608265 ns 5599393 ns 100 random/PDQSort/count:100000 2102133 ns 2097889 ns 100 random/QuickSort_RTL/count:1000000 111924175 ns 111232604 ns 100 random/QuickSort_Simd/count:1000000 64520049 ns 63951990 ns 100 random/PDQSort/count:1000000 24124765 ns 23968662 ns 100 few_unique/QuickSort_RTL/count:100 3079 ns 3016 ns 100 few_unique/QuickSort_Simd/count:100 2430 ns 2415 ns 100 few_unique/PDQSort/count:100 1100 ns 1054 ns 100 few_unique/QuickSort_RTL/count:1000 34210 ns 34068 ns 100 few_unique/QuickSort_Simd/count:1000 30551 ns 30475 ns 100 few_unique/PDQSort/count:1000 3884 ns 3821 ns 100 few_unique/QuickSort_RTL/count:10000 478228 ns 475630 ns 100 few_unique/QuickSort_Simd/count:10000 388707 ns 387296 ns 100 few_unique/PDQSort/count:10000 34574 ns 34507 ns 100 few_unique/QuickSort_RTL/count:100000 5266166 ns 5242134 ns 100 few_unique/QuickSort_Simd/count:100000 4469897 ns 4460899 ns 100 few_unique/PDQSort/count:100000 385048 ns 382507 ns 100 few_unique/QuickSort_RTL/count:1000000 57633919 ns 57367821 ns 100 few_unique/QuickSort_Simd/count:1000000 51159478 ns 50926898 ns 100 few_unique/PDQSort/count:1000000 4738638 ns 4716733 ns 100 two_values/QuickSort_RTL/count:100 2416 ns 2404 ns 100 two_values/QuickSort_Simd/count:100 2596 ns 2573 ns 100 two_values/PDQSort/count:100 642 ns 630 ns 100 two_values/QuickSort_RTL/count:1000 21831 ns 21786 ns 100 two_values/QuickSort_Simd/count:1000 26755 ns 26698 ns 100 two_values/PDQSort/count:1000 2256 ns 2209 ns 100 two_values/QuickSort_RTL/count:10000 307295 ns 306226 ns 100 two_values/QuickSort_Simd/count:10000 385912 ns 384743 ns 100 two_values/PDQSort/count:10000 26331 ns 26284 ns 100 two_values/QuickSort_RTL/count:100000 3864743 ns 3848329 ns 100 two_values/QuickSort_Simd/count:100000 4514908 ns 4499180 ns 100 two_values/PDQSort/count:100000 111245 ns 110447 ns 100 two_values/QuickSort_RTL/count:1000000 44169248 ns 43899232 ns 100 two_values/QuickSort_Simd/count:1000000 52890868 ns 52295299 ns 100 two_values/PDQSort/count:1000000 1227367 ns 1222160 ns 100 all_equal/QuickSort_RTL/count:100 2280 ns 2264 ns 100 all_equal/QuickSort_Simd/count:100 2217 ns 2174 ns 100 all_equal/PDQSort/count:100 582 ns 567 ns 100 all_equal/QuickSort_RTL/count:1000 18267 ns 18192 ns 100 all_equal/QuickSort_Simd/count:1000 26927 ns 26839 ns 100 all_equal/PDQSort/count:1000 1483 ns 1474 ns 100 all_equal/QuickSort_RTL/count:10000 231677 ns 230263 ns 100 all_equal/QuickSort_Simd/count:10000 388326 ns 385394 ns 100 all_equal/PDQSort/count:10000 8330 ns 8267 ns 100 all_equal/QuickSort_RTL/count:100000 3072258 ns 3055451 ns 100 all_equal/QuickSort_Simd/count:100000 4560752 ns 4521917 ns 100 all_equal/PDQSort/count:100000 69035 ns 67927 ns 100 all_equal/QuickSort_RTL/count:1000000 36140895 ns 35829412 ns 100 all_equal/QuickSort_Simd/count:1000000 52332564 ns 52036527 ns 100 all_equal/PDQSort/count:1000000 660978 ns 656433 ns 100 ascending/QuickSort_RTL/count:100 1870 ns 1877 ns 100 ascending/QuickSort_Simd/count:100 3392 ns 3352 ns 100 ascending/PDQSort/count:100 1172 ns 1160 ns 100 ascending/QuickSort_RTL/count:1000 16620 ns 16586 ns 100 ascending/QuickSort_Simd/count:1000 31390 ns 31279 ns 100 ascending/PDQSort/count:1000 1498 ns 1433 ns 100 ascending/QuickSort_RTL/count:10000 205486 ns 204808 ns 100 ascending/QuickSort_Simd/count:10000 397228 ns 395426 ns 100 ascending/PDQSort/count:10000 6873 ns 6760 ns 100 ascending/QuickSort_RTL/count:100000 2481823 ns 2478599 ns 100 ascending/QuickSort_Simd/count:100000 4677020 ns 4651949 ns 100 ascending/PDQSort/count:100000 53174 ns 52941 ns 100 ascending/QuickSort_RTL/count:1000000 30165890 ns 30016057 ns 100 ascending/QuickSort_Simd/count:1000000 52585840 ns 52358100 ns 100 ascending/PDQSort/count:1000000 546292 ns 543164 ns 100 slightly_scrambled/QuickSort_RTL/count:100 2265 ns 2179 ns 100 slightly_scrambled/QuickSort_Simd/count:100 2984 ns 2951 ns 100 slightly_scrambled/PDQSort/count:100 1312 ns 1223 ns 100 slightly_scrambled/QuickSort_RTL/count:1000 18443 ns 18415 ns 100 slightly_scrambled/QuickSort_Simd/count:1000 32498 ns 32437 ns 100 slightly_scrambled/PDQSort/count:1000 4900 ns 4885 ns 100 slightly_scrambled/QuickSort_RTL/count:10000 264849 ns 264564 ns 100 slightly_scrambled/QuickSort_Simd/count:10000 425552 ns 424796 ns 100 slightly_scrambled/PDQSort/count:10000 102652 ns 101993 ns 100 slightly_scrambled/QuickSort_RTL/count:100000 3134562 ns 3119682 ns 100 slightly_scrambled/QuickSort_Simd/count:100000 4998249 ns 4978579 ns 100 slightly_scrambled/PDQSort/count:100000 1472946 ns 1466023 ns 100 slightly_scrambled/QuickSort_RTL/count:1000000 37367087 ns 37182092 ns 100 slightly_scrambled/QuickSort_Simd/count:1000000 55740685 ns 55477997 ns 100 slightly_scrambled/PDQSort/count:1000000 17186242 ns 17082706 ns 100 sorted_blocks/QuickSort_RTL/count:100 2478 ns 2433 ns 100 sorted_blocks/QuickSort_Simd/count:100 3095 ns 2992 ns 100 sorted_blocks/PDQSort/count:100 1090 ns 1067 ns 100 sorted_blocks/QuickSort_RTL/count:1000 27677 ns 27633 ns 100 sorted_blocks/QuickSort_Simd/count:1000 34255 ns 34185 ns 100 sorted_blocks/PDQSort/count:1000 7345 ns 7287 ns 100 sorted_blocks/QuickSort_RTL/count:10000 309712 ns 307961 ns 100 sorted_blocks/QuickSort_Simd/count:10000 472346 ns 467668 ns 100 sorted_blocks/PDQSort/count:10000 156489 ns 156120 ns 100 sorted_blocks/QuickSort_RTL/count:100000 4751737 ns 4736686 ns 100 sorted_blocks/QuickSort_Simd/count:100000 5649021 ns 5626166 ns 100 sorted_blocks/PDQSort/count:100000 2090767 ns 2085443 ns 100 sorted_blocks/QuickSort_RTL/count:1000000 55395331 ns 55148208 ns 100 sorted_blocks/QuickSort_Simd/count:1000000 64626584 ns 64356116 ns 100 sorted_blocks/PDQSort/count:1000000 24145373 ns 24023248 ns 100 single_big_swap/QuickSort_RTL/count:100 2154 ns 2105 ns 100 single_big_swap/QuickSort_Simd/count:100 3405 ns 3371 ns 100 single_big_swap/PDQSort/count:100 1271 ns 1280 ns 100 single_big_swap/QuickSort_RTL/count:1000 18941 ns 18884 ns 100 single_big_swap/QuickSort_Simd/count:1000 35627 ns 35369 ns 100 single_big_swap/PDQSort/count:1000 2914 ns 2881 ns 100 single_big_swap/QuickSort_RTL/count:10000 230744 ns 226649 ns 100 single_big_swap/QuickSort_Simd/count:10000 439977 ns 439106 ns 100 single_big_swap/PDQSort/count:10000 24462 ns 24441 ns 100 single_big_swap/QuickSort_RTL/count:100000 2642839 ns 2638752 ns 100 single_big_swap/QuickSort_Simd/count:100000 5123966 ns 5110310 ns 100 single_big_swap/PDQSort/count:100000 212054 ns 211579 ns 100 single_big_swap/QuickSort_RTL/count:1000000 31998557 ns 31831534 ns 100 single_big_swap/QuickSort_Simd/count:1000000 56693968 ns 56385127 ns 100 single_big_swap/PDQSort/count:1000000 2287462 ns 2274715 ns 100 descending/QuickSort_RTL/count:100 1953 ns 1907 ns 100 descending/QuickSort_Simd/count:100 3266 ns 3233 ns 100 descending/PDQSort/count:100 1243 ns 1190 ns 100 descending/QuickSort_RTL/count:1000 16769 ns 16669 ns 100 descending/QuickSort_Simd/count:1000 31242 ns 31187 ns 100 descending/PDQSort/count:1000 1416 ns 1383 ns 100 descending/QuickSort_RTL/count:10000 205746 ns 205463 ns 100 descending/QuickSort_Simd/count:10000 397112 ns 396469 ns 100 descending/PDQSort/count:10000 6385 ns 6357 ns 100 descending/QuickSort_RTL/count:100000 2509532 ns 2479948 ns 100 descending/QuickSort_Simd/count:100000 4654628 ns 4626693 ns 100 descending/PDQSort/count:100000 53043 ns 52875 ns 100 descending/QuickSort_RTL/count:1000000 29567957 ns 29440319 ns 100 descending/QuickSort_Simd/count:1000000 52694899 ns 52342249 ns 100 descending/PDQSort/count:1000000 546650 ns 544430 ns 100 pipe_organ/QuickSort_RTL/count:100 2116 ns 2097 ns 100 pipe_organ/QuickSort_Simd/count:100 1963 ns 1946 ns 100 pipe_organ/PDQSort/count:100 628 ns 611 ns 100 pipe_organ/QuickSort_RTL/count:1000 18278 ns 18198 ns 100 pipe_organ/QuickSort_Simd/count:1000 28385 ns 27872 ns 100 pipe_organ/PDQSort/count:1000 1451 ns 1439 ns 100 pipe_organ/QuickSort_RTL/count:10000 214821 ns 214320 ns 100 pipe_organ/QuickSort_Simd/count:10000 368611 ns 367921 ns 100 pipe_organ/PDQSort/count:10000 7572 ns 6703 ns 100 pipe_organ/QuickSort_RTL/count:100000 2732768 ns 2722013 ns 100 pipe_organ/QuickSort_Simd/count:100000 4673867 ns 4651534 ns 100 pipe_organ/PDQSort/count:100000 54171 ns 53981 ns 100 pipe_organ/QuickSort_RTL/count:1000000 32340442 ns 32204978 ns 100 pipe_organ/QuickSort_Simd/count:1000000 52601173 ns 52242565 ns 100 pipe_organ/PDQSort/count:1000000 1191710 ns 1179642 ns 100 push_front/QuickSort_RTL/count:100 2201 ns 2179 ns 100 push_front/QuickSort_Simd/count:100 3505 ns 3463 ns 100 push_front/PDQSort/count:100 1466 ns 1460 ns 100 push_front/QuickSort_RTL/count:1000 20234 ns 20070 ns 100 push_front/QuickSort_Simd/count:1000 32713 ns 32609 ns 100 push_front/PDQSort/count:1000 2279 ns 1973 ns 100 push_front/QuickSort_RTL/count:10000 259051 ns 249922 ns 100 push_front/QuickSort_Simd/count:10000 408989 ns 407301 ns 100 push_front/PDQSort/count:10000 9560 ns 9541 ns 100 push_front/QuickSort_RTL/count:100000 2963996 ns 2943123 ns 100 push_front/QuickSort_Simd/count:100000 4759393 ns 4743849 ns 100 push_front/PDQSort/count:100000 92058 ns 91788 ns 100 push_front/QuickSort_RTL/count:1000000 32817948 ns 32651555 ns 100 push_front/QuickSort_Simd/count:1000000 52370293 ns 52119300 ns 100 push_front/PDQSort/count:1000000 885186 ns 880839 ns 100 push_middle/QuickSort_RTL/count:100 2182 ns 2131 ns 100 push_middle/QuickSort_Simd/count:100 3385 ns 3379 ns 100 push_middle/PDQSort/count:100 643 ns 630 ns 100 push_middle/QuickSort_RTL/count:1000 18552 ns 18454 ns 100 push_middle/QuickSort_Simd/count:1000 30820 ns 30591 ns 100 push_middle/PDQSort/count:1000 1553 ns 1518 ns 100 push_middle/QuickSort_RTL/count:10000 234531 ns 233782 ns 100 push_middle/QuickSort_Simd/count:10000 392271 ns 389699 ns 100 push_middle/PDQSort/count:10000 8547 ns 8493 ns 100 push_middle/QuickSort_RTL/count:100000 2610121 ns 2600388 ns 100 push_middle/QuickSort_Simd/count:100000 4630818 ns 4609689 ns 100 push_middle/PDQSort/count:100000 71370 ns 69366 ns 100 push_middle/QuickSort_RTL/count:1000000 31046096 ns 30924256 ns 100 push_middle/QuickSort_Simd/count:1000000 52541755 ns 52326830 ns 100 push_middle/PDQSort/count:1000000 711900 ns 708811 ns 100

Dictionaries, Hashing and Performance

Stefan Glienke replied to Clément's topic in Algorithms, Data Structures and Class Design

I just came across this thread while searching for something else, and while reading this code, I want to make a few points: I agree on the essential point: when you know how many items you will add to a collection, it is always beneficial to set the capacity upfront. Now, even though this is some naive benchmark code, it shows a few things: - Calling ContainsKey before the attempt to call Add means performing the lookup twice, which has not been necessary at all since Delphi 10.3 because of the new method TryAdd. On my machine, the tests go down from 51ms and 31ms to 44ms and 26ms. - The hash function is one of the key parts that makes or breaks a dictionary performance-wise. The numbers I posted are from 12.3, which already uses a better hash algo (FNV1a) than it did years ago (BobJenkins) but is still far away from ideal. Especially for Integer keys, if the hashtable is designed in a robust way of dealing with hash collisions and clustering, you can get away with not hashing anything but using the value itself. Unfortunately, the RTL one is not built in that way, but for this benchmark, it is fine to replace the eq compare it uses with the one from Spring4d, which does what I mentioned before. This brings down the numbers to 36ms and 22ms. That is another 15% improvement. Now Spring4d would not be Spring4d if it did not have another trick up its sleeve: when it uses the default eq compare, the hashtable does not need to call GetHashCode and Equals for some intrinsic types such as Integer because the value itself is used as hashcode and for checking Integer equality it does not need to call some method. That leads to the above test with initial capacity taking 14ms.

March 25
59 replies
- tdictionary
- tstringlist
- (and 2 more)
  Tagged with:

What does [ref] attribute actually dows

Stefan Glienke replied to Tommi Prami's topic in RTL and Delphi Object Pascal

It is briefly explained under Constant Parameters - it enforces the parameter to be passed as a reference.

How to check that TValue.Type is array of TValue

Stefan Glienke replied to dmitrybv's topic in RTL and Delphi Object Pascal

FWIW, this function returns true for multidimensional static arrays of TValue.

Namespaces in Spring4d code

Stefan Glienke replied to EugeneK's topic in Delphi Third-Party

To avoid repeating the unit throughout the source code, you can declare an alias at the top and only explicitly put the unit name there. That, however, only works for non-generic types and consts. I wish it would also work for generics and routines.

Namespaces in Spring4d code

Stefan Glienke replied to EugeneK's topic in Delphi Third-Party

The claim that it compiles faster is bogus - prove me wrong. Most compile time from spring4d comes from generics, which I reported years ago. Also, my suggestion for third-party libraries is to pre-compile them, which removes any dependency on the project options in your project. Currently, Spring4d supports down to XE, and as long as that is the case, I am not putting even more conditionals into the code than there already are.

Namespaces in Spring4d code

Stefan Glienke replied to EugeneK's topic in Delphi Third-Party

Why?

TParallelArray Sort Performance...

Stefan Glienke replied to Steve Maughan's topic in RTL and Delphi Object Pascal

Just a few new numbers of a not yet released parallel pdq sort - using the benchmark code from this comment earlier in this thread Fill array (double), elements count: 500000 Start sorting ... RTL TArray.Sort (ms.): 47 RTL TParallelArray.Sort (ms.): 35 Spring TArray.Sort (ms.): 12 Spring TArray.Sort_Parallel (ms.): 3 Fill array (double), elements count: 5000000 Start sorting ... RTL TArray.Sort (ms.): 551 RTL TParallelArray.Sort (ms.): 128 Spring TArray.Sort (ms.): 136 Spring TArray.Sort_Parallel (ms.): 64 Fill array (double), elements count: 100000000 Start sorting ... RTL TArray.Sort (ms.): 12724 RTL TParallelArray.Sort (ms.): 1884 Spring TArray.Sort (ms.): 3035 Spring TArray.Sort_Parallel (ms.): 675 Again - these numbers are fluctuating a bit because the benchmark is a "run once" benchmark and it depends on the current CPU state etc - also I did not tweak the threshold and CPU count yet - simply calling TTask.Run from System.Threading to fork some slices into parallel execution. But overall it does not look too bad, doesn't it?

New forum?

Stefan Glienke replied to Rick_Delphi's topic in Community Management

Of course, he says that - he wants to sell his overpriced garbage

Guidance on FreeAndNil for Delphi noob

Stefan Glienke replied to Paul Dardeau's topic in RTL and Delphi Object Pascal

Because as you can read in this thread people advocate for using it everywhere instead of at those rare places it's designed for: making sure the reference is set to nil before the instance is destroyed because any code being called during destruction might reach back to this very reference. (which is the reason its name is not even correct - it should have been named NilAndFree)

Guidance on FreeAndNil for Delphi noob

Stefan Glienke replied to Paul Dardeau's topic in RTL and Delphi Object Pascal

Using FreeAndNil is the very essence of https://en.wikipedia.org/wiki/Cargo_cult_programming

Virtual class methods and properties

Stefan Glienke replied to pyscripter's topic in RTL and Delphi Object Pascal

It appears that Marco lacks the technical understanding to evaluate this issue. I left a comment.

Guidance on FreeAndNil for Delphi noob

Stefan Glienke replied to Paul Dardeau's topic in RTL and Delphi Object Pascal

The issue with your code is that it captures Self - if you avoid that, your problems go away.

Guidance on FreeAndNil for Delphi noob

Stefan Glienke replied to Paul Dardeau's topic in RTL and Delphi Object Pascal

Reducing instead of eliminating the probability of AVs (or other errors caused by races) in async code is one of the most annoying things you can do - anyone who ever was on the hunt for that "one in a billion times" bug knows that. I cannot believe that people who should know better give such advice.

Guidance on FreeAndNil for Delphi noob

Stefan Glienke replied to Paul Dardeau's topic in RTL and Delphi Object Pascal

The best practice is coding in a way that you don't need it

pasfmt out now!

Stefan Glienke replied to Joshua Gardner's topic in Delphi Third-Party

It always is just a matter of taste - I don't particularly like the parameter and local variable prefixing so I mostly settled on the C# naming conventions for those - I did not want to go that far to name fields with a leading underscore like them though. Though this is not without disadvantages - I had cases where I had a count parameter and wanted to access the Count field but without qualifying with Self I accessed the parameter instead. (those are the days where I wished for case sensitivity) Because being open source means that everyone can go ahead and modify it to their liking, right?

Sign In

Stefan Glienke

Content Count

Joined

Last visited

Days Won

Content Type

Profiles

Forums

Calendar

Everything posted by Stefan Glienke

Implementation of procedure <T: Record>

Rapid.Generics revamp

Rapid.Generics revamp

Rapid.Generics revamp

Dictionaries, Hashing and Performance

SIMD QSort

Dictionaries, Hashing and Performance

What does [ref] attribute actually dows

How to check that TValue.Type is array of TValue

Namespaces in Spring4d code

Namespaces in Spring4d code

Namespaces in Spring4d code

TParallelArray Sort Performance...

New forum?

Guidance on FreeAndNil for Delphi noob

Guidance on FreeAndNil for Delphi noob

Virtual class methods and properties

Guidance on FreeAndNil for Delphi noob

Guidance on FreeAndNil for Delphi noob

Guidance on FreeAndNil for Delphi noob

pasfmt out now!

pasfmt out now!

Looking but not finding...

Since February 2025, the creation of an 'ini' file with Delphi is much slower than in the past.

Custom Managed Record - Compiler Fail "F2084 Internal Error C25335"

Browse

Activity