Jump to content

Stefan Glienke

Members
  • Content Count

    1366
  • Joined

  • Last visited

  • Days Won

    130

Everything posted by Stefan Glienke

  1. A few notes: - the position array will very likely be rather short - unless you going to replace words in a novel or something like that. Handwritten InsertionSort will outperform that. - avoid dynamic allocation for the position array - use a static array with a reasonable standard size and only fall back to dynamically allocating if that is not enough - after fixing that I get an improvement of around 1/3 over the code you posted and now SamplingProfiler tells me that the code stays in System.Pos for like 50% of the time - that cannot be right Also you have a bug: AddTestCase(vTestCases, 'ReplaceReplaceString', ['Replace', 'FooString'], ['Foo', 'Bar']);
  2. First because 64bit does not suffer from the "virtual method glitch". Second because even with a class implementing the IComparer interface you get adjustor thunks which you don't have with the default comparers provided by Generics.Collections.Defaults because those are directly crafted IMTs (interface method tables) BTW: you again unnecessarily slow down your benchmark results by putting everything into the dpr main. And you are comparing different things - your CustomBinarySearch uses CompareValue which is a function call. So if you decide to roll your custom implementation then don't dumb it down: function GetName_CustomBinarySearch2(const aData: TArray<TDataLine>; const aItem: TDataLine): string; var L, H, i, c: Integer; begin L := 0; H := High(aData); while L <= H do begin i := L + (H - L) shr 1; c := aData[i].SeqNo - aItem.SeqNo; if c < 0 then L := i + 1 else if c > 0 then H := i - 1 else Break; end; if L <= H then Result := aData[i].CustomName else Result := ''; end; makes it almost twice as fast as your implementation. I decided to use the subtract approach as you did with the comparers - keep in mind this only works if there cant be an overflow which wont happen if SeqNo is just 0..MaxInt Anyhow at least in this benchmark it did not make any noticable difference to writing: if aData[i].SeqNo < aItem.SeqNo then L := i + 1 else if aData[i].SeqNo > aItem.SeqNo then H := i - 1 else Break;
  3. Stefan Glienke

    Profiler for Delphi

    Given its the "official" way to produce these files - I would not say that it's cheating. MS even published the "documentation" (as in "heres the code, figure it out yourself") on github: https://github.com/microsoft/microsoft-pdb It if turns out to be the best way then I could live with the fact that I have to install a VS - one would even have to check if distributing the required dll would be permitted or if just installing VS for using these dlls requires a license if you are not eligible to use the Community Edition.
  4. Stefan Glienke

    Profiler for Delphi

    Well shoot me all the stuff (with fool proof instructions if there are any ^^) and I try it out on VTune 2020
  5. Stefan Glienke

    Profiler for Delphi

    Just a few random thoughts: Could this be the case? https://community.intel.com/t5/Analyzers/Getting-Symbols-in-VTune-Amplifier-2018/td-p/1129907 Have you tried turning the pdb from the sample which seems to work into yaml, analyze is and turn it back into pdb to check if it still works? Like eliminating the differences between those two to nail down the case for VTune choking. Also in case you are not afraid of C++ you might look into https://github.com/rainers/cv2pdb which claims to be able to generate pdb files for D that Visual Studio can read (btw, can VS read yours?)
  6. FWIW the approach to use objects as bucket items for a hashmap is a no go in real code as your memory will be all over the place plus the memory overhead of objects. In a microbenchmark you will get away with that because there won't be much more memory around. On my i7 the TNameMap already drops behind the RTL dict at around 50000k items. The same seems to be the case for the TSynDict as that drops even more.
  7. Good point - I will even go even further and think all the lookup methods should returning rather than a specific property value which a) improves performance, b) makes them better comparable and c) makes this code more flexible Most then even turn into easily inlinable one liners.
  8. Stefan Glienke

    Profiler for Delphi

    It might be worth asking over on https://community.intel.com/t5/Analyzers/bd-p/analyzers. Maybe someone can give you a pointer into the right direction. Also did you try the pdb in WinDbg?
  9. As always you again seem to only read half of what I wrote. Base line (code as you posted) - running on an i5-3570K @ 3.4GHz Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 46 397 60 23 48 59 50 72 474 67 28 52 59 100 105 504 62 32 51 59 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 21 373 106 11 25 107 50 105 443 109 18 26 110 100 204 487 104 23 29 103 Use equalitycomparer from Tiny.Generics.pas for TDict and Spring - running on the i5-3570K @ 3.4GHz Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 45 407 44 24 49 44 50 71 468 42 27 48 42 100 96 499 43 32 53 43 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 21 379 26 11 27 25 50 106 445 25 18 27 25 100 207 483 26 22 29 27 Simply put "procedure Main;" after the CreateRandomData procedure and "end;begin Main;" before the end. - running on the i5-3570K @ 3.4GHz Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 28 387 27 25 32 26 50 51 455 25 28 32 24 100 76 487 27 32 34 27 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 22 380 27 12 25 27 50 105 446 24 18 28 24 100 207 489 25 24 30 26 Same code as before but running on an i7-8700 @ 3.2Ghz Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 23 306 21 20 27 20 50 42 355 22 24 28 20 100 63 377 22 26 29 21 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 17 299 16 7 18 17 50 72 359 16 13 19 17 100 139 383 17 21 22 18 Same code compiled with XE8 and also running on the i7-8700 @ 3.2GHz Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 26 292 22 20 24 21 50 65 342 22 23 26 21 100 112 355 21 25 26 23 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 24 281 17 8 18 18 50 116 331 18 13 20 18 100 215 347 17 16 22 17 Edit - for bonus points - I fixed the TArray.BinarySearch case by using an IComparer that I prepared and stored as field because that a) solves the return stack buffer glitch caused my TDelegatedComparer and b) avoids producing an anonymous method every time that method is being called. Also got rid of the CompareValue call in BinarySearchByID, can use < and >. Running on the i5-3570k @ 3.4Ghz Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 26 64 25 21 31 27 50 52 73 26 23 32 30 100 77 84 25 23 34 25 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 22 63 25 11 27 27 50 105 81 24 18 27 26 100 205 94 24 22 29 27
  10. As I already said - the high constant factor with the RTL hash function is the problem - both TDict and Spring with faster equalitycomparer from Tiny.Library - results from an i5-3570 @ 3.4GHz Also testing in the global main is garbage because you are then using global variables that cause way less optimal code - fixing that (simply put all that benchmark code into a procedure) and you get the following results Win32: Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 26 375 25 24 32 27 50 56 447 25 29 32 26 100 83 473 26 32 35 25 1000 535 585 28 83 37 26 10000 5561 753 46 130 44 30 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 21 382 28 12 30 27 50 109 466 24 18 28 25 100 206 491 24 23 31 26 1000 2044 603 30 88 39 30 10000 20532 810 36 181 49 36 Win64: Get Name from ID (Search by integer): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 24 297 26 23 32 26 50 60 323 26 30 31 26 100 94 328 26 40 32 26 1000 752 381 28 85 47 28 10000 8619 427 33 128 47 30 Get ID from Name (Search by string): records Sequenced TArray.BinS TDict CustomBinS TSynDict Spring 10 61 306 24 30 27 24 50 304 342 23 41 31 22 100 620 347 23 51 31 23 1000 6466 426 31 127 52 30 10000 64263 552 38 237 58 36
  11. Don't let yourself get distracted - I am eagerly awaiting your results Well it seems there is some DLL unload callback registered that clears the cache for that module should it exist. But given I understood the initial problem properly the case when the resourcestring was initially from the exe itself is not handled by that mechanism.
  12. They store the string reference along with the ident in a hashtable.
  13. Interesting, I did not notice that change until now - but that change is actually pretty nice! It avoids tons of identical string instances around the memory and loading them over and over again. But of course now we get the problem of cache invalidation hence this routine must be called when language got changed.
  14. Stefan Glienke

    Profiler for Delphi

    I like SamplingProfiler but compared to the capabilities of VTune or μProf it's just a toy.
  15. Stefan Glienke

    Profiler for Delphi

    Does it help to put the pdb into one of the directories mentioned in 6. of this doc? https://software.intel.com/content/www/us/en/develop/documentation/vtune-help/top/set-up-project/search-directories/search-order.html
  16. Stefan Glienke

    Good Key/Hash for SQL string

    Whats the problem in simply using a TDictionary<string,queryresult> where the key is the sql statement - even if there is a hash collision the hashtable handles it. I doubt that the extra comparisons of the colliding SQL strings would affect performance noticably.
  17. "Let's put dots into unit names, add some confusing matching logic for uses clauses that nobody understands and call that namespaces"
  18. Stefan Glienke

    Profiler for Delphi

    That's also something that went through my head when I was reading the LLVM documentation.
  19. If you look into the debugger you see what causes the breakpoint to stop It looks to me as the many situations where the compiler produces debug symbols that are a little off - as you can see in the asm view you have two different pieces being affected by the breakpoint. The thing is that every line generated by the compiler - even the implicit for the looping goes somewhere in terms of what line of code they belong to - and in this case its the piece of code happening for the outer loop. We had another situation where this happened just recently:
  20. As Remy rightly assumed this indeed was a bug - an untyped pointer was assignment compatible to a dynamic array - a very dangerous and long lasting bug - it was finally in 10.2. And because the default for {$TYPEDADDRESS} is OFF this code actually resulted in the pointer to the TRect variable passed as a TBytes - now because the code inside that overload never does any range or bounds check it happens to work if you pass the correct size of that variable.
  21. Stefan Glienke

    DEC 6.x issue

    tbh you pretty much butchered the entire code for older versions. I just gave XE a try and it fails on numerous things - not supported $IF/$ENDIF (it was $IF/$IFEND back then) - scoped unit names, usage on intrinsic helpers, AtomicIncrement and more I guess (I stopped fixing the code at that point)
  22. It does matter - if your ID values for example are in a certain range maybe even starting at 1 you could get O(1) access if you used them simply as index in the array itself. I don't know what that means - but talking about at most 10k records with strings up to 30 characters looking up one of those is certainly within the ms if not ns range.
  23. No surprise there - linear search is O(n), binary search is O(log n) and hash table is O(1), what you see with the differences for certain number of records are the different constant factors - building a hash has a certain cost which is why TDict is slower up to a certain point - this depends on the hash algorithm being used - indirections being caused by extra function calls like IEqualityComparer.GetHashCode and the memory layout of the hash table. You can even see that it's not strictly O(1) as it gets a little slower the more records you have which I would guess is due to memory layout within the hashtable where at a certain point memory access time kicks in because the data does not fit in the fast cache anymore (access pattern of the benchmarking code matters there as well). Without specifying the exact use cases it's hard to give any suggestions - is the data one time fill and then only lookup, is data being changed after - adding/removing records, is data within the records changing. Whats the typical lookup pattern - is data being looked up randomly or in certain patterns (such as ID or Name sequences in certain order). If you aim for maximum performance a custom data structure and algorithm will always win over some standard algorithm by a few percent - the question is if you want to do all that work including risking bugs for that negligible few percent or use some standard data structures and algorithms that are proven to work and easy to be applicable.
  24. That is because technically what the compiler generates for line 93 is this: xComparison := function(const left, right: TDataLine): Integer begin Result := CompareIDs(left, right) end); When you put a breakpoint there you have them in multiple lines: one time it gets hit for the assignment to xComparison and all other times for the Result := CompareIDs
×