-
Content Count
1428 -
Joined
-
Last visited
-
Days Won
141
Everything posted by Stefan Glienke
-
http://docwiki.embarcadero.com/Libraries/Sydney/en/System.StrUtils.SplitString
-
Make the duration column smaller - there seems to be some glitch with it sporadically spanning the entire width making the name column not visible which I have seen occasionally but could not repro yet.
-
ANN: DDevExtensions and DFMCheck with 10.3 Rio support
Stefan Glienke replied to jbg's topic in Delphi Third-Party
Gogo gadget admin power and fix them 😉 -
ANN: DDevExtensions and DFMCheck with 10.3 Rio support
Stefan Glienke replied to jbg's topic in Delphi Third-Party
Blog was moved to https://www.idefixpack.de/ some while ago already -
Fastest Way to Read / Parse a large JSON File?
Stefan Glienke replied to Steve Maughan's topic in Algorithms, Data Structures and Class Design
Isn't fastest and using Delphi's own classes mutually exclusive? My personal recommendation goes to https://github.com/ahausladen/JsonDataObjects -
How to handle delphi exception elegantly with logging feature.
Stefan Glienke replied to HalfBlindCoder's topic in Algorithms, Data Structures and Class Design
const true = not true; 😈 -
How to handle delphi exception elegantly with logging feature.
Stefan Glienke replied to HalfBlindCoder's topic in Algorithms, Data Structures and Class Design
Yeah, writing Boolean(0/1) is really much faster than false/true -
How to handle delphi exception elegantly with logging feature.
Stefan Glienke replied to HalfBlindCoder's topic in Algorithms, Data Structures and Class Design
Isn't that what I wrote? Guess what I meant when I wrote "interpret" a bool as a number. -
Spring4D 2.0 sneak peek - the evolution of performance
Stefan Glienke posted a topic in Tips / Blogs / Tutorials / Videos
https://delphisorcery.blogspot.com/2021/06/spring4d-20-sneak-peek-evolution-of.html -
Spring4D 2.0 sneak peek - the evolution of performance
Stefan Glienke replied to Stefan Glienke's topic in Tips / Blogs / Tutorials / Videos
FWIW the difference you saw between interfaced based enumerator and a record based one might have been bigger than what you see with Spring because some collection types (I am considering applying this to some more) are using not classic implemented-by-an-object interfaces for IEnumerator but a handcrafted IMT - seehttps://bitbucket.org/sglienke/spring4d/src/3e9160575e06b3956c0e90d2aebe5e57d931cd19/Source/Base/Collections/Spring.Collections.Lists.pas#lines-56 - this avoids the adjustor thunks which gives a noticeable performance improvement. I also looked into avoiding the heap allocation for the enumerator by embedding one into the list itself which it returns and only do the heap allocation once a second enumeration happens. I did not run any benchmarks on that yet to evaluate if that's something worthwhile. After all, if you want to have that maximum raw speed avoiding assignments I already mentioned another approach I am looking into - which also provides for-in loop capability but will not have T as loop element but a ^T. I have that in an experimental branch - I can add some benchmark of that tomorrow. -
Spring4D 2.0 sneak peek - the evolution of performance
Stefan Glienke replied to Stefan Glienke's topic in Tips / Blogs / Tutorials / Videos
Latest Delphi versions inline MoveNext - since 10.3 or so The reason to use an interface is simple: API compatibility - try building something like IEnumerable<T> and all other collection types based upon and compatible with that with record-based enumerators that inline. It's not possible. Record-based enumerators are only possible if you make them specifically for each collection type as in your case because you don't have a common base type for your collections that is also enumerable. IList<T> in Spring can have multiple implementations (and indeed does) not all being wrappers around a dynamic array. FWIW here are some performance comparisons between RTL that uses a class for enumerator with inlined MoveNext GetCurrent, a modified RTL version via helper with GetEnumerator that returns a record, and Spring. The test just does iteration over differently sized lists and counting the odd items (the lists are filled with numbers 1..n in random order. You can indeed see the smaller list have way lower items/sec as the loop overhead is higher. However, I argue that the benefit of the overall design and its flexibility with all the things you can achieve with IEnumerable<T> compensates for the higher cost of setting up the loop and not being able to inline MoveNext and GetCurrent. Furthermore, the enumerator in Spring does an additional check in its MoveNext to prevent accidentally modifying the currently being iterated collection - which is a not so uncommon mistake to happen. Also since IEnumerable<T> and IEnumerator<T> in Spring are composable and are being used with the streaming operations an enumerator always holds the value after a MoveNext and does not only fetch it from the underlying collections like a as a naive and fast list iterator would do by just holding the lists array pointer and an index and the inlined GetCurrent be Result := items[index] Remember when I wrote in my article that Spring provides the best balance between the best possible speed and a rich API? This is one of the decisions I had to make and I am constantly exploring options to make this better. Especially on small lists, the loop overhead can be huge compared to the actual work inside the loop. FWIW especially for lists, I am currently looking into providing a similar (but safer!) API as the RTL does by giving direct raw access to the backing array. Using this API can use a record enumerator and blows the performance totally out of the water. 10.4.2 - win32 Run on (4 X 3410,01 MHz CPU s) CPU Caches: L1 Data 32 K (x4) L1 Instruction 32 K (x4) L2 Unified 256 K (x4) L3 Unified 6144 K (x1) ------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------ iterate-rtl/10 79,6 ns 78,5 ns 8960000 items_per_second=127.431M/s iterate-rtl/100 291 ns 295 ns 2488889 items_per_second=338.913M/s iterate-rtl/1000 2286 ns 2302 ns 298667 items_per_second=434.425M/s iterate-rtl/10000 22287 ns 22461 ns 32000 items_per_second=445.217M/s iterate-rtl/100000 222090 ns 219702 ns 2987 items_per_second=455.162M/s iterate-rtl-record/10 17,0 ns 17,3 ns 40727273 items_per_second=579.232M/s iterate-rtl-record/100 185 ns 184 ns 3733333 items_per_second=543.03M/s iterate-rtl-record/1000 1737 ns 1716 ns 373333 items_per_second=582.764M/s iterate-rtl-record/10000 18495 ns 18415 ns 37333 items_per_second=543.025M/s iterate-rtl-record/100000 179492 ns 179983 ns 3733 items_per_second=555.609M/s iterate-spring/10 90,2 ns 90,0 ns 7466667 items_per_second=111.132M/s iterate-spring/100 410 ns 408 ns 1723077 items_per_second=245.06M/s iterate-spring/1000 3699 ns 3683 ns 186667 items_per_second=271.516M/s iterate-spring/10000 36136 ns 36098 ns 19478 items_per_second=277.02M/s iterate-spring/100000 365107 ns 368968 ns 1948 items_per_second=271.026M/s 10.4.2 - win64 Run on (4 X 3410,01 MHz CPU s) CPU Caches: L1 Data 32 K (x4) L1 Instruction 32 K (x4) L2 Unified 256 K (x4) L3 Unified 6144 K (x1) ------------------------------------------------------------------------------------ Benchmark Time CPU Iterations UserCounters... ------------------------------------------------------------------------------------ iterate-rtl/10 112 ns 112 ns 6400000 items_per_second=89.0435M/s iterate-rtl/100 538 ns 530 ns 1120000 items_per_second=188.632M/s iterate-rtl/1000 4570 ns 4499 ns 149333 items_per_second=222.263M/s iterate-rtl/10000 45814 ns 46527 ns 15448 items_per_second=214.929M/s iterate-rtl/100000 457608 ns 455097 ns 1545 items_per_second=219.733M/s iterate-rtl-record/10 20,1 ns 19,9 ns 34461538 items_per_second=501.259M/s iterate-rtl-record/100 197 ns 197 ns 3733333 items_per_second=508.369M/s iterate-rtl-record/1000 1863 ns 1842 ns 373333 items_per_second=543.03M/s iterate-rtl-record/10000 18664 ns 18834 ns 37333 items_per_second=530.958M/s iterate-rtl-record/100000 186418 ns 188354 ns 3733 items_per_second=530.916M/s iterate-spring/10 107 ns 107 ns 6400000 items_per_second=93.0909M/s iterate-spring/100 493 ns 500 ns 1000000 items_per_second=200M/s iterate-spring/1000 4298 ns 4332 ns 165926 items_per_second=230.854M/s iterate-spring/10000 42277 ns 42165 ns 14452 items_per_second=237.161M/s iterate-spring/100000 422194 ns 423825 ns 1659 items_per_second=235.947M/s -
Yes, ditch AnsiString.
-
Google Tests (unit testing framework) working with Embarcadero C++ clang compiler?
Stefan Glienke replied to Roger Cigol's topic in General Help
TestInsight is test framework agnostic so it could work with GoogleTests - apart from direct support of the different run options from the TestInsight window itself you could probably implement that support yourself just from studying the sources that ship with TestInsight.- 26 replies
-
- c++
- google tests
-
(and 1 more)
Tagged with:
-
Spring4D 2.0 sneak peek - the evolution of performance
Stefan Glienke replied to Stefan Glienke's topic in Tips / Blogs / Tutorials / Videos
Yes, developing generic code can be absolutely frustrating depending on the Delphi version. And codegen within generics can be less effective than it would if you would write the same identical code directly. Guess you could get a glimpse of what I have to endure -
How to handle delphi exception elegantly with logging feature.
Stefan Glienke replied to HalfBlindCoder's topic in Algorithms, Data Structures and Class Design
⚠️ Disclaimer: Micro optimization advice ahead 😉 You only do it the other way around (interpret a bool as a number) to embed conditionals into your algorithm to avoid branching. -
Fast lookup tables - TArray.BinarySearch vs Dictionary vs binary search
Stefan Glienke replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
Your assessment on the C# dictionary is incorrect - in fact it uses a pretty nice design - the actual "hashtable" is just an array of integer. Pretty compact. They store the indices to the other array where the actual items reside in and there are no gaps because those items are stored in a contiguous way. No wasting space if you have a low load factor. Yes, there is one indirection but usually, unless you have so many items that these two does not fit into cache anymore this is pretty darn fast. Collisions are being solved by linking the items that collided - can certainly argue there is some wasted space because all items that never were subject to a collision have that next pointer being unused. There are other implementations such as the one in python or the one we implemented in Spring4d (which is very similar to the one in Python) that uses probing with some factor to avoid clustering. More on the C# implementation https://blog.markvincze.com/back-to-basics-dictionary-part-2-net-implementation/ -
Fast lookup tables - TArray.BinarySearch vs Dictionary vs binary search
Stefan Glienke replied to Mike Torrettinni's topic in Algorithms, Data Structures and Class Design
If you say so - people have profiled the crap out of hashtables - and none that is halfway decent is using unnecessary memory indirections or wastes space by storing pointers. The point with strings might be true but then you only pay the price when using strings - and chances are that the string you are using to look up the item is already in your cache. -
Then only +1 when it has decimal places 😉 function RoundUpToFive(AValue: Double): Integer; begin Result := ((Trunc(AValue) div 5) + Byte(Frac(AValue) > 0)) * 5; end;
-
Spring4D 2.0 sneak peek - the evolution of performance
Stefan Glienke replied to Stefan Glienke's topic in Tips / Blogs / Tutorials / Videos
The opposite - by forcing many specialized types into Spring.Collections.dcu it avoids stuffing them into each and every other dcu that might use them. Try following: create 10 units each with a simple class and a function that calls TCollections.CreateList<TThatClass> - now compile the project once with 1.2 and once with 2.0 and look at the dcus. Also I can only repeat my suggestion: precompile Spring and use its dcus and don't recompile it over and over every time you compile your project. Because then you will gain the main benefit - most commonly used specializations are in Spring.Collections.dcu and the compiler just needs to reference them. Compiletime and memory usage of the compiler for projects using Spring 2.0 have dropped significantly. -
Introducing Spring.Benchmark - a port of Google benchmark
Stefan Glienke posted a topic in Tips / Blogs / Tutorials / Videos
https://delphisorcery.blogspot.com/2021/06/introducing-springbenchmark-port-of.html -
Good question - probably because I did not look into the implementation as the API does not tell it automatically does UTF8 conversion which it usually does not. But as far as I can see GetHashString encodes as hex and not as base64
- 15 replies
-
uses System.Hash, System.NetEncoding, System.SysUtils; function GenerateDigest: string; var bodyText: string; payloadBytes: TBytes; sha256Hash: THashSHA2; begin bodyText := '{ your JSON payload }'; sha256Hash := THashSHA2.Create(); sha256Hash.Update(TEncoding.UTF8.GetBytes(bodyText)); payloadBytes := sha256Hash.HashAsBytes; Result := 'SHA-256=' + TNetEncoding.Base64.EncodeBytesToString(payloadBytes); end;
- 15 replies
-
Fast Pos & StringReplace for 64 bit
Stefan Glienke replied to Tom de Neef's topic in Algorithms, Data Structures and Class Design
64bit StrPos raises AV when either string is empty - also the overall design of your API is a bit weird to me (or I misunderstand something) - why do I need different functions when doing a one time search vs repeated searches. For example when using StrPos in a loop it gets terribly slow because of all the initialization happen in that function. -
MulDiv( ) : Integer; in Delphi, cannot find this in the RTL (cross-platform)
Stefan Glienke replied to Rollo62's topic in RTL and Delphi Object Pascal
http://docwiki.embarcadero.com/Libraries/Sydney/en/System.MulDivInt64 - since all platforms apart from win32 are 64bit anyway no need to worry Int64 might be overkill. -
Advantages of record constructor over record class function, reviewed after CustomRecords
Stefan Glienke replied to Rollo62's topic in RTL and Delphi Object Pascal
Exactly - this one of the biggest gripes I have with Delphi/Pascal - the two separate concerns of allocating memory and initialization are mixed together. This is also what prevents having stack objects. C++ for example does it differently using new to allocate memory which also calls the ctor but they are not mixed - you can also just use an object on the stack or a vector/array of objects without memory indirection.