Andrea Magni 75 Posted November 26, 2018 Just made a quick test using ScaleMM2 ( https://github.com/andremussche/scalemm ) with MARS (MinimalConsole demo, https://github.com/andrea-magni/MARS/tree/master/Demos/MinimalConsole) and it seems to push performances something like from 100 to 170% 🙂 If you are using MARS and want to give it a try, let me know how it works for you. Share this post Link to post
David Heffernan 2353 Posted November 26, 2018 Doesn't this suggest that reducing the amount of heap allocation would lead to much bigger gains? Share this post Link to post
Andrea Magni 75 Posted November 26, 2018 1 hour ago, David Heffernan said: Doesn't this suggest that reducing the amount of heap allocation would lead to much bigger gains? I am not actually an expert on this so low level topics but I am sure there is a big room for optimization in MARS. So far I always focused on functionalities and ease of use and there are a couple of spots I know I can easily optimize. It's on my todo list but not really a priority at this very moment. However, any help would be greatly appreciated, just in case somebody is willing to. 😉 Share this post Link to post
Stefan Glienke 2019 Posted November 27, 2018 (edited) It's easy enough to profile where the mass of heap allocations are coming from by using FastMM full debug and log the callstacks. Or run under SamplingProfiler and check what hits the memory routines most. Edited November 27, 2018 by Stefan Glienke 1 Share this post Link to post
David Heffernan 2353 Posted November 27, 2018 The point is that faster memory allocation treats the symptom rather than the cause. Such a large boost in performance implies that memory allocation is dominating execution time, at least in this benchmark.  That in turn implies that reducing the amount of memory allocation could give far greater boosts. Share this post Link to post
Andrea Magni 75 Posted November 27, 2018 @David Heffernan so I guess the fact I was just brute forcing 10k requests with ab.exe (Apache Benchmark) matters as the request served was a simple hello world thing. The setup time for the execution should be way more than execution time (physiologically).  It was just a simple test, but I am open to better benchmarking.  Thanks Share this post Link to post
Andrea Magni 75 Posted November 27, 2018 4 hours ago, Stefan Glienke said: It's easy enough to profile where the mass of heap allocations are coming from by using FastMM full debug and log the callstacks. Or run under SamplingProfiler and check what hits the memory routines most. Thanks for the suggestions, I will give them a try but here we are a bit out of my comfort zone...  Thanks @Stefan Glienke! Share this post Link to post