Search the Community
Showing results for 'profiling'.
Found 57 results
-
Worker thread queue performance
Anders Melander replied to snowdev's topic in Algorithms, Data Structures and Class Design
I didn't investigate but I got a lot of leaks reported when existing the application when running in the debugger. Okay. It's expensive to start a thread but if you are launching the threads at application startup then it doesn't matter. If you create them on-demand then I would use TTask instead. The first task will take the worst of the pool initialization hit. https://en.delphipraxis.net/search/?q=profiling If you use a lock-free structure then you don't need locking. Hence the "free" in the name 🙂 And FTR, the term deadlock means a cycle where two threads each have some resource locked and each is waiting for the other to release their resource. I think what you meant was race condition; Two threads modifying the same resource at the same time. PWideChar is supposedly a pointer to a WideString? In that case, please don't. WideString is only for use in COM and it's horribly slow. No, what I meant was that instead of using dynamic strings (which are relatively slow because they must be allocated, sized, resized, freed, etc.) use a static array of chars: Buffer: array[BufferSize] of char. You will waste some bytes but it's fast.- 10 replies
-
- multithreading
- queue
-
(and 5 more)
Tagged with:
-
Worker thread queue performance
snowdev replied to snowdev's topic in Algorithms, Data Structures and Class Design
I’ve search over the internet when I started the project and found some posts around TMonitor performance. I also found a gabr42’s (OmniThreadLibrary creator) blog post about this and just decided to use TCriticalSection. Thats make sense, I’ve tested this way and got similar results, and the simple fifo queue wons (working with objects or pointers). Not exactly, using ReportMemoryLeaksOnShutdown didnt take any leak running the tests… every queue format release their resources. I’ll take a look into that, usually dont. This reason I dont included in the given example… every thread became up on the app initialization. Thanks for the tip. I dont know a profiling lib for Delphi, but I’ll measure them with stopwatches. I just use locking because I dont know if there could have a deadlock when other thread is pushing and the worker is popping, so I do it just in case. You say that this scenario isnt that possible? About Windows message queue, it seems slow as a simple fifo aswell, thought continue using this approach. In the next few days I’ll build a ring buffer like approach and test the performance compared to TQueue, it internal uses an array of T btw. About strings I could switch to PWideChar aswell, I use string for ease. Almost same performance as TQueue. Thanks for the reply.- 10 replies
-
- multithreading
- queue
-
(and 5 more)
Tagged with:
-
It took me a bit longer than expected to get here but I believe I've finally reached the goal. The following shows VTune profiling a Delphi application, with symbol, line number and source code resolution: Download Get the source here: https://bitbucket.org/anders_melander/map2pdb/ And a precompiled exe here: https://bitbucket.org/anders_melander/map2pdb/downloads/ The source has only been tested with Delphi 10.3 - uses inline vars so it will not compile with older versions. Usage map2pdb - Copyright (c) 2021 Anders Melander Version 2.0 Parses the map file produced by Delphi and writes a PDB file. Usage: map2pdb [options] <map-filename> Options: -v Verbose output -pdb[:<output-filename>] Writes a PDB (default) -yaml[:<output-filename>] Writes an YAML file that can be used with llvm-pdbutil -bind[:<exe-filename>] Patches a Delphi compiled exe file to include a reference to the pdb file -test Works on test data. Ignores the input file Example: Configure your project linker options to output a Detailed map file. Compile the project. Execute map2pdb <map-filename> -bind Profile the application with VTune (or whatever) Known issues The -bind switch must occur after the filename contrary to the usage instructions. PDB files larger than 16Mb are not valid. This is currently by design. 64-bit PE files are not yet supported by the -bind option. As should be evident I decided not to go the DWARF route after all. After using a few days to read the DWARF specification and examine the FPC source I decided that it would be easier to leverage the PDB knowledge I had already acquired. Not that this has been easy. Even though I've been able to use the LLVM PDB implementation and Microsoft's PDB source as a reference LLVM's implementation is incomplete and buggy and the LLVM source is "modern C++" which means that it's close to unreadable in places. Microsoft's source, while written in clean C and guaranteed to be correct, doesn't compile and is poorly commented. Luckily it was nothing a few all-nighters with a disassembler and a hex editor couldn't solve. Enjoy!
-
WebUI framework: Technical preview. Part 1.
Alexander Sviridenkov replied to Alexander Sviridenkov's topic in I made this
Main differences: 1. In WebUI you do not work with simple control like Label, Edit, Combo, etc., only with high level entities. F.e. "I want to have a filter on toolbar for this field". 2. Library is not tied to certain JS/CSSlibraries, you can define how it will render UI by changing templates. 3. UI is designed in browser and is stored in DB. Therefore: a) You can have different UI on each customer/site using one compiled application. And easily copy/paste forms between sites. b) Changes are more simple. Standard way: Customer request, f.e. add new column to grid - Open application in IDE, find form, find dataset, change query, find grid, add column, compile, test, send back to customer. WebIUI - press F2 in browser, open designer, change SQL, add column, press Apply. 4. Very powerful grid/listview. Easy to define cell content, color, background, format, etc. 5. Lot of built-in features (with UI created using WebUI itself) - localization with automatic translation, session management, forms management, SQL and pages profiling, server monitoring, etc. 6. Embedded access rights management. All entities are linked to dataclasses and UI is displayed with respect to current user role (f.e. Delete button will be hidded if role has no delete rights for this class). 7. SQL, JS, Scripts editors with completion and hints. 8. Some parts are created automatically basing on DB schema and content anylysis. You can download demo, enter you database details and get working application in 1-5 minutes. 9. Easy integration with Delphi, Register you class in WebUI and you will be able to call any of its methods from scripts. -
how to filter on source files using VTune?
Anders Melander replied to merijnb's topic in Delphi Third-Party
Assuming you're using map2pdb to generate the PDB-file for VTune, you can specify on the map2pdb command line which units to include and which to exclude. For example, I often use these filters (excludes DevExpress, the VCL, Indy, etc.): -include:0001 -exclude:dx*;cx*;winapi.*;vcl.*;data.*;firedac.*;soap.*;web.*;datasnap.*;id*;vcltee.* However, that will not prevent VTune from profiling that code. It just can't resolve the addresses in those units. As far as I know, the only way to filter data in VTune is to either use the filter fields at the bottom of the Window or to select the lines you'd like to exclude/include, right click and then select Filter In by Selection/Filter Out by Selection from the context menu. https://www.intel.com/content/www/us/en/docs/vtune-profiler/user-guide/2024-2/context-menu-grid.html -
Hello, In a VCL application I am currently trying to optimize a monothread task that is doing many complex geometric calculations and that is taking around 2 minutes and 20 seconds to execute. It seems like a good candidate for implementing a multithread strategy. My computer has 8 cores and 16 threads but I try to implement 8 threads only for now. Here is the code implementing the Parallel.For loop: var lNumTasks := 8; SetLength(lVCalculBuckets, lNumTasks); Parallel.For<TObject> (lShadingStepListAsObjects.ToArray) .NoWait .NumTasks(lNumTasks) .OnStop(Parallel.CompleteQueue(lResults)) .Initialize( procInitMultiThread ) .Finalize( procFinalizeMultiThread ) .Execute ( procExecuteMultiThread ); procInitMultiThread and procFinalizeMultiThread copy and free lVCalculBuckets which contains one copy of our working objects per thread: procedure TMyClass.procInitMultiThread(aTaskIndex, aFromIndex, aToIndex: Integer); var lVCalcul : TVCalcul; begin // Copy data lVCalcul := TVCalcul.Create(nil); lVCalcul.CopyLight(Self.VCalcul); lVCalculBuckets[aTaskIndex] := lVCalcul; end; procedure TMyClass.procFinalizeMultiThread(aTaskIndex, aFromIndex, aToIndex: Integer); var lVCalcul : TVCalcul; begin // Delete copied data lVCalcul := TVCalcul(lVCalculBuckets[aTaskIndex]); FreeAndNil(lVCalcul); end; procExecuteMultiThread is just making the calculations and posting them back to the calling thread so that they are displayed on the VCL interface: procedure TMyClass.procExecuteMultiThread(aTaskIndex: Integer; var aValue: TObject); var lVCalcul : TVCalcul; lRes: TStepRes; begin // Retrieve data lVCalcul := TVCalcul(lVCalculBuckets[aTaskIndex]); if Assigned(lVCalcul) then begin // Calculate factors lRes := TShadingStepRes(aValue); lVCalcul.CalculateFactors(lRes.Height, lRes.Width); // Post results lRes.FillResFromVCalcul(lVCalcul); lResults.Add(TOmniValue.CastFrom<TStepRes>(lRes)); end; end; Now this implementation runs in about 1min50, which is faster than the monothread version, but far from the gains I expected. I tried simplifying the code by removing the "Post results" part, thinking that it was causing synchronization delays. But it doesn't have any effects. Running the application inside SamplingProfiler and profiling a worker thread shows that 80% of the time spent by this thread is in NtDelayExecution: Yet I have no idea why because in the calculation part itself there isn't any synchronization code that I am aware of. If any of you would be able to point me in the right direction to further debug this, it would be much appreciated.
-
For better results run the measured process over and over again while profiling. It gets more accurate with every sample you get. I've even run with Monte Carlo enabled (multi threaded app) for 30 minutes, and then check the results.
-
Anders said in another thread ... Hi Anders. Thanks for the feedback, much appreciated. Yes, I have profiled my Clipper library's Delphi code (and not infrequently), but until now I haven't profiled into Delphi's Runtime Library. And I see your point re TList.Get, if you're alluding to its relatively expensive range checking? However, I can't see an easy hack (eg by overriding Get, since it isn't virtual). But I could write a wrapper Get function (using TList's public List property) and bypass the default TList.items property altogether. Is this what you're suggesting, or would you suggest I do something else? And are you also seeing other "low hanging fruit" that would improve performance? Edit: I've replaced all implicit calls to TList.Get with the following function and as a consequence have seen quite a substantial improvement in performance (and Delphi now slightly out-performs C#). function UnsafeGet(List: TList; Index: Integer): Pointer; inline; begin // caution: no bounds checking Result := List.List[Index]; end;
-
Polywick Studio - Delphi and C++ Builder developer.
PolywickStudio posted a topic in Job Opportunities / Coder for Hire
I operate a small business doing Delphi coding and work-for-hire. My services are: App modernization, legacy upgrades. From Delphi 1 (Windows 3.1) to Delphi XE 12.1 See: Case Studies I’ve personally developed on C++ UI components, newer Delphi Skia-based components. New application development (UI work, coding). You can choose from different team options: Dedicated team - need a team? Staff augmentation (developers on-and-off basis), Outsourcing (such as, 3 hours a week, Once a month to update Delphi app) What you get: Full source-code in Github/ Gitlab/ BitBucket / Azure repository. CI/CD build (where applicable), so when there's check-in, a build is made. Markdown documented and unit tested code. Optional profiling project if code is slow. Warranty (if there's a small fix it'll get fixed). I'm top-rated on Upwork and on Embarcadero Blogs. My website and contact form. -
What are the performance profilers for Delphi 12?
Brian Evans replied to dmitrybv's topic in General Help
There is Nexus Quality Suite | NexusDB which "works with Delphi 5 to Rad Studio 12 Yukon and beyond!". It has a long lineage as it is a successor to the long defunct TurboPower Sleuth QA Suite. A book is available on high performance Delphi which has a free chapter on profiling: Delphi High Performance - Second Edition | Packt (packtpub.com) that covers various tools. -
MAP2PDB - Profiling with VTune
Wagner Landgraf replied to Anders Melander's topic in Delphi Third-Party
Probably that's what is happening here. I had an older VTune version that worked, and now I installed Tuen 2024 which doesn't. (Of course, "working" was very limited, it didn't support any hardware assisted profiling, but at least the Hotspots with no hardware was working, just to profile application logic). The problem is: where the heck to I find old VTune versions to install? I search everywhere, and I can't find any information. Closes I found was to register for Intel account and go to some download/registration center, but nothing is displayed there are it only lists "registered" products. Does anyone happen to have an old offline installer of 2023, maybe 2022 VTune? -
MAP2PDB - Profiling with VTune
Anders Melander replied to Anders Melander's topic in Delphi Third-Party
VTune only supports Intel hardware as it relies on certain CPU features that are only available on Intel CPUs. At least that what they claim: https://www.intel.com/content/www/us/en/developer/articles/system-requirements/vtune-profiler-system-requirements.html Maybe you can get an older version of VTune to work. For example the current version of VTune doesn't support hardware assisted profiling on my (admittedly pretty old) processor. -
MAP2PDB - Profiling with VTune
Vincent Parrett replied to Anders Melander's topic in Delphi Third-Party
This is awesome, embarcadero have no excuse for poor performance now! 🤣Nice one As someone who uses runtime packages (due to the plug architecture of my application) - this is a god send - I have profiled the crap out of my code, found a lot of small improvements and a couple of large ones - but quite often the profiler shows the majority of the per ends up somewhere inside the RTL - this will allow me to do another round of profiling. The only down side is I have an amd cpu, and uProf is nowhere near as good as VTune (although both suck when it comes to UI/UX). -
Why does Delphi 12 marginally bloat EXE file size compared to 11.1?
PaulM117 replied to PaulM117's topic in RTL and Delphi Object Pascal
In September I restarted my flagship application from scratch and incrementally tracked each increase in file size from bare Win64 VCL app in Delphi 11.1. So I did have a reasonable accounting. I never bought 11.3 as my update subscription expired. I do of course measure performance. This was about EXE bloat. Your posts and scattered internet writings, along with Dalija, Primoz, Arnaud, Remy, and others I am forgetting, have helped me tremendously through the years to learn performant and efficient coding - I am thankful for the dialogue. Let me try my best to contradict your overall practically mostly true argument for my specific case. It is possible to attain optimal performance in Delphi by the mere fact that we have Win64 assembler code ability. Moreover: For SSE2, I use Neslib.FastMath which beats the previous MS D3DX10 DLL SSE-optimized libraries I was using - this is a graphics application and heavily GPU bound due to my excellent, lean, low-level, cache-efficient coding strategies for CPU code possible in Delphi. I am running at 120fps with less than 2% CPU usage in Task Manager with an unfinished app, and am confident I can keep it that way. A C++ app where the programmer has not laid out memory in a cache-efficient manner will have worse performance than a correctly written Delphi Win64 app using FastMath for SSE (packing in to TVector4s) I use inline constants everwhere to pull things into registers and pay attention as much as I can to producing clean assembly with Object Pascal syntax. For instance, I never write for var I := 0 to Length(arr)-1 in hot paths, after I found (admittedly a few versions ago) that it did a repeat call to Length() and/or subtraction - I always go const AHi = High(arr); for var I := 0 to AHI do, etc. If I were using C++ I would have the convenience of trusting that little stupid things like that were already taken care of for me, but the benefits of Delphi for UI design outweight the minor inconveniences. However I know you are right about the overall quality of Embarcadero's compiler which I lament. I greatly lament that Embarcadero has spent more time on Firemonkey/C++/database stuff that's irrelevant to me rather than making a highly optimized Win32/64 compiler and built-in profiling/instrumenting tools. However, this is just a matter of convenience. I still reject that I can't produce as optimal performance of a Delphi Win64 application compared to MSVC, only that it requires much more attention in some areas. Let me know if you disagree with that. BTW - I did decide against the EXE compression libraries for that very reason of their apparent likelihood to trigger AV detections. -
String comparison in HashTable
Arnaud Bouchez replied to Tommi Prami's topic in Algorithms, Data Structures and Class Design
1) TL&WR Do not try to use those tricks in anything close to a common-use hash table, e.g. your own data library. So much micro-benchmarking for no benefit in practice: all this is not applicable to a common-use library. The Rust RTL has already been optimized and scrutinized by a lot of people and production workloads. I would never try to apply what he found for his own particular case to any hash table implementation. 2) hash function From profiling on real workloads, with a good enough hash function, there are only a few collisions. Of course, FNV is a slow function, with a bad collision rate. With a good hash function, e.g. our AesNiHash32 from https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.core.asmx64.inc#L6500 which is inspired by the one in Go RTL, comparing the first char is enough to reject most collisions, due to its almost-random hash spreading. Then, the idea of "tweaking the hash function" is just a pure waste of computer resource for a common-use library, once you have a realistic hash table with thousands (millions) of items. Of course, this guy want to hash a table of a few elements, which are known in advance. So it is not a problem for him. So no hash function was the fastest. Of course. But this is not a hash table any more - it is a dedicated algorithm for a specific use-case. 3) security Only using the first and last characters is an awful assumption for a hash process in a common library. It may work for his own dataset, but it is a very unsafe practice. This is the 101 of hash table security: don't make it guessable, or you would expose yourself to hash flooding http://ocert.org/advisories/ocert-2012-001.html 4) one known algorithm for such a fixed keyword lookup The purpose of this video is to quickly find a value within a fixed list of keywords. And from what I have seen in practice, some algorithms would perform better because won't involve a huge hash table, and won't pollute the CPU cache. For instance, this code is used on billions of computers, on billions of datasets, and works very well in practice: https://sqlite.org/src/file?name=src/tokenize.c&ci=trunk The code is generated by https://sqlite.org/src/file?name=tool/mkkeywordhash.c&ci=trunk An extract is: /* Check to see if z[0..n-1] is a keyword. If it is, write the ** parser symbol code for that keyword into *pType. Always ** return the integer n (the length of the token). */ static int keywordCode(const char *z, int n, int *pType){ int i, j; const char *zKW; if( n>=2 ){ i = ((charMap(z[0])*4) ^ (charMap(z[n-1])*3) ^ n*1) % 127; for(i=((int)aKWHash[i])-1; i>=0; i=((int)aKWNext[i])-1){ if( aKWLen[i]!=n ) continue; zKW = &zKWText[aKWOffset[i]]; if( (z[0]&~0x20)!=zKW[0] ) continue; if( (z[1]&~0x20)!=zKW[1] ) continue; j = 2; while( j<n && (z[j]&~0x20)==zKW[j] ){ j++; } if( j<n ) continue; *pType = aKWCode[i]; break; } } return n; } Its purpose was to reduce the code size, but in practice, it also reduces CPU cache pollution and tends to be very fast, thanks to a 128 bytes hash table. This code is close to what the video proposes - just even more optimized. -
Hi all, I'm looking to find a decent, easy-to-use Profiler to help identify performance bottlenecks and memory leaks. I'm a small-project user so not looking to pay too much - free and open source options are welcome as long as they're straightforward to use - no command line stuff, please. It's 2023 and I'm using RAD Studio 10.4.2 - I believe SmartBear used to provide a standard version of AQTime with an earlier version of RAD, shame they stopped doing this really because now it seems we have to pay big bucks for this particular profiling system. Recommendations and suggestions welcome, thank you!
-
Hi everyone, I am looking for relocation (or remote) Delphi jobs (special interest if OpenGL and or gamedev is involved). I have all the equipment needed to work remotely and determination to relocate if necessary. I have been using Delphi since 2001 and working professionally with it since 2010. Core languages: Delphi 7 – 11, GLSL, SQL, JavaScript. Familiarity with: C++, C#, Java. GameDev: OpenGL 1.2 - 3.3, OpenGLES2, OpenAL, Overbyte ICS, Genetic algorithms, AI. Competence: algorithms and data-structures, debugging, profiling, OOP, problem-solving, unit testing and TDD, geometry and math, multithreading, localization and API integration. Additional areas of expertise: SVN, Git, FastReport, JasperReports, FHIR, BPMN, XML, JSON, WebGL, highcharts.js, three.js, Kafka, OpenAL, Overbyte ICS, Genetic algorithms, AI. Asset creation: Photoshop, Illustrator, Lightwave 3D, Substance Designer, MS Office, Google Docs\Sheets, etc. Contacts: kromster380@gmail.com
-
I'm not an expert with PDB but that is what I've done and now seems to work: - Installed the latest version of VTune Profiler (2023.1.0). - Changed the original amplxe_media140.dll with an old version from Jan Rysavy (previous posts). - Enabled map in Detailed mode. - Compiled a project of 1.359.947 code lines (including library sources) which generate a map file of 49.696.190 bytes. - Executed map2pdb.exe rosettacncpph1.64.map -bind -v D:\x\develop\qem\rosetta_cnc_1>map2pdb.exe rosettacncpph1.64.map -bind -v map2pdb - Copyright (c) 2021 Anders Melander Version 2.8.0 Constructed a new PDB GUID: {F2D8CB4B-DA08-4BBD-A399-DBC449AF1364} Output filename not specified. Defaulting to rosettacncpph1.64.pdb Reading MAP file - Segments - Modules - Symbols Warning: [116390] Failed to resolve symbol to module: [0004:00000000000002C8] SysInit.TlsLast Warning: [116392] Failed to resolve symbol to module: [0003:00000000FE7F6000] SysInit.__ImageBase - Line numbers Collected 3.996 modules, 182.925 symbols, 525.223 lines, 985 source files Constructing PDB file - Collecting source file names - Module streams - Strings stream - PDB Info stream - TPI stream - Symbols stream - DBI stream - IPI stream - Finalizing PDB file - 9.068 blocks written in 3 intervals PE filename not specified. Defaulting to rosettacncpph1.64.exe Patching PE file - PE32+ image (64-bit) - Adding .debug section. - PDB file name has been stored in debug data. - PE file has been updated. Elapsed time: 00:00:00.895 This has generated a PDB file of 37.142.528 bytes - Started profiling of EXE in VTune profiler which generates this log at the profiling stop: Data collection is completed successfully May 22 2023 18:06:59 The result file 'XXX\r001hs\r001hs.vtune' is successfully created and added to the project . Finalization completed with warnings May 22 2023 18:08:33 Result finalization has completed with warnings that may affect the representation of the analysis data. Please see details below. Cannot locate debugging information for file `C:\WINDOWS\System32\msvcrt.dll'. Cannot locate debugging information for file `C:\WINDOWS\WinSxS\amd64_microsoft.windows.common-controls_6595b64144ccf1df_6.0.19041.1110_none_60b5254171f9507e\COMCTL32.dll'. Cannot locate debugging information for file `C:\WINDOWS\WinSxS\amd64_microsoft.windows.gdiplus_6595b64144ccf1df_1.1.19041.2251_none_91a40448cc8846c1\gdiplus.dll'. Cannot locate debugging information for file `C:\WINDOWS\SYSTEM32\ntdll.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\GDI32.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\SETUPAPI.DLL'. Cannot locate debugging information for file `C:\WINDOWS\System32\KERNEL32.DLL'. Cannot locate debugging information for file `C:\WINDOWS\System32\cfgmgr32.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\gdi32full.dll'. Cannot locate debugging information for file `C:\Program Files\Bitdefender\Endpoint Security\bdhkm\dlls_266262988153465131\bdhkm64.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\user32.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\combase.dll'. Cannot locate debugging information for file `C:\Program Files\Bitdefender\Endpoint Security\atcuf\dlls_266575548366517634\atcuf64.dll'. Cannot locate debugging information for file `C:\WINDOWS\system32\mswsock.dll'. Cannot locate debugging information for file `C:\WINDOWS\system32\uxtheme.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\ole32.dll'. Cannot locate debugging information for file `C:\WINDOWS\SYSTEM32\opengl32.dll'. Cannot locate debugging information for file `C:\WINDOWS\SYSTEM32\DEVOBJ.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\KERNELBASE.dll'. Cannot locate debugging information for file `C:\WINDOWS\SYSTEM32\TextShaping.dll'. Cannot locate debugging information for file `C:\Windows\System32\msxml6.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\WS2_32.dll'. Cannot locate debugging information for file `C:\WINDOWS\System32\DriverStore\FileRepository\nv_dispi.inf_amd64_c1a085cc86772d3f\nvoglv64.dll'. Cannot locate debugging information for file `C:\WINDOWS\SYSTEM32\HID.DLL'. Cannot locate debugging information for file `C:\WINDOWS\System32\win32u.dll'. Cannot locate debugging information for file `C:\Program Files (x86)\Intel\oneAPI\vtune\latest\bin64\tpsstool.dll'.: And all symbols project symbols are visible (not for below list of DLL):
-
Create Class at run time with an AncestorClass and a ClassName
SwiftExpat replied to Robert Gilland's topic in RTL and Delphi Object Pascal
Profiling that code is necessary to figure out which component is taking the time, as Brian Evans said above. I use one component which takes 23 ms to execute the create. That is a long time considering there are 5 instances on the form. My fix was to delay creation of the 4 instances not visible. -
Apologize delay(illness) Thank you guys for all of your input and comments, CreateTimerQueueTimer seems to be the best method (cpu time vs accuracy) @KodeZwerg Multimedia timer(your first code posted) works very precise too, however I'm getting AV on form close(with low intervals <=50ms) - not always. As for the latest method, Can you show some basic implementatio like for the first one? Fire action every interval set mean for unit kz.Windows.Timer ; Probably it would be good to add procedures and properties for Period and DueTime(SetPeriod, SetDueTime) QFP, QPC seems to be non go for such usage like I first wanted, it's very good for meassure execution time for some portion of code e.g. for profiling/optimizations. But for other purposes CreateTimerQueueTimer seems to be the king. @Lars Fosdal Fire and run until app will be closed/destroyed (launch tasks every X time with precision where ttimer is too weak)
-
A little deeper I ran library loading profiling: https://ten0s.github.io/blog/2022/07/01/debugging-dll-loading-errors The log is big. There is only error: 2818:37f4 @ 05671921 - LdrpProcessWork - ERROR: Unable to load DLL: "C:\WINDOWS\WinSxS\x86_microsoft.windows.common-controls_6595b64144ccf1df_6.0.19041.1110_none_a8625c1886757984\COMCTL32.dll", Parent Module: "C:\WINDOWS\System32\comdlg32.dll", Status: 0xc000007b Status: 0xc000007b = (#define STATUS_INVALID_IMAGE_FORMAT ((NTSTATUS)0xC000007BL)) """ C:\WINDOWS\WinSxS\x86 """ ??? My app is 64-bit COM-server contains some side components, whose runtime packages all compiled as 64-bit code. What does the option [Use debug dcus] add to the code? Has anyone come across this? Thanks!
-
What Stefan wrote 🙂 I had my focus on the exact same three areas and had made almost the same changes. When you look at the profiling of the generated asm it becomes pretty clear what the bottlenecks are - down to the individual statements. In addition to Stefan's modifications I would try to see if it made any noticeable difference to test for (i = 0) last in IntersectListSort because presumably the two other conditions are met more often. It could eliminate one or two branches. I would also consider if quick sort is the best algorithm to use. For example if the list is already "almost sorted" (for example because a sorted list was modified), insertions sort might be a better choice.
-
Is Move the fastest way to copy memory?
Arnaud Bouchez replied to dummzeuch's topic in RTL and Delphi Object Pascal
L1 cache access time makes a huge difference. http://blog.skoups.com/?p=592 You could retrieve the L1 cache size, then work on buffers of about 90% of this size (always keep some space for stack, tables and such). Then, if you work in the API buffer directly, a non-temporal move to the result buffer may help a little. During your process, if you use lookup tables, ensure they don't pollute the cache. But profiling is the key for sure. Guesses are most of the time wrong... -
Profiling is a bit tricky, I think there is no real integrated version for FMX yet. But you could try to use the Macos XCode Instrument tools, they show a lot of details for any running apps, but I havent used them any more in the last few years. I think it should be more powerful and useful even. https://gist.github.com/loderunner/36724cc9ee8db66db305
-
Looking for photography enthusiasts for continuing a camera calibrator project (CoCa)
hurodal posted a topic in I made this
Hi there, This is my first message here. I'm here because I have been given all the resources of an open-source project called CoCa (ICC Color Camera Calibrator) which is a software that makes ICC color profiles for cameras and scanners. His author is getting old and cannot continue it so I want to continue. My problem is that I know nothing about Delphi nor programming (other than BASIC that I learnt as a child with my Spectrum 48K) but I do know about camera profiling and color management. This is the original webpage of the project (cloned into my server) so you can see how it looks and works. https://www.hugorodriguez.com/calibracion/coca_web/coca_page.html I'm looking for someone that loves photography and digital imaging and delphy to continue this project. I've been told that a good step would be to convert this code to python or other language that doesn't need so much text (as delphy) to make it more easier to work, but these are not my words so I cannot say. Please if anyone likes this and want to help to this open-source project, contact me. Best regards, Hugo Rodriguez