Jump to content

mitch.terpak

Members
  • Content Count

    21
  • Joined

  • Last visited

Community Reputation

5 Neutral

Recent Profile Visitors

The recent visitors block is disabled and is not being shown to other users.

  1. mitch.terpak

    Delphi and "Use only memory safe languages"

    I once tested GPT4.0 for some assembly code. Its actually quite good at explaining and improving Assembly code, but absolutely horrendous at writing it from scratch.
  2. mitch.terpak

    Delphi and "Use only memory safe languages"

    We do, but have been needing to resort to a C++ DLL for some performance critical parts. But our Delphi code still outperformed Intel Math Kernel Library. Main reason was just that the Linux compiled code is unbearable slow. You're underestimating this quite frankly, the amount of time it'd take you to correctly optimize Assembly code instead of relying on the Compiler to do a decent job insane. Once again you're better off writing C++ since it's compiler will do only a tiny bit worse job then hand optimizing Assembly code, and better then Delphi compiler. Agree
  3. mitch.terpak

    C Libraries to Delphi

    Large spectrum between unpaid intern translating or well paid software engineer of course
  4. mitch.terpak

    C Libraries to Delphi

    Yeah, the header files seem easy enough that it can probably do a quite good job. But if you don't know what you're doing and make type mistakes you get an external error that will be very hard to track down. It will probably be a bit stubborn since it's so much code, so you'll have to step through it
  5. mitch.terpak

    C Libraries to Delphi

    Yea the articles linked by Brandon are on the spot. What you need is to make a C-style interface to the DLL's. The header files just tell you how you can implement the functions. I'd estimate an experienced programmer can easily do this within a week. For example struct _DATA_REGISTRATION: typedef struct _REGISTRATION_DATA { VARIANT registrationData; VARIANT signatureData; VARIANT issuingAuthority; } REGISTRATION_DATA; becomes type REGISTRATION_DATA = record registrationData: Variant; signatureData: Variant; issuingAuthority: Variant; end; Some of the IRegistration: #ifndef __IRegistration_INTERFACE_DEFINED__ #define __IRegistration_INTERFACE_DEFINED__ /* interface IRegistration */ /* [unique][helpstring][uuid][dual][object] */ EXTERN_C const IID IID_IRegistration; #if defined(__cplusplus) && !defined(CINTERFACE) MIDL_INTERFACE("0BE1D001-A538-476B-8F59-6594741D7720") IRegistration : public IUnknown { public: virtual /* [source][helpstring] */ HRESULT STDMETHODCALLTYPE Initialize( /* [retval][out] */ LONG *result) = 0; virtual /* [source][helpstring] */ HRESULT STDMETHODCALLTYPE Finalize( /* [retval][out] */ LONG *result) = 0; virtual /* [source][helpstring] */ HRESULT STDMETHODCALLTYPE GetReaderName( /* [in] */ LONG index, /* [out] */ BSTR *readerName, /* [retval][out] */ LONG *result) = 0; Becomes: IRegistration = interface(IUnknown) ['{0BE1D001-A538-476B-8F59-6594741D7720}'] function Initialize(out result: LongInt): HRESULT; stdcall; function Finalize(out result: LongInt): HRESULT; stdcall; function GetReaderName(index: LongInt; out readerName: WideString; out result: LongInt): HRESULT; stdcall; end; I just quickly scrabbled these together to give you an impression.
  6. mitch.terpak

    Delphi Low-code No-code?

    Delphi is fine, we have a huge code-base and if I had the resources I'd rewrite it in C#. The main gripe that makes it unbearable is the fact Delphi code compiled to Linux is sometimes up to 20x slower. And no code could technically be visual programming no? (no idea if visual programming is done for anything serious except for Unreal Engine)
  7. mitch.terpak

    Delphi 12.0 TParallel.For performance. Threading.pas issues

    Threading.pas from 12.1 is still less performant then 11.3. Growing the WorkerThreads in TThreadPool.TThreadPoolMonitor.GrowThreadPoolIfStarved() via the last statement else if FThreadPool.UnlimitedWorkerThreadsWhenBlocked then for i := 1 to Min(FThreadPool.FQueuedRequestCount, FThreadPool.FMaxLimitWorkerThreadCount div 2 + 1) do FThreadPool.CreateWorkerThread; But its doing so erroneously. This seems to be in my case because the amount of work I do in a TParallel.For is large. So here: procedure TThreadPool.TThreadPoolMonitor.Execute; const MaxInactiveInterval = 30 * 1000; InactiveCountdown = MaxInactiveInterval div TThreadPool.MonitorThreadDelay; var I: Integer; CPUInfo: TThread.TSystemTimes; CpuUsageArray: array[0..TThreadPool.NumCPUUsageSamples - 1] of Cardinal; CurUsageSlot: Integer; ExitCountdown: Integer; AvgCPU: Cardinal; CurMonitorStatus: TThreadPool.TMonitorThreadStatus; Signaled: Boolean; begin NameThreadForDebugging(Format('Thread Pool Monitor Thread - %s ThreadPool - %p', [ClassName, Pointer(FThreadPool)])); {$IFDEF MSWINDOWS} if ThreadPoolMonitorHandles <> nil then begin TMonitor.Enter(ThreadPoolMonitorHandles); try ThreadPoolMonitorHandles.Add(FThreadPool, Handle); finally TMonitor.Exit(ThreadPoolMonitorHandles); end; end; try {$ENDIF MSWINDOWS} FThreadPool.FMonitorThreadWakeEvent.WaitFor(TThreadPool.MonitorThreadDelay); TThread.GetSystemTimes(CPUInfo); CurUsageSlot := 0; FillChar(CPUUsageArray, SizeOf(CPUUsageArray), 0); ExitCountdown := InactiveCountdown; while not Terminated do begin if not FThreadPool.FShutdown then begin Signaled := FThreadPool.FMonitorThreadWakeEvent.WaitFor(TThreadPool.MonitorThreadDelay) = TWaitResult.wrSignaled; FThreadPool.FCurrentCPUUsage := TThread.GetCPUUsage(CPUInfo); CPUUsageArray[CurUsageSlot] := FThreadPool.FCurrentCPUUsage; if CurUsageSlot = TThreadPool.NumCPUUsageSamples - 1 then CurUsageSlot := 0 else Inc(CurUsageSlot); AvgCPU := 0; for I := 0 to TThreadPool.NumCPUUsageSamples - 1 do Inc(AvgCPU, CPUUsageArray[I]); FThreadPool.FAverageCPUUsage := AvgCPU div TThreadPool.NumCPUUsageSamples; if FThreadPool.FCurrentCPUUsage < TThreadPool.CPUUsageLow then GrowThreadPoolIfStarved; CurMonitorStatus := FThreadPool.FMonitorThreadStatus; if Signaled then begin FThreadPool.FMonitorThreadWakeEvent.ResetEvent; Continue; end; if FThreadPool.FShutdown then ExitCountdown := -1 else if not (TThreadPool.TMonitorThreadStat.NoWorkers in CurMonitorStatus) then Dec(ExitCountdown) else ExitCountdown := InactiveCountdown; end else ExitCountdown := -1; if ExitCountdown <= 0 then begin if ExitCountdown < 0 then begin TInterlocked.Exchange(Integer(FThreadPool.FMonitorThreadStatus), 0); Exit; end else if TMonitorThreadStatus(TInterlocked.CompareExchange(Integer(FThreadPool.FMonitorThreadStatus), 0, Integer(CurMonitorStatus))) = CurMonitorStatus then Exit else ExitCountdown := InactiveCountdown; end; end; {$IFDEF MSWINDOWS} finally if ThreadPoolMonitorHandles <> nil then begin TMonitor.Enter(ThreadPoolMonitorHandles); try ThreadPoolMonitorHandles.Remove(FThreadPool); finally TMonitor.Exit(ThreadPoolMonitorHandles); end; end; end; {$ENDIF MSWINDOWS} end; The MonitorThreadDelay is 500ms, but that's not sufficient for our application. So lets say you do a TParallel for from 0-100, you have 20 threads and each task takes 10 seconds. Then it can occur that within a 500ms interval none of the threads have finished work, then this new logic thinks it's deadlocked. If they'd let us set these runtime it'd make a big difference already private const MaxThreadsPerCPU = 2; // Constants used for calculating CPU Usage CPUUsageHigh = 95; // Start retiring/removing threads when CPU usage gets this high CPUUsageLow = 80; // Add more threads if the CPU usage is below this CPUUsageLowest = 20; // Shrink the thread pool when CPU usage falls below this NumCPUUsageSamples = 10; // Keep a running list of CPU Usage samples over which the average is calculated MonitorThreadDelay = 2000; // Was 500 SuspendInterval = 5000 + MonitorThreadDelay; // Interval to use for suspending work in worker threads SuspendTime = MonitorThreadDelay + 100; // Time to spend in SuspendWork; RetirementDelay = 5000; // Delay interval for retiring threads Now I have to maintain my own Threading.pas still...
  8. I don't completely understand your problem but I'm assuming the issue is the following. You're running something that has to be done on the background and you don't want your application to freeze up / be unresponsive until its done. Or you want to wait until a specific thing is done. https://docwiki.embarcadero.com/RADStudio/Athens/en/Using_TTask_from_the_Parallel_Programming_Library TTask.Run(procedure begin // Whatever async thing you want to run end); Let's say now you want to wait until its done, or check at least. var aTask : ITask; aTask := TTask.Run(procedure begin // Whatever async thing you want to run end); aTask.Wait(); What's important is that when you do something on VCL you do it on the main thread. The TTask.Run is always on a separate thread from the main. You'll have to use https://docwiki.embarcadero.com/Libraries/Alexandria/en/System.Classes.TThread.Synchronize to call the main thread and do VCL things from an async thread. aTask.Wait might achieve the opposite of what you want: Locking the main thread. If for some reason you have to use it elsewhere there's also a TTaskStatus that you could check. Or you can make another TTask.Run and put aTask.Wait inside it. Hope this helps!
  9. mitch.terpak

    What new features would you like to see in Delphi 13?

    Delphi compiled to Linux is borderline unusable, whatever the reason is, I'm seeing x20 slowdown on some code (see screenshot). For example in a procedure that does Sparse Matrix Factorization. I had to port parts of our code to C++ (DLL) and will probably have to port more as work around for this. Sure it can compile longer, even if it'd compile for 30min I don't mind.
  10. mitch.terpak

    What new features would you like to see in Delphi 13?

    For Linux64 to not compile to something 5-20x slower then Windows.
  11. mitch.terpak

    Delphi 12.0 TParallel.For performance. Threading.pas issues

    Actual use case: if not LoadflowObjectDict.TryGetValue(TThread.CurrentThread.ThreadID,LoadflowObject) then begin LoadflowObject := TLoadflowObject.Create(); // Assignment of new object // Copy of a large array // Setlength in the range of 10.000-50.000 // Move of a large array // Copy of a large array // Deepcopy of a 100-300mb object (object size in memory) // Deepcopy of a 50-100mb nested object (object size in memory) CriticalSection.Acquire; LoadflowObjectDict.TryAdd(TThread.CurrentThread.ThreadID,LoadflowObject); CriticalSection.Release; end; My test looked like this, note how it has nothing in front of the TryAdd that will introduce a lot of variance in the timings. The test was absolutely worse case. Adding the CriticalSection is quite insignificant. Just very noticeable on small problems. program ParallelLoopConsoleApp; {$APPTYPE CONSOLE} uses System.SysUtils, System.Classes, System.Generics.Defaults, System.Threading, System.Generics.Collections; var LoopCount: Integer = 0; ThreadDataDict: TDictionary<Cardinal, TThreadData>; procedure PerformCalculation; var CoreCount: Integer; LoadflowThreadpool : TThreadpool; begin CoreCount:=TThreadPool.Default.MaxWorkerThreads div 2; ThreadDataDict:=TDictionary<Cardinal,TThreadData>.Create(1000); LoadflowThreadpool := TThreadpool.Create; LoadflowThreadpool.SetMaxWorkerThreads(CoreCount); LoadflowThreadpool.SetMinWorkerThreads(CoreCount); while True do begin Inc(LoopCount); Writeln('Loop Count: ',LoopCount); TParallel.For(1,100, procedure(Index: Integer) var ThreadData: TThreadData; begin // Store or update thread-specific data // This is not what actually happens though, before this other things happen that make it very unlikely for threads to be synced. But for arguments sake if not ThreadDataDict.TryGetValue(Tthread.CurrentThread.ThreadID,ThreadData) then ThreadDataDict.TryAdd(Tthread.CurrentThread.ThreadID,ThreadData); // Adding a criticalsection here would make it safe // Perform some calculation here ThreadData.CalculationResult:=Index * Index div 17 + 231 * 2 - 4; end, LoadflowThreadpool); // Cleanup and prepare for the next iteration ThreadDataDict.Clear; end; end; begin try PerformCalculation; except on E: Exception do Writeln('An error occurred: ', E.Message); end; Writeln('Press Enter to exit...'); end. The reason I'm adjusting it now is because I think there's a quite significant collision chance on generating the bucket ID. Which can't be easily avoided. And I'd rather avoid this causing exceptions in a Docker using a DLL
  12. mitch.terpak

    Delphi 12.0 TParallel.For performance. Threading.pas issues

    Alright, I tested it in a console app it has a failure rate of about 1 / 35000 with 20 cores, that's too much. I wrapped adding new ThreadID's / ThreadObjects with a TCriticalSection. Thank you for making me reconsider my stubbornness.
  13. mitch.terpak

    Delphi 12.0 TParallel.For performance. Threading.pas issues

    Not to justify it, you're right, it's technically not thread-safe. But the odds for it going wrong are very low, if it doesn't cause an exception then it might cause a 300mb memory leak when it goes wrong. It's a conscious trade-off for a tiny bit more performance. A CriticalSection would solve this when adding the objects to the Dictionary.
  14. mitch.terpak

    Delphi 12.0 TParallel.For performance. Threading.pas issues

    Note that the person replying there Dmitry Arefiev is the same person that mentioned the solution here.
  15. mitch.terpak

    Delphi 12.0 TParallel.For performance. Threading.pas issues

    I tested this and it indeed works well for the smaller problems I have. It still seems to trigger: else if FThreadPool.UnlimitedWorkerThreadsWhenBlocked then begin writeln('entered'); for i := 1 to Min(FThreadPool.FQueuedRequestCount, FThreadPool.FMaxLimitWorkerThreadCount div 2 + 1) do FThreadPool.CreateWorkerThread; end; While there seems no reason for that. Which does end up being a problem for me on the larger problem sizes. (since it will still add new worker threads with different ThreadID's for which copies will have to be made). But I can confirm that it cuts the issue down to only a 15% performance loss in our benchmark opposed to 40% with UnlimitedWorkerThreadsWhenBlocked on true. I also got the ACCESS_VIOLATION again while attempting to test the performance with the debugger. This time self was inaccessible and it occurred on. I also had it happen for the first time ever in our DUnitX (console): FThreadPool.FCurrentCPUUsage := TThread.GetCPUUsage(CPUInfo); Name Value Self Inaccessible value I 10 CPUInfo (594602031250, 37442656250, 616291093750, 0) CpuUsageArray (5, 4, 5, 6, 5, 4, 4, 5, 5, 5) CurUsageSlot 3 ExitCountdown 37 AvgCPU 48 CurMonitorStatus [Created] Signaled False
×