-
Content Count
787 -
Joined
-
Last visited
-
Days Won
43
Posts posted by pyscripter
-
-
33 minutes ago, Anders Melander said:As far as I can tell it can be solved "simply" by changing the head of the two stacks from a pointer to record containing the pointer and a counter:
If you have a look at the posted file, this is what I tried to do. Please have a go if you have the time.
-
7 minutes ago, David Heffernan said:You can't be doing multithreaded programming where "whilst still not perfect" as a valid statement. It's got to be right.
Absolutely right. This is why I am passing the challenge to people like you, with much deeper knowledge than mine. I would very much hope that @Primož Gabrijelčičfor instance, has a go at providing a solution, since he has faced the very same issue in OmniThreadLibrary.
-
@Darian Millerhas published a very nice article about the state of TThreadedQueue and TMonitor in Delphi. He has also published at Github a stress test that shows how TThreadQueue still fails under stress.
I have played with his stress test and concluded that the problem is almost certainly in TMonitor. TMonitor implements a lock-free stack to recycle events created with the CreateEvent function. The relevant code in SysUtils is
var EventCache: PEventItemHolder; EventItemHolders: PEventItemHolder; procedure Push(var Stack: PEventItemHolder; EventItem: PEventItemHolder); var LStack: PEventItemHolder; begin repeat LStack := Stack; EventItem.Next := LStack; until AtomicCmpExchange(Pointer(Stack), EventItem, LStack) = LStack; end; function Pop(var Stack: PEventItemHolder): PEventItemHolder; begin repeat Result := Stack; if Result = nil then Exit; until AtomicCmpExchange(Pointer(Stack), Result.Next, Result) = Result; end;
This lock-free stack is used by NewWaitObj and FreeWaitObj which are part of the Monitor support protocol and used by TMonitor. This works reasonably well, but under stress it fails. The reason it fails is known as the ABA problem and is discussed in a similar context by a series of excellent blog posts by @Primož Gabrijelčič: blog post 1, blog post 2, blog post 3.
His OmniThreadLibrary contains the following routine that he uses to deal with this problem.
/either 8-byte or 16-byte CAS, depending on the platform; destination must be propely aligned (8- or 16-byte) function CAS(const oldData: pointer; oldReference: NativeInt; newData: pointer; newReference: NativeInt; var destination): boolean; asm {$IFNDEF CPUX64} push edi push ebx mov ebx, newData mov ecx, newReference mov edi, destination lock cmpxchg8b qword ptr [edi] pop ebx pop edi {$ELSE CPUX64} .noframe push rbx //rsp := rsp - 8 ! mov rax, oldData mov rbx, newData mov rcx, newReference mov r8, [destination + 8] //+8 with respect to .noframe lock cmpxchg16b [r8] pop rbx {$ENDIF CPUX64} setz al end; { CAS }
I have tried to use this function to provide a solution for TMonitor similar to the one in OmniThreadLibrary. (see attached iaStressTest.TThreadedQueue.PopItem that can be used with the original stress test). Whilst still not perfect it helps a lot in 32 bits with say up to 100 threads. However it crashes in 64bits and I do not know why. I am posting this here in case anyone with better knowledge than mine of assembler and thread programming can help with the challenge of fixing TMonitor. It would be nice to try and get a fix included in 10.4. And even if it is not included, it can be easily used as a patch in the same way as in the attached code.
- 3
-
9 minutes ago, pyscripter said:Let me thank again everyone that responded. You said that the error-code is thread specific, this is not the way to check whether a specific API call failed etc., things I fully agree with, but which I knew already.
My post just made an observation, probably not of any significance. Whenever you run code in a thread, GetLastError always returns 87 which corresponds to a call with invalid parameters, The original post asked two questions:
- Was this a known fact?
- More importantly why? In other words, what is the API call in the TThread code, that sets this error? Even if it is inconsequential, and I believe it is, I had the curiosity to find out.
I don't think I got an answer to these questions.
-
Let me thank again everyone that responded. You said that the error-code is thread specific, this is not the way to check whether a specific API call failed etc., things I fully agree with, but which I knew already.
My post just made an observation. Whenever you run code in a thread, GetLastError always returns 87 which corresponds to a call with invalid parameters, The original post asked two questions:
- Was this a known fact?
- More importantly why? In other words, what is the API call that sets this error? Even if it is inconsequential, and I believe it is, I had the curiosity to find out.
I don't think I got an answer to these questions.
-
@Lars FosdalSo you do not think this OS error is the result of Delphi making a call to an OS (Windows) function with invalid parameters.
-
@Der schöne GüntherI am not sure about the relevance of your quote.
No matter what your thread code is or how you run it (by inheriting from TThread, creating an anonymous thread or whatever), something results in an OS error 87 (it is always the same code) that corresponds to "parameter is incorrect". This OS error has been raised before your thread code has started running. And this has nothing to do with the return code of a thread.
This is a problem because if you do any OS stuff in your thread code, and then you want to check whether an error was raised using CheckOSError, an exception will be raised.
A workaround would be to always start your thread code with the statement:
SetLastError(0);
-
The following code
program ThreadOSError; {$APPTYPE CONSOLE} uses System.SysUtils, System.Classes; begin Assert(GetLastError=0); TThread.CreateAnonymousThread(procedure begin try CheckOSError(GetLastError); except On E: Exception do WriteLn(E.Message); end; end).Start; ReadLn; end.
produces the following output in 10.3,3 in Windows.
System Error. Code: 87. The parameter is incorrect
Same is true whichever way you run Thread code.
Is this a known issue? Any idea why the OS error is raised?
-
7 minutes ago, Anders Melander said:Um... you're responding to a post which is over a year old, written by a person who's no longer alive...
His legacy lives on through people like me using his code...
- 3
-
On 2/1/2019 at 1:07 AM, Rudy Velthuis said:The compiler needs a type for Bob so it can know it should return a TTalking (through the Implicit operator) and not a TSmartPtr<TTalking>. There is no way around that.
If you want type inference then add a new method:
function SmartPtr<T>.Value: T; begin if FGuard <> nil then Result := FGuard.Value else Result := nil; end;
Then you can write:
var Bob := SmartPtr<TTalking>(TTalking.Create('Bob')).Value;
slightly more elegant than having three times TTalking on the same line.
-
2 hours ago, RDP1974 said:"They" should move if want to jump to the bandwagon of parallel computing (IMHO? Within 5 years will be the facto with dozens or hundred cpu cores as standard)-> hard to beat Elixir, Erlang, Go or those functional programming that offers built-in horizontal and vertical scalability (userland scheduler with lightweight fibers, kernel threads, multiprocessing over cpu hw cores, machine clustering... without modify a line of code)
🙂
Funnily enough some of the most popular languages today, Python, JavaScript R and Ruby are single-threaded and you have to go out-of-your-way to use more than one cores.
- 1
-
6 minutes ago, Anders Melander said:It seems you have just reverted the two sections of code to the way they where in Delphi 10.2
I was not aware of that. It is certainly wrong now.
-
I finally got to the bottom of this issue and I have a simple fix.
See https://quality.embarcadero.com/browse/RSP-28200
Please test and vote for it.
- 1
-
8 minutes ago, David Heffernan said:The issue is that x64 trig functions are very slow for very large values. Nobody actually wants to know sin for 99999999/pi radians. Put in sensible values for the argument to sin and it looks more reasonable. For instance try using
T:=Sin(i/99999999);
Yes you are right...
-
Oh I get it. sin is highly optimized in 32-bits but apparently not in 64-bits.
-
I revisited this thread and tested the code below:
program Project1; {$APPTYPE CONSOLE} {$R *.res} uses System.SysUtils, System.Threading, System.Diagnostics; var SW:TStopWatch; type TThreadPoolStatsHelper = record helper for TThreadPoolStats function Formatted: string; end; function TThreadPoolStatsHelper.Formatted: string; begin Result := Format('Worker: %2d, Min: %2d, Max: %2d, Idle: %2d, Retired: %2d, Suspended: %2d, CPU(Avg): %3d, CPU: %3d', [self.WorkerThreadCount, self.MinLimitWorkerThreadCount, self.MaxLimitWorkerThreadCount, self.IdleWorkerThreadCount, self.RetiredWorkerThreadCount, self.ThreadSuspended, self.AverageCPUUsage, self.CurrentCPUUsage]); end; procedure Load; begin TParallel.For(0, 99999999, procedure(i: Integer) var T:Single; begin T:=Sin(i/PI); end); end; begin try Writeln('PPL Test ---------------'); Writeln('Before: '+ TThreadPoolStats.Current.Formatted); SW:=TStopWatch.StartNew; Load; Writeln('Finished in '+SW.Elapsed.ToString); Sleep(1000); Writeln('After: '+TThreadPoolStats.Current.Formatted); except on E: Exception do Writeln(E.ClassName, ': ', E.Message); end; ReadLn; end.
This is the output
32-bits
PPL Test ---------------
Before: Worker: 0, Min: 8, Max: 200, Idle: 0, Retired: 0, Suspended: 0, CPU(Avg): 0, CPU: 0
Finished in 00:00:00.7620933
After: Worker: 8, Min: 8, Max: 200, Idle: 7, Retired: 0, Suspended: 0, CPU(Avg): 8, CPU: 1564-bits
PPL Test ---------------
Before: Worker: 0, Min: 8, Max: 200, Idle: 0, Retired: 0, Suspended: 0, CPU(Avg): 0, CPU: 0
Finished in 00:00:14.0655228
After: Worker: 8, Min: 8, Max: 200, Idle: 7, Retired: 0, Suspended: 0, CPU(Avg): 85, CPU: 1Can anyone explain the huge difference in times? (it was consistent over many runs).
-
@Attila Kovacs Thanks I missed that. Now I get it. The key is in
class function TLocation.FromValue(C: TRttiContext; const AValue: TValue): TLocation; begin Result.FType := C.GetType(AValue.TypeInfo); Result.FLocation := AValue.GetReferenceToRawData; end;
If AValue contains an object, Flocation would a pointer to a pointer. That was not the case in my testing. He could do the dereferencing in this function of course.
-
I was looking at an old blog post by Barry Kelly.
In particular the function:
function TLocation.FieldRef(const name: string): TLocation; var f: TRttiField; begin if FType is TRttiRecordType then begin f := FType.GetField(name); Result.FLocation := PByte(FLocation) + f.Offset; Result.FType := f.FieldType; end else if FType is TRttiInstanceType then begin f := FType.GetField(name); Result.FLocation := PPByte(FLocation)^ + f.Offset; Result.FType := f.FieldType; end else raise Exception.CreateFmt('Field reference applied to type %s, which is not a record or class', [FType.Name]); end;
I am puzzled by the line:
Result.FLocation := PPByte(FLocation)^ + f.Offset;
If Flocation is an object (FType is TRttiInstance type) and I am having a record field inside the object, the Result.FLocation should be PByte(FLocation) + f.offset, i.e. the same as for FType is TRttiRecord.
Barry Kelly is probably the guy that wrote the Rtti stuff, so he knows what he is talking about. What I am missing?
-
-
GetProperty(Name).PropertyType.Handle
- 1
-
https://github.com/ase379/gpprofile2017
A revamped version of the good old instrumenting profiler by @Primož Gabrijelčič.
-
I can confirm that the workaround worked.
-
I have just been told that this is a reported bug with Catalina 10.15.4. and there is a workaround I am going to try.
@Dave NottageHow come this did not affect you?
-
@Dave NottageThanks for responding I am in Delphi 10.3.3.
Here is what I see in verbose mode when I select Add New SDK as above and then press OK.
The -version -skd happens while the dialog is showing. Nothing seems to happen when I press OK.
Do I need to download the SDK in the Mac? The XCode version is 11.4.
Revisiting TThreadedQueue and TMonitor
in RTL and Delphi Object Pascal
Posted
Then I hope the weather will be bad 😀