Jump to content

Anders Melander

Members
  • Content Count

    2312
  • Joined

  • Last visited

  • Days Won

    119

Everything posted by Anders Melander

  1. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    There's RSP-23333 which is about this problem. I've already posted a link to the original article and this thread to the report. Beyond that it would be nice if someone on the 10.4 field test could bring it to the attention of those that can do something about it.
  2. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    I'm aware that it's a noop in this case and I was leaving that detail to the compiler... ...but that's a good point. I get your point but I always start with registers and then refactor the code to use symbols to make it readable. I guess I should keep the register names in a comment.
  3. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    As far as I can tell that's a coincidence. RDX needs a value: https://www.felixcloutier.com/x86/cmpxchg8b:cmpxchg16b
  4. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    Beware that this does not align memory allocated from the heap. For that you can use the following: {$ifdef CPUX64} SetMinimumBlockAlignment(mba16Byte); {$else CPUX64} SetMinimumBlockAlignment(mba8Byte); {$endif CPUX64} This works for me: function InterlockedCompareExchange128(Destination: Pointer; ExchangeHigh, ExchangeLow: Int64; ComparandResult: Pointer): Bool; stdcall; asm .NOFRAME .PUSHNV R10 .PUSHNV RAX .PUSHNV RBX .PUSHNV RCX MOV R10,Destination MOV RBX,ExchangeLow MOV RCX,ExchangeHigh MOV RDX,[ComparandResult+8] MOV RAX,[ComparandResult] LOCK CMPXCHG16B [R10] SETZ AL end; The PUSHNV RBX is translated to push RBX + pop RBX. The others do nothing.
  5. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    Here's a simple test of the queue only. I have only shown the test of the old code. The test of the new is similar. Old stack code fails after a short while [*] on 32-bit and 64 bit with an invalid pointer operation on freeing an item popped from the stack (a short while = ~4 seconds, approximately 1 million stack operations by 100 threads on a 4 core system). The reason it fails so quickly is that I have tried to limit the calls to the memory manager as that is a bottleneck which tends to serialize the operations a bit. New stack code succeeds on 32 bit but fails immediately on 64-bit with an AV on the LOCK CMPXCHG16B [R10] in InterlockedCompareExchange128. uses System.SyncObjs; const THREAD_COUNT = 100; ITERATIONS = 1000000; var Threads: array[0..THREAD_COUNT-1] of TThread; type PEventItemHolder = ^TEventItemHolder; TEventItemHolder = record Next: PEventItemHolder; Event: Pointer; end; TSyncEventItem = record Lock: Integer; Event: Pointer; end; procedure Push(var Stack: PEventItemHolder; EventItem: PEventItemHolder); var LStack: PEventItemHolder; begin repeat LStack := Stack; EventItem.Next := LStack; until AtomicCmpExchange(Pointer(Stack), EventItem, LStack) = LStack; end; function Pop(var Stack: PEventItemHolder): PEventItemHolder; begin repeat Result := Stack; if Result = nil then Exit; until AtomicCmpExchange(Pointer(Stack), Result.Next, Result) = Result; end; type TPushPopThread = class(TThread) private class var FEventCache: PEventItemHolder; class var FStackCount: integer; class var FReadyCount: integer; class var FLiveCount: integer; class var FOperations: int64; class var FFailure: boolean; private FEvent: TEvent; protected procedure Execute; override; public constructor Create(AEvent: TEvent); class property StackCount: integer read FStackCount; class property ReadyCount: integer read FReadyCount; class property LiveCount: integer read FLiveCount; class property Failure: boolean read FFailure; class property Operations: int64 read FOperations; end; constructor TPushPopThread.Create(AEvent: TEvent); begin inherited Create; FEvent := AEvent; end; procedure TPushPopThread.Execute; var Item, PoppedItem: PEventItemHolder; begin TInterlocked.Increment(FReadyCount); try FEvent.WaitFor(INFINITE); TInterlocked.Increment(FLiveCount); try try Item := nil; try for var i := 0 to ITERATIONS-1 do begin if (Terminated) or (FFailure) then exit; if (Random(4) = 0) then // A lot more Pops that Pushes begin if (Item = nil) then New(Item); Item.Event := pointer(TInterlocked.Increment(FStackCount)); Push(FEventCache, Item); Item := nil; end else begin PoppedItem := Pop(FEventCache); if (PoppedItem <> nil) then begin TInterlocked.Decrement(FStackCount); if (Item <> nil) then FreeMem(Item); Item := PoppedItem; end; end; TInterlocked.Increment(FOperations); end; // Pop and free remaining items PoppedItem := Pop(FEventCache); while (PoppedItem <> nil) do begin TInterlocked.Decrement(FStackCount); FreeMem(PoppedItem); PoppedItem := Pop(FEventCache); end; finally if (Item <> nil) then FreeMem(Item); end; except FFailure := True; raise; end; finally TInterlocked.Decrement(FLiveCount); end; finally TInterlocked.Decrement(FReadyCount); end; end; procedure TForm11.ButtonOldStartClick(Sender: TObject); begin var Event := TEvent.Create(nil, True, False, ''); try for var i := 0 to THREAD_COUNT-1 do if (Threads[i] = nil) then Threads[i] := TPushPopThread.Create(Event); // Wait for all threads ready to execute while (TPushPopThread.ReadyCount < THREAD_COUNT) do Sleep(1); // Start all threads Event.SetEvent; // Wait for thread completion while (TPushPopThread.LiveCount > 0) do Sleep(1); finally Event.Free; end; if (TPushPopThread.Failure) then ShowMessage('Failed') else if (TPushPopThread.StackCount > 0) then ShowMessage('Stack lost items') else ShowMessage('Success'); for var i := 0 to THREAD_COUNT-1 do if (Threads[i] <> nil) then begin Threads[i].WaitFor; FreeAndNil(Threads[i]); end; end;
  6. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    The test is IMO doing too much so it's hard to tell where the problem originates. Since we've been focusing on the TEventItemHolder stack I would start by verifying that the original code fails (although it's obvious that is has the ABA problem) and the new code doesn't fail. Then I would validate TMonitor.Wait and only then would I validate TThreadedQueue.PopItem.
  7. Anders Melander

    Record Circular References

    Not in this single case, I agree, but beware of tunnel vision. Regardless the obstacle is probably more a question of resources and priorities. It might also not be easy to shoehorn something like this into the existing compiler. If it's implemented like a traditional one-pass compiler it will be using various optimizations that can only be made because it's a one-pass compiler.
  8. Anders Melander

    Record Circular References

    It isn't. The type and offset of the members needs to be known too: type TMyRecord = record Foo: integer; Bar: string; end; TMyClass = class private FFooBar: TMyRecord; public property Foo: integer read FFooBar.Foo; property Bar: string read FFooBar.Bar; end;
  9. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    You're missing an "n" in Exchange and the NewValue and CurrentValue parameters should be declared as const or [ref].
  10. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    It's been raining so here you go (completely untested): type TEventStack = record Counter: int64; Head: PEventItemHolder; procedure Push(EventItem: PEventItemHolder); function Pop: PEventItemHolder; end; var EventCache: TEventStack; ... EventItemHolders: TEventStack; procedure TEventStack.Push(EventItem: PEventItemHolder); var Current, Next: TEventStack; begin repeat // We don't need to copy atomically since the test below will detect tearing // but since the members should be aligned tearing should not occur anyway. Current := Self; EventItem.Next := Current.Head; Next.Head := EventItem; Next.Counter := Current.Counter + 1; // I'm assuming TInterlocked.CompareExchange returns a boolean: True on success. until TInterlocked.CompareExchange(Self, Next, Current); end; function TEventStack.Pop: PEventItemHolder; var Current, Next: TEventStack; begin repeat Current := Self; if (Current.Head = nil) then Exit(nil); Next.Head := Current.Head.Next; Next.Counter := Current.Counter + 1; until TInterlocked.CompareExchange(Self, Next, Current); Result := Current.Head; end; I've made the two functions members of the record to get rid of the first parameter. What you (or someone else) need to do is provide an implementation of TInterlocked.CompareExchange, AtomicCmpExchange or CAS that handles 16 bytes and returns a boolean indicating success.
  11. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    Oh. I generally don't bother with attached or linked files. However from a brief glance at your code this looks fishy: until (AtomicCmpExchange(Stack.Spin, Spin + 1, Spin) = Spin) and CAS(Result, Spin + 1, Result.Next, Spin + 2, Stack); I would think you'd need to test and update both Spin and Next in one go. That's why you need the 128 bit CAS. I don't see the need for both the AtomicCompareExchange and the CAS. You also need to modify the Push functions since it has the same problems as the Pop. I might give it a go this week-end (if the weather is bad 🙂) but I suspect someone else might beat me to it.
  12. Anders Melander

    Revisiting TThreadedQueue and TMonitor

    Ouch! One would have thought they knew not to write code like that. As far as I can tell it can be solved "simply" by changing the head of the two stacks from a pointer to record containing the pointer and a counter: type TEventStack = record Counter: int64; Head: PEventItemHolder; end; var EventCache: TEventStack; ... EventItemHolders: TEventStack; ...and then change the Push and Pop functions to increment the transaction counter on each operation. Finally we just need a version of AtomicCmpExchange that operates on 128 bits (e.g. the 16-byte CAS mentioned above) to update the transaction Counter and the Head pointer in one go. I believe this will take care of the ABA problem.
  13. Anders Melander

    Record Circular References

    Classes are reference types.
  14. Anders Melander

    Record Circular References

    You mean a preprocessor. Adding a preprocessor would amount to the same as making the compiler multi-pass.
  15. Anders Melander

    Record Circular References

    Yes, probably although I don't think the front end is enough. But more importantly I think it's about cost/benefit and the limited resources available to them.
  16. Anders Melander

    Record Circular References

    You are not stuck. There's always choice. I'm assuming you've chosen to stick with Delphi because you like the language. With the benefits Delphi provides comes some limitations. That's just part of the equation. Because it's a one-pass compiler (mostly). The benefit is that it is fast. The price is that there are some things that are not possible. Reference types can be forward declared because their size are always known (=SizeOf(pointer)). The size of record types are not known until they have been fully declared. This means that the compiler cannot determine the layout of a record type if it contains other record types that have not been fully declared, and it needs to know the layout in order to generate the code that uses it. This problem can be solved, with certain limitations, while still staying a one-pass compiler, but at a cost of added complexity in the compiler and longer compile time.
  17. Anders Melander

    Record Circular References

    Without access to the compiler source it's hard to tell, but I'll bet there are some. For one, as far as I can tell, it would require forward declaration of the record and I know there are good reasons why that isn't possible. How would you envision forward declaration of a record method would look?
  18. Anders Melander

    TThread always raises OS Errror

    Agree. But your wise words are missing the point that the value of GetLastError is irrelevant in the context. Being curious about the value is pointless. Those that are curious about it regardless could just examine the source, and when that turns up no clue, trace though CreateThread in a kernel debugger, but in the end it will amount to nothing because the value [drum roll] doesn't matter.
  19. Anders Melander

    TThread always raises OS Errror

    You are wasting both your own and our time but as usual here the discussion continues ad nauseam even though the question has been resolved in the very first answer to the original post. I'm not sure how to explain it in a way that we haven't already tried. The value is meaningless, undocumented, random, forget about it ffs! If you really want an answer to this question, I suggest you ask it on stackexchange. I'm sure it will get the answers it deserves... Add a link back to this discussion for bonus points.
  20. Anders Melander

    TThread always raises OS Errror

    I think you did. Since the value of GetLastError is meaningless the way you used it: It doesn't matter and It doesn't matter
  21. Anders Melander

    TThread always raises OS Errror

    AFAIK it isn't and there's no reason for you to do it yourself because the value is irrelevant unless you're testing the result of an API function that has indicated that GetLastError should be used to get information about a failure. Here's another example of the exact same mistake: CreateThread() // GetLastError() returns 87 You don't need to "clear LastError" unless your function is using GetLastError as it's own status reporting mechanism and that would be very rare.
  22. Anders Melander

    TThread always raises OS Errror

    Probably not. The interpretation would depend on the API function that set the status value (notice I'm not calling it an error code). For example it could mean that a supplied parameter value didn't apply to the current context. As long as the caller handles the condition somehow then everything is fine.
  23. Anders Melander

    TThread always raises OS Errror

    I agree. The test case is invalid. Nope. There's no API contract that promises you that GetLastError will or should be zero at the point where you are testing it. If it's important to you that it is zero then sure, go ahead and set it to zero but it would be better to not misuse GetLastError that way. You should test GetLastError right after you have made an Win API call because at that point it will have been set to a relevant value.
  24. Um... you're responding to a post which is over a year old, written by a person who's no longer alive...
×