Jump to content
Mike Torrettinni

The Case of Delphi Const String Parameters

Recommended Posts

29 minutes ago, balabuev said:

And this leads to the conclusion, that any potential additional field, like LockCount will also need locking. Moreover locking of two integer fields simultaneously is a much bigger problem (imho).

My LockCount theory wasn't for threading at all, it was for marking the content as mute, simply marking the associated data as immutable (aka temporary const ), to correct and prevent the leak mentioned in earlier posts.

 

32 minutes ago, balabuev said:

1) Even for read-only use cases, because we already described that ref-count value changes even in cases, when string char data is not modified.

2) Also for read-write use cases, even if the access to shared global string variable is protected with critical section in user's code.

Please have a look on how they are implemented for strings and you will see, they are not for thread safety as there is a check then lock (inc/dec), this is wrong and ineffective without a loop for thread safety, in other words useless.

Share this post


Link to post
32 minutes ago, Kas Ob. said:

Please have a look on how they are implemented for strings and you will see, they are not for thread safety as there is a check then lock (inc/dec), this is wrong and ineffective without a loop for thread safety, in other words useless.

Everything related this aspect is veeery tricky.

 

First, the goal of having interlocted ref-count field protection is tricky. It's not because we want thread-safe strings. It's included to be able to simulate value-type semantics for strings in muli-threaded environment (for preventing concepts leaking, in Dalija's terms). This is described in my previous post.

 

Second, the implementation is also tricky. It's a kind of state of the art piece of code. Hard to understand. Anyway, check for -1 can be made without locking because:

 

  • Only string literals have a ref-count equal to -1. And string literals are static. So, this -1 exist at its respective memory location from the start of the program to the end.
  • This -1 is never changes to any positive value and vise versa.
  • So, if you compare any ref-count with -1 even without protection, you will always detect, whether this string data is a static string literal data, or dynamically allocated data. During this comparison the value itself of a positive ref-count does not matter.

 

 

Edited by balabuev

Share this post


Link to post
26 minutes ago, balabuev said:

First, the goal of having interlocted ref-count field protection is tricky. It's not because we want thread-safe strings. It's included to be able to simulate value-type semantics for strings in muli-threaded environment (for preventing concepts leaking, in Dalija's terms). This is described in my previous post.

I am sorry to disagree here, in my opinion were added as futile try to make strings thread safe, and all what achieved is the fake sense of safety, i am sticking to my description of them as useless and wasting at least +18 cycle per call.

 

28 minutes ago, balabuev said:

Second, the implementation is also tricky. It's a kind of state of the art piece of code. Hard to understand. Anyway, check for -1 can be made without locking because:

 

  • Only string literals have a ref-count equal to -1. And string literals are static. So, this -1 exist at its respective memory location from the start of the program to the end.
  • This -1 is never changes to any positive value and vise versa.

You can call it tricky, but lets simplify this algorithm and mechanism to its most basic then build on that

 

Since a call like UStrAsg is been called on any strings (const or not), means we are reaching the comparison to -1, then if it is -1 then branch to skip the addition, so what gain of of not adding 1 to it, and to explain this little deeper, we can have faster const string by initialize them to 1 or even 10 and skip compare and jump, and here i am talking about how CPU's works and where they does waste cycles, and yes a simple add (without LOCK) is way faster and performance friendly than compare and jump.

 

Also if it is initialized to 1 then it is impossible to reach 0 and then trigger the free on global const string, because this will happen with memory corruption or broken try..finally, considering the const strings are already allocated and initialized in read-and-write memory.

35 minutes ago, balabuev said:

So, if you compare any ref-count with -1 even without protection, you will always detect, whether this string data is a static string literal data, or dynamically allocated data. During this comparison the value itself of a positive ref-count does not matter.

if it is -1 or 3 or whatever the overwrite or free is happening in one case, and this is when RefCount goes from 1 to 0, simple and clear.

 

So after we simplified the mechanism so much, don't you think it is possible to do it better ? and faster many folds ?

Share this post


Link to post
5 minutes ago, Kas Ob. said:

I am sorry to disagree here, in my opinion were added as futile try to make strings thread safe, and all what achieved is the fake sense of safety

 

Without interlocking, strings as well as dynamic arrays will not be usable in multi-threaded enviroment at all. Again, given this code:

var
  G: string;
  CriticalSection: ...;

procedure ThreadTask;
var
  s: string;
begin
  CriticalSection.Lock;   // Protected access to shared global G variable.
  s := G;                 //
  CriticalSection.Unlock; //

  DoSomethingWithS(s);
end;

What do you say to the user, when he starts complaing about random AVs? The code is correct.

 

Share this post


Link to post
1 hour ago, balabuev said:

 

Not agree. Imagine several identical tasks running in different threads:


var
  G: string;
  CriticalSection: ...;

procedure ThreadTask;
var
  s: string;
begin
  CriticalSection.Lock;   // Protected access to shared global G variable.
  s := G;                 //
  CriticalSection.Unlock; //

  DoSomethingWithS(s);
end;

 

Logical viewpoint: The code above reads the string from shared global variable G into thread local variable s. Then, since s is local we can do with it anything without futher protection.

 

Phisical viewpoint: Even after assignment of value of G to variable s, they still point to same char memory, so, more than one thread can access it. This also applies to reference count field, which is a part of metioned shared memory.

 

In this case the lock is needless (with interlocked reference counting). The interesting case has to be where G is modified. But the reference counting is interlocked, and I'm sure it's there for a reason, and I'm sure that I've forgotten what that reason is.

Share this post


Link to post
9 minutes ago, David Heffernan said:

In this case the lock is needless

Yes. But my point is: even with this critical section protection and even with this read-only use case - ref-counts still should be interlocked.

Edited by balabuev
  • Like 1

Share this post


Link to post
6 minutes ago, David Heffernan said:

But the reference counting is interlocked, and I'm sure it's there for a reason, and I'm sure that I've forgotten what that reason is.

The reason is, that when S goes out of scope, it decrements the reference count of G. 

 

G should only be destroyed when its reference count reaches zero so that action must be atomic.

 

 

Share this post


Link to post
46 minutes ago, David Heffernan said:

In this case the lock is needless (with interlocked reference counting). The interesting case has to be where G is modified. But the reference counting is interlocked, and I'm sure it's there for a reason, and I'm sure that I've forgotten what that reason is.

I don't know if there is some other reason, but one reason is that with interlocked reference count you can freely use reference counted instances of any type (interfaces, dynamic arrays...) in multithreaded environment for read access if all threads have acquired their strong reference before original reference has been written to (cleared or assigned).

 

In such case you don't need any other more costly locking protection, and you can share same string, so no copying necessary. Clearing such references when they go out of scope is thread safe, because memory deallocation will happen only in case where reference count reached 0. Also getting strong reference from strong reference (variable) held by thread is also thread safe - when I said held by thread it means that no other thread is allowed to write to that variable.

Edited by Dalija Prasnikar
  • Like 2

Share this post


Link to post
31 minutes ago, A.M. Hoornweg said:

The reason is, that when S goes out of scope, it decrements the reference count of G.

Yes. Just two remarks:

  • Reference count is not belong to S or G. It belongs to shared memory to which both S and G points. And also we speaking about several threads, so there are several S variables.
  • Before S goes out of scope, it can be passed by value to some other function(s), like DoSomethingWithS(s) in my code. Or assigned to some other string variables, like s2 := s; or even just re-assigned s := 'abc'. Each such operation will increas and/or descreas shared reference counter. So, it, actually, may be touched many times.
Edited by balabuev

Share this post


Link to post

You are right about interlocked necessity for global const, and here as we discussing mechanism we need to find a way to handle this, and i really like this way of discussing of viability instead of no and wrong and "none sense stuff" !!, thank you 

 

So what if the global const are initialized with 0, plain and simple,

1) We are safe as 0 will not be reached form 1 because it will automatically call free, and from -1 it will be safe as it should not happen or an exception is due.

2) This will introduce little overhead in calculating the addition (1 or -1) and in this case we still can mange to do it branchless like this

        xor     eax, eax
        test    edi, edi
        setg    al
        lea     eax, [rax + rax - 1] //  lea     eax, [eax + eax - 1]
        cmove   eax, edi

Thank to godbolt and clang for this great and god given power in solving such algorithms, saving huge time tryin to figure a fast assembly !

 

The above will -1 for negative value and +1 for positive value and 0 for 0, means we can also go ahead and remove the LOCK (interlocked operation) for the RefCount and as explained in (1), 0 as refcount can only be overwritten by 0, and we still branchless for this case.

 

Did i miss something ? and what do you think ?

Share this post


Link to post
33 minutes ago, Kas Ob. said:

and what do you think ?

I commented only parts I understand. Basically I do not understand what is your initial idea. Even, if we simplify things and will not talk about interlocking, I cannot understand how second counter (LockCount) or negative values in RefCount can help.

 

Given your original code:

 

procedure C(st: string);
begin
  Writeln(st);
end;

procedure B(st: string);
begin
  C(st);
end;

procedure A(st: string);
begin
  B(st);
end;

 

I can rewrite it with some pseudo-code, which will higlight, what is happened inside:

 

procedure C(st: string);
begin
  _AddRef(str);
  try
    Writeln(st);
  finally
    _Release(str);
  end;
end;

procedure B(st: string);
begin
  _AddRef(str);
  try
    C(st);
  finally
    _Release(str);
  end;
end;

procedure A(st: string);
begin
  _AddRef(str);
  try
    B(st);
  finally
    _Release(str);
  end;
end;

Where _AddRef and _Release - are two RTL funcions, which increase and decrease reference count correspondingly. Can you modify this code to highligh your ideas... 

 

I just trying to understand - whether something else should be generated by the compiler directly inside A, B and C functions? Or you speak only about modifying _AddRef and _Release RTL (pseudo-)funcions?

Share this post


Link to post
22 minutes ago, balabuev said:

I just trying to understand - whether something else should be generated by the compiler directly inside A, B and C functions? Or you speak only about modifying _AddRef and _Release RTL (pseudo-)funcions?

Both in fact with the minimum changes.

 

So in code how it should go, as you added the try finally for A,B and C, now we should optimize where parameter is const is an advantage so lets assume only B will have const parameter and currently the compiler will generated code like this

procedure C(st: string);
begin
  _AddRef(str);
  try
    Writeln(st);
  finally
    _Release(str);
  end;
end;

procedure B(const st: string);
begin
    C(st);
end;

procedure A(st: string);
begin
  _AddRef(str);
  try
    B(st);
  finally
    _Release(str);
  end;
end;

we removed _AddRef, _Release and try..finally from B for perfromance, now how to protect against st in B from being changed in C and obey the contract with A, in other words when A issued a call with value of st that should be const it should be returned untouched, we know from earlier posts how the corruption happen (overwritten data), so the idea is changed the mechanism to protect st from the point A called B until return, we are not obliged to implement full LockCounter (we could but why go through all of it) we need to signal ownership only.

 

Here an example omn how to do it, lets add another set of _AddRefConst and _ReleaseConst with little difference from AddRef and Release like this

_AddRef if the RefCount is -1 do nothing else increase RefCount

_Release if -1 do nothing else decrease RefCount if RefCount is 0 then free (or call destroy/unallocate in short release for good) , (notice UStrClr will return nil after free or not)

the suggested behaviour is as the following

Global constant strings will have RefCOunt=0

_AddRef increase the RefCount, (this step i think might be simplified as i mentioned in the last assembly) RefCount will stay 0 if it is 0 will be increased by +1 if positive or decreased by -1 if negative

_Release decrease the RefCount as above means 0 will stay 0 and if the RefCount is +1 then free for good, now check if the RefCount =-1 and raise an exception, we have violation and/or memory leak in other word unpredicted behaviour.

_AddRefConst will do the same as _AddRef except it will negate the RefCount and return boolean true (or anything to distinguish) for ownership if the RefCount was positive

_ReleaseConst will do exactly the same as _Release with one exception, it will accept additional parameter for ownership if true then negate the RefCount

 

Now the code above will look like this

procedure C(st: string);
begin
  _AddRef(str);
  try
    Writeln(st);
  finally
    _Release(str);
  end;
end;

procedure B(const st: string);
begin
    C(st);
end;

procedure A(st: string);
var
  TmpLocal: boolean;
begin
  TmpLocal := _AddRefConst(str);
  try
    B(st);
  finally
    _ReleaseConst(str, TmpLocal); 
  end;
end;

Notice that 

1) the changes from the compiler is done only in A where st is var and will be passed as const, B and C we didn't add any overhead to the existing generated code, so no compiler modification needed there, we handling thing in RTL for B and C

2) in A we need one local var to store ownership, ownership (for lets say strings) will be one and one only, so any consequence functions will not negate what is negative and will return false for ownership, hence we created consistency without implementing full LockCount, this is essential to solve the chaining problem var->const -> var(default) -> const ->..... the first will always have ownership and the data will return untouched.

3) the compiler is already generating temporary local var for its generated code almost everywhere due the small amount of CPU registers, so one more is an overhead of course but still a solution to that memory leak and guarantee its detection.

 

I think that is it, this will guarantee string variable passed as const to be protected, and here i don't mean multithreading, but simple conflict by reusing reference (pointer), so the before mentioned bug ( as i classified it) can be put to rest.

 

Now another approach can be a LockCount by putting it before the others in the string header, we will be breaking strings compatibility between different compilers version as long the RefCount is not reaching 0, this might be a problem and might be not, because from an EXE with an extra counter (replacing the negative algorithm above )  will and can pass a string to dll build with the old scheme and it will work fine as long as the string is referenced from EXE and the bug with const is not in the DLL, so for compatibility this approach is might be little better, though it might need deeper thinking.

Share this post


Link to post

Let's imagine that I want to change the code and call C (instead of B) from A:

 

procedure C(st: string);
begin
  _AddRef(str);
  try
    Writeln(st);
  finally
    _Release(str);
  end;
end;

procedure B(const st: string);
begin
    C(st);
end;

procedure A(st: string);
begin
  _AddRef(str);
  try
    C(st); // Call C directly without B.
  finally
    _Release(str);
  end;
end;

Whether your new _AddRefConst/_ReleaseConst functions need to be used?

 

 

Share this post


Link to post
5 minutes ago, balabuev said:

Whether your new _AddRefConst/_ReleaseConst functions need to be used?

No the old and usual ones, because we didn't made the switch here, we are passing var to var, the new ones will be only added when we make the transition for these managed types (string, TBytes...), when they are var and will be passed as constants.

Just like you did it in the above code.

 

In fact if such mechanism exist we might be able to utilize it in locking managed types.

Also in theory with global consts initialized to 0, with the suggested mechanism, i think we can drop the interlocking instruction, returning/saving very precious +18 cycle per call and the one of most expensive operation with string handling, this is huge, unless i am missing something.

Share this post


Link to post

I am thinking that might be it is easier or possible if the RefCountConst didn't increase RefCount at all, like in A (when calling B) and just negate, this might also save performance, Also in case we are calling C from A, this might be thought deeper there might be a better approach, not sure if it is needed to begin with if C is not calling any assigning function, the same for A, the compiler should be able to check if such called been made, hence the add-try,,finally-dec is needed.

Share this post


Link to post
18 minutes ago, Kas Ob. said:

No the old and usual ones

 

Ok. Then let's call many different procedures :classic_blink:

procedure A(st: string);
begin
  _AddRef(str);
  try
    B(st);
    C(st);
    B(st);
    C(st);
    B(st);
    C(st);
    B(st);
    C(st);    
  finally
    _Release(str);
  end;
end;

The power of RefCount (as it currently implemented), that AddRef/Release and try/finally is introduced in the A procedure only ones. Moreover, this try/finally is shared among all managed variables in this procedure.

 

And what is more, some cases do not need all this stuff at all:

var 
  st: string; // Variable, external to procedure.

procedure A;
begin
  B(st);
  C(st);
  B(st);
  C(st);
  B(st);
  C(st);
  B(st);
  C(st);    
end;

 

 

 

 

Share this post


Link to post
19 minutes ago, balabuev said:

Ok. Then let's call many different procedures :classic_blink:

Tricky one indeed ! 🙂

 

Here in this case if my last suggestion with removing the RefCounting inc/dec and leave the negate, in this case the code will be simplified to this to prevent locking

 

procedure UnLockString(Str); inline;
begin
  // two assembly instruction at most to simple negate or make sure it is positive RefCount
end;

procedure LockString(Str); inline;
begin
  // two assembly instruction at most to simple negate or make sure it is negative RefCount
end;

procedure A(st: string);
var
  TmpLocal: boolean;
begin
  TmpLocal := _AddRefConst(str);  // own the lock for the bulk
  try
    // because B is not locked
    UnLockString(Str);  // MakeTheRefCountPositive(str) unlock before B
    B(st);  
    LockString(str);    // MakeTheRefCountNegative(str) lock before C
    C(st);
    UnLockString(Str);
    B(st);
    LockString(str);
    C(st);
    UnLockString(Str);
    B(st);
    LockString(str);
    C(st);
    UnLockString(Str);
    B(st);
    LockString(str);
    C(st);    
  finally
    _ReleaseConst(str, TmpLocal);  // no lose end with locking means no memory leak; hre if the compiler kept track the locking it can call _Release instead
  end;
end;

And if the compiler can - and it in fact it does- track of the status of the lock then it is one instruction for both LockStr and UnlockStr,

Also the same method can be used with the second code sample when the compiler detect the existence if function like C, but this is more rare code i think.

 

That was fast thinking from me and hope i am still not missing something 🙂

 

But still the possibility of moving the locking/unlocking from A to C is viable, but this should be the compilers developers work to investigate and test, right ?

 

Share this post


Link to post
49 minutes ago, Kas Ob. said:

And if the compiler can - and it in fact it does- track of the status of the lock then...

 

This is impossible because of unpredictable conditions, loops, etc:

procedure A(st: string);
begin
  UnLockString(Str);
  B(st);
  if SomeCondition then
  begin
    LockString(Str);
    C(st);
  end;
  UnLockString(Str);
  B(st);
end;

 

Also, if you add LockString before call of C from A, then you also should add it when calling C from B:

procedure B(const st: string);
begin
  LockString(Str);
  C(st);
end;

And, actually before any call.

 

And, finally, returning to threads (even without interlocking aspect): You have two states of a string - it can be either locked or unlocked during execution. But, you really lock shared string data. So, having several threads we may run into situation, in which this shared string data should be locked in one thread and unlocked in another thread at the same time.

 

 

 

 

 

Share this post


Link to post
8 hours ago, balabuev said:

This is impossible because of unpredictable conditions, loops, etc:

Try,,finally should be there and ownership should be reserved, if the inline variables are correctly implemented by the compiler then it should be able to handle this, i think.

Here i would point this, i don't have IDE with inline variables to see how it is handled where try..finally for begin block been added by the compiler, i asked in this forum and no one commented on that and i can't find on the internet any technical details, so i am assuming here.

 

8 hours ago, balabuev said:

Also, if you add LockString before call of C from A, then you also should add it when calling C from B:

...

And, actually before any call.

No need for this, we are not doing it everywhere, the LockStr/UnlockStr (inside try..finally ) could be only added where mixed call been used just like the case you mentioned with B,C,B,C,B...

 

8 hours ago, balabuev said:

And, finally, returning to threads (even without interlocking aspect): You have two states of a string - it can be either locked or unlocked during execution. But, you really lock shared string data. So, having several threads we may run into situation, in which this shared string data should be locked in one thread and unlocked in another thread at the same time.

If a string been used by multiple threads, then 

1) it been locked and protected as whole by critical section or any other synchronizing mechanism, so its content should be accessed by only one thread at a time.

2) Lets assume two threads are using it, changing it and passing it to a function with const parameter, means the content is locked by the suggestion, here lets discuss what current behaviour is, one thread lets say is using it, so either it is assigning from it means it increased the the RefCount or it did tried to alternate it, here the data might be freed, if this happen then we adding this suggestion to prevent this to begin with by an exception that should be raised, simply this is exactly the case we need to detect and prevent with an exception, or the code will just increase the RefCount ( according to the suggestion) without the ability to take ownership of the lock, miss using strings like type in thread should be protected by the developer code like in this case, as it is his broken code not the compiler failure to guarantee consistency of constant, like the developer will have his exception at first test/run/debug code hence he is the one who should change his thread to copy the string before using it (like locally ..) when and where he see fit. 

 

The lock will be owned by one function only in one place at a time, i think it should be only when the transition is happening, and thinking about flipping with LockStr/UnlockStr this might be unneeded !! as we established the content should be unchangable. 

 

And here lets even expect the worse i mean the suggestion failed or missing critical not discussed here situation, then by changing how the RefCount work, by this suggestion (or any other means i am sure there might be other approachs) , where we can remove the interlocked operation ( removing LOCK instruction) we gained at least 18 cycle and this will always occur in pairs means the direct gain is +36 cycle, i don't know for fact how much a try..finally will cost but i don't expect it to do worse than 10 (worst case scinario) this will happen once, so the gain is substantial, that is one, on other hand passing string most likely it will be handled as string further and in many cases there is either a full try..finally or accessing the content in read only(here we need not to protect), with try..finally we most likely don't need to protect this string per se, only in case we will be passing it further as var to be changed and this will violate the contract as we build on top it is const here, so this might simplify many situations, and rethinking now in your mentioned case with B;C;B;C;B... if it did happen within C then this is not a problem , while if it is in A and due the saved time removing the interlocked operation, we are safe to assume adding the lock/unlock in A only will still faster, as we removed the LOCK operation application wide, here one thing if the case of this multiple switching between these calls is there then we even imagine try..finally for each of them as this function is not built performance wise by the developer and this will contradict his assumption and his code to be performance wise, hence adding the try..finally and lock/unlock is the right and safe action by the compiler, let the developer refactor his code.

 

If we dropped the locking in full ( this whole suggestion), and left the new RefCount in little modified version without the negative behaviour, means with no fix for the memory leak, we still gained huge performance, don't you agree ?

 

I am afraid now we are discussing extra deep details and one wrong example, as it happened in this forum quite many times, the whole idea will be deemed as useless issued by stupidity, as i witnessed quite few times, so i really don't want to lose this thread for the void, and also thank you this discussion.

Share this post


Link to post
1 hour ago, Kas Ob. said:

If a string been used by multiple threads, then 

1) it been locked and protected as whole by critical section or any other synchronizing mechanism, so its content should be accessed by only one thread at a time.

Thats not true. I've showed it in previous posts.

Share this post


Link to post
2 minutes ago, balabuev said:

Thats not true. I've showed it in previous posts.

Sure. but it was global const, and the locking/unlocking will not have any side effect, because it can't bee free and there is no negating what so ever happing on its RefCount.

 

11 hours ago, balabuev said:

And, finally, returning to threads (even without interlocking aspect): You have two states of a string - it can be either locked or unlocked during execution. But, you really lock shared string data. So, having several threads we may run into situation, in which this shared string data should be locked in one thread and unlocked in another thread at the same time.

The exception which was intended to prevent data lose and unpredicted behaviour will be happening when the RefCount goes from -1 to 0, while current code implementation behaviour will free the string when the RefCount goes from from 1 to 0, so in theory if this freeing is already happening (i mean from 1 to 0) on a string that been shared by multiple threads then releasing is occurring and the unpredicted behaviour already does exist, and this either by the bug (short implementation) by the compiler or wrong threading model by the developer, right ? am i missing something ?

 

So having two state of the string is not bad, or may be if the compilers developer can join and discuss this further, because i still can't see how this will badly impact the threading model, sure there will be always a chance that i am missing something, but can that something and its effect be considering as bad side effect, hence start different brain storming to fix it elegantly, in other words where are we with this discussing, is it worthless or we can cross the threshold from it is good/better ?

 

The thing is this suggestion combined few ideas, so it definitely can be separated in at least two ideas and investigated separately, or keep them combined to achieve a fix one and for all.

Share this post


Link to post
27 minutes ago, balabuev said:
2 hours ago, Kas Ob. said:

If a string been used by multiple threads, then 

1) it been locked and protected as whole by critical section or any other synchronizing mechanism, so its content should be accessed by only one thread at a time.

Thats not true. I've showed it in previous posts.

Also this what i meant by one bad example, by me missing explaining in more details.

 

 

And thinking of this case where the interlocked operation is a must, is the case is only with global constant strings ?

because if yes then we easily can stick to initialize the global to 0 and remove the atomic addition/subtraction by utilizing a very same method, ( if we to drop the memory leak protection if full)

By changing the addition/subtraction by 1 we can use an algorithm like this

1) if the RefCount <> -1  then add/sub 1

2) if RefCount = -1 then add/sub 0 ( the RefCount didn't change)

the branchless assembly for that is

        xor     eax, eax
        cmp     edi, -1
        setne   al
        add     eax, eax
        add     eax, -1

And here we need the global constants to be initialized to -1, this can be a separate idea to enhance the performance greatly ! but only if the interlocked operation are there to protect global consts string, which i think it is.

Share this post


Link to post
1 hour ago, Kas Ob. said:

Sure. but it was global const

It was global variable.

Share this post


Link to post
1 hour ago, Kas Ob. said:

The exception which was intended to prevent data lose and unpredicted behaviour will be happening when the RefCount goes from -1 to 0, while current code implementation behaviour will free the string when the RefCount goes from from 1 to 0, so in theory if this freeing is already happening (i mean from 1 to 0) on a string that been shared by multiple threads then releasing is occurring and the unpredicted behaviour already does exist, and this either by the bug (short implementation) by the compiler or wrong threading model by the developer, right ? am i missing something ?

 

Wrong. At least in the context I speaking about. Lets take the code:

procedure A(st: string);
var
  TmpLocal: boolean;
begin
  TmpLocal := _AddRefConst(str); 
  try
    UnLockString(Str);  // MakeTheRefCountPositive(str) unlock before B
    B(st);  
    LockString(str);    // MakeTheRefCountNegative(str) lock before C
    C(st);
    UnLockString(Str);
    B(st);
    LockString(str);
    C(st);
    UnLockString(Str);
    B(st);
    LockString(str);
    C(st);
    UnLockString(Str);
    B(st);
    LockString(str);
    C(st);    
  finally
    _ReleaseConst(str, TmpLocal);
  end;
end;

The initial user's code itself (without our additions) is correct and should work without any bugs or exceptions.

 

During the invokation of this code reference counter will be switched from positive to negative and back several times. Because this is what your LockString/UnlockString function do.

 

So, lets execute this (fully correct) code in two threads concurrently passing the value of the same global variable G into both:

var
  G: string;

procedure TForm11.Button1Click(Sender: TObject);
begin
  G := 7.ToString; // Init with non contant string data.

  TTask.Run(procedure
  begin
    A(G);
  end);

  TTask.Run(procedure
  begin
    A(G);
  end);
end;

So, both running A procedures will share same string data and same reference count. And they both will switch this single shared reference count from negative to positive and back in unpredictable order.

 

This single observation lead me to conclusion, that your way is wrong.

Share this post


Link to post
20 minutes ago, balabuev said:

It was global variable.

Then it can be protected like the local var .

 

But here I think i might be start to understand what are pointing to, i might be wrong though, so please correct me

You are referring to such code

var
  GlobalString: string;

procedure ThreadProc1(st: string); // could be const 
begin 
// we are in a thread and accessing only st (not GlobalString)
end;

procedure ThreadProc2; // could be const 
begin 
// we are in a thread and accessing GlobalString
end;

begin
 GlobalString := something..
 // StartManyThreadsand and call ThreadProc1 (or ThreadProc2 too)
end;

So here lets discuss this,

For ThreadProc1

1) if ThreadProc1 is changing the content of st then we have a problem even with current behavior and code, and this is wrong doing by the developer as strings are not thread safe.

2) If ThreadProc1 is not changing st, then this will work fine in the current implementation, and here the interlocked operation guarantee the safety of RefCount, but lets delay this to ThreadProc2 

 

For ThreadProc2

1) if ThreadProc1 is changing the content of GlobalString then we have a problem even with current behavior and code, and this is wrong doing by the developer as strings are not thread safe. just like ThreadProc1

2) If ThreadProc1 is not changing GlobalString, then this will work fine in the current implementation, and here the interlocked operation guarantee the safety of RefCount, now lets discuss this behaviour, on one hand removing the interlocked operation will definitly break the compatibility with current, most likely many already exist application out there will be broken right away, but this will lead us to the most important question which is

 

Is this assumption sanctioned by Delphi as language and its documentation, i mean global strings are thread safe ?

To my knowledge the answer is no, and would like to be corrected on this.

 

Now if the answer is no then the compiler with new behaviour is not breaking any rules, and the code done by many application is the wrong and should be corrected, (also there is tweak/workaround) can used because we assuming GlobalString is not changed then simply by invoking UniqueString to be used with in thread will solve this, and this new behaviour can be documented and will fix all the broken application with the new behaviour, in other words this should be the developer concern.

if the answer is Yes then we need to find better approach to either modify the suggestion in whole or a remedy to this global strings usage, does that make sense to you ?

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×