Jump to content
dummzeuch

Using dxgettext on Windows 10

Recommended Posts

This might be completely off beam, but I thought it worth raising.

 

There was a problem in IDEFixPack with some older AMD processors which involves converting UTF8 strings.

 

Quoting from the thread about this issue :

 

<< This fixes the usage of a SSE 4.1 CPU instruction ("ptest") in a code block that only checked for SSE 2.

 

A function that converts UTF8Strings that contain only ASCII characters to UnicodeStrings used an SSE 4.1 CPU instruction. But the SSE instructions are only used if there are more than 15 characters in the UTF8String.

>>

 

Might this relate to the problem I'm having with gnugettext ?

 

In the context of IDEFixPack, the bug caused older CPUs to throw 0xC000001D "illegal instruction" exception.  I'm not seeing the exception, but I think there is an exception in the call stack that I haven't yet been able to capture as it occurs.

 

Share this post


Link to post
11 hours ago, Sue King said:

There was a problem in IDEFixPack with some older AMD processors which involves converting UTF8 strings.

<< This fixes the usage of a SSE 4.1 CPU instruction ("ptest") in a code block that only checked for SSE 2.

 

A function that converts UTF8Strings that contain only ASCII characters to UnicodeStrings used an SSE 4.1 CPU instruction. But the SSE instructions are only used if there are more than 15 characters in the UTF8String.

>>

 

Might this relate to the problem I'm having with gnugettext ?

Possible. Since I have no idea what caused it, I can't really say. But as far as I know about how IDEFixPack works, it should not affect the compiled executables.

 

Uninstall the IDEFixPack, compile your test program and test it again. Then you'll know.

Share this post


Link to post

I don't think that the IDEFixPack has anything to do with gnugettext translations. 

 

I was wondering if the bug found in IDEFixPack might also be in some assembler called somewhere by something in gnugettext.  It was a long shot as the bug is related to converting utf8 strings.

Share this post


Link to post

Ah, OK, I understand what you are getting at.

 

Hm, I don't remember seeing any assembler code in gnugtetext.pas. And since it is a stand alone unit that uses only Delphi system units, I doubt that this could be the case. Even if it were the case, I would not know were to look.

 

Also, IIRC you said, that you ran your test program in the debugger. This kind of bug should have triggered the debugger to stop. You didn't mention any such error, so I guess there isn't.

Share this post


Link to post

I just added a test for using gnugettext in a multithreaded environment:

 

https://svn.code.sf.net/p/dxgettext/code/trunk/tests/MultithreadedResourceStringTest

 

Could you please checkout this test and try to run it on your computer? If it's something that happens only there, maybe this simple test case might get some more information.

 

There are two projects, one for Delphi 2007 and the other for Delphi 10.2. I haven't got Delphi 10.3 here to test, but loading and compiling the Delphi 10.2 project in Delphi 10.3 should work fine.

 

Note that the project is meant to be run in the integrated debugger because in case of failure, the threads just raise an excpeption. These exceptions are only visible in the debugger.

Share this post


Link to post

This test does not show the problem.  I was about to try something multi threaded myself, so I'm using your test as a base and looking at the code that does fail to try and make your test more like the failing test.  This is giving me more insight into the code that is actually failing so I will try again to catch the exception that shows in the call stack.

 

In the test I'm running, I am not calling any of the gnugettext functions directly, like AddDomain.  It is a Nexus demo with gnugettext.pas added to the dpr.  All calls to gnugettext are done indirectly using LoadResString (as far as I can see).  There are also no .mo files for translating.

 

Nexus hooks into the exception handler and has its own processing of exceptions before they are raised.

 

Early versions of gnugettext did not cause issues.  I did do some investigating to see when the problem was introduced, which I think I mentioned in an earlier post.

 

I took another approach to see if it is related to the processor - trying it on another machine that has an Intel processor, not an AMD.  It still goes into a loop, so that rules that idea out.

 

Share this post


Link to post

I have found a possible solution to this.

 

In TGnuGettextInstance.dgettext I replaced

UTF8Decode

with

UTF8ToUnicodeString

 

Is there any reason why UTF8ToUnicodeString could not be used instead of UTF8Decode ?

As far as I can see the difference is that UTF8Decode returns a widestring, and UTF8ToUnicodeString returns a UnicodeString, which is what the result field is defined as.  Does this mean there is an extra conversion from WideString to UnicodeString that might be causing the issue ?

Share this post


Link to post

Hm, that's odd:

 

if UNICODE is defined:

gnugettext.Utf8Decode calls System.Utf8ToWideString which in turn calls System.Utf8Decode (in Delphi 10.3.1)

I wonder what caused that Utf8Decode function to be added to gnugettext. Maybe there was a bug in early UNICODE aware Delphis? Or may it's simply because Utf8Decode was marked deprecated at some time.

 

System.UTF8ToUnicodeString contains basically the same code as System.Utf8Decode with the exception that the Temp string is either a _WideStr or a UnicodeString. I'm too tired (I have been banging my had on the table all day to find a solution for a BDE problem under Window 10 😞. It seemed to work at last, when all of a sudden I got a priviliged instruction error 😞 ) to get my head around the implications.

 

For now I see no reason why gnugettext.Utf8Decode should not call System.Utf8TUnicodeString rather than System.Utf8ToWideString, but I'll have to run some tests.

Share this post


Link to post

OK, looking at it again, I still see no reason not to change gnugettext.Utf8Decode from calling System.Utf8ToWideString to System.UTF8ToUnicode.

The svn blame function unfortunately does not give any insight because this code was already migrated from Berlios to SourceForge in 2012 and there is no further history available.

 

So I'll commit that change.

Share this post


Link to post

Reading the thread it seems to confirm my theory in the other thread you posted @dummzeuch, that the issue is related to reference counting. Some code is probably treating WideStrings (which are COM strings) as UnicodeStrings (which are reference counted) or somehow misinterpreting/casting data types along the way.

 

 @Sue King: It would be helpful if you could provide links (or simply attach) both versions of dxgettext, the old one that worked, and the new one that doesn't. If there is a minimal demo, which works with nexus db trial DCUs, that would be even better.

Edited by mael

Share this post


Link to post

I think there is an additional twist: The function gnugettext.Utf8Decode is being inlined, so there is one more possibility for code generation errors. (Took me a while to figure out why my break points in that function never worked.)

Share this post


Link to post
On 3/6/2019 at 10:04 AM, dummzeuch said:

For now I see no reason why gnugettext.Utf8Decode should not call System.Utf8TUnicodeString rather than System.Utf8ToWideString, but I'll have to run some tests.

 

On 3/9/2019 at 4:44 AM, dummzeuch said:

OK, looking at it again, I still see no reason not to change gnugettext.Utf8Decode from calling System.Utf8ToWideString to System.UTF8ToUnicode.

I can think of one reason: System.UTF8ToUnicode(), and its companion System.UnicodeToUTF8(), were broken or more accurately incomplete in Delphi 6-2007, as they did not support 4-byte UTF-8 sequences (Unicode codepoints outside of the BMP), only 1-3 byte sequences (Unicode codepoints in the BMP) .  That was not fixed until Delphi 2009, when they were rewritten to use platform conversions instead of manual conversions.

 

Now granted, at the time, 4-byte UTF-8 sequences were pretty rare, typically only seen in strings using Eastern Asian languages.  But in modern Unicode, they are much more common now, especially with the popularity of emojis on the rise, most of which use high codepoint values outside the BMP.

Edited by Remy Lebeau

Share this post


Link to post

R124 does not show the problem in the test application.   All versions until R47 do show the problem.  R47 does not.

 

I have been able to get R115 working by commenting out HookLoadResString.Enable.  In earlier versions that didn't work, there were different changes I made that 'fixed' the error.  It seems to me that under most conditions the error doesn't show, but when it does, its symptoms are unpredictable.  The test that @dummzeuch created generating exceptions in a multithreaded application (which is what is happening in the demo) did not create the right conditions to show the error.  The nexus code is very much more complex.

 

@mael The test program is one of the example projects provided by Nexus.  It is in Examples->Delphi->Remoting->Client to Client Messaging.

I have simply added gnugettext.pas to the project.  To generate the problem, Click on Connect and then Disconnect.  The program stops responding and has to be killed as it generates a never ending loop trying to close sockets used for messaging.

I think you should be able to recompile this example with the trial dcus.  I've posted nexus to confirm this but haven't had a response yet.

Share this post


Link to post
6 hours ago, Sue King said:

 

(How do I delete a quote in the mobile interface? Backspace doesn't work.)

6 hours ago, Sue King said:

R124 does not show the problem in the test application.

OK, that's good. As long as the change does not introduce any new problems, I consider the case closed. Thanks for your contribution everybody.

Share this post


Link to post
On 3/11/2019 at 12:15 AM, dummzeuch said:

(How do I delete a quote in the mobile interface? Backspace doesn't work.)

I also find that annoying.  If you click on the quote, an X appears in the top-left corner, then click on that and hit backspace or delete, and SOMETIMES that works, but rarely.

 

In fact, I'm not very impressed with how this site behaves in a mobile browser overall, which sucks because I spend more time lately using mobile browsers than desktop browsers when visiting various support forums.  If I try to select more than a few words of text at a time, the selection jumps all over the place, which makes quoting and copy/pasting near impossible.  And I can't insert code blocks, there is no option for that in the editor toolbar, like there is in a desktop browser.  And it took me forever to find the "Mark site read" option in the mobile site, I finally found it in the top-level menu bar, but think it would make more sense for it to be in the "Activity" section next to "Unread content", not in the "Account" section.

Share this post


Link to post

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×