pyscripter 689 Posted January 13, 2019 (edited) The code: var C: AnsiChar := #$0A; if C in [#$A, #$D] then Generates the following assembly code in 32 bits. Project2.dpr.26: var C: AnsiChar := #$0A; 004F9C10 C645FF0A mov byte ptr [ebp-$01],$0a Project2.dpr.41: if C in [#$A, #$D] then 004F9C14 8A45FF mov al,[ebp-$01] 004F9C17 2C0A sub al,$0a 004F9C19 7404 jz $004f9c1f 004F9C1B 2C03 sub al,$03 004F9C1D 751C jnz $004f9c3b On the other hand var C: Char := #$0A0A; if C in [#$A, #$D] then generates the following: Project2.dpr.26: var C: Char := #$0A0A; 004F9C10 66C745FE0A0A mov word ptr [ebp-$02],$0a0a Project2.dpr.41: if C in [#$A, #$D] then 004F9C16 668B45FE mov ax,[ebp-$02] 004F9C1A 6683E80A sub ax,$0a 004F9C1E 7406 jz $004f9c26 004F9C20 6683E803 sub ax,$03 004F9C24 751C jnz $004f9c42 Notice that it handles the wide char correctly. However the compiler issues the following warning: [dcc32 Warning] Project2.dpr(41): W1050 WideChar reduced to byte char in set expressions. Consider using 'CharInSet' function in 'SysUtils' unit. Question 1: Why the warning is issued, given that the generated code does not reduce the wide char to a byte? Question 2: Doesn't this mean that RSP-13141 has been resolved except for the warning? In the discussion of that issue @Arnaud Bouchez points out that the warning is misleading. In 64 bit the generated code looks much more complex: Project2.dpr.26: var C: Char := #$0A0A; 00000000005716E8 66C7452E0A0A mov word ptr [rbp+$2e],$0a0a Project2.dpr.41: if C in [#$A, #$D] then 00000000005716EE 480FB7452E movzx rax,word ptr [rbp+$2e] 00000000005716F3 6683E808 sub ax,$08 00000000005716F7 6683F807 cmp ax,$07 00000000005716FB 7718 jnbe TestCharInSet + $35 00000000005716FD B201 mov dl,$01 00000000005716FF 8BC8 mov ecx,eax 0000000000571701 80E17F and cl,$7f 0000000000571704 D3E2 shl edx,cl 0000000000571706 480FB60556000000 movzx rax,byte ptr [rel $00000056] 000000000057170E 84C2 test dl,al 0000000000571710 0F95C0 setnz al 0000000000571713 EB02 jmp TestCharInSet + $37 0000000000571715 33C0 xor eax,eax 0000000000571717 84C0 test al,al 0000000000571719 7422 jz TestCharInSet + $5D Question 3: Why is the code is so more complex in 64 bits? Please forgive my ignorance. Edited January 13, 2019 by pyscripter 2 Share this post Link to post
Rudy Velthuis 91 Posted February 14, 2019 (edited) On 1/14/2019 at 12:01 AM, pyscripter said: The code: var C: AnsiChar := #$0A; if C in [#$A, #$D] then Generates the following assembly code in 32 bits. Project2.dpr.26: var C: AnsiChar := #$0A; 004F9C10 C645FF0A mov byte ptr [ebp-$01],$0a Project2.dpr.41: if C in [#$A, #$D] then 004F9C14 8A45FF mov al,[ebp-$01] 004F9C17 2C0A sub al,$0a 004F9C19 7404 jz $004f9c1f 004F9C1B 2C03 sub al,$03 004F9C1D 751C jnz $004f9c3b On the other hand var C: Char := #$0A0A; if C in [#$A, #$D] then generates the following: Project2.dpr.26: var C: Char := #$0A0A; 004F9C10 66C745FE0A0A mov word ptr [ebp-$02],$0a0a Project2.dpr.41: if C in [#$A, #$D] then 004F9C16 668B45FE mov ax,[ebp-$02] 004F9C1A 6683E80A sub ax,$0a 004F9C1E 7406 jz $004f9c26 004F9C20 6683E803 sub ax,$03 004F9C24 751C jnz $004f9c42 Notice that it handles the wide char correctly. However the compiler issues the following warning: [dcc32 Warning] Project2.dpr(41): W1050 WideChar reduced to byte char in set expressions. Consider using 'CharInSet' function in 'SysUtils' unit. Question 1: Why the warning is issued, given that the generated code does not reduce the wide char to a byte? Question 2: Doesn't this mean that RSP-13141 has been resolved except for the warning? In the discussion of that issue @Arnaud Bouchez points out that the warning is misleading. In 64 bit the generated code looks much more complex: Project2.dpr.26: var C: Char := #$0A0A; 00000000005716E8 66C7452E0A0A mov word ptr [rbp+$2e],$0a0a Project2.dpr.41: if C in [#$A, #$D] then 00000000005716EE 480FB7452E movzx rax,word ptr [rbp+$2e] 00000000005716F3 6683E808 sub ax,$08 00000000005716F7 6683F807 cmp ax,$07 00000000005716FB 7718 jnbe TestCharInSet + $35 00000000005716FD B201 mov dl,$01 00000000005716FF 8BC8 mov ecx,eax 0000000000571701 80E17F and cl,$7f 0000000000571704 D3E2 shl edx,cl 0000000000571706 480FB60556000000 movzx rax,byte ptr [rel $00000056] 000000000057170E 84C2 test dl,al 0000000000571710 0F95C0 setnz al 0000000000571713 EB02 jmp TestCharInSet + $37 0000000000571715 33C0 xor eax,eax 0000000000571717 84C0 test al,al 0000000000571719 7422 jz TestCharInSet + $5D Question 3: Why is the code is so more complex in 64 bits? Please forgive my ignorance. Ad 1: The warning is correct. Both #$A and #$D are WideChar values reduced to AnsiChars. Ad 2: No, that doesn't mean it is resolved (I don't see any signs that we will get built-in Pascal sets 16-bit wide elements). Note that not C is reduced, only #$000A and #$000D. In such comparisons, if that is simpler, you will often see a set being reduced to a range. FWIW, instead of if A in ['A'..'Z'] then ... ; you can do: case A of 'A'..'Z': ... ; end; That is a little less convenient, but the range syntax lets you use all WideChar values, even the ones > #255. Ad 3: That is indeed weird. Even with optimizations on, you'll see the same. But this seems to be related to how LLVM handles DEBUG and RELEASE modes. It is well possible that the actual code in RELEASE mode (no-debug settings) is really optimized. But you can't see this in the CPU view. I say this because this kind of extremely ugly and uselessly complicated output can be seen from the Clang compilers too, in DEBUG mode. But as soon as DEBUG is off, you get extremely fast generated code. I just can't prove this yet. I managed to show this in the CPU view for a Clang-compiled program a few times, but sometimes that worked, sometimes it didn't. I can see that the code is shorter in RELEASE mode (48 instead of 64 bytes). Edited February 14, 2019 by Rudy Velthuis Share this post Link to post