Recent

Author Topic: Be careful with shr  (Read 15554 times)

tetrastes

  • Hero Member
  • *****
  • Posts: 766
Re: Be careful with shr
« Reply #30 on: March 06, 2018, 02:02:19 pm »
As mentioned above "For compatibility with the x64 modes and Delphi write like this". FPC x32 generate not compatible code.

But I see that peoples want N shr L to be zero if L >= sizeof(N)*8  :-\

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12901
  • FPC developer.
Re: Be careful with shr
« Reply #31 on: March 06, 2018, 02:12:40 pm »
as I understand.
But for what?

To avoid extra instructions being added for people that just want to set a register to zero in a fancy way.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Be careful with shr
« Reply #32 on: March 06, 2018, 02:18:18 pm »
Can you guys try this code:
Code: Pascal  [Select][+][-]
  1. program FPCShr;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   {$IFDEF UNIX} {$IFDEF UseCThreads} cthreads, {$ENDIF} {$ENDIF}
  7.   Classes,
  8.   SysUtils
  9.   { you can add units after this };
  10.  
  11. function GetVariable(v: QWord): QWord;
  12. begin
  13.   Result := v;
  14. end;
  15.  
  16. var
  17.   l8: byte;
  18.   l16: word;
  19.   l32: DWord;
  20.   l64: QWord;
  21. begin
  22.   WriteLn('Constants:');
  23.   l8 := high(l8);
  24.   l8 := l8 shr ((sizeof(l8) * 8));
  25.   writeln(sizeof(l8) * 8, ':', IntToHex(l8, sizeof(l8) * 2));
  26.  
  27.   l16 := high(l16);
  28.   l16 := l16 shr ((sizeof(l16) * 8));
  29.   writeln(sizeof(l16) * 8, ':', IntToHex(l16, sizeof(l16) * 2));
  30.  
  31.   l32 := high(l32);
  32.   l32 := l32 shr ((sizeof(l32) * 8));
  33.   writeln(sizeof(l32) * 8, ':', IntToHex(l32, sizeof(l32) * 2));
  34.  
  35.   l64 := high(l64);
  36.   l64 := l64 shr ((sizeof(l64) * 8));
  37.   writeln(sizeof(l64) * 8, ':', IntToHex(l64, sizeof(l64) * 2));
  38.   WriteLn('');
  39.  
  40.   WriteLn('Right side is variable:');
  41.   l8 := high(l8);
  42.   l8 := l8 shr GetVariable(8);
  43.   writeln(sizeof(l8) * 8, ':', IntToHex(l8, sizeof(l8) * 2));
  44.  
  45.   l16 := high(l16);
  46.   l16 := l16 shr GetVariable(16);
  47.   writeln(sizeof(l16) * 8, ':', IntToHex(l16, sizeof(l16) * 2));
  48.  
  49.   l32 := high(l32);
  50.   l32 := l32 shr GetVariable(32);
  51.   writeln(sizeof(l32) * 8, ':', IntToHex(l32, sizeof(l32) * 2));
  52.  
  53.   l64 := high(l64);
  54.   l64 := l64 shr GetVariable(64);
  55.   writeln(sizeof(l64) * 8, ':', IntToHex(l64, sizeof(l64) * 2));
  56.   WriteLn('');
  57.  
  58.   WriteLn('Both sides are variable:');
  59.   l8 := high(l8);
  60.   l8 := Byte(GetVariable(l8)) shr Byte(GetVariable(8));
  61.   writeln(sizeof(l8) * 8, ':', IntToHex(l8, sizeof(l8) * 2));
  62.  
  63.   l16 := high(l16);
  64.   l16 := Word(GetVariable(l16)) shr Byte(GetVariable(16));
  65.   writeln(sizeof(l16) * 8, ':', IntToHex(l16, sizeof(l16) * 2));
  66.  
  67.   l32 := high(l32);
  68.   l32 := DWord(GetVariable(l32)) shr Byte(GetVariable(32));
  69.   writeln(sizeof(l32) * 8, ':', IntToHex(l32, sizeof(l32) * 2));
  70.  
  71.   l64 := high(l64);
  72.   l64 := QWord(GetVariable(l64)) shr Byte(GetVariable(64));
  73.   writeln(sizeof(l64) * 8, ':', IntToHex(l64, sizeof(l64) * 2));
  74.  
  75.   WriteLn('');
  76.  
  77.   WriteLn('Left side is variable:');
  78.   l8 := high(l8);
  79.   l8 := Byte(GetVariable(l8)) shr 8;
  80.   writeln(sizeof(l8) * 8, ':', IntToHex(l8, sizeof(l8) * 2));
  81.  
  82.   l16 := high(l16);
  83.   l16 := Word(GetVariable(l16)) shr 16;
  84.   writeln(sizeof(l16) * 8, ':', IntToHex(l16, sizeof(l16) * 2));
  85.  
  86.   l32 := high(l32);
  87.   l32 := DWord(GetVariable(l32)) shr 32;
  88.   writeln(sizeof(l32) * 8, ':', IntToHex(l32, sizeof(l32) * 2));
  89.  
  90.   l64 := high(l64);
  91.   l64 := QWord(GetVariable(l64)) shr 64;
  92.   writeln(sizeof(l64) * 8, ':', IntToHex(l64, sizeof(l64) * 2));
  93.  
  94.   ReadLn;
  95. end.

Edit:
Corrected the code & format.
Added the last case.

The output here on a 32-bit Windows:
Quote
Constants:
8:00
16:0000
32:FFFFFFFF
64:FFFFFFFFFFFFFFFF

Right side is variable:
8:00
16:0000
32:FFFFFFFF
64:0000000000000000

Both sides are variable:
8:00
16:0000
32:FFFFFFFF
64:0000000000000000

Left side is variable:
8:00
16:0000
32:FFFFFFFF
64:FFFFFFFFFFFFFFFF
« Last Edit: March 06, 2018, 02:42:00 pm by engkin »

Phemtik

  • New Member
  • *
  • Posts: 19
Re: Be careful with shr
« Reply #33 on: March 06, 2018, 03:39:22 pm »
With Trunk@38348 i get this.

Code: [Select]
Constants:
8:00
16:0000
32:FFFFFFFF
64:FFFFFFFFFFFFFFFF

Right side is variable:
8:00
16:0000
32:FFFFFFFF
64:FFFFFFFFFFFFFFFF

Both sides are variable:
8:00
16:0000
32:FFFFFFFF
64:FFFFFFFFFFFFFFFF

Left side is variable:
8:00
16:0000
32:FFFFFFFF
64:FFFFFFFFFFFFFFFF

But i still don't understand what is wrong with shr/shl if you use more than 31 or 63.
Because the CPU behave exactly how it should. The Manual from AMD and Intel state that behaviour with shl/shr.
Intel i7-3610QM
Fedora 28

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Be careful with shr
« Reply #34 on: March 06, 2018, 05:02:05 pm »
Thank you Phemtik. I assume you ran it on a 32-bit OS.

ARM CPUs behaves in a different way.  Maybe that's irrelevant.

Phemtik

  • New Member
  • *
  • Posts: 19
Re: Be careful with shr
« Reply #35 on: March 06, 2018, 05:46:45 pm »
No, i used an 64-bit OS (Fedora) but i checked also with 32-bit mode.

Well, ARM state the same behaviour like AMD and Intel. ARM use that equation for shift (dependend on platform).
Code: [Select]
ShiftValue mod 32/64left Shift 0 to 31 and right shift 1 to 31. 32 = 0 on right shift

So it would be strange, if it behaves different than Intel or AMD.
Intel i7-3610QM
Fedora 28

WooBean

  • Sr. Member
  • ****
  • Posts: 303
Re: Be careful with shr
« Reply #36 on: March 06, 2018, 06:27:13 pm »
Hi to all digging CPU/FPC limits!

Thank you Phemtik. I assume you ran it on a 32-bit OS.

ARM CPUs behaves in a different way.  Maybe that's irrelevant.

@engkin:
Your test program run in win7/64 (compiled as 64-bit app) gives the same result as Phemtik reported.

@all:
Well, I created a "stupid" program just to illustrate that we have a problem at FPC side - compiler produces code sometimes quite stupid but see below:
Code: Pascal  [Select][+][-]
  1. var
  2.   l: uint64;
  3.   n: integer;
  4. begin
  5.   l:= high(l);
  6.   n:= 63;
  7.   n:=l shr n;
  8.   l:= high(l);
  9.   l:=l shr 63;
  10.   l:= high(l);
  11.   l:=l shr 64;
  12.   l:=l;
  13.   l := high(l);
  14.   n:=64;
  15.   l:=l shr n;
  16.  
  17. end.
  18.  
Disassembled code:
Code: Pascal  [Select][+][-]
  1. project2.lpr:4                            begin
  2. 0000000100001460 55                       push   %rbp
  3. 0000000100001461 4889e5                   mov    %rsp,%rbp
  4. 0000000100001464 488d6424e0               lea    -0x20(%rsp),%rsp
  5. 0000000100001469 e872230000               callq  0x1000037e0 <fpc_initializeunits>
  6. project2.lpr:5                            l:= high(l);
  7. 000000010000146E 48c70587db0000ffffffff   movq   $0xffffffffffffffff,0xdb87(%rip)        # 0x10000f000
  8. project2.lpr:6                            n:= 63;
  9. 0000000100001479 c7058ddb00003f000000     movl   $0x3f,0xdb8d(%rip)        # 0x10000f010
  10. project2.lpr:7                            n:=l shr n;
  11. 0000000100001483 48630586db0000           movslq 0xdb86(%rip),%rax        # 0x10000f010
  12. 000000010000148A 488b156fdb0000           mov    0xdb6f(%rip),%rdx        # 0x10000f000
  13. 0000000100001491 4889c1                   mov    %rax,%rcx
  14. 0000000100001494 48d3ea                   shr    %cl,%rdx
  15. 0000000100001497 891573db0000             mov    %edx,0xdb73(%rip)        # 0x10000f010
  16. project2.lpr:8                            l:= high(l);
  17. 000000010000149D 48c70558db0000ffffffff   movq   $0xffffffffffffffff,0xdb58(%rip)        # 0x10000f000
  18. project2.lpr:9                            l:=l shr 63;
  19. 00000001000014A8 488b0551db0000           mov    0xdb51(%rip),%rax        # 0x10000f000
  20. 00000001000014AF 48c1e83f                 shr    $0x3f,%rax
  21. 00000001000014B3 48890546db0000           mov    %rax,0xdb46(%rip)        # 0x10000f000
  22. project2.lpr:10                           l:= high(l);
  23. 00000001000014BA 48c7053bdb0000ffffffff   movq   $0xffffffffffffffff,0xdb3b(%rip)        # 0x10000f000
  24. project2.lpr:11                           l:=l shr 64;
  25. 00000001000014C5 488b0534db0000           mov    0xdb34(%rip),%rax        # 0x10000f000
  26. 00000001000014CC 4889052ddb0000           mov    %rax,0xdb2d(%rip)        # 0x10000f000
  27. project2.lpr:12                           l:=l;
  28. 00000001000014D3 488b0526db0000           mov    0xdb26(%rip),%rax        # 0x10000f000
  29. 00000001000014DA 48                       mov    %rax,0xdb1f(%rip)        # 0x10000f000
  30. project2.lpr:13                           l := high(l);
  31. 00000001000014E1 48c70514db0000ffffffff   movq   $0xffffffffffffffff,0xdb14(%rip)        # 0x10000f000
  32. project2.lpr:14                           n:=64;
  33. 00000001000014EC c7051adb000040000000     movl   $0x40,0xdb1a(%rip)        # 0x10000f010
  34. project2.lpr:15                           l:=l shr n;
  35. 00000001000014F6 48630d13db0000           movslq 0xdb13(%rip),%rcx        # 0x10000f010
  36. 00000001000014FD 488b05fcda0000           mov    0xdafc(%rip),%rax        # 0x10000f000
  37. 0000000100001504 48d3e8                   shr    %cl,%rax
  38. 0000000100001507 488905f2da0000           mov    %rax,0xdaf2(%rip)        # 0x10000f000
  39. project2.lpr:17                           end.
  40.  
What do we see?
- when shr right operand is less then 64 (we are testing 64-bit code) there is no problem;
- when shr right operand is a constant equal 64 (line 11 of project2) compiled code puts data to register rax and than back to source (it is equivalent to pascal code L:=L; what for?), variable remains unchanged!
- when shr right operand is a variable equal 64 (line 15 of project2) compiled code executes shr operation. What is interesting - CPU's (Intel) result of it is ... that variable remains unchanged!

Summing up, for now you should be careful using shr operator. Do not assume that shifting right all bits of a variable by size (in bits) of that variable will set the variable to zero.

WooBean 



   
 
« Last Edit: March 06, 2018, 07:37:45 pm by WooBean »
Platforms: Win7/64, Linux Mint 22.1 Xia

tetrastes

  • Hero Member
  • *****
  • Posts: 766
Re: Be careful with shr
« Reply #37 on: March 06, 2018, 08:43:39 pm »
Summing up, for now you should be careful using shr operator. Do not assume that shifting right all bits of a variable by size (in bits) of that variable will set the variable to zero.

Be VERY careful.  Do not assume that shifting right EVEN MORE THAN all bits of a variable by size (in bits) of that variable will set the variable to zero.  :-X
« Last Edit: March 06, 2018, 08:51:37 pm by tetrastes »

Phemtik

  • New Member
  • *
  • Posts: 19
Re: Be careful with shr
« Reply #38 on: March 06, 2018, 09:20:22 pm »
@WooBean
Well, FPC don't let anything remain from your example after optimization.  :)


But as i said, this behaviour have nothing to do with FPC, its normal on Intel and AMD.
It is stated in the manuals. It's also nothing special for right shift. Left shift behave in the same way.

In the manuals it is stated:
32-Bit Values shl/shr use the first 5 bits = 31
64-Bit Values shl/shr use the first 6 bits = 63

If you use a 64 for shift on 64-bit values, you have the first 6 bits 0's and on the 7th bit a 1.
But the CPU only use the first 6 bits, so he shifts nothing because you have only 0's there.
If you use a 65 for shift, you shift by 1 and on 128 you shift again nothing and so on.

But yes, it should be stated in the FPC documentation / wiki.
« Last Edit: March 06, 2018, 10:28:47 pm by Phemtik »
Intel i7-3610QM
Fedora 28

jamie

  • Hero Member
  • *****
  • Posts: 7768
Re: Be careful with shr
« Reply #39 on: March 07, 2018, 03:11:12 am »
This problem as been around sin the 16 bit days with intel type processors, nothing new for me :)

 I did write a LargeSHift function once so that large chunks could be handled..
 
  It's not a show stopper, all one needs to do is have a record to split a 64bit or 32 bit type
depending on the target.

 Have some overloads to handle it, the smaller values can be done using an overload inline
function where by it just simply uses the current form and full buss size type then go to the
function.

 I guess one could even do a operator overload on SHR and SHL

Have a good day.
The only true wisdom is knowing you know nothing

 

TinyPortal © 2005-2018