Recent

Author Topic: A very simple and basic question about avx2 in fpc  (Read 605 times)

beria2

  • New Member
  • *
  • Posts: 16
A very simple and basic question about avx2 in fpc
« on: April 10, 2025, 01:27:08 pm »
A big request for help... I've been sitting here for twenty-four hours and I can't solve the problem. How to load data into avx2 registers in the internal fpc assembled correctly. In the program below, they are initialized with garbage, and the program outputs the value -1.2556805098698515E+249, which is not true.
 :'( :'( :'( :'( :'( :'( :'( :'( :'( :'( :'( :'(

  function AsmTest: double; assembler;
  asm
           VMOVAPD YMM0, ymmword ptr [rip + @Data]
           VEXTRACTF128 XMM0, YMM0, 1
           JMP     @Exit
           ALIGN   32
           @Data:
           DB      $40, $08, $FA, $29, $E1, $70, $A4, $F1
           DB      $41, $09, $FB, $2A, $E2, $71, $A5, $F2
           DB      $42, $0A, $FC, $2B, $E3, $72, $A6, $F3
           DB      $43, $0B, $FD, $2C, $E4, $73, $A7, $F4
           @Exit:
  end;
     

ALLIGATOR

  • Full Member
  • ***
  • Posts: 160
Re: A very simple and basic question about avx2 in fpc
« Reply #1 on: April 10, 2025, 03:07:29 pm »
Hmmm, at first glance, it looks like the value loads normally

beria2

  • New Member
  • *
  • Posts: 16
Re: A very simple and basic question about avx2 in fpc
« Reply #2 on: April 10, 2025, 03:30:56 pm »
Hmmm, at first glance, it looks like the value loads normally
And where do you have such a beautiful debugger in Lazarus? This is how I look at the contents of the processor registers.  According to him, there is garbage there. And the test result, that is, the second number from the array, is garbage, not the one whose binary contents are typed in there.. It should be 3.26711296000000E+05. In my opinion, the code is correct, but the result is strange....

LV

  • Sr. Member
  • ****
  • Posts: 272
Re: A very simple and basic question about avx2 in fpc
« Reply #3 on: April 10, 2025, 03:52:31 pm »
Maybe use the data as a double? The result should be 10Pi ~ 31.4159265359.

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$asmmode intel}
  4. uses
  5.   SysUtils, Math;
  6.  
  7. var
  8.   RawPtr: Pointer;
  9.   AlignedPtr: Pointer;
  10.   ResultValue: Double;
  11.  
  12. const
  13.   ALIGNMENT = 32;
  14.  
  15. type
  16.   TAVXData = array[0..3] of Double;
  17.  
  18. procedure AllocateAlignedMemory(var Raw, Aligned: Pointer; Size, Align: PtrUInt);
  19. var
  20.   p: PtrUInt;
  21. begin
  22.   GetMem(Raw, Size + Align);              // Reserve extra space for alignment
  23.   p := (PtrUInt(Raw) + Align - 1) and not (Align - 1); // Align upwards
  24.   Aligned := Pointer(p);
  25. end;
  26.  
  27. function AsmTest(p: Pointer): Double; assembler;
  28. asm
  29.   mov     rax, p                      // Address of the array
  30.   vmovapd ymm0, ymmword ptr [rax]     // Load 4 double values into ymm0
  31.  
  32.   // Add lower and upper halves of ymm0
  33.   vextractf128 xmm1, ymm0, 1          // Extract upper 2 double values into xmm1
  34.   vaddpd  xmm0, xmm0, xmm1            // xmm0 = (a0+a2, a1+a3)
  35.  
  36.   // Horizontally add remaining values
  37.   vhaddpd xmm0, xmm0, xmm0            // xmm0 = (a0+a2+a1+a3, ...)
  38.  
  39.   // Return the result
  40.   movsd   xmm1, xmm0                  // Move result to xmm1
  41. end;
  42.  
  43.  
  44.  
  45. var
  46.   Data: TAVXData = (pi, 2 * pi, 3 * pi, 4 * pi);
  47.  
  48. begin
  49.   // Allocate memory
  50.   AllocateAlignedMemory(RawPtr, AlignedPtr, SizeOf(Data), ALIGNMENT);
  51.  
  52.   // Copy data to aligned memory
  53.   Move(Data, AlignedPtr^, SizeOf(Data));
  54.  
  55.   Writeln('RawPtr     = ', PtrUInt(RawPtr));
  56.   Writeln('AlignedPtr = ', PtrUInt(AlignedPtr));
  57.  
  58.   // Check if AlignedPtr is a multiple of 32
  59.   if (PtrUInt(AlignedPtr) mod ALIGNMENT) <> 0 then
  60.     raise Exception.Create('Alignment error!');
  61.  
  62.   // Call assembler function
  63.   ResultValue := AsmTest(AlignedPtr);
  64.   Writeln('Returned value = ', ResultValue :0:10);
  65.  
  66.   FreeMem(RawPtr);
  67.  
  68.   Readln;
  69. end.
  70.  

Code: Text  [Select][+][-]
  1. RawPtr     = 22902112
  2. AlignedPtr = 22902112
  3. Returned value = 31.4159265359
  4.  

ALLIGATOR

  • Full Member
  • ***
  • Posts: 160
Re: A very simple and basic question about avx2 in fpc
« Reply #4 on: April 10, 2025, 03:56:44 pm »
Hmmm, at first glance, it looks like the value loads normally
And where do you have such a beautiful debugger in Lazarus? This is how I look at the contents of the processor registers.  According to him, there is garbage there. And the test result, that is, the second number from the array, is garbage, not the one whose binary contents are typed in there.. It should be 3.26711296000000E+05. In my opinion, the code is correct, but the result is strange....

 :D It's right here:
https://forum.lazarus.freepascal.org/index.php?topic=68962.0

beria2

  • New Member
  • *
  • Posts: 16
Re: A very simple and basic question about avx2 in fpc
« Reply #5 on: April 10, 2025, 05:12:14 pm »
A big request for help... I've been sitting here for twenty-four hours and I can't solve the problem. How to load data into avx2 registers in the internal fpc assembled correctly. In the program below, they are initialized with garbage, and the program outputs the value -1.2556805098698515E+249, which is not true.
 :'( :'( :'( :'( :'( :'( :'( :'( :'( :'( :'( :'(

  function AsmTest: double; assembler;
  asm
           VMOVAPD YMM0, ymmword ptr [rip + @Data]
           VEXTRACTF128 XMM0, YMM0, 1
           JMP     @Exit
           ALIGN   32
           @Data:
           DB      $40, $08, $FA, $29, $E1, $70, $A4, $F1
           DB      $41, $09, $FB, $2A, $E2, $71, $A5, $F2
           DB      $42, $0A, $FC, $2B, $E3, $72, $A6, $F3
           DB      $43, $0B, $FD, $2C, $E4, $73, $A7, $F4
           @Exit:
  end;
     


I did it!!!


  function AsmTest: double; assembler;
  asm
           VMOVAPD YMM0, ymmword ptr [rip + @Data]
           VPERMILPD XMM1, XMM0, 1
           MOVAPS  XMM0, XMM1
           JMP     @Exit
           ALIGN   32
           @Data:
           DB      $00, $00, $00, $00, $00, $00, $F0, $3F  // 1.0
           DB      $00, $00, $00, $00, $00, $00, $00, $40  // 2.0
           DB      $00, $00, $00, $00, $00, $00, $08, $40  // 3.0
           DB      $00, $00, $00, $00, $00, $00, $10, $40  // 4.0
           @Exit:
  end;
       

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12206
  • FPC developer.
Re: A very simple and basic question about avx2 in fpc
« Reply #6 on: April 20, 2025, 01:13:18 pm »
You don't need to stuff it in a procedure, just use a const.

Code: [Select]
const
      splitsh6 :  array[0..31] of byte = (  $00,$04,$08,$0C,$01,$05,$09,$0d,
                                            $02,$06,$0A,$0E,$03,$07,$0B,$0F,
                                            $00,$04,$08,$0C,$01,$05,$09,$0d,
                                            $02,$06,$0A,$0E,$03,$07,$0B,$0F);


procedure xx;
asm
 ...
       // in FPC code, loads of constants should now be 32-byte aligned (constmin)
       vmovdqa  ymm9, [rip+splitsh6]
...
end;

[/code]

 

TinyPortal © 2005-2018