Recent

Author Topic: alloca  (Read 4708 times)

Red_prig

  • Full Member
  • ***
  • Posts: 153
Re: alloca
« Reply #15 on: March 02, 2023, 10:49:33 am »
__builtin_alloca is an compiler intrinsic, not a dynamic symbol

Thaddy

  • Hero Member
  • *****
  • Posts: 16945
  • Ceterum censeo Trump esse delendam
Re: alloca
« Reply #16 on: March 02, 2023, 12:10:58 pm »
__builtin_alloca is an compiler intrinsic, not a dynamic symbol
In that case it can be exposed as an inlined function?
Something like:
Code: C  [Select][+][-]
  1. #include <stdlib.h>
  2.  
  3. void* _alloca(size_t size) {
  4.     return __builtin_alloca(size);
  5. }
and put that in a library?

The pascal function would be something like:
Code: Pascal  [Select][+][-]
  1. function alloca(size:size_t):pointer;cdecl;external name '_alloca');

I will try if it works.
« Last Edit: March 02, 2023, 12:18:21 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Bogen85

  • Hero Member
  • *****
  • Posts: 702
Re: alloca
« Reply #17 on: March 02, 2023, 12:23:46 pm »
__builtin_alloca is an compiler intrinsic, not a dynamic symbol
In that case it can be exposed as an inlined function?
Something like:
Code: C  [Select][+][-]
  1. #include <stdlib.h>
  2.  
  3. void* _alloca(size_t size) {
  4.     return __builtin_alloca(size);
  5. }
and put that in a library?

The pascal function would be something like:
Code: Pascal  [Select][+][-]
  1. function alloca(size:size_t):pointer;cdecl;external name '_alloca');

I will try if it works.

And how would that work?
Quote
DESCRIPTION
       The  alloca()  function  allocates size bytes of space in the stack frame of the caller.  This temporary space is automatically freed when the function that
       called alloca() returns to its caller.

And
Quote
#ifdef  __GNUC__
           #define alloca(size)   __builtin_alloca (size)
           #endif

See man alloca

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12202
  • FPC developer.
Re: alloca
« Reply #18 on: March 02, 2023, 12:32:07 pm »
__builtin_alloca is an compiler intrinsic, not a dynamic symbol
In that case it can be exposed as an inlined function?
Something like:
Code: C  [Select][+][-]
  1. #include <stdlib.h>
  2.  
  3. void* _alloca(size_t size) {
  4.     return __builtin_alloca(size);
  5. }
and put that in a library?

No, since then _builtin_alloca would operate on the alloca function, and not on the function calling alloca.

The only decent way would be full alloca intrinsic support for FPC, but this is involved, since it might clash with many things (exceptions,  automated types, interaction with any form of stack frame optimization  etc).  And then of course many uses (e.g. static classes) will probably lead to more extensions and before you know it you have a third (or fourth if you call objective C) object model inside FPC.

One should really consider if that is a direction you should want. Besides that, there might be some embedded use in the most constrained micros (and then I don't mean luxury limousines like Arduinos), but it is quite rare nowadays, and usually they are not really HLL targets in the first place.

But maybe I am mistaken and there are many great reasons for that that ARE supportable. Enlighten me please :-)
« Last Edit: March 02, 2023, 01:27:42 pm by marcov »

Bogen85

  • Hero Member
  • *****
  • Posts: 702
Re: alloca
« Reply #19 on: March 02, 2023, 12:44:24 pm »
No, since then _builtin_alloca would operate on the alloca function, and not on the function calling alloca.

Exactly....
(My earlier answer was just leaving it up to the reader to see that...)
« Last Edit: March 02, 2023, 12:50:46 pm by Bogen85 »

MathMan

  • Sr. Member
  • ****
  • Posts: 408
Re: alloca
« Reply #20 on: March 02, 2023, 01:47:32 pm »
@Red_prig - there may be existing solutions, depending on your requirement.

If your memory size is fixed, or variable with a known upper limit at least, then maybe something like inserting a buffer variable in your function may be sufficient, like

Code: Pascal  [Select][+][-]
  1. function xyz( ... ):result;
  2.  
  3. var
  4.   buffer: array [ 0..MaxSize-1 ] of <basic type>;
  5.  
  6. begin
  7.   ... use buffer here ...
  8. end;
  9.  

The above unfortunately doesn't work with dynamic arrays, if speed is important, because setting the length will generate a call to heap mngmnt and also clear the content - if I'm not mistaken.

Another alternative would be to hand over the heap as a parameter - still fast and you could do fine grained heap managament in upper layers.

But it really depends on your requirements, which you haven't detailed as of yet <= or have I missed something in this thread?

HTH,
MathMan

Red_prig

  • Full Member
  • ***
  • Posts: 153
Re: alloca
« Reply #21 on: March 02, 2023, 02:05:34 pm »
The function described in the post was my solution, I just wanted to hear other people's opinions.

440bx

  • Hero Member
  • *****
  • Posts: 5302
Re: alloca
« Reply #22 on: March 02, 2023, 02:49:41 pm »
It should be noted that allocating memory on the stack isn't a problem.

The real problem is that, in the case of FPC, the compiler doesn't know that memory has been allocated on the stack, likely causing SP/ESP to change something that the compiler doesn't expect.  The result is that whatever memory is allocated on the stack has to normally be manually de-allocated before returning from the function/procedure for the compiler's stack pops to be balanced with the pushes it did upon entering the function/procedure.

In (some?) C/C++ compilers it's different because the compiler "knows" that stack allocations modify SP/ESP and adjusts its generated code accordingly.  Neither FPC nor Delphi v2.0 (I don't know about other versions) do that.

With FPC (and probably Delphi as well), there does not seem to be any advantage in allocating memory on the stack since that memory must be manually de-allocated before exiting the function/procedure.

On Windows, ntdll.dll provides the core functions required to allocate memory on the stack, which touch the stack pages when necessary (important when the allocation straddles a page boundary)

HTH.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

Thaddy

  • Hero Member
  • *****
  • Posts: 16945
  • Ceterum censeo Trump esse delendam
Re: alloca
« Reply #23 on: March 02, 2023, 03:13:31 pm »
Some more news:
I got your code working on Windows after you wrote it works on windows.
It still does not work on Linux64.

Note that the one I use for my stack based classes (windows only) is better because it works around any SEH problems.
See this link in case you did not find it yet:
http://www.atelierweb.com/64-bit-_alloca-how-to-use-from-delphi/
He also has a FPC version and the asm file is well documented.

Still, I want it on Linux, because I want my stack based classes on Linux too  :D
« Last Edit: March 02, 2023, 03:19:34 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Red_prig

  • Full Member
  • ***
  • Posts: 153
Re: alloca
« Reply #24 on: March 02, 2023, 03:41:08 pm »
I'm wondering what is the procedure for allocation and how you use these classes. If I remember correctly, then the allocation of classes inside a special procedure that can be overridden, it is in this particular case that the code I published will not work, another mechanism is needed.

Thaddy

  • Hero Member
  • *****
  • Posts: 16945
  • Ceterum censeo Trump esse delendam
Re: alloca
« Reply #25 on: March 02, 2023, 04:09:47 pm »
I am merely doing this:
Code: Pascal  [Select][+][-]
  1. // snippet from classonstack with your and my code mixed:
  2.   function _alloca(size:qword):Pointer; sysv_abi_default; assembler; nostackframe;
  3.   asm
  4.     movqq       %rsp,%rax
  5.     subq        %rdi,%rax
  6.     lea     -8(%rax),%rax
  7.     andq        $-32,%rax
  8.     movqq     (%rsp),%rdi
  9.     movqq       %rax,%rsp
  10.     lea    -32(%rsp),%rsp
  11.     jmp    %rdi
  12.   end;  
  13.  
  14.   { this does all the magic }
  15.   class function TStackObject.NewInstance:Tobject; // declaration is override
  16.   var
  17.     p : pointer;
  18.   begin
  19.     { allocate on the stack }
  20.     p:=_alloca(instancesize);
  21.     if p <> nil then InitInstance(p);
  22.     NewInstance:=TObject(p);
  23.   end;
This "works" on Win64, but not on linux64.
I quoted "works" because your code also fails on Win64 if in the same function SEH (exceptions) are used, whereas the code I used takes that into account.

It is pretty stable code the way I do it normally. This is adapted, just to show you what I am doing.


« Last Edit: March 02, 2023, 04:15:35 pm by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Red_prig

  • Full Member
  • ***
  • Posts: 153
Re: alloca
« Reply #26 on: March 02, 2023, 04:16:37 pm »
First, I'll try to rewrite the code from the link, let's see how it can be adapted

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12202
  • FPC developer.
Re: alloca
« Reply #27 on: March 02, 2023, 05:07:55 pm »
I quoted "works" because your code also fails on Win64 if in the same function SEH (exceptions) are used, whereas the code I used takes that into account.

It is pretty stable code the way I do it normally. This is adapted, just to show you what I am doing.

Doesn't unnecessary copy the object? Basically you fake allocate a local object, which then is copied to the return value ?

Red_prig

  • Full Member
  • ***
  • Posts: 153
Re: alloca
« Reply #28 on: March 02, 2023, 06:43:18 pm »
So I ported it:  :P

Code: Pascal  [Select][+][-]
  1. //rcx=size
  2. //rdx=alignm
  3. //r8=accum - optional
  4. function _alloca(size,alignm:QWORD;accum:PDWORD):Pointer; MS_ABI_Default; assembler nostackframe;
  5. label
  6.  _1,
  7.  _2,
  8.  {$IFDEF WINDOWS}
  9.  _3,
  10.  _exit,
  11.  {$ENDIF}
  12.  _4;
  13. asm
  14.   mov (%rsp),%r9  // return address
  15.   mov %ecx  ,%ecx // zero-extend
  16.   mov %edx  ,%edx // zero-extend
  17.  
  18.   cmp $16, %rdx
  19.   jge _1
  20.   mov $16, %rdx   // Minimum alignment to consider in Win 64 is 16 bytes
  21. _1:
  22.   cmp $4096, %rdx
  23.   jle _2
  24.   mov $4096, %rdx
  25. _2:
  26.   lea (%rcx), %rax  //rax:=size
  27.  
  28.   lea 8(%rsp), %r10 //ptr=rsp+8
  29.   sub %rax, %r10    //ptr:=ptr-size
  30.   neg %rdx          //alignm:=-alignm
  31.   and %rdx, %r10    //ptr:=AlignDown(ptr,alignm)
  32.  
  33.   xor %r11, %r11    //r11:=0
  34.   lea 8(%rsp), %rax //ptr2:=rsp+8
  35.   sub %r10, %rax    //psize:=ptr2-ptr
  36.  
  37. {$IFDEF WINDOWS}
  38.   cmovb %r11,%r10
  39.   mov %gs:(0x10),%r11 //StackLimit
  40.   cmp %r11,%r10
  41.   jae _exit
  42.  
  43.   and $0xF000,%r10w
  44. _3:
  45.   lea -0x1000(%r11),%r11
  46.   movb   $0,(%r11)
  47.   cmp  %r11,%r10
  48.   jne _3
  49.  
  50. _exit:
  51. {$ENDIF}
  52.   sub %rax, %rsp   //rsp:=rsp-psize
  53.  
  54.   cmp $0, %r8      //if accum<>nil then
  55.   jz _4
  56.   addl %eax,(%r8)  //accum^:=accum^+psize
  57. _4:
  58.  
  59.   mov  %r9,(%rsp)  //set return address
  60.   mov %rsp,%rax    //Result:=rsp
  61.   add   $8,%rax    //Result:=Result+8
  62. end;  
  63.  

Thaddy

  • Hero Member
  • *****
  • Posts: 16945
  • Ceterum censeo Trump esse delendam
Re: alloca
« Reply #29 on: March 02, 2023, 08:05:41 pm »
I will test it (on linux64).... :D 8-)
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

 

TinyPortal © 2005-2018