Recent

Author Topic: Is This A Leak?  (Read 9706 times)

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Is This A Leak?
« Reply #15 on: October 17, 2019, 09:27:15 am »
He's saying dynamic arrays (and strings!) contain an implicit ref. Which is a good thing - but your experiment showed "passing by value" behavior.

What Marco says is applicable mostly if your code does very low-level operations, say, if you write pure assembler or if you're used to the "C" way of doing things. Very, very basically (it's probably more complex) what the compiler does in the first case ("NoVar") is make a copy of the array or string and pass to the procedure a reference (pointer) to that copy, to avoid pushing the whole array/string into the stack (and other reasons, e.g. to be able to modify it inside the procedure without doing "gymnastics" with it).

The point here is that with var you're treating with a reference to the real variable but without it it's a reference to a "local" copy.
Not quite correct. Both arrays and strings are in fact pointers to an a bit more involved structure which both contain reference counts. So assigning one to a new variable or parameter normally involves an increase in the reference count (and later on decrease). For normal parameters the reference count is increased when the reference is passed to it, for var, const and constref parameters it is not.

Take a look at this example:

Code: Pascal  [Select][+][-]
  1. program tarrtest;
  2.  
  3. type
  4.   tdynarray = { packed } record
  5.      refcount : ptrint;
  6.      high : tdynarrayindex;
  7.   end;
  8.   pdynarray = ^tdynarray;
  9.  
  10.   TLongIntArray = array of LongInt;
  11.  
  12. function ArrayRefCount(aArr: Pointer): PtrInt; inline;
  13. begin
  14.   if Assigned(aArr) then
  15.     ArrayRefCount := pdynarray(aArr - sizeof(tdynarray))^.refcount
  16.   else
  17.     ArrayRefCount := 0;
  18. end;
  19.  
  20. procedure Test(aArr: TLongIntArray);
  21. begin
  22.   Writeln('Test: ', HexStr(Pointer(aArr)));
  23.   Writeln(ArrayRefCount(Pointer(aArr)));
  24.   SetLength(aArr, 2);
  25.   Writeln('Test: ', HexStr(Pointer(aArr)));
  26.   Writeln(ArrayRefCount(Pointer(aArr)));
  27. end;
  28.  
  29. procedure Test2(var aArr: TLongIntArray);
  30. begin
  31.   Writeln('Test2: ', HexStr(Pointer(aArr)));
  32.   Writeln(ArrayRefCount(Pointer(aArr)));
  33.   SetLength(aArr, 2);
  34.   Writeln('Test: ', HexStr(Pointer(aArr)));
  35.   Writeln(ArrayRefCount(Pointer(aArr)));
  36. end;
  37.  
  38. procedure DoTest;
  39. var
  40.   a, b: TLongIntArray;
  41. begin
  42.   SetLength(a, 3);
  43.   Writeln(ArrayRefCount(Pointer(a)));
  44.   Test(a);
  45.   Writeln(HexStr(Pointer(a)));
  46.   a := Nil;
  47.   SetLength(a, 3);
  48.   Writeln(ArrayRefCount(Pointer(a)));
  49.   Test2(a);
  50.   Writeln(HexStr(Pointer(a)));
  51.   b := a;
  52.   Writeln(ArrayRefCount(Pointer(a)));
  53.   Test2(a);
  54.   Writeln(HexStr(Pointer(a)));
  55.   Writeln(HexStr(Pointer(b)));
  56. end;
  57.  
  58. begin
  59.   DoTest;
  60. end.

It will print the following output:
Code: [Select]
1
Test: 002A6940
2
Test: 002A6960
1
002A6940
1
Test2: 002A6940
1
Test: 002A6940
1
002A6940
2
Test2: 002A6940
2
Test: 002A6960
1
002A6960
002A6940

As you can see in the call to Test the reference count is increased and calling SetLength on it leads to the creation of a new array. In the first call to Test2 the reference count is kept as 1 and the SetLength directly influences also the a variable in DoTest. In the second call to Test2 a new array is created like it was in Test, because the reference count is not 1. a in DoTest is still changed, but b happily keeps a reference to the original array.

del

  • Sr. Member
  • ****
  • Posts: 258
Re: Is This A Leak?
« Reply #16 on: October 17, 2019, 06:34:48 pm »
"In Free Pascal all non-trivial objects are passed by reference". True. But basically it's just a matter of bandwidth. It's cheaper to send a reference, so references are what get sent. And when you stick "var" in front of it, the called function has read / write access. If you leave the "var" out, then it's just read access. (Let's forget about "const" for the time being)

Now - at runtime - if the called function tries to modify a "no var" (read only) reference then the called function creates a local copy and uses the local copy for its local purposes. And this local copy disappears and its resources get freed when the called function exits.

So the take away for me is that objects are routinely passed as references. No need to declare (class& object) in the parameters. "Var" in the parameters means "non const". Not sure what "const" means. It's kinda covered by the absence of "var". So it must have some other meaning to the compiler.
« Last Edit: October 17, 2019, 06:36:40 pm by del »

Peter H

  • Sr. Member
  • ****
  • Posts: 272
Re: Is This A Leak?
« Reply #17 on: October 17, 2019, 07:16:24 pm »
Recently I debugged a legacy program that suffered from stack overflow.
The program dealt with fixed arrays and in procedures these arrays where passed by value without "const" attribute.
So I know the following from debugging, not from documentation:  8-)

"const" in a parameter list means the parameter is not modified by the procedure.
To some degree the compiler prevents writing.

If -for example- a fixed size array or string is in the parameter list, it is passed by value normally, but passed by reference, when it is const.
For contrast: in C fixed size arrays are always passed by reference, no matter if const or not.

"const" means the compiler can decide if a parameter is better passed by value or by reference.
A constref parameter is always passed by reference, even if it fits into a register.

« Last Edit: October 17, 2019, 07:59:45 pm by Peter H »

440bx

  • Hero Member
  • *****
  • Posts: 3945
Re: Is This A Leak?
« Reply #18 on: October 17, 2019, 08:15:47 pm »
If -for example- a fixed size array or string is in the parameter list, it is passed by value normally, but passed by reference, when it is const.
For contrast: in C fixed size arrays are always passed by reference, no matter if const or not.
An array or just about any variable that does not fit in a register will always be passed by reference.  The compiler has no choice because the value is simply too large to fit in a register (in some rare cases, some values may be passed or returned in two registers but, again, those are rare cases.)

For an array or, any value that is larger than what fits in a register, what "const" does is inform the compiler whether or not a copy of the variable/structure is necessary.  If "const" is not specified when passing an array, the compiler will make a copy of the array and the code will act on the copy - this is how the semantics of "by value" are implemented for data that does not fit in a register.  If the array is declared "const" then the compiler will prevent (to the extent it can) from writing to the array elements, because of this, a copy is not made since it is not necessary.

As you pointed out "to some degree the compiler prevents writing".  The "degree" is determined by the compiler's ability to determine whether or not an array element is being written to.  For instance, through pointer aliasing, it is always possible to write to any array element passed as "const" since the compiler cannot always determine at compile time the target of a pointer at run time.

HTH.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Peter H

  • Sr. Member
  • ****
  • Posts: 272
Re: Is This A Leak?
« Reply #19 on: October 17, 2019, 09:32:19 pm »
An array or just about any variable that does not fit in a register will always be passed by reference.  The compiler has no choice because the value is simply too large to fit in a register

This is not absolutely precise.
"Reference variable" means, there is a hidden pointer passed on the stack. The only passed pointer is the stackpointer in pascal.

The hidden pointer is in C, but not in Pascal.
So far I understand it, the data is copied to the stack and the receiving procedure treats the arguments in the same way as an ordinary local variable.
There is no hidden pointer for an array, that is passed by value in pascal.
(The data must be copied to the stackframe, otherwise the procedure would not be reentrant.)

Consider this program:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. type
  4.   Tarr = array[0..10] of integer;
  5.  
  6. var
  7.   arr: Tarr;
  8.   i: integer = 0;
  9.  
  10.   procedure arrhog(a: Tarr);
  11.   var
  12.     b: Tarr;
  13.   begin
  14.     writeln(integer(@a[0]), integer(@b[0]): 20);
  15.     Inc(i);
  16.     if i < 5 then
  17.       arrhog(a);
  18.    end;
  19.  
  20. begin
  21.   writeln(integer(@arr[0]));
  22.   arrhog(arr);
  23.   readln;
  24. end.
  25.  

It produces this output:

61440
20970968            20971016
20970792            20970840
20970616            20970664
20970440            20970488
20970264            20970312


« Last Edit: October 17, 2019, 09:52:32 pm by Peter H »

440bx

  • Hero Member
  • *****
  • Posts: 3945
Re: Is This A Leak?
« Reply #20 on: October 17, 2019, 10:57:28 pm »
"Reference variable" means, there is a hidden pointer passed on the stack. The only passed pointer is the stackpointer in pascal.
I'm going to interpret "Reference variable" as a "var" parameter.  In that case, a pointer to the variable is what is passed as the parameter to the function/procedure.  I'm not sure I'd consider it "hidden" since "var" implies "pointer to", though I admit, it is not obvious to some programmers.

The hidden pointer is in C, but not in Pascal.
I don't get what you are saying there.  If there is one thing C does not do is hide much of anything, much less pointers.  What do you mean "the hidden pointer is in C" ? can you give an example ?

So far I understand it, the data is copied to the stack and the receiving procedure treats the arguments in the same way as an ordinary local variable.
For a large data structure (one that does not fit in registers), a copy is made on the stack and a reference/pointer to that copy is used by the callee to access it.  It is worth pointing out that in Free Pascal, depending on the program's bitness, the copy may be created by the caller or by the callee, IOW, who makes the copy depends on the program's bitness but, regardless of who made the copy, the structure will always be accessed through a pointer (though this fact may not be obvious to some programmers.)

There is no hidden pointer for an array, that is passed by value in pascal.
The array will _always_ be accessed using a pointer.  The array identifier is a pointer to the start of the array and an array will _always_ be passed by reference.  What changes, when the caller made the copy (if passed by value) is the target of the reference.

(The data must be copied to the stackframe, otherwise the procedure would not be reentrant.)
The data is copied if the parameter is passed by value and "const" isn't specified.  Whether a function/procedure is re-entrant is a different matter, no language, Free Pascal included, guarantees that a function or procedure is re-entrant, it's the programmer's responsibility to make it that way.

I probably missed the point of your example code.  I don't see what it's supposed to be demonstrating.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

del

  • Sr. Member
  • ****
  • Posts: 258
Re: Is This A Leak?
« Reply #21 on: October 18, 2019, 01:13:50 am »
The hidden pointer is in C, but not in Pascal.
I don't get what you are saying there.  If there is one thing C does not do is hide much of anything, much less pointers.  What do you mean "the hidden pointer is in C" ? can you give an example ?

Don't want to put words in his mouth but in C++ (a language different than C) a "reference" (&) is a const pointer in the sense that it always points to the same address in memory, as it is "wedded" to the object it references (points to). Its "pointerness" is hidden by friendly "passed by value" member dereferencing syntax: ".", instead of "->".

440bx

  • Hero Member
  • *****
  • Posts: 3945
Re: Is This A Leak?
« Reply #22 on: October 18, 2019, 02:46:14 am »
Don't want to put words in his mouth but in C++ (a language different than C) a "reference" (&) is a const pointer .... etc
You are correct.  In C++, parameters can be passed by reference just as they are in Pascal.  Not so in C.   If anything the pointer is "hidden" in Pascal (and C++ when passing by reference) but never in C.

Just in case, your point is well taken.  Maybe Peter meant C++ instead of C.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

del

  • Sr. Member
  • ****
  • Posts: 258
Re: Is This A Leak?
« Reply #23 on: October 18, 2019, 04:40:13 am »
Well in C you pass by reference by explicitly passing the address:

Code: Pascal  [Select][+][-]
  1. double x, xx, n, mean, stDev;
  2.  
  3. ... stuff ...
  4.  
  5. GetStats(x, xx, n, &mean, &stDev);

You're passing the pointers (addresses) by value and thereby passing what they're pointing to "by reference". C++ just took this basic idea and slicked it up a bit - mostly to achieve neater syntax, which tends to de-emphasize, or "hide" the reality that you're still passing addresses around. But that's another story for another time.
 :D

Thaddy

  • Hero Member
  • *****
  • Posts: 14204
  • Probably until I exterminate Putin.
Re: Is This A Leak?
« Reply #24 on: October 18, 2019, 06:57:25 am »
You can also do that in Pascal if you like unreadable code with ambiguous looks
Code: Pascal  [Select][+][-]
  1. // bare pointers
  2. GetStats(x, xx, n, @mean, @stDev);
  3. // or
  4. GetStats(x, xx, n, Addr(mean), Addr(stDev));
  5. // Or with pointer cast
  6. GetStats(x, xx, n, PDouble(@mean), PDouble(@stDev));
  7. // or fully de-referenced
  8. GetStats(x, xx, n, PDouble(@mean)^, PDouble(@stDev)^);
« Last Edit: October 18, 2019, 06:59:21 am by Thaddy »
Specialize a type, not a var.

del

  • Sr. Member
  • ****
  • Posts: 258
Re: Is This A Leak?
« Reply #25 on: October 18, 2019, 08:50:38 am »
You can also do that in Pascal if you like unreadable code with ambiguous looks
Code: Pascal  [Select][+][-]
  1. // bare pointers
  2. GetStats(x, xx, n, @mean, @stDev);
  3. // or
  4. GetStats(x, xx, n, Addr(mean), Addr(stDev));
  5. // Or with pointer cast
  6. GetStats(x, xx, n, PDouble(@mean), PDouble(@stDev));
  7. // or fully de-referenced
  8. GetStats(x, xx, n, PDouble(@mean)^, PDouble(@stDev)^);

Wow! Thaddy! Such an embarrassment of riches! I'm going with Option One.  :D

Thaddy

  • Hero Member
  • *****
  • Posts: 14204
  • Probably until I exterminate Putin.
Re: Is This A Leak?
« Reply #26 on: October 18, 2019, 09:07:35 am »
Just remember that this is valid Pascal too:
Code: Pascal  [Select][+][-]
  1. procedure GetStats(x, xx, n, &mean, &stDev:double); // looks eerily similar

 :D O:-) 8-)

(Of course it means something completely different....)
« Last Edit: October 18, 2019, 09:10:05 am by Thaddy »
Specialize a type, not a var.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Is This A Leak?
« Reply #27 on: October 18, 2019, 09:42:48 am »
"In Free Pascal all non-trivial objects are passed by reference". True. But basically it's just a matter of bandwidth. It's cheaper to send a reference, so references are what get sent. And when you stick "var" in front of it, the called function has read / write access. If you leave the "var" out, then it's just read access. (Let's forget about "const" for the time being)

Now - at runtime - if the called function tries to modify a "no var" (read only) reference then the called function creates a local copy and uses the local copy for its local purposes. And this local copy disappears and its resources get freed when the called function exits.

So the take away for me is that objects are routinely passed as references. No need to declare (class& object) in the parameters. "Var" in the parameters means "non const". Not sure what "const" means. It's kinda covered by the absence of "var". So it must have some other meaning to the compiler.
Not quite correct. A normal parameter is read/write just like a var-parameter is, but the caller (not the callee) creates a copy of the value.

So far I understand it, the data is copied to the stack and the receiving procedure treats the arguments in the same way as an ordinary local variable.
For a large data structure (one that does not fit in registers), a copy is made on the stack and a reference/pointer to that copy is used by the callee to access it.  It is worth pointing out that in Free Pascal, depending on the program's bitness, the copy may be created by the caller or by the callee, IOW, who makes the copy depends on the program's bitness but, regardless of who made the copy, the structure will always be accessed through a pointer (though this fact may not be obvious to some programmers.)
As so often that depends on the used calling convention and ABI.

Take the following code:
Code: Pascal  [Select][+][-]
  1. type
  2.   TTest = record
  3.     a, b, c: LongInt;
  4.   end;
  5.  
  6. function TestReg(a: TTest): LongInt; register;
  7. begin
  8.   TestReg := a.a;
  9. end;
  10.  
  11. function TestStdCall(a: TTest): LongInt; stdcall;
  12. begin
  13.   TestStdCall := a.a;
  14. end;
  15.  
  16. function TestCDecl(a: TTest): LongInt; cdecl;
  17. begin
  18.   TestCDecl := a.a;
  19. end;
  20.  
  21. var
  22.   a: TTest;
  23. begin
  24.   a.a := 42;
  25.   a.b := 21;
  26.   a.c := 8;
  27.   TestReg(a);
  28.   TestStdCall(a);
  29.   TestCDecl(a);
  30. end.

This will result in the following assembly code on i386-win32:

Code: [Select]
.section .text.n_p$tarrtest_$$_testreg$ttest$$longint,"x"
.balign 16,0x90
.globl P$TARRTEST_$$_TESTREG$TTEST$$LONGINT
P$TARRTEST_$$_TESTREG$TTEST$$LONGINT:
# Temps allocated between ebp-20 and ebp-8
# [tarrtest.pp]
# [66] begin
pushl %ebp
movl %esp,%ebp
leal -20(%esp),%esp
# Var a located at ebp-4, size=OS_32
# Var $result located at ebp-8, size=OS_S32
movl %eax,-4(%ebp)
movl -4(%ebp),%edx
movl (%edx),%eax
movl %eax,-20(%ebp)
movl 4(%edx),%eax
movl %eax,-16(%ebp)
movl 8(%edx),%eax
movl %eax,-12(%ebp)
# [67] TestReg := a.a;
movl -20(%ebp),%eax
movl %eax,-8(%ebp)
# [68] end;
movl -8(%ebp),%eax
movl %ebp,%esp
popl %ebp
ret

.section .text.n_p$tarrtest_$$_teststdcall$ttest$$longint,"x"
.balign 16,0x90
.globl P$TARRTEST_$$_TESTSTDCALL$TTEST$$LONGINT
P$TARRTEST_$$_TESTSTDCALL$TTEST$$LONGINT:
# [71] begin
pushl %ebp
movl %esp,%ebp
leal -4(%esp),%esp
# Var a located at ebp+8, size=OS_NO
# Var $result located at ebp-4, size=OS_S32
# [72] TestStdCall := a.a;
movl 8(%ebp),%eax
movl %eax,-4(%ebp)
# [73] end;
movl -4(%ebp),%eax
movl %ebp,%esp
popl %ebp
ret $12

.section .text.n_p$tarrtest_$$_testcdecl$ttest$$longint,"x"
.balign 16,0x90
.globl P$TARRTEST_$$_TESTCDECL$TTEST$$LONGINT
P$TARRTEST_$$_TESTCDECL$TTEST$$LONGINT:
# [76] begin
pushl %ebp
movl %esp,%ebp
leal -4(%esp),%esp
# Var a located at ebp+8, size=OS_NO
# Var $result located at ebp-4, size=OS_S32
# [77] TestCDecl := a.a;
movl 8(%ebp),%eax
movl %eax,-4(%ebp)
# [78] end;
movl -4(%ebp),%eax
movl %ebp,%esp
popl %ebp
ret

.section .text.n__main,"x"
.balign 16,0x90
.globl _main
_main:
.globl PASCALMAIN
PASCALMAIN:
# [82] begin
pushl %ebp
movl %esp,%ebp
call fpc_initializeunits
# [83] a.a := 42;
movl $42,U_$P$TARRTEST_$$_A
# [84] a.b := 21;
movl $21,U_$P$TARRTEST_$$_A+4
# [85] a.c := 8;
movl $8,U_$P$TARRTEST_$$_A+8
# [86] TestReg(a);
movl $U_$P$TARRTEST_$$_A,%eax
call P$TARRTEST_$$_TESTREG$TTEST$$LONGINT
# [87] TestStdCall(a);
leal -12(%esp),%esp
movl U_$P$TARRTEST_$$_A,%eax
movl %eax,(%esp)
movl U_$P$TARRTEST_$$_A+4,%eax
movl %eax,4(%esp)
movl U_$P$TARRTEST_$$_A+8,%eax
movl %eax,8(%esp)
call P$TARRTEST_$$_TESTSTDCALL$TTEST$$LONGINT
# [88] TestCDecl(a);
leal -12(%esp),%esp
movl U_$P$TARRTEST_$$_A,%eax
movl %eax,(%esp)
movl U_$P$TARRTEST_$$_A+4,%eax
movl %eax,4(%esp)
movl U_$P$TARRTEST_$$_A+8,%eax
movl %eax,8(%esp)
call P$TARRTEST_$$_TESTCDECL$TTEST$$LONGINT
addl $12,%esp
# [90] end.
call fpc_do_exit
movl %ebp,%esp
popl %ebp
ret

As you can see for both the stdcall and cdecl calls the parameter is passed as a copy on the stack and the calling function accesses it using the frame pointer register. For the register one it's passed as a reference in a register.
This behaviour not only depends on the calling convention, but also on the passed type: for example records that contain managed types are always passed as reference. Sometimes it also depends on the size whether a record is passed as reference or a direct copy (e.g. a copy of a record of size 4 could be passed directly inside a register).

del

  • Sr. Member
  • ****
  • Posts: 258
Re: Is This A Leak?
« Reply #28 on: October 18, 2019, 11:34:29 am »
"In Free Pascal all non-trivial objects are passed by reference". True. But basically it's just a matter of bandwidth. It's cheaper to send a reference, so references are what get sent. And when you stick "var" in front of it, the called function has read / write access. If you leave the "var" out, then it's just read access. (Let's forget about "const" for the time being)

Now - at runtime - if the called function tries to modify a "no var" (read only) reference then the called function creates a local copy and uses the local copy for its local purposes. And this local copy disappears and its resources get freed when the called function exits.

So the take away for me is that objects are routinely passed as references. No need to declare (class& object) in the parameters. "Var" in the parameters means "non const". Not sure what "const" means. It's kinda covered by the absence of "var". So it must have some other meaning to the compiler.
Not quite correct. A normal parameter is read/write just like a var-parameter is, but the caller (not the callee) creates a copy of the value.

Oh boy. I initially agreed with this take. For several hours. Until that other guy mentioned that it was reference counted. Which I'm guessing means that no "deep" copies happen until there is an actual need - until one of the variables becomes different from the shared reference. And I'm assuming that the change that triggers the deep copy (the copy that gets modified) usually occurs "inside" the called function. The caller remains isolated from what's going on in the called function (because of "no var"), and continues to reference the original data. It seems inefficient for the caller to always create a deep copy (and pass a reference to it) when there are often cases in which no deep copy is ever needed.

440bx

  • Hero Member
  • *****
  • Posts: 3945
Re: Is This A Leak?
« Reply #29 on: October 18, 2019, 02:59:12 pm »
As so often that depends on the used calling convention and ABI.
Yes, definitely.  I should have specified that my comments were applicable to the compiler's default parameter passing convention, namely register.

This behaviour not only depends on the calling convention, but also on the passed type: for example records that contain managed types are always passed as reference.
I stay away from managed types as much as I can, as a result, I know very little about how they are treated. I will take your word for how those are handled.  Thank you for making that clear.

Sometimes it also depends on the size whether a record is passed as reference or a direct copy (e.g. a copy of a record of size 4 could be passed directly inside a register).
That makes perfect sense. if it fits in a register and it is passed by value, it should be passed in a register (if there any still available, of course.)  I presume that in the case of a record of size 4, that implies one (1) field of size 4 and, if there were two fields whose combined size was 4 then two (2) registers would be used, correct ?

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018