Recent

Author Topic: Reference Counting and Ownership  (Read 6692 times)

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Reference Counting and Ownership
« on: December 31, 2021, 01:10:50 pm »
There's an interesting article I read the other day about an ownership model for reference counting. It was proposed for the Lobster language and it is a very interesting approach. According to the article, this reference counting is mainly done during compile-time and takes away about 95% of the runtime reference count operations.

The ownership model is interesting in that it could tackle (unexpected) reference changes like this (pseudo-code):
Code: Text  [Select][+][-]
  1. var x = "hello"
  2. def f(y):
  3.     x = "world"
  4.     print y       // should it print "world" or "hello"
  5. f(x)
I was wondering where FPC stands in this.
keep it simple

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 1059
Re: Reference Counting and Ownership
« Reply #1 on: December 31, 2021, 01:31:33 pm »
In FPC, ownership is determined by the programmer:
* if they declare the parameter as const, constref or var, it will print 'world' (because then either the programmer assures the compiler that the value won't change, or they are explicitly passing a reference to the string rather than its value)
* if it's a value parameter, it will print 'world' (because then the programmer explicitly told the compiler that they want the value to be passed)

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Reference Counting and Ownership
« Reply #2 on: December 31, 2021, 01:45:03 pm »
In FPC, ownership is determined by the programmer:
* if they declare the parameter as const, constref or var, it will print 'world' (because then either the programmer assures the compiler that the value won't change, or they are explicitly passing a reference to the string rather than its value)
* if it's a value parameter, it will print 'world' (because then the programmer explicitly told the compiler that they want the value to be passed)

But that's basically telling the compiler to either reference (to the same address) or make a copy of the string?

An ownership model instead would make a second reference (y) the new owner if the previous owner (x) changes content.
« Last Edit: December 31, 2021, 01:47:02 pm by munair »
keep it simple

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 1059
Re: Reference Counting and Ownership
« Reply #3 on: December 31, 2021, 01:56:53 pm »
Strings are reference-counted, so a value parameter does not make a copy. It just increases the reference count. The copy/ownership changes (dynamically) when the caller changes the value.

If you would change the ownership immediately on calling the routine, then you would change the semantics of existing programs. And even if you consider their behaviour buggy, that won't appease anyone who relies on that behaviour (as we've experienced time and time again when FPC behaves differently from Delphi even if the Delphi behaviour is just a code generation quirk that happens to guarantee the Delphi behaviour rather than something that is defined by the language specification).

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Reference Counting and Ownership
« Reply #4 on: December 31, 2021, 02:41:24 pm »
So if I understand correctly, when x is passed by value to parameter y, no copy of x is made. Instead the reference count is increased by 1, so that when x changes in the meantime, the original value is kept alive because a reference (y) still exists, like in this example:
Code: Pascal  [Select][+][-]
  1. program test;
  2.  
  3. var
  4.   x: string;
  5.  
  6. procedure f(y: string);
  7. begin
  8.   x := 'world';
  9.   writeln(y);
  10. end;
  11.  
  12. begin
  13.   writeln;
  14.  
  15.   x := 'hello';
  16.   f(x);
  17. end.

btw, how do I prevent the code box from writing ' for a single quote?
keep it simple

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 1059
Re: Reference Counting and Ownership
« Reply #5 on: December 31, 2021, 03:01:32 pm »
So if I understand correctly, when x is passed by value to parameter y, no copy of x is made. Instead the reference count is increased by 1, so that when x changes in the meantime, the original value is kept alive because a reference (y) still exists, like in this example:
Indeed, as long as string = ansistring ({$h+}, or default in {$mode delphi}). When string = shortstring, then there is no reference counting and instead full copies are made.

Quote
btw, how do I prevent the code box from writing ' for a single quote?
The forum software was recently upgraded to solve some security issues and this is an unfortunate side-effect. They're trying to find out how to fix it.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11452
  • FPC developer.
Re: Reference Counting and Ownership
« Reply #6 on: December 31, 2021, 03:18:03 pm »
I do note that the article doesn't seem to talk about exception handling.

If I optimize away an operation y:=x+z; where z is a constant, x is an object that is owner, and y is the L-value that will get the new ownership.

If you optimize this away and an exception happens in evaluation of x+z, what happens?

p.s. the 95% number seems to go for all ownership decisions, which is like every statement that uses an object. I don't think that applies to ansistrings that well.

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Reference Counting and Ownership
« Reply #7 on: December 31, 2021, 04:10:25 pm »
I do note that the article doesn't seem to talk about exception handling.

If I optimize away an operation y:=x+z; where z is a constant, x is an object that is owner, and y is the L-value that will get the new ownership.

If you optimize this away and an exception happens in evaluation of x+z, what happens?

p.s. the 95% number seems to go for all ownership decisions, which is like every statement that uses an object. I don't think that applies to ansistrings that well.

Interesting questions. I haven't come to exception handling and deeper optimization yet, so I cannot answer them. But the ownership model, especially at compile-time, sounds great and that's what I'm working on right now. In principle, it should be able to handle anything holding an address to allocated memory.
keep it simple

lainz

  • Hero Member
  • *****
  • Posts: 4468
    • https://lainz.github.io/
Re: Reference Counting and Ownership
« Reply #8 on: December 31, 2021, 04:50:20 pm »
So if I understand correctly, when x is passed by value to parameter y, no copy of x is made. Instead the reference count is increased by 1, so that when x changes in the meantime, the original value is kept alive because a reference (y) still exists, like in this example:
Code: Pascal  [Select][+][-]
  1. program test;
  2.  
  3. var
  4.   x: string;
  5.  
  6. procedure f(y: string);
  7. begin
  8.   x := 'world';
  9.   writeln(y);
  10. end;
  11.  
  12. begin
  13.   writeln;
  14.  
  15.   x := 'hello';
  16.   f(x);
  17. end.

btw, how do I prevent the code box from writing ' for a single quote?

Anyways this code is bad programming. Is better to try always using pure functions, constant parameters, single result, no changing the global state of the program. Is hard, but most bugs are preventing global state changes, not by definition but in my own experience as programmer.

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Reference Counting and Ownership
« Reply #9 on: December 31, 2021, 05:08:49 pm »
Anyways this code is bad programming. Is better to try always using pure functions, constant parameters, single result, no changing the global state of the program. Is hard, but most bugs are preventing global state changes, not by definition but in my own experience as programmer.

I agree, but it's a minimal example of a situation that a compiler should handle correctly / consistently. So it's not a question of good or bad programming, but rather a question of "what if".
keep it simple

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11452
  • FPC developer.
Re: Reference Counting and Ownership
« Reply #10 on: December 31, 2021, 05:09:41 pm »
Interesting questions. I haven't come to exception handling and deeper optimization yet, so I cannot answer them. But the ownership model, especially at compile-time, sounds great and that's what I'm working on right now. In principle, it should be able to handle anything holding an address to allocated memory.

Threading might be another issue, specially if ownership is runtime in some cases. And event driven (network servers, GUI apps) vs batch apps is another. With short lived events triggered

If ownership is fully compiletime, to me it reads as formalization/systematic approach of common sense optimizing away refcounting.

If ownership is runtime (or can devolve to it if not solved compiletime), I don't see the point. Either update one variable atomically or the other. 

For strings there is also a case that when a string function sometimes returns a parameter unmodified, and sometimes not (e.g. substitution value not found  ) , then owner can't be statically determined, but does the function have some runtime ability to do this? Or just let refcounting without ownership handle it?

The page is a bit thin on details IMHO, and for FPC those would be interesting, since the default case that optimizes away locally used expressions applies less.

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Reference Counting and Ownership
« Reply #11 on: December 31, 2021, 05:23:11 pm »
The page is a bit thin on details IMHO, and for FPC those would be interesting, since the default case that optimizes away locally used expressions applies less.

It's definitely thin and I'm just starting to work out a basic concept that might work. But it's a serious one worth exploring given the author's background/experience. Sooner or later during compiler development the question rises: what can/should be done at compile-time and what at run-time. Normally it's a flow that follows logic, but in the case of memory management, as the author correctly states in the introduction, the problem is far from being solved.
keep it simple

ASerge

  • Hero Member
  • *****
  • Posts: 2241
Re: Reference Counting and Ownership
« Reply #12 on: December 31, 2021, 08:49:29 pm »
In FPC, ownership is determined by the programmer:
* if they declare the parameter as const, constref or var, it will print 'world' (because then either the programmer assures the compiler that the value won't change, or they are explicitly passing a reference to the string rather than its value)
* if it's a value parameter, it will print 'world' (because then the programmer explicitly told the compiler that they want the value to be passed)
Due to the fact that literal constants come without reference counter (-1), things can be a little more complicated. Example:
Code: Pascal  [Select][+][-]
  1. {$MODE OBJFPC}
  2. {$LONGSTRINGS ON}
  3. {$APPTYPE CONSOLE}
  4.  
  5. var
  6.   X: string = 'Hello';
  7.  
  8. procedure F( {1} Y: string);
  9. begin
  10.   X := 'World';
  11.   Writeln('X: ', X, ' ', StringRefCount(X));
  12.   Writeln('Y: ', Y, ' ', StringRefCount(Y));
  13. end;
  14.  
  15. begin
  16.   //UniqueString(X); {2}
  17.   F(X);
  18.   Readln;
  19. end.

{1} empty or const, no {2}
X: World -1
Y: Hello -1

{1} constref or var, {2} exists or not
X: World -1
Y: World -1

{1} empty, {2} exists
X: World -1
Y: Hello 1

{1} const, {2} exists
X: World -1
Y:  17517928

The latter, because Y refers to the freed memory.

By the way, recently someone asked about the differences between const and constref. Here it is.

440bx

  • Hero Member
  • *****
  • Posts: 4024
Re: Reference Counting and Ownership
« Reply #13 on: December 31, 2021, 09:29:19 pm »
Interesting conundrum in that code :)

For one thing, it shows one of the many reasons why global variables are a bad thing. 

The other thing is, if a variable is global then it doesn't really make sense to pass it as a parameter to some function since, the receiving function has access to the variable already.

I think compilers should refuse passing global variable(s) by reference.  They should allow passing a static (in the C sense) variable because that makes it visible to the recipient (a global variable is already visible, therefore it doesn't need to be passed by reference.)

The other problem with global variables that the code snippet doesn't show is that, in a multitasking environment, it's value can change at any time, it's volatile by nature.   The only thing that would make sense is to always pass its value at the time of the call but, passing a reference to it is pointless (since being global it is visible to all code.)
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Reference Counting and Ownership
« Reply #14 on: January 01, 2022, 10:20:28 am »
Interesting conundrum in that code :)

For one thing, it shows one of the many reasons why global variables are a bad thing. 

Global variables are not bad, but they should be defined with care/purpose. As with everything in programming, nothing helps against badly written code or bad programming style. See also my earlier comment #9.

Best wishes for 2022 everyone!
keep it simple

 

TinyPortal © 2005-2018