Forum > Suggestions

Proposal: Record Composition

(1/7) > >>

Warfley:
I would like to make a proposal for a small language feature, which could be really useful. I named this feature "Record Composition", but it already crossed my mind that because of the composition pattern in OOP, this may not be the most fitting name.

As I do not know what the offical way to submit such ideas/contributions is (there seems to be no information in the gitlab or the website), so I will just post it here, as I know that many FPC developers are watching this Forum regularly.

The basic idea is simple, just allow access to the subfields of a record element directly through the parent record. Example:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---{$ModeSwitch RecordComposition} type  TChildRec = record    A: Integer;  end;   TComposed = record    uses child: TChildRec;    B: Double;  end; var  c: TComposed;begin  c.A := 42;  c.B := 3.14;  WriteLn(c.child.A); // Writes 42end. 
I've already implemented it in the FPC sources, to see how feasable such an implementation would be: https://gitlab.com/freepascal.org/fpc/source/-/merge_requests/498
You can look at the commited test cases to see some more examples.

Why I think this is useful/required
There are basically three main reasons for it:

1. Since C11 C allows so called anonymous structures and anonymous unions within their structs. This means that while records have been mostly equivalent with C structs allowing easy porting of C code to Pascal, anonymous records and anonymous unions are not yet compatible, making it challenging to port C code using it. Record composition would basically resolve this problem using anonymous composition fields (already implemented):

--- Code: C  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---struct foo {  int x;  struct bar; // anonymous struct inclusion of struct bar}Can now be represented by:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TFoo = record  x: Integer;  uses TBar;end;2. Allowing for non trailing and multiple variant parts. As was discussed previously in https://forum.lazarus.freepascal.org/index.php/topic,63226.0.html there seems to be a need for having the variant part of the record not in the end, or having multiple variant parts within a record. Also this can be solved using record composition:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TFoo = record  x: Integer;  uses record    case Boolean of    True: (I: Integer);    False: (D: Double);  end;  y: Integer;end;3. It allows to easier organize record contents throughout multiple definitions. For example if you have platform dependent data, you don't need massively nested {$IfDef}-{$Else}-{$EndIf} directives, you can simply write different types and import them:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TFoo = record  CommonField: Integer;  {$IfDef Windows}  uses TFooWinInternal;  {$Else}  uses TFooUnixInternal;  {$EndIf}end;

There are also a few minor reasons to do this, especially interesting for advanced records. For example, one of the main issues with advanced records is that they don't support inheritance. While this composition is no replacement for inheritance, as it does not provide virtual methods or similars, the ability to compose a record from some base definition with some additional data and/or functionality is probably fully sufficient in most situations, giving more options without having to resort to a full inheritance model.
Second, one thing I experimented with was to use this to enable records as wrappers for objects and classes, for example one could imagine an "shared pointer" like this:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TSharedSL = recordprivate  ref: TStringList;  refcount: PInteger;public   using ref;  class operator Initialize(...); // Will create ref  class operator Finalize(...); // will decrease refcount  class operator Copy(...); // Decrease -> copy -> increase refcount  class operator AddRef(...); // Will increase refcountend; var  sl: TSharedSL;begin  sl.Add('Hello');  sl.Add('World');  WriteLn(sl.Text);end.But I'm a bit unsure about that one, while I currently have built it such that classes can also be composited, I think that just including large classes will mostly create irritating types with way to many fields and field collisions everywhere. So I don't think if there is really any use-case for this that makes it worth it.

Lastly its a very simple addition to the FPC, the diff on the compiler code is only like 200 lines of code, with some of it being for features like allowing to use classes and visibility scoping (such that you can like above have ref within a different visibility than the fields that are imported), which may not even be worth to include in the first place. At it's core it just adds link symbols which when resolved generate the AST for accessing the composite member, with basically only 2 contact points within the FPC source base, such that it will only have very limited effect on other FPC functionality and maintenance.

TRon:
proposal introduces ambiguity:


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type  TChildRec1 = record    A: Integer;  end;  TChildRec2 = record    A: Integer;  end;    TComposed = record    uses child1: TChildRec1;    uses child2: TChildRec2;    B: Double;  end; Becomes even more problematic (to detect) when more nesting of record structures occurs.

edit: forgot to rename the child fieldnames introducing my own ambiquity. Fixed now.

Warfley:
As I implemented it, it is quite easy, when there is a collision, and there is no member variable to resolve this, it throws an error. See the reccomp_dup_unnamed test cases: https://gitlab.com/Warfley/FPC_Source/-/blob/recordcomposition/tests/test/reccomp_dup_unnamed1_f.pp

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type  TChildRec = record    B: Integer;  end;   TComposed = record    A: Integer;    B: Integer;    uses TChildRec; // Error duplicate identifier B    C: Integer;  end; 
If the composition field is named, it will only throw a warning, and include the fields in a first come first serve manner, where the fields of the owning record are always prefered, as can be seen in the reccomp_dup_named test cases: https://gitlab.com/Warfley/FPC_Source/-/blob/recordcomposition/tests/test/reccomp_dup_named1_f.pp (note, because of the -Sew flag this test will also fail to compile due to it treating warnings as errors, in a live system you of course can always just ignore this warning)

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---  TChildRec = record    C: Integer;  end;   TComposed = record    A: Integer;    B: Integer;    uses child: TChildRec; // Warning Duplicate Identifier C    C: Integer;  end; var  c: TComposed;begin  c.C := 42; // Sets C of TComposed Record  c.child.C := 42; // Active disambiguation is requiredend; 
I think that a warning is of course also a very strong incentive to avoid this, but as active disambiguation is always possible I don't think it is truely an error case.

One thing I am unsure about is, when there is a collision between two composites like in your example, it would always use the first one in top to bottom order (so c.A would be c.child1.A), one thing I also though about would be that in such a case to include neither, so you can't write c.A anyway and always must use c.child1.A or c.child2.A. That would be the "cleaner" option I think.

But all in all, those are details, as you can always disambiguate through the field, it does not pose a problem

Martin_fr:
The following works. So you can only "include" one parent.

"object" is like record, allocated in the stack. No pointer to heap, no "create" needed.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type  TChildRec = object    A: Integer;  end;   TComposed = object(TChildRec)    B: Double;  end;

Warfley:

--- Quote from: Martin_fr on September 16, 2023, 09:46:05 pm ---The following works. So you can only "include" one parent.

"object" is like record, allocated in the stack. No pointer to heap, no "create" needed.


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---type  TChildRec = object    A: Integer;  end;   TComposed = object(TChildRec)    B: Double;  end;
--- End quote ---

I know about objects and inheritance, the thing is that this does not serve the purposes outlined in my initial post. Unlike records it is not compatible to C, making it infeasable to serve as an interface to a C library that uses anonymous structs/unions. Also inheritance with a vmt and downcasting is much more complex than what I propose, as the goal of this is not to provide OOP functionality, but is more in the spirit of original records to simply provide data access within a structured record.
It is therefore important that when doing this:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---TComposite = record  A: Integer;  using TChildRecord;  C: Integer;end;The TChildRecord instance is exactly located between A and C, with predictable (and configurable) alignment.
This should be compatible and usable for a wrapper to a C library that publishes the following datastructure:

--- Code: C  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---struct composite {  int a;  struct childstruct;  int c;}
It specifically tries to address some of the limitations records have currently, which were also discussed in for example: https://forum.lazarus.freepascal.org/index.php/topic,63226.msg479790.html#msg479790

Navigation

[0] Message Index

[#] Next page

Go to full version