Recent

Author Topic: [Feature] High level Variant Records  (Read 970 times)

Warfley

  • Hero Member
  • *****
  • Posts: 1684
[Feature] High level Variant Records
« on: October 08, 2024, 11:26:40 pm »
Hello everyone,

I want to propose a new feature I've built for the FPC, which completely uncreatively I named "High level Variant Records". Variant records are quite interesting but the way they are implemented in FPC is very low level. Basically they are the exact same as C style unions, just with extra steps.

That said, syntactically they are quite interesting, because they allow for a selector field, and associating the branches with values the selector field can take. But while this is syntactically interesting, there is no semantic meaning behind this and the following is perfectly legal code:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2.   case sel: Boolean of
  3.   True: (A: Integer);
  4.   True: (B: Double);
  5. end;
  6.  
  7. vr: TVarRec;
  8. begin
  9.   vr.Sel := False;
  10.   vr.A:=42;

Furthermore because there is no semantic, variant records do not allow for managed types in the variant branches. So take the following:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2.   case sel: Boolean of
  3.   True: (A: String);
  4.   False: (B: Integer);
  5. end;
This is not allowed because the FPC doesn't know if A or B is set. But from the syntax this should be obvious, if sel is True then A should be set and if it is False B should be set. But this semantic connection is not encoded in the language.

So I set out to change that. And with my Merge Request I've added 4 new features to do so:

Variant RTTI
First as a basis I've added RTTI information about the variants. While this was a necessary base for the management operations, it also allows the user to traverse the variant information easiely using the TypInfo unit
Code: Pascal  [Select][+][-]
  1. td:=GetTypeData(TypeInfo(TVarRec));
  2. for i:=0 to td^.VariantInfo^.BranchCount-1 do
  3. begin
  4.   mf:=td^.VariantInfo^.Branches[i]^.BranchStart;
  5.   while mf<td^.VariantInfo^.Branches[i]^.BranchEnd do
  6.   begin
  7.     // Iterate through all fields of a branch and print the type
  8.     WriteLn(mf^.TypeRef^.Name);
  9.     Inc(mf);
  10.   end
  11. end;

Strict Variants
The next introduction is strict variants, this will add compiler checks that your variant branches need to have unique values and contain a 0 index:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2. case Boolean of
  3. True: (...);
  4. True: (...); // Error because overlap with existing first branch
  5. end;

Variant Access Checks
This adds runtime checks to variant access to make sure the program only accesses the branch currently selected by the selector:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2.   case sel: Boolean of
  3.   True: (A: Integer);
  4.   False: (B: Double);
  5. end;
  6.  
  7. var
  8.   vr: TVarRec;
  9. begin
  10.   vr.sel:=True;
  11.   vr.B:=3.14; // Runtime error because B is on false branch but true is set
  12. end;

Managed Variants
Lastly, with all these precursers, it finally enables to use managed types in variant branches:
Code: Pascal  [Select][+][-]
  1. type
  2.   TVarRec = record
  3.   case sel:Boolean of
  4.   True: (s:String); // Managed field
  5.   False: (I: Integer);
  6.   end;
  7.  
  8. var
  9.   vr: TVarRec;
  10. begin
  11.   vr.sel:=True; // Branch switch -> will initialized managed field s
  12.   vr.s:='Hello World';
  13.   vr.sel:=False; // Branch switch -> will finalize managed field s
  14. end.
  15.  

Further information can be found in the merge request.

I would be greatful if people can give me feedback and maybe try it out a bit so see whats missing or where problems arise.

440bx

  • Hero Member
  • *****
  • Posts: 4668
Re: [Feature] High level Variant Records
« Reply #1 on: October 09, 2024, 12:00:33 am »
<snip> .. there is no semantic meaning behind this and the following is perfectly legal code:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2.   case sel: Boolean of
  3.   True: (A: Integer);
  4.   True: (B: Double);
  5. end;
  6.  
  7. vr: TVarRec;
  8. begin
  9.   vr.Sel := False;
  10.   vr.A:=42;
That code is NOT legal.  the case should NOT accept the value "True" twice.  Every variant is supposed to have a unique identifier and, this was reported as a bug a good while back.  That said, FPC v3.2.2 compiles it but it shouldn't.  I guess the bug is or will be fixed in the next release, v3.2.4 ?


Furthermore because there is no semantic, variant records do not allow for managed types in the variant branches. So take the following:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2.   case sel: Boolean of
  3.   True: (A: String);
  4.   False: (B: Integer);
  5. end;
This is not allowed because the FPC doesn't know if A or B is set.
The compiler accepts that and there is no reason why it shouldn't.  That's perfectly legal code.  The fact that "A" is a managed type is irrelevant.

For the record, the variant tag is for the programmer's convenience, it is not there to tell the compiler what the variant's values are supposed to be (the value is set at runtime, therefore the compiler cannot usually know it at compile time.)


(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Warfley

  • Hero Member
  • *****
  • Posts: 1684
Re: [Feature] High level Variant Records
« Reply #2 on: October 09, 2024, 12:12:22 am »
Quote
That code is NOT legal.  the case should NOT accept the value "True" twice.  Every variant is supposed to have a unique identifier and, this was reported as a bug a good while back.  That said, FPC v3.2.2 compiles it but it shouldn't.  I guess the bug is or will be fixed in the next release, v3.2.4 ?

Well I do remember this discussion, but I can tell you at least that there has been no "fix" to this in trunk/main. This is why I built that myself. Because I agree, it should not be legal, but it is.

Also, just checked, it's not changed on the 3.2.4 branch either. Infact in the current fpc code the fpc forgets the branch values as soon as it has read them (unless in mode ISO). They are really completely vestigial and have absolutely no function to the compiler

The compiler accepts that and there is no reason why it shouldn't.  That's perfectly legal code.  The fact that "A" is a managed type is irrelevant.

No the Compiler does not accept this, because it can't. Take the flowing:
Code: Pascal  [Select][+][-]
  1. var
  2.   vr: TVarRec;
  3. begin
  4.   vr.B:=42;
  5. end;
If the runtime now tries to free the string in A, it will try to dereference Pointer(42), which leads to a crash. This is because the runtime does not know which branch is selected, because the selector is just syntactic sugar.
This is what I changed, I made the runtime aware of the variant part and the selector to enable this.

Quote
For the record, the variant tag is for the programmer's convenience, it is not there to tell the compiler what the variant's values are supposed to be (the value is set at runtime, therefore the compiler cannot usually know it at compile time.)
Sure but the Compiler must embed the information for the runtime. Right now the fpc does not do that, this is why the example above does not work, because the runtime can't determine if A is selected and there is a valid string, or B is selected and there is an integer in that memory location.

What my change does is add this information to the runtime so the runtime can look at the selection field, see A is selected and therefore treat it as a string.

Also in some cases the Compiler can be aware of the value:
Code: Pascal  [Select][+][-]
  1. if VR.Sel then
  2.   VR.B:=42;
The Compiler could easily deduce here that the branch for A is selected and therefore this access to B should be illegal.
This would actually be the next thing I'd start to work on if the features mentioned above are finished
« Last Edit: October 09, 2024, 12:25:17 am by Warfley »

440bx

  • Hero Member
  • *****
  • Posts: 4668
Re: [Feature] High level Variant Records
« Reply #3 on: October 09, 2024, 12:27:17 am »
No the Compiler does not accept this, because it can't. Take the flowing:
Code: Pascal  [Select][+][-]
  1. var
  2.   vr: TVarRec;
  3. begin
  4.   vr.B:=42;
  5. end;
You didn't put the assignment after the type definition. You just had the type definition all by itself followed by a statement that it was illegal.  I thought you meant the type definition was not acceptable.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Warfley

  • Hero Member
  • *****
  • Posts: 1684
Re: [Feature] High level Variant Records
« Reply #4 on: October 09, 2024, 12:38:59 am »
The fpc will not accept any variant record definition that contains managed fields. It will throw an error as soon as it encounters one. Even for types that should be compatible (e.g. if on both branches there is a dynamic array of a non managed type of the same word size, like array of LongInt vs Array of LongWord)

Thaddy

  • Hero Member
  • *****
  • Posts: 15979
  • Censorship about opinions does not belong here.
Re: [Feature] High level Variant Records
« Reply #5 on: October 09, 2024, 06:05:51 am »
Well, depends on string mode if it is managed. in mode objfpc it is shortstring by default and shortstrings are not managed. for string to be anything else it needs {$H+} or Delphi/Delphiunicode mode.
for strings to work in a variant record, the length of the shortstring must be known, so it is a string[xxx]

What I do not understand is to allow for variant records to allow for pointer types  (which managed types are) in general.
to me a variant record is always a consecutive array of bytes of size record with multiple possible layout. Storing pointers if fine but violates its purpose. managed types makes it not a flat but a tree like structure.
« Last Edit: October 09, 2024, 06:22:21 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Khrys

  • Full Member
  • ***
  • Posts: 102
Re: [Feature] High level Variant Records
« Reply #6 on: October 09, 2024, 06:56:42 am »
to me a variant record is always a consecutive array of bytes of size record with multiple possible layout. Storing pointers if fine but violates its purpose. managed types makes it not a flat but a tree like structure.

Can you elaborate? Why would you want to avoid pointers inside variant records?

Thaddy

  • Hero Member
  • *****
  • Posts: 15979
  • Censorship about opinions does not belong here.
Re: [Feature] High level Variant Records
« Reply #7 on: October 09, 2024, 09:06:26 am »
I want to avoid managed types in variant records, because they are pointers in a flat memory layout so don't make much sense. store as many pointers you like, possible but not very relevant.
If I smell bad code it usually is bad code and that includes my own code.

Warfley

  • Hero Member
  • *****
  • Posts: 1684
Re: [Feature] High level Variant Records
« Reply #8 on: October 09, 2024, 09:41:47 am »
What I do not understand is to allow for variant records to allow for pointer types  (which managed types are) in general.
to me a variant record is always a consecutive array of bytes of size record with multiple possible layout. Storing pointers if fine but violates its purpose. managed types makes it not a flat but a tree like structure.
But this tree structure is what variant records are perfectly suited for. Take the following:
Code: Pascal  [Select][+][-]
  1. TTree = record
  2.   case NodeType: (Branch, Leaf) of
  3.   Branch: (children: Array of TTree);
  4.   Leaf: (AValue: String);
  5. End;

This is exactly why I implemented managed variants, so this is possible to implement. Because of the selector field NodeType there is no ambiguity what case is live and therefore the runtime can easily manage either the array or the value part

Thaddy

  • Hero Member
  • *****
  • Posts: 15979
  • Censorship about opinions does not belong here.
Re: [Feature] High level Variant Records
« Reply #9 on: October 09, 2024, 09:55:41 am »
My problem to the concept is to store and read records, which is easy with a flat layout. Without automatic store and read such records are rather futile in my opinion, although I see some benefits if store and load are not needed.
Such records as you propose are difficult to stream. Can be done, but requires quite a lot of extra code.
« Last Edit: October 09, 2024, 10:00:39 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Khrys

  • Full Member
  • ***
  • Posts: 102
Re: [Feature] High level Variant Records
« Reply #10 on: October 09, 2024, 01:54:17 pm »
My problem to the concept is to store and read records, which is easy with a flat layout. Without automatic store and read such records are rather futile in my opinion, although I see some benefits if store and load are not needed.

What I do not understand is to allow for variant records to allow for pointer types  (which managed types are) in general.

I completely agree that streaming pointers themselves is nonsensical in the vast majority of applications, but records don't exist exclusively for serialization/streaming purposes. So "in general" seems a bit overreaching to me.

Thanks @Warfley for your work/effort! Hopefully the MR gets approved, this is a very useful feature IMO.

Thaddy

  • Hero Member
  • *****
  • Posts: 15979
  • Censorship about opinions does not belong here.
Re: [Feature] High level Variant Records
« Reply #11 on: October 09, 2024, 04:19:06 pm »
Code: Pascal  [Select][+][-]
  1. type
  2.   TVarRec = record
  3.   case sel:Boolean of
  4.   True: (s:String); // UnManaged field
  5.   False: (I: Integer);
  6.   end;
  7.  
  8. var
  9.   vr: TVarRec;
  10. begin
  11.   vr.s:='Hello World';
  12.   writeln(vr.s);
  13. end.
Always compiled. I still do not get it.
This always compiled, simply because the default string type is always shortstring except the delphi modes.
Would be helpful that you would write for which string types this is relevant and that it expects {$H+}
« Last Edit: October 09, 2024, 04:22:30 pm by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

Warfley

  • Hero Member
  • *****
  • Posts: 1684
Re: [Feature] High level Variant Records
« Reply #12 on: October 09, 2024, 04:26:04 pm »
Obviously when talking about managed fields I mean AnsiString. I did not specifically mention this because I think the share of people working with out mode Delphi or H- is probably in the single digit percentage of all Pascal programmers

Thaddy

  • Hero Member
  • *****
  • Posts: 15979
  • Censorship about opinions does not belong here.
Re: [Feature] High level Variant Records
« Reply #13 on: October 09, 2024, 04:50:37 pm »
Well, even mode objfpc has default shortstring...
anyway i will evaluate the feature, maybe it comes to me that there is some applicatibility.
It was not my intention to throw it away beforehand.
« Last Edit: October 10, 2024, 06:40:18 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5726
  • Compiler Developer
Re: [Feature] High level Variant Records
« Reply #14 on: October 10, 2024, 09:56:53 pm »
Furthermore because there is no semantic, variant records do not allow for managed types in the variant branches. So take the following:
Code: Pascal  [Select][+][-]
  1. TVarRec = record
  2.   case sel: Boolean of
  3.   True: (A: String);
  4.   False: (B: Integer);
  5. end;
This is not allowed because the FPC doesn't know if A or B is set.
The compiler accepts that and there is no reason why it shouldn't.  That's perfectly legal code.  The fact that "A" is a managed type is irrelevant.

Assuming that String = AnsiString it absolutly not allowed, same for any other managed type (ShortString is not a managed type). This is because the compiler/RTL must be able to correctly apply the type's management functions to the field. If it would be inside a variant record the compiler/RTL would not be able to decide which field is set and thus it might call the management functions on an illegal value.

 

TinyPortal © 2005-2018