I am definitely very interested in the possible enhancements to the variants in records. I already explicitly asked for that feature in this thread.
I've read the "composable records" thing but, I think it falls short of what a genuinely useful multiple variant record should provide.
Unnamed unions in C are another implementation disgrace. That's because, in C, when a struct has multiple unnamed unions, the compiler allows fields of one union to be set while also allowing fields of a different union that affect the fields just set of the previous union to be modified. That's ridiculous and, it is an obvious error the compiler should catch and, at the very least, warn the programmer of the problem but, not in C ... macho programmers don't need to be told they just shot themselves in the foot with a bazooka. They don't need that and, why should the compiler warn about semantic mistakes ?... who's idea is it that the compiler should actually work and ensure things make sense ? ... what's next ?... strong typing ?
Anyway...
what follows is a possible implementation....
First, let's define the syntactic structure of the variants (some of the following is heavily influenced by Ada variants)... there are two types of variants, tagged and untagged. That's very important because the syntax of one should be a clear superset of the syntax of the other. With that in mind...
1. Records may contain variant "case" parts anywhere in the body — not just at the end. Multiple variants per record allowed. Each variant has its own "end".
Anonymous variant — no tag field, no storage for selector:
case (ArmA, ArmB) of
ArmA: ( ... );
ArmB: ( ... );
end
```
Named variant — tag field stored in record:
case TagField : dword(ArmA, ArmB) of
ArmA: ( ... );
ArmB: ( ... );
end
2. Tagged variant fields may only be written via write-mode "with". Reading uses read-mode "with" — programmer asserts correct tag value (this is a bit of a departure from Ada and, not necessarily a good one.) Untagged variant fields may be accessed directly via full path.
{ Write mode — sets tag, all fields must be assigned }
with r.FirstOption.NamedField := OtherOptionA do
begin
AsInt32 := 1; AnotherField := true;
end
The important thing about the above "with" is that the tag field is assigned a value AND it requires every field that is in the union/variant to be assigned a value and, since the scope is clearly identified, fields that don't belong to the scope cannot be referenced (one of the absurd problems in the C unnamed unions implementation.)
{ Read mode — programmer asserts tag value }
with r.FirstOption.NamedField.OtherOptionA do
n := AsInt32;
When simply reading/referencing a field, for practical purposes there are no requirements, as long as the field is in scope, it can be "read".
The important thing is that the "with" statement becomes a safety net (instead of opening the door to problems which can happen when combined with some poor programming practices), here are the rules that should govern the "with" statement:
(a.) the with list may contain record variables, enumeration type names, untagged variant paths, and tagged variant write-mode entries
(b.) A record variable brings its fields into unqualified scope (in other words, no longer needs a full reference since the "with record_var" sets the scope)
(c.) An enumeration type name brings its elements into unqualified scope (presuming scoped enums are in effect.)
(d.) An enumeration variable in "with" list is a semantic error — use the type name
(e.) with r.ArmName do — untagged variant arm path — brings arm fields into unqualified scope
(f.) with r.path.TagField.ArmName do — tagged variant read mode — brings arm fields into scope
(g.) with r.path.TagField := ArmName do — tagged variant write mode — sets tag and activates arm
(h.) In write mode all fields of the activated arm must be assigned in the "with" (this ensures there cannot be partially initialized variant/union.)
(i.) In write mode leaving any field unassigned is a semantic error
(j.) The tag field value may not be changed outside of write-mode `with`
(k.) In read mode the compiler does not verify the tag field value (the compiler relies on the programmer to ensure the fields referenced are consistent with the value of the already set tag value. this isn't necessarily a good idea but, it does simplify the code generation and error handling... anyway, it's questionable...)
(l.) A variable may not appear more than once in same with list (I believe this has already been implemented in the "unleashed" version.)
(m.) A type name may not appear more than once in same "with" list. Same as above but explicitly for types.
(n.) If any two entries share a field or element name — semantic error (this prevents the most common complaint about the "with" statement, lack of "obviousness"/"uniqueness" of what field is getting set.)
(o.) Use separate nested "with" statements to resolve name ambiguity
(p.) Field and element names resolved at point of use within innermost active "with" scope
(q.) A name in scope from two or more active "with" scopes simultaneously — semantic error (the multiple "with" use must not itself create a situation where the field being set is not obvious/unique.)
(r.) Explicit qualified reference always valid inside "with" (helps clear possible ambiguities)
(s.) Explicit qualification takes precedence over unqualified resolution
(t.) "with" list may mix record variables and enumeration type names freely
(u.) Nested "with" statements resolve ambiguity for shared names
(v.) Each top-level record variable in a "with" list defines a scope chain — its nested field entries must immediately follow it as a contiguous group before any other top-level entry appears
(w.) Interleaving entries from different scope chains within the same "with" list is a semantic error
(x.) A scope chain entry is a field of the record or nested record most recently introduced by the preceding entry in the same chain
(y.) Each entry in a scope chain must be a valid field of the record type brought into scope by the immediately preceding entry — a field that does not belong to that scope is a semantic error
Example valid: "with r, inner1, inner2, inner3 do" — each entry descends from the previous
Example invalid: "with r, r2, inner1, r2inner1, inner2 do" — entries from r's chain and r2's chain are interleaved
(z.) Enumeration type names in a "with" list are not part of any record scope chain and may appear at any position without violating the contiguous grouping rule
(aa.) "with all variable do" requires every direct fixed field of the record variable to be assigned before the body exits (Note: this form requires a new keyword "all".)
(ab.) "with all" checks direct fields only — nested record fields are treated as a single unit; assigning the nested record as a whole satisfies the check
(ac.) To apply completeness checking to a nested record too use a separate "with all nestedfield do"
(ad.) "with all" on a record containing a variant is a semantic error — use write-mode "with" for variants instead
(ae.) FAM fields are excluded from the "with all" completeness check — the FAM field must be explicitly satisfied using "rec.fam := unassigned;" (Note: this requires a new keyword "unassigned", "unassigned" is an escape valve to be used with arrays.)
(af.) Array fields satisfy "with all" either by whole-array assignment or by "MyArray := unassigned;"
(ag.) If any field is assigned only inside a conditional statement the "with all" check is considered satisfied but the compiler issues a warning that one or more fields may be left uninitialised
(ah.) "with all" applies only to the immediately following variable — it does not propagate to scope chain entries
(ai.) "with all" on an enumeration type name is a semantic error — enumeration elements cannot be assigned
(aj.) A tagged variant write-mode entry ("variable := identifier") must be the sole entry in the "with" list — no other entries of any kind may accompany it
(ak.) A "with" list containing a tagged variant write-mode entry alongside any other entry is a semantic error
(al.) Tagged variant read-mode entries and untagged variant path entries are not subject to this isolation rule and may appear alongside other entries subject to the contiguous scope chain rules
Those are the semantic rules the parser should enforce to ensure the "with" is crystal clear, introduces no possible ambiguities as to which field is being referenced and, for variants, it isolates the variant's fields ensuring all of its fields are set and no other "foreign" uninvited fields "crash the party."
Using those rules, the compiler can ensure consistency, uniqueness and completeness. That's what a compiler should do.
HTH.