Recent

Author Topic: FPC Unleashed (inline vars, statement expr, tuples, match, indexed/lazy labels)  (Read 40620 times)

flowCRANE

  • Hero Member
  • *****
  • Posts: 986
Another suggestion taken from my game's source code is to change the calling convention from winapi to nativecall.

The winapi calling convention in the context of the current Free Pascal is misleading because it is not Windows-specific at all—it defines the native calling convention for the current platform. Therefore, changing it to nativecall would resolve the confusion caused by the unfortunate name.

For the purposes of my project, I changed the name of this convention using a macro, but macros cause incorrect syntax highlighting and break code editor features such as code folding.
« Last Edit: May 12, 2026, 08:44:34 pm by flowCRANE »
Lazarus 4.6 with FPC 3.2.2, Windows 11 — all 64-bit

Working solo on a top-down retro-style action/adventure game (pixel art), programming the engine from scratch, using Free Pascal and SDL3.

flowCRANE

  • Hero Member
  • *****
  • Posts: 986
One more suggestion. Since the Unleashed dialect is becoming a distinct dialect, I propose not calling it Free Pascal Unleashed, but rather Unleashed Pascal—short, concise, and clearly indicating its distinctiveness. It would also be a good idea to rename the compiler itself to UPC (Unleashed Pascal Compiler). But that can wait, since it’s not essential right now.
Lazarus 4.6 with FPC 3.2.2, Windows 11 — all 64-bit

Working solo on a top-down retro-style action/adventure game (pixel art), programming the engine from scratch, using Free Pascal and SDL3.

creaothceann

  • Sr. Member
  • ****
  • Posts: 375
[...] Better path: keep union as-is, and let it take an optional size cap. Three forms:

Code: Pascal  [Select][+][-]
  1. union size 1                  // this union must fit in 1 byte
  2.   BitField: byte;
  3.   bitpacked record
  4.     // 8 single-bit flags - fits in 1 byte, OK
  5.   end;
  6. end;
  7.  
  8. union type byte               // shorthand for "size sizeof(byte)" = 1 byte
  9.   BitField: byte;
  10.   bitpacked record
  11.     // ...
  12.   end;
  13. end;
  14.  
  15. union size sizeof(TSomeRec)   // arbitrary compile-time-constant expression
  16.   BitField: byte;
  17.   bitpacked record
  18.     // ...
  19.   end;
  20. end;
  21.  

I think the second form isn't really needed, SizeOf can already do that. (Which would effectively reduce the necessary compiler change to the first form.)


One more suggestion. Since the Unleashed dialect is becoming a distinct dialect, I propose not calling it Free Pascal Unleashed, but rather Unleashed Pascal—short, concise, and clearly indicating its distinctiveness. It would also be a good idea to rename the compiler itself to UPC (Unleashed Pascal Compiler). But that can wait, since it’s not essential right now.

It's still 99% based on FPC though, and theoretically upstream is compatible with the changes.

440bx

  • Hero Member
  • *****
  • Posts: 6530
If any variant of the union exceeds the cap, the compiler errors. Add a 9th bitsize 1 flag to a union size 1 block -> compile error: "union body 9 bits / 2 bytes exceeds declared size 1 byte".
I think that is a significant improvement.  It allows the compiler to do some safety checks and take appropriate action to help the programmer.




This bothers me:
Code: Pascal  [Select][+][-]
  1. union
  2.   BitField: byte;
  3.   bitpacked record
  4.     ImageUsesLargePages:          boolean;
  5.     IsProtectedProcess:           boolean;
  6.     // ... 6 more, no bitsize 1 needed
  7.   end;
  8. end;
  9.  
The reason it bothers me is that those are bit fields and that could be made obvious by using bit counts instead of a type that doesn't always imply bit (boolean does not _always_ imply a single bit, actually unless there is "bitpacked" somewhere affecting it, it never does), for a number of reasons, the construct should be:
Code: Pascal  [Select][+][-]
  1. union
  2.   BitField: byte;
  3.   record
  4.     ImageUsesLargePages:          1;
  5.     IsProtectedProcess:           1;
  6.     // ... 6 more, no bitsize 1 needed
  7.   end;
  8. end;
  9.  
Among the many reasons:
  • 1. some fields may need a bit count, e.g, 3, using boolean means there will be a mix of a type and a bit count, that's inconsistent and for that reason alone it should not be allowed. It could easily be confused with definitions allowed in C that mean something completely different layout-wise.
  • 2. if the Ada more verbose way is discarded because it is considered too wordy, at least Ada's wordiness isn't redundant and unnecessary, it is needed to describe everything the construct allows.  The point is: why do I have to type "boolean" n times when I can just type the single character "1" which is clearer than the combination of two words "bitpacked" and "boolean" which requires me to notice the "bitpacked" because without it, boolean is 8 bits.   
  • 3. Another problem is that using "bitpacked" requires defining a record for it to be bitpacked, it is not uncommon to see many definitions in C where a field that is of some ordinal type is followed by a few bit fields and those bit fields are not in a record, they are just additional fields, the bitpacked thing forces the definition of a record where there isn't one originally and there are times when this causes the definition to diverge from what was intended in the original C version (this is related to reason 1.)

The whole "bitpacked" "boolean" thing is for the birds.  There should be a consistent, unique way of defining bit fields and, if the C way is going to be adopted then stick to bit counts, no "bitpacked", no boolean, just bit counts, clear simple, unambiguous and easy.  Not to mention parallel to what everyone who knows C expects.  IOW, consistent too.  My personal API definitions consists of over 600,000 lines of ported C .h files and every single definition that includes bit fields has been a hassle because of the "bitpacked" thing and the weird way FPC uses to define the fields (using ranges instead of bit counts, on that note, I should mention that Ada regrettably uses that method... among the very few things that should not be copied from Ada, that's one of them.)

One "nicety" that could be added to the compiler when referring to a 1 bit bit field is for the compiler to treat the single bit field as meaning "boolean", IOW, it would allow testing if ImageUsesLargePages then ... and ImageUsesLargePages := 1 or ImageUsesLargePages := TRUE (even though FPC does not really guarantee that TRUE = 1, which is a different problem.)  That would also make it consistent with the field's treatment in C.

I'm done with that thing.  As is probably obvious, it really bothers me.



Another thing I see as being a hidden can of large worms is:
Code: Pascal  [Select][+][-]
  1. TOuter = record
  2.   embed TInner;     // anonymous embed - requires the [b]embed[/b] keyword
  3.   named: TInner;    // named subfield - standard Pascal, no keyword
  4. end;
if the record is named then there is no problem because the name establishes a scope where the fields exist, that's the current model and we all know that works.

The "embed TInner" or unnamed TInner is a huge potential source of problem.  Specifically, when unnamed there is no scope that resolves name conflicts.  Imagine this, first iteration of the program everything is honky dory, no name collisions then, in some maintenance cycle, some fields are added to some record which is later used unnamed in some other record but, now there are name conflicts, how are the conflicts removed ? are they removed from the standalone records or are fields removed/renamed from the record that contains them ?... forget it, that's a headache I don't ever want to have.  Name the field groups, which means there is nothing to modify because that's what the language already does.

IMO, the potential for conflicts and the headaches caused in deciding how to resolve the conflicts make this a very undesirable "feature" (with features like that, who needs enemies ?)



align example - confirming behaviour, @440bx feedback wanted

For:

Code: Pascal  [Select][+][-]
  1. TRec = record
  2.   a: byte align 8;
  3.   b: byte align 8;
  4.   c: byte;
  5. end;
  6.  

The layout:
  • a at offset 0 (already 8-aligned)
  • b at offset 8 (next 8-aligned boundary after a)
  • c at offset 9 (immediately after b, no special alignment)
  • sizeof(TRec) = 16 - padded up to a multiple of the largest alignment in the record - 8. With packed record, no trailing pad -> sizeof = 10.
The trailing padding to 16 is the standard C-struct convention - it ensures arrays of TRec keep each element 8-aligned.

A couple more cases to confirm the rule:

Code: Pascal  [Select][+][-]
  1. TTest1 = record
  2.   a: byte align 16;
  3.   b: byte align 8;
  4.   c: byte;
  5. end;
  6.  

Same memory layout as the first one - a at 0, b at 8, c at 9. Why? a's requested align 16 is already satisfied at offset 0, and b only needs align 8, so b still lands at offset 8. The record's overall alignment becomes 16 (the max field alignment), which means sizeof must be a multiple of 16. The last used byte is c at offset 9, so the data extends to byte 10. The smallest multiple of 16 that is >= 10 is 16, so sizeof = 16.

(Why the rounding? So arrays of TTest1 keep every element 16-aligned - if sizeof were 10, the second element would land at offset 10, which isn't 16-aligned.)

Code: Pascal  [Select][+][-]
  1. TTest2 = record
  2.   a: byte align 16;
  3.   b: byte align 32;
  4.   c: byte;
  5. end;
  6.  

Layout:
  • a at offset 0
  • b at offset 32 (next 32-aligned boundary after a)
  • c at offset 33
  • sizeof = 64 - record's overall alignment is 32 (the max field alignment). The last used byte is c at offset 33, so data extends to byte 34. The smallest multiple of 32 that is >= 34 is 64.
@440bx - does this match what you'd expect? Before it's locked in, I want to confirm the convention.
Yes, it does.  Good stuff.

I would add one detail, which is, if the record is "packed" then no padding is done because it is packed.  This requires clarifying the behavior when packed records that use aligned fields are array elements.  In that case, I'd say the compiler should refuse to create an array of packed records whose fields are individually aligned because there is no way to resolve the record size conflct so it can be used in an array since the array cannot use the packed size since it may conflict with the first field's required aligment.  simple solution: don't allow arrays of packed records that have individually aligned fields. 

My $0.05  (it used to be $0.02 but between tariffs and inflation, it's up to $0.05 and going... )

« Last Edit: May 12, 2026, 09:51:07 pm by 440bx »
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

Fibonacci

  • Hero Member
  • *****
  • Posts: 997
  • Behold, I bring salvation - FPC Unleashed
Then is question separate: when record align set and make one field have other alignment then what is take precedence (e.g. is field align is all time must fit record align or can be more big for one field and ignore record align because this is small contradict with documentation that is exist)

All of this will end up in the docs. The Unleashed documentation acts as an "override" on top of the stock FPC documentation - by default you read the FPC docs, and where Unleashed adds or changes a feature, the Unleashed docs describe it in detail (including any place where the behaviour differs from stock).



So I propose adding a new type of loop, designed specifically for infinite loops:

Code: Pascal  [Select][+][-]
  1. loop
  2.   // iterating forever
  3. end;

Going to pass on this one. while true do ... already does the job with a couple of extra tokens, and a dedicated loop ... end; construct earns its keep only if multiple users actually want it - right now it sounds like you'd be the main consumer. If others in the thread chime in saying they'd use it too, I'd reconsider.



Another suggestion taken from my game's source code is to change the calling convention from winapi to nativecall.

The winapi calling convention in the context of the current Free Pascal is misleading because it is not Windows-specific at all—it defines the native calling convention for the current platform. Therefore, changing it to nativecall would resolve the confusion caused by the unfortunate name.

For the purposes of my project, I changed the name of this convention using a macro, but macros cause incorrect syntax highlighting and break code editor features such as code folding.

A full rename would break a lot of existing code - every unit using winapi (Windows.pas, ShellAPI, GDI, OpenGL bindings, third-party libs, etc.) would stop compiling. Not worth the breakage.

Adding nativecall as an alias for winapi would be doable in theory, but I'd rather not. Reasons:
  • Two names for the same thing = style fragmentation. One file uses winapi, next file uses nativecall, same convention under the hood. Ends up in style debates and inconsistent codebases - exactly what Pascal's "one obvious way" philosophy tries to avoid.
  • nativecall is itself misleading. On Win32, winapi resolves to stdcall, but FPC's actual "native" / default convention on i386 is register. So nativecall would mean "Windows-native", not "platform-native" - the new name trades one mild confusion for another.
  • Stock FPC compatibility loss. Code using nativecall won't compile in stock FPC. Unleashed already adds plenty of one-way features (inline vars, union, embed, etc.) - I'd rather spend that "incompatibility budget" on actual new functionality, not on renaming an existing keyword.
  • Naming churn is expensive. Every keyword in the language is something every reader has to learn. Adding a second name for the same concept doubles the surface area without adding capability.
So: passing on this one too. But appreciate the suggestion - it's the kind of friction-point that's worth flagging even if the answer ends up being "no".

One more suggestion. Since the Unleashed dialect is becoming a distinct dialect, I propose not calling it Free Pascal Unleashed, but rather Unleashed Pascal—short, concise, and clearly indicating its distinctiveness. It would also be a good idea to rename the compiler itself to UPC (Unleashed Pascal Compiler). But that can wait, since it's not essential right now.

Same answer as the others - passing on this. Reasons:
  • It IS still 99% FPC. Unleashed is a fork that tracks upstream changes. The compiler core, RTL, code generation, target support - all FPC. Unleashed adds a language layer on top via {$mode unleashed}, plus a handful of compiler patches. Calling it a separate language overstates how separate it actually is.

  • Renaming the binary breaks every existing tooling integration. fpcupdeluxe configs, IDE settings, build scripts, CI / CD pipelines, Makefiles - all hardcoded around fpc / ppcx64 / etc. Switching to upc / ppcupc would break every user's setup the day they update. Massive cost for cosmetic gain.

  • Theoretical upstream-compat matters. As @creaothceann pointed out, the changes are theoretically upstream-compatible. Keeping the FPC name signals that intent - this is a fork that might one day merge features back, not a hostile breakup.

  • Brand recognition. "Free Pascal" carries decades of recognition. "FPC Unleashed" leverages that. "Unleashed Pascal" / "UPC" starts from zero and asks every newcomer "what's this thing?" instead of "what's the difference?".
For informal references in conversation, "Unleashed Pascal" or just "Unleashed" is fine - I use it that way myself. But the project name stays "FPC Unleashed" and the compiler binary stays fpc.

That said - I'm not rejecting the idea outright. Unleashed Pascal Compiler sounds good, even kind of proud ;) Just too early for that.



The whole "bitpacked" "boolean" thing is for the birds. There should be a consistent, unique way of defining bit fields and, if the C way is going to be adopted then stick to bit counts, no "bitpacked", no boolean, just bit counts, clear simple, unambiguous and easy.

Fair point that boolean bitsize 1 is verbose for what's conceptually just "one bit". But before going C-style, let me lay out what Pascal already offers - three types are natively 1-bit in a bitpacked context, no modifier needed:

Code: Pascal  [Select][+][-]
  1. bitpacked record
  2.   a: boolean;      { 1 bit - boolean }
  3.   b: 0..1;         { 1 bit - subrange }
  4.   c: (k0, k1);     { 1 bit - 2-variant enum }
  5. end;
  6.  

Wider sub-byte fields work the same way - the compiler picks the natural width:
  • 0..3 or a 4-variant enum -> 2 bits
  • 0..7 or a 5-to-8-variant enum -> 3 bits
  • etc.
And bitsize N is the explicit override for cases where the natural width doesn't match - e.g. integer bitsize 3 or bytebool bitsize 1 (take an 8-bit bytebool and force it to 1 bit while keeping the bytebool type identity).



Comparison - same 1-bit flag field, three styles:

Code: Pascal  [Select][+][-]
  1. flag: boolean bitsize 1;   { 24 chars - verbose, explicit }
  2. flag: 0..1;                { 11 chars - Pascal subrange }
  3. flag: 1;                   { 8 chars  - C-style proposal }
  4.  

C-style saves 3 chars over subrange. The question is whether that's worth bringing in. My take: no, and here's why.

1. It breaks the "field : typename" idiom. After the colon Pascal expects a type. flag: 1 is a literal where a type should be - the whole "field = name, colon, type" intuition goes out the window. Subrange 0..1 is still a type (range type), so it doesn't break this rule.

2. Ambiguity at N >= 8. What does field: 8 mean? 8 bits unsigned (0..255)? 8 bits signed (-128..127)? In C this depends on the underlying unsigned int vs int declaration - Pascal has no ambient storage type to fall back on, so we'd have to pick one. If field: 8 means 0..255, it overlaps with field: byte - redundant. If signed, it surprises C porters.

3. Narrow useful range. For N in {8, 16, 32, 64} you already have named types (byte, word, longword, int64). C-style is only useful for N in {2..7} - a narrow band where 0..N-1 subrange is just as readable.

4. Composability problem with bitsize. What does flag: 5 bitsize 1 mean? Contradiction - : 5 says 5 bits, bitsize 1 says 1. We'd have to ban the combination or pick a winner. More edge cases to document.

5. bitsize N has a different job. It's the explicit override for types whose natural width != the requested width. bytebool bitsize 1 says "take this 8-bit boolean type and pack it to 1 bit anyway". C-style : 1 can't express this - there's no type to apply an override to.



Where I think the actual fix is - leading with the cleaner pattern in docs.

You're right that boolean bitsize 1 is the worst example I could have led with - it makes the syntax look heavier than it needs to be. The cleaner form is just bitpacked on the inner record:

Code: Pascal  [Select][+][-]
  1. union size 1
  2.   BitField: byte;
  3.   bitpacked record
  4.     ImageUsesLargePages, IsProtectedProcess,
  5.     IsImageDynamicallyRelocated, SkipPatchingUser32Forwarders,
  6.     IsPackagedProcess, IsAppContainer,
  7.     IsProtectedProcessLight, IsLongPathAwareProcess: boolean;
  8.   end;
  9. end;
  10.  

bitpacked forces each boolean to 1 bit automatically - no per-field annotation. For non-boolean small fields, subrange:

Code: Pascal  [Select][+][-]
  1. bitpacked record
  2.   priority: 0..7;     { 3 bits }
  3.   flags:    0..15;    { 4 bits }
  4. end;
  5.  

And bitsize N sits on top of all of this as the override when natural width doesn't match what you want. Most code won't need it.



Summary - three mechanisms cover the use cases:
  • boolean / subrange / enum inside a bitpacked record - natural bit width, no annotation
  • bitsize N on any type - explicit width override
  • union size N - container budget check
C-style : N would be a fourth mechanism that mostly duplicates subrange with a 3-char savings, while adding ambiguity, idiom-break and composability headaches. Trade isn't worth it.

Pascal is supposed to read like a book. flag: 0..1 reads as "flag, range 0 to 1" - any Pascal programmer understands it cold, no lookup required. flag: 1 reads as... what? Either you already know C's bit-field syntax, or you have to consult the docs and memorise a new convention. That's a tax on every future reader of the code, paid so the writer can save three characters. Wrong trade for a language that values readability over compression.



EDIT:

I think the second form isn't really needed, SizeOf can already do that. (Which would effectively reduce the necessary compiler change to the first form.)

You're right, union type T goes out. Wasn't implemented yet anyway - union size sizeof(T) covers the same case without a separate form.
« Last Edit: May 12, 2026, 11:38:26 pm by Fibonacci »
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

440bx

  • Hero Member
  • *****
  • Posts: 6530
You're right that adding yet another method of specifying bit fields is less than ideal but, currently, the whole thing is a mess.  That's why another way, that is clean and simple is fully justified.

when a user sees: "fieldname : 1;" if they are a Pascal programmer who doesn't know C, they may not know/realize that it's a 1 bit  bit field but, they'll figure it out very quickly and they'll never forget it.

The other reason "fieldname : 1;" should be allowed is because there are plenty of C definitions that include a mix of normal fields and bit fields.  For instance:
Code: C  [Select][+][-]
  1.   /* imagine we are in a struct */
  2.  
  3.   int the_int;
  4.  
  5.   BOOLEAN bitfield : 3;
  6.   BOOLEAN anotherbitfield : 4;
  7.  
  8.   int64  bigint;
  9.  
  10.   unsigned int anotherbitfield : SOME_CONSTANT;
  11.   unsigned int reserved : sizeof(unsigned int) - SOME_CONSTANT;
  12.  
  13.   unsigned char abytehere;
  14.  
representing that with current Pascal syntax is a pain in the neck.  Your proposal of allowing type names to intermix with bit counts, which you are forced to allow because there is no type for 3 bits unless you first define a range for it, another pain in the neck, makes it very likely to confuse a bit field with a regular field (since they may both be defined using a type name instead of the bit field with a bit count and the normal field with a type.)  In addition to that, there are a few C structs where the number of bits is given by a constant expression, usually to figure out how many bits the trailing "reserved" bits field size should be, using ranges to figure that out ? ... good luck. 

As I said previously, the current Pascal method is for the birds.  There are so many problems with it it's not even worth trying to fix.

Lastly, since FPC provides bitsizeof, allowing a constant expression in the field size would do C one better since there would be no need to define a separate constant to get the fields's bit size.



On a different note, something I forgot to mention about alignment, when a record is packed, if packing causes some fields to no longer be aligned on their natural boundary, the compiler should emit a warning for the fields that are not properly aligned.  For some architectures, that warning could be a crucial piece of information.
« Last Edit: May 13, 2026, 12:17:36 am by 440bx »
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

Fibonacci

  • Hero Member
  • *****
  • Posts: 997
  • Behold, I bring salvation - FPC Unleashed
Ok, let's say we add another form, specifically for porting C structs to Pascal. Before that lands, I need to nail down the semantics - and there's a gap in what you've specified.

For 1-bit fields, you wrote:

the compiler to treat the single bit field as meaning 'boolean', IOW, it would allow testing if ImageUsesLargePages then ... and ImageUsesLargePages := 1 or ImageUsesLargePages := TRUE

Clear - 1-bit field is boolean-like. Good. But what about wider bit fields? Say a 3-bit field:

Code: Pascal  [Select][+][-]
  1. record
  2.   priority: 3;     { 3 bits - but what type?? }
  3.   counter:  5;     { 5 bits - same question }
  4. end;
  5.  

Concrete questions I need answered before this can be implemented:
  • What type is priority? byte? shortint? Some new 3-bit-int phantom type? Or context-dependent?
  • Signed or unsigned? priority: 3 - is the range 0..7 or -4..3?
  • Arithmetic behaviour: priority + 1 - what type does this produce? Promoted to integer? Stays 3-bit-wide and wraps?
  • Comparison: if priority > 5 then - does this compile? Against what type does the literal 5 resolve?
  • Assignment overflow: priority := 100 - error? Truncate to 3 bits (100 and $07 = 4)? Range check?
  • Pass to a function: Foo(priority) where Foo expects byte - implicit conversion? Compile error?
In C, the answer to most of these comes from the underlying type: unsigned int x : 3 is "3 bits of unsigned int" - the underlying type determines sign and range, and drives integer promotion in expressions (priority + 1 promotes the 3-bit field to int before adding). The bit count is just storage.

In your proposal, the underlying type is gone - just : 3. So either:

(a) You pick a default type (likely unsigned byte/word/longword based on N) - which means field: 8 = field: byte (redundant) and field: 16 = field: word (redundant for N>=8). And ambiguity-by-default for signed vs unsigned.

(b) You require a type after all, e.g. field: unsigned 3 / field: signed 3 / field: byte : 3 - which puts you back to two-keyword syntax, where C already has the answer (unsigned int field : 3) and you're just renaming it.

(c) You introduce a new bit-field family of types (bit3, bit4, etc.) - which is yet another type system addition.

Each of those has its own can of worms. Before field: N can be added, the type interpretation for N > 1 needs to be specified - what's your call?

For 1-bit it's simple because boolean is a natural fit. For 2..7 it's wide open.
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

creaothceann

  • Sr. Member
  • ****
  • Posts: 375
Even for 1-bit fields I've had situations where using them as integers was the natural way (specifically, the field being used as an array index). But that can be easily solved via ord.


Another thing I see as being a hidden can of large worms is:
Code: Pascal  [Select][+][-]
  1. TOuter = record
  2.   embed TInner;     // anonymous embed - requires the [b]embed[/b] keyword
  3.   named: TInner;    // named subfield - standard Pascal, no keyword
  4. end;
if the record is named then there is no problem because the name establishes a scope where the fields exist, that's the current model and we all know that works.

The "embed TInner" or unnamed TInner is a huge potential source of problem.  Specifically, when unnamed there is no scope that resolves name conflicts.  Imagine this, first iteration of the program everything is honky dory, no name collisions then, in some maintenance cycle, some fields are added to some record which is later used unnamed in some other record but, now there are name conflicts, how are the conflicts removed ? are they removed from the standalone records or are fields removed/renamed from the record that contains them ?... forget it, that's a headache I don't ever want to have.  Name the field groups, which means there is nothing to modify because that's what the language already does.

I'd disagree with that - the "no scope" is the whole point of this change.

Yes, changing any part of a code base can affect other parts of the code base. The same issue occurs when the Free Pascal / Lazarus developers change the RTL or the language (e.g. added keywords), or when a dependency (library, package) changes. That's just part of the job. At least with the embed feature the compiler gives helpful errors. The issue could be fixed by creating a separate type for embedding, or manually copying the old version's fields over, or adjusting the code that uses the embedded type.

The alternative is freezing the old state and only allowing additions tha are compatible.

440bx

  • Hero Member
  • *****
  • Posts: 6530
@Fibonacci,

All of those questions are totally valid, that's why when I proposed the "container" the container had a type associated with it.  That answers all of your questions in one shot.

You're using "union" instead of "container", that's fine with me, I'm not set on "container", "union" is fine and you had suggested a few forms, one of them, IIRC, used a type.  I'd use that one.

One area where your proposal is more powerful is that if every field has its own type instead of just bit counts, then a 3 bit bit field (for example) can be interpreted as unsigned in one case and signed in another but, that would require defining a different range to distinguish each case.  More flexibility at the expense of more typing. No problem. OTH, if you go that way, you also have to provide some mechanism to set new container boundaries, in C, they use a zero bit bit field to mark an explicit boundary when one is needed.

Honestly, there are lots of possible options but, I'd select the simplest one, which is select the container type upfront (you can still do that using the keyword "union", it doesn't make any difference) then every field is treated as an instance of n bits of that type with corresponding valid arithmetic operations on them.  Functionally, that's an exact parallel to C, the only difference is that it takes less typing and enables the compiler to do some checks to help the programmer ensure there are no sizing mistakes.



@creaothceann,

Quote
I'd disagree with that - the "no scope" is the whole point of this change.
but that opens the door to all kinds of name conflicts in the future.  I'd stay away from something like that like the plague.

There is a big difference from changing something in the RTL which can cause problems to changing the name of a run-of-mill field and have that trivial change cause all kinds of problems.  Changes in the RTL are not what programmers normally do, adding fields and changing field names are bread and butter programmers don't even think about and shouldn't have to think about.

That "feature" creates a naming house of cards and, while the compiler doesn't force anyone to use it, the moment I'd see that feature used in code, I'd instantly send the code to the trash can.  It's a maintenance nightmare, I am not dealing with that gratuitous headache. No way!.

Honesty, it strikes me as one of those things that initially looks like a great idea but, if you give it some real thought and/or use it, it becomes very painfully obvious that it was a terrible idea.
« Last Edit: May 13, 2026, 01:54:25 am by 440bx »
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

flowCRANE

  • Hero Member
  • *****
  • Posts: 986
@Fibonacci—do you have a comprehensive draft of the new syntax for records, unions, named/unnamed fields, bit fields, and memory alignment for fields and entire structures? Some examples that summarize what you ultimately chose and plan to implement? I'm curious to see what syntax will ultimately be available in this dialect.
Lazarus 4.6 with FPC 3.2.2, Windows 11 — all 64-bit

Working solo on a top-down retro-style action/adventure game (pixel art), programming the engine from scratch, using Free Pascal and SDL3.

Fibonacci

  • Hero Member
  • *****
  • Posts: 997
  • Behold, I bring salvation - FPC Unleashed
@Fibonacci—do you have a comprehensive draft of the new syntax for records, unions, named/unnamed fields, bit fields, and memory alignment for fields and entire structures? Some examples that summarize what you ultimately chose and plan to implement? I'm curious to see what syntax will ultimately be available in this dialect.

I wish I could say no :D but actually I do. ~1000-line MD file with the full details. Problem is, it's full of inconsistencies right now and needs serious cleanup before I can publish it. The situation also keeps shifting - something changes every few hours because active development is happening literally as we speak, and anything I posted now would be outdated within hours. You'll have to wait a bit.



All of those questions are totally valid, that's why when I proposed the "container" the container had a type associated with it.  That answers all of your questions in one shot.

You're using "union" instead of "container", that's fine with me, I'm not set on "container", "union" is fine and you had suggested a few forms, one of them, IIRC, used a type.  I'd use that one.

The "type" form - welcome back! In a slightly modified shape. Proposal:

Code: Pascal  [Select][+][-]
  1. TPEB = packed record
  2.   InheritedAddressSpace:    bytebool;
  3.   ReadImageFileExecOptions: bytebool;
  4.   BeingDebugged:            bytebool;
  5.  
  6.   union of Byte size 1   // <- the whole union must not exceed 1 byte, otherwise compile error
  7.     BitField: Byte;
  8.     bitpacked record     // <- instead of per-field size/bitsize
  9.       ImageUsesLargePages:         1; // <- every field (this and below) is seen by the compiler as "Byte"
  10.       IsProtectedProcess:          1;
  11.       IsImageDynamicallyRelocated: 1;
  12.       // ...
  13.     end;
  14.   end;
  15.  
  16.   // ...
  17. end;

Any objections? Feedback?
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

440bx

  • Hero Member
  • *****
  • Posts: 6530
No objections but a question instead, in:
Code: Pascal  [Select][+][-]
  1. union of Byte size 1
  2.  
byte implies size 1, so why have both ? wouldn't one or the other be sufficient ?  Am I missing something ?

FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

Fibonacci

  • Hero Member
  • *****
  • Posts: 997
  • Behold, I bring salvation - FPC Unleashed
No objections but a question instead, in:
Code: Pascal  [Select][+][-]
  1. union of Byte size 1
  2.  
byte implies size 1, so why have both ? wouldn't one or the other be sufficient ?  Am I missing something ?

Fair question - I had both there just to show some of the syntax in one place. Either alone would do:

Code: Pascal  [Select][+][-]
  1. union of Byte     // type Byte, size implied (1 byte)
  2.   ...
  3. end;
  4.  
  5. union size 1      // 1-byte cap, no implicit type
  6.   ...
  7. end;
  8.  

The modifiers (size, bitsize, align, bitalign) are all optional - write them when you need the constraint, skip them when the natural defaults work.
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

440bx

  • Hero Member
  • *****
  • Posts: 6530


ok, now presuming that in the example you provided there is either one or the other but not both
Code: Pascal  [Select][+][-]
  1. TPEB = packed record
  2.   InheritedAddressSpace:    bytebool;
  3.   ReadImageFileExecOptions: bytebool;
  4.   BeingDebugged:            bytebool;
  5.  
  6.   union of Byte size 1   // <- ONE or the other but NOT both.
  7.     BitField: Byte;
  8.     bitpacked record     // <- instead of per-field size/bitsize
  9.       ImageUsesLargePages:         1; // <- every field (this and below) is seen by the compiler as "Byte"
  10.       IsProtectedProcess:          1;
  11.       IsImageDynamicallyRelocated: 1;
  12.       // ...
  13.     end;
  14.  
  15.     { potentially other structs here, which unions allow }
  16.   end;
  17.  
  18.   // ...
  19. end;

Any objections? Feedback?

One major concern I have is that a union may hold several structures of different sizes, for instance, in the above example, there could be another struct that uses more than 1 byte, which is normal yet, since your union says "union of byte" it cannot accommodate another structure that is larger than a single byte and changing its declared size doesn't work either because, your comment states that every field in the bitpacked record has effectively "inherited" the size from the union.  There is a conflict. 

the logical solution I see is to remove the size from the "union" and instead have it in "bitpacked record of byte" because that is the bit field container, not the outer union itself.  Basically, the reworked definition would be:
Code: Pascal  [Select][+][-]
  1. TPEB = packed record
  2.   InheritedAddressSpace:    bytebool;
  3.   ReadImageFileExecOptions: bytebool;
  4.   BeingDebugged:            bytebool;
  5.  
  6.   union   // no size here
  7.     BitField: Byte;
  8.     bitpacked record of byte    // <- the bitpacked record must not exceed 1 byte, otherwise compile error
  9.       ImageUsesLargePages:         1; // <- every field (this and below) is seen by the compiler as "Byte"
  10.       IsProtectedProcess:          1;
  11.       IsImageDynamicallyRelocated: 1;
  12.       // ...
  13.     end;
  14.  
  15.     { potentially other structs here, which unions allow }
  16.   end;
  17.  
  18.   // ...
  19. end;
That way the union can have as many structs as needed because the size of the bit fields is declared in the bit fields container, not in the global union.  One last comment, since the record is "record of byte", that should be thought of as an explicit declaration that what's in the container are bit fields thereby making the "bitpacked" superfluous but, it can remain there as documentation.

Lastly relocating the size to the record declaration makes it even clearer to the parser that it is that record and nothing else that is limited and "softly" required to be 1 byte.  Larger than 1 byte is an error, smaller is a note or maybe even a warning.

HTH.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

flowCRANE

  • Hero Member
  • *****
  • Posts: 986
Code: Pascal  [Select][+][-]
  1. TPEB = packed record
  2.   InheritedAddressSpace:    bytebool;
  3.   ReadImageFileExecOptions: bytebool;
  4.   BeingDebugged:            bytebool;
  5.  
  6.   union of Byte size 1   // <- the whole union must not exceed 1 byte, otherwise compile error
  7.     BitField: Byte;
  8.     bitpacked record     // <- instead of per-field size/bitsize
  9.       ImageUsesLargePages:         1; // <- every field (this and below) is seen by the compiler as "Byte"
  10.       IsProtectedProcess:          1;
  11.       IsImageDynamicallyRelocated: 1;
  12.       // ...
  13.     end;
  14.   end;
  15.  
  16.   // ...
  17. end;

Any objections? Feedback?

Yes—there is no data type specified for bit-fields and this is a huge problem to me. This is not consistent with either the verbose syntax of Pascal or even the syntax of C. In my opinion, it's a very bad idea to omit data types and rely on type inference in this case. Please keep the syntax as Pascal-ish as possible. 8)

I like the idea of record/union of n, where n is the size of the structure in bytes. But in my opinion, if we're going to specify the structure's size in bytes, it's better for the keyword to be as closely related to the size as possible; that's why I would prefer record/union size n. The same applies to the size of the bit field, i.e., the previous suggestion with the bitsize keyword:

Code: Pascal  [Select][+][-]
  1. TPEB = packed record
  2.   InheritedAddressSpace:    ByteBool;
  3.   ReadImageFileExecOptions: ByteBool;
  4.   BeingDebugged:            ByteBool;
  5.  
  6.   union size 1 // or "union size Byte"
  7.     BitField: Byte;
  8.     bitpacked record
  9.       ImageUsesLargePages:         Byte bitsize 1; // name, type and size in bits
  10.       IsProtectedProcess:          Byte bitsize 1;
  11.       IsImageDynamicallyRelocated: Byte bitsize 1;
  12.       // ...
  13.     end;
  14.   end;
  15.  
  16.   // ...
  17. end;
  18.  

Such syntax is clear, verbose and more Pascal-ish. Here, there should be a clear distinction between size in bytes and size in bits, which means two separate keywords should be used—size and bitsize`. This is necessary because the size of a structure or union may not be divisible by 8:

Code: Pascal  [Select][+][-]
  1. type
  2.   TStruct = bitpacked record
  3.     {..}
  4.  
  5.     union bitsize 20 // 20-bits for this union
  6.       {..}
  7.     end;
  8.  
  9.     {..}
  10.   end;
« Last Edit: May 13, 2026, 11:48:24 am by flowCRANE »
Lazarus 4.6 with FPC 3.2.2, Windows 11 — all 64-bit

Working solo on a top-down retro-style action/adventure game (pixel art), programming the engine from scratch, using Free Pascal and SDL3.

 

TinyPortal © 2005-2018