Recent

Author Topic: FPC Unleashed (inline vars, statement expr, tuples, match, indexed/lazy labels)  (Read 40809 times)

Fibonacci

  • Hero Member
  • *****
  • Posts: 1000
  • Behold, I bring salvation - FPC Unleashed
Consider this simple example:

<cut>

after some time, could be months or years, someone decides to add a field to TREC_1, like this:

<cut>

now there is a naming conflict with SomeOtherField. Where is the conflict resolved, in the new TREC_1 or in TComposed ?  The potential for this problem grows as the number of unnamed embedded records grows in the composed record(s).

<cut>

Did that clarify the situation ?

No. I get what you're pointing at, but it only matters in one specific case: if you assume coders are stupid. Are they?

The error is:
Error: Duplicate identifier "SOMEOTHERFIELD" from composition - already present in the surrounding record
(screenshot attached)

That's verbose enough to tell exactly what's wrong, where it came from, and what to fix. No archaeology required, no scrolling through "hundreds of lines or different files" to find the conflict - the compiler hands the answer over on the first build after the change. The maintainer reads "duplicate identifier from composition" and knows immediately what to do.

Yes, the surface for this grows with each anonymous embed. But the surface for any failure mode grows with each new line of code. What matters is whether the failure is silent (bad) or loud (good). This one is loud, the message is informative, and the fix is local.

Where you see a "can of worms", a "house of cards" and a "poison pill", I see great potential. Real use cases - I actually can't wait until it's merged into main, so I can start building projects with it. This feature gives real "record-freedom" - not just the embed, but the feature as a whole. Freedom - that is what I see, and feel.



I really see no value in that feature but only problems.

I've wanted something like this since @Warfley presented his "Record Composition" - his implementation has the same exact thing, anonymous embed, just named differently. You don't see the value, but I (and I hope others) do. This feature was my little dream.

I've said enough about how "free" I feel having it - as a whole, not just embed on its own. It's a massive addition - it enables really cool structures, like ones with a common header/footer.



But I can see I won't convince you - oh well, I'm not going to keep fighting this. Your preference, fair enough. The feature isn't for you, you don't have to use it, you don't have to test it.

I hope someone else will. Maybe they'll find a bug, maybe suggest an improvement, maybe spot something I missed. As I wrote earlier - I expect bugs. Or maybe there are none. Or maybe they'll surface over time. Or I'll find them myself.

Let's end this little embed war here. Give it a few days, then I'll merge to main and we can move on to other changes/fixes. I hope you won't drop Unleashed over this - records still work exactly as before, this is just an addition. You don't have to use it.



Update on the installer timeline: I'll publish it tomorrow, not today. Too late in the day for that, and this discussion took too much of my energy :o
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

440bx

  • Hero Member
  • *****
  • Posts: 6532
Just to clarify a few things,

1.  I don't think coders are stupid, I just think they make mistakes, just as I do.  No difference.

2. We definitely have diametrically opposed opinions as to how valuable the feature is.  I went through all the trouble of explaining in detail because a. you asked for it and b. I thought you didn't realize the potential problems as a result of a simple mental lapse.  I realize now, you are fully aware of the problems yet, still value it.  That's fine with me, I have no problem with that but, as you correctly pointed out, it definitely isn't for me.

3. I wasn't fighting.  I was doing my best, in good faith, to point out the problems I see because I thought they might have escaped you.  Sometimes I fail to see the obvious and I've seen that happen to other people too, not just me.  I thought this was one of those instances.  It was not a fight.

Keep up the good work, there is lots to appreciate in what you've added to the language so far.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

Okoba

  • Hero Member
  • *****
  • Posts: 660
@Fibonacci starred  ;)

Fibonacci

  • Hero Member
  • *****
  • Posts: 1000
  • Behold, I bring salvation - FPC Unleashed
New feature: Composable Records

Modeswitch composablerecords, on by default in {$mode unleashed}. The feat/composable-records branch has landed on main, both compiler and Lazarus IDE - rebuild Unleashed from latest and the feature is live.

Three things in one modeswitch:
  • Record composition - embed an existing record into another, anonymously (fields flatten into the outer scope, outer.field) or named (access through the carrier, outer.carrier.field). Inline anonymous record ... end; blocks inside other records work too, with the same flatten semantics.
  • union ... end; - C-style untagged memory overlap. Replaces case TYPE of 0: (...) 1: (...) end; in the common case where there is no real discriminator. Fields after the union are legal (no more "variant must be last").
  • Anonymous enums scoped to the record - inline kind: (kA, kB, kC); declarations put their constants into the record's namespace (TFoo.kA) instead of polluting the unit. Two records can declare the same enumerator names with no clash.
Together they make porting C structs (especially WinAPI / POSIX headers heavy with anonymous unions and anonymous nested structs) faithful 1:1, instead of forcing the legacy "duplicate fields across variant branches" or "give up one variant entirely" workarounds.



The basics

The Windows _PEB header - byte-aligned head fields plus a 1-byte bitfield union sharing the same storage:

Code: Pascal  [Select][+][-]
  1. type
  2.   TPEBHead = record
  3.     InheritedAddressSpace:    bytebool;
  4.     ReadImageFileExecOptions: bytebool;
  5.     BeingDebugged:            bytebool;
  6.     union of Byte                          // default type = Byte
  7.       BitField: Byte;
  8.       bitpacked record                     // inherits Byte from outer scope
  9.         ImageUsesLargePages:          1;
  10.         IsProtectedProcess:           1;
  11.         IsImageDynamicallyRelocated:  1;
  12.         SkipPatchingUser32Forwarders: 1;
  13.         IsPackagedProcess:            1;
  14.         IsAppContainer:               1;
  15.         IsProtectedProcessLight:      1;
  16.         IsLongPathAwareProcess:       1;
  17.       end;
  18.     end;
  19.   end;

SizeOf(TPEBHead) = 4: three 1-byte bytebools + 1-byte union (eight booleans share the byte with BitField). Layout matches the C _PEB header byte-for-byte.

Things to notice:
  • union of Byte - declares a 1-byte memory overlap. of T is shorthand: derives size SizeOf(T) and align AlignOf(T) automatically, plus sets T as the default field type for the C-style name: N; bitfield syntax inside. So union of Byte = "1-byte union, Byte default" - and that 1-byte cap is enforced: add a 9th : 1 to the inner record and the build breaks at the union declaration site with Union variants need 2 bytes but the union was declared "size 1".
  • bitpacked record inside the union - inherits Byte from the enclosing of Byte scope, so name: 1; becomes name: Byte bitsize 1. Eight booleans pack into 8 bits sharing the byte with BitField.
  • Fields after the union are legal. Stock Pascal case TYPE of must be the last thing in a record - the union here can sit anywhere.


Why this and not the alternatives

Compare a faithful port of OVERLAPPED from winnt.h - the C version has a union sitting in the middle:

Code: C  [Select][+][-]
  1. typedef struct _OVERLAPPED {
  2.   ULONG_PTR Internal;
  3.   ULONG_PTR InternalHigh;
  4.   union {
  5.     struct { DWORD Offset; DWORD OffsetHigh; };
  6.     PVOID Pointer;
  7.   };
  8.   HANDLE hEvent;
  9. } OVERLAPPED;

Stock FPC's RTL ports it like this:

Code: Pascal  [Select][+][-]
  1. OVERLAPPED = record
  2.   Internal:     ULONG_PTR;
  3.   InternalHigh: ULONG_PTR;
  4.   Offset:       DWORD;
  5.   OffsetHigh:   DWORD;
  6.   hEvent:       HANDLE;
  7. end;

The pointer variant is dropped entirely. WinAPI calls that use it need a manual Pointer(@ov.Offset)^ cast at the call site.

With Composable Records:

Code: Pascal  [Select][+][-]
  1. TOverlapped = record
  2.   Internal:     ULONG_PTR;
  3.   InternalHigh: ULONG_PTR;
  4.   union
  5.     record Offset, OffsetHigh: DWORD; end;     // variant 1 - flattens into outer
  6.     pPointer: PVOID;                           // variant 2 - named field
  7.   end;
  8.   hEvent: THANDLE;
  9. end;

Both variants accessible by name, layout identical to C, no fields duplicated, no fields dropped. Three nested constructs at work: the union block holds the memory overlap; the inline anonymous record variant flattens Offset and OffsetHigh into the outer scope; the single named field pPointer coexists alongside.



Embedding an existing record

Pull fields from an existing record into the outer scope via embed:

Code: Pascal  [Select][+][-]
  1. type
  2.   TPoint = record
  3.     x, y: integer;
  4.   end;
  5.  
  6.   TPixel = record
  7.     embed TPoint;          // anonymous embed - flattens x, y into TPixel
  8.     color: longword;
  9.   end;
  10.  
  11. var
  12.   p: TPixel;
  13. begin
  14.   p.x := 10;               // direct access via flatten
  15.   p.y := 20;
  16.   p.color := $ff0000;
  17. end;

Methods, properties, and operators of the embedded type lift through the embed too - if TPoint has class operator + (const a, b: TPoint): TPoint;, then TPixel + TPixel compiles and returns TPoint (the operator binds to the embed slice). Three flavours of composition coexist: embed TName; (named type, flatten), inline anonymous record ... end; (no type name, flatten), and classic name: TType; (named subfield, no flatten - access through outer.name.field).

Anonymous enums stay in the record's scope

Inline kind: (kA, kB, kC); enumerators no longer leak into the unit scope - they live inside the record namespace, qualified as TRec.kA from outside, bare kA from inside a method or a with block:

Code: Pascal  [Select][+][-]
  1. type
  2.   TFirst  = record kind: (kA, kB, kC); end;
  3.   TSecond = record kind: (kA, kB, kC); end;     // same names, no clash
  4.  
  5. var
  6.   a: TFirst;
  7.   b: TSecond;
  8. begin
  9.   a.kind := TFirst.kA;
  10.   b.kind := TSecond.kC;
  11.   // a.kind := kA;                              // compile error - kA unqualified
  12. end;

In stock FPC the same code makes kA, kB, kC top-level identifiers in the enclosing unit - two records cannot declare the same enumerator names without colliding. Composable Records keeps the unit symbol table clean and makes the discriminator pattern (see below) viable without manually-named enum types.



Layout control

Five modifiers attach right after the record / union keyword (pre-body) or after a single field's type (per-field):
  • of T (pre-body, union and bitpacked record) - default field type for the name: N; C-style bitfield shorthand. On union, also sets size and alignment from T.
  • size N / bitsize N (pre-body and per-field) - byte or bit size assertion + padding. Mutually exclusive. Catches accidental growth: a record outgrowing its declared size breaks the build at the declaration site.
  • align N / bitalign N (pre-body and per-field) - byte or bit alignment override. align N requires N to be a power of two; bypasses the platform's recordalignmax clamp, useful for cache-line padding.
Per-field examples:

Code: Pascal  [Select][+][-]
  1. type
  2.   // explicit 64-byte alignment (cache line)
  3.   TCache = record
  4.     counter: int64 align 64;
  5.     flag:    boolean;
  6.   end;
  7.  
  8.   // explicit bit width override - wide type narrowed
  9.   TBitfield = packed record
  10.     flags: integer bitsize 3;     // 3 bits, declared type stays integer
  11.     next:  byte;
  12.   end;

Four compile-time intrinsics ride along: OffsetOf / BitOffsetOf for field positions (byte / bit), and AlignOf / BitAlignOf for alignment introspection. Both accept Pascal-style Type.field and C-style Type, field separators (and mixing them within the same call). AlignOf / BitAlignOf also accept a bare type (AlignOf(integer) = 4). All are constant expressions and walk through compositions transparently.

Heap allocation gets four new helpers for fields with custom align N: GetMemAligned, AllocMemAligned, ReAllocMemAligned, FreeMemAligned (all in system, no uses clause needed). Default GetMem returns 16-byte-aligned pointers only, so cache-line-padded records (align 64) need the explicit variants.

The intrinsics are constant expressions, so they work in {$if} directives for layout assertions against C headers:

Code: Pascal  [Select][+][-]
  1. {$if AlignOf(TCounter) <> 64} {$error TCounter not cache-aligned} {$endif}
  2. {$if OffsetOf(TOverlapped, hEvent) <> 24} {$error OVERLAPPED layout drift} {$endif}
  3. {$if SizeOf(TPEBHead) <> 4} {$error PEB head size mismatch} {$endif}

Compile-time guards on layout drift: the build fails the moment a struct outgrows its declared shape, instead of silently misbehaving at runtime against a C library or kernel API expecting an exact byte layout.



Tagged unions - the modern case TAG: TYPE of

The modern replacement for Pascal's tagged variant section: a record-scoped enum discriminator plus an untagged union. Two orthogonal constructs instead of one conflated form:

Code: Pascal  [Select][+][-]
  1. type
  2.   TPacket = record
  3.     kind: (kAudio, kVideo, kCtrl);                                  // discriminator - record-scoped enum
  4.     union
  5.       record codec, channels: byte; sample_rate: word; end;         // kAudio
  6.       record codec_video: byte; width, height: word; end;           // kVideo
  7.       ctrl: record cmd, status: word; end;                          // kCtrl
  8.     end;
  9.     crc: longword;
  10.   end;
  11.  
  12. procedure Process(const p: TPacket);
  13. begin
  14.   case p.kind of
  15.     TPacket.kAudio: WriteLn('audio codec=', p.codec);
  16.     TPacket.kVideo: WriteLn('video ', p.width, 'x', p.height);
  17.     TPacket.kCtrl:  WriteLn('ctrl cmd=', p.ctrl.cmd);
  18.   end;
  19. end;

The discriminator and the overlap are independent - swap one set of variants without touching the discriminator type, change the discriminator type without rewriting the union. The discriminator does not pollute the unit (kAudio lives in TPacket, a TFrame declared later in the same unit can have its own kAudio meaning a different thing).



Use cases

Wherever the memory shape is dictated by something outside the language - hardware, protocols, file formats, OS APIs:
  • WinAPI / POSIX struct ports. SYSTEM_INFO, OVERLAPPED, TOKEN_PRIVILEGES, _PEB, LARGE_INTEGER - anything with anonymous unions or anonymous nested structs. Faithful 1:1 ports, no variants lost, no fields duplicated.
  • Hardware register maps. Memory-mapped I/O registers where bits within a word carry distinct meanings (enable, mode, status). union of DWORD with C-style bitfields models the register directly, instead of reg and $0F shl 4 arithmetic at every callsite.
  • Network protocol headers. TCP/UDP/IP/Ethernet/MQTT/WebSocket frames, all packed structures with mixed-width fields and bit flags. Declarative layout instead of helper chains.
  • File formats. BMP info headers (v3/v4/v5 share a core), ELF/PE/Mach-O headers, ZIP local file headers, custom container formats. Anonymous embed expresses shared cores once.



Restrictions
  • Only record types can be embedded. class, legacy object, interface, helper types, primitives, arrays, and pointers are rejected at the embed site with Record type expected after "embed". Class/object embed would embed a pointer or VMT - rejected on purpose.
  • Anonymous embed produces a flat namespace. A field name from TInner colliding with an already-present name in the surrounding record (direct field, earlier embed, earlier flattened name) raises Duplicate identifier "X" from composition. Fix is local: rename one side, or switch to a named subfield (inner: TInner;) which keeps its members in a separate namespace.
  • Composition never upgrades visibility. A private field of an embedded record stays private through the embed; strict private is never reachable from outside the embedded type.
  • embed T; where T is a generic type parameter is rejected at the generic declaration site. Use a named subfield (item: T) instead. Lifting this is a future enhancement.
  • Modifier order on union / record: of T must come first when present, size and bitsize are mutually exclusive, align and bitalign are mutually exclusive, each modifier appears at most once.
Backward compatibility

Strictly additive - every existing record compiles unchanged:
  • Legacy case TYPE of (untagged and tagged variants) still works. union is the modern path, case kept for backward compat.
  • union, embed, and pad are contextual keywords - recognised only inside record bodies, and only when the next token is not : or ,. A field literally named union or embed keeps parsing as a regular field. jwawinuser.pas has union: record and still compiles unchanged.
  • Outside record bodies the new tokens stay plain identifiers - variables, parameters, methods named union or embed are fine.
  • PPU is forward-compatible. Older PPUs without the composition section continue to load.


Full reference: https://github.com/fpc-unleashed/freepascal/blob/main/unleashed/docs/composable-records.md



@Okoba: Thanks
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

Thaddy

  • Hero Member
  • *****
  • Posts: 19268
  • Glad to be alive.
I don't see how this is much more useful than OpaqueData and family that I introduced well over 5 years ago just to provide for such structures coming from external sources.
https://www.freepascal.org/docs-html/rtl/system/popaquedata.html

This is already in 3.2.0

You can cast OpaqueData to any record structure you need.
(This differs from a pointercast because it is typed)

This needs thorough testing and I will do so.
« Last Edit: May 21, 2026, 02:54:21 pm by Thaddy »
objects are fine constructs. You can even initialize them with constructors.

Fibonacci

  • Hero Member
  • *****
  • Posts: 1000
  • Behold, I bring salvation - FPC Unleashed
I don't see how this is much more useful than OpaqueData and family that I introduced well over 5 years ago just to provide for such structures coming from external sources.
https://www.freepascal.org/docs-html/rtl/system/popaquedata.html

This is already in 3.2.0

You can cast OpaqueData to any record structure you need.
(This differs from a pointercast because it is typed)

POpaqueData is an empty record. Composable Records describes complex byte and bit layouts. If those look equivalent to you, there's nothing to discuss here.

Side note - I've never seen POpaqueData used in any actual code, ever. Maybe for a reason...
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

Thaddy

  • Hero Member
  • *****
  • Posts: 19268
  • Glad to be alive.
In the discussion on why this should be introduced as a separate compiler based option I gave many examples. These are public.
Mostly pass-throughs, anonymous, but also quite a few others.

Again, I will test it, it is not a critique, I just don't see it.
It was accepted - and without much discussion - for a reason.

Same use-case as discussed here....
« Last Edit: May 21, 2026, 03:07:38 pm by Thaddy »
objects are fine constructs. You can even initialize them with constructors.

cdbc

  • Hero Member
  • *****
  • Posts: 2816
    • http://www.cdbc.dk
Hi
I've used P/TOpaqueData often in connection with opaque handles from C libraries, which is often exactly what you get back when you 'xxx_init' a library and pass in to 'xxx_done/xxx_fini'.
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE6/QT6 -> FPC Release -> Lazarus Release &  FPC Main -> Lazarus Main

440bx

  • Hero Member
  • *****
  • Posts: 6532
It looks pretty good... one concern which is likely already solved, in the anonymous enumerations, there is no indication as to the enumeration size, I suppose whatever the setting of $Z applies but, that means the actual size must be "hunted down".

It would be nice if anonymous enumerations allowed a possibly optional cast to indicate their size, e.g, word(e_el_1, e_el_2, ... e_el_n);  that would inform the programmer, right then and there what their size is.  Quite nice to have that information available in the declaration.

In the tagged unions, does the compiler ensure there is a layout present for every enumeration member or is that left to the programmer ?  Also, does the Kind enumeration allow duplicate values and, if yes, how does that affect the branch identification and the number of possible branches ?
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

440bx

  • Hero Member
  • *****
  • Posts: 6532
Feature request:

A way of causing the compiler to emit a message if some function or data type is used in the code.  Similar to what "deprecated" does but for general use.

Something along the lines of:

Code: Pascal  [Select][+][-]
  1. function VirtualAlloc2(all the parameters it takes) : its_result;  
  2.   emit(Note, 'this function requires Window 10 or greater');
  3.  

In the above example, a note stating that the function requires Window 10 or greater is emitted but, the programmer is free to emit whatever he/she thinks is notable about using the function.

The first parameter could be the message severity which should parallel what FPC currently has, notes, hints, warnings, errors, followed by the text message itself.  if the severity is error then compilation should stop.


FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

Fibonacci

  • Hero Member
  • *****
  • Posts: 1000
  • Behold, I bring salvation - FPC Unleashed
It looks pretty good... one concern which is likely already solved, in the anonymous enumerations, there is no indication as to the enumeration size, I suppose whatever the setting of $Z applies but, that means the actual size must be "hunted down".

It would be nice if anonymous enumerations allowed a possibly optional cast to indicate their size, e.g, word(e_el_1, e_el_2, ... e_el_n);  that would inform the programmer, right then and there what their size is.  Quite nice to have that information available in the declaration.

Standard enum size. Default 4 bytes unsigned. Controlled by, as you said, {$Z}. Enum fields now accepts "of <type>".

Code: Pascal  [Select][+][-]
  1. type
  2.   TPacket = packed record
  3.     kind: (kAudio, kVideo, kCtrl) of Word;
  4.     b: byte;
  5.   end;
  6.  
  7. begin
  8.   writeln(SizeOf(TPacket));      // 3
  9.   writeln(SizeOf(TPacket.kind)); // 2
  10.   readln;
  11. end.

In the tagged unions, does the compiler ensure there is a layout present for every enumeration member or is that left to the programmer ?  Also, does the Kind enumeration allow duplicate values and, if yes, how does that affect the branch identification and the number of possible branches ?

The enum holds whatever values you want. No requirement that the rest of the record provides a matching (sub)type for each value - discriminator and union are independent.

And yes, enumerations allow duplicate values (same as stock FPC). "How does that affect the branch identification" - that's up to the programmer.



Feature request:

A way of causing the compiler to emit a message if some function or data type is used in the code.  Similar to what "deprecated" does but for general use.

Something along the lines of:

Code: Pascal  [Select][+][-]
  1. function VirtualAlloc2(all the parameters it takes) : its_result;  
  2.   emit(Note, 'this function requires Window 10 or greater');
  3.  

In the above example, a note stating that the function requires Window 10 or greater is emitted but, the programmer is free to emit whatever he/she thinks is notable about using the function.

The first parameter could be the message severity which should parallel what FPC currently has, notes, hints, warnings, errors, followed by the text message itself.  if the severity is error then compilation should stop.

Design question - does emit() fire always when the function is referenced, even on a build that already targets Windows 10+? Without any conditional wrapping around it? Anyone gets the note on every call?
« Last Edit: May 23, 2026, 11:06:58 am by Fibonacci »
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

440bx

  • Hero Member
  • *****
  • Posts: 6532
Thank you for the clarifications Fibonacci.

I particularly like the "of word" to size the enum.  I think that's elegant.

yes, emit always fires.  The message should by its very nature be unconditional. Keep it simple.

As far as emitting the message on every call, it would be elegant if the message was emitted only once no matter how many times the API is called, since as you surmised, emitting every time could add up to a fair amount of message noise in some cases.  Last thing FPC needs is another "var may not have been initialized" torrent of useless, not to mention conceptually incorrect, messages.

Just as food for thought, other implementations could be "onuse" instead of "emit" and have the message importance, i.e, note, message, warning, error, be separate instead of in the parentheses.  Example:  emit note 'this function requires Win 10';

I mention that to make it clear that I am not set on a particular structure.  As long as there is a simple and, preferably elegant way, of outputting the message, I'm happy with it.

That makes me think, removing that extremely annoying message "var may not have been initialized" thing would be a major improvement.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

Fibonacci

  • Hero Member
  • *****
  • Posts: 1000
  • Behold, I bring salvation - FPC Unleashed
Unleashed Installer - available now

No point dragging this out any longer. I've tested it as much as I could on my own setup and it works end-to-end. Time to see how it behaves on other people's machines.

Download: https://github.com/fpc-unleashed/installer/releases

What it does

Sets up FPC Unleashed (compiler + Lazarus IDE) from the main branches in one go - no manual git clone / build dance, no fpcupdeluxe configuration steps. Lighter than fpcupdeluxe, focused on Unleashed only.
  • Host platforms: Win64 / Linux64 (x86_64)
  • Cross-compile targets: win32, win64, linux32, linux64, WASM
If something breaks

Two options:
Either works, I'll respond to both. The faster the bug reports come in, the faster the rough edges get smoothed out.
FPC Unleashed - inline vars, tuples, statement expressions, array equality, compound assignments, indexed/lazy labels, no-RTTI & more. ⭐ Star it on GitHub!

VisualLab

  • Hero Member
  • *****
  • Posts: 742
The fact that I prefer to stick to basic syntax and data types (e.g., no OOP, tuples, pattern matching) doesn’t mean that this basic syntax suits me. And it doesn’t suit me, for example, when it comes to unions, bit fields, arrays, etc.—these are basic data types whose syntax is overly complicated.

Regarding unions (variant records) and bit fields, I agree. Pascal has limitations in this regard. A solution to this issue would be useful. Although not necessarily by copying C syntax verbatim. I believe 440bx's comments on this matter are entirely valid and should be taken into account.

Quote
Besides, how can you recognize archaic or (its opposite) modern syntax? Because it is different from other languages?

Mainly because it hasn't changed in decades, it's clunky and deviates from the norm. If a particular syntax can be improved, it should be improved, and I'm glad that work is underway on these aspects (at least some of them).

You must be joking. The syntax of C and C++ has also remained unchanged for decades. Is it also clunky and deviates from the norm? If that's your opinion, I completely agree with you. And should the syntax of these languages ​​be improved? If that's what you think, then I agree on that point too. Unfortunately, no one intends to correct it. On the contrary, C and C++ enthusiasts insist it's great (which is a lie).

Furthermore, what does it mean that Pascal's syntax deviates from the norm? So, from what? What constitutes a programming language norm? There is no standardization document that describes what the syntax of programming languages should look like. Because it would mean that such a document is a description of one specific language! Such a hypothesis (i.e., this "deviation from the norm") cannot be proven mathematically or experimentally (as in physics). This is purely a subjective feeling. The only thing that can be proven is whether, for a given group of people, a given syntax is:
  • easier to learn,
  • clearer to use,
  • more or less error-prone.
And there can be many such groups. And there are many, as evidenced by the number of programming languages ​​in use.

Quote
When declaring arrays, you specify a number in square brackets, and when specifying the element index, you also specify a number in square brackets. This is error-prone.

Why exactly would that be error-prone? An array declaration is a separate and unambiguous context, so it’s impossible to confuse specifying the number of array elements with accessing a specific element. It is not without reason that other languages have a syntax completely different from Pascal’s, because it is simpler, shorter, and still unambiguous.

When I declare an array, I do so for a specific number of elements, and I almost always use zero-based indexing. And since the indexing is the same in 99.9% cases, requiring the index range to be specified in 99.9% of declared arrays is a waste of time and unnecessarily lengthens the code.

Other languages, meaning which ones? Because there are many syntaxes for declaring arrays. Not just the "one right C" one that you apparently like - which is fine, because everyone has the right to prefer a certain syntax. The fact that some languages ​​have C-like array declaration syntax in the past stemmed from a desire to attract people to the new language (C++, Java, C#, etc.), not from any advantages of that syntax. Furthermore, what does it mean that the syntax is unambiguous? That the C syntax is unambiguous and the Pascal syntax isn't? How can this be proven? This is purely subjective. Pascal's array syntax is unambiguous. And the fact that an array declaration doesn't use the number of elements, but the index range is an advantage. And in C you can't. An arbitrary index range allows for greater coding flexibility for certain algorithms when using arrays. I've used it many times and continue to use it. So, for me, declaring an index range isn't a waste of time! And there are many more such people.

Quote
Secondly, the idea of ​​"tinkering" with the counter of a "for" loop within that loop is absurd. These two ideas should not be implemented under any circumstances, as they will cause various problems in the future.

It is absurd to prohibit the use of the alter iterator in for loops and force users to use other loops. Not only does this fail to prevent incorrect implementations (since exactly the same problems exist with while or repeat loops), but it also forces a longer implementation and annoys the user by treating them like an underdeveloped child.

Are you suggesting replacing the current Pascal "for" loop syntax with the C syntax? The Pascal version has automatic loop counter changing, while the C one is essentially quarter-automatic. This is a radically different approach to the for loop. I don't think it's a good idea to implement C language quirks into Pascal. I understand that someone prefers a different syntax. Fine. But then you could simply use the other language, since it has so many advantages. This is a simpler and more reliable solution.

As for "longer syntax"—how much longer? A few characters? If it provides additional information for the compiler or the person reading the code, then that is an advantage.

A programming language is a tool that is meant to serve the user. Programming isn’t kindergarten—either you know how to program or you don’t, and if you don’t, the language’s features don’t matter, because you won’t be able to use them correctly anyway, regardless of what’s allowed and what isn’t. If you can't implement the algorithm using a for with altering its iterator, you won't be able to implement it using a while loop either.

Yes, I agree with the statement that "A programming language is a tool meant to serve the user." But to be truly useful, it can't be confusing, cumbersome, or have any quirks.

The statement, "Programming isn't kindergarten—either you can program or you can't..." is pure nonsense and demonstrates a lack of substantive arguments. Guido von Rossum was once criticized for the lack of encapsulation in Python's "classes." The response was, "But why would you want to hide that? We're all adults." Another objection concerned the lack of constants. The response was, "Just don't change it."  This demonstrates that this man had insufficient programming experience (he's probably learned a thing or two since then). He did not understand that in the case of complex software, some solutions that seem strange, cumbersome, and redundant are not so at all. They simply allow you to offload some of the work involved in checking for errors in the code to the software. But software isn't a miracle worker or a wizard; there's no magic bullet, so the programmer must include certain information in the code. The less ambiguity there is in the code, the less likely it is to make errors. The following statement: "Programming is not kindergarten - either you can program or you can't..." may indicate: (1) too little programming experience, or (2) a sloppy and reckless approach to programming (excessive convenience at the expense of code quality).

Quote
This idea is redundant, error-prone, and difficult to read compared to the current one.

Absolutely not, and that is precisely why most modern programming languages use exactly this syntax, whereas the one found in Pascal is redundant and rarely found anywhere. In contrast, the Pascal syntax is error-prone because it requires specifying the upper index, which makes it much easier to make a mistake by forgetting the -1 for 0-based indexing.

But which languages? All of them? This is definitely not the case. As I mentioned earlier, some of them simply copied the syntax from C, but not necessarily because of its usefulness.

As for the redundancy of Pascal syntax for array indexing – how can this be proven? Again, this is just a subjective opinion. Moreover, in C you can also forget about correcting indexes. It depends on the algorithm and what the code is processing.

Quote
A typical C-ism.

No, it's just a simplification of what is redundant. In contrary, yours is a typical case of C-phobia (or even modern-phobia as such syntax is commonly used in nearly all modern programming languages).

The “ad persom” argument is nonsensical. Neither of us knows the other. We never met, and we likely never will. I suggest focusing on evaluating the solutions and opinions presented.

My critical opinion of the C language stems from its shortcomings and flaws. This is understandable, as people have different assessments of the suitability of tools for their work. Secondly, C syntax is not some "divine standard." It is just one of many. It has its flaws, which stem from the decisions of people working with computers in the early 1970s. They made certain design decisions without thinking about the future. Today, these decisions are romanticized as the "extraordinary insight and wisdom" of the designers of C (and C++) at the time. But this is an illusion (cognitive error) resulting from temporal distance.

How can we tell if C syntax is widely used in almost all modern programming languages? Is there a reliable and authoritative comparison published somewhere? So, definitely not all imperative languages. What about functional languages? What about declarative languages? I've already explained why some imperative languages ​​have a syntax similar to C. Another factor could be that their designers only knew C or C++.

Quote
Another terrible idea. Especially the one with two or three variables in the loop header. If someone thinks they need this language construct, they should probably rethink their code, because something is flawed.

What are you talking about? You write as if you've never written a loop in your life where the iterator needs to be modified dynamically and on which the final number of iterations depends. Of course such algorithms exist (and I even gave an example of one I recently implemented), and of course it would be easier and simpler to implement it using a for loop rather than a while loop.

And the funniest part is that you can easily implement loops in C or C++ this way, dynamically modifying not only the for loop iterator but also its termination condition. This gives you full control over how the loop behaves and reduces the amount of code you need to write, and this is exactly what my proposal is about. But C/C++ is a tool that supports the programmer, while Pascal is an overprotective mom, unjustifiably forbidding her child from doing this and that so the poor thing doesn’t hurt itself. 8)

If C or C++ are more useful to you, why don't you just use them? It's easier because they already exist. And it's a completely rational approach.

Why port C or C++ syntax to Pascal? To create another oddity like C#, only much more niche? What's the point? Unless it's about divisions among Pascal users. Oh, that makes perfect sense :) But not for Pascal programmers :(

As for the statement: "Pascal is an overprotective mom, unjustifiably forbidding her child from doing this and that so the poor thing doesn’t hurt itself." - such a meaningless platitude could have been written by some fanatic teenager who is just learning to program. Roughly speaking, people who create software can be divided into two groups:
  • those who want to use various bizarre tricks or excessively shortened syntax (e.g., to impress others with their apparent wisdom),
  • those who want to create robust software so that in the future they won't have to constantly patch it due to bugs and vulnerabilities.
Both of these approaches use different syntaxes in programming languages. There are more people in the first group. This applies not only to programmers. In every technical industry there are those who want to take shortcuts, or rather believe that it can be done. Those who want robustness are, unfortunately, fewer. And on top of that, greedy company managements who do not understand technical issues and put pressure on programmers. And this approach is transferring to the open source world.

440bx

  • Hero Member
  • *****
  • Posts: 6532
Feature request:

A clean syntax C-like ternary operator.

Example:
Code: Pascal  [Select][+][-]
  1.   x := iif a > b then a else b;
  2.   y := iif p <> nil then p.value else 0;
  3.  

Statement characteristics:

  • iif condition then true_expr else `false_expr` is a conditional expression that produces a value
  • `iif` is a reserved keyword
  • The condition must be a boolean expression — a non-boolean condition is a semantic error
  • The `else` branch is mandatory — omitting it is a syntax error; `iif` must always produce a value
  • Only the selected branch is evaluated — if the condition is true only `true_expr` is evaluated; if false only `false_expr` is evaluated
  • Both branches must produce assignment-compatible types — incompatible branch types are a semantic error
  • The result type of an `iif` expression follows the same implicit widening rules as binary expressions — if one branch is wider the result is the wider type
  • `true_expr` and `false_expr` extend as far right as possible — greedy parsing
  • Nested iif binds inner-first — the innermost `iif`'s `else` clause binds to the nearest `then`
  • The compiler emits a style note when `iif` expressions are nested beyond one level without parentheses: `consider using parentheses for clarity in nested iif expressions`
  • An `iif` expression is valid anywhere an expression is valid — assignment RHS, function argument, array index, record field initialiser, and so on
  • An `iif` expression is not valid in a `const_expr` context — the condition and branches may not be evaluated at compile time unless all three are constant expressions; if all three are constant expressions the compiler evaluates the `iif` at compile time
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018