Recent

Author Topic: Lazarus for Windows on aarch64 (ARM64) - Native Compiler  (Read 36203 times)

msintle

  • Sr. Member
  • ****
  • Posts: 358
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #120 on: October 12, 2025, 03:34:10 pm »
Exceptions a bit working already.

That's the point. They work a bit. That's what I had originally tested when implementing them, but there are situations that I hadn't thought about and how I implemented exceptions back then simply is not compatible with that. So in quite some situations it will work but in others it will break in mysterious, puzzling ways. And as such it's simply not ready for prime time.

Anything that we could do financially to contribute towards a reimplementation, maybe from the ground up, that would work properly in all scenarios?

bbrx

  • New Member
  • *
  • Posts: 18
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #121 on: October 15, 2025, 05:39:46 pm »
Has any progress been made in getting Lazarus to work natively on ARM64 (aarch64) on Windows?

This support exists for macOS and Linux, but is ironically absent on Windows; where Lazarus had originally started.

Without native code, we can't do certain things like compile shell extensions; and ARM64 is gaining traction on the Windows side.

Are there any development costs which need sponsoring to get over this hurdle?

Hello,

I am also interested in an ARM-based PC (XElite), but how does Lazarus perform with Prism (X86 emulation)? Does it allow existing applications to be compiled while waiting for native ARM64 support?

Thanks

DonAlfredo

  • Hero Member
  • *****
  • Posts: 1845
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #122 on: October 16, 2025, 08:05:58 pm »
I did not succeed. Unfortunately.
FPC uses the stackpointer as framepointer on ARM64 and the Windows unwinder does not like this and I was unable to "invent" a workaround.
The FPC devs have invested a lot of time and effort to get things running on Win64/ARM64. Its too bad that it does not work [yet].

Fred vS

  • Hero Member
  • *****
  • Posts: 3716
    • StrumPract is the musicians best friend
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #123 on: October 16, 2025, 09:04:51 pm »
FPC uses the stackpointer as framepointer on ARM64 and the Windows unwinder does not like this and I was unable to "invent" a workaround.

Sorry but I tried, I asked my friend Gork what he thought about it, here is his answer:  :-[

Quote
Exception Handling and Unwinder: Partial support for exceptions exists in the RTL (run-time library),
but it's unreliable in certain scenarios (e.g., try..finally blocks with exit statements, or more complex nesting).
The compiler generates incorrect or incomplete unwind information, preventing proper stack unwinding during exceptions.
Recent work (e.g., fixes to seh64.inc for data structure alignment with winnt.h) has improved stack unwinding,
but issues like linking errors with "_fin$" symbols (related to finally blocks) persist.
Bug #40203 on GitLab (https://gitlab.com/freepascal.org/fpc/source/-/issues/40203) tracks this with updates showing progress but no full resolution.
Frame Pointer Specifics: FPC on ARM64 omits the dedicated frame pointer (x29) by default for performance reasons,
similar to GCC's -fomit-frame-pointer.
This is fine on other platforms but breaks Windows' table-based unwinding without custom unwind codes.
The {$OPTIMIZATION FORCENOSTACKFRAME} directive exists to force omission,
but it's not used by default on ARM yet—however, the effective behavior is omission where possible.
Enabling a frame pointer isn't straightforward via options; it would require code changes or compiler modifications.

Manual Unwind Info: Adding custom .pdata/.xdata in assembly is theoretically possible (following Microsoft's spec for unwind codes, e.g., save_fplr for <x29, lr> pairs and alloc_m for stack allocation), but it's complex and not a simple "invention"—it would require patching FPC's code generator or RTL.
Emulation Fallback: Run x64-compiled FPC apps under Windows' Prism emulator on ARM64 hardware.
Performance is good for many cases, but it won't work if you need native ARM64 DLLs or drivers.
I use Lazarus 2.2.0 32/64 and FPC 3.2.2 32/64 on Debian 11 64 bit, Windows 10, Windows 7 32/64, Windows XP 32,  FreeBSD 64.
Widgetset: fpGUI, MSEgui, Win32, GTK2, Qt.

https://github.com/fredvs
https://gitlab.com/fredvs
https://codeberg.org/fredvs

DonAlfredo

  • Hero Member
  • *****
  • Posts: 1845
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #124 on: October 16, 2025, 10:03:23 pm »
FPC does generate custom unwind codes. And they look good. They do a nice job of reversing the prologue. Checked with all kinds of tools. But nevertheless I have not succeeded.

PascalDragon

  • Hero Member
  • *****
  • Posts: 6195
  • Compiler Developer
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #125 on: October 16, 2025, 10:30:23 pm »
FPC does generate custom unwind codes. And they look good. They do a nice job of reversing the prologue. Checked with all kinds of tools. But nevertheless I have not succeeded.

The problem is that the compiler needs to use the framepointer instead of the stackpointer as you already noticed, but it really dislikes to do that and the exceptions handlers need to be properly extracted from the normal functions (instead of partially being copied like they are now).

Fred vS

  • Hero Member
  • *****
  • Posts: 3716
    • StrumPract is the musicians best friend
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #126 on: October 16, 2025, 10:38:00 pm »
FPC does generate custom unwind codes.

Answer of Gork for "FPC does generate custom unwind codes. What must be changed to be ok?"

The issue with FPC's unwind codes on Windows ARM64 stems from the fact that while the compiler does generate custom unwind information (via the .pdata and .xdata sections, as required by the Windows ABI), it currently produces incorrect or incomplete data that doesn't fully comply with Microsoft's specification for table-based exception unwinding. This leads to failures in stack unwinding during exceptions, preventing proper restoration of non-volatile registers (e.g., x19-x28, d8-d15, lr) and invocation of exception handlers. The core problem is exacerbated by FPC's default omission of a dedicated frame pointer (x29), relying instead on the stack pointer (sp) for frame access— a performance optimization that's valid on ARM64 but requires precise unchained unwind codes to work with the Windows unwinder.Based on recent developer discussions (as of September 2025), the fixes involve a combination of RTL structure alignments, code generator patches, and ensuring the unwind codes accurately mirror the function's prolog and epilog in reverse order. Here's a breakdown of what's wrong and what needs to be changed:Key Problems with Current Unwind CodesIncorrect Unwind Information Generation: The compiler's AArch64 backend (e.g., in cgcpu.pas) doesn't fully account for Windows-specific requirements, leading to malformed .xdata. For example, unwind codes fail to properly describe stack allocations, register saves, and restorations, causing the unwinder to skip or corrupt the stack during partial unwinds (e.g., mid-prolog or mid-epilog). This is why simple non-exception code runs, but try-except or try-finally blocks crash or skip handlers.
Omission of Frame Pointer (x29): Without x29 (unchained mode), the unwind must rely solely on sp for recovery. However, current codes don't handle this correctly, especially for non-canonical prologs/epilogs (e.g., variable stack sizes or non-standard save orders). Packed unwind data in .pdata can't represent sp restores from x29, forcing full .xdata—but FPC's implementation often generates incompatible sequences.
Mismatched SEH Structures: Definitions in rtl/win64/seh64.inc deviate from Windows' winnt.h, affecting how the unwinder interprets context during unwinding. For instance:KNONVOLATILE_CONTEXT_POINTERS.IntegerContext is sized incorrectly (should be array[0..11] for x19-x28 + fp + lr).
TDispatcherContext misses a Reserved: DWord field for ARM64.
RUNTIME_FUNCTION lacks proper flags and comments for ARM64-specific unwind data.

Missing Local Unwind Support: The AArch64 code generator lacks a Windows override for g_local_unwind, which is needed to integrate with SEH for local jumps (e.g., exit in try-finally). This prevents _FPC_local_unwind from being called with the correct sp and label reference.
Incomplete Handling for Exceptions: Even with partial RTL support (added around 2020-2021), the unwind codes don't ensure 1:1 mapping to prolog/epilog instructions, violating the spec's no-code-probing rule. This breaks mid-unwind scenarios, like interrupts or exceptions in partial frames.

What Needs to Be Changed in the Unwind CodesTo comply with the Windows ARM64 unwind spec, FPC's unwind codes must be updated in the code generator to produce accurate .xdata sequences. These codes are stored in reverse prolog order (epilogs mirror them) and must describe every stack-modifying operation precisely. Since FPC omits x29, focus on unchained formats (CR=00 or 01 in packed .pdata). Key changes include:Align SEH Structures in RTL:Update seh64.inc to match winnt.h:Set KNONVOLATILE_CONTEXT_POINTERS.IntegerContext to array[0..11] of PQWord (comment: "// X19..X28 + Fp + Lr").
Keep FloatingContext as array[0..7] of PM128A (comment: "// D8..D15").
For RUNTIME_FUNCTION.UnwindData, add ARM64 comment: "// or Flag(2) Len(11) RegF(3) RegI(4) H(1) CR(2) FSize(9)".
Add Reserved: DWord to TDispatcherContext for AARCH64, plus ControlPCIsUnwound: Byte and NonVolatileRegisters: PByte.

These ensure the unwinder can correctly restore context pointers during unwinding.

Implement Windows-Specific Local Unwind in Code Generator:Add an override to g_local_unwind in compiler/aarch64/cgcpu.pas for system_aarch64_win64:

Key Problems with Current Unwind CodesIncorrect Unwind Information Generation: The compiler's AArch64 backend (e.g., in cgcpu.pas) doesn't fully account for Windows-specific requirements, leading to malformed .xdata. For example, unwind codes fail to properly describe stack allocations, register saves, and restorations, causing the unwinder to skip or corrupt the stack during partial unwinds (e.g., mid-prolog or mid-epilog). This is why simple non-exception code runs, but try-except or try-finally blocks crash or skip handlers.

Omission of Frame Pointer (x29): Without x29 (unchained mode), the unwind must rely solely on sp for recovery. However, current codes don't handle this correctly, especially for non-canonical prologs/epilogs (e.g., variable stack sizes or non-standard save orders). Packed unwind data in .pdata can't represent sp restores from x29, forcing full .xdata—but FPC's implementation often generates incompatible sequences.
Mismatched SEH Structures: Definitions in rtl/win64/seh64.inc deviate from Windows' winnt.h, affecting how the unwinder interprets context during unwinding. For instance:KNONVOLATILE_CONTEXT_POINTERS.IntegerContext is sized incorrectly (should be array[0..11] for x19-x28 + fp + lr).
TDispatcherContext misses a Reserved: DWord field for ARM64.
RUNTIME_FUNCTION lacks proper flags and comments for ARM64-specific unwind data.

Missing Local Unwind Support: The AArch64 code generator lacks a Windows override for g_local_unwind, which is needed to integrate with SEH for local jumps (e.g., exit in try-finally). This prevents _FPC_local_unwind from being called with the correct sp and label reference.
Incomplete Handling for Exceptions: Even with partial RTL support (added around 2020-2021), the unwind codes don't ensure 1:1 mapping to prolog/epilog instructions, violating the spec's no-code-probing rule. This breaks mid-unwind scenarios, like interrupts or exceptions in partial frames.

What Needs to Be Changed in the Unwind CodesTo comply with the Windows ARM64 unwind spec, FPC's unwind codes must be updated in the code generator to produce accurate .xdata sequences. These codes are stored in reverse prolog order (epilogs mirror them) and must describe every stack-modifying operation precisely. Since FPC omits x29, focus on unchained formats (CR=00 or 01 in packed .pdata). Key changes include:Align SEH Structures in RTL:Update seh64.inc to match winnt.h:Set KNONVOLATILE_CONTEXT_POINTERS.IntegerContext to array[0..11] of PQWord (comment: "// X19..X28 + Fp + Lr").
Keep FloatingContext as array[0..7] of PM128A (comment: "// D8..D15").
For RUNTIME_FUNCTION.UnwindData, add ARM64 comment: "// or Flag(2) Len(11) RegF(3) RegI(4) H(1) CR(2) FSize(9)".
Add Reserved: DWord to TDispatcherContext for AARCH64, plus ControlPCIsUnwound: Byte and NonVolatileRegisters: PByte.

These ensure the unwinder can correctly restore context pointers during unwinding.

Implement Windows-Specific Local Unwind in Code Generator:Add an override to g_local_unwind in compiler/aarch64/cgcpu.pas for system_aarch64_win64:


Code: Pascal  [Select][+][-]
  1. procedure tcgaarch64.g_local_unwind(list: TAsmList; l: TAsmLabel);
  2. var
  3.   para1, para2: tcgpara;
  4.   href: treference;
  5.   pd: tprocdef;
  6. begin
  7.   if target_info.system<>system_aarch64_win64 then
  8.   begin
  9.     inherited g_local_unwind(list,l);
  10.     exit;
  11.   end;
  12.   pd:=search_system_proc('_fpc_local_unwind');
  13.   para1.init;
  14.   para2.init;
  15.   paramanager.getcgtempparaloc(list,pd,1,para1);
  16.   paramanager.getcgtempparaloc(list,pd,2,para2);
  17.   reference_reset_symbol(href,l,0,1,[]);
  18.   a_load_reg_cgpara(list,OS_ADDR,NR_STACK_POINTER_REG,para1);
  19.   a_loadaddr_ref_cgpara(list,href,para2);
  20.   paramanager.freecgpara(list,para2);
  21.   paramanager.freecgpara(list,para1);
  22.   g_call(list,'_FPC_local_unwind');
  23.   para2.done;
  24.   para1.done;
  25. end;

This enables custom local unwinding by passing sp and a label to _FPC_local_unwind, fixing issues like skipped finally blocks. However, it may still crash post-unwind if broader codes aren't fixed—test iteratively.

Generate Correct Unwind Code Sequences:For Stack Allocation: Use alloc_s (for <512 bytes: 000iiiii), alloc_m (<32K: 11000iii iiiiiiii), or alloc_l (<256M: 11100000 iiiiiiii iiiiiiii iiiiiiii). Match the prolog's sub sp,sp,#size and epilog's add sp,sp,#size.
For Register Saves (Unchained, No x29): Avoid set_fp (11100001) or add_fp (11100010 iiiiiiii). Instead:

Use save_regp/save_regp_x for integer pairs (x19-x28: 110010nn nniiiiii or 110011nn nniiiiii, pre-indexed for !).
save_reg/save_reg_x for singles (110100nn nniiiiii or 1101010n nnniiiii).
save_lrpair for lr pairs (1101011n nniiiii).
save_fregp/save_fregp_x for FP pairs (d8-d15: 1101100n nniiiiii or 1101101n nniiiiii).
save_freg/save_freg_x for FP singles (1101110n nniiiiii or 11011110 nnniiiii).
Chain with save_next (11100110) for consecutive pairs (e.g., after save_regp to save four regs).
For flexible saves (e.g., non-standard offsets), use save_any_reg variants (11100111 followed by 16-bit payload for str/stp/ldr/ldp on x/d/q regs).

End Codes: Always end sequences with end (11100100) for complete unwind, or end_c (11100101) for chained scopes/fragments.
Other: Include nop (11100011) for padding/homing, pac_sign_lr (11111100) if pointer authentication is used.
Custom Codes for Non-Standard Cases: If FPC's sp-as-fp setup doesn't fit standard codes, use custom opcodes (11101xxx) like MSFT_OP_TRAP_FRAME (11101000) or MSFT_OP_CONTEXT (11101010) for bespoke stack handling in assembly routines.
Ensure codes are in big-endian byte order, packed tightly (up to 255 bytes in .xdata), and shared across prolog/epilogs. For large functions (>1M instructions), split into fragments with separate .pdata/.xdata (first has prolog; others may have epilogs or end_c).

Enable Packed Unwind Where Possible:For canonical unchained functions (no exceptions, standard save order, no x29), use packed .pdata (Flag=01/10/11) to compress data: 11-bit function length, 9-bit frame size, 2-bit CR (00: no lr save; 01: lr saved), 1-bit H (homing), 4-bit RegI (integer saves), 3-bit RegF (FP pairs). This avoids full .xdata but requires exact compliance—fall back to .xdata if non-canonical.

RecommendationsApply the patches from the forum (e.g., seh64.inc diff and g_local_unwind override) to FPC trunk and test with simple try-finally/except programs.
For full fix, the AArch64 code generator needs deeper changes to emit spec-compliant .xdata (track bug #40203 on GitLab). If you're contributing, focus on matching prolog/epilog 1:1 with unwind codes and handling unchained mode without x29.
Workaround: Avoid exceptions entirely for now, or force frame pointers via compiler directives if feasible (though this isn't currently straightforward on ARM64).
Monitor the Lazarus forum thread for updates—devs are actively patching, with bounties accelerating progress.


I use Lazarus 2.2.0 32/64 and FPC 3.2.2 32/64 on Debian 11 64 bit, Windows 10, Windows 7 32/64, Windows XP 32,  FreeBSD 64.
Widgetset: fpGUI, MSEgui, Win32, GTK2, Qt.

https://github.com/fredvs
https://gitlab.com/fredvs
https://codeberg.org/fredvs

PascalDragon

  • Hero Member
  • *****
  • Posts: 6195
  • Compiler Developer
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #127 on: October 16, 2025, 11:36:23 pm »
FPC does generate custom unwind codes.

Answer of Gork for "FPC does generate custom unwind codes. What must be changed to be ok?"

I've only read the first sentence and I already know that the stupid AI is wrong... ::) The unwind codes are ok, its the generated code of the exception handlers that's wrong. Please don't spam AI responses if you're not able to verify them.

msintle

  • Sr. Member
  • ****
  • Posts: 358
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #128 on: October 17, 2025, 01:55:03 am »
FPC does generate custom unwind codes. And they look good. They do a nice job of reversing the prologue. Checked with all kinds of tools. But nevertheless I have not succeeded.

You have already succeeded in galvanizing support for this effort.

Take a day or two off, and the answer will come to you (or to CoPilot when you query it at the right time - just when the AI random seed will be perfect to generate an accurate response!)

DonAlfredo

  • Hero Member
  • *****
  • Posts: 1845
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #129 on: October 17, 2025, 06:23:33 am »
As with all AI answers: some things make (a bit of) sense, some are plain wrong. But I must say that this GROK answer makes (MUCH) more sense than anything I got from CodePilot.
I will give an example.
Quote
Add Reserved: DWord to TDispatcherContext for AARCH64, plus ControlPCIsUnwound: Byte and NonVolatileRegisters: PByte.
This is true, but does not help in getting things running. So good answer, but not helping in solving the current issue.
Quote
For example, unwind codes fail to properly describe stack allocations, register saves, and restorations
Completely wrong answer.
I will take the advice from msintle: a few days more distance and then perhaps a (re-)fresh start.

bbrx

  • New Member
  • *
  • Posts: 18
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #130 on: October 17, 2025, 08:33:48 am »
I did not succeed. Unfortunately.
FPC uses the stackpointer as framepointer on ARM64 and the Windows unwinder does not like this and I was unable to "invent" a workaround.
The FPC devs have invested a lot of time and effort to get things running on Win64/ARM64. Its too bad that it does not work [yet].

ok, thanks for your feedback

Fred vS

  • Hero Member
  • *****
  • Posts: 3716
    • StrumPract is the musicians best friend
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #131 on: October 17, 2025, 01:45:03 pm »
As with all AI answers: some things make (a bit of) sense, some are plain wrong. But I must say that this GROK answer makes (MUCH) more sense than anything I got from CodePilot.
I will give an example.
Quote
Add Reserved: DWord to TDispatcherContext for AARCH64, plus ControlPCIsUnwound: Byte and NonVolatileRegisters: PByte.
This is true, but does not help in getting things running. So good answer, but not helping in solving the current issue.
Quote
For example, unwind codes fail to properly describe stack allocations, register saves, and restorations
Completely wrong answer.

That's the price of the game: sometimes the AI ​​is good, sometimes not.
And regarding Gork, you must perfectly understand and describe the question you're going to ask, and if he makes a mistake, show him what's wrong.
Unfortunately, I don't have the skills and knowledge of FPC,/Aarch64/Windows to fully appreciate his answers. I've provided his answers here in the hope that they may be useful to you.
I use Lazarus 2.2.0 32/64 and FPC 3.2.2 32/64 on Debian 11 64 bit, Windows 10, Windows 7 32/64, Windows XP 32,  FreeBSD 64.
Widgetset: fpGUI, MSEgui, Win32, GTK2, Qt.

https://github.com/fredvs
https://gitlab.com/fredvs
https://codeberg.org/fredvs

msintle

  • Sr. Member
  • ****
  • Posts: 358
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #132 on: October 17, 2025, 03:18:53 pm »
As with all AI answers: some things make (a bit of) sense, some are plain wrong. But I must say that this GROK answer makes (MUCH) more sense than anything I got from CodePilot.

I understand keepoing CoPilot running is also a major investment of time and energy. Randomly every other day (or week) it asks for "reverification of identity", which involves supplying a new credit card (even if no charges are being made). A truly ridiculous product with a truly ridiculous customer experience that matches the uselessness and the shoddy quality of the "solution" on offer.

I will take the advice from msintle: a few days more distance and then perhaps a (re-)fresh start.

Always the best. We sometimes forget we are biological beings, and in fact, resting and recovering is far more important than we realize. The right click at the right time can even make a world of difference with AI - it's all about finding the right moment with the right random seed ;)

DonAlfredo

  • Hero Member
  • *****
  • Posts: 1845
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #133 on: October 17, 2025, 03:49:18 pm »
I give you an example of how hard it can be to discuss with an AI.
I had a function with an array [0..600] of byte. So some stack allocation needed. In total 2416 bytes.
I asked CoPilot to analyze the .xdata. It came up with this.
Code: Pascal  [Select][+][-]
  1. XCode Word 2 = 0xE3E3E3E4 (signed decimal: -471604252) → little-endian bytes:
  2. E4 E3 E3 E3
  3. The lower 16 bits encode 0x012E → 302 decimal → stack allocation units.
  4. Multiply by 8302 * 8 = 2416 bytes, exactly what sub sp, sp, #2416 does.
  5.  
So it completely made up that in some wonderful universe far away 0xE3E4 is equal to 0x012E, just to fulfil the expectations of a stack of 2416 bytes. Utterly nonsense. Utterly useless. And frustrating also.

And when confronted, this was the answer.
Code: Pascal  [Select][+][-]
  1. The bytes E4 E3 E3 E3 are not literally 0x012E.
  2. They encode the number of 8-byte stack units in the FPC unwind encoding format.
  3. The unwinder decodes it according to the DWARF AArch64 frame instructions, extracting 0x12E as the stack allocation after decoding.
  4.  
Again utterly nonsense.
« Last Edit: October 17, 2025, 03:55:25 pm by DonAlfredo »

Fred vS

  • Hero Member
  • *****
  • Posts: 3716
    • StrumPract is the musicians best friend
Re: Lazarus for Windows on aarch64 (ARM64) - Native Compiler
« Reply #134 on: October 17, 2025, 04:01:46 pm »
I asked CoPilot to analyze the .xdata.

I did not try copilot yet.
Only tried ChatGPT, Gemini and Gork.
They all have serious deficiencies in Pascal. But it seems to me that Gork corrects himself better when you point out his mistakes.
In short, AI still has a lot of work to do before it's truly useful. It's a bit like translators: if you don't have a minimum knowledge of the target language, it's best not to trust/use the translation.
« Last Edit: October 17, 2025, 04:05:35 pm by Fred vS »
I use Lazarus 2.2.0 32/64 and FPC 3.2.2 32/64 on Debian 11 64 bit, Windows 10, Windows 7 32/64, Windows XP 32,  FreeBSD 64.
Widgetset: fpGUI, MSEgui, Win32, GTK2, Qt.

https://github.com/fredvs
https://gitlab.com/fredvs
https://codeberg.org/fredvs

 

TinyPortal © 2005-2018