FPC does generate custom unwind codes.
Answer of Gork for
"FPC does generate custom unwind codes. What must be changed to be ok?"The issue with FPC's unwind codes on Windows ARM64 stems from the fact that while the compiler does generate custom unwind information (via the .pdata and .xdata sections, as required by the Windows ABI), it currently produces incorrect or incomplete data that doesn't fully comply with Microsoft's specification for table-based exception unwinding. This leads to failures in stack unwinding during exceptions, preventing proper restoration of non-volatile registers (e.g., x19-x28, d8-d15, lr) and invocation of exception handlers. The core problem is exacerbated by FPC's default omission of a dedicated frame pointer (x29), relying instead on the stack pointer (sp) for frame access— a performance optimization that's valid on ARM64 but requires precise unchained unwind codes to work with the Windows unwinder.Based on recent developer discussions (as of September 2025), the fixes involve a combination of RTL structure alignments, code generator patches, and ensuring the unwind codes accurately mirror the function's prolog and epilog in reverse order. Here's a breakdown of what's wrong and what needs to be changed:Key Problems with Current Unwind CodesIncorrect Unwind Information Generation: The compiler's AArch64 backend (e.g., in cgcpu.pas) doesn't fully account for Windows-specific requirements, leading to malformed .xdata. For example, unwind codes fail to properly describe stack allocations, register saves, and restorations, causing the unwinder to skip or corrupt the stack during partial unwinds (e.g., mid-prolog or mid-epilog). This is why simple non-exception code runs, but try-except or try-finally blocks crash or skip handlers.
Omission of Frame Pointer (x29): Without x29 (unchained mode), the unwind must rely solely on sp for recovery. However, current codes don't handle this correctly, especially for non-canonical prologs/epilogs (e.g., variable stack sizes or non-standard save orders). Packed unwind data in .pdata can't represent sp restores from x29, forcing full .xdata—but FPC's implementation often generates incompatible sequences.
Mismatched SEH Structures: Definitions in rtl/win64/seh64.inc deviate from Windows' winnt.h, affecting how the unwinder interprets context during unwinding. For instance:KNONVOLATILE_CONTEXT_POINTERS.IntegerContext is sized incorrectly (should be array[0..11] for x19-x28 + fp + lr).
TDispatcherContext misses a Reserved: DWord field for ARM64.
RUNTIME_FUNCTION lacks proper flags and comments for ARM64-specific unwind data.
Missing Local Unwind Support: The AArch64 code generator lacks a Windows override for g_local_unwind, which is needed to integrate with SEH for local jumps (e.g., exit in try-finally). This prevents _FPC_local_unwind from being called with the correct sp and label reference.
Incomplete Handling for Exceptions: Even with partial RTL support (added around 2020-2021), the unwind codes don't ensure 1:1 mapping to prolog/epilog instructions, violating the spec's no-code-probing rule. This breaks mid-unwind scenarios, like interrupts or exceptions in partial frames.
What Needs to Be Changed in the Unwind CodesTo comply with the Windows ARM64 unwind spec, FPC's unwind codes must be updated in the code generator to produce accurate .xdata sequences. These codes are stored in reverse prolog order (epilogs mirror them) and must describe every stack-modifying operation precisely. Since FPC omits x29, focus on unchained formats (CR=00 or 01 in packed .pdata). Key changes include:Align SEH Structures in RTL:Update seh64.inc to match winnt.h:Set KNONVOLATILE_CONTEXT_POINTERS.IntegerContext to array[0..11] of PQWord (comment: "// X19..X28 + Fp + Lr").
Keep FloatingContext as array[0..7] of PM128A (comment: "// D8..D15").
For RUNTIME_FUNCTION.UnwindData, add ARM64 comment: "// or Flag(2) Len(11) RegF(3) RegI(4) H(1) CR(2) FSize(9)".
Add Reserved: DWord to TDispatcherContext for AARCH64, plus ControlPCIsUnwound: Byte and NonVolatileRegisters: PByte.
These ensure the unwinder can correctly restore context pointers during unwinding.
Implement Windows-Specific Local Unwind in Code Generator:Add an override to g_local_unwind in compiler/aarch64/cgcpu.pas for system_aarch64_win64:
Key Problems with Current Unwind CodesIncorrect Unwind Information Generation: The compiler's AArch64 backend (e.g., in cgcpu.pas) doesn't fully account for Windows-specific requirements, leading to malformed .xdata. For example, unwind codes fail to properly describe stack allocations, register saves, and restorations, causing the unwinder to skip or corrupt the stack during partial unwinds (e.g., mid-prolog or mid-epilog). This is why simple non-exception code runs, but try-except or try-finally blocks crash or skip handlers.
Omission of Frame Pointer (x29): Without x29 (unchained mode), the unwind must rely solely on sp for recovery. However, current codes don't handle this correctly, especially for non-canonical prologs/epilogs (e.g., variable stack sizes or non-standard save orders). Packed unwind data in .pdata can't represent sp restores from x29, forcing full .xdata—but FPC's implementation often generates incompatible sequences.
Mismatched SEH Structures: Definitions in rtl/win64/seh64.inc deviate from Windows' winnt.h, affecting how the unwinder interprets context during unwinding. For instance:KNONVOLATILE_CONTEXT_POINTERS.IntegerContext is sized incorrectly (should be array[0..11] for x19-x28 + fp + lr).
TDispatcherContext misses a Reserved: DWord field for ARM64.
RUNTIME_FUNCTION lacks proper flags and comments for ARM64-specific unwind data.
Missing Local Unwind Support: The AArch64 code generator lacks a Windows override for g_local_unwind, which is needed to integrate with SEH for local jumps (e.g., exit in try-finally). This prevents _FPC_local_unwind from being called with the correct sp and label reference.
Incomplete Handling for Exceptions: Even with partial RTL support (added around 2020-2021), the unwind codes don't ensure 1:1 mapping to prolog/epilog instructions, violating the spec's no-code-probing rule. This breaks mid-unwind scenarios, like interrupts or exceptions in partial frames.
What Needs to Be Changed in the Unwind CodesTo comply with the Windows ARM64 unwind spec, FPC's unwind codes must be updated in the code generator to produce accurate .xdata sequences. These codes are stored in reverse prolog order (epilogs mirror them) and must describe every stack-modifying operation precisely. Since FPC omits x29, focus on unchained formats (CR=00 or 01 in packed .pdata). Key changes include:Align SEH Structures in RTL:Update seh64.inc to match winnt.h:Set KNONVOLATILE_CONTEXT_POINTERS.IntegerContext to array[0..11] of PQWord (comment: "// X19..X28 + Fp + Lr").
Keep FloatingContext as array[0..7] of PM128A (comment: "// D8..D15").
For RUNTIME_FUNCTION.UnwindData, add ARM64 comment: "// or Flag(2) Len(11) RegF(3) RegI(4) H(1) CR(2) FSize(9)".
Add Reserved: DWord to TDispatcherContext for AARCH64, plus ControlPCIsUnwound: Byte and NonVolatileRegisters: PByte.
These ensure the unwinder can correctly restore context pointers during unwinding.
Implement Windows-Specific Local Unwind in Code Generator:Add an override to g_local_unwind in compiler/aarch64/cgcpu.pas for system_aarch64_win64:
procedure tcgaarch64.g_local_unwind(list: TAsmList; l: TAsmLabel);
var
para1, para2: tcgpara;
href: treference;
pd: tprocdef;
begin
if target_info.system<>system_aarch64_win64 then
begin
inherited g_local_unwind(list,l);
exit;
end;
pd:=search_system_proc('_fpc_local_unwind');
para1.init;
para2.init;
paramanager.getcgtempparaloc(list,pd,1,para1);
paramanager.getcgtempparaloc(list,pd,2,para2);
reference_reset_symbol(href,l,0,1,[]);
a_load_reg_cgpara(list,OS_ADDR,NR_STACK_POINTER_REG,para1);
a_loadaddr_ref_cgpara(list,href,para2);
paramanager.freecgpara(list,para2);
paramanager.freecgpara(list,para1);
g_call(list,'_FPC_local_unwind');
para2.done;
para1.done;
end;
This enables custom local unwinding by passing sp and a label to _FPC_local_unwind, fixing issues like skipped finally blocks. However, it may still crash post-unwind if broader codes aren't fixed—test iteratively.
Generate Correct Unwind Code Sequences:For Stack Allocation: Use alloc_s (for <512 bytes: 000iiiii), alloc_m (<32K: 11000iii iiiiiiii), or alloc_l (<256M: 11100000 iiiiiiii iiiiiiii iiiiiiii). Match the prolog's sub sp,sp,#size and epilog's add sp,sp,#size.
For Register Saves (Unchained, No x29): Avoid set_fp (11100001) or add_fp (11100010 iiiiiiii). Instead:
Use save_regp/save_regp_x for integer pairs (x19-x28: 110010nn nniiiiii or 110011nn nniiiiii, pre-indexed for !).
save_reg/save_reg_x for singles (110100nn nniiiiii or 1101010n nnniiiii).
save_lrpair for lr pairs (1101011n nniiiii).
save_fregp/save_fregp_x for FP pairs (d8-d15: 1101100n nniiiiii or 1101101n nniiiiii).
save_freg/save_freg_x for FP singles (1101110n nniiiiii or 11011110 nnniiiii).
Chain with save_next (11100110) for consecutive pairs (e.g., after save_regp to save four regs).
For flexible saves (e.g., non-standard offsets), use save_any_reg variants (11100111 followed by 16-bit payload for str/stp/ldr/ldp on x/d/q regs).
End Codes: Always end sequences with end (11100100) for complete unwind, or end_c (11100101) for chained scopes/fragments.
Other: Include nop (11100011) for padding/homing, pac_sign_lr (11111100) if pointer authentication is used.
Custom Codes for Non-Standard Cases: If FPC's sp-as-fp setup doesn't fit standard codes, use custom opcodes (11101xxx) like MSFT_OP_TRAP_FRAME (11101000) or MSFT_OP_CONTEXT (11101010) for bespoke stack handling in assembly routines.
Ensure codes are in big-endian byte order, packed tightly (up to 255 bytes in .xdata), and shared across prolog/epilogs. For large functions (>1M instructions), split into fragments with separate .pdata/.xdata (first has prolog; others may have epilogs or end_c).
Enable Packed Unwind Where Possible:For canonical unchained functions (no exceptions, standard save order, no x29), use packed .pdata (Flag=01/10/11) to compress data: 11-bit function length, 9-bit frame size, 2-bit CR (00: no lr save; 01: lr saved), 1-bit H (homing), 4-bit RegI (integer saves), 3-bit RegF (FP pairs). This avoids full .xdata but requires exact compliance—fall back to .xdata if non-canonical.
RecommendationsApply the patches from the forum (e.g., seh64.inc diff and g_local_unwind override) to FPC trunk and test with simple try-finally/except programs.
For full fix, the AArch64 code generator needs deeper changes to emit spec-compliant .xdata (track bug #40203 on GitLab). If you're contributing, focus on matching prolog/epilog 1:1 with unwind codes and handling unchained mode without x29.
Workaround: Avoid exceptions entirely for now, or force frame pointers via compiler directives if feasible (though this isn't currently straightforward on ARM64).
Monitor the Lazarus forum thread for updates—devs are actively patching, with bounties accelerating progress.