Constant string-to-ordinal casts (stringordcast)Following up on
@440bx's request -
DWORD('abcd') now works in
const,
var initializers and inline
var under
{$modeswitch stringordcast} (on by default in
unleashed mode). The compiler folds it to a real compile-time constant when the literal's byte count matches the target ordinal's size.
Two clarifications before the demo.1. MacPas already has a partial version. Under
{$mode macpas}, FPC accepts
DWORD('abcd') in
const and folds it to an immediate, but only for 4-byte literals (
byte('X'),
word('HI'),
qword('12345678') are rejected) and with
big-endian packing regardless of the target.
stringordcast generalizes this: 1/2/4/8-byte literals, and packing uses the
target's native endianness. The practical consequence is that the in-memory byte layout of the folded constant matches the source byte order on both LE and BE targets, so signature checks like
PDWORD(@buffer)^ = DWORD('RIFF') work uniformly. On x86/x86_64 that means
dword('abcd') folds to
$64636261 (bytes
61 62 63 64 =
'a' 'b' 'c' 'd' in memory); on a BE target the numerical value would be
$61626364 but the memory layout is still
61 62 63 64.
Result type is the type you cast to, including signed variants (
LongInt,
Int64).
2. I owe a correction. Earlier in this thread I said
DWORD('abcd') "already works with inline variables". It
parses and prints the right value, but the generated assembly tells a different story - the compiler puts the string literal in rodata and emits
mov reg, [label] at runtime. I should have checked the generated code before claiming "already works".
With
stringordcast it's a genuine compile-time fold now (x86_64):
movl $1684234849, -4(%rbp) ; 1684234849 == $64636261, native LE load of 'abcd'
No string in rodata, no runtime load, no pointer cast - the bytes are packed into an
ordconstn at typecheck time.
Demoprogram stringordcast_demo;
{$mode unleashed}
const
// untyped const
SIG_MZ = word('MZ'); // $5A4D
SIG_RIFF = dword('RIFF'); // $46464952
MAGIC_8 = qword('DEADBEEF'); // $4645454246414544
// char-literal variants
HEX_DWORD = dword(#$DE#$AD#$BE#$EF); // $EFBEADDE
MIXED = dword('AB'#$00#$01); // $01004241
var
// global var with initializer
gSig: dword = dword('abcd'); // $64636261
gTag: word = word('OK'); // $4B4F
procedure inline_context;
begin
// inline var, inferred type
var a := dword('abcd');
// inline var, typed
var b: word := word('HI');
// signed variant
var c: int64 := int64('abcdefgh');
writeln(' inline inferred a = $', hexstr(a, 8));
writeln(' inline typed b = $', hexstr(b, 4));
writeln(' inline signed c = $', hexstr(c, 16));
end;
procedure signature_check;
var
buf: array[0..3] of char = ('R', 'I', 'F', 'F');
begin
if pdword(@buf[0])^ = SIG_RIFF then
writeln(' RIFF file detected (bytes in memory match SIG_RIFF)');
end;
begin
writeln('untyped const:');
writeln(' word(''MZ'') = $', hexstr(SIG_MZ, 4));
writeln(' dword(''RIFF'') = $', hexstr(SIG_RIFF, 8));
writeln(' qword(''DEADBEEF'') = $', hexstr(MAGIC_8, 16));
writeln(' dword(#$DE#$AD...) = $', hexstr(HEX_DWORD, 8));
writeln(' dword(''AB''#$00#$01) = $', hexstr(MIXED, 8));
writeln;
writeln('typed var initializer:');
writeln(' gSig = $', hexstr(gSig, 8));
writeln(' gTag = $', hexstr(gTag, 4));
writeln;
writeln('inline var:');
inline_context;
writeln;
writeln('signature use-case:');
signature_check;
writeln;
readln;
end.
Output:
untyped const:
word('MZ') = $5A4D
dword('RIFF') = $46464952
qword('DEADBEEF') = $4645454244414544
dword(#$DE#$AD...) = $EFBEADDE
dword('AB'#$00#$01) = $01004241
typed var initializer:
gSig = $64636261
gTag = $4B4F
inline var:
inline inferred a = $64636261
inline typed b = $4948
inline signed c = $6867666564636261
signature use-case:
RIFF file detected (bytes in memory match SIG_RIFF)
DetailsSize must match exactly. If it doesn't, you get a specific diagnostic instead of the generic "Illegal expression":
Error: Cannot cast string of length 3 to ordinal type "LongWord" (size 4 bytes)
Works with
#N-escaped char literals and mixed forms:
dword(#$DE#$AD#$BE#$EF) gives bytes
DE AD BE EF in memory,
dword('AB'#$00#$01) gives
41 42 00 01.
New modeswitchThis is a new modeswitch
stringordcast - off by default in all existing modes, on by default in
unleashed mode. In stock Pascal a string literal is a string, not an ordinal, so the cast is formally illegal; the modeswitch makes it explicit opt-in rather than forcing it on. The emitted code for the const-section case is identical to writing the hex value by hand - no storage, no pointer cast, just an immediate.
ImplementationThe parser already treats
TypeName(expr) as a typecast node, so the fold lives in the constant evaluator: when the cast target is an integer ordinal and the inner node is a
cst_conststring whose byte length equals the target size, the bytes are packed in target-native endianness into an
ordconstn and the string node is discarded before codegen. No new AST node, no storage path, no change to how non-size-matching casts are diagnosed.