The reason the compiler uses a load without sign extension is that your arrays are declared as having only valid unsigned index types (0..0 and 1..1). I know this is a common way to declare variable sized arrays, but it's still hack and these statements are all invalid (they cause range errors, which means the result is undefined).I don't buy that explanation but, leaving that aside. The fact that the code behaves differently when compiled for 32bit than in 64bit is clearly a problem. The result should _not_ depend on the bitness. Either there is a bug when compiling for 64bit or there is a bug when compiling for 32bit. You choose.
The correct way to declare such arrays is to either declare them as array[low(ptrint) div sizeof(elementsize)..high(ptrint) div sizeof(elementsize)], or to use a pointer type to the element size (you can index pointer types like arrays both in FPC and in Delphi).
No, it's not a problem, because as Jonas said you've entered the realm of undefined behaviour and it's nowhere written that this undefined behaviour needs to be the same on all platforms.RowIdx is an integer. Specifically, a 32bit integer. If it is going to be moved into a 64bit register then it _must_ be sign extended. Anything else is incorrect and cannot be justified.
Undefined behaviour? In Pascal?? :o So all the crap is copied from C, and the useful things are left off
Think hex, ffíng idiot. $FFFFFFFFFFFFFFFF<> $FFFFFFFF, of course. In any pattern it is undefinedUndefined behaviour? In Pascal?? :o So all the crap is copied from C, and the useful things are left off
In Pascal "undefined behavior" is now even better than in C, C++ and any other language.That really takes "undefined behavior" to a new level.
var Index32 : int32 = -1; Index64 : int64 = -1; ... ... AnArray[Index32] <> AnArray[Index64];
Think hex, ffíng idiot. $FFFFFFFFFFFFFFFF<> $FFFFFFFF, of course. In any pattern it is undefinedThaddy, you poor thing, at least try to fake some self control. I'd ask you to think but, that would really be undefined behavior for you. Maybe someone close to you can hand hold you to -1 = -1 regardless of how it is represented. If not, consider applying for government help.
Also in the GNU compiler suite and -let me guess - Visual studio.
You can add more f's to taste....
$FFFFFFFFFFFFFFFF<> $00000000FFFFFFFF in a well behaved compiler, which al three are...
When you index an array with any expression, this expression gets type-converted to the range type of the array. This is also the step at which range checking gets performed (if enabled). So what gets loaded in a 64bit registers is not RowIdx, but implicit_cast(RowIdx, array_range_type). And array_range_type cannot have negative values according to its declaration.No, it's not a problem, because as Jonas said you've entered the realm of undefined behaviour and it's nowhere written that this undefined behaviour needs to be the same on all platforms.RowIdx is an integer. Specifically, a 32bit integer. If it is going to be moved into a 64bit register then it _must_ be sign extended.
The fact that the code behaves differently when compiled for 32bit than in 64bit is clearly a problem.The code gets interpreted the same on both platforms by the compiler (index = unsigned). However, because on 32 bit platforms an address register is only 32 bit long, you don't notice a difference there.
When you index an array with any expression, this expression gets type-converted to the range type of the array. This is also the step at which range checking gets performed (if enabled). So what gets loaded in a 64bit registers is not RowIdx, but implicit_cast(RowIdx, array_range_type). And array_range_type cannot have negative values according to its declaration.If FPC was doing range checking the way you describe then it would _not_ compile the example program shown below. (cases 1 and 2.)
The code gets interpreted the same on both platforms by the compiler (index = unsigned). However, because on 32 bit platforms an address register is only 32 bit long, you don't notice a difference there.That is not a valid argument. The index variable is a signed integer. The compiler cannot decide to turn it into an unsigned type because it's being used to access an array which can be indexed with an unsigned type. If you apply what you are saying then the compiler should not even accept a signed type as an index for such an array.
simply because compiler uses the type information it got from the program (and that is also the only information it can use).You can't use an argument that the compiler "uses type information it got from the program" when the compiler is dropping the sign of a signed data type. The compiler has been told the indexing variable is a signed integer, it cannot simply turn that into an unsigned integer behind the programmer's back. That is incorrect.
@Jonas
The range checking argument you are putting forward, simply does not work and it is not even applicable. A range checking argument cannot be put forward to justify turning a signed type into an unsigned type.
He doesn't say that. He says you pass an signed value to a range that is only positive. That should generate a runtime error/exception when range checking is on (and also Delphi does so, e.g. change b:=3 to b:=-3 and add {$R+} and uses sysutils)He is not saying it as clearly as I have because the range checking argument is obviously not applicable but, it is being used to justify incorrect code generation.
If you turn off range checks, you are in undefined territory, and anything may happen. The exact codegeneration depends on target.That is not the case at all. For a value n, whether positive or negative, the memory referenced by array[n] is addressof(array) + (n * sizeof(arrayelement)). There is no "undefined territory" in any of this.
AnArray[5] := SPACE; // Delphi won't compile this (which is correct) AnArray[-5] := SPACE; // nor this but, FPC does. At least, in this case // it generates CORRECT code for it. b := 3; AnInt := (b * 2) div 3; AnArray[AnInt] := SPACE; // FPC generates INCORRECT code (no sign extension)
I think you are mixing up range checking and type conversions.When you index an array with any expression, this expression gets type-converted to the range type of the array. This is also the step at which range checking gets performed (if enabled). So what gets loaded in a 64bit registers is not RowIdx, but implicit_cast(RowIdx, array_range_type). And array_range_type cannot have negative values according to its declaration.If FPC was doing range checking the way you describe then it would _not_ compile the example program shown below. (cases 1 and 2.)
The code gets interpreted the same on both platforms by the compiler (index = unsigned). However, because on 32 bit platforms an address register is only 32 bit long, you don't notice a difference there.That is not a valid argument. The index variable is a signed integer. The compiler cannot decide to turn it into an unsigned type because it's being used to access an array which can be indexed with an unsigned type.
If you apply what you are saying then the compiler should not even accept a signed type as an index for such an array.It should, because type conversions between different integer types are defined in the language. It is possible for signed types to contain values that are also valid for unsigned types (any value that is 0 or positive). If it contains an invalid value, then the result is undefined, except if you enable range checking (in which case a range check error/exception will be thrown).
Even with range checking enabled, the compiler cannot simply decide to change the data type(s) the programmer declared.The programmer has declared two different data types:
Question...Because when a number is used, the programmer has given the responsibility to do range checking to the compiler. When a variable is used it is the programmer's responsibility unless and until the programmer chooses to relinquish that responsibility by enabling runtime range checking.
AnArray[5] := SPACE; // Delphi won't compile this (which is correct) AnArray[-5] := SPACE; // nor this but, FPC does. At least, in this case // it generates CORRECT code for it. b := 3; AnInt := (b * 2) div 3; AnArray[AnInt] := SPACE; // FPC generates INCORRECT code (no sign extension)
@440bx
For that construct, why does AnArray[number_index] not work when AnArray[variable_index] does?
He is not saying it as clearly as I have because the range checking argument is obviously not applicable but, it is being used to justify incorrect code generation.
That is not the case at all. For a value n, whether positive or negative, the memory referenced by array[n] is addressof(array) + (n * sizeof(arrayelement)).
The range checking argument is neither here nor there. FPC is dropping the sign of a signed type. Even with range checking enabled it should NOT do that. It could report and error but, only if the programmer enables range checking, not otherwise.
Because when a number is used, the programmer has given the responsibility to do range checking to the compiler. When a variable is used it is the programmer's responsibility unless and until the programmer chooses to relinquish that responsibility by enabling runtime range checking.
That the variable holding the index (which the compiler expects to be 0) is of signed type does not matter. After all it is valid to store 0 in a signed variable.if the argument that has been presented so far held any water then, the compiler should just move zero (or xor a register) and use that as the index value but, we both know it isn't doing that.
No, since the array[n] already implies the conversion to the array index range. So even if it were true, it would beIf that were true then the compiler should simply use 0 since that is the only valid index for the array. Claiming that the compiler should convert the indexing variable to the index type is absurd since there isn't a data type for every possible combination of start/end array range.
addressof(array) + (n' * sizeof(arrayelement)).
with n' the conversion of n to the range of the array
That in some cases, without range checking turned off, that conversion might turn out differently, is unfortunate, but FPC is a compatible compiler, not an emulator for bad code.
Interesting. Thank you kindly for the response.You're most welcome. It was a good question, I like those. :)
When using the indexing variable to index the array, the indexing variable gets converted to the indexing type of the array.You propose that statement as an argument to justify dropping the sign of a signed type while invoking Delphi compatibility to justify compiling AnArray[-5]. I should note that {$MODE DELPHI} was not specified in the code to justify that behavior.
In the end, the issue is not whether implicit converting from signed to unsigned types should be possible. If it weren't, you would not be able to assign a longint to a byte or cardinal variable without a warning/error or explicit typecast (if you want a language that forbids this, have a look at e.g. Ada, but it takes quite a bit of getting used to this rigidity if you come from Pascal).Type conversion should be done when necessary, not when unnecessary and particularly not when it is incorrect to do so (as FPC is doing in this case.)
@MartinTwo different issues. The absence of signed extension also holds true for an array declared [0..99999]That the variable holding the index (which the compiler expects to be 0) is of signed type does not matter. After all it is valid to store 0 in a signed variable.if the argument that has been presented so far held any water then, the compiler should just move zero (or xor a register) and use that as the index value but, we both know it isn't doing that.
There is no valid and sensible argument to justify what FPC is doing.
This discussion is futile. It has become patently obvious that this bug is here to stay no matter how much effort someone puts into explaining why it is wrong. Therefore, I am done discussing this subject.
In order for the initial program to compile in Delphi (the latest version), need to replace PtrUInt with NativeUInt. To make it work without errors, disable range check and replace Char/PChar with AnsiChar/PAnsiChar.Yes, the initial program does need some minor modifications to make it run under Delphi (Seattle for me.)
In order for the program to work correctly in FPC, need to replace Integer with NativeInt (in FPC, SizeInt instead of NativeInt is most often used).
Just for completeness, I've attached the versions I used to test with Delphi (Seattle)I suggest a shorter example instead of SigExtend (with same idea)
It works (showing -1) with FPC 32-bit, Delphi (RIO) 32-bit, Delphi 64-bit, but does not with FPC 64-bit (SIGSEGV).I think that is a nice and succinct example. Thank you Serge.
When porting to a 64-bit platform, the code sometimes requires conversion. For example, replacing Integer with SizeInt. In this example, this solves the problem.It works (showing -1) with FPC 32-bit, Delphi (RIO) 32-bit, Delphi 64-bit, but does not with FPC 64-bit (SIGSEGV).These "little details" make porting code from other languages (or even other flavors of Pascal) more work than it should be. Very unfortunate.
When porting to a 64-bit platform, the code sometimes requires conversion. For example, replacing Integer with SizeInt. In this example, this solves the problem.That's true but, it shouldn't be necessary to change the type of an index. It is not justifiable that AnArray[int32] <> AnArray[int64] for the same index value still within the range of an int32.
If you look at RTL, there and it is made.
When porting to a 64-bit platform, the code sometimes requires conversion. For example, replacing Integer with SizeInt. In this example, this solves the problem.That's true but, it shouldn't be necessary to change the type of an index. It is not justifiable that AnArray[int32] <> AnArray[int64] for the same index value still within the range of an int32.
If you look at RTL, there and it is made.
The problem is that "array [0..0] of" is interpreted as array[cardinal] (or array[qword])
An integer is a whole number that can be either greater than 0, called positive, or less than 0, called negative. Zero is neither positive nor negative.
Again, it would be nice, if at least the latter would be treated different. (or both)I have to agree, it would be rather nice if FPC did it correctly. Unfortunately, it looks like neither "nice" nor correct is going to happen in this case.
Imagine if in an expression such as:FPC decided to "convert" ANumber into an unsigned type because the constant 5 is positive. THAT is effectively what FPC is doing when openly ignoring and overriding the declared type of the indexing variable. As obviously ludicrous as doing something like that would be, it would actually have greater justification since, at least the numeral 5, unlike zero (0), is positive.
var ANumber : integer; ... ... ANumber + 5;
Afaik, cardinal is "unsigned" not "positive".Formally, any value other than zero (0) is signed. Cardinal/word/dword/qword/etc means a non-negative-only value which causes the interpretation of the high bit to change thereby allowing higher magnitudes and, that interpretation is valid because the compiler has been explicitly told that there are no negative values in the type/range. It should also be pointed out that non-negative-only includes zero (0) since it is neither negative nor positive.
In any case, "cardinal" contains 0. Yes, so does (signed-)integer.Int/Int32/int64/integer also contain zero (0), therefore it is not possible to assign a sign to zero since both "unsigned" and signed data types include that value. It is incorrect to simply choose, or presume, that zero is positive which is what FPC is doing by "concluding" that sign extension should not be done.
So (for the discussions sake) cardinal is chosen (actually the sub-range 0..0 is chosen, and it is chosen as a subrange of cardinal).0 is also in a subrange of int16/int32/int64/etc, therefore, it is incorrect to presume it is _only_ in a subrange of an unsigned type such as Cardinal.
SomeCardinal := SomeIntegerVariable // where SomeIntegerVariable = -5But in that example, you are declaring a type and in addition to that, you have values other than zero (0) which provide enough information to unambiguously state that the type/range does not include negative values.
will assign $ffff...fb as a positive SomeCardinal.
Equally you can also assign numbers to a variable of subrange type, even if the number is outside that range.
type TNum = 1..5;
ANum := SomeIntegerVariable // where SomeIntegerVariable = -5
Not tested, but on 64 bit that probably also give $00000000fffffffb
If it does not, that would IMHO be inconsistent.
Again, I am not saying that I agree with this design decision. Just pointing out my understanding of the current state.I understand you are not saying you are in agreement with the decision. I am clear on that.
It is not the same. 5 is within the range of integer. No conversion needed.but, it is also in the range of a qword yet, FPC didn't decide to "convert" the variable into a qword, which is what it is doing when failing to sign extend an integer because it is assuming a data type for a constant.
If your program is just changed at line 36 like this, as you know I guess:Yes, that is correct and that is the workaround. As far as indexing an array, as long as the index is numeric and the compiler works as it is supposed to, doing sign extension on signed types for instance, then the size of the integer in use is irrelevant.
RowIdx : SizeInt; // row index I : SizeInt; // column index
It works fine, as the indices are then always pointer-size.