Recent

Author Topic: Likely incorrect 64bit code generation in FPC v3.0.4  (Read 4015 times)

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 659
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #15 on: March 13, 2019, 11:17:29 pm »
When you index an array with any expression, this expression gets type-converted to the range type of the array. This is also the step at which range checking gets performed (if enabled). So what gets loaded in a 64bit registers is not RowIdx, but implicit_cast(RowIdx, array_range_type). And array_range_type cannot have negative values according to its declaration.
If FPC was doing range checking the way you describe then it would _not_ compile the example program shown below. (cases 1 and 2.)
I think you are mixing up range checking and type conversions.

FPC, and Delphi, always insert necessary implicit type conversions, regardless of the state of range checking. They are fundamental part of the language. Without them, most code would not compile at all.

If you enable range checking, then whenever an implicit ordinal type conversion is inserted whereby the target type cannot hold every valid value of the source type, the compiler will insert extra code that checks at run time whether this conversion is in fact valid. However, it will not change the behaviour of the implicit type conversion itself, nor will it insert extra type conversions.

Regarding Delphi giving errors even when range checking is not enabled, while FPC only prints warnings: when this was implemented, the then-current version of Delphi (Delphi 7) did not print anything at all when compiling that program, even when range checking was enabled. We tried to both keep compatibility with Delphi code (i.e., compile the code), and give feedback to the programmer about the issue (print a warning when range checking is disabled, print an error when it is enabled). Delphi's behaviour obviously has changed in the mean time, and maybe we should now also always give an error instead (although you can bet that would also result in complaints and bug reports).

The code gets interpreted the same on both platforms by the compiler (index = unsigned). However, because on 32 bit platforms an address register is only 32 bit long, you don't notice a difference there.
That is not a valid argument.  The index variable is a signed integer.  The compiler cannot decide to turn it into an unsigned type because it's being used to access an array which can be indexed with an unsigned type.
[/quote]
It is a basic property of typed languages: if you use an entity of type A to perform an operation that expects an entity of type B, a compiler will always insert a type conversion. After that type conversion (which may simply reinterpret an existing bit pattern, e.g. when converting an integer to another integer type of the same size, or transform it in a more complex way, e.g. when converting an ansichar into a unicodestring), the result will be seen as an entity of this new type rather than of the original type.

See e.g. https://en.wikibooks.org/wiki/Computer_Programming/Type_conversion for more information about the concept of type conversions.

Quote
  If you apply what you are saying then the compiler should not even accept a signed type as an index for such an array.
It should, because type conversions between different integer types are defined in the language. It is possible for signed types to contain values that are also valid for unsigned types (any value that is 0 or positive). If it contains an invalid value, then the result is undefined, except if you enable range checking (in which case a range check error/exception will be thrown).

Quote
Even with range checking enabled, the compiler cannot simply decide to change the data type(s) the programmer declared.
The programmer has declared two different data types:
* the type of the indexing variable
* the type of the array's range type

When using the indexing variable to index the array, the indexing variable gets converted to the indexing type of the array. That is simply how type conversions work. Also in Delphi. It is even that way in C/C++. This particular scenario with array indexing is, however, irrelevant in C/C++ because there array indexing is simply an alias for pointer arithmetic (unlike in Pascal).

In the end, the issue is not whether implicit converting from signed to unsigned types should be possible. If it weren't, you would not be able to assign a longint to a byte or cardinal variable without a warning/error or explicit typecast (if you want a language that forbids this, have a look at e.g. Ada, but it takes quite a bit of getting used to this rigidity if you come from Pascal).

The issue is about what the index/range type of an array[0..0] should be. If it would be signed rather than unsigned, then your code would work the same as under Delphi. On the other hand, the efficiency of some code that is type correct would be lowered (unnecessary sign extensions), and as mentioned before it would require of the compiler's code to be audited as well.

However, regardless of the above, your code will always be invalid according to any Pascal standard, because you are accessing an array using an invalid index.

440bx

  • Hero Member
  • *****
  • Posts: 1089
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #16 on: March 13, 2019, 11:18:41 pm »
Question...


Code: Pascal  [Select]
  1.     AnArray[5]  := SPACE;    // Delphi won't compile this (which is correct)
  2.  
  3.     AnArray[-5] := SPACE;    // nor this but, FPC does. At least, in this case
  4.                              // it generates CORRECT code for it.
  5.  
  6.   b := 3;
  7.   AnInt := (b * 2) div 3;
  8.  
  9.   AnArray[AnInt] := SPACE;   // FPC generates INCORRECT code (no sign extension)
  10.  


@440bx

For that construct, why does AnArray[number_index] not work when AnArray[variable_index] does?
Because when a number is used, the programmer has given the responsibility to do range checking to the compiler.  When a variable is used it is the programmer's responsibility unless and until the programmer chooses to relinquish that responsibility by enabling runtime range checking. 

using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 5576
    • wiki
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #17 on: March 13, 2019, 11:19:06 pm »
He is not saying it as clearly as I have because the range checking argument is obviously not applicable but, it is being used to justify incorrect code generation.

While I actually think, it would be nice to be able to do what your example tries, Jonas is right as far as the current state goes.

Jonas did not say (or mean to say), that the compiler casts the index value to unsigned.
Jonas said that the compiler omits the code for extending the value as signed value.

The compiler knows that the range for the index goes from 0..0 (that is the declaration). None of those values is negative. Therefore signed expansion is never needed, and the code not generated.

As far as the compiler goes: If you then later in your code present any value outside that range, then this is an invalid value and the program does not need to work for incorrect input. More to the point the compiler can not make it work for such a case, because the compiler would not know what the intention is.

That the variable holding the index (which the compiler expects to be 0) is of signed type does not matter. After all it is valid to store 0 in a signed variable.

Declare an array as "foo: array [-1..-1] of bar" and signextension should work.
Unfortunately, then the first element is -1. So if you want to access the element before the declared min-index that becomes -2. (And likely that breaks your code too)

---------------
Now for the case [0..0] it is easy to solve by declaring the variable as "foo: PBar // PBar = ^Bar"

But if you want a 1-based array [1..1], then there is no solution.

So it would be nice, if there was a feature that would allow to write arrays like that, and at the same time tell the compiler what the range should be.
The problem is, if you want the first element to be at 0 (or 1) then you can not have a negative element in the defined index range. Yet that is what you currently need.


marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7363
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #18 on: March 13, 2019, 11:28:00 pm »
That is not the case at all.  For a value n, whether positive or negative, the memory referenced by array[n] is addressof(array) + (n * sizeof(arrayelement)). 

No, since the array[n]  already implies the conversion to the array index range. So even if it were true, it would be

Code: Pascal  [Select]
  1. addressof(array) + (n' * sizeof(arrayelement)).  

with n' the conversion of n to the range of the array

That in some cases, without range checking turned off, that conversion might turn out differently, is unfortunate, but FPC is a compatible compiler, not an emulator for bad code.

Quote
The range checking argument is neither here nor there.  FPC is dropping the sign of a signed type.  Even with range checking enabled it should NOT do that.  It could report and error but, only if the programmer enables range checking, not otherwise.

Rangechecks do not exist for purity contests only. They guard the boundaries of the typesystem, and with reason. Turn them off, and you better know what you are doing.

In this case, the argument to make an exception is even weaker, since code like this is mostly pre D2009 whatever-works-x86-code, since D2009, pointermath allows to solve the problems more logically.

ASBzone

  • Full Member
  • ***
  • Posts: 241
  • Automation leads to relaxation...
    • BrainWaveCC Utilities
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #19 on: March 13, 2019, 11:44:36 pm »
Because when a number is used, the programmer has given the responsibility to do range checking to the compiler.  When a variable is used it is the programmer's responsibility unless and until the programmer chooses to relinquish that responsibility by enabling runtime range checking.

Interesting.   Thank you kindly for the response.
-ASB: https://www.BrainWaveCC.com

Lazarus v2.0.5 r61666 / FPC v3.2.0-beta-r42593 (via FpcUpDeluxe) -- Windows 64-bit install w/32-bit cross-compile
Primary System: Windows 10 Pro x64, Version 1903 (Build 18362.295)
Other Systems: Windows 10 Pro x64, Version 1809 or greater

440bx

  • Hero Member
  • *****
  • Posts: 1089
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #20 on: March 14, 2019, 12:07:29 am »
@Martin
That the variable holding the index (which the compiler expects to be 0) is of signed type does not matter. After all it is valid to store 0 in a signed variable.
if the argument that has been presented so far held any water then, the compiler should just move zero (or xor a register) and use that as the index value but, we both know it isn't doing that.

IOW, if type conversion is going to be used to justify what FPC is doing then, the only acceptable "conversion" is to set the index variable to zero.

There is no valid and sensible argument to justify what FPC is doing.


@Marco
No, since the array[n]  already implies the conversion to the array index range. So even if it were true, it would be

Code: Pascal  [Select]
  1. addressof(array) + (n' * sizeof(arrayelement)).  

with n' the conversion of n to the range of the array

That in some cases, without range checking turned off, that conversion might turn out differently, is unfortunate, but FPC is a compatible compiler, not an emulator for bad code.
If that were true then the compiler should simply use 0 since that is the only valid index for the array.  Claiming that the compiler should convert the indexing variable to the index type is absurd since there isn't a data type for every possible combination of start/end array range.


@ASBzone
Interesting.   Thank you kindly for the response.
You're most welcome.  It was a good question, I like those. :)


@Jonas
When using the indexing variable to index the array, the indexing variable gets converted to the indexing type of the array.
You propose that statement as an argument to justify dropping the sign of a signed type while invoking Delphi compatibility to justify compiling AnArray[-5].  I should note that {$MODE DELPHI} was not specified in the code to justify that behavior.

One thing that is rather nice in a computer language and, required in most, is consistency.

In the end, the issue is not whether implicit converting from signed to unsigned types should be possible. If it weren't, you would not be able to assign a longint to a byte or cardinal variable without a warning/error or explicit typecast (if you want a language that forbids this, have a look at e.g. Ada, but it takes quite a bit of getting used to this rigidity if you come from Pascal).
Type conversion should be done when necessary, not when unnecessary and particularly not when it is incorrect to do so (as FPC is doing in this case.)

The bottom line is:

1. Delphi does it correctly.  C/C++ do it correctly.  FPC does not. 
2. with Delphi and C/C++ the results obtained in 32bit and 64bit are the same, not so in FPC.

Also part of the bottom line is: 

This discussion is futile.  It has become patently obvious that this bug is here to stay no matter how much effort someone puts into explaining why it is wrong.  Therefore, I am done discussing this subject.

Thank you to all who have participated in the discussion and, for presenting your viewpoint.   I appreciate it in spite of the fact that, unfortunately, the arguments are rather flawed.
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 5576
    • wiki
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #21 on: March 14, 2019, 12:22:44 am »
@Martin
That the variable holding the index (which the compiler expects to be 0) is of signed type does not matter. After all it is valid to store 0 in a signed variable.
if the argument that has been presented so far held any water then, the compiler should just move zero (or xor a register) and use that as the index value but, we both know it isn't doing that.
Two different issues. The absence of signed extension also holds true for an array declared [0..99999]
And for that the index is not constant.

Quote
There is no valid and sensible argument to justify what FPC is doing.
This discussion is futile.  It has become patently obvious that this bug is here to stay no matter how much effort someone puts into explaining why it is wrong.  Therefore, I am done discussing this subject.

As I said I support the idea of this being changed.

But it is not a bug. It is a missing feature.

ASerge

  • Hero Member
  • *****
  • Posts: 1396
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #22 on: March 14, 2019, 08:56:12 pm »
In order for the initial program to compile in Delphi (the latest version), need to replace PtrUInt with NativeUInt. To make it work without errors, disable range check and replace Char/PChar with AnsiChar/PAnsiChar.
In order for the program to work correctly in FPC, need to replace Integer with NativeInt (in FPC, SizeInt instead of NativeInt is most often used).

Now for RangeChecks. It is clear that the range check program does not work. But I will notice that with R+ the compiler issues not a warning, but an error. In addition, I have generated code does not match the comment:
Code: Pascal  [Select]
  1.   AnArray[-5] := SPACE; // nor this but, FPC does. At least, in this case
  2.   // it generates CORRECT code for it.
Code: ASM  [Select]
  1. # [27] AnArray[-5] := SPACE; // nor this but, FPC does. At least, in this case
  2.         movb    $32,U_$P$RANGECHECKS_$$_ANARRAY+251(%rip)

Code: ASM  [Select]
  1. # [45] b := 3;
  2.         movl    $3,%eax
  3. # Var AnInt located in register eax
  4. .Ll7:
  5. # [46] AnInt := (b * 2) div 3;
  6.         movl    $2,%eax
  7. .Ll8:
  8. # [47] AnArray[AnInt] := SPACE; // FPC generates INCORRECT code (no sign extension)
  9.         movb    $32,U_$P$RANGECHECKS_$$_ANARRAY+2(%rip)
  10. # Var AnInt64 located in register rax
  11. .Ll9:
  12. # [56] AnInt64 := (b * 2) div 3; // for this expression FPC generates correct code
  13.         movq    $2,%rax
  14. .Ll10:
  15. # [57] AnArray[AnInt64] := SPACE; // and doesn't complain about it.
  16.         movb    $32,U_$P$RANGECHECKS_$$_ANARRAY+2(%rip)

440bx

  • Hero Member
  • *****
  • Posts: 1089
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #23 on: March 14, 2019, 10:33:53 pm »
In order for the initial program to compile in Delphi (the latest version), need to replace PtrUInt with NativeUInt. To make it work without errors, disable range check and replace Char/PChar with AnsiChar/PAnsiChar.
In order for the program to work correctly in FPC, need to replace Integer with NativeInt (in FPC, SizeInt instead of NativeInt is most often used).
Yes, the initial program does need some minor modifications to make it run under Delphi (Seattle for me.)

The second example I provided runs under both, FPC and Delphi (Seattle) without modification.

As you pointed out, if the RowIdx is declared as ptrint/nativeint then the code FPC generates is correct (it doesn't have to do sign extension since the programmer already did it for it.)  If RowIdx is a 32bit signed integer in a 64bit program then the code it generates is incorrect (as is more than known by now, it fails to sign extend the 32bit signed integer.)

Just for completeness, I've attached the versions I used to test with Delphi (Seattle)

Lesson: in FPC declare all integers as ptrint or ptruint or, NativeInt and NativeUInt (or some other equivalent such as sizeint.)  No bad surprises that way.

Thank you for looking into the original code and mentioning the modifications it requires to make it work with a recent version of Delphi.


using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 1396
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #24 on: March 16, 2019, 12:03:29 am »
Just for completeness, I've attached the versions I used to test with Delphi (Seattle)
I suggest a shorter example instead of SigExtend (with same idea)
Code: Pascal  [Select]
  1. {$APPTYPE CONSOLE}
  2. {$RANGECHECKS OFF}
  3.  
  4. type
  5.   PCardinalIndexArray = ^TCardinalIndexArray;
  6.   TCardinalIndexArray = array[0..2] of Integer;
  7.   TIntegerIndexArray = array[-1..1] of Integer;
  8.  
  9. const
  10.   A: TIntegerIndexArray = (-1, 0, 1);
  11. var
  12.   P: PCardinalIndexArray;
  13.   Index: Integer;
  14. begin
  15.   P := Pointer(@A[0]);
  16.   Index := 0;
  17.   Dec(Index);
  18.   Writeln(P^[Index]);
  19.   Readln;
  20. end.
It works (showing -1) with FPC 32-bit, Delphi (RIO) 32-bit, Delphi 64-bit, but does not with FPC 64-bit (SIGSEGV).

440bx

  • Hero Member
  • *****
  • Posts: 1089
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #25 on: March 16, 2019, 12:36:22 am »
It works (showing -1) with FPC 32-bit, Delphi (RIO) 32-bit, Delphi 64-bit, but does not with FPC 64-bit (SIGSEGV).
I think that is a nice and succinct example.  Thank you Serge.

Unfortunately, it looks like that bug won't be corrected but, at least some people will now be aware that, in FPC, they cannot expect a 64bit program to behave as it does in 32bit.

These "little details" make porting code from other languages (or even other flavors of Pascal) more work than it should be.  Very unfortunate.

using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 1396
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #26 on: March 16, 2019, 01:32:59 am »
It works (showing -1) with FPC 32-bit, Delphi (RIO) 32-bit, Delphi 64-bit, but does not with FPC 64-bit (SIGSEGV).
These "little details" make porting code from other languages (or even other flavors of Pascal) more work than it should be.  Very unfortunate.
When porting to a 64-bit platform, the code sometimes requires conversion. For example, replacing Integer with SizeInt. In this example, this solves the problem.
If you look at RTL, there and it is made.

440bx

  • Hero Member
  • *****
  • Posts: 1089
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #27 on: March 16, 2019, 01:42:16 am »
When porting to a 64-bit platform, the code sometimes requires conversion. For example, replacing Integer with SizeInt. In this example, this solves the problem.
If you look at RTL, there and it is made.
That's true but, it shouldn't be necessary to change the type of an index.  It is not justifiable that AnArray[int32] <> AnArray[int64] for the same index value still within the range of an int32.
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 5576
    • wiki
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #28 on: March 16, 2019, 01:54:57 am »
When porting to a 64-bit platform, the code sometimes requires conversion. For example, replacing Integer with SizeInt. In this example, this solves the problem.
If you look at RTL, there and it is made.
That's true but, it shouldn't be necessary to change the type of an index.  It is not justifiable that AnArray[int32] <> AnArray[int64] for the same index value still within the range of an int32.

The problem is that "array [0..0] of" is interpreted as array[cardinal] (or array[qword])
Unfortunately that is even true for "array[integer(0)..integer(0)]".

Again, it would be nice, if at least the latter would be treated different. (or both)

440bx

  • Hero Member
  • *****
  • Posts: 1089
Re: Likely incorrect 64bit code generation in FPC v3.0.4
« Reply #29 on: March 16, 2019, 04:22:44 am »
The problem is that "array [0..0] of" is interpreted as array[cardinal] (or array[qword])

That by itself is incorrect.  Zero (0) does not have a sign, it is neither positive nor negative.  No correctly implemented compiler can "decide" that zero is positive or negative.  That is simply totally incorrect.

Mathematically:
Quote
An integer is a whole number that can be either greater than 0, called positive, or less than 0, called negative. Zero is neither positive nor negative.

To further aggravate matters, after FPC decides that zero (0) is positive, it forces a variable that has been declared as signed to be unsigned.  That is beyond incorrect. 

And to put a cherry on top of that moldy cake, it only forces the variable to be positive when compiling for 64bit and not for 32bit.

Imagine if in an expression such as:
Code: Pascal  [Select]
  1. var ANumber : integer;
  2.   ...
  3.   ...
  4.   ANumber + 5;
  5.  
FPC decided to "convert" ANumber into an unsigned type because the constant 5 is positive.  THAT is effectively what FPC is doing when openly ignoring and overriding the declared type of the indexing variable.  As obviously ludicrous as doing something like that would be, it would actually have greater justification since, at least the numeral 5, unlike zero (0), is positive.

Again, it would be nice, if at least the latter would be treated different. (or both)
I have to agree, it would be rather nice if FPC did it correctly.  Unfortunately, it looks like neither "nice" nor correct is going to happen in this case.

It seems that, no matter how many different ways I explain why what FPC is doing is simply wrong, it is for naught. What does surprise me is that the majority of mainstream compilers do it correctly which should really clue the developers that their decision to make FPC behave the way it does is, putting it diplomatically, "probably" incorrect.

« Last Edit: March 16, 2019, 04:30:08 am by 440bx »
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.