Recent

Author Topic: RIP-relative addressing - should compiler be more helpful?  (Read 5536 times)

ahydra

  • New Member
  • *
  • Posts: 19
RIP-relative addressing - should compiler be more helpful?
« on: December 31, 2017, 12:19:38 am »
I submitted this bug report: https://bugs.freepascal.org/view.php?id=32905 which was closed because it was felt I was "asking questions" rather than submitting a bug, even though I explained why I felt it was a bug. So clearly what Florian was trying to say is: you're doing it wrong.

I played around a bit, and looking at the first example:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$ASMMODE INTEL}
  4.  
  5. var
  6.   r: integer;
  7.  
  8. begin
  9.   asm
  10.     MOV r, 7 // crash!
  11.   end;
  12.  
  13.   writeln(r);
  14.   readln();
  15. end.

it turns out the asm code must be written as "MOV [RIP+r], 7", as x64 uses RIP-relative addressing. Fair enough, and that works fine. But it made me think (as I noted in the bug report): why does the compiler allow you to write code that doesn't make sense :)?

Specifically:
1) is there ever any use for loading the absolute address, for example "LEA RAX, r"? I was thinking perhaps for lookup tables there could be, but it wouldn't work very well as the offsets are truncated to 32-bit. For example, in a quick test program "r" is at 0x10000f000, but the instruction is compiled to "LEA RAX, 0xf000".
2) Note that, to add to the fun, you can read from globals just fine: "MOV EAX, r" compiles to (AT&T syntax) "movabs 0x1000f000, %eax", and works as expected - it loads the value of variable "r" into EAX.
3) Given the above, should the compiler not try to help you out by replacing the "r" with "[RIP+offset]" where the code would otherwise make no sense due to offsets being truncated? This would be similar behaviour to stack variables, which are automatically replaced with "[RBP+offset]".

Any opinions? I feel I'm missing something here.

Thanks,

ahydra

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: RIP-relative addressing - should compiler be more helpful?
« Reply #1 on: December 31, 2017, 01:29:50 am »
it turns out the asm code must be written as "MOV [RIP+r], 7", as x64 uses RIP-relative addressing. Fair enough, and that works fine. But it made me think (as I noted in the bug report): why does the compiler allow you to write code that doesn't make sense :)?

A compiler does not program for you. It checks syntax and converts that to backend code. And it IS possible to access data at addresses that are not rip relative in assembler. Just not very useful.

ahydra

  • New Member
  • *
  • Posts: 19
Re: RIP-relative addressing - should compiler be more helpful?
« Reply #2 on: December 31, 2017, 04:53:29 am »

A compiler does not program for you. It checks syntax and converts that to backend code.

Yes, of course. But what is semantically different between the global variable and stack variable cases, such that it will convert the latter correctly but not the former? This is what I'm puzzled about.

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 1058
Re: RIP-relative addressing - should compiler be more helpful?
« Reply #3 on: December 31, 2017, 03:27:26 pm »
In general, inline assembly should be translated as literally as possible into machine code, because the whole point of inline assembly is to give programmers direct access to the cpu. So if you write that you wish to write something to the absolute address of a global variable, that's what the compiler should generate. Depending on the platform and run time environment, there may also be multiple ways to access global variables (e.g. GOT-based or not), so changing the (valid) user-written assembly code in a way that may or may not be intended, is not a good idea.

For local variables and parameters it's simpler because there is only one way to access them (and they don't have an absolute address that is available at compile/link time).

Arioch

  • Sr. Member
  • ****
  • Posts: 421
Re: RIP-relative addressing - should compiler be more helpful?
« Reply #4 on: January 10, 2024, 11:32:00 am »
so changing the (valid) user-written assembly code in a way that may or may not be intended

well, then maybe add some syntax to explicitly put the intention in?

like, extending Microsoft ASM style

Code: Pascal  [Select][+][-]
  1.   asm
  2.     MOV  AUTO PTR r, 7 // crash!
  3.   end;

Arioch

  • Sr. Member
  • ****
  • Posts: 421
Re: RIP-relative addressing - should compiler be more helpful?
« Reply #5 on: January 10, 2024, 11:37:49 am »
so changing the (valid) user-written assembly code in a way that may or may not be intended

well, then maybe add some syntax to explicitly put the intention in?

like, extending Microsoft ASM style

Code: Pascal  [Select][+][-]
  1.   asm
  2.     MOV  AUTO PTR r, 7 // crash!
  3.   end;

Though personally i do not see what the "may not" clause refers too.

"R" here is not a constant or literal, it is a variable reference. It is not "@R" or "Addr(R)" either.
So the intention seems clear here, getting the variable value, whereever the variable is.

Yes, perhaps inline assembler could be (in Delphi too) more consistent, if it demanded brackets "move [@R], 7" for memory access always, like PDP-11 assembler used to. But this would, i think, break compatibility to MASM/TASM/Delphi BASM syntax then...

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: RIP-relative addressing - should compiler be more helpful?
« Reply #6 on: January 11, 2024, 11:30:43 pm »
As Jonas said, the point of the inline assembler is an as direct translation as possible. In the case of a global variable it can be that the user indeed does not want a RIP-relative access (maybe because they're writing an operating system kernel that sits at a specific address). Inline assembler provides great power, but with it also comes great responsibility, in this case of the user to understand what they're doing and to do that correctly.

 

TinyPortal © 2005-2018