Self in ASM

creaothceann

Full Member
Posts: 117

Self in ASM

« on: September 09, 2017, 12:13:56 am »

I have an assembler block in a method. In this assembler block I want to access a variable of the class, preferably without loading it into a local variable.

The MOV instruction takes an offset and a register (and optionally an index with its multiplicator, but I don't use that here), and the documentation says that "_SELF represents the object method pointer in methods", but I can't seem to be able to use __SELF in a MOV instruction...

Example code:

Code: Pascal [Select][+]

{$ASMMODE ATT}
 
 
procedure TMyClass.MyMethod;
begin
//...
ASM
        {$IFDEF CPU32}
        //...
        MOVl    TMyClass.MyVariable(__SELF), %EAX
        //...
        {$ELSE}
        //...
        MOVq    __SELF, %RAX
        MOVq    TMyClass.MyVariable(%RAX), %RAX
        //...
        {$ENDIF}
END;
//...
end;

Right now I'm compiling for x64, so the upper code path isn't active. In the lower code path you can see that I have to explicitly load __SELF into a register first before using that register in the MOV instruction (which might end up as a useless "MOV RAX, RAX" in some cases). Is this a bug in the compiler?

EDIT: From the documentation I was assuming that __SELF represents the current register holding the object pointer, maybe that's not the case?

« Last Edit: September 09, 2017, 12:54:33 am by creaothceann »

Logged

ASerge

Hero Member
Posts: 2249

Re: Self in ASM

« Reply #1 on: September 09, 2017, 02:16:22 am »

Code: Pascal [Select][+]

procedure TForm1.FormCreate(Sender: TObject);
begin
  FField := 1;
  TestIncAsm;
  TestIncBlock;
  Caption := IntToStr(FField);
end;
 
{$ASMMODE INTEL}
 
procedure TForm1.TestIncAsm; assembler;
asm
  mov rdx, Self.FField
  inc rdx
  mov Self.FField, rdx
end;
 
procedure TForm1.TestIncBlock;
begin
  asm
    mov rax, Self
    mov rdx, TForm1(rax).FField
    inc rdx
    mov TForm1(rax).FField, rdx
  end;
end;

Logged

Akira1364

Hero Member
Posts: 561

Re: Self in ASM

« Reply #2 on: September 09, 2017, 04:37:31 am »

I would say: Just don't use ASM. There is essentially a zero percent chance that it will actually be faster than what FPC will produce from normal Pascal code when the proper optimization flags are set. And especially don't use it for such basic purposes as what you seem to be using it for (as in shuffling the values of variables around. Really? Why?)

Let's even give you the benefit of the doubt and say you somehow managed to implement the most highly optimized hand-written ASM implementation of a certain method the world has ever seen (using SSE2 instructions, we'll say, because for some unknown reason people still seem to think the 16-year-old SSE2 instruction set is remotely close to what high-end CPUs are designed for in 2017.) It would still be blown out of the water with a well-written Pascal implementation by anyone with an SSE3+, AVX, AVX2, e.t.c capable CPU who bothered to take the two seconds required to set the required optimization flags before compiling it.

« Last Edit: September 09, 2017, 06:00:23 am by Akira1364 »

Logged

marcov

Administrator
Hero Member
Posts: 11458
FPC developer.

Re: Self in ASM

« Reply #3 on: September 09, 2017, 07:06:09 am »

Not even close. FPC does not autovectorize, so the multiple-values-in-one go feature of SSE goes unused with pascal code. (and even then it is not hard to make handoptimized code better than autovectorized or intrinsics code)

Logged

Akira1364

Hero Member
Posts: 561

Re: Self in ASM

« Reply #4 on: September 09, 2017, 07:39:55 am »

-Sv seems to do the trick in my experience. I'm not saying it isn't possible, I'm just saying I've never seen a handwritten ASM implementation that was better than or came anywhere close to the result of the Pascal version compiled with -CfAVX2 -CpCOREAVX2 -O4 -OpCOREAVX2 -Sv.

Logged

creaothceann

Full Member
Posts: 117

Re: Self in ASM

« Reply #5 on: September 09, 2017, 09:41:53 am »

@ASerge:
Thanks. So I have to do the extra step in non-ASM methods? Alright.

@Akira1364:
Sometimes the programmer has more information available than the compiler, which can make all the difference. Especially when the compiler's usual approach isn't really designed for the problem.

I'm writing an emulator (several million opcodes per second) with an interpreter core. The naive approach reads a byte (opcode) from virtual memory in an endless loop and uses a case-of construct to run the appropriate opcode handler, which defeats the host CPU's branch predictor. So a more refined approach uses an array of pointers to labels and jumps between them via computed goto at the end of every opcode handler. The "problem" is that in my case this burns 2*256*8 = 4 KB of the host CPU's L1 data cache (x64 host CPU, 2 guest CPU modes), which is usually 32 KB per core. That's 12.5% of high speed cache occupied which could maybe, possibly be used for other data.

So my idea was to take the opcode byte and transform it (without further memory accesses) into an opcode handler's memory address. Which is already working, using a fixed virtual memory layout (yes, highly platform specific but that's ok) where I copy each handler's program code to its own strategic position. (Which means no global variables thanks to x64 RIP, but no problem.) The problem right now is to safely return from there when it's time to run the rest of the program.

(This is a "for fun" project, and "just write a JITter" / "write the whole program in ASM" wouldn't be fun.)

Logged

marcov

Administrator
Hero Member
Posts: 11458
FPC developer.

Re: Self in ASM

« Reply #6 on: September 09, 2017, 02:05:48 pm »

Quote from: Akira1364 on September 09, 2017, 07:39:55 am

-Sv seems to do the trick in my experience.

In my experience it doesn't do that much at all. And it is buggy. https://bugs.freepascal.org/view.php?id=31612

Quote

I'm not saying it isn't possible, I'm just saying I've never seen a handwritten ASM implementation that was better than or came anywhere close to the result of the Pascal version compiled with -CfAVX2 -CpCOREAVX2 -O4 -OpCOREAVX2 -Sv.

I work in image processing, and a factor 3-4 is standard. That said, one should always have both implementations and compare.

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: Self in ASM (Read 4081 times)

creaothceann

Self in ASM

ASerge

Re: Self in ASM

Akira1364

Re: Self in ASM

marcov

Re: Self in ASM

Akira1364

Re: Self in ASM

creaothceann

Re: Self in ASM

marcov

Re: Self in ASM

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook