Stream read/write via inline assembler [SOLVED by ASerge]

totya

Hero Member
Posts: 720

Stream read/write via inline assembler [SOLVED by ASerge]

« on: June 09, 2019, 06:53:37 pm »

Hi!

I know, asm isn't very popular, but I'd like to access to stream data (TMemoryStream) via (inline) assembler (x86). Because I want execute a simple (byte!) operation on the whole stream (stream.size), and I think this is much faster with asm.

So I'd like a similar asm example for that (only the for cycle):

Code: Pascal [Select][+]

procedure TForm1.Operation(const StreamIn, StreamOut: TMemoryStream);
var
  i: integer;
  B: byte;
begin
  StreamIn.Position:=0;
  StreamOut.Clear;
 
  for i := 0 to StreamIn.Size - 1 do
  begin
    B := StreamIn.ReadByte;
    B := B + 1; // operation example...
    StreamOut.WriteByte(B);
  end;
 
  StreamOut.Position := 0;
end;      

I guess the StreamIn.Memory pointed to the stream memory...

Thanks...

« Last Edit: June 10, 2019, 10:53:16 pm by totya »

Logged

LemonParty

Jr. Member
Posts: 63

Re: Stream read/write via inline assembler

« Reply #1 on: June 09, 2019, 07:18:49 pm »

Use method Read.
Then write a separate function that proceed the array of readed bytes (may be written in pure assembler).
Then put back proceeded bytes using TStream.Write.

This algorithm have to really increase the performance.

(inlining not work with functions that contains loops)

Logged

rvk

Hero Member
Posts: 6169

Re: Stream read/write via inline assembler

« Reply #2 on: June 09, 2019, 07:26:10 pm »

As you mentioned StreamIn.Memory contains the array of bytes so just loop that in assembler and change the byte. After that the InStream will have your changed stream. No need for read and write.

But I wonder how much performance gain you will get.

How large is the stream you are talking about?

Logged

totya

Hero Member
Posts: 720

Re: Stream read/write via inline assembler

« Reply #3 on: June 09, 2019, 07:30:40 pm »

Hi rvk master!

Well, the StreamIn and StreamOut size is different, so I need read from StreamIn and write to StreamOut (Streamout has header).

Quote

so just loop that in assembler and change the byte

I understand, but please show me an example

I used assembler long time ago... and the google not my friend in this case...

Edit for new question.:

Quote from: rvk on June 09, 2019, 07:26:10 pm

How large is the stream you are talking about?

The sources are files, and total file size about 20MB (at the moment). The speed is not very bad with pascal, but I want to see it with asm...

« Last Edit: June 09, 2019, 08:01:41 pm by totya »

Logged

jamie

Hero Member
Posts: 6131

Re: Stream read/write via inline assembler

« Reply #4 on: June 09, 2019, 08:01:47 pm »

there is a property "Memory" which returns the pointer of the memory block

So if you know assembler you can do this..

I suppose I can code up an example but why? Unless you are doing a lot of short setups
I can't see a reason for it but who am I. Maybe I'll feel generous and code up an example.

Logged

The only true wisdom is knowing you know nothing

rvk

Hero Member
Posts: 6169

Re: Stream read/write via inline assembler

« Reply #5 on: June 09, 2019, 08:03:10 pm »

I don't think it will be much faster in assembler but you can try.
For me it has been even longer since I worked with assembler (around 1988).

But for move() in fpc, it is already in assembler. So you can create a stream (set size) and just do a move. Or don't work with streams at all and just work with arrays.

Code: Pascal [Select][+]

procedure Move(const source;var dest;count:SizeInt);[public, alias: 'FPC_MOVE'];assembler;nostackframe;
asm
  cmp     ecx,SMALLMOVESIZE
  ja      @Large
  cmp     eax,edx
  lea     eax,[eax+ecx]
  jle     @SmallCheck
@SmallForward:
  add     edx,ecx
  jmp     SmallForwardMove_3
@SmallCheck:
  je      @Done {For Compatibility with Delphi's move for Source = Dest}
  sub     eax,ecx
  jmp     SmallBackwardMove_3
@Large:
  jng     @Done {For Compatibility with Delphi's move for Count < 0}
  cmp     eax,edx
  jg      @moveforward
  je      @Done {For Compatibility with Delphi's move for Source = Dest}
  push    eax
  add     eax,ecx
  cmp     eax,edx
  pop     eax
  jg      @movebackward
@moveforward:
  jmp     dword ptr fastmoveproc_forward
@movebackward:
  jmp     dword ptr fastmoveproc_backward {Source/Dest Overlap}
@Done:
end;
 
{Move ECX Bytes from EAX to EDX, where EAX > EDX and ECX > 36 (SMALLMOVESIZE)}
procedure Forwards_SSE_3;assembler;nostackframe;
const
  LARGESIZE = 2048;
asm
  cmp     ecx,LARGESIZE
  jge     @FwdLargeMove
  cmp     ecx,SMALLMOVESIZE+32
  movups  xmm0,[eax]
  jg      @FwdMoveSSE
  movups  xmm1,[eax+16]
  movups  [edx],xmm0
  movups  [edx+16],xmm1
  add     eax,ecx
  add     edx,ecx
  sub     ecx,32
  jmp     SmallForwardMove_3
@FwdMoveSSE:
  push    ebx
  mov     ebx,edx
  {Align Writes}
  add     eax,ecx
  add     ecx,edx
  add     edx,15
  and     edx,-16
  sub     ecx,edx
  add     edx,ecx
  {Now Aligned}
  sub     ecx,32
  neg     ecx
@FwdLoopSSE:
  movups  xmm1,[eax+ecx-32]
  movups  xmm2,[eax+ecx-16]
  movaps  [edx+ecx-32],xmm1
  movaps  [edx+ecx-16],xmm2
  add     ecx,32
  jle     @FwdLoopSSE
  movups  [ebx],xmm0 {First 16 Bytes}
  neg     ecx
  add     ecx,32
  pop     ebx
  jmp     SmallForwardMove_3
@FwdLargeMove:
  push    ebx
  mov     ebx,ecx
  test    edx,15
  jz      @FwdLargeAligned
  {16 byte Align Destination}
  mov     ecx,edx
  add     ecx,15
  and     ecx,-16
  sub     ecx,edx
  add     eax,ecx
  add     edx,ecx
  sub     ebx,ecx
  {Destination now 16 Byte Aligned}
  call    SmallForwardMove_3
  mov     ecx,ebx
@FwdLargeAligned:
  and     ecx,-16
  sub     ebx,ecx {EBX = Remainder}
  push    edx
  push    eax
  push    ecx
  call    AlignedFwdMoveSSE_3
  pop     ecx
  pop     eax
  pop     edx
  add     ecx,ebx
  add     eax,ecx
  add     edx,ecx
  mov     ecx,ebx
  pop     ebx
  jmp     SmallForwardMove_3
end; {Forwards_SSE}

This can be shortened but you will only gain a few cycles because this procedure first determines the best way to do the move and then jumps to the appropriate function.

But I wouldn't focus on the move procedure because it's already in assembler. So create an array, first add the header, then do the move from instream.memory to your array and loop through it to perform your action.

If you look at the assembler "debug view" when you run the following snippet:

Code: Pascal [Select][+]

for i := 0 to 20 * 1024 * 1024 do
begin
  a[i] := a[i] + 1;
end;

You see something like this:

Code: Pascal [Select][+]

asm
  @back:
  movl   $0xffffffff,-0xdc(%ebp)
  mov    %esi,%esi
  mov    -0xdc(%ebp),%eax
  add    $0x1,%eax
  mov    %eax,-0xdc(%ebp)
  movzbl -0x70(%ebp,%eax,1),%eax
  add    $0x1,%eax
  mov    -0xdc(%ebp),%edx
  mov    %al,-0x70(%ebp,%edx,1)
  cmpl   $0x1400000,-0xdc(%ebp)
  jl     back
end;

What would you like to change for performance wise?

It would all depend on your "B := B + 1;" calculation because I don't think you can make the loop any more efficient. (But don't use the stream.read and stream.write because they do add some overhead)

Note: Back in the time I did assembler we only had ax, ah and al and such (8086 processor

)

Logged

totya

Hero Member
Posts: 720

Re: Stream read/write via inline assembler

« Reply #6 on: June 09, 2019, 08:11:23 pm »

Quote from: jamie on June 09, 2019, 08:01:47 pm

there is a property "Memory" which returns the pointer of the memory block So if you know assembler you can do this..

If I see a code which read data from the stream/buffer to a register one by one and byte-steps, and write this to the other stream/buffer, I think it's a good start for the beginng.

Logged

rvk

Hero Member
Posts: 6169

Re: Stream read/write via inline assembler

« Reply #7 on: June 09, 2019, 08:17:07 pm »

The last snippet in my previous post shows the for loop to manipulate an array.

So first create the header in an array, then move the instream.memory after it and do the for loop in assembler.

But even if you don't do the for loop in assembler... You can just only do your b := b +1 in assembler (assuming.it does something different than just adding 1).

Logged

ASerge

Hero Member
Posts: 2249

Re: Stream read/write via inline assembler

« Reply #8 on: June 09, 2019, 08:22:29 pm »

Quote from: totya on June 09, 2019, 08:11:23 pm

If I see a code which read data from the stream/buffer to a register one by one and byte-steps, and write this to the other stream/buffer, I think it's a good start for the beginng.

Code: Pascal [Select][+]

{$ASMMODE INTEL}
procedure Operation(const StreamIn, StreamOut: TMemoryStream);
var
  LSize: SizeInt;
begin
  LSize := StreamIn.Size;
  StreamOut.Size := LSize;
  //repeat
  //  Dec(LSize);
  //  if LSize < 0 then
  //    Break;
  //  PByte(StreamOut.Memory)[LSize] := PByte(StreamIn.Memory)[LSize] + 1;
  //until False;
  asm
    mov  rsi, StreamIn
    mov  rdi, StreamOut
    mov  rcx, LSize
    @@StartLoop:
    dec  rcx
    jl   @@EndLoop
    mov  al, [rsi+rcx].TMemoryStream.Memory
    inc  al
    mov  [rdi+rcx].TMemoryStream.Memory, al
    jmp  @@StartLoop
    @@EndLoop:
  end ['rsi', 'rdi', 'rcx', 'rax'];
  StreamOut.Position := 0;
end;

When using "asm" inserts, FPC stops optimizing the surrounding code, so without asm it will be faster.

« Last Edit: June 09, 2019, 08:24:41 pm by ASerge »

Logged

LemonParty

Jr. Member
Posts: 63

Re: Stream read/write via inline assembler

« Reply #9 on: June 09, 2019, 08:56:29 pm »

Steroids gotta make the cycle running with the speed of light (at least 4 times faster than classic instructions).
But do you need the speed of light?

Logged

totya

Hero Member
Posts: 720

Re: Stream read/write via inline assembler

« Reply #10 on: June 09, 2019, 09:10:53 pm »

Quote from: ASerge on June 09, 2019, 08:22:29 pm

Code: Pascal [Select][+][-]
{$ASMMODE INTEL}...

Thank you for this very readable code! It's enough for me the start... Seems to me these x64 registers, but seems to me its not a big problem (rsi->esi).

Thank you too: LemonParty, rvk master, jamie for answers, and informations.

Logged

totya

Hero Member
Posts: 720

Re: Stream read/write via inline assembler

« Reply #11 on: June 09, 2019, 10:10:26 pm »

Quote from: ASerge on June 09, 2019, 08:22:29 pm

Hi!

My operation is more complicated than inc(), but unfortunatelly I got sigsev (StreamOut.Position := 0;) with this untouched simple test code:

Code: Pascal [Select][+]

unit Unit1;
 
{$mode objfpc}{$H+}
 
interface
 
uses
  Classes, SysUtils, Forms, Controls, Graphics, Dialogs, StdCtrls;
 
type
 
  { TForm1 }
 
  TForm1 = class(TForm)
    Button1: TButton;
    Memo1: TMemo;
    procedure Button1Click(Sender: TObject);
  private
    procedure Operation(const StreamIn, StreamOut: TMemoryStream);
 
  end;
 
var
  Form1: TForm1;
 
implementation
 
{$R *.lfm}
 
{ TForm1 }
 
{$ASMMODE INTEL}
procedure TForm1.Operation(const StreamIn, StreamOut: TMemoryStream);
var
  LSize: SizeInt;
begin
  LSize := StreamIn.Size;
  StreamOut.Size := LSize;
 
  //repeat
  //  Dec(LSize);
  //  if LSize < 0 then
  //    Break;
  //  PByte(StreamOut.Memory)[LSize] := PByte(StreamIn.Memory)[LSize] + 1;
  //until False;
  asm
    mov  rsi, StreamIn
    mov  rdi, StreamOut
    mov  rcx, LSize
    @@StartLoop:
    dec  rcx
    jl   @@EndLoop
    mov  al, [rsi+rcx].TMemoryStream.Memory
    inc  al
    mov  [rdi+rcx].TMemoryStream.Memory, al
    jmp  @@StartLoop
    @@EndLoop:
  end ['rsi', 'rdi', 'rcx', 'rax'];
 
  StreamOut.Position := 0;
end;
 
procedure TForm1.Button1Click(Sender: TObject);
var
  StreamIn, StreamOut: TMemoryStream;
begin
  StreamIn := TMemoryStream.Create;
  StreamOut := TMemoryStream.Create;
  try
    StreamIn.WriteByte(100);
    StreamIn.WriteByte(100);
 
    Operation(StreamIn, StreamOut);
 
    Memo1.Lines.Add(IntToStr(StreamOut.ReadByte));
    Memo1.Lines.Add(IntToStr(StreamOut.ReadByte));
  finally
    StreamIn.Free;
    StreamOut.Free;
  end;
end;
 
end.

If I comment
//inc al
then this code run without error, but the result is garbage... (120, 204).

Logged

jamie

Hero Member
Posts: 6131

Re: Stream read/write via inline assembler

« Reply #12 on: June 09, 2019, 10:36:43 pm »

Or, you can use the MOVE

Move(SourceStream.Memory^, DestinationStream.Pointer^,Memory);

Reset your Seek back to zero or what ever.

The MOVE is system level and should be closer optimized over using the Methods of the
streams.

Logged

The only true wisdom is knowing you know nothing

ASerge

Hero Member
Posts: 2249

Re: Stream read/write via inline assembler

« Reply #13 on: June 09, 2019, 11:57:03 pm »

Quote from: totya on June 09, 2019, 10:10:26 pm

Quote from: ASerge on June 09, 2019, 08:22:29 pm

My operation is more complicated than inc(), but unfortunatelly I got sigsev (StreamOut.Position := 0;) with this untouched simple test code:

That's additional danger with assembler - easy to make a mistake. I forgot to dereference Memory and that the property should be accessed directly through the field, otherwise the offset 0 is used.

Code: Pascal [Select][+]

procedure Operation(const StreamIn, StreamOut: TMemoryStream);
var
  LSize: SizeInt;
begin
  LSize := StreamIn.Size;
  StreamOut.Size := LSize;
  //repeat
  //  Dec(LSize);
  //  if LSize < 0 then
  //    Break;
  //  PByte(StreamOut.Memory)[LSize] := PByte(StreamIn.Memory)[LSize] + 1;
  //until False;
  asm
    mov  rsi, StreamIn
    mov  rdi, StreamOut
    mov  rcx, LSize
    @@StartLoop:
    dec  rcx
    jl   @@EndLoop
    mov  rdx, [rsi].TMemoryStream.FMemory
    mov  al, [rdx+rcx]
    inc  al
    mov  rdx, [rdi].TMemoryStream.FMemory
    mov  BYTE PTR [rdx+rcx], al
    jmp  @@StartLoop
    @@EndLoop:
  end ['rsi', 'rdi', 'rcx', 'rax', 'rdx'];
  StreamOut.Position := 0;
end;

Logged

totya

Hero Member
Posts: 720

Re: Stream read/write via inline assembler

« Reply #14 on: June 10, 2019, 09:50:41 am »

Quote from: ASerge on June 09, 2019, 11:57:03 pm

...

Big thanks to you, this sample code result is okay now... But now I will have one less register what can I use for operations

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: Stream read/write via inline assembler [SOLVED by ASerge] (Read 6584 times)

totya

Stream read/write via inline assembler [SOLVED by ASerge]

LemonParty

Re: Stream read/write via inline assembler

rvk

Re: Stream read/write via inline assembler

totya

Re: Stream read/write via inline assembler

jamie

Re: Stream read/write via inline assembler

rvk

Re: Stream read/write via inline assembler

totya

Re: Stream read/write via inline assembler

rvk

Re: Stream read/write via inline assembler

ASerge

Re: Stream read/write via inline assembler

LemonParty

Re: Stream read/write via inline assembler

totya

Re: Stream read/write via inline assembler

totya

Re: Stream read/write via inline assembler

jamie

Re: Stream read/write via inline assembler

ASerge

Re: Stream read/write via inline assembler

totya

Re: Stream read/write via inline assembler

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook