Recent

Author Topic: Stream read/write via inline assembler [SOLVED by ASerge]  (Read 6303 times)

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #15 on: June 10, 2019, 01:07:05 pm »
...

Thank you again!

With your sample, I was able to do what I wanted. As I remember, the assembler much faster than any high-level language.

Algorithm sleftest (hash compared!) with 300MB data:

pascal implementation: 35,32 MB/s
assembler implementation: 333,88 MB/s

It's about 10 times faster...

But I have a question, to tell the truth, I don't know exactly why don't work your first sample (you wrote you forget dereference pointers), so in my algorithm I use constant array values. My question is, I can access these array constant safely (ADD_ARRAY_SAMPLE) with my code?

Sample (isn't the real algorithm), x86 version:

Code: Pascal  [Select][+][-]
  1. procedure TForm1.Operation2(const Stream: TMemoryStream);
  2. const
  3.   ADD_ARRAY_SAMPLE: array[0..2] of byte = ($01, $02, $03);
  4. var
  5.   LSize, Counter: SizeInt;
  6.   ADD_ARRAY_SAMPLE_PTR: pointer;
  7. begin
  8.   LSize := Stream.Size;
  9.   ADD_ARRAY_SAMPLE_PTR:= @ADD_ARRAY_SAMPLE;
  10.  
  11.   Counter:=0;
  12.  
  13.   {$ASMMODE INTEL}
  14.   asm
  15.      mov  esi, Stream
  16.      mov  ecx, Counter // ADD_ARRAY_SAMPLE index
  17.      mov  ebx, 0 //Stream index
  18.  
  19.      @@StartLoop:
  20.      mov  edx, [esi].TMemoryStream.FMemory
  21.      mov  al, [edx+ebx]
  22.  
  23.      mov  edx, ADD_ARRAY_SAMPLE_PTR
  24.      mov  ah, [edx+ecx]
  25.  
  26.      add al, ah
  27.  
  28.      mov  edx, [esi].TMemoryStream.FMemory
  29.      mov  BYTE PTR [edx+ebx], al
  30.  
  31.      cmp  ecx, 2
  32.      je   @@ARRAY_MAX
  33.      inc  ecx
  34.      jmp  @@Next
  35.  
  36.      @@ARRAY_MAX:
  37.      mov ecx, 0
  38.  
  39.      @@Next:
  40.      inc  ebx
  41.      cmp ebx, LSize
  42.      je @@EndLoop
  43.  
  44.      jmp  @@StartLoop
  45.      @@EndLoop:
  46.    end ['esi', 'eax', 'ebx', 'ecx', 'edx'];
  47.  
  48.   Stream.Position := 0;
  49. end;

... and I have a simple question ctrl-d (jedi code format) why doesn't work if asm code available in the source?

Thank you!

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: Stream read/write via inline assembler
« Reply #16 on: June 10, 2019, 01:17:08 pm »
pascal implementation: 35,32 MB/s
assembler implementation: 333,88 MB/s
It's about 10 times faster...
Can we see your pascal implementation?
Did it also work with tstream.memory as array or did you use tstream.read and write?

Did you try the repeat/until ASerge showed as commented code?
Because I don't think you should get that much of a difference.

Because I suspect using repeat/until with just your algoritme in assembler will be just as fast or maybe slightly slower. But not a factor 10.

« Last Edit: June 10, 2019, 01:19:44 pm by rvk »

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #17 on: June 10, 2019, 02:29:02 pm »
Can we see your pascal implementation?

rvk master, I knew it, I knew it :)

So, the algorithm autohor sent me the algorithm to use (he app doesn't work correctly), but I haven't permit to show the original code, but something about this:

Code: Pascal  [Select][+][-]
  1.  
  2. Size:=MemStreamIn.Size;
  3. MemStreamOut.Size:=Size+HeaderSize;
  4. Counter:=0;
  5.  
  6. for I := 0 to Size - 1 do
  7.   begin
  8.     B := MemStreamIn.ReadByte
  9.  
  10.     { simple bitwise operations with an const array... array element choice: see: Counter }
  11.  
  12.     MemStreamOut.WriteByte(B);
  13.  
  14.     if Counter < X then
  15.       Inc(Counter)
  16.     else
  17.       Counter := 0;
  18.   end;

Edit, because I forget to answer your more questions:
Did it also work with tstream.memory as array or did you use tstream.read and write?

As you see I use simple Stream ReadByte/WriteByte.

Did you try the repeat/until ASerge showed as commented code?

I tried it, after I see the first assembler code is unusable, but i don't saw/tried it deeper.
« Last Edit: June 10, 2019, 02:37:02 pm by totya »

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: Stream read/write via inline assembler
« Reply #18 on: June 10, 2019, 02:42:04 pm »
... but something about this:
There is the problem.
Your pascal code is still using MemStreamIn.ReadByte and MemStreamOut.WriteByte which slows down your code considerably. No wonder it is 10 times slower than the assembler implementation.

You need to do something like this:

Code: Pascal  [Select][+][-]
  1. Size := MemStreamIn.Size;
  2. MemStreamOut.Size := Size + HeaderSize;
  3. pIn := 0;
  4. pOut := HeaderSize;
  5. pTo := MemStreamOut.Size;
  6. repeat
  7.   B :=  PByte(MemStreamIn.Memory)[pIn];
  8.  
  9.   asm
  10.     // put here your assembler code of just the algoritme.
  11.   end;
  12.  
  13.   PByte(MemStreamOut.Memory)[pOut] := B;
  14.   Inc(pIn);
  15.   Inc(pOut);
  16. until pOut > pTo;
(just typed out of my head so the > and begin and end values might be slightly off)

But you will see with this implementation, the pascal code will still be very fast because you don't use the .readbyte and writebyte functions.

(And even this probably can be more optimized)

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #19 on: June 10, 2019, 03:01:40 pm »
(And even this probably can be more optimized)

Hi, I tried your code, and I got 192,74 MB/s. Inside the pascal code. Its not bad, much faster than original. But slower than 333,88 MB/s :) "more optimized"
« Last Edit: June 10, 2019, 03:03:41 pm by totya »

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: Stream read/write via inline assembler
« Reply #20 on: June 10, 2019, 03:04:33 pm »
Ok, I thought it would be almost as fast because there are no extra calls.

(I take it you do your testing outside the ide without debugger)

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #21 on: June 10, 2019, 03:06:56 pm »
(I take it you do your testing outside the ide without debugger)

No speed difference with or withut debugger (MB/s):

193,74
194,12

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #22 on: June 10, 2019, 03:15:08 pm »
But what a surprise, if I compiled same code (rvk, inside the pascal operation) to x64 I got 389,05 MB/s

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #23 on: June 12, 2019, 10:38:10 pm »
Code: Pascal  [Select][+][-]
  1. Size := MemStreamIn.Size;
  2. MemStreamOut.Size := Size + HeaderSize;
  3. pIn := 0;
  4. pOut := HeaderSize;
  5. pTo := MemStreamOut.Size -1; // orig: pTo := MemStreamOut.Size;
  6. repeat
  7.   B :=  PByte(MemStreamIn.Memory)[pIn];
  8.  
  9.   asm
  10.     // put here your assembler code of just the algoritme.
  11.   end;
  12.  
  13.   PByte(MemStreamOut.Memory)[pOut] := B;
  14.   Inc(pIn);
  15.   Inc(pOut);
  16. until pOut > pTo;

Small bug corrected ;) Se: //

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Stream read/write via inline assembler [SOLVED by ASerge]
« Reply #24 on: June 12, 2019, 11:59:09 pm »
If you are about speed, try using SIMD instructions. Or unroll the loop.

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: Stream read/write via inline assembler
« Reply #25 on: June 13, 2019, 12:49:43 am »
Small bug corrected ;) Se: //
Not a 'bug'... That's why I added the extra note  :P

(just typed out of my head so the > and begin and end values might be slightly off)

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler
« Reply #26 on: June 13, 2019, 05:11:05 pm »
and begin and end values might be slightly off)

Ha master! :)

With my weak english I didn't understand this sentence, but now I can imagine what this it mean :)

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler [SOLVED by ASerge]
« Reply #27 on: June 13, 2019, 05:14:52 pm »
If you are about speed, try using SIMD instructions. Or unroll the loop.

Hi, thanks, the speed is okay for me now, but if you show me workable sample (like as ASerge asm code) I can to try it.

engkin

  • Hero Member
  • *****
  • Posts: 3112
Re: Stream read/write via inline assembler [SOLVED by ASerge]
« Reply #28 on: June 13, 2019, 06:52:59 pm »
With big size like:
The sources are files, and total file size about 20MB (at the moment).

and the type of instructions you want:
Code: Pascal  [Select][+][-]
  1. ...
  2.     { simple bitwise operations with an const array... array element choice: see: Counter }

It makes it sound like a perfect candidate for using SIMD instructions.

if you show me workable sample (like as ASerge asm code) I can to try it.

It is exactly the same code, but instead of using XOR you use its SIMD counterpart PXOR. You don't deal with normal CPU registers like EAX,EDX..etc. You have a different set of registers like XMM1.. the size of these registers is bigger. EAX is 4 bytes while XMM1 is 16 bytes. 64bit CPUs have even bigger SIMD registers.

Here is an example.

totya

  • Hero Member
  • *****
  • Posts: 720
Re: Stream read/write via inline assembler [SOLVED by ASerge]
« Reply #29 on: June 13, 2019, 11:27:24 pm »
Here is an example.

Thank you! I will see it on weekend.. if I my old intel core2duo support it...

 

TinyPortal © 2005-2018