Recent

Author Topic: issue with assembly  (Read 4285 times)

dutchincle

  • New Member
  • *
  • Posts: 20
issue with assembly
« on: August 17, 2021, 04:55:04 am »
I am trying to make a function using assembler, that returns the sum of the digits of a number. It seems to be working but I need to have a "write" function at the end of it, which slows it all down. If I remove it, it all crashes, any ideas?

Code: Pascal  [Select][+][-]
  1. {$ASMMODE intel}
  2. {$mode objfpc}{$h+}
  3.  
  4. function Get_sum_of_digits2(number : longint) : longint;
  5. begin
  6.   asm
  7.     MOV ECX,10
  8.     XOR EBX, EBX
  9.     MOV EAX, NUMBER
  10.   @1:
  11.     XOR EDX,EDX
  12.     DIV ECX
  13.     ADD EBX, EDX
  14.     CMP EAX, 0
  15.     JA @1
  16.     MOV EAX, EBX
  17.     MOV RESULT, EAX
  18.   end;
  19.   write('');
  20. end;
  21.  
  22.  
  23. const
  24.         sample = maxlongint;
  25.  
  26. begin
  27.         writeln(sample);
  28.         writeln(get_sum_of_digits2(sample));
  29. end.
  30.  
  31.  


mika

  • Full Member
  • ***
  • Posts: 102
Re: issue with assembly
« Reply #1 on: August 17, 2021, 06:11:10 am »
when using assembler you have to be aware of ABI

save and restore EBX or RBX depending on is your code 32 bit or 64 bit
Code: Pascal  [Select][+][-]
  1. {$ASMMODE intel}
  2. {$mode objfpc}{$h+}
  3.  
  4. function Get_sum_of_digits2(number : longint) : longint; assembler; nostackframe;
  5.   asm
  6.     push rbx
  7.     MOV EAX, NUMBER
  8.     MOV ECX,10
  9.     XOR EBX, EBX
  10.  
  11.   @1:
  12.     XOR EDX,EDX
  13.     DIV ECX
  14.     ADD EBX, EDX
  15.     CMP EAX, 0
  16.     JA @1
  17.     MOV EAX, EBX
  18.     MOV RESULT, EAX
  19.     pop rbx
  20. end;
  21.  
  22.  
  23. const
  24.         sample = maxlongint;
  25.  
  26. begin
  27.         writeln(sample);
  28.         writeln(get_sum_of_digits2(sample));
  29. end.


mika

  • Full Member
  • ***
  • Posts: 102
Re: issue with assembly
« Reply #2 on: August 17, 2021, 06:15:40 am »
And still your function does not work with negative values.

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 1059
Re: issue with assembly
« Reply #3 on: August 17, 2021, 08:25:37 am »
I am trying to make a function using assembler, that returns the sum of the digits of a number. It seems to be working but I need to have a "write" function at the end of it, which slows it all down. If I remove it, it all crashes, any ideas?

Code: Pascal  [Select][+][-]
  1. {$ASMMODE intel}
  2. {$mode objfpc}{$h+}
  3.  
  4. function Get_sum_of_digits2(number : longint) : longint;
  5. begin
  6.   asm
  7.     MOV ECX,10
  8.     XOR EBX, EBX
  9.     MOV EAX, NUMBER
  10.   @1:
  11.     XOR EDX,EDX
  12.     DIV ECX
  13.     ADD EBX, EDX
  14.     CMP EAX, 0
  15.     JA @1
  16.     MOV EAX, EBX
  17.     MOV RESULT, EAX
  18.   end;
  19.   write('');
  20. end;
  21.  
There are two ways to use inline assembly:
  • a pure assembler function (like mika showed) and then you only have to save the registers that are non-volatile according to the ABI.
  • an inline assembler block in a regular Pascal procedure/function, like you used. However, if you use an inline assembler block in a function, you have to either save and restore all modified registers yourself, or tell the compiler which ones you modified (so it can save/restore them in case it's necessary)
So in the second case:
Code: [Select]
function Get_sum_of_digits2(number : longint) : longint;
begin
  asm
    MOV ECX,10
..
    MOV EAX, EBX
    MOV RESULT, EAX
  end ['eax', 'ecx', 'ebx', 'edx'];
end;

dutchincle

  • New Member
  • *
  • Posts: 20
Re: issue with assembly
« Reply #4 on: August 17, 2021, 01:51:12 pm »
Thanks everyone!! This is my first adventure into asm, as I need this to be FAST.

Thanks

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: issue with assembly
« Reply #5 on: August 17, 2021, 02:19:48 pm »
Thanks everyone!! This is my first adventure into asm, as I need this to be FAST.

Did you try first a pure Pascal solution to see whether the compiler would optimize it better than you can by hand?

I'm by no means casting aspersions on your assembly-foo but the compiler can usually take advantage of implementation details that a human might miss at first sight. Then, if need be, you can take the compiler's assembly output and optimize it further.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

mika

  • Full Member
  • ***
  • Posts: 102
Re: issue with assembly
« Reply #6 on: August 17, 2021, 03:00:34 pm »
Thanks everyone!! This is my first adventure into asm, as I need this to be FAST.

If your concern is speed, then do not use DIV instruction. In this case you can "replace" division by multiplication as divisor is constant. So if you will rewrite code using pure pascal syntax, compiler will take care converting division into multiplication.

Nitorami

  • Sr. Member
  • ****
  • Posts: 496
Re: issue with assembly
« Reply #7 on: August 17, 2021, 05:21:23 pm »
Quote
Did you try first a pure Pascal solution to see whether the compiler would optimize it better than you can by hand?

A while ago I experimented with assembler routines I found on the Internet, such as random generators, in a naive attempt to get the fastest possible code. My conclusion: It's not worth the headache. The exact same routine written in plain pascal and compiled at optimisation level 1 always ran faster or at least at the same speed as my assembly.

An exception is if you want to take advantage of special CPU instructions such as floating point vector processing which are not implemented in FPC. But that requires much more than beginners knowledge. If you absolutely need more speed, better consider multithreading.

dutchincle

  • New Member
  • *
  • Posts: 20
Re: issue with assembly
« Reply #8 on: August 17, 2021, 06:07:04 pm »


Did you try first a pure Pascal solution to see whether the compiler would optimize it better than you can by hand?


Below is my original code, probably not optimal.
Code: Pascal  [Select][+][-]
  1. Function get_sum_of_digits1(num : longint): longint;
  2. var sum,remainder : longint;
  3. Begin
  4.         sum := 0;
  5.         while num > 0 do
  6.         begin
  7.                 Divmod(num,10,num,remainder);
  8.                 inc(sum,remainder)
  9.         end;
  10.         Result := sum;
  11. end;
  12.  

mika

  • Full Member
  • ***
  • Posts: 102
Re: issue with assembly
« Reply #9 on: August 17, 2021, 08:40:16 pm »
curiosity took best of me
so, i did measure functions  get_sum_of_digits1 and get_sum_of_digits2
about the same performance

earlier I proposed to replace division by multiplication, this is what i meant:

Code: Pascal  [Select][+][-]
  1. function Get_sum_of_digits0(num : dword) : dword;
  2. var rem  : dword;
  3.     sum, a : dword;
  4. begin
  5.      sum:=0;
  6.      while num> 0 do
  7.      begin
  8.           a:= num div 10;
  9.           rem:=num-a*10;
  10.           num:=a;
  11.           sum:=sum+rem;
  12.      end;
  13.      Get_sum_of_digits0:=sum;
  14. end;
  15.  
Get_sum_of_digits0 about 5x faster

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: issue with assembly
« Reply #10 on: August 17, 2021, 09:11:46 pm »
You can improve the performance slightly by removing a local variable:
Code: Pascal  [Select][+][-]
  1. function Get_sum_of_digitsResult(num: DWord): DWord;
  2. var
  3.   rem, a: DWord;
  4. begin
  5.   Result := 0;
  6.   while num > 0 do
  7.     begin
  8.       a := num div 10;
  9.       rem := num - a*10;
  10.       num := a;
  11.       Result := Result + rem;
  12.     end;
  13. end;


Nitorami

  • Sr. Member
  • ****
  • Posts: 496
Re: issue with assembly
« Reply #11 on: August 17, 2021, 09:24:48 pm »
You can possibly further improve it slightly by making it inline. My loop test (win32) yields:

Pure assembly function Get_sum_of_digits2 -  0.75 sec
"Divmod" version get_sum_of_digits1 - 0.95 sec
Latest version by howardpc Get_sum_of_digitsResult - 0.32 sec

The issue with the divmod version is probably that procedure divmod (in unit math) is written in assembler, but the compiler cannot inline pure assembler functions; so it needs to push 4 variables and then make a procedure call. I guess this makes it less efficient.
« Last Edit: August 17, 2021, 10:04:01 pm by Nitorami »

mika

  • Full Member
  • ***
  • Posts: 102
Re: issue with assembly
« Reply #12 on: August 17, 2021, 10:57:21 pm »
You can improve the performance slightly by removing a local variable:
Code: Pascal  [Select][+][-]
  1. function Get_sum_of_digitsResult(num: DWord): DWord;
  2. var
  3.   rem, a: DWord;
  4. begin
  5.   Result := 0;
  6.   while num > 0 do
  7.     begin
  8.       a := num div 10;
  9.       rem := num - a*10;
  10.       num := a;
  11.       Result := Result + rem;
  12.     end;
  13. end;

Please, use "Result" if you like esthetics of it. Byte code of function Get_sum_of_digitsResult and function Get_sum_of_digits0 are identical (at least for -O4). So if there are performance difference then only within measurement error.

dutchincle

  • New Member
  • *
  • Posts: 20
Re: issue with assembly
« Reply #13 on: August 18, 2021, 01:35:38 am »
interestingly, at least I think it is, my original code was is as efficient as Howard's code, until you start adding the -O2 through -O4 optimizations, there Howard's code is as fast s the assembler program (at least at -O4)..
To be honest I never really paid attention to the optimizations, now I need to read up on those.
Thanks for all the help.
Bas

 

TinyPortal © 2005-2018