issue with assembly

dutchincle

New Member
Posts: 20

issue with assembly

« on: August 17, 2021, 04:55:04 am »

I am trying to make a function using assembler, that returns the sum of the digits of a number. It seems to be working but I need to have a "write" function at the end of it, which slows it all down. If I remove it, it all crashes, any ideas?

Code: Pascal [Select][+]

{$ASMMODE intel}
{$mode objfpc}{$h+}
 
function Get_sum_of_digits2(number : longint) : longint;
begin
  asm
    MOV ECX,10
    XOR EBX, EBX
    MOV EAX, NUMBER
  @1:
    XOR EDX,EDX
    DIV ECX 
    ADD EBX, EDX
    CMP EAX, 0
    JA @1
    MOV EAX, EBX
    MOV RESULT, EAX
  end;
  write('');
end;
 
 
const
        sample = maxlongint;
 
begin
        writeln(sample);
        writeln(get_sum_of_digits2(sample));
end.
 
 

Logged

mika

Full Member
Posts: 102

Re: issue with assembly

« Reply #1 on: August 17, 2021, 06:11:10 am »

when using assembler you have to be aware of ABI

save and restore EBX or RBX depending on is your code 32 bit or 64 bit

Code: Pascal [Select][+]

{$ASMMODE intel}
{$mode objfpc}{$h+}
 
function Get_sum_of_digits2(number : longint) : longint; assembler; nostackframe;
  asm
    push rbx
    MOV EAX, NUMBER
    MOV ECX,10
    XOR EBX, EBX
 
  @1:
    XOR EDX,EDX
    DIV ECX
    ADD EBX, EDX
    CMP EAX, 0
    JA @1
    MOV EAX, EBX
    MOV RESULT, EAX
    pop rbx
end;
 
 
const
        sample = maxlongint;
 
begin
        writeln(sample);
        writeln(get_sum_of_digits2(sample));
end.

Logged

mika

Full Member
Posts: 102

Re: issue with assembly

« Reply #2 on: August 17, 2021, 06:15:40 am »

And still your function does not work with negative values.

Logged

Jonas Maebe

Hero Member
Posts: 1059

Re: issue with assembly

« Reply #3 on: August 17, 2021, 08:25:37 am »

Quote from: dutchincle on August 17, 2021, 04:55:04 am

I am trying to make a function using assembler, that returns the sum of the digits of a number. It seems to be working but I need to have a "write" function at the end of it, which slows it all down. If I remove it, it all crashes, any ideas?

Code: Pascal [Select][+][-]
{$ASMMODE intel}
{$mode objfpc}{$h+}

function Get_sum_of_digits2(number : longint) : longint;
begin
asm
MOV ECX,10
XOR EBX, EBX
MOV EAX, NUMBER
@1:
XOR EDX,EDX
DIV ECX
ADD EBX, EDX
CMP EAX, 0
JA @1
MOV EAX, EBX
MOV RESULT, EAX
end;
write('');
end;

There are two ways to use inline assembly:

a pure assembler function (like mika showed) and then you only have to save the registers that are non-volatile according to the ABI.
an inline assembler block in a regular Pascal procedure/function, like you used. However, if you use an inline assembler block in a function, you have to either save and restore all modified registers yourself, or tell the compiler which ones you modified (so it can save/restore them in case it's necessary)

So in the second case:

Code: [Select]

function Get_sum_of_digits2(number : longint) : longint;
begin
  asm
    MOV ECX,10
..
    MOV EAX, EBX
    MOV RESULT, EAX
  end ['eax', 'ecx', 'ebx', 'edx'];
end;

Logged

dutchincle

New Member
Posts: 20

Re: issue with assembly

« Reply #4 on: August 17, 2021, 01:51:12 pm »

Thanks everyone!! This is my first adventure into asm, as I need this to be FAST.

Thanks

Logged

lucamar

Hero Member
Posts: 4219

Re: issue with assembly

« Reply #5 on: August 17, 2021, 02:19:48 pm »

Quote from: dutchincle on August 17, 2021, 01:51:12 pm

Thanks everyone!! This is my first adventure into asm, as I need this to be FAST.

Did you try first a pure Pascal solution to see whether the compiler would optimize it better than you can by hand?

I'm by no means casting aspersions on your assembly-foo but the compiler can usually take advantage of implementation details that a human might miss at first sight. Then, if need be, you can take the compiler's assembly output and optimize it further.

Logged

Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)

Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

mika

Full Member
Posts: 102

Re: issue with assembly

« Reply #6 on: August 17, 2021, 03:00:34 pm »

Quote from: dutchincle on August 17, 2021, 01:51:12 pm

Thanks everyone!! This is my first adventure into asm, as I need this to be FAST.

If your concern is speed, then do not use DIV instruction. In this case you can "replace" division by multiplication as divisor is constant. So if you will rewrite code using pure pascal syntax, compiler will take care converting division into multiplication.

Logged

Nitorami

Sr. Member
Posts: 496

Re: issue with assembly

« Reply #7 on: August 17, 2021, 05:21:23 pm »

Quote

Did you try first a pure Pascal solution to see whether the compiler would optimize it better than you can by hand?

A while ago I experimented with assembler routines I found on the Internet, such as random generators, in a naive attempt to get the fastest possible code. My conclusion: It's not worth the headache. The exact same routine written in plain pascal and compiled at optimisation level 1 always ran faster or at least at the same speed as my assembly.

An exception is if you want to take advantage of special CPU instructions such as floating point vector processing which are not implemented in FPC. But that requires much more than beginners knowledge. If you absolutely need more speed, better consider multithreading.

Logged

dutchincle

New Member
Posts: 20

Re: issue with assembly

« Reply #8 on: August 17, 2021, 06:07:04 pm »

Quote from: lucamar on August 17, 2021, 02:19:48 pm

Did you try first a pure Pascal solution to see whether the compiler would optimize it better than you can by hand?

Below is my original code, probably not optimal.

Code: Pascal [Select][+]

Function get_sum_of_digits1(num : longint): longint;
var sum,remainder : longint;
Begin
        sum := 0;
        while num > 0 do
        begin
                Divmod(num,10,num,remainder);
                inc(sum,remainder)
        end;
        Result := sum;
end;
 

Logged

mika

Full Member
Posts: 102

Re: issue with assembly

« Reply #9 on: August 17, 2021, 08:40:16 pm »

curiosity took best of me
so, i did measure functions get_sum_of_digits1 and get_sum_of_digits2
about the same performance

earlier I proposed to replace division by multiplication, this is what i meant:

Code: Pascal [Select][+]

function Get_sum_of_digits0(num : dword) : dword;
var rem  : dword;
    sum, a : dword;
begin
     sum:=0;
     while num> 0 do
     begin
          a:= num div 10;
          rem:=num-a*10;
          num:=a;
          sum:=sum+rem;
     end;
     Get_sum_of_digits0:=sum;
end;
 

Get_sum_of_digits0 about 5x faster

Logged

howardpc

Hero Member
Posts: 4144

Re: issue with assembly

« Reply #10 on: August 17, 2021, 09:11:46 pm »

You can improve the performance slightly by removing a local variable:

Code: Pascal [Select][+]

function Get_sum_of_digitsResult(num: DWord): DWord;
var
  rem, a: DWord;
begin
  Result := 0;
  while num > 0 do
    begin
      a := num div 10;
      rem := num - a*10;
      num := a;
      Result := Result + rem;
    end;
end;

Logged

Nitorami

Sr. Member
Posts: 496

Re: issue with assembly

« Reply #11 on: August 17, 2021, 09:24:48 pm »

You can possibly further improve it slightly by making it inline. My loop test (win32) yields:

Pure assembly function Get_sum_of_digits2 - 0.75 sec
"Divmod" version get_sum_of_digits1 - 0.95 sec
Latest version by howardpc Get_sum_of_digitsResult - 0.32 sec

The issue with the divmod version is probably that procedure divmod (in unit math) is written in assembler, but the compiler cannot inline pure assembler functions; so it needs to push 4 variables and then make a procedure call. I guess this makes it less efficient.

« Last Edit: August 17, 2021, 10:04:01 pm by Nitorami »

Logged

mika

Full Member
Posts: 102

Re: issue with assembly

« Reply #12 on: August 17, 2021, 10:57:21 pm »

Quote from: howardpc on August 17, 2021, 09:11:46 pm

You can improve the performance slightly by removing a local variable:
Code: Pascal [Select][+][-]
function Get_sum_of_digitsResult(num: DWord): DWord;
var
rem, a: DWord;
begin
Result := 0;
while num > 0 do
begin
a := num div 10;
rem := num - a*10;
num := a;
Result := Result + rem;
end;
end;

Please, use "Result" if you like esthetics of it. Byte code of function Get_sum_of_digitsResult and function Get_sum_of_digits0 are identical (at least for -O4). So if there are performance difference then only within measurement error.

Logged

dutchincle

New Member
Posts: 20

Re: issue with assembly

« Reply #13 on: August 18, 2021, 01:35:38 am »

interestingly, at least I think it is, my original code was is as efficient as Howard's code, until you start adding the -O2 through -O4 optimizations, there Howard's code is as fast s the assembler program (at least at -O4)..
To be honest I never really paid attention to the optimizations, now I need to read up on those.
Thanks for all the help.
Bas

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: issue with assembly (Read 4285 times)

dutchincle

issue with assembly

mika

Re: issue with assembly

mika

Re: issue with assembly

Jonas Maebe

Re: issue with assembly

dutchincle

Re: issue with assembly

lucamar

Re: issue with assembly

mika

Re: issue with assembly

Nitorami

Re: issue with assembly

dutchincle

Re: issue with assembly

mika

Re: issue with assembly

howardpc

Re: issue with assembly

Nitorami

Re: issue with assembly

mika

Re: issue with assembly

dutchincle

Re: issue with assembly

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook