Lazarus

Free Pascal => General => Topic started by: Fred vS on November 05, 2021, 03:13:13 pm

Title: Add arrays into one ?
Post by: Fred vS on November 05, 2021, 03:13:13 pm
Hello.

What is the fastest-best way to add arrays of float into one ?

I use the "classical" way, something like this:

Code: Pascal  [Select][+][-]
  1. ...
  2.  
  3. var
  4. arr0, arr1, arr2, arr3 : array of cfloat;
  5.  
  6. ...
  7.  
  8. procedure AddArrays();
  9. var
  10.   i : integer;
  11.   begin
  12.   // all arrays have the same size
  13.     for i := 0 to length(arr0) -1 do
  14.     arr0[i] := arr1[i] + arr2[i] + arr3[i];
  15.   end;
  16.  

Thanks.

Fre;D
Title: Re: Add arrays into one ?
Post by: Thaddy on November 05, 2021, 04:15:22 pm
This came up this week.  There is compiler support for it in two ways: See: https://forum.lazarus.freepascal.org/index.php/topic,57009.0.html
and the link to Rosetta code.
Both the + operator and Concat() are fast and safe.
Code: Pascal  [Select][+][-]
  1. {$mode objfpc}
  2. {$modeswitch arrayoperators+}
  3. var
  4.   arr0, arr1, arr2, arr3 : array of single;
  5. begin
  6.   arr0 := concat(arr1,arr2,arr3);
  7.   // or
  8.   arr0 := arr1+arr2+arr3;
  9. end.
The concat() may be very slightly faster.
In Delphi mode the array + operator is on by default. Concat() can be used in almost all modes.
Also note FPC can optimize your code using simd instructions on Intel and Arm, so experiment with the compiler settings.  Concat() and + may need FPC to rebuild the compiler and rtl with optimum settings for maximum speed. For audio - I believe your interest -, this is certainly worth the trouble. FPC is build rather conservative as standard.
You could try and build FPC and RTL with -CfAVX2 and with -dFASTMATH for Intel. This will still cover most Intel/AMD machines in the wild. For arm, something similar can be done.
Title: Re: Add arrays into one ?
Post by: BobDog on November 05, 2021, 05:01:43 pm
I think the addition is element for element into an array of the same dimension as the other three.
In which case I think the OP's method looks OK.
But I could be wrong of course.
Title: Re: Add arrays into one ?
Post by: Fred vS on November 05, 2021, 05:38:23 pm
This came up this week.  There is compiler support for it in two ways: See: https://forum.lazarus.freepascal.org/index.php/topic,57009.0.html
and the link to Rosetta code.
Both the + operator and Concat() are fast and safe.
Code: Pascal  [Select][+][-]
  1. {$mode objfpc}
  2. {$modeswitch arrayoperators+}
  3. var
  4.   arr0, arr1, arr2, arr3 : array of single;
  5. begin
  6.   arr0 := concat(arr1,arr2,arr3);
  7.   // or
  8.   arr0 := arr1+arr2+arr3;
  9. end.
The concat() may be very slightly faster.
In Delphi mode the array + operator is on by default. Concat() can be used in almost all modes.
Also note FPC can optimize your code using simd instructions on Intel and Arm, so experiment with the compiler settings.  Concat() and + may need FPC to rebuild the compiler and rtl with optimum settings for maximum speed. For audio - I believe your interest -, this is certainly worth the trouble. FPC is build rather conservative as standard.
You could try and build FPC and RTL with -CfAVX2 and with -dFASTMATH for Intel. This will still cover most Intel/AMD machines in the wild. For arm, something similar can be done.

Thanks Thaddy, I will deeply study your post, write you later.

Quote
For audio - I believe your interest -,
Indeed!  ;)

By the way, about apply a multiplication to a array, something like this, is there also a other solution?

Code: Pascal  [Select][+][-]
  1. var
  2.  arr0: array of cfloat;
  3.  fl : cfloat = 0.3456 ; // ok Thaddy!
  4.  
  5. ...
  6.  
  7. procedure MultipliArray();
  8. var
  9.   i : integer;
  10.    begin
  11.       for i := 0 to length(arr0) -1 do
  12.     arr0[i] := arr0[i] * fl;
  13.   end;

@BobDog : thanks for your post.  Yes it is ok but maybe, like explained Thaddy, optimization can be done.

Fre;D
Title: Re: Add arrays into one ?
Post by: Thaddy on November 05, 2021, 05:46:09 pm
I think the addition is element for element into an array of the same dimension as the other three.
In which case I think the OP's method looks OK.
But I could be wrong of course.
No you are not wrong. But the compiler can use simd instructions for simple math (vectors here)  and that can lead to a considerable speed increase and that is what Fred wants.
This is not about the algorithm perse, but about squeezing the most speed - within reason - out of the compiler. In effect my suggestion is that Fred ends up with a compiler and rtl that is specifically suited to audio.. :D (because + and concat() will be optimized)
Then you don't need assembler, because you can do everything in Pascal.
Not many people know this, but this can be done with FPC and the generated code is often better than you or me or Fred can do ourselves. So in this case it is not about the algorithm, but about compiler options and the way the rtl is built.
The make script for FPC has an OPT="<whatever option you want>" for that.
Title: Re: Add arrays into one ?
Post by: Thaddy on November 05, 2021, 06:14:50 pm
About this : cfloat = 0.3456 ;. declare it as a (global) const. This will help the compiler and does not take up stack space.
Title: Re: Add arrays into one ?
Post by: Fred vS on November 05, 2021, 06:35:38 pm
About this : cfloat = 0.3456 ;. declare it as a (global) const. This will help the compiler and does not take up stack space.

Ok but this was only to show the thing, does it exist optimization too for multiplication, like done with concact() for addition ?
Title: Re: Add arrays into one ?
Post by: BobDog on November 05, 2021, 07:36:19 pm

I shall leave the optimisations to the fp experts here, for I am a beginner in fp.
But I do note that things like
Code: Pascal  [Select][+][-]
  1. arr[i]+=arr2[i];
  2. arr[i]*=arr2[i];


do not work with build 3.2.2 or 3.2.0.
(Win 10)

Title: Re: Add arrays into one ?
Post by: Thaddy on November 05, 2021, 07:49:09 pm

I shall leave the optimisations to the fp experts here, for I am a beginner in fp.
But I do note that things like
Code: Pascal  [Select][+][-]
  1. arr[i]+=arr2[i];
  2. arr[i]*=arr2[i];


do not work with build 3.2.2 or 3.2.0.
(Win 10)
Those C'sms are strongly discouraged. Those crept in a long time ago, but even the devs never use it.
(Never matured and the resulting code is even slower in some cases)
If you are a beginner: do not use those, but write out the equivalent in true Pascal code.
So  a:= a+b instead of a:+=b.
Convincing example: the compiler sources and the rtl do not contain that filth.
Marco commented about this this week.
That said: can you give a compilable example that "does not work"? Because at first glance it should.
Title: Re: Add arrays into one ?
Post by: Fred vS on November 05, 2021, 07:52:07 pm
Hello Thaddy.

OK, fpc 3.3.1 trunk recompiled using this:

Code: Bash  [Select][+][-]
  1. make all FPC=$COMPILER OPT="-Fl/usr/local/lib -CfAVX2 -dFASTMATH"

I will do some test, it can take some time before result.

Many thanks.

Fre;D
Title: Re: Add arrays into one ?
Post by: Thaddy on November 05, 2021, 08:43:03 pm
Ok but this was only to show the thing, does it exist optimization too for multiplication, like done with concact() for addition ?
Yes. the -Cf<XXX> flag will optimze that and more too. You should be able to examine the assembler output.
Title: Re: Add arrays into one ?
Post by: glorfin on November 05, 2021, 09:03:05 pm
But, Thaddy, concat does not do what the topicstarter asks for. It really concatenates arrays, adding a second to the end of a first etc. And "+" with {$modeswitch arrayoperators+} does the same.
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. var
  3.   Arr1 : array of single = (1.0,2.0,3.0);
  4.   Arr2 : array of single = (4.0,5.0,6.0);
  5.   Arr0 : array of single;
  6.   I:integer;
  7. begin
  8.   Arr0 := concat(Arr1,Arr2);
  9.   for I := 0 to high(Arr0) do
  10.     writeln(Arr0[I]);
  11.   readln;
  12. end.
  13.  
Output:
1.000000000E+00
 2.000000000E+00
 3.000000000E+00
 4.000000000E+00
 5.000000000E+00
 6.000000000E+00
Title: Re: Add arrays into one ?
Post by: howardpc on November 05, 2021, 10:06:42 pm
This may be marginally faster than Fred's "classical" way of adding array elements:
Code: Pascal  [Select][+][-]
  1. program AddArrays;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   Types;
  7.  
  8. var
  9.   arrSum, arr1, arr2, arr3: TDoubleDynArray;
  10.   d: Double;
  11.  
  12.   function GetArraySum(a1, a2, a3: TDoubleDynArray): TDoubleDynArray;
  13.   var
  14.     arrLen: SizeInt;
  15.     i: Integer;
  16.     d: Double;
  17.   begin
  18.     Result := Nil;
  19.     arrLen := Length(a1);
  20.     if (Length(a2) <> arrLen) or (Length(a3) <> arrLen) then
  21.       Exit;
  22.     SetLength(Result, arrLen);
  23.     Move(a1[0], Result[0], arrLen * SizeOf(Double));
  24.     for i := 0 to High(Result) do
  25.       Result[i] := Result[i] + a2[i] + a3[i];
  26.   end;
  27.  
  28. begin
  29.   arr1 := [0, 1.1, 2.2, 3.3];
  30.   arr2 := [10.4, 11.5, 12.6, 13.7];
  31.   arr3 := [100.8, 101.9, 102.1, 103];
  32.   arrSum := GetArraySum(arr1, arr2, arr3);
  33.   for d in arrSum do
  34.     Write(d:4:1,' ');
  35. end.
Title: Re: Add arrays into one ?
Post by: howardpc on November 05, 2021, 10:32:35 pm
On second thoughts (I have not timed any of these routines) I think what I showed is probably slower than your method, since it is basically the same, but with an added memory move.
I think you should look for an optimised way to add (or in your second case, multiply) several floating point values.
Title: Re: Add arrays into one ?
Post by: Fred vS on November 05, 2021, 11:37:42 pm
I think you should look for an optimised way to add (or in your second case, multiply) several floating point values.

Hum, it is basically to goal of the initial question.
It seems that, by code, not lot of gain in speed can be expected.

@Thaddy, do you think that -CfAVX2 -dFASTMATH compiler-optimization could have a real impact in the "classical" code of initial post ( if you agree to read it, of course )?
Title: Re: Add arrays into one ?
Post by: BobDog on November 06, 2021, 12:19:41 am

I notice a slight boost using parameters instead of globals.
Code: Pascal  [Select][+][-]
  1. uses
  2. sysutils;
  3. type
  4. aos=array of single;
  5.  
  6.  
  7. var
  8. arr0, arr1, arr2, arr3 : array of single;
  9.  
  10.  
  11.  
  12. procedure AddArrays() ;
  13. var
  14.   i : integer;
  15.   begin
  16.   // all arrays have the same size
  17.     for i := 0 to length(arr0) -1 do
  18.     begin
  19.     arr0[i]:=0;
  20.     arr0[i] := arr1[i] + arr2[i] + arr3[i];
  21.     end;
  22.   end;
  23.  
  24.   procedure AddArrays2(var arr0:aos;arr1:aos;arr2:aos;arr3:aos);
  25. var
  26.   i : integer;
  27.   begin
  28.   // all arrays have the same size
  29.     for i := 0 to length(arr0) -1 do
  30.     begin
  31.      arr0[i]:=0;
  32.     arr0[i] := arr1[i] + arr2[i] + arr3[i];
  33.     end;
  34.   end;
  35.  
  36.  
  37.   var
  38.   i,k:int64;
  39.   lim:int64=200000000;
  40.   t,t2,totg,totp:int64;
  41.  
  42.   begin
  43.   writeln('Please wait a second or so . . .');
  44.   setlength(arr0,lim);
  45.   setlength(arr1,lim);
  46.   setlength(arr2,lim);
  47.   setlength(arr3,lim);
  48.   for i:=0 to lim-1 do
  49.   begin
  50.   arr1[i]:=i;
  51.   arr2[i]:=i;
  52.   arr3[i]:=i;
  53.   end;
  54.   totp:=0;
  55.   totg:=0;
  56.  
  57.   for k:=1 to 10 do
  58.   begin
  59.   t:=gettickcount64;
  60.   AddArrays();
  61.   t2:=gettickcount64-t;
  62.   totg:=totg+t2;
  63.   write(t2,' global ');
  64.   for i:=1 to 5 do write(arr0[i],' ');
  65.   writeln;
  66.   t:=gettickcount64;
  67.   AddArrays2(arr0,arr1,arr2,arr3);
  68.   t2:=gettickcount64-t;
  69.   totp:=totp+t2;
  70.   write(t2,' params ');
  71.   for i:=1 to 5 do write(arr0[i],' ');
  72.   write('  ',k,' of 10');
  73.   writeln;
  74.   writeln;
  75.   end;
  76.   writeln('Totals');
  77.   writeln('Time with globals ',totg);
  78.   writeln('Time with params  ',totp);
  79. writeln('Press return to end');
  80. readln;
  81.   end.
  82.  
  83.  
Title: Re: Add arrays into one ?
Post by: Thaddy on November 06, 2021, 07:19:13 am
Code: Pascal  [Select][+][-]
  1.   procedure AddArrays2(const arr0:aos;arr1:aos;arr2:aos;arr3:aos);
Is even faster.
Title: Re: Add arrays into one ?
Post by: bytebites on November 06, 2021, 08:44:10 am
Yet faster
Code: Pascal  [Select][+][-]
  1. procedure AddArrays2(const arr0,arr1,arr2,arr3:aos);  
Title: Re: Add arrays into one ?
Post by: Thaddy on November 06, 2021, 09:18:31 am
Shorter, you mean? Here it is not faster.
Ah, on another PC it is indeed faster, but not on my laptop. Strange.
Title: Re: Add arrays into one ?
Post by: Fred vS on November 06, 2021, 04:32:17 pm

I notice a slight boost using parameters instead of globals.
Code: Pascal  [Select][+][-]
  1. uses
  2. sysutils;
  3. type
  4. aos=array of single;
  5.  
  6.  
  7. var
  8. arr0, arr1, arr2, arr3 : array of single;
  9.  
  10.  
  11.  
  12. procedure AddArrays() ;
  13. var
  14.   i : integer;
  15.   begin
  16.   // all arrays have the same size
  17.     for i := 0 to length(arr0) -1 do
  18.     begin
  19.     arr0[i]:=0;
  20.     arr0[i] := arr1[i] + arr2[i] + arr3[i];
  21.     end;
  22.   end;
  23.  
  24.   procedure AddArrays2(var arr0:aos;arr1:aos;arr2:aos;arr3:aos);
  25. var
  26.   i : integer;
  27.   begin
  28.   // all arrays have the same size
  29.     for i := 0 to length(arr0) -1 do
  30.     begin
  31.      arr0[i]:=0;
  32.     arr0[i] := arr1[i] + arr2[i] + arr3[i];
  33.     end;
  34.   end;
  35.  
  36.  
  37.   var
  38.   i,k:int64;
  39.   lim:int64=200000000;
  40.   t,t2,totg,totp:int64;
  41.  
  42.   begin
  43.   writeln('Please wait a second or so . . .');
  44.   setlength(arr0,lim);
  45.   setlength(arr1,lim);
  46.   setlength(arr2,lim);
  47.   setlength(arr3,lim);
  48.   for i:=0 to lim-1 do
  49.   begin
  50.   arr1[i]:=i;
  51.   arr2[i]:=i;
  52.   arr3[i]:=i;
  53.   end;
  54.   totp:=0;
  55.   totg:=0;
  56.  
  57.   for k:=1 to 10 do
  58.   begin
  59.   t:=gettickcount64;
  60.   AddArrays();
  61.   t2:=gettickcount64-t;
  62.   totg:=totg+t2;
  63.   write(t2,' global ');
  64.   for i:=1 to 5 do write(arr0[i],' ');
  65.   writeln;
  66.   t:=gettickcount64;
  67.   AddArrays2(arr0,arr1,arr2,arr3);
  68.   t2:=gettickcount64-t;
  69.   totp:=totp+t2;
  70.   write(t2,' params ');
  71.   for i:=1 to 5 do write(arr0[i],' ');
  72.   write('  ',k,' of 10');
  73.   writeln;
  74.   writeln;
  75.   end;
  76.   writeln('Totals');
  77.   writeln('Time with globals ',totg);
  78.   writeln('Time with params  ',totp);
  79. writeln('Press return to end');
  80. readln;
  81.   end.
  82.  
  83.  

Hello BobDog.

Many thanks for your code.
But here, on Linux Debian 11 64 bit, with a intel i5 and 16 megas RAM, compiled with fpc 3.2.2,  t2:=gettickcount64-t; = always 0, for both methods.

So it is difficult to estimate if one is faster than the other.

It is strange because it I add a sleep(1);,  t2:=gettickcount64-t; = 1, as wanted.
Is a kind of "application.processmessages" needed to get the right gettickcount64 ? (or my machine is maybe too fast and the result is < than 1 tickcount).

Here my result:

Code: Bash  [Select][+][-]
  1. fred@fredvs ~> ./testarray
Quote
Please wait a second or so . . .
.
.. // 10 x the code that follow from 1 to 10

0
0 global 0 0 0 0 0
0 params 0 0   10 of 10

Totals
Time with globals 0
Time with params  0
Press return to end

Fre;D
Title: Re: Add arrays into one ?
Post by: Fred vS on November 06, 2021, 06:33:39 pm
Hello.

@ BobDog, bytebites, Thaddy: it seems that you have more luck than me with the BobDog-code.

I always get 0 as result.

But it could be interesting to see the result with a multiplication too, with something like this in code, and do the same for parameter method:
( I want to test it but on my machine tickcount duration is also = 0 )

Code: Pascal  [Select][+][-]
  1. var
  2. ratio : single = 0.123;
  3. ratio1 : single = 0.345;
  4. ratio2 : single = 0.567;
  5. ratio3 : single = 0.890;
  6.  
  7. procedure AddArrays() ; // and multiply
  8. var
  9.   i : integer;
  10.   begin
  11.   // all arrays have the same size
  12.     for i := 0 to length(arr0) -1 do
  13.     begin
  14.     arr0[i]:=0;
  15.     arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  16.     end;
  17.   end;

This to see if parameter method has impact too on multiply.

Fre;D
Title: Re: Add arrays into one ?
Post by: Fred vS on November 06, 2021, 10:21:56 pm
Hello.

OK, I found the problem.
Here on Linux it failed because in procedure AddArrays() , i must be int64 otherwise range check error

Code: Pascal  [Select][+][-]
  1. procedure AddArrays() ; // and multiply
  2. var
  3.   i : int64; // change this
  4.  
  5.   procedure AddArrays2(var arr0:aos;arr1:aos;arr2:aos;arr3:aos);
  6. var
  7.   i : int64; // change this

Now there is a tickcount duration.
Time to investigate, write you later.

Fre;D
Title: Re: Add arrays into one ?
Post by: Fred vS on November 06, 2021, 10:58:40 pm
Hello.

With this code lightly changed from BobDog-code :
[EDITED X 3] The code was changed added bytebites solution and method with parameters but without var or const.

Code: Pascal  [Select][+][-]
  1. program testarray;
  2.  
  3.     uses
  4.     sysutils;
  5.     type
  6.     aos=array of single;
  7.    
  8.     var
  9.     arr0, arr1, arr2, arr3 : array of single;
  10.     ratio : single = 0.123;
  11.     ratio1 : single = 0.345;
  12.     ratio2 : single = 0.567;
  13.     ratio3 : single = 0.890;
  14.      
  15.      
  16.     procedure CalculArrays() ; // add and multiply
  17.      var
  18.       i : int64;
  19.       begin
  20.  
  21.       // all arrays have the same size
  22.         for i := 0 to length(arr0) -1 do
  23.         begin
  24.           arr0[i]:=0;
  25.           arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  26.         end;
  27.       end;
  28.      
  29.      procedure CalculArrays2(var arr0:aos;arr1:aos;arr2:aos;arr3:aos); // add and multiply
  30.        var
  31.         i : int64;
  32.       begin
  33.       // all arrays have the same size
  34.         for i := 0 to length(arr0) -1 do
  35.         begin
  36.          arr0[i]:=0;
  37.          arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  38.         end;
  39.       end;
  40.  
  41.     procedure CalculArrays3(const arr0,arr1,arr2,arr3:aos); //
  42.       var
  43.         i : int64;
  44.       begin
  45.       // all arrays have the same size
  46.         for i := 0 to length(arr0) -1 do
  47.         begin
  48.          arr0[i]:=0;
  49.          arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  50.         end;
  51.       end;
  52.      
  53.     procedure CalculArrays4(arr0,arr1,arr2,arr3:aos); //
  54.       var
  55.         i : int64;
  56.       begin
  57.       // all arrays have the same size
  58.         for i := 0 to length(arr0) -1 do
  59.         begin
  60.          arr0[i]:=0;
  61.          arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  62.         end;
  63.       end;
  64.      
  65.     var
  66.       lim:int64=200000000;
  67.       i,k,t,t2,totg,totp, totp2, totp3 :int64;
  68.      
  69.     begin
  70.       writeln('Filling arrays, please wait a second or so . . .');
  71.       setlength(arr0,lim);
  72.       setlength(arr1,lim);
  73.       setlength(arr2,lim);
  74.       setlength(arr3,lim);
  75.       for i:=0 to lim-1 do
  76.       begin
  77.       arr1[i]:=i;
  78.       arr2[i]:=i;
  79.       arr3[i]:=i;
  80.       end;
  81.       totp:=0;
  82.       totp2:=0;
  83.       totp3:=0;
  84.       totg:=0;
  85.  
  86.       writeln('OK arrays filled, START the race . . .');
  87.  
  88.     for k:=1 to 5 do
  89.      begin
  90.       writeln('Pass ',k);
  91.       t:=gettickcount64;
  92.       CalculArrays();
  93.       t2:=gettickcount64-t;
  94.       totg:=totg+t2;
  95.       t:=gettickcount64;
  96.       CalculArrays2(arr0,arr1,arr2,arr3);
  97.       t2:=gettickcount64-t;
  98.       totp:=totp+t2;
  99.       t:=gettickcount64;
  100.       CalculArrays3(arr0,arr1,arr2,arr3);
  101.       t2:=gettickcount64-t;
  102.       totp2:=totp2+t2;
  103.       t:=gettickcount64;
  104.       CalculArrays4(arr0,arr1,arr2,arr3);
  105.       t2:=gettickcount64-t;
  106.       totp3:=totp3+t2;
  107.     end;
  108.      writeln('Time with globals ',totg);
  109.      writeln('Time with params var ',totp);
  110.      writeln('Time with params const ',totp2);
  111.      writeln('Time with params nil ',totp3);
  112.      writeln('Press return to end');
  113.      readln;
  114.     end.    
  115.      

I get this as result on Linux 64 and fpc 3.2.2, i5, 16 G ram.
You may see that the difference is very light (here params const is faster).

Quote
f./testarray
Filling arrays, please wait a second or so . . .
OK arrays filled, START the race . . .
...
Time with globals 9449
Time with params var 9422
Time with params const 9379
Time with params nil 9408

[EDITED]
Same test but compiled with -O3 optimization:  (here params nil is faster).
 
Quote
Time with globals 8835
Time with params var 8813
Time with params const 8936
Time with params nil 8801

Fre;D
Title: Re: Add arrays into one ?
Post by: Josh on November 07, 2021, 01:12:49 am
no debug info
-o3              -04
g 5094     5814
v 4966     5484 optimize with 3 or 4
c 3156     3406

what the point of arro[0]:=0; when it defined in next statemt; so removed and get
g 3844    4265
v 3999    4110
c 2688    2844

pentium cpu 5404u dual core 2.3ghz laptop, windows 11
Title: Re: Add arrays into one ?
Post by: Fred vS on November 07, 2021, 02:39:07 am
what the point of arro[0]:=0;

It is to make work the machine.  ;)

In my previous post edited, added in code the method with parameters but without var/const.
And this one wins with -03 on my system.

Strange all that differences (but not very impressive).

By the way, your results are much faster than mine, my machine is a laptop Thinkpad X390.
Title: Re: Add arrays into one ?
Post by: Fred vS on November 27, 2021, 03:00:34 pm
Hello.

With this code lightly changed from BobDog-code :
[EDITED X 3] The code was changed added bytebites solution and method with parameters but without var or const.

Code: Pascal  [Select][+][-]
  1. program testarray;
  2.  
  3.     uses
  4.     sysutils;
  5.     type
  6.     aos=array of single;
  7.    
  8.     var
  9.     arr0, arr1, arr2, arr3 : array of single;
  10.     ratio : single = 0.123;
  11.     ratio1 : single = 0.345;
  12.     ratio2 : single = 0.567;
  13.     ratio3 : single = 0.890;
  14.      
  15.      
  16.     procedure CalculArrays() ; // add and multiply
  17.      var
  18.       i : int64;
  19.       begin
  20.  
  21.       // all arrays have the same size
  22.         for i := 0 to length(arr0) -1 do
  23.         begin
  24.           arr0[i]:=0;
  25.           arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  26.         end;
  27.       end;
  28.      
  29.      procedure CalculArrays2(var arr0:aos;arr1:aos;arr2:aos;arr3:aos); // add and multiply
  30.        var
  31.         i : int64;
  32.       begin
  33.       // all arrays have the same size
  34.         for i := 0 to length(arr0) -1 do
  35.         begin
  36.          arr0[i]:=0;
  37.          arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  38.         end;
  39.       end;
  40.  
  41.     procedure CalculArrays3(const arr0,arr1,arr2,arr3:aos); //
  42.       var
  43.         i : int64;
  44.       begin
  45.       // all arrays have the same size
  46.         for i := 0 to length(arr0) -1 do
  47.         begin
  48.          arr0[i]:=0;
  49.          arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  50.         end;
  51.       end;
  52.      
  53.     procedure CalculArrays4(arr0,arr1,arr2,arr3:aos); //
  54.       var
  55.         i : int64;
  56.       begin
  57.       // all arrays have the same size
  58.         for i := 0 to length(arr0) -1 do
  59.         begin
  60.          arr0[i]:=0;
  61.          arr0[i] := ratio * ( (ratio1 * arr1[i]) + (ratio2 * arr2[i]) + (ratio3 * arr3[i]) ); // here ratio multiply
  62.         end;
  63.       end;
  64.      
  65.     var
  66.       lim:int64=200000000;
  67.       i,k,t,t2,totg,totp, totp2, totp3 :int64;
  68.      
  69.     begin
  70.       writeln('Filling arrays, please wait a second or so . . .');
  71.       setlength(arr0,lim);
  72.       setlength(arr1,lim);
  73.       setlength(arr2,lim);
  74.       setlength(arr3,lim);
  75.       for i:=0 to lim-1 do
  76.       begin
  77.       arr1[i]:=i;
  78.       arr2[i]:=i;
  79.       arr3[i]:=i;
  80.       end;
  81.       totp:=0;
  82.       totp2:=0;
  83.       totp3:=0;
  84.       totg:=0;
  85.  
  86.       writeln('OK arrays filled, START the race . . .');
  87.  
  88.     for k:=1 to 5 do
  89.      begin
  90.       writeln('Pass ',k);
  91.       t:=gettickcount64;
  92.       CalculArrays();
  93.       t2:=gettickcount64-t;
  94.       totg:=totg+t2;
  95.       t:=gettickcount64;
  96.       CalculArrays2(arr0,arr1,arr2,arr3);
  97.       t2:=gettickcount64-t;
  98.       totp:=totp+t2;
  99.       t:=gettickcount64;
  100.       CalculArrays3(arr0,arr1,arr2,arr3);
  101.       t2:=gettickcount64-t;
  102.       totp2:=totp2+t2;
  103.       t:=gettickcount64;
  104.       CalculArrays4(arr0,arr1,arr2,arr3);
  105.       t2:=gettickcount64-t;
  106.       totp3:=totp3+t2;
  107.     end;
  108.      writeln('Time with globals ',totg);
  109.      writeln('Time with params var ',totp);
  110.      writeln('Time with params const ',totp2);
  111.      writeln('Time with params nil ',totp3);
  112.      writeln('Press return to end');
  113.      readln;
  114.     end.    
  115.      

I get this as result on Linux 64 and fpc 3.2.2, i5, 16 G ram.
You may see that the difference is very light (here params const is faster).

Quote
f./testarray
Filling arrays, please wait a second or so . . .
OK arrays filled, START the race . . .
...
Time with globals 9449
Time with params var 9422
Time with params const 9379
Time with params nil 9408

[EDITED]
Same test but compiled with -O3 optimization:  (here params nil is faster).
 
Quote
Time with globals 8835
Time with params var 8813
Time with params const 8936
Time with params nil 8801

Fre;D

Hello.

About speed with float calculation.

It would be interesting to see also the result using fpc-LLVM.

There is the wiki https://wiki.lazarus.freepascal.org/LLVM that explains how to build the compiler.
But does it exist a fpc-LLVM release ( with binaries for Linux 64 bit for example ) ?
( I am terribly lazy and without free-time those days. )

Even more, if somebody has fpc-LLVM binary working, could he be so kind to test the demo program to see if there is a difference in result vs the "classical" fpc version ?

Many thanks.

Fre;D
Title: Re: Add arrays into one ?
Post by: mischi on November 27, 2021, 07:58:10 pm
fpc 3.2.0, macOS 11.6.1, i5 2.8GHz, 8GB RAM:

fpc Testarray.pas; ./Testarray
...
Time with globals 5563
Time with params var 5803
Time with params const 4642
Time with params nil 4650

fpc -O3 Testarray.pas; ./Testarray
...
Time with globals 3259
Time with params var 3154
Time with params const 2589
Time with params nil 3128

fpc -O4 Testarray.pas; ./Testarray
...
Time with globals 3727
Time with params var 4051
Time with params const 3114
Time with params nil 3620
Title: Re: Add arrays into one ?
Post by: Fred vS on November 27, 2021, 08:11:16 pm
@ Mischi: thanks for testing.

Did you use the "classical" fpc or the one compiled for LLVM ?
If it was with the "fpc-LLVM", can you compare your result with the "classical" fpc ?

Thanks.

Fre;D
Title: Re: Add arrays into one ?
Post by: mischi on November 28, 2021, 12:13:28 pm
Did you use the "classical" fpc or the one compiled for LLVM ?
If it was with the "fpc-LLVM", can you compare your result with the "classical" fpc ?

Thanks.

Fre;D

Classic. I tried to build fpc-LLVM with 3.2.2, but i did not manage. So, giving up for the moment.

MiSchi
TinyPortal © 2005-2018