Forum > General

performance problem with Free Pascal 2.6.x

(1/5) > >>

As my old Delphi 7 does no longer work correctly with Win 7 profession 64bit, I switched to Free pascal with Lazarus. I encounter differences in the speed of the compiled programs. Programs using the same code are slower than with Delphi. Here is a simple routine:

--- Code: ---procedure Filter(const A: double);
var ZHelp,Trapezoid : double;
    k,n, Number : integer : integer;
    ZArray,ZArrayFilter : array of double;
 For n:= 1 to Number do
   ZHelp:= 0;
   For k:= 1 to Number do
    ZArray[k-1]:= ZValues[k-1] * A * exp(-Pi*power(A/1000*(XValues[k-1]-XValues[n-1]), 2));
   // integrate (trapezoidal rule)
   For k:= 1 to Number-1 do
    Trapezoid:= (abs(ZArray[k-1]-ZArray[k])/2 + Min(ZArray[k-1], ZArray[k]))
                * (XValues[k]-XValues[k-1]); //trapezoid  = triangle + rectangle
    ZHelp:= ZHelp + Trapezoid;
   end; //end for k
   ZArrayFilter[n-1]:= ZHelp / 1000;
 end; //end for n (Filter)

--- End code ---

My arrays have up to 30.000 values and Number is in the same region. For Number:=10800 Delphi needs 17 s while Free Pascal 2.6 needs 20 s (18% slower).  (I cannot compile using Delphi 7 anymore, so my times are hand-stopped.)

I read in this thread:
that FreePascal is much slower than Delphi because it has problems if the array size is not a power of 2. But this info is from 2010 and Free pascal 2.2.

So my question is what I can do to make the code at least as fast as on Delphi 7? I already use the compiler options
-Mdelphi -O3
and also tried
-Cfsse2 and -Cfsse3
without any gain in speed. I also read that there is an optimization called "fastmath" but Lazarus doesn't provide such a compiler option.

Any ideas?

Not the array size, but the ELEMENT size a power of two. But you seem to use doubles which have size 8, which is 2**3, so a power of 2.

The  posts you seem to indicate that Delphi does some form of strength reduction in such case to avoid repeated multiplications (I can't remember seeing D7 do that, but that is what it seems to say).

I don't really see anything that can be improved quickly, so you would probably have to compare the generated assembler.

Or just try the development version.

Look at:

Afaik fpc trunk optimizes more, but I do not know how much of this issue is covered.

One think you can try (if not solved otherwise):

ZArray[k] / ZArray[k-1]

before the loop:
ZArrayElementPointer = @ZArray[0]

in the loop

Afaik SSA is still a private branch Florian works on from time to time.

replace all var / some_num with coresponting var * and see if this makes a dif
ex: A/1000=A*0,001 
Edit***: also [ A/1000 ] is a const for its place so calculate once before the for


[0] Message Index

[#] Next page

Go to full version