Recent

Author Topic: Question on compiler provided defines and optimisation flags  (Read 666 times)

MathMan

  • Sr. Member
  • ****
  • Posts: 325
Hello all,

I just discovered, that there are two interesting compiler provided defines FPC_HAS_FAST_FMA_SINGLE & FPC_HAS_FAST_FMA_SINGLE which seem to interact with the FASTMATH optimisation flag.

These might come in handy for something I am currently working on and I would like to fully understand their effects - but unfortunately there is litte to no information in the manuals for FPC 3.2.2.

My current understanding, after playing around and analysing the assembler output, is as follows

* the defines are set if a suitable target architecture is selected - e.g. for COREIAVX2 they are set, for COREI they are unset
* in the source I can enable / disable the use via {$optimisation FASTMATH} and {$optimisation NOFASTMATH}

Is the above correct, or is there something else I have to consider, if I want to compile a function with & without FMA operations?

Finally there is the command line parameter -Sv which is only explained as "use vector operations if available".

* is this really for generation of vector operations, or does it only relate to the vectorised parameter passing under Win?
* if the former, what magic must be applied in the source to make use of this? <= I did try some variants, but without any effect

Kind regards,
MathMan

Jonas Maebe

  • Hero Member
  • *****
  • Posts: 1058
Re: Question on compiler provided defines and optimisation flags
« Reply #1 on: July 17, 2022, 11:42:09 am »
Is the above correct, or is there something else I have to consider, if I want to compile a function with & without FMA operations?
Put it in an include file, and include it once in a unit compiled for one architecture and once for the other. Make the function name either configurable via a macro, or keep the header in the unit so that you can give them different names.

Quote
Finally there is the command line parameter -Sv which is only explained as "use vector operations if available".
It enables adding (and several other math and logic operations) for array operands, and will use vector operations to calculate the results. It does not perform auto-vectorization.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Question on compiler provided defines and optimisation flags
« Reply #2 on: July 17, 2022, 11:51:41 am »
* the defines are set if a suitable target architecture is selected - e.g. for COREIAVX2 they are set, for COREI they are unset

The defines solely depend on the specified FPU type. Both AVX2 and AVX512 have them set.

* in the source I can enable / disable the use via {$optimisation FASTMATH} and {$optimisation NOFASTMATH}

FastMath might result in the loss of precision on certain platforms with certain operations (especially on platforms that support 80-bit floating point). Otherwise it will try to use FMA in case of Single or Double types (see compiler/nadd.pas, taddnode.try_fma).

Finally there is the command line parameter -Sv which is only explained as "use vector operations if available".

* is this really for generation of vector operations, or does it only relate to the vectorised parameter passing under Win?
* if the former, what magic must be applied in the source to make use of this? <= I did try some variants, but without any effect

It's related to the former. You need to arrays with 4 fields (or in general?) of an ordinal or floating point type (with a size <= 8 Byte) and you need to have SSE or higher enabled. Then the compiler will allow the use of vector operations on these types and use SIMD to achieve them.

MathMan

  • Sr. Member
  • ****
  • Posts: 325
Re: Question on compiler provided defines and optimisation flags
« Reply #3 on: July 17, 2022, 12:10:11 pm »
* the defines are set if a suitable target architecture is selected - e.g. for COREIAVX2 they are set, for COREI they are unset

The defines solely depend on the specified FPU type. Both AVX2 and AVX512 have them set.

* in the source I can enable / disable the use via {$optimisation FASTMATH} and {$optimisation NOFASTMATH}

FastMath might result in the loss of precision on certain platforms with certain operations (especially on platforms that support 80-bit floating point). Otherwise it will try to use FMA in case of Single or Double types (see compiler/nadd.pas, taddnode.try_fma).

Finally there is the command line parameter -Sv which is only explained as "use vector operations if available".

* is this really for generation of vector operations, or does it only relate to the vectorised parameter passing under Win?
* if the former, what magic must be applied in the source to make use of this? <= I did try some variants, but without any effect

It's related to the former. You need to arrays with 4 fields (or in general?) of an ordinal or floating point type (with a size <= 8 Byte) and you need to have SSE or higher enabled. Then the compiler will allow the use of vector operations on these types and use SIMD to achieve them.

Many thanks PascalDragon - that clarifies it.

* regarding first point above I assume sometime in the future this may also become available for ARM cores with SVE (or somesuch)?
* regarding the second point - I only intend to use single / double types, so that fits <= I stay away from extendend as far as I can usually
* regarding the last, I thought I tried that, but will try again - will also take a look at the intrinsics file mentioned in the other thread. Should that be something like the below then

Code: Pascal  [Select][+][-]
  1. var
  2.   Src1, Src2: array [ 0..3 ] of single;
  3.   Mult: array [ 0..3 ] of single;
  4.  
  5. ...
  6.   Mult := Src1 * Src2; // Mult is the dot-product of the 4 point vectors Src1, Src2?
  7.  

Kind regards,
MathMan
« Last Edit: July 17, 2022, 12:19:21 pm by MathMan »

MathMan

  • Sr. Member
  • ****
  • Posts: 325
Re: Question on compiler provided defines and optimisation flags
« Reply #4 on: July 18, 2022, 08:15:29 am »
Is the above correct, or is there something else I have to consider, if I want to compile a function with & without FMA operations?
Put it in an include file, and include it once in a unit compiled for one architecture and once for the other. Make the function name either configurable via a macro, or keep the header in the unit so that you can give them different names.

Quote
Finally there is the command line parameter -Sv which is only explained as "use vector operations if available".
It enables adding (and several other math and logic operations) for array operands, and will use vector operations to calculate the results. It does not perform auto-vectorization.

Thanks Jonas - I don't know how, but I must have overlooked your response yesterday  :-[

Regarding the first point - yes, this approach is also the one I had in mind (I just looked at the MMX unit the other day ...)

Regarding second - I have it working now. However it looks like -Sv and FASTMATH do not go hand in hand as of yet, and I still haven't found a way to discover if -Sv has been set and manipulating it in the source like I can with FASTMATH. So maybe I have to put that on hold and takeup again at some later date.

Kind regards,
MathMan

 

TinyPortal © 2005-2018