Recent

Author Topic: Half precision floating point - wish list  (Read 8086 times)

henrytj

  • Jr. Member
  • **
  • Posts: 51
Half precision floating point - wish list
« on: August 31, 2007, 02:14:53 pm »
This is a wish list item. I hope that sometime in the future FP/Lazarus will add support for a relatively new half precision floating point data type. This is gaining use in high dynamic range (HDR) graphics and gaming systems. Here are some links.

http://en.wikipedia.org/wiki/High_dynamic_range_imaging

http://en.wikipedia.org/wiki/Half_precision

http://en.wikipedia.org/wiki/OpenEXR

This is the kind of thing I am interested in working with using FP/Lazarus.

Henry

Marc

  • Administrator
  • Hero Member
  • *
  • Posts: 2599
RE: Half precision floating point - wish list
« Reply #1 on: September 03, 2007, 12:54:24 pm »
Makes some sense...
submit it as feature request in mantis (under FPC project)
//--
{$I stdsig.inc}
//-I still can't read someones mind
//-Bugs reported here will be forgotten. Use the bug tracker

rforcen

  • Jr. Member
  • **
  • Posts: 52
Re: Half precision floating point - wish list
« Reply #2 on: May 04, 2024, 09:26:30 am »
I've bee porting some c++ code to support f16 also testing OpenCL compatible half data type, results & s/c can be found @: https://github.com/rforcen/fpc/tree/main/openCL, giving:

 half float16 opencl / native pascal ST/MT comparision on complex arithmetics:

  a * b + a - b / a;

  this is for a ryzen 7 5700G w/ embedded radeon graph  2GB RAM

  lap half opencl:    31 ms
  lap half pas   :  1078 ms
  lap half MT    :   172 ms

  ratio ST/CL: 34.7
  ratio MT/CL: 5.5

so using opencl for this kind of data type is clearly the best option,

f16 repo: https://github.com/rforcen/fpc/tree/main/f16

« Last Edit: May 04, 2024, 09:29:54 am by rforcen »

Thaddy

  • Hero Member
  • *****
  • Posts: 15488
  • Censorship about opinions does not belong here.
Re: Half precision floating point - wish list
« Reply #3 on: May 04, 2024, 10:05:30 am »
That is not really testing if you do not mention the cpu and fpu types for which the pascal code is compiled. The defaults are very conservative. (-Cp, -Cf and -Op settings)
My great hero has found the key to the highway. Rest in peace John Mayall.
Playing: "Broken Wings" in your honour. As well as taking out some mouth organs.

rforcen

  • Jr. Member
  • **
  • Posts: 52
Re: Half precision floating point - wish list
« Reply #4 on: May 04, 2024, 10:29:40 am »
it's on the post: "ryzen 7 5700G w/ embedded radeon graph  2GB RAM",

this is an average cpu with no specific dedicated gpu which i currently don't use, it would be interesting to see some results on some average gpu boards,

target tweaking doesn't change performance in this case.

Thaddy

  • Hero Member
  • *****
  • Posts: 15488
  • Censorship about opinions does not belong here.
Re: Half precision floating point - wish list
« Reply #5 on: May 04, 2024, 10:38:43 am »
target tweaking doesn't change performance in this case.
It can make a difference. I do not expect the same speed as with opencl, but I do expect an improvement.
My great hero has found the key to the highway. Rest in peace John Mayall.
Playing: "Broken Wings" in your honour. As well as taking out some mouth organs.

rforcen

  • Jr. Member
  • **
  • Posts: 52
Re: Half precision floating point - wish list
« Reply #6 on: May 05, 2024, 09:02:31 am »
after some fiddling with hand coded intrinsics using VCVTPH2PS & vcvtps2ph I've matched performance of opencl & fpc,
s/c @ https://github.com/rforcen/fpc/blob/main/openCL/testHalf.lpr (procedure testSingleConv)

as these two asm instructions are not directly supported I've added the hex upcodes obtained from compiling g++ -g intrinsic snippet and objdump,

             
Code: Pascal  [Select][+][-]
  1. DB      $c4, $e2, $79, $13, $c0 // VCVTPH2PS XMM0,XMM0  //  c4 e2 79 13 c0

and

             
Code: Pascal  [Select][+][-]
  1. DB      $c4, $e3,  $79, $1d, $c0, $00 // vcvtps2ph xmm0,xmm0,0x0

for avx-512 fp16 instructions

 
Code: Pascal  [Select][+][-]
  1.  62 f5 74 08 58 d0       vaddph %xmm0,%xmm1,%xmm2
  2.   62 f5 74 08 5c d0       vsubph %xmm0,%xmm1,%xmm2
  3.   62 f5 74 08 59 d0       vmulph %xmm0,%xmm1,%xmm2
  4.   62 f5 74 08 5e d0       vdivph %xmm0,%xmm1,%xmm2
  5.  

AMD zen 4/5 or Intel xeon or >10 gen is required which is not my case   :'(

« Last Edit: May 05, 2024, 10:43:57 am by rforcen »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11721
  • FPC developer.
Re: Half precision floating point - wish list
« Reply #7 on: May 05, 2024, 11:06:16 am »
I did a quick test, and VCVTPH2PS is supported in FPC trunk it seems.

Can't test either since no AVX512.  (afaik intel gen 12 and gen 13/14 mostly has no avx512 either. Only the gen 10 (ice lake) and 11 Laptops with a letter in their name )

rforcen

  • Jr. Member
  • **
  • Posts: 52
Re: Half precision floating point - wish list
« Reply #8 on: May 05, 2024, 05:10:10 pm »
combining SIMD w/MT using average non avx-512 APU cpu OpenCL,

performance is matched,

i've removed all avx512 specific upcodes as requires exotic cpu,

https://github.com/rforcen/fpc/blob/main/openCL/testHalf.lpr

Code: Pascal  [Select][+][-]
  1. procedure _gen_ab_SIMD(i: PtrInt; {%H-}pnt: pointer; Item: TMultiThreadProcItem);
« Last Edit: May 05, 2024, 05:13:12 pm by rforcen »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11721
  • FPC developer.
Re: Half precision floating point - wish list
« Reply #9 on: May 05, 2024, 05:36:30 pm »
 I have a routine for flat field correction (which every column in an image for non uniform lighting, usually lamp intensity at the sides is slightly less), but I implement that in 16-bit fixed point math to maximize register utilization.

If your input is 8-bit like mine, scaled/fixedpoint might also be a solution.

rforcen

  • Jr. Member
  • **
  • Posts: 52
Re: Half precision floating point - wish list
« Reply #10 on: May 06, 2024, 09:49:00 am »
i plan using this for IA apps. fp16 flavors are getting popular

 

TinyPortal © 2005-2018