Recent

Author Topic: [SOLVED] FPC Raspberry + ARM Neon  (Read 7198 times)

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
[SOLVED] FPC Raspberry + ARM Neon
« on: April 16, 2017, 03:08:02 am »
Hello everyone

I'm trying translate my video and image process routines made in assembly SIMD to arm neon, now I'm using a raspberry 3 b with raspbian + laz 1.7 + fpc 3.0. But I'm getting a lot of weird errors when I tried use neon instructions like vadd or some register like q0 or q1 (It's unrecognized).

Anyone know something about this? Or maybe if that version of FPC supports the instruction set.
« Last Edit: April 23, 2017, 03:51:43 pm by johnsson »
Just a regular guy

Laksen

  • Hero Member
  • *****
  • Posts: 724
    • J-Software
Re: FPC Raspberry + ARM Neon
« Reply #1 on: April 16, 2017, 09:05:55 am »
Try to post some examples of your routines.

As far as I remember fpc only support the pre UAL mnemonics where vadd is fadd, etc

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: FPC Raspberry + ARM Neon
« Reply #2 on: April 16, 2017, 12:52:25 pm »
This may help:
Code: Pascal  [Select][+][-]
  1. {$Macro on}
  2. {$Define VABS:=FABS}
  3. {$Define VADD:=FADD}
  4. {$Define VDIV:=FDIV}
  5. {$Define VMLA:=FMAC}
  6. {$Define VMLS:=FNMAC}
  7. // define a convenient alias for these two
  8. {.$Define VMOV (immediate)      FCONST a}
  9. {.$Define VMOV (register)       FCPY}
  10. //
  11. {$Define VMUL:=FMUL}
  12. {$Define VNEG:=FNEG}
  13. {$Define VNMLA:=FNMSC}
  14. {$Define VNMLS:=FMSC}
  15. {$Define VNMUL:=FNMUL}
  16. {$Define VSQRT:=FSQRT}
  17. {$Define VSUB:=FSUB}
Specialize a type, not a var.

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
Re: FPC Raspberry + ARM Neon
« Reply #3 on: April 16, 2017, 03:16:40 pm »
Try to post some examples of your routines.

As far as I remember fpc only support the pre UAL mnemonics where vadd is fadd, etc

The code is pretty simple. I just use compile a vadd instruction with d0 and q0 register

Code: [Select]
  vadd.i32 d0, d1 // Here a i got a error message "Error: Internal error 2010122301"

I tried use the q0 (128 bits register) but apparently it's not supported yet. I also change the command removing the .I32

Code: [Select]
  vadd d0, d1 // On raspbian i got a generic error message, but when I compile this on a x86 machine with cross arm compiler the error message is: Error Asm [vadd vreg vreg] invalid combination
Just a regular guy

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
Re: FPC Raspberry + ARM Neon
« Reply #4 on: April 16, 2017, 03:18:16 pm »
This may help:
Code: Pascal  [Select][+][-]
  1. {$Macro on}
  2. {$Define VABS:=FABS}
  3. {$Define VADD:=FADD}
  4. {$Define VDIV:=FDIV}
  5. {$Define VMLA:=FMAC}
  6. {$Define VMLS:=FNMAC}
  7. // define a convenient alias for these two
  8. {.$Define VMOV (immediate)      FCONST a}
  9. {.$Define VMOV (register)       FCPY}
  10. //
  11. {$Define VMUL:=FMUL}
  12. {$Define VNEG:=FNEG}
  13. {$Define VNMLA:=FNMSC}
  14. {$Define VNMLS:=FMSC}
  15. {$Define VNMUL:=FNMUL}
  16. {$Define VSQRT:=FSQRT}
  17. {$Define VSUB:=FSUB}

I got a message unrecognized fadd opcode, maybe I need add some flag to compiler?

 ::)
« Last Edit: April 18, 2017, 02:51:27 am by johnsson »
Just a regular guy

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: FPC Raspberry + ARM Neon
« Reply #5 on: April 16, 2017, 04:54:57 pm »
Internal errors can - and should - be posted on the bug tracker.
"Error: Internal error 2010122301"

Enter it with a compilable small example.

[edit]
It looks like this internal error isn't there anymore in trunk....?
You are using maybe an old version? Like 3.0.0 instead of 3.0.2?

[edit2]
That code is indeed commented out in the raatt unit. Use a newer version of the compiler.
« Last Edit: April 16, 2017, 05:37:16 pm by Thaddy »
Specialize a type, not a var.

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: FPC Raspberry + ARM Neon
« Reply #6 on: April 16, 2017, 05:43:56 pm »
Note that from that now removed "Error: Internal error 2010122301" code I suppose the dotted postfix is maybe .w32 ....  not i32.

I also deduced from that code that unified syntax should be supported. (around Line 308 in raatt.pas from trunk) So syntax: vadd.w32 r,r
I suppose that should work. Note those macro's I gave you are rather old but should work on 3.0.0 and 2.6.x code. Still needs the postfix correct, though.

Btw: I don't think it is a good idea to use Neon, it is more or less obsolete for the architecture (53). Using VFPv4 instructions proved faster to me. Although I indirectly use neon with ProjectNe10 code with audio. And FPC does a pretty good job on armhf with proper compiler settings. Always examine the compiler's assembler output before you optimize.
« Last Edit: April 16, 2017, 06:41:34 pm by Thaddy »
Specialize a type, not a var.

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
Re: FPC Raspberry + ARM Neon
« Reply #7 on: April 16, 2017, 10:00:03 pm »
So, in the raspberry I'm using the FPC 3.0.0 (BTW I found a really usefull install script) on the x84 machine the FPC is 3.1.1.

Now I'm doing all tests in the FPC 3.1.1.

When a I use vadd.i32 I got an error about invalid combination instead invalid opcode when I replace it for vadd.w32 (Maybe I'm doing something wrong)

About the FP4, if it's better than neon and supported by FPC, great!  :D I'm trying translate the functions from SIMD x86 to arm, and basically the first option mentioned in a lot of tutorial/article and others, was neon like: "SIMD ARM NEON", so it was my first option.

[Edit]

I see now about the projectNe10, it use gcc right?
« Last Edit: April 16, 2017, 10:16:30 pm by johnsson »
Just a regular guy

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: FPC Raspberry + ARM Neon
« Reply #8 on: April 17, 2017, 08:07:45 am »
Yes. Linaro + GCC. But actually it dynamically switches to neon assembler code if available.
3.0.2 still contains internal error code.
I recommend building 3.1.1 on the RPi because 3.0.0 is still allowed to bootstrap that.
Specialize a type, not a var.

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: FPC Raspberry + ARM Neon
« Reply #9 on: April 17, 2017, 08:41:58 am »
The following code compiles with a trunk crosscompiler AND with trunk on the RPi3:
Code: Pascal  [Select][+][-]
  1. program armneontest;
  2. {$mode objfpc}
  3. procedure armasmtest;assembler;
  4. // note vadd takes THREE operants here...
  5. asm
  6.   vadd.i32 d0, d1, d1
  7. end;
  8.  
  9. begin
  10. end.
So you must upgrade to trunk for the RPi compiler.
Your cross-compiler seems fine. I guess there really isn't a combination for vadd that only takes two registers, but I have to check that.
« Last Edit: April 17, 2017, 08:45:39 am by Thaddy »
Specialize a type, not a var.

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
Re: FPC Raspberry + ARM Neon
« Reply #10 on: April 18, 2017, 02:48:35 am »
That's really weird.

I check my fpc version (x86 machine)

Code: [Select]
  fpc -iW

It returns 3.1.1

I copy and paste your example and got the exactly same error Asm [vadd vreg vreg vreg] (One register more on the error)

I'll try update the FPC to the last version available on trunk, also in raspbian.

Thanks a lot =)
Just a regular guy

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
Re: FPC Raspberry + ARM Neon
« Reply #11 on: April 18, 2017, 05:20:56 am »
I updated my FPC to the last version on trunk, and now it's working, also I installed the linaro + gcc + codeblocks (I was trying use the arm gcc eabi and it's not supported by codeblocks, only by eclipse)


I hope maybe in a near future the FPC implement the register q0, q1, q2 ...


Thanks a lot for the help  :D
« Last Edit: April 23, 2017, 05:50:54 pm by johnsson »
Just a regular guy

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: FPC Raspberry + ARM Neon
« Reply #12 on: April 19, 2017, 08:54:04 am »
Hm. I just realized that that vadd is actually not a neon, but also a vfpv4 instruction in this particular case.
There are two options left:
- The ULTIBO project has real Neon support (maybe ask them to submit their patch to FPC core?)
- Use the VFPv4 instruction set only.

This is a bit my mistake, because I am not really familiar with Neon (as I wrote: I only use vfpx assembler) and I forgot to check that.
The Q registers are indeed Neon only.

Sorry for the mix up. The whole of my answers should read: for VFPx this works, but only because the instructions are copied from/ the same as.. Neon. Neon itself - in full -
 does not work (yet).
Neon code from other languages will link to FPC, though.

Note that ARM says that IF the architecture supports VFPv4 you don't need Neon anymore. (But I could not find anything about 128bit (Q) registers...)

Either Laksen or Florian probably know much more about the subject. This is a case where I have my incomplete but fairly accurate knowledge from the compiler sources and not from any other documentation... Which is fun, but leads in my case to assumptions that are wrong.
« Last Edit: April 19, 2017, 09:09:22 am by Thaddy »
Specialize a type, not a var.

Laksen

  • Hero Member
  • *****
  • Posts: 724
    • J-Software
Re: FPC Raspberry + ARM Neon
« Reply #13 on: April 19, 2017, 12:06:15 pm »
Some of the problem is that it's a big mess of VFP, NEON, "Advanced SIMD", pre-UAL/UAL

The process of fixing it up will probably be a multistage process, but adding the Q registers should be possible of course.

johnsson

  • New Member
  • *
  • Posts: 22
  • Lazarus Rocks
Re: [SOLVED] FPC Raspberry + ARM Neon
« Reply #14 on: April 23, 2017, 04:03:33 pm »
Thanks for the help

I installed the Linaro on codeblocks and I'm converting the SIMD x86 to ARM Neon and linking on Lazarus.

Unfortunately I hate the gcc asm style, so I'm using intrinsics instead of pure asm (It's not the best, but so far so good).

I'll keep the FPC on track and probably I'll migrate the source to FPC + Assembly Inline  8-)
« Last Edit: April 23, 2017, 05:49:14 pm by johnsson »
Just a regular guy

 

TinyPortal © 2005-2018