Recent

Author Topic: Where's the optimization? ... (ARM)  (Read 798 times)

AlanTheBeast

  • Sr. Member
  • ****
  • Posts: 348
  • My software never cras....
Where's the optimization? ... (ARM)
« on: September 26, 2022, 07:52:30 pm »
In 32 bit ARM.

Code: Pascal  [Select][+][-]
  1.     UBXMSEi := 7;      
  2.     UBXCSEi := 7;      
  3.     CFLA := 0;
  4.     CFLB := 0;
  5.  

-O2 on or not.
yields:

Code: Pascal  [Select][+][-]
  1. # [489] UBXMSEi := 7;
  2.         mov     r1,#7
  3.         ldr     r0,.Lj475
  4.         strh    r1,[r0]
  5. # [490] UBXCSEi := 7;
  6.         mov     r0,#7
  7.         ldr     r1,.Lj473
  8.         strh    r0,[r1]
  9. # [491] CFLA := 0;
  10.         mov     r0,#0
  11.         ldr     r1,.Lj486
  12.         strb    r0,[r1]
  13. # [492] CFLB := 0;
  14.         mov     r1,#0
  15.         ldr     r0,.Lj488
  16.         strb    r1,[r0]
  17.  

Would have thought 1 load of 7 (or 0 the CFLA/B) and then stores to ea. location ...

And perhaps register adds to point to the next location rather than loading 2nd pointer for ea. store.  (Yes, ops in the order of the declaration - even threw in an {$ALIGN 2} there....)

Or does more optimization take place in the assembler pas?

I suppose I could put them in a record with a case overlay (effectively absolute) to improve that, but expected it out of the compiler.

NOTE: not critical here, running at about 25 - 30 Hz normally - but really expected better.
Everyone talks about the weather but nobody does anything about it.
..Samuel Clemens.

Laksen

  • Hero Member
  • *****
  • Posts: 743
    • J-Software
Re: Where's the optimization? ... (ARM)
« Reply #1 on: September 26, 2022, 09:21:41 pm »
It's a known optimization possibility that the compiler isn't smart enough to do currently.
On paper it's simple enough, but once you are in the node tree it can get tricky to implement :)

AlanTheBeast

  • Sr. Member
  • ****
  • Posts: 348
  • My software never cras....
Re: Where's the optimization? ... (ARM)
« Reply #2 on: September 26, 2022, 09:49:55 pm »
It's a known optimization possibility that the compiler isn't smart enough to do currently.
On paper it's simple enough, but once you are in the node tree it can get tricky to implement :)

Actually why I'm curious if there are assembler level optimizations that are invisible to us up here in so-called high level land.

For me, the above would be a no-brainer - but I used to write x86 and other less RISC oriented assembler ...

There is another trick, of course, declare a 64 bit word a(absolute) t the location where those 4 vars are located and clobber them with 1 write... may go there yet.

And then of course some time soon, this project will go 64bit and I'll be obsessing over other little bits...

(To be clear, the question is for a pretty key bit of code where receiving data is concerned, it's not that frequent but I need to get the function done in r/t close to when measurements were made...)
Everyone talks about the weather but nobody does anything about it.
..Samuel Clemens.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5481
  • Compiler Developer
Re: Where's the optimization? ... (ARM)
« Reply #3 on: September 27, 2022, 09:46:58 am »
It's a known optimization possibility that the compiler isn't smart enough to do currently.
On paper it's simple enough, but once you are in the node tree it can get tricky to implement :)

Actually why I'm curious if there are assembler level optimizations that are invisible to us up here in so-called high level land.

Well, essentially anything located in the assembler optimizer (in this case either the Aarch64 specific one or the general ARM one). In both cases the results will be visible in the assembly output however.

You can always open a feature request for that optimization, I'm rather sure someone like FPK or Gareth would like to play with that...

Seenkao

  • Hero Member
  • *****
  • Posts: 550
    • New ZenGL.
Re: Where's the optimization? ... (ARM)
« Reply #4 on: September 28, 2022, 10:00:54 am »
I'm rather sure someone like FPK or Gareth would like to play with that...
https://gitlab.com/freepascal.org/fpc/source/-/issues/39781

Проблема оптимизации для всех архитектур одинакова. Не собирается её ни кто решать в ближайшее время. Банальные вещи, компилятор не хочет замечать. А для ARM-архитектуры, там намного хуже оптимизация, чем для x86.

google translate:
The optimization problem is the same for all architectures. No one is going to solve it in the near future. Banal things, the compiler does not want to notice. And for the ARM architecture, there is much worse optimization than for x86.
Rus: Стремлюсь к созданию минимальных и достаточно быстрых приложений.

Eng: I strive to create applications that are minimal and reasonably fast.
Working on ZenGL

 

TinyPortal © 2005-2018