Recent

Author Topic: [FPC] Install-time compilation  (Read 2777 times)

srcstorm

  • New Member
  • *
  • Posts: 21
[FPC] Install-time compilation
« on: December 17, 2015, 11:13:36 am »
Hi,

Here is my feature request: Provide a means to generate optimized executable for the machine that compilation is running on, similar to the process that is described here:
http://www.infoq.com/news/2014/07/ahead-of-time-compiler-os

The resulting executable will be specific to that machine, and should issue a warning if run on another machine.

The purpose of this kind of compilation is to have the most optimized executable during installation of the project to a client. Install-time compilation can be done on source code, but an additional capability to work on a binary intermediate representation would be very welcomed.

Developer shouldn't have to compile his project for each and every target. Since fpc is very accessible to end users, we should be able to distribute only one binary IR. This will be very convenient for developers, and also assures fastest possible executable for users.

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7503
Re: [FPC] Install-time compilation
« Reply #1 on: December 17, 2015, 01:38:09 pm »

The purpose of this kind of compilation is to have the most optimized executable during installation of the project to a client.

Do you have any benchmarks how much this would matter? It requires complete new infrastructure, and then you would want to have a ballpark figure of the gains.

Maybe the non native backends (JVM and LLVM) might yield an alternate route to this goal. (and better: with already estabilished infrastructure (compared to a newly introduced FPC one)

Quote
Developer shouldn't have to compile his project for each and every target. Since fpc is very accessible to end users, we should be able to distribute only one binary IR. This will be very convenient for developers, and also assures fastest possible executable for users.

That is exchanging customer (user) ease for developer ease, since the customer has to install (and worse: manage/update) a runtime.

srcstorm

  • New Member
  • *
  • Posts: 21
Re: [FPC] Install-time compilation
« Reply #2 on: December 20, 2015, 03:34:47 am »
Do you have any benchmarks how much this would matter? It requires complete new infrastructure, and then you would want to have a ballpark figure of the gains.

Hi marcov,

I prepared a batch to investigate performance gains by targeting a specific CPU:
Code: Text  [Select]
  1. @ECHO OFF
  2. SETLOCAL
  3. SET PATH=%PATH%;C:\FPC\3.0.0\bin\i386-win32
  4. SET COMPILER=ppcrossx64
  5. SET PRJ=myproject
  6. SET CPU=x86_64
  7. SET TARGET=ATHLON64
  8. SET FPU=AVX2
  9. SET PIOPT=-O3 -OoREGVAR -OoSTACKFRAME -OoPEEPHOLE -OoLOOPUNROLL -OoTAILREC -OoCSE -OoDFA -OoUSERBP -OoORDERFIELDS -OoREMOVEEMPTYPROCS
  10. DEL *_%CPU%_*.rpt
  11. DEL *_%CPU%*.exe
  12.  
  13. ECHO ON
  14. %COMPILER% %PRJ% -B -MObjFPC -P%CPU% -o%PRJ%_%CPU%.exe
  15.  
  16. %COMPILER% %PRJ% -B -MObjFPC -P%CPU% %PIOPT% -OWall -FW%PRJ%_%CPU%_pi.rpt -CX -XX -Xs- -o%PRJ%_%CPU%_pi.exe
  17. %COMPILER% %PRJ% -B -MObjFPC -P%CPU% %PIOPT% -Owall -Fw%PRJ%_%CPU%_pi.rpt -CX -XX -o%PRJ%_%CPU%_pi.exe
  18.  
  19. %COMPILER% %PRJ% -B -MObjFPC -P%CPU% -Cp%TARGET% -Cf%FPU% %PIOPT% -Op%TARGET% -OWall -FW%PRJ%_%CPU%_ps.rpt -CX -XX -Xs- -o%PRJ%_%CPU%_ps.exe
  20. %COMPILER% %PRJ% -B -MObjFPC -P%CPU% -Cp%TARGET% -Cf%FPU% %PIOPT% -Op%TARGET% -Owall -Fw%PRJ%_%CPU%_ps.rpt -CX -XX -o%PRJ%_%CPU%_ps.exe
  21. @ECHO OFF
  22.  
  23. SET BENCH=bench_%CPU%.bat
  24. IF EXIST %BENCH% GOTO RUNBENCH
  25. ECHO @ECHO OFF>%BENCH%
  26. ECHO ECHO ***>>%BENCH%
  27. ECHO ECHO Unoptimized:>>%BENCH%
  28. ECHO %PRJ%_%CPU%>>%BENCH%
  29. ECHO ECHO ***>>%BENCH%
  30. ECHO ECHO Platform-independent optimizations:>>%BENCH%
  31. ECHO %PRJ%_%CPU%_pi>>%BENCH%
  32. ECHO ECHO ***>>%BENCH%
  33. ECHO ECHO Platform-specific optimizations:>>%BENCH%
  34. ECHO %PRJ%_%CPU%_ps>>%BENCH%
  35. ECHO ECHO ***>>%BENCH%
  36. ECHO PAUSE>>%BENCH%
  37.  
  38. :RUNBENCH
  39. ECHO.
  40. %BENCH%
  41.  

As a sample program I used the one discussed here:
http://free-pascal-general.1045716.n5.nabble.com/code-optimization-td2848157.html

I compiled both the initial inefficient code listing,
http://free-pascal-general.1045716.n5.nabble.com/code-optimization-tp2848157p2849756.html
and a revised effient listing:
http://free-pascal-general.1045716.n5.nabble.com/code-optimization-tp2848157p2852367.html

The results:
> Inefficient code
>> x86_64 Unoptimized 12.40s
>> x86_64 Optimized 11.31s
>> x86_64+ATHLON64+AVX2 Optimized 11.28s

> Efficient code
>> x86_64 Unoptimized 7.49s
>> x86_64 Optimized 6.19s
>> x86_64+ATHLON64+AVX2 Optimized 6.19s

As you see I couldn't get better results with platform-specific optimizations. However, I was hoping to get better result.