Your program will link against - at minimum - the rtl.
The rtl is not optimzed beyond -O2 to facilitate debugging while maintaining safe optimizations, optimizations without side effects.
So if you want full optimization, you also need to optimize the rtl and all other packages that you use.
It is not rocket science and was already well explained to you....
Ok, read my whole post again and let me clarify:
1) I was happy with the speed gain when I used JUST the Optimization Level 4 in "Compilation and Linking".
2) I want to NOT use the Level 4, instead I want to use the what-makes-the-O4-switch bunch of switches (so I can play with optimizations).
3) The bunch was used in both Custom Options and Additions and Overrides.
4) It seems that the bunch not working, whereas if I use -O4 it's works just like when I was using Leve 4 in Optimization Level.
The -O4 is supposed to be a replacement for:
-OoPEEPHOLE
-OoREMOVEEMPTYPROCS
-OoREGVAR
-OoSTACKFRAME
-OoTAILREC
-OoCSE
-OoCONSTPROP
-OoDFA
-USELOADMODIFYSTORE //it gives error with his switch, like its not supported by my compiler version 3.0.0
-OoLOOPUNROLL
-OoORDERFIELDS
-OoDEADVALUES
-OoFASTMATH
(^this is the "bunch")
but it seems that bunch != -O4.