Recent

Author Topic: INITFINAL feature modification question.  (Read 1521 times)

Interferon

  • Newbie
  • Posts: 6
INITFINAL feature modification question.
« on: June 20, 2023, 01:06:48 am »
I am writing a bootloader for a risc-v microcontroller that only has 2k of flash available for the bootloader section.
Even when not using any units, an empty pascal program still generates around 250 bytes of machine code to execute unit initialization and finalization sections.  There is a compiler switch that is supposed to turn off/on init and finalize (-SfINITFINAL), but it only works when compiling the RTL.  And even then, I  had to add {$define FPC_HAS_FEATURE_INITFINAL} sections around the fpc_InitializeUnits and FinalizeUnits procedures to just immediately return when turned off.
By doing this, I was able to get an empty project to only generate 360 bytes of machine code in the executable.
So my questions are:
1.  Shouldn't this be a feature switch of the compiler code generation instead of the RTL?  It would be better to always have the init/finalize procedures available in the system.o lib, but selectively added, depending on the config of the project when compiling.
2.  If it is added to the compiler, should a new switch be added besides -SfINITFINAL so that it can be used when compiling a project instead of the RTL?  Or should -SfINITFINAL be modified to allow setting when compiling user code?
3.  Would it be even better to simply have the compiler add one jump instruction for each unit init/finalization procedure encountered?  That way, if no units are used, or none of the used units have init/finalization sections, no machine code would be generated for such as a side effect.  Each jump instruction is only 4 bytes in Risc-v, and would avoid the overhead of the jump lookup tables currently utilized.
4.  Is this the correct forum for communicating with the main compiler developers, or do I need to go to the mailing list?

I can do the modifications in the code and do a merge request for whatever the correct solution is.

ccrause

  • Hero Member
  • *****
  • Posts: 986
Re: INITFINAL feature modification question.
« Reply #1 on: June 20, 2023, 02:21:31 pm »
I am writing a bootloader for a risc-v microcontroller that only has 2k of flash available for the bootloader section.
Even when not using any units, an empty pascal program still generates around 250 bytes of machine code to execute unit initialization and finalization sections.  There is a compiler switch that is supposed to turn off/on init and finalize (-SfINITFINAL), but it only works when compiling the RTL.  And even then, I  had to add {$define FPC_HAS_FEATURE_INITFINAL} sections around the fpc_InitializeUnits and FinalizeUnits procedures to just immediately return when turned off.
By doing this, I was able to get an empty project to only generate 360 bytes of machine code in the executable.

I am also poking around the compiler to try and remove redundant code for AVR bootloaders (where code size is sometimes even more limited).  Manually removing the RTL functions is probably not the correct way, since calls to these functions are automatically inserted by the compiler (here and here). My opinion is also that unit init/final code should be executed if included in the main program, else unexpected things may happen.  For example including the heap manager requires initialization, else the heap functionality will be broken.

One strategy is to check whether init/final code is registered by used units, if not then omit the init/final calls.  For unit initialization code it seems easy to omit the call if no init routines are registered by used units (there may be exceptions that could add init code at a later stage, so may perhaps not be straight-forward). During linking the dead code should be eliminated (this does work for AVR), thus automatically removing code if no initialization is required.

For finalization code it is more complicated, since only one call to fpc_do_exit is made, which in turn calls InternalExit and System_Exit.  InternalExit calls ExitProc, then prints out possible error info, then calls FinalizeUnits.  Carving out the FinalizeUnits call to apply the same methodology as for unit initialization would require refactoring fpc_do_exit call chain.  An alternative could be to allow the use of the noreturn modifier with the main procedure block, which should then be interpreted as no exit code is needed.  This seems appropriate for embedded targets, since there is nothing to fall back to (provided the end of code is guarded with a endless loop just in case).

Anyway, just my random thoughts on the matter.

Quote
So my questions are:
1.  Shouldn't this be a feature switch of the compiler code generation instead of the RTL?  It would be better to always have the init/finalize procedures available in the system.o lib, but selectively added, depending on the config of the project when compiling.
Yes, I think this should be handled by the compiler.  Any RTL code not called somewhere should be eliminated by the linker.

Quote
2.  If it is added to the compiler, should a new switch be added besides -SfINITFINAL so that it can be used when compiling a project instead of the RTL?  Or should -SfINITFINAL be modified to allow setting when compiling user code?
I'm leaning towards self cleaning initialization logic, i.e. no calls if no initialization is required.  For embedded targets finalization seems more optional.

Quote
3.  Would it be even better to simply have the compiler add one jump instruction for each unit init/finalization procedure encountered?  That way, if no units are used, or none of the used units have init/finalization sections, no machine code would be generated for such as a side effect.  Each jump instruction is only 4 bytes in Risc-v, and would avoid the overhead of the jump lookup tables currently utilized.
I agree that the indirection adds extra code, but it isn't clear to me why this route was taken.

Quote
4.  Is this the correct forum for communicating with the main compiler developers, or do I need to go to the mailing list?

I can do the modifications in the code and do a merge request for whatever the correct solution is.
You can also try the fpc-devel mailing list.

ccrause

  • Hero Member
  • *****
  • Posts: 986
Re: INITFINAL feature modification question.
« Reply #2 on: June 20, 2023, 07:04:29 pm »
I am also poking around the compiler to try and remove redundant code for AVR bootloaders (where code size is sometimes even more limited).
My changes to get rid of the FPC_INIT_FUNC_TABLE wrapper and call if no initialization is required can be viewed here.  Note the focus for this was the AVR target, other targets may require further modifications.

Because of the different call sequence on program exit the same strategy cannot be applied symmetrically to finalization code.  One option could be to move generating the the individual calls currently wrapped in fpc_do_exit into the compiler, then the finalization call can be omitted if the finalization list is empty, but this doesn't seem elegant since some of the RTL logic will shift to the compiler.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5796
  • Compiler Developer
Re: INITFINAL feature modification question.
« Reply #3 on: June 20, 2023, 10:18:28 pm »
Even when not using any units, an empty pascal program still generates around 250 bytes of machine code to execute unit initialization and finalization sections.  There is a compiler switch that is supposed to turn off/on init and finalize (-SfINITFINAL), but it only works when compiling the RTL.  And even then, I  had to add {$define FPC_HAS_FEATURE_INITFINAL} sections around the fpc_InitializeUnits and FinalizeUnits procedures to just immediately return when turned off.

The intention of the InitFinal modeswitch is not to disable the whole initialization/finalization mechanism, but to enable/disable only the parsing of the initialization and finalization sections in certain language modes that don't support them (e.g. MacPas).

I am also poking around the compiler to try and remove redundant code for AVR bootloaders (where code size is sometimes even more limited).
My changes to get rid of the FPC_INIT_FUNC_TABLE wrapper and call if no initialization is required can be viewed here.  Note the focus for this was the AVR target, other targets may require further modifications.

Because of the different call sequence on program exit the same strategy cannot be applied symmetrically to finalization code.  One option could be to move generating the the individual calls currently wrapped in fpc_do_exit into the compiler, then the finalization call can be omitted if the finalization list is empty, but this doesn't seem elegant since some of the RTL logic will shift to the compiler.

Since AVR binaries are AFAIR built as ELF binaries maybe you can try something with weak symbols?

Interferon

  • Newbie
  • Posts: 6
Re: INITFINAL feature modification question.
« Reply #4 on: June 21, 2023, 03:45:25 am »
Thanks for the responses.
It seems like the best option is what ccrouse suggested and have the compiler just not emit init/final code when no units need it.
That way, you only pay for what you use.

PascalDragon, do you know why the init/finalize code uses lookup tables instead of just doing a direct call to each unit in turn?  Because it seems as though in every case a direct jump set would use less space than the overhead of the table lookup/jump.  Possibly even 0 space when no jumps are needed.

I have it working how I want, albeit kind of dirty.  Not sure I want to go down the rabbit hole of hacking up the compiler code generation.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5796
  • Compiler Developer
Re: INITFINAL feature modification question.
« Reply #5 on: June 22, 2023, 09:42:30 pm »
PascalDragon, do you know why the init/finalize code uses lookup tables instead of just doing a direct call to each unit in turn?  Because it seems as though in every case a direct jump set would use less space than the overhead of the table lookup/jump.  Possibly even 0 space when no jumps are needed.

On space restricted platforms (e.g. AVR) it uses a list of direct function calls. For other platforms an array is simply more flexible and this will even be more the case once support for dynamic packages is added, because then the units need to be initialized in a depth first order as loading a package needs to go through all units and initialize those that are not yet initialized.

Interferon

  • Newbie
  • Posts: 6
Re: INITFINAL feature modification question.
« Reply #6 on: June 29, 2023, 10:26:52 am »
PascalDragon, do you know why the init/finalize code uses lookup tables instead of just doing a direct call to each unit in turn?  Because it seems as though in every case a direct jump set would use less space than the overhead of the table lookup/jump.  Possibly even 0 space when no jumps are needed.

On space restricted platforms (e.g. AVR) it uses a list of direct function calls. For other platforms an array is simply more flexible and this will even be more the case once support for dynamic packages is added, because then the units need to be initialized in a depth first order as loading a package needs to go through all units and initialize those that are not yet initialized.
Is this functionality that should be exposed in a compiler switch so that any space restricted processor can take advantage of it?  The CPU architecture isn't enough information to make the decision, because there are huge resource differences between specific microcontrollers with the same instruction set.

 

TinyPortal © 2005-2018