aaa ok ... im also interested in mm under fpc but keep in mind that mm cannot be asm dependent as it must be crossplaform. i hope thaddy will make good point
Roberto's code is strictly Windows and uses *a lot* of intel assembler. Re-doing that cross-platform is not an option. (But it is an option for all intel platforms)
What *is* an option is to use some of the algorithmic improvements, which I am focussing on.
Note the mm's from Delphi and FPC differ in architecture. Not only the interface, but also what is expected to be implemented: like storing size, which delphi does not do: Delphi simply relies on slot/bucket size only.
The fpc compiler and rtl internals rely on that part of implementation, so it can not be changed very easily. I will have to implement storage of size.
OTOH, the core algorithm of FPC's MM is similar but *a lot* better designed in other parts, so speed gains will be not as big as with Delphi.
Don't expect too much from it. The biggest gains will be on intel only. It will also never be a default, but a plugin MM.
Also: once you use the available optimizations in FPC, the pure pascal implementation of the default MM can be much improved using just the settings that are already there.
FPC is much, much better than Delphi for using vectors, mmx, sse and the likes on intel, but also e.g. VFPX on arm, provided you specify the correct options.
And that is really a matter of documenting how to speed up the MM. (This only goes for memory block operations, but that is essentially a MM)
To put it another way: if you would use the FPC mm code in Delphi 7, (stripping the storage of size part) Delphi 7 would be faster than with its standard MM. You can try that yourself. I did that during the MM competition years ago. (Where I became unofficially slowest with my nifty commm.pas
)