The implementation in FPC is known and can be found in the compiler source:
It is one single locked operand following two -if possible, given the platform, but in practice always- instructions.
This is taken care of in the high level code generator. Read the code. Then decide what's wrong (not!)
Do not trust the wiki, trust the compiler source code. The wiki is polluted.