The results below appear unusual. When the debugger options are disabled, the program runs more slowly.
Adding/removing debug info may have changed alignment in memory => The code is loaded to a different address...
Try to add
{$CodeAlign proc=$40}
at the top => that may have an impact too ($20 is probably enough).
You can also play with the value for loop=
If you have that directive, in Test3 start adding nop at the start of the asm block.
For me: 1 to 3 nop => no diff
4th nop => slower
But that is hardly because that nop uses that much time. And it is before the loop, so it is executed just once.
That is because the loop moves to a different alignment.
As I said, benchmarks like this are extremely easy to get it wrong.
(there a prior discussions on this somewhere hidden in the forum (or maybe on the mail list), IIRC including a link to a youtube video with some explanations.