I found the culprit and never in my wildest dreams would I imagine it was the cause. A single, solitary instruction...
Floor.
Yes. Explicitly turning a Double into an Integer apparently takes about 5,000,000 cycles.
Changing all my Floors into Truncs with the implicit conversion to Integer changed my numbers.
Instead of 7% Idle and 12% Run, I'm now getting...
(wait for it)
5% Idle and 6.5% Run!
Against the C++ 3%/4.5%, color me a Happy Camper!
Free Pascal (except for Floor...) is a winner!
Thanks for all of your help and suggestions! Now to try and shave off that extra 2%...
As an aside: Would this be considered a bug? In my test, Trunc was 2x faster where they should be more or less identical in speed.