I think that using TImage does not give accurate result. Indeed, when doing so, the screen update is postponed to the OnPaint event of TImage, thus not counted in the time.
I replaced Code2 by drawing pixels directly on the Panel.
There was a problem with the line order and byte order of the BGRABitmap (it can change depending on the platform). I've also used precomputed palettes to hopefully optimize.
I don't get any difference between TLazIntfImage and TBGRABitmap, I get 10 ms in both cases. Note that TLazIntfImage could be optimized by using a palette.
MacOS 64bits
---------------
Code1 (bmp pixels): 88ms
Code2 (panel pixels): 77.5ms
Code3 (DIB): N/A
Code4 (BGRA): 10.5ms
Code5 (Intf): 10ms
Can you give me your times with this version?