I am curious to know what is the difference in terms of execution time, could you please explain/demonstrate?
It's a matter of CPU cache usage.
Original code "erases"(which in fact consists in writing data) all the lines(rows) of the canvas simultaneously. This means that the CPU will try to simultaneously cache writes for many lines, no more than the height of the canvas.
New CPUs have huge caches and good mechanisms of managing it, but old CPUs are not that good. Those old CPUs will run the proposed code faster than the original code because the data writes are leaner, more fluent.
Also, there are CPUs that write data forward significantly faster than writing it backward. This is another example of CPU cache usage.