Recent

Author Topic: Iterating/accessing an array, index vs pointer element access  (Read 12885 times)

jmm72

  • Jr. Member
  • **
  • Posts: 79
  • Very experienced in being a beginner...
I've seen some (old) messages here and there mentioning that accessing or iterating elements in an array is way faster using pointers than the normal index way array[ i ], and then I have seen a few mentions that the accessing through indices has been improved. I wanted to do a real comparison.

My tests (program attached below) show that for iterating an array, from beginning to end, pointer access is faster, but not as it seemed to be. Pointer iteration takes 91.2% of the time it took to do the same using indices. For random access, the trouble is doing the pointer math, which takes away all the benefits, at least in my implementation, feel free to correct my code if you think it can do faster (but keep the objfpc mode).

The tests were done with several callings to the procedures in my test program, compiling in the Release mode for Win x64 target. Lazarus 1.6.4 with FPC 3.0.2.
Cumulative runs of traversing the array using integer index: 51,223 milliseconds.
Cumulative runs of traversing the array using pointer: 46,721 milliseconds.
Time of pointer runs compared to index runs: 91,21%.

Code: Pascal  [Select][+][-]
  1. program arraytest;
  2. { Fills an array with random values, accesses it with an index (array[index])
  3.   and with a pointer, and times each pass. The random values are valid indices
  4.   in the array for use in the Access* procedures.
  5.   By Juan Miguel Martinez, 2017, released into the Public Domain. }
  6.  
  7. {$mode objfpc}
  8.  
  9. uses
  10.   sysutils, dateutils;
  11.  
  12. const
  13.   NumVal = 268435456 * 4; // 268,435,456 integers = 1GiB
  14.   NumValC = NumVal - 1;
  15.   NumAccesses = 130000000; // takes 4-5 times more than traversing 4gb
  16.  
  17. type
  18.   TProc = procedure;
  19.  
  20. var
  21.   Vector: array of Integer;
  22.  
  23. // Fills the array with random values, static random seed
  24. procedure FillVector;
  25. var
  26.   i: Integer;
  27.  
  28. begin
  29.   Randseed := 4; // Guaranteed to be random by fair dice roll
  30.   for i := 0 to NumValC do
  31.     Vector[i] := Random(NumVal);
  32. end;
  33.  
  34. // Traverses whole array, index version
  35. procedure TraverseIndex;
  36. var
  37.   i, o: Integer;
  38.  
  39. begin
  40.   o := 0;
  41.   for i := 0 to NumValC do
  42.     if Odd(Vector[i]) then
  43.       Inc(o);
  44.  
  45.   Write(IntToStr(o), ' odd numbers, ');
  46. end;
  47.  
  48. // Traverses whole array, pointer version
  49. procedure TraversePointer;
  50. var
  51.   i, o: Integer;
  52.   p: ^Integer;
  53.  
  54. begin
  55.   o := 0;
  56.   p := @Vector[0];
  57.   for i := 0 to NumValC do
  58.   begin
  59.     if Odd(p^) then
  60.       Inc(o);
  61.  
  62.     Inc(p);
  63.         end;
  64.  
  65.   Write(IntToStr(o), ' odd numbers, ');
  66. end;
  67.  
  68. // Accesses the array N times, index version
  69. procedure AccessIndex;
  70. var
  71.   idx, o: Integer;
  72.   i: Int64;
  73.  
  74. begin
  75.   o := 0;
  76.   RandSeed := 4; // xkcd ftw!
  77.   idx := 0;
  78.   for i := 1 to NumAccesses do
  79.   begin
  80.     if Odd(Vector[idx]) then
  81.       Inc(o);
  82.  
  83.     idx := Vector[idx];
  84.         end;
  85.  
  86.   Write(IntToStr(o), ' odd numbers, ')
  87. end;
  88.  
  89. // Accesses the array N times, pointer version
  90. procedure AccessPointer;
  91. var
  92.   o: Integer;
  93.   b, p: ^Integer;
  94.   i: Int64;
  95.  
  96. begin
  97.   o := 0;
  98.   b := @Vector[0];
  99.   p := b;
  100.   RandSeed := 4; // xkcd ftw!
  101.   for i := 1 to NumAccesses do
  102.   begin
  103.     if Odd(p^) then
  104.       Inc(o);
  105.  
  106.     p := b + p^;
  107.         end;
  108.  
  109.   Write(IntToStr(o), ' odd numbers, ');
  110. end;
  111.  
  112. // Clocks a call to one of the accessing procedures
  113. procedure ClockProcedure(const proc: TProc; const msg: string);
  114. var
  115.   t: TDateTime;
  116.  
  117. begin
  118.   Write(msg);
  119.   t := Now();
  120.   proc;
  121.   WriteLn(IntToStr(MilliSecondsBetween(Now(), t)), ' ms');
  122. end;
  123.  
  124. begin
  125.   SetLength(Vector, NumVal);
  126.   ClockProcedure(@FillVector, 'Filling array... ');
  127.   ClockProcedure(@TraverseIndex,   'Traversing array with index...   ');
  128.   ClockProcedure(@TraversePointer, 'Traversing array with pointer... ');
  129.   ClockProcedure(@AccessIndex,     'Accessing array with index...    ');
  130.   ClockProcedure(@AccessPointer,   'Accessing array with pointer...  ');
  131.   WriteLn('Press Enter to continue...');
  132.   ReadLn();
  133. end.
  134.  
Lazarus 1.6.4 + FPC 3.0.2 64bits under Windows 7 64bits
Only as a hobby nowadays
Current proyect release: TBA

Nitorami

  • Sr. Member
  • ****
  • Posts: 481
Re: Iterating/accessing an array, index vs pointer element access
« Reply #1 on: March 30, 2017, 07:58:48 pm »
It is often claimed than pointer access is much faster but I cannot confirm that. This is only the case for multidimensional arrays, where the compiler cannot (yet) determine which indices need to be recalculated, and that may result in a lot of unnecessary index calculations.

I cannot run your code because I am using FPC in the 32-bit version, and I only have an old single-core machine... so I reduced the int64 to in32 and NumVal to 1/4GiB to run it within acceptable time. The difference between index and pointer version is minute.

I don't think your code is particularly well suited to assess performance anyway, because the huge array does not fit into the CPU cache, so the performance may be determined by memory fetches rather than the access method.

Edit: I get some 5% higher performance when exchanging
    if Odd(Vector[idx]) then Inc(o);
with
    inc (o, Vector[idx] mod 2);

Which however may cause different results with negative numbers I guess...

Edit again - what about this version ?
Code: Pascal  [Select][+][-]
  1. // Accesses the array N times, index version
  2. procedure AccessIndex;
  3. var
  4.   idx, o, i: integer;
  5. begin
  6.   o := 0; idx := 0;
  7.   for i := 1 to NumAccesses do
  8.   begin
  9.     idx := Vector[idx];
  10.     inc (o, idx mod 2);
  11.  end;
  12.   Write(IntToStr(o), ' odd numbers, ');
  13. end;
  14.  
« Last Edit: March 30, 2017, 09:04:50 pm by Nitorami »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Iterating/accessing an array, index vs pointer element access
« Reply #2 on: March 30, 2017, 11:13:23 pm »
Pointers themselves are not magic. It is the reduction of live values and/or elimination of calculation that causes the savings. Pointers are in some cases the means to do so.

See also:  http://stackoverflow.com/questions/41562081/copying-of-data-in-a-tight-loop-where-the-function-move-can-not-be-used/41611925#41611925


jmm72

  • Jr. Member
  • **
  • Posts: 79
  • Very experienced in being a beginner...
Re: Iterating/accessing an array, index vs pointer element access
« Reply #3 on: March 31, 2017, 01:36:12 am »
I don't think pointers are magic, I wanted to test those cryptic statements about pointers being way faster than indices for arrays, without any real numbers, and then what kind of optimization got the access by indices. Personally I don't really like using pointers with arrays because they make my code less legible, and the few times I've used them was in simple ways where I thought they'd do good.

I'd say that internally, at run-time, an array index is transformed to a pointer, and then the pointer is what's used; and that's the part which got optimized. Using pointers just skips that part, and depending on what kind of pointer arithmetic you're doing, bypassing the compiler's index to pointer transformation with your own pointer arithmetic might be a bit better or somewhat worse. As Marcov wisely points out, using sometimes indices or pointers are the means to save up on operations.

Yes I'm aware not everyone has 64 bits or enough memory to do the test, that's why the constants are there. Unfortunately I missed the Int64 which actually is not needed. I intentionally chose Odd() so it does something besides just moving the index; in a loop you're at least going to access each element. Whether Odd() vs. mod 2 is faster or not doesn't matter, what matters is the difference between one method of accessing and the other, if the work inside the loop is the same (increment index/pointer, check for Odd() and increase a counter). The fact that it gives just 5% of speedup means that Odd() does little work, I'd say.

About idx := Vector[idx], it's just to make some random pointer arithmetic and test which one is faster. I am not doing exact timing on each method, these are just some raw tests to see which methods are faster compared to each one.

And now everyone can keep up with indices or pointers knowing what advantages each one gives. Personally I'll keep up with indices unless there's a simple tight loop with so many elements that pointers are worth using.

(And read the link posted by Marcov on a straigthforward approach to optimize this stuff.)
Lazarus 1.6.4 + FPC 3.0.2 64bits under Windows 7 64bits
Only as a hobby nowadays
Current proyect release: TBA

Remy Lebeau

  • Hero Member
  • *****
  • Posts: 1312
    • Lebeau Software
Re: Iterating/accessing an array, index vs pointer element access
« Reply #4 on: March 31, 2017, 06:37:53 am »
an array index is transformed to a pointer, and then the pointer is what's used;

Yes.  The array's base pointer has to be retrieved and then incremented by an offset calculated from the current index and element size.  Do that in a loop, and the pointer has to be retrieved and re-calculated each time, before it can then be used.

On the other hand, if you start with the base pointer and just increment it on each iteration, the work is lessened since the new pointer value is already to go on the next iteration.
Remy Lebeau
Lebeau Software - Owner, Developer
Internet Direct (Indy) - Admin, Developer (Support forum)

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Iterating/accessing an array, index vs pointer element access
« Reply #5 on: March 31, 2017, 03:19:56 pm »
Using array of integer in this test - isn't good idea. SizeOf(Integer) = 4 and therefore compiler may optimize asm code and turn
Code: Pascal  [Select][+][-]
  1. Vector[i]
  2.  
into something like
Code: Text  [Select][+][-]
  1. MOV EAX,[EBX+ECX*4]
  2.  
that isn't much slower, than just
Code: Text  [Select][+][-]
  1. MOV EAX,[EBX]
  2.  
simply cuz effective address calculation is "hardware accelerated" since some processor - i486, if I remember it correctly.

Try some data size > 8 and not power of 2 or close to power of 2, so MUL/IMUL instruction would be needed.

P.S. Asm code would be even faster, as it would use registers only - no local variables.
Code: Text  [Select][+][-]
  1. LEA EDX,Vector[4]
  2. MOV ECX,Vector
  3. XOR EAX,EAX
  4. @@Loop:
  5. DEC ECX
  6. JB @@Exit
  7. TEST DWORD PTR [EDX],1
  8. JZ @@Even
  9. INC EAX
  10. @@Even:
  11. LEA EDX,DWORD PTR [EDX][4]
  12. JMP @@Loop
  13. @@Exit:
  14.  
« Last Edit: March 31, 2017, 03:46:35 pm by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

jmm72

  • Jr. Member
  • **
  • Posts: 79
  • Very experienced in being a beginner...
Re: Iterating/accessing an array, index vs pointer element access
« Reply #6 on: April 01, 2017, 04:20:01 pm »
Thanks Mr. Madguy for the insight and tips, I made a second array with a 13-byte element (a packed record). The advantage of using pointers for iterating the array was more obvious, with around 20% less time using a pointer than using an integer index. On the other hand, random access was essentially the same using both methods.

Using a 16-bit record showed less than 5% advantage for pointers on random access.

I really never got into assembler nor about what the compiler was doing on backstage, for me clean code I could understand and change quickly months after last edition was enough. It seems I'll have to change that for performance-critical situations  ;)

However the initial idea, to check what was faster in FPC, indices or pointers, and in what situations, has been achieved, it seems. Next in my 'I'm unsure which one would work faster' list, Inc/Dec versus plus/minus operators. Inc/Dec wins, of course.
Lazarus 1.6.4 + FPC 3.0.2 64bits under Windows 7 64bits
Only as a hobby nowadays
Current proyect release: TBA

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Iterating/accessing an array, index vs pointer element access
« Reply #7 on: April 01, 2017, 04:56:18 pm »
Code: Pascal  [Select][+][-]
  1. program arraytest;
  2. { Fills an array with random values, accesses it with an index (array[index])
  3.   and with a pointer, and times each pass. The random values are valid indices
  4.   in the array for use in the Access* procedures.
  5.   By Juan Miguel Martinez, 2017, released into the Public Domain. }
  6.  
  7. {$mode objfpc}
  8. {$ASMMODE INTEL}
  9.  
  10. uses
  11.   sysutils, dateutils;
  12.  
  13. const
  14.   NumVal = 268435456 * 4; // 268,435,456 integers = 1GiB
  15.   NumValC = NumVal - 1;
  16.   NumAccesses = 130000000; // takes 4-5 times more than traversing 4gb
  17.  
  18. type
  19.   TProc = procedure;
  20.   TMyType=record
  21.     Data:Integer;
  22.     Padding:array[0..5] of byte;
  23.   end;
  24.   PMyType=^TMyType;
  25.  
  26. var
  27.   Vector: array of TMyType;
  28.  
  29. // Fills the array with random values, static random seed
  30. procedure FillVector;
  31. var
  32.   i: Integer;
  33.  
  34. begin
  35.   Randseed := 4; // Guaranteed to be random by fair dice roll
  36.   for i := 0 to NumValC do
  37.     Vector[i].Data := Random(NumVal);
  38. end;
  39.  
  40. // Traverses whole array, index version
  41. procedure TraverseIndex;
  42. var
  43.   i, o: Integer;
  44.  
  45. begin
  46.   o := 0;
  47.   for i := 0 to NumValC do
  48.     if Odd(Vector[i].Data) then
  49.       Inc(o);
  50.  
  51.   Write(IntToStr(o), ' odd numbers, ');
  52. end;
  53.  
  54. // Traverses whole array, pointer version
  55. procedure TraversePointer;
  56. var
  57.   i, o: Integer;
  58.   p: PMyType;
  59.  
  60. begin
  61.   o := 0;
  62.   p := @Vector[0];
  63.   for i := 0 to NumValC do
  64.   begin
  65.     if Odd(p^.Data) then
  66.       Inc(o);
  67.  
  68.     Inc(p);
  69.         end;
  70.  
  71.   Write(IntToStr(o), ' odd numbers, ');
  72. end;
  73.  
  74. procedure TraverseAsm;
  75.   var o:Integer;
  76. begin
  77.   asm
  78.     MOV RAX, Vector
  79.     MOV RCX, NumVal
  80.     XOR RDX, RDX
  81.     JRCXZ @@Exit
  82.     @@Loop:
  83.     TEST DWORD PTR [RAX],1
  84.     JZ @@Even
  85.     INC RDX
  86.     @@Even:
  87.     LEA RAX, [RAX + 12]
  88.     LOOP @@Loop
  89.     @@Exit:
  90.     MOV o, EDX
  91.   end;
  92.   Write(IntToStr(o), ' odd numbers, ');
  93. end;
  94.  
  95. // Accesses the array N times, index version
  96. procedure AccessIndex;
  97. var
  98.   idx, o: Integer;
  99.   i: Int64;
  100.  
  101. begin
  102.   o := 0;
  103.   RandSeed := 4; // xkcd ftw!
  104.   idx := 0;
  105.   for i := 1 to NumAccesses do
  106.   begin
  107.     if Odd(Vector[idx].Data) then
  108.       Inc(o);
  109.  
  110.     idx := Vector[idx].Data;
  111.         end;
  112.  
  113.   Write(IntToStr(o), ' odd numbers, ')
  114. end;
  115.  
  116. // Accesses the array N times, pointer version
  117. procedure AccessPointer;
  118. var
  119.   o: Integer;
  120.   b, p: PMyType;
  121.   i: Int64;
  122.  
  123. begin
  124.   o := 0;
  125.   b := @Vector[0];
  126.   p := b;
  127.   RandSeed := 4; // xkcd ftw!
  128.   for i := 1 to NumAccesses do
  129.   begin
  130.     if Odd(p^.Data) then
  131.       Inc(o);
  132.  
  133.     p := b + p^.Data;
  134.         end;
  135.  
  136.   Write(IntToStr(o), ' odd numbers, ');
  137. end;
  138.  
  139. procedure AccessAsm;
  140.   var o:Integer;
  141. begin
  142.   asm
  143.     PUSH R9
  144.     MOV RAX, Vector
  145.     MOV R9, RAX
  146.     MOV RCX, NumAccesses
  147.     XOR RDX, RDX
  148.     JRCXZ @@Exit
  149.     @@Loop:
  150.     MOV EAX,[RAX]
  151.     TEST EAX,1
  152.     JZ @@Even
  153.     INC RDX
  154.     @@Even:
  155.     IMUL RAX, RAX, 12
  156.     LEA RAX, [R9 + RAX]
  157.     LOOP @@Loop
  158.     @@Exit:
  159.     MOV o, EDX
  160.     POP R9
  161.   end;
  162.   Write(IntToStr(o), ' odd numbers, ');
  163. end;
  164.  
  165. // Clocks a call to one of the accessing procedures
  166. procedure ClockProcedure(const proc: TProc; const msg: string);
  167. var
  168.   t: TDateTime;
  169.  
  170. begin
  171.   Write(msg);
  172.   t := Now();
  173.   proc;
  174.   WriteLn(IntToStr(MilliSecondsBetween(Now(), t)), ' ms');
  175. end;
  176.  
  177. begin
  178.   SetLength(Vector, NumVal);
  179.   ClockProcedure(@FillVector, 'Filling array... ');
  180.   ClockProcedure(@TraverseIndex,   'Traversing array with index...   ');
  181.   ClockProcedure(@TraversePointer, 'Traversing array with pointer... ');
  182.   ClockProcedure(@TraverseAsm,     'Traversing array with pointer asm... ');
  183.   ClockProcedure(@AccessIndex,     'Accessing array with index...    ');
  184.   ClockProcedure(@AccessPointer,   'Accessing array with pointer...  ');
  185.   ClockProcedure(@AccessAsm,   'Accessing array with pointer asm...  ');
  186.   WriteLn('Press Enter to continue...');
  187.   ReadLn();
  188. end.
  189.  
AccessPointer doesn't give you any advantage, cuz p := b + p^ still involves b * SizeOf(TMyType).
« Last Edit: April 01, 2017, 04:57:59 pm by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Iterating/accessing an array, index vs pointer element access
« Reply #8 on: April 01, 2017, 04:57:03 pm »
Also results. Funny thing, but my asm code is slower, than FPC one.
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Iterating/accessing an array, index vs pointer element access
« Reply #9 on: April 01, 2017, 05:13:05 pm »
Your assembler code is not aligned properly? Or the compiler has better pipelining characteristics?
« Last Edit: April 01, 2017, 05:24:39 pm by Thaddy »
Specialize a type, not a var.

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Iterating/accessing an array, index vs pointer element access
« Reply #10 on: April 01, 2017, 06:04:42 pm »
Your assembler code is not aligned properly? Or the compiler has better pipelining characteristics?
I don't know. I've tried different variants - my code is still slower. Some low level optimization, I guess.

P.S. You should remember, that IMUL isn't much slower on modern computers: on old ones MOV could take 2 ticks, while IMUL could take 32.
« Last Edit: April 01, 2017, 06:06:33 pm by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

MathMan

  • Sr. Member
  • ****
  • Posts: 325
Re: Iterating/accessing an array, index vs pointer element access
« Reply #11 on: April 03, 2017, 01:24:33 pm »
Your assembler code is not aligned properly? Or the compiler has better pipelining characteristics?

Probably more due to the way it was coded. I'd think the following will be significantly faster for example

Code: Pascal  [Select][+][-]
  1. procedure TraverseAsm;
  2.   var o:Integer;
  3. begin
  4.   asm
  5.     MOV RAX, Vector
  6.     MOV RCX, NumVal
  7.     XOR RDX, RDX
  8.     JRCXZ @@Exit
  9.  
  10.   @@Loop:
  11.  
  12.     MOV RBX, [RAX]
  13.     AND EBX, 1
  14.     ADD RDX, RBX
  15.     ADD RAX, 12
  16.     DEC RCX
  17.     JNE @@Loop
  18.  
  19.   @@Exit:
  20.  
  21.     MOV o, EDX
  22.   end ['RAX','RBX','RCX','RDX'];
  23.  
  24.   Write(IntToStr(o), ' odd numbers, ');
  25. end;
  26.  

Regards,
MathMan

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Iterating/accessing an array, index vs pointer element access
« Reply #12 on: April 03, 2017, 05:11:05 pm »
Probably more due to the way it was coded. I'd think the following will be significantly faster for example

Regards,
MathMan
Yeah, high level optimization always gives you better results, than low level one. But what I want to say, is that my intention was to make initial code work faster via low level optimization. High level optimization means, we need to also change initial code too.

P. S. It was LOOP, that was slow. Dunno, why it's slower, than two instructions, it's equivalent to. Without LOOP my code is as fast, as FPC's one - ~3800ms. I thought, it should have been faster.
Code: Pascal  [Select][+][-]
  1. procedure TraversePointer;
  2. var
  3.   i, o: Integer;
  4.   p: ^Integer;
  5.  
  6. begin
  7.   o := 0;
  8.   p := @Vector[0];
  9.   for i := 0 to NumValC do
  10.   begin
  11.     Inc(o, p^ and 1);
  12.     Inc(p);
  13.   end;
  14.   Write(IntToStr(o), ' odd numbers, ');
  15. end;
  16.  
« Last Edit: April 03, 2017, 05:59:01 pm by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Iterating/accessing an array, index vs pointer element access
« Reply #13 on: April 03, 2017, 06:17:21 pm »
Adagium is: write pure pascal, examine assembler out put, think if you can improve.
Otherwise leave it alone. Your first step should always be to examine the assembler that the compiler generates and only THEN optimize if it is worth the trouble.
Specialize a type, not a var.

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 844
Re: Iterating/accessing an array, index vs pointer element access
« Reply #14 on: April 03, 2017, 09:01:19 pm »
Adagium is: write pure pascal, examine assembler out put, think if you can improve.
Otherwise leave it alone. Your first step should always be to examine the assembler that the compiler generates and only THEN optimize if it is worth the trouble.
Simple rule: don't bother about low level optimization - bother about medium and high level ones. Better algorithm is always faster, than same algorithm, but optimized better. Right language tools are always faster too. But assembler isn't useless. It usually allows you to judge, how fast high level things really are, cuz high level things consist of low level things and it's low level things, that actually take real processor ticks. I.e. when you measure effectiveness of your algorithm - you shouldn't assume, that it's high level operations, that should be taken as time quantum.
« Last Edit: April 03, 2017, 09:03:03 pm by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

 

TinyPortal © 2005-2018