Recent

Author Topic: [Solved] Dynamic arrays much slower than GetMem  (Read 3105 times)

ChrisR

  • Full Member
  • ***
  • Posts: 229
[Solved] Dynamic arrays much slower than GetMem
« on: September 27, 2021, 02:56:44 pm »
The attached project computes a gaussian blur on a 3D volume of dimensions kDim^3. With kDim = 256, the exact same computations are 172 times slower (Apple M1, 1394 vs 239723 ms) and 294 times slower (Ryzen 3900X, 2706 vs 796013 ms ) than using traditional arrays.  I realize that there is some overhead to calling setlength(), where the array will be initialized as zeros, but there are very few calls to setlengths().

I have used dynamic arrays extensively in my projects as they are convenient. I had always thought they incurred a small penalty, but not to this extent. Can anyone suggest modifications to the code that can improve dynamic array performance.

Dynamic arrays:
  TFloat32s = array of single;
  setlength(img3D, n)
  ...
  img3D := nil;

Traditional arrays:
 f32ra = array [0..0] of Single;
 f32p = ^f32ra;
GetMem(img3D, n * 4);
...
FreeMem(img3D);
« Last Edit: October 02, 2021, 03:16:46 pm by ChrisR »

Warfley

  • Hero Member
  • *****
  • Posts: 608
Re: Dynamic arrays much slower than GetMem
« Reply #1 on: September 27, 2021, 03:31:07 pm »
Dynamic arrays have a few overhead factors. The first one is of course the initialization when calling SetLength. The second one is the lazy copy mechanism. Arrays are reference counted to minimize copy operations. But when you change the array it needs to be copied. So on every single write access, code is executed that checks if the reference counter is 1 and if it is greater, it will copy the array into a new memory location before performing the access
To avoid this, you can use pointer access to the array:
Code: Pascal  [Select][+][-]
  1. var
  2.   MyArray: Array of Single;
  3. begin
  4.   SetLength(MyArray, 2);
  5.   PSingle(MyArray)[0] := 42;
  6.   PSingle(MyArray)[1] := 3.14159;
  7. end;

Thaddy

  • Hero Member
  • *****
  • Posts: 10991
Re: Dynamic arrays much slower than GetMem
« Reply #2 on: September 27, 2021, 03:40:07 pm »
Indeed. It is the setlengths in loops that bites you. Never do that for performance code.
The average programmer productivity is 4-5 hours per day. Peak performance 72 hours for short bursts. MTBF is 1 second or less.

MarkMLl

  • Hero Member
  • *****
  • Posts: 3515
Re: Dynamic arrays much slower than GetMem
« Reply #3 on: September 27, 2021, 03:45:37 pm »
Indeed. It is the setlengths in loops that bites you. Never do that for performance code.

Seconded. Memory can be released/reallocated in situations where common sense would have suggested that it would have been retained and that has massive performance implications. That is NOT intended as criticism of the way the compiler and runtimes are written, I'm sure there are reasons that make it inevitable.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Thaddy

  • Hero Member
  • *****
  • Posts: 10991
Re: Dynamic arrays much slower than GetMem
« Reply #4 on: September 27, 2021, 04:00:20 pm »
Well, actually: if the resize is smaller than before, there is no relocation. That only happens when the array grows.
The average programmer productivity is 4-5 hours per day. Peak performance 72 hours for short bursts. MTBF is 1 second or less.

ChrisR

  • Full Member
  • ***
  • Posts: 229
Re: Dynamic arrays much slower than GetMem
« Reply #5 on: September 27, 2021, 04:24:47 pm »
There are very few setlength() calls, none in the inner loop. I did try Warfley's PSSingle() trick, but do not see much advantage. Testing this, I found that virtually all the difference is due to a memory copy that DOES happen inside a for loop:

For the dynamic arrays I use:
  tmp := copy(img, imgp, imgp + nx - 1);
while for GetMemory I use
  Move(img^[imgp], tmp^[0],nx*4);//src,dst

I found that I could get similar performance for dynamic arrays by using
  Move(img[imgp], tmp[0],nx*4);//src,dst
So in my case, the culprit was copy. Are there any comments on my hack of using a move instruction? Is there any alternative to the `copy` instruction - here the length of the array never changes.



Warfley

  • Hero Member
  • *****
  • Posts: 608
Re: Dynamic arrays much slower than GetMem
« Reply #6 on: September 27, 2021, 04:38:48 pm »
The reason why copy is so much slower is pretty simple:
Code: Pascal  [Select][+][-]
  1. MyArray := Copy(...)
does the following:
1. It allocates new memory for the result of Copy
2. It calles a move operation from the old array to the newly allocated memory region
3. It releases the memory hold by MyArray

Using move works "in-place" i.e. it moves directly to the memory location without creating a new array first.

Generally speaking, avoid allocations and free operations. Pre allocations are always a good solution. Modern OS use virtual memory, which means that until you touch a memory page it will not be actually loaded into memory. So you can make huge pre-allocations without any punishment.
So if you are writing performance oriented code, you can simply do one large allocation that allocates all the memory you could possibly need for your algorithm and don't do any new allocations afterwards

avk

  • Hero Member
  • *****
  • Posts: 505
    • my self-education project
Re: Dynamic arrays much slower than GetMem
« Reply #7 on: September 27, 2021, 05:36:24 pm »
...
The second one is the lazy copy mechanism. Arrays are reference counted to minimize copy operations. But when you change the array it needs to be copied. So on every single write access, code is executed that checks if the reference counter is 1 and if it is greater, it will copy the array into a new memory location before performing the access
...

Are you sure?
What, then, will print a test like this?
Code: Pascal  [Select][+][-]
  1. program test_cow;
  2. {$mode objfpc}
  3. var
  4.   a, b: array of Integer;
  5. begin
  6.   a := [1, 2, 3];
  7.   b := a;
  8.   b[1] := 42;
  9.   WriteLn('a[1] = ', a[1]);
  10. end.
  11.  

...
To avoid this, you can use pointer access to the array:
Code: Pascal  [Select][+][-]
  1. var
  2.   MyArray: Array of Single;
  3. begin
  4.   SetLength(MyArray, 2);
  5.   PSingle(MyArray)[0] := 42;
  6.   PSingle(MyArray)[1] := 3.14159;
  7. end;


Hmm,
Code: Pascal  [Select][+][-]
  1. program test;
  2. {$mode objfpc}
  3. uses
  4.   SysUtils, DateUtils;
  5. var
  6.   a: array of Single;
  7.   I, J: Integer;
  8.   Start, Stop: TTime;
  9.  
  10. begin
  11.   SetLength(a, 1000000);
  12.   Start := Time;
  13.   for I := 1 to 1000 do
  14.     for J := 0 to High(a) do
  15.      PSingle(a)[J] := Single(5.0);
  16.   Stop := Time;
  17.   WriteLn('PSingle: ', MillisecondsBetween(Stop, Start));
  18.   FillChar(Pointer(a)^, Length(a)*SizeOf(Single), 0);
  19.   Start := Time;
  20.   for I := 1 to 1000 do
  21.     for J := 0 to High(a) do
  22.      a[J] := Single(5.0);
  23.   Stop := Time;
  24.   WriteLn('Index: ', MillisecondsBetween(Stop, Start));  
  25.  
prints
Code: Text  [Select][+][-]
  1. PSingle: 501
  2. Index: 483
  3.  

Warfley

  • Hero Member
  • *****
  • Posts: 608
Re: Dynamic arrays much slower than GetMem
« Reply #8 on: September 27, 2021, 06:10:42 pm »
Looks like I confused this with Strings, strings call fpc_ansistr_unique but arrays don't: Assembly Example

Awkward

  • Jr. Member
  • **
  • Posts: 95
Re: Dynamic arrays much slower than GetMem
« Reply #9 on: September 27, 2021, 06:38:53 pm »
i tried test program on my slightly ol PC and got different results 3 times:
PSingle: 3496
Index: 3873

PSingle: 3834
Index: 3511

PSingle: 3562
Index: 3594

i think, it shows not too good test algorithm

ASerge

  • Hero Member
  • *****
  • Posts: 1887
Re: Dynamic arrays much slower than GetMem
« Reply #10 on: September 27, 2021, 06:40:45 pm »
Can anyone suggest modifications to the code that can improve dynamic array performance.
For me
Quote
Blur time 2618
Blur time (dynamic arrays) 2582
With replace (commented out) in blurSd
Code: Pascal  [Select][+][-]
  1. tmp := copy(img, imgp, {imgp + nx - 1}nx);
because in blurS it is
Code: Pascal  [Select][+][-]
  1. Move(img^[imgp], tmp^[0], nx * 4);

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 793
  • Professional amateur ;-P
Re: Dynamic arrays much slower than GetMem
« Reply #11 on: September 27, 2021, 06:49:53 pm »
Hey Y'all,

Poor @GetMem, isn't he already busy enough curating OPM?

Why do you have to also burden him with curating your memory?

Give the poor person a break :P

Cheers,
Gus

</sarcasm>
Lazarus 2.3.0(trunk) FPC 3.3.1(trunk) Ubuntu 21.10 64b Dark Theme
Lazarus 2.0.12(stable) FPC 3.2.2(stable) Ubuntu 21.10 64b Dark Theme
http://github.com/gcarreno

avk

  • Hero Member
  • *****
  • Posts: 505
    • my self-education project
Re: Dynamic arrays much slower than GetMem
« Reply #12 on: September 27, 2021, 07:02:14 pm »
i tried test program on my slightly ol PC and got different results 3 times:
PSingle: 3496
Index: 3873

PSingle: 3834
Index: 3511

PSingle: 3562
Index: 3594

i think, it shows not too good test algorithm

In a desktop system, there will always be deviations due to its multitasking. Make statistics of several measurements, if interested. But I suppose if there really was some reason to take this or that approach, it would be immediately visible.

Warfley

  • Hero Member
  • *****
  • Posts: 608
Re: Dynamic arrays much slower than GetMem
« Reply #13 on: September 27, 2021, 07:33:52 pm »
If you look at the assembly the Indexed access and the Pointer access generate the exact same assembly. Any differences in speed are coincidental.

See: Link

It's only dynamic strings that have this lazy copy mechanism on indexed access
« Last Edit: September 27, 2021, 07:37:39 pm by Warfley »

Awkward

  • Jr. Member
  • **
  • Posts: 95
Re: Dynamic arrays much slower than GetMem
« Reply #14 on: September 27, 2021, 07:59:19 pm »
slightly offtop: try to add array prefix "var" for index example and/or "const" for pointer example ^_^
THAT the optimization!

 

TinyPortal © 2005-2018