Lazarus

Free Pascal => General => Topic started by: ChrisR on September 27, 2021, 02:56:44 pm

Title: [Solved] Dynamic arrays much slower than GetMem
Post by: ChrisR on September 27, 2021, 02:56:44 pm
The attached project computes a gaussian blur on a 3D volume of dimensions kDim^3. With kDim = 256, the exact same computations are 172 times slower (Apple M1, 1394 vs 239723 ms) and 294 times slower (Ryzen 3900X, 2706 vs 796013 ms ) than using traditional arrays.  I realize that there is some overhead to calling setlength(), where the array will be initialized as zeros, but there are very few calls to setlengths().

I have used dynamic arrays extensively in my projects as they are convenient. I had always thought they incurred a small penalty, but not to this extent. Can anyone suggest modifications to the code that can improve dynamic array performance.

Dynamic arrays:
  TFloat32s = array of single;
  setlength(img3D, n)
  ...
  img3D := nil;

Traditional arrays:
 f32ra = array [0..0] of Single;
 f32p = ^f32ra;
GetMem(img3D, n * 4);
...
FreeMem(img3D);
Title: Re: Dynamic arrays much slower than GetMem
Post by: Warfley on September 27, 2021, 03:31:07 pm
Dynamic arrays have a few overhead factors. The first one is of course the initialization when calling SetLength. The second one is the lazy copy mechanism. Arrays are reference counted to minimize copy operations. But when you change the array it needs to be copied. So on every single write access, code is executed that checks if the reference counter is 1 and if it is greater, it will copy the array into a new memory location before performing the access
To avoid this, you can use pointer access to the array:
Code: Pascal  [Select][+][-]
  1. var
  2.   MyArray: Array of Single;
  3. begin
  4.   SetLength(MyArray, 2);
  5.   PSingle(MyArray)[0] := 42;
  6.   PSingle(MyArray)[1] := 3.14159;
  7. end;
Title: Re: Dynamic arrays much slower than GetMem
Post by: Thaddy on September 27, 2021, 03:40:07 pm
Indeed. It is the setlengths in loops that bites you. Never do that for performance code.
Title: Re: Dynamic arrays much slower than GetMem
Post by: MarkMLl on September 27, 2021, 03:45:37 pm
Indeed. It is the setlengths in loops that bites you. Never do that for performance code.

Seconded. Memory can be released/reallocated in situations where common sense would have suggested that it would have been retained and that has massive performance implications. That is NOT intended as criticism of the way the compiler and runtimes are written, I'm sure there are reasons that make it inevitable.

MarkMLl
Title: Re: Dynamic arrays much slower than GetMem
Post by: Thaddy on September 27, 2021, 04:00:20 pm
Well, actually: if the resize is smaller than before, there is no relocation. That only happens when the array grows.
Title: Re: Dynamic arrays much slower than GetMem
Post by: ChrisR on September 27, 2021, 04:24:47 pm
There are very few setlength() calls, none in the inner loop. I did try Warfley's PSSingle() trick, but do not see much advantage. Testing this, I found that virtually all the difference is due to a memory copy that DOES happen inside a for loop:

For the dynamic arrays I use:
  tmp := copy(img, imgp, imgp + nx - 1);
while for GetMemory I use
  Move(img^[imgp], tmp^[0],nx*4);//src,dst

I found that I could get similar performance for dynamic arrays by using
  Move(img[imgp], tmp[0],nx*4);//src,dst
So in my case, the culprit was copy. Are there any comments on my hack of using a move instruction? Is there any alternative to the `copy` instruction - here the length of the array never changes.


Title: Re: Dynamic arrays much slower than GetMem
Post by: Warfley on September 27, 2021, 04:38:48 pm
The reason why copy is so much slower is pretty simple:
Code: Pascal  [Select][+][-]
  1. MyArray := Copy(...)
does the following:
1. It allocates new memory for the result of Copy
2. It calles a move operation from the old array to the newly allocated memory region
3. It releases the memory hold by MyArray

Using move works "in-place" i.e. it moves directly to the memory location without creating a new array first.

Generally speaking, avoid allocations and free operations. Pre allocations are always a good solution. Modern OS use virtual memory, which means that until you touch a memory page it will not be actually loaded into memory. So you can make huge pre-allocations without any punishment.
So if you are writing performance oriented code, you can simply do one large allocation that allocates all the memory you could possibly need for your algorithm and don't do any new allocations afterwards
Title: Re: Dynamic arrays much slower than GetMem
Post by: avk on September 27, 2021, 05:36:24 pm
...
The second one is the lazy copy mechanism. Arrays are reference counted to minimize copy operations. But when you change the array it needs to be copied. So on every single write access, code is executed that checks if the reference counter is 1 and if it is greater, it will copy the array into a new memory location before performing the access
...

Are you sure?
What, then, will print a test like this?
Code: Pascal  [Select][+][-]
  1. program test_cow;
  2. {$mode objfpc}
  3. var
  4.   a, b: array of Integer;
  5. begin
  6.   a := [1, 2, 3];
  7.   b := a;
  8.   b[1] := 42;
  9.   WriteLn('a[1] = ', a[1]);
  10. end.
  11.  

...
To avoid this, you can use pointer access to the array:
Code: Pascal  [Select][+][-]
  1. var
  2.   MyArray: Array of Single;
  3. begin
  4.   SetLength(MyArray, 2);
  5.   PSingle(MyArray)[0] := 42;
  6.   PSingle(MyArray)[1] := 3.14159;
  7. end;


Hmm,
Code: Pascal  [Select][+][-]
  1. program test;
  2. {$mode objfpc}
  3. uses
  4.   SysUtils, DateUtils;
  5. var
  6.   a: array of Single;
  7.   I, J: Integer;
  8.   Start, Stop: TTime;
  9.  
  10. begin
  11.   SetLength(a, 1000000);
  12.   Start := Time;
  13.   for I := 1 to 1000 do
  14.     for J := 0 to High(a) do
  15.      PSingle(a)[J] := Single(5.0);
  16.   Stop := Time;
  17.   WriteLn('PSingle: ', MillisecondsBetween(Stop, Start));
  18.   FillChar(Pointer(a)^, Length(a)*SizeOf(Single), 0);
  19.   Start := Time;
  20.   for I := 1 to 1000 do
  21.     for J := 0 to High(a) do
  22.      a[J] := Single(5.0);
  23.   Stop := Time;
  24.   WriteLn('Index: ', MillisecondsBetween(Stop, Start));  
  25.  
prints
Code: Text  [Select][+][-]
  1. PSingle: 501
  2. Index: 483
  3.  
Title: Re: Dynamic arrays much slower than GetMem
Post by: Warfley on September 27, 2021, 06:10:42 pm
Looks like I confused this with Strings, strings call fpc_ansistr_unique but arrays don't: Assembly Example (https://compiler-explorer.com/z/xcYcTnsPj)
Title: Re: Dynamic arrays much slower than GetMem
Post by: Awkward on September 27, 2021, 06:38:53 pm
i tried test program on my slightly ol PC and got different results 3 times:
PSingle: 3496
Index: 3873

PSingle: 3834
Index: 3511

PSingle: 3562
Index: 3594

i think, it shows not too good test algorithm
Title: Re: Dynamic arrays much slower than GetMem
Post by: ASerge on September 27, 2021, 06:40:45 pm
Can anyone suggest modifications to the code that can improve dynamic array performance.
For me
Quote
Blur time 2618
Blur time (dynamic arrays) 2582
With replace (commented out) in blurSd
Code: Pascal  [Select][+][-]
  1. tmp := copy(img, imgp, {imgp + nx - 1}nx);
because in blurS it is
Code: Pascal  [Select][+][-]
  1. Move(img^[imgp], tmp^[0], nx * 4);
Title: Re: Dynamic arrays much slower than GetMem
Post by: Gustavo 'Gus' Carreno on September 27, 2021, 06:49:53 pm
Hey Y'all,

Poor @GetMem, isn't he already busy enough curating OPM?

Why do you have to also burden him with curating your memory?

Give the poor person a break :P

Cheers,
Gus

</sarcasm>
Title: Re: Dynamic arrays much slower than GetMem
Post by: avk on September 27, 2021, 07:02:14 pm
i tried test program on my slightly ol PC and got different results 3 times:
PSingle: 3496
Index: 3873

PSingle: 3834
Index: 3511

PSingle: 3562
Index: 3594

i think, it shows not too good test algorithm

In a desktop system, there will always be deviations due to its multitasking. Make statistics of several measurements, if interested. But I suppose if there really was some reason to take this or that approach, it would be immediately visible.
Title: Re: Dynamic arrays much slower than GetMem
Post by: Warfley on September 27, 2021, 07:33:52 pm
If you look at the assembly the Indexed access and the Pointer access generate the exact same assembly. Any differences in speed are coincidental.

See: Link (https://godbolt.org/z/TGsEWjKe6)

It's only dynamic strings that have this lazy copy mechanism on indexed access
Title: Re: Dynamic arrays much slower than GetMem
Post by: Awkward on September 27, 2021, 07:59:19 pm
slightly offtop: try to add array prefix "var" for index example and/or "const" for pointer example ^_^
THAT the optimization!
Title: Re: Dynamic arrays much slower than GetMem
Post by: Jonas Maebe on September 27, 2021, 08:10:42 pm
The attached project computes a gaussian blur on a 3D volume of dimensions kDim^3. With kDim = 256, the exact same computations are 172 times slower (Apple M1, 1394 vs 239723 ms) and 294 times slower (Ryzen 3900X, 2706 vs 796013 ms ) than using traditional arrays.  I realize that there is some overhead to calling setlength(), where the array will be initialized as zeros, but there are very few calls to setlengths().
In case of a dynamic array, there's (on 64 bit platforms) a 16 byte header at the start of the memory block (containing the reference count and size), but I'm not sure how that could explain such a difference in speed.
Title: Re: Dynamic arrays much slower than GetMem
Post by: ASerge on September 27, 2021, 09:08:16 pm
Maybe I didn't make it clear in the previous message, but the program compares different algorithms! When I made the algorithms the same, the dynamic version became even faster.
Title: Re: Dynamic arrays much slower than GetMem
Post by: ChrisR on September 27, 2021, 10:03:16 pm
ASerge

I agree that the difference in speed can be resolved by changing the dynamic array call from
 tmp := copy(img, imgp, nx);
to
  Move(img^[imgp], tmp^[0], nx * 4);

Even though the array has the same size before and after the copy(), there seems to be a big penalty.

Title: Re: Dynamic arrays much slower than GetMem
Post by: Kays on September 27, 2021, 11:06:27 pm
[…] I have used dynamic arrays extensively in my projects as they are convenient. […]
Dynamic arrays are managed data types (https://www.freepascal.org/docs-html/ref/refse20.html). The FPC will automagically insert try … finally frames when you use them. You can [but shouldn’t] control this with {$implicitExceptions} (https://www.freepascal.org/docs-html/current/prog/progsu34.html). For more on that topic cf. avoiding implicit try finally section (https://wiki.freepascal.org/Avoiding_implicit_try_finally_section) in the Wiki.
Title: Re: Dynamic arrays much slower than GetMem
Post by: Jonas Maebe on September 28, 2021, 08:41:13 am
ASerge

I agree that the difference in speed can be resolved by changing the dynamic array call from
 tmp := copy(img, imgp, nx);
to
  Move(img^[imgp], tmp^[0], nx * 4);

Even though the array has the same size before and after the copy(), there seems to be a big penalty.
copy always allocates a new array.
Title: Re: Dynamic arrays much slower than GetMem
Post by: ASerge on September 28, 2021, 06:27:30 pm
I agree that the difference in speed can be resolved by changing the dynamic array call from
 tmp := copy(img, imgp, nx);
to
  Move(img^[imgp], tmp^[0], nx * 4);
Again, you did not understand me. The problem is that the functions with a dynamic and static array are different. That is, they do not the same thing. That is, it is useless to compare them, no matter what tricks you do with Move. They are different!
In my first message, I just showed where the error is located, and pointed out that if make them logically the same, then the difference in speed disappears.
Was
Code: Pascal  [Select][+][-]
  1. tmp := copy(img, imgp, imgp + nx - 1);
After correction (this is similar to the version without a dynamic array)
Code: Pascal  [Select][+][-]
  1. tmp := copy(img, imgp, nx);
Imagine that imgp equal 100000 to understand where the error is, and why there is such a difference in speed.
Title: Re: Dynamic arrays much slower than GetMem
Post by: ChrisR on September 29, 2021, 01:45:16 pm
ASerge is correct, my original problem was that the copy() command was copying a larger array than required. So while there copy() command may have a slight penalty versus move(), the effect is small.


So this thread is largely a false alarm. It does look like dynamic array can perform reasonably as long as you minimize the use of setlength() calls (which carry some overhead and obligatorily zero arrays).

My fault entirely. Thanks for the community help.

By the way, how do I set a thread to being "solved" with this forum?
Title: Re: Dynamic arrays much slower than GetMem
Post by: dseligo on September 29, 2021, 01:56:16 pm
By the way, how do I set a thread to being "solved" with this forum?

Just change title of the first post to: '[SOLVED] Dynamic arrays much slower than GetMem'
TinyPortal © 2005-2018