Recent

Author Topic: Insert into Dynamic Array.  (Read 1942 times)

dbannon

  • Hero Member
  • *****
  • Posts: 3408
    • tomboy-ng, a rewrite of the classic Tomboy
Insert into Dynamic Array.
« on: April 12, 2022, 09:16:06 am »
In recent FPC extra ways to use dynamic arrays become available, particularly insert and delete, see https://lists.freepascal.org/pipermail/fpc-pascal/2018-May/053892.html  (thanks PascalDragon !)

I am a bit puzzled as to what is happening behind the scenes however.  We all know the danger of repeated calls to setlength() but insert seems to be able to avoid it, my experiments indicate that it does not call setlength each time its called, instead it seems to reserve blocks of pointers 'occasionally' and sensibly the size of those blocks increases as the size of the array increases.

In fact, here is a table of how big those reserved blocks are ('Len' being current length of array)-

Len     Next Reserve
0          1
2          8
9         15
25       30
57       326
383    4,820
5,203 4,744
9,946 88,340

Now, that says to me that a dynamic array variable has some hidden meta data. It must keep track of how much extra space the array has so it knows when to use the pre allocated space and when to allocate some more.

My question -

Is my assessment correct ?  Does using insert on a Dynamic Array avoid the performance issues of calling setlength each time ?   Using insert and delete only, I should avoid setlength() ?

My test code, maybe I have it all wrong ?

Code: Pascal  [Select][+][-]
  1. procedure TestArray();
  2. var
  3.     S : TStringArray;
  4.     i : integer;
  5.     P : qword;
  6. begin
  7.     P := qword(pointer(S));                      // thats Nil, 0
  8.     for i := 0 to 99999 do begin
  9.         insert('blar', S, 0);
  10.         if P <> qword(pointer(S)) then begin
  11.           P := qword(pointer(S));
  12.           writeln('Pointer changed at i=' + inttostr(i) + ' and length=' + inttostr(length(S)));
  13.         end;
  14.     end;
  15. end;
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

bytebites

  • Hero Member
  • *****
  • Posts: 719
Re: Insert into Dynamic Array.
« Reply #1 on: April 12, 2022, 11:31:57 am »
 qword(pointer()) is too complicated. Pchar() does the same.

Zvoni

  • Hero Member
  • *****
  • Posts: 3003
Re: Insert into Dynamic Array.
« Reply #2 on: April 12, 2022, 11:57:06 am »
Having run the test, too, incl. writeln-ing the pointer, i would guess that it's similiar to a StringList: The allocated memory grows in chunks.
This was my Testcode:
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. Uses SysUtils;
  3. var
  4.     S : TStringArray;
  5.     i : integer;
  6.     P : qword;
  7. begin
  8.     P := qword(pointer(S));                      // thats Nil, 0
  9.     for i := 0 to 19 do begin
  10.         insert('blar', S, 0);
  11.         //if P <> qword(pointer(S)) then begin
  12.           P := qword(pointer(S));
  13.           writeln('Pointer P='+Inttostr(p)+' changed at i=' + inttostr(i) + ' and length=' + inttostr(length(S)));
  14.         //end;
  15.     end;
  16.     Readln;
  17. end.
Result:
Code: [Select]
Pointer P=22873808 changed at i=0 and length=1
Pointer P=22742736 changed at i=1 and length=2
Pointer P=22742736 changed at i=2 and length=3
Pointer P=22742736 changed at i=3 and length=4
Pointer P=22742736 changed at i=4 and length=5
Pointer P=22742736 changed at i=5 and length=6
Pointer P=22742736 changed at i=6 and length=7
Pointer P=22742736 changed at i=7 and length=8
Pointer P=22742736 changed at i=8 and length=9
Pointer P=22906600 changed at i=9 and length=10
Pointer P=22906600 changed at i=10 and length=11
Pointer P=22906600 changed at i=11 and length=12
Pointer P=22906600 changed at i=12 and length=13
Pointer P=22906600 changed at i=13 and length=14
Pointer P=22906600 changed at i=14 and length=15
Pointer P=22906600 changed at i=15 and length=16
Pointer P=22906600 changed at i=16 and length=17
Pointer P=22906600 changed at i=17 and length=18
Pointer P=22906600 changed at i=18 and length=19
Pointer P=22906600 changed at i=19 and length=20

EDIT: Just found this on the Wiki-Page for SetLength
https://wiki.freepascal.org/Dynamic_array
Quote
The procedure allocates memory for as many records of the base type as specified, plus some management data.
So at a guess, this "management data" contains stuff like Count, Element-type, Base-Address, reserved memory and similiar
« Last Edit: April 12, 2022, 12:08:13 pm by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Warfley

  • Hero Member
  • *****
  • Posts: 1930
Re: Insert into Dynamic Array.
« Reply #3 on: April 12, 2022, 12:07:24 pm »
The code for insert is:
Code: Pascal  [Select][+][-]
  1. procedure fpc_dynarray_insert(var p : pointer;source : SizeInt;data : pointer;count : SizeInt;pti : pointer);compilerproc;
  2.   var
  3.     newhigh,
  4.     i : tdynarrayindex;
  5.     size : sizeint;
  6.     realp,
  7.     newp : pdynarray;
  8.     ti : pointer;
  9.     elesize : sizeint;
  10.     eletype,eletypemngd : pointer;
  11.   begin
  12.     if not assigned(data) or
  13.         (count=0) then
  14.       exit;
  15.  
  16.     if assigned(p) then
  17.       realp:=pdynarray(p-sizeof(tdynarray))
  18.     else
  19.       realp:=nil;
  20.     newp:=realp;
  21.  
  22.     { cap insert index }
  23.     if assigned(p) then
  24.       begin
  25.         if source<0 then
  26.           source:=0
  27.         else if source>realp^.high+1 then
  28.           source:=realp^.high+1;
  29.       end
  30.     else
  31.       source:=0;
  32.  
  33.     { skip kind and name }
  34. {$ifdef VER3_0}
  35.      ti:=aligntoptr(Pointer(pti)+2+PByte(pti)[1]);
  36. {$else VER3_0}
  37.      ti:=aligntoqword(Pointer(pti)+2+PByte(pti)[1]);
  38. {$endif VER3_0}
  39.  
  40.     elesize:=pdynarraytypedata(ti)^.elSize;
  41.     eletype:=pdynarraytypedata(ti)^.elType2^;
  42.     { only set if type needs initialization }
  43.     if assigned(pdynarraytypedata(ti)^.elType) then
  44.       eletypemngd:=pdynarraytypedata(ti)^.elType^
  45.     else
  46.       eletypemngd:=nil;
  47.  
  48.     { determine new memory size }
  49.     if assigned(p) then
  50.       newhigh:=realp^.high+count
  51.     else
  52.       newhigh:=count-1;
  53.     size:=elesize*(newhigh+1)+sizeof(tdynarray);
  54.  
  55.     if assigned(p) then
  56.       begin
  57.         if realp^.refcount<>1 then
  58.           begin
  59.             { make an unique copy }
  60.             getmem(newp,size);
  61.             fillchar(newp^,sizeof(tdynarray),0);
  62.  
  63.             { copy leading elements }
  64.             if source>0 then
  65.               move(p^,(pointer(newp)+sizeof(tdynarray))^,source*elesize);
  66.             { insert new elements }
  67.             move(data^,(pointer(newp)+sizeof(tdynarray)+source*elesize)^,count*elesize);
  68.             { copy trailing elements }
  69.             if realp^.high-source+1>0 then
  70.               move((p+source*elesize)^,(pointer(newp)+sizeof(tdynarray)+(source+count)*elesize)^,(realp^.high-source+1)*elesize);
  71.  
  72.             { increment ref. count of managed members }
  73.             if assigned(eletypemngd) then
  74.               for i:=0 to newhigh do
  75.                 int_addref(pointer(newp)+sizeof(tdynarray)+elesize*i,eletypemngd);
  76.  
  77.             { a declock(ref. count) isn't enough here }
  78.             { it could be that the in MT environments  }
  79.             { in the mean time the refcount was       }
  80.             { decremented                             }
  81.  
  82.             { it is, because it doesn't really matter }
  83.             { if the array is now removed             }
  84.             fpc_dynarray_clear(p,pti);
  85.           end
  86.         else
  87.           begin
  88.             { resize the array }
  89.             reallocmem(realp,size);
  90.  
  91.             { p might no longer be correct }
  92.             p:=pointer(realp)+sizeof(tdynarray);
  93.  
  94.             { move the trailing part after the inserted data }
  95.             if source<=realp^.high then
  96.               move((p+source*elesize)^,(p+(source+count)*elesize)^,(realp^.high-source+1)*elesize);
  97.  
  98.             { move the inserted data to the destination }
  99.             move(data^,(p+source*elesize)^,count*elesize);
  100.  
  101.             { increase reference counts of inserted elements }
  102.             if assigned(eletypemngd) then
  103.               begin
  104.                 for i:=source to source+count-1 do
  105.                   int_addref(p+i*elesize,eletypemngd);
  106.               end;
  107.  
  108.             newp:=realp;
  109.           end;
  110.       end
  111.     else
  112.       begin
  113.         { allocate new array }
  114.         getmem(newp,size);
  115.         fillchar(newp^,sizeof(tdynarray),0);
  116.  
  117.         { insert data }
  118.         move(data^,(pointer(newp)+sizeof(tdynarray))^,count*elesize);
  119.  
  120.         { increase reference counts of inserted elements }
  121.         if assigned(eletypemngd) then
  122.           begin
  123.             for i:=0 to count-1 do
  124.               int_addref(pointer(newp)+sizeof(tdynarray)+i*elesize,eletypemngd);
  125.           end;
  126.       end;
  127.  
  128.     p:=pointer(newp)+sizeof(tdynarray);
  129.     newp^.refcount:=1;
  130.     newp^.high:=newhigh;
  131.   end;

As you can see, it always calls reallocmem for the new size (which is exactly as large as to fit exactly one more element in). So there is no geometric growth in the insert logic

The reason you see this is due to the behavior of the memory manager. Which allows you growing your memory region without having to move the elements.
As a comparison, do the same test with the c memory manager (include cmem as very first unit) and you see:
Code: Text  [Select][+][-]
  1. Pointer changed at i=0 and length=1
  2. Pointer changed at i=1 and length=2
  3. Pointer changed at i=933 and length=934
  4. Pointer changed at i=935 and length=936
  5. Pointer changed at i=937 and length=938
  6. Pointer changed at i=939 and length=940
  7. Pointer changed at i=941 and length=942
  8. Pointer changed at i=943 and length=944
  9. Pointer changed at i=945 and length=946
  10. Pointer changed at i=947 and length=948
  11. Pointer changed at i=949 and length=950
  12. Pointer changed at i=951 and length=952
  13. Pointer changed at i=953 and length=954
  14. Pointer changed at i=955 and length=956
  15. Pointer changed at i=957 and length=958
  16. Pointer changed at i=959 and length=960
  17. [...]

So this behavior is soly dependent on the memory manager. And therefore also probably dependent on the current state of allocations (so called heap fragmentation). So this is not something you should rely on. If you have an inner loop adding large amounts of elements, build the geometric growth yourself
« Last Edit: April 12, 2022, 12:17:24 pm by Warfley »

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: Insert into Dynamic Array.
« Reply #4 on: April 12, 2022, 12:09:55 pm »
Likewise you see the difference with (and without) using heaptrc.

Zvoni

  • Hero Member
  • *****
  • Posts: 3003
Re: Insert into Dynamic Array.
« Reply #5 on: April 12, 2022, 12:14:47 pm »
As you can see, it always calls reallocmem for the new size (which is exactly as large as to fit exactly one more element in). So there is no geometric growth in the insert logic
OK, You will have to explain that one (and i admit: I have no idea about stuff like this. You know what curiosity did to the cat?..  :D)
If i read the source-code correctly, reallocmem only gets called if realp^.refcount = 1
How is that "always calls"?
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Warfley

  • Hero Member
  • *****
  • Posts: 1930
Re: Insert into Dynamic Array.
« Reply #6 on: April 12, 2022, 12:21:54 pm »
It is "always called" in the scenario looked at here, namely that a single array is appended over and over.
Arrays implement lazy copy, meaning they are copied by reference, but if they are changed, it is ensured that you are working on a unique copy. This is ensured through the reference counter.
So when refcount is not 1, i.e. there are two arrays using the same memory, getmem and move is called to ensure a fresh copy is created, so only the array at hand is changed and the other reference to the same array is unchanged. Similarly, if there is no array at the moment that can be reallocated, a new array is created using getmem.

But as the question here was about inserting to a single array, for this case, only the refcount=1 case is of any relevance

Zvoni

  • Hero Member
  • *****
  • Posts: 3003
Re: Insert into Dynamic Array.
« Reply #7 on: April 12, 2022, 12:44:44 pm »
It is "always called" in the scenario looked at here, namely that a single array is appended over and over.
Arrays implement lazy copy, meaning they are copied by reference, but if they are changed, it is ensured that you are working on a unique copy. This is ensured through the reference counter.
So when refcount is not 1, i.e. there are two arrays using the same memory, getmem and move is called to ensure a fresh copy is created, so only the array at hand is changed and the other reference to the same array is unchanged. Similarly, if there is no array at the moment that can be reallocated, a new array is created using getmem.

But as the question here was about inserting to a single array, for this case, only the refcount=1 case is of any relevance
Ahhh, OK. Got it.
Thx.
So, basically a DynArray doesn't grow in chunks, but the Memory-Manager knows, if it still has room left after the last/highest element.
(EDIT: Which would be dbannons assumption, if i read it correctly)
« Last Edit: April 12, 2022, 12:46:27 pm by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

dbannon

  • Hero Member
  • *****
  • Posts: 3408
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Insert into Dynamic Array.
« Reply #8 on: April 12, 2022, 12:46:23 pm »
...
So this behavior is soly dependent on the memory manager. And therefore also probably dependent on the current state of allocations (so called heap fragmentation). So this is not something you should rely on. If you have an inner loop adding large amounts of elements, build the geometric growth yourself

Yes, you are right, cmem gives a quite different result but interestingly, about the same time to run.

But, and this rocked me, doing setlength and then assigning in the loop is massively faster !  160ms v 3mS type faster. So, my question very very firmly answered.

Thanks Warfley, very useful answer, thanks Zvoni and howardpc for your interest too !

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

Warfley

  • Hero Member
  • *****
  • Posts: 1930
Re: Insert into Dynamic Array.
« Reply #9 on: April 12, 2022, 01:13:53 pm »
Yes the C memory manager is ridiculusly fast, this is because a, it is extremly simple, basically it does not try to be smart but simply gives you the first memory region that is large enough to hold your data. The standard memory manager on the other hand has much more complex behavior, trying to group up similar sized objects. For example:
Code: Pascal  [Select][+][-]
  1. var
  2.   p1, p2, p3: Pointer;
  3. begin
  4.   p1 := GetMem(4);
  5.   p2 := GetMem(70);
  6.   p3 := GetMem(3000);
  7.   WriteLn('p1->p2: ', p2 - p1);
  8.   WriteLn('p2->p3: ', p3 - p2);
  9.   ReadLn;
  10. end
Without cmem:
Code: Pascal  [Select][+][-]
  1. p1->p2: 32792
  2. p2->p3: 32808
So the three memory regions are 32 kb away from each other, so  basically the memory manager has it's own memory regions for different orders of magnitude of sizes
With cmem:
Code: Pascal  [Select][+][-]
  1. p1->p2: 32
  2. p2->p3: 104
These are very close together, so cmem just has one memory regions and adds the elements wherever they fit (the large jump of 32 is because the c memory manager will not allocate memory less than 32 bytes).

And because it is simpler, it tends to be faster as less logic has to be executed. The second, probably more important, the c memory manager is provided by the system libc, which is provided by you os and extremely optimised, like special optimised versions for different CPU versions.
Thats why even it has to do much more allocations it is still quite fast.

But as I like to say, the best kind of work is the one you don't have to do, so preallocating memory is usually the way to go. While it is of course nice to just write something like:
Code: Pascal  [Select][+][-]
  1. for i in items do
  2.   arr += [i];
The much more efficient way is to do:
Code: Pascal  [Select][+][-]
  1. len := 0;
  2. SetLength(arr, 32);
  3. for i in items do
  4. begin
  5.   if len >= Length(arr) then
  6.     SetLength(arr, Length(arr)*2);
  7.   arr[len] := i;
  8.   Inc(len);
  9. end;
  10. SetLength(arr, len);
It is much more code, but can certainly be much more efficient

dbannon

  • Hero Member
  • *****
  • Posts: 3408
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Insert into Dynamic Array.
« Reply #10 on: April 14, 2022, 05:23:01 am »
I thought I'd get some performance numbers. And the results were interesting enough to be worth sharing IMHO !  Simple test code that 'adds' a small string to a dynamic array or a TstringList using both default and cmem memory manager, linux. Between 30K and 10M times.

The code -
Code: Pascal  [Select][+][-]
  1. Const Iterations = 3000000;
  2. procedure TestArrays();
  3. var
  4.     SL : TStringList;
  5.     SA : TStringArray;
  6.     i : integer;
  7.     Tick : qword;
  8. begin
  9.     setlength(SA, 0);
  10.  
  11.     Tick := Gettickcount64();
  12.     setlength(SA, Iterations);
  13.     for i := 0 to Iterations-1 do begin
  14.         SA[i] := 'blar';
  15.     end;
  16.     setlength(SA, 0);
  17.     writeln('Preallocate Array = ' + inttostr(Gettickcount64() - Tick) + 'mS');
  18.  
  19.     Tick := Gettickcount64();
  20.     for i := 0 to Iterations do begin
  21.         setlength(SA, i+1);
  22.         SA[i] := 'blar';
  23.     end;
  24.     setlength(SA, 0);
  25.     writeln('Allocate Array InLoop = ' + inttostr(Gettickcount64() - Tick) + 'mS');
  26.  
  27.     Tick := Gettickcount64();
  28.     for i := 0 to Iterations do begin
  29.         insert('blar', SA, high(SA));
  30.     end;
  31.     setlength(SA, 0);
  32.     writeln('Insert into Array = ' + inttostr(Gettickcount64() - Tick) + 'mS');
  33.  
  34.     Tick := Gettickcount64();
  35.     SL := TStringList.create;
  36.     for i := 0 to Iterations do begin
  37.         SL.Add('blar');
  38.     end;
  39.     SL.Free;
  40.     writeln('StringList = ' + inttostr(Gettickcount64() - Tick) + 'mS');
  41. end;

cmem
Code: [Select]
      Pre   InLoop  Insert  StringList
30k     1       6       7      4
100K    5      19      15      6
300K   13      35      17     10
1M     38      46      44     35
3M     62     112     127     97
10M   130     600     620    451

Default Memory Manager
Code: [Select]
      Pre   InLoop  Insert  StringList
30K     1       4       6      4
100K    4      15      15      8
300K   14,     46      24     14
1M     38     215     217     52
3M     65    1790    1855    141
10M   124   19677   19797    511

Release Mode. All numbers are mS, lower the better. Less than 30K items challenged resolution of GetTickCount64. Note the non-linearity.
Pre = Pre allocate space in the array using setlength.
InLoop = Adjust size of Array in each iteration of loop.
Insert  = use the Insert Intrinsic that appeared in FPC3.2.0


Picking a few datapoints of interest -

In all cases, using insert or adjusting setlength in the loop was at least 4 times slower than pre allocating for the array.

The memory manager made little difference to the Preallocated Array model but cmem was massively faster in the Insert and In Loop Array models for the bigger data structures. Cmem was a little faster for the StringList too.

The PreAllocate Array was mostly faster than the StringList and often substantially so. But remarkably, using cmem the StringList did creep ahead slightly in the 300K to 1M items range.

If you know how big your data structure will be and don't need the extensive features of a TStringList, a dynamic array is probably a good choice. And, maybe, self managing the allocation is worth consideration ?

Thanks Warfley for making me look at it properly !

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

 

TinyPortal © 2005-2018