Recent

Author Topic: A memory handling pattern that makes the FPC memory manager slow down a lot  (Read 3281 times)

ALLIGATOR

  • Sr. Member
  • ****
  • Posts: 404
  • I use FPC [main] 💪🐯💪
Code: Pascal  [Select][+][-]
  1. program mmtest;
  2. {$ifdef FPC}{$mode objfpc}{$endIf}
  3. {$pointermath on}
  4.  
  5. uses
  6.   // My HW: i7-8750H
  7.   //                   t1, ms | t2, ms
  8.   //                   _______|_______
  9.   // FPC default     :  34967 |     78
  10.   // FPC cmem        :     30 |  32949
  11.   // FPC FastMM4-AVX :     13 |     33
  12.   // FPC Mormot      :     12 |     32
  13.   // D12CE default   :     13 |     31
  14.  
  15.   //mormot.core.fpcx64mm,
  16.   //cmem,
  17.   //FastMM4,
  18.   sysutils;
  19.  
  20. var
  21.   MyArr: array [0..100000-1] of Pointer;
  22.   t: TDateTime;
  23.  
  24. procedure FreeEveryN(N: Integer);
  25. var
  26.   i:integer;
  27. begin
  28.   i:=Low(MyArr);
  29.   while i <= High(MyArr) do
  30.   begin
  31.     if MyArr[i]<>nil then
  32.     begin
  33.       FreeMem(MyArr[i]);
  34.       MyArr[i]:=nil;
  35.     end;
  36.     inc(i, N);
  37.   end;
  38. end;
  39.  
  40. procedure AllocEveryFree(Size: Integer);
  41. var
  42.   i: Integer;
  43. begin
  44.   for i:=Low(MyArr) to High(MyArr) do
  45.     if MyArr[i]=nil then GetMem(MyArr[i], Size);
  46. end;
  47.  
  48. begin
  49.   FillChar(MyArr[Low(MyArr)], SizeOf(MyArr), 0);
  50.  
  51.   AllocEveryFree(2048);
  52.   FreeEveryN(2);
  53.  
  54.   t:=Now;
  55.   AllocEveryFree(1024);
  56.   Write((Now-t)*MSecsPerDay:7:0, ' ms');
  57.  
  58.   t:=Now;
  59.   FreeEveryN(1);
  60.   WriteLn(', ', (Now-t)*MSecsPerDay:7:0, ' ms');
  61.  
  62.   ReadLn;
  63. end.
  64.  
I may seem rude - please don't take it personally

Thaddy

  • Hero Member
  • *****
  • Posts: 18797
  • Glad to be alive.
Why do you re-initialize already initialezed memory? MyArr is heap, so is initialized.
Recovered from removal of tumor in tongue following tongue reconstruction with a part from my leg.

440bx

  • Hero Member
  • *****
  • Posts: 6159
Why do you re-initialize already initialezed memory? MyArr is heap, so is initialized.
That's not what GetMem's documentation says.

https://www.freepascal.org/docs-html/rtl/system/getmem.html

Code: Text  [Select][+][-]
  1. The newly allocated memory is not initialized in any way, and may contain garbage data. It must be cleared with a call to FillChar or FillWord.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

cdbc

  • Hero Member
  • *****
  • Posts: 2687
    • http://www.cdbc.dk
Hi
'AllocMem' initializes, AFAICR...
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE6/QT6 -> FPC Release -> Lazarus Release &  FPC Main -> Lazarus Main

440bx

  • Hero Member
  • *****
  • Posts: 6159
Hi
'AllocMem' initializes, AFAICR...
Regards Benny
Yes, it does but ALLIGATOR's code uses GetMem.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

ASerge

  • Hero Member
  • *****
  • Posts: 2477
Yes, you pick inconvenient sizes when the memory manager is constantly forced to "not guess and start over" when allocating.
I rewrote the program a bit to make it more detailed:
Code: Pascal  [Select][+][-]
  1. program mmtest;
  2. {$IFDEF FPC}
  3.   {$MODE OBJFPC}
  4. {$ENDIF}
  5. {$IFDEF WINDOWS}
  6.   {$APPTYPE CONSOLE}
  7. {$ENDIF}
  8.  
  9. uses
  10.   //mormot.core.fpcx64mm,
  11.   //cmem,
  12.   //FastMM4,
  13.   SysUtils;
  14.  
  15. var
  16.   MyArr: array [0..100000 - 1] of Pointer;
  17.   TimeStart: TDateTime;
  18.  
  19. procedure StartTimer(const Desc: string);
  20. begin
  21.   Write(Desc, '':25-Length(Desc));
  22.   TimeStart := Now;
  23. end;
  24.  
  25. procedure StopTimerAndReport;
  26. begin
  27.   Writeln((Now - TimeStart) * MSecsPerDay:7:0, ' ms');
  28. end;
  29.  
  30. procedure AllocateForAllNotUsed(ABlockSize: NativeInt);
  31. var
  32.   i: NativeInt;
  33. begin
  34.   for i := Low(MyArr) to High(MyArr) do
  35.     if MyArr[i] = nil then
  36.       GetMem(MyArr[i], ABlockSize);
  37. end;
  38.  
  39. procedure FreeEveryN(N: NativeInt);
  40. var
  41.   i: NativeInt;
  42. begin
  43.   i := Low(MyArr);
  44.   while i <= High(MyArr) do
  45.   begin
  46.     ReAllocMem(MyArr[i], 0);
  47.     Inc(i, N);
  48.   end;
  49. end;
  50.  
  51. procedure DoTest(AFirstSize, ASecondSize: NativeInt);
  52. begin
  53.   StartTimer('First allocate by ' + IntToStr(AFirstSize));
  54.   AllocateForAllNotUsed(AFirstSize);
  55.   StopTimerAndReport;
  56.  
  57.   StartTimer('Free every even item');
  58.   FreeEveryN(2);
  59.   StopTimerAndReport;
  60.  
  61.   StartTimer('Reallocate freed by ' + IntToStr(ASecondSize));
  62.   AllocateForAllNotUsed(ASecondSize);
  63.   StopTimerAndReport;
  64.  
  65.   StartTimer('Free all');
  66.   FreeEveryN(1);
  67.   StopTimerAndReport;
  68.   Writeln;
  69. end;
  70.  
  71. begin
  72.   FillChar(MyArr[Low(MyArr)], SizeOf(MyArr), 0);
  73.   DoTest(2048, 1024);
  74.   Writeln;
  75.   DoTest(2000, 1000);
  76.   Write('Press Enter to exit...');
  77.   ReadLn;
  78. end.

My i5-2300.

FPC default:
Code: Text  [Select][+][-]
  1. First allocate by 2048       160 ms
  2. Free every even item          10 ms
  3. Reallocate freed by 1024   48190 ms
  4. Free all                     105 ms
  5.  
  6.  
  7. First allocate by 2000       165 ms
  8. Free every even item           0 ms
  9. Reallocate freed by 1000      41 ms
  10. Free all                      96 ms
  11.  
  12. Press Enter to exit...

cmem:
Code: Text  [Select][+][-]
  1. First allocate by 2048       210 ms
  2. Free every even item          50 ms
  3. Reallocate freed by 1024      30 ms
  4. Free all                   39690 ms
  5.  
  6.  
  7. First allocate by 2000       640 ms
  8. Free every even item         430 ms
  9. Reallocate freed by 1000      40 ms
  10. Free all                   64220 ms
  11.  
  12. Press Enter to exit...

ALLIGATOR

  • Sr. Member
  • ****
  • Posts: 404
  • I use FPC [main] 💪🐯💪
@Thaddy
Yes, you are probably right about global variables being initialized with 0, but I don't use them that often and don't remember exactly, so I wrote the code like this

@ASerge
Yes, it's probably not the right size for a memory manager, but appreciate how much it slows down because of this, it's a very big slowdown. Perhaps someone who is well versed in the internal structure of the memory manager can somehow smooth out such variants so that there are no such spikes

And besides all this - I posted it on the forum to document it, and also maybe someone who has relevant competence will be interested in it
I may seem rude - please don't take it personally

LV

  • Sr. Member
  • ****
  • Posts: 427
Let's run the @ASerge program with the following data:

Code: Pascal  [Select][+][-]
  1. begin
  2.   FillChar(MyArr[Low(MyArr)], SizeOf(MyArr), 0);
  3.   DoTest(2048, 1992);
  4.   Writeln;
  5.   DoTest(2048, 1993);
  6.   Writeln;
  7.   DoTest(2064, 2024);
  8.   Writeln;
  9.   DoTest(2064, 2025);
  10.   Writeln;
  11.   Write('Press Enter to exit...');
  12.   ReadLn;
  13. end.
  14.  

output (default memory manager):

Code: Text  [Select][+][-]
  1. First allocate by 2048       106 ms
  2. Free every even item           4 ms
  3. Reallocate freed by 1992   20687 ms
  4. Free all                      76 ms
  5.  
  6.  
  7. First allocate by 2048       100 ms
  8. Free every even item           4 ms
  9. Reallocate freed by 1993       6 ms
  10. Free all                      74 ms
  11.  
  12.  
  13. First allocate by 2064       104 ms
  14. Free every even item           4 ms
  15. Reallocate freed by 2024   23564 ms
  16. Free all                      76 ms
  17.  
  18.  
  19. First allocate by 2064       101 ms
  20. Free every even item           5 ms
  21. Reallocate freed by 2025       6 ms
  22. Free all                      74 ms
  23.  
  24.  
  25. Press Enter to exit...
  26.  

The answer seems to be this:
Memory is allocated and deallocated in blocks.
https://forum.lazarus.freepascal.org/index.php/topic,68982.msg536478.html#msg536478
Reply #63 .

Code: Text  [Select][+][-]
  1. Your system is 64 bit
  2. MGet = 1991|  GetSize_ = 1992|  FreeMem_ = 2016|  GetSize = 1992|  FreeMem = 2016|
  3. MGet = 1992|  GetSize_ = 1992|  FreeMem_ = 2016|  GetSize = 1992|  FreeMem = 2016|
  4. ..............................................  + 32
  5. MGet = 1993|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  6. MGet = 1994|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  7. MGet = 1995|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  8. MGet = 1996|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  9. MGet = 1997|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  10. MGet = 1998|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  11. MGet = 1999|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  12. MGet = 2000|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  13. MGet = 2001|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  14. MGet = 2002|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  15. MGet = 2003|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  16. MGet = 2004|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  17. MGet = 2005|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  18. MGet = 2006|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  19. MGet = 2007|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  20. MGet = 2008|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  21. MGet = 2009|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  22. MGet = 2010|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  23. MGet = 2011|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  24. MGet = 2012|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  25. MGet = 2013|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  26. MGet = 2014|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  27. MGet = 2015|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  28. MGet = 2016|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  29. MGet = 2017|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  30. MGet = 2018|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  31. MGet = 2019|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  32. MGet = 2020|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  33. MGet = 2021|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  34. MGet = 2022|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  35. MGet = 2023|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  36. MGet = 2024|  GetSize_ = 2024|  FreeMem_ = 2048|  GetSize = 2024|  FreeMem = 2048|
  37. ..............................................  + 32
  38. MGet = 2025|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  39. MGet = 2026|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  40. MGet = 2027|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  41. MGet = 2028|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  42. MGet = 2029|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  43. MGet = 2030|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  44. MGet = 2031|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  45. MGet = 2032|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  46. MGet = 2033|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  47. MGet = 2034|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  48. MGet = 2035|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  49. MGet = 2036|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  50. MGet = 2037|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  51. MGet = 2038|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  52. MGet = 2039|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  53. MGet = 2040|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  54. MGet = 2041|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  55. MGet = 2042|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  56. MGet = 2043|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  57. MGet = 2044|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  58. MGet = 2045|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  59. MGet = 2046|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  60. MGet = 2047|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  61. MGet = 2048|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  62. MGet = 2049|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  63. MGet = 2050|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  64. MGet = 2051|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  65. MGet = 2052|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  66. MGet = 2053|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  67. MGet = 2054|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  68. MGet = 2055|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  69. MGet = 2056|  GetSize_ = 2056|  FreeMem_ = 2080|  GetSize = 2056|  FreeMem = 2080|
  70. ..............................................  + 32
  71. MGet = 2057|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  72. MGet = 2058|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  73. MGet = 2059|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  74. MGet = 2060|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  75. MGet = 2061|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  76. MGet = 2062|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  77. MGet = 2063|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  78. MGet = 2064|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  79. MGet = 2065|  GetSize_ = 2088|  FreeMem_ = 2112|  GetSize = 2088|  FreeMem = 2112|
  80.  

abouchez

  • Full Member
  • ***
  • Posts: 136
    • Synopse
I am never convinced by such microbenchmarks.
Rely on real work, like using it for FPC and look at compilation time, or in a multi-thread web server and benchmark using wrk.

Just calling getmem/freemem in loops is not a realistic scenario for sure: no application does this. Ever.
And it is likely that you may find such a weakness in any MM, with the properly wrong pattern.

Anyway, the fastest heap is the one which is not used: this is why in mORMot (even if we have our own heap unit), we rather avoid most transient string during our process.

 

TinyPortal © 2005-2018