Recent

Author Topic: Why CMem?  (Read 2331 times)

Thaddy

  • Hero Member
  • *****
  • Posts: 8680
Re: Why CMem?
« Reply #45 on: August 21, 2019, 03:50:49 pm »
[And there you hit a problem that these are often used functions which a procvar would slow down. Which is probably why it is as it is now. (just doesn't show up in the simplistic benchmarks)
It also shows up in quite complex benchmarks. cmem's as they are now are much more advanced than a simple heap manager as it was before ~2008-2010. I fell into this trap of ceteris paribus. cmem's are not being equal.
And yes, it would require interface and implementation.inc per platform as per the definitions I found for the three major platforms that have equivalents for memsize.
Most people that want to use threading should learn to patch their jeans first: use a needle.

BrunoK

  • Full Member
  • ***
  • Posts: 174
  • Retired programmer
Re: Why CMem?
« Reply #46 on: August 21, 2019, 04:45:08 pm »
This would make the cmem interface variable. If for some reason you need a specific windows version of CMEM, just use make a windows specific unit of cmem. (WCmem)
Not variable, only fully complying with TMemoryManager requirements as defined in the WIKI.
See my post at August 20, 2019, 03:09:15 pm. This is actually the cmem I'm testing in my working copy of FPC / Lazarus.
And there you hit a problem that these are often used functions which a procvar would slow down. Which is probably why it is as it is now. (just doesn't show up in the simplistic benchmarks)
There I really don't understand what is the problem with procvar, can you explain please.

I have now in my version of FPC astrings.inc these bits of code that replace the unfair treatment off non heap.inc memory manager  (and the same for unicodestring and widestring).
Code: Pascal  [Select]
  1. const
  2.   cChunkRoundUp  = SizeUInt(15);   // Chunk up to 16 bytes not including string terminator
  3.   cChunkRounding = SizeUInt(not 15);
  4.  
then
Code: Pascal  [Select]
  1. Function NewAnsiString(Len : SizeInt) : Pointer;
  2. {
  3.   Allocate a new AnsiString on the heap.
  4.   initialize it to zero length and reference count 1.
  5. }
  6. Var
  7.   P : Pointer;
  8.   lAllocSize : SizeInt;
  9. begin
  10.   { request a multiple of 16 because the heap manager alloctes anyways chunks of 16 bytes }
  11.   { ~bk 18.08.?? force allocation by multiple of 16 including 2 bytes for
  12.                  null (UTF-16 string terminator). Put alternate memory manager
  13.                  on equal footing as heap.inc }
  14.   // GetMem(P,Len+(AnsiFirstOff+sizeof(char)));
  15.   lAllocSize := (Len+AnsiFirstOff+cChunkRoundUp+2) and cChunkRounding;
  16.   GetMem(P, lAllocSize);
  17.   If P<>Nil then begin
  18.      PAnsiRec(P)^.Ref:=1;         { Set reference count }
  19.      PAnsiRec(P)^.Len:=0;         { Initial length }
  20.      PAnsiRec(P)^.CodePage:=DefaultSystemCodePage;
  21.      PAnsiRec(P)^.ElementSize:=SizeOf(AnsiChar);
  22.      inc(p,AnsiFirstOff);         { Points to string now }
  23.      PAnsiChar(P)^    :=#0;       { Terminating #0 }
  24.      PAnsiChar(P + 1)^:=#0;       { additional in case aRawString is UTF-16}
  25.   end;
  26.   NewAnsiString:=P;
  27. end;
  28.  
and where MemoryManager.MemSize is called :
Code: Pascal  [Select]
  1. procedure fpc_AnsiStr_SetLength(var S: RawByteString;
  2.   l: SizeInt{$ifdef FPC_HAS_CPSTRING}; cp: TSystemCodePage{$endif FPC_HAS_CPSTRING});
  3.   [public , alias: 'FPC_ANSISTR_SETLENGTH']; compilerproc;
  4. {
  5.   Sets The length of string S to L.
  6.   Makes sure S is unique, and contains enough room.
  7. }
  8. ...some code ...
  9.     else if PAnsiRec(Pointer(S) - AnsiFirstOff)^.Ref = 1 then begin
  10.       Temp := Pointer(s) - AnsiFirstOff;
  11.       lens := MemSize(Temp);
  12.       lena := AnsiFirstOff + L + 2; // 2 for possible UTF-16
  13.       { allow shrinking string if that saves at least half of current size }
  14.       if (lena > lens) or ((lens > 32) and (lena <= (lens div 2))) then begin
  15.         lena := (lena + cChunkRoundUp) and cChunkRounding;
  16.         if lena<>lens then begin
  17.           reallocmem(Temp, lena);
  18.           Pointer(S) := Temp + AnsiFirstOff;
  19.         end;
  20.       end;
  21.     end
  22.  
The idea about the double terminating #0 is that maybe it could be possible to unify ansistring and unicodestring in the future. Anyway the code cost is marginal relative to the rest and  preallocating all strings with the same overallocation is perfectly alright.
In the task manager,  my tests show for lazarus that memory used with heap.inc or cmem.pp does not differ much (maybe a bit cheaper with cmem).

I'm currently modifying the compiler to accept the new following compiler directive to put on the command line. (my current version accepts already a -dcmem switch) :
Code: Pascal  [Select]
  1.         { Support for alternate MemoryManager (not the heap.inc default)
  2.           Note to myself : more should contain 'm'<MemoryManager_UnitName> ~bk
  3.                                             or 'm:='<MemoryManager_UnitName> }
  4.  
and if successful will attempt to create a winheap memory manager that will use the windows HeapAlloc / HeapReAlloc / HeapFree / HeapSize that will conform to TMemoryManager requirements.
Lazarus trunk r. 59978/03.01.2019 (+/- patches regarding enabled, TScrollBar, TCursorImage). FPC 3.0.4 32 bits. (+heaptrc with leaked ClassName+Revisited TList) , Windows 10 Pro x64 (v. 1903)

PascalDragon

  • Hero Member
  • *****
  • Posts: 573
  • Compiler Developer
Re: Why CMem?
« Reply #47 on: August 23, 2019, 09:30:53 am »
Actually I found at least 2 lines that could cause troubles they are
Code: Pascal  [Select]
  1. D:\fpc-laz-asus\Lazarus\laz-svn-trunk\components\sparta\generics\source\inc\generics.dictionaries.inc
  2. components\sparta\generics\source\inc\generics.dictionaries.inc (278,40) Result := PSizeInt(PByte(@((@Self)^))-SizeOf(SizeInt))^;
  3. components\sparta\generics\source\inc\generics.dictionaries.inc (1318,42) Result := SizeInt((@PByte(@((@Self)^))[-SizeOf(SizeInt)])^);
  4.  
Good find. :o That should indeed be fixed. Preferably before we release 3.2 which contains that as well...

BrunoK

  • Full Member
  • ***
  • Posts: 174
  • Retired programmer
Re: Why CMem?
« Reply #48 on: August 23, 2019, 11:32:49 am »
Actually I found at least 2 lines that could cause troubles they are
Code: Pascal  [Select]
  1. D:\fpc-laz-asus\Lazarus\laz-svn-trunk\components\sparta\generics\source\inc\generics.dictionaries.inc
  2. components\sparta\generics\source\inc\generics.dictionaries.inc (278,40) Result := PSizeInt(PByte(@((@Self)^))-SizeOf(SizeInt))^;
  3. components\sparta\generics\source\inc\generics.dictionaries.inc (1318,42) Result := SizeInt((@PByte(@((@Self)^))[-SizeOf(SizeInt)])^);
  4.  
Good find. :o That should indeed be fixed. Preferably before we release 3.2 which contains that as well...
bk edit : bullshit -> ignore please
Attached 2 possible (untested) patches.
1° for lazarus trunk
2° for FPC trunk
« Last Edit: August 23, 2019, 04:39:19 pm by BrunoK »
Lazarus trunk r. 59978/03.01.2019 (+/- patches regarding enabled, TScrollBar, TCursorImage). FPC 3.0.4 32 bits. (+heaptrc with leaked ClassName+Revisited TList) , Windows 10 Pro x64 (v. 1903)

PascalDragon

  • Hero Member
  • *****
  • Posts: 573
  • Compiler Developer
Re: Why CMem?
« Reply #49 on: August 23, 2019, 03:54:01 pm »
Actually I found at least 2 lines that could cause troubles they are
Code: Pascal  [Select]
  1. D:\fpc-laz-asus\Lazarus\laz-svn-trunk\components\sparta\generics\source\inc\generics.dictionaries.inc
  2. components\sparta\generics\source\inc\generics.dictionaries.inc (278,40) Result := PSizeInt(PByte(@((@Self)^))-SizeOf(SizeInt))^;
  3. components\sparta\generics\source\inc\generics.dictionaries.inc (1318,42) Result := SizeInt((@PByte(@((@Self)^))[-SizeOf(SizeInt)])^);
  4.  
Good find. :o That should indeed be fixed. Preferably before we release 3.2 which contains that as well...
Attached 2 possible (untested) patches.
1° for lazarus trunk
2° for FPC trunk
I just checked the original code. They're a false alarm: they rely on the internal structure of dynamic arrays, not on the size of the allocated memory. So nothing to see here...

BrunoK

  • Full Member
  • ***
  • Posts: 174
  • Retired programmer
Re: Why CMem?
« Reply #50 on: August 23, 2019, 04:36:24 pm »
I just checked the original code. They're a false alarm: they rely on the internal structure of dynamic arrays, not on the size of the allocated memory. So nothing to see here...
I rushed a bit to fast on these because they looked so much like hacking getting the size of a memory block. One gets confused pretty easily with indirected indirect pointers.
Lazarus trunk r. 59978/03.01.2019 (+/- patches regarding enabled, TScrollBar, TCursorImage). FPC 3.0.4 32 bits. (+heaptrc with leaked ClassName+Revisited TList) , Windows 10 Pro x64 (v. 1903)

Thaddy

  • Hero Member
  • *****
  • Posts: 8680
Re: Why CMem?
« Reply #51 on: August 23, 2019, 05:27:06 pm »
So? it looks like it is possible to rip out the size?
Most people that want to use threading should learn to patch their jeans first: use a needle.

PascalDragon

  • Hero Member
  • *****
  • Posts: 573
  • Compiler Developer
Re: Why CMem?
« Reply #52 on: August 24, 2019, 10:55:47 am »
One gets confused pretty easily with indirected indirect pointers.
I hear you :-[

So? it looks like it is possible to rip out the size?
Did you test BrunoK's adjusted cmem without the size with Lazarus? Cause maybe it's a package you have installed in Lazarus which is misbehaving there.

BrunoK

  • Full Member
  • ***
  • Posts: 174
  • Retired programmer
Re: Why CMem?
« Reply #53 on: September 19, 2019, 12:20:38 pm »
So? it looks like it is possible to rip out the size?
Did you test BrunoK's adjusted cmem without the size with Lazarus? Cause maybe it's a package you have installed in Lazarus which is misbehaving there.
Just hanging around memory allocation, at least on windows, I have run some tests about memory block aligments supplied by heap.inc / cmem / winheap (see rtl\win\winheap.pp unit attached). 

Actually Grumpy may have a point, specifically on windows/cmem, that having the parasitic ptruint just in front the the allocated memory breaks alignment for structures expecting 8 bytes / 16 bytes (128 bits) alignment.

Testing the pointers returned by heap.inc / cmem / winheap (starting a modified version of Lazarus with test code in heap.inc) WITHOUT the additional ptruint gives the following alignments :
CPU32 -> pointer to memory block always aligned at least on multiple of 8 bytes boundaries.
CPU64 -> pointer to memory block always aligned at least on multiple of 16 bytes boundaries.

What that means is that getting a blocks for 128 bit structures that are sometimes expected on some i86_64 (and MMX instuctions whatever it means) processors are wrongly aligned when using cmem with the heading ptruint.

See https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019 for requirement for 16 byte alignment.

It should be checked on other intel OSes (linux etc ..) how their malloc/realloc align the heap memory blocks and then correct the cmem code to get rid of the leading "size as ptruint" when a equivalent to window's _msize exists.

About attachements  :
winheap.pp in rtl/win may work when defined as first unit in project and buildrtl has been modified.
cmem is there just to give an idea of how to change non windows code but wont work straight out because in my FPC the memory manager is loaded BEFORE system (compiler changes) and some modifications in heap.inc must be done to correctly initialize / finalize code.

If some developer wants to get the details, tell me where to post the changes I made to get things running correctly.

Lazarus trunk r. 59978/03.01.2019 (+/- patches regarding enabled, TScrollBar, TCursorImage). FPC 3.0.4 32 bits. (+heaptrc with leaked ClassName+Revisited TList) , Windows 10 Pro x64 (v. 1903)