Initialize for object

Nitorami

Hero Member
Posts: 501

Re: Initialize for object

« Reply #1 on: March 16, 2023, 06:14:30 pm »

If you use virtual methods within your object then you'll need to declare and call a constructor init, otherwise the VMT (Virtual Method Table) won't be created and you'll get an AV. If and how the "initialize" is involved in that, I don't know. I did not even know that such a method exists for objects. I only use operator initialise for automatic initialisation of advanced records.

Logged

PascalDragon

Hero Member
Posts: 5486
Compiler Developer

Re: Initialize for object

« Reply #2 on: March 17, 2023, 03:42:10 pm »

Quote from: Okoba on March 16, 2023, 04:25:22 pm

What does Initialize does for object? It seems nothing.

When the object contains managed types they'll be brought to a valid state with Initialize
Normally you don't need to call Initialize yourself, the compiler automatically inserts the necessary calls

Logged

Okoba

Hero Member
Posts: 533

Re: Initialize for object

« Reply #3 on: March 17, 2023, 07:43:25 pm »

So why in sample code, it writes 10 even after Initialize? Shouldn't it set to zero? Like what it does for a record? Is there a way to Initialize the memory beside filling zero?

Logged

Thaddy

Hero Member
Posts: 14382
Sensorship about opinions does not belong here.

Re: Initialize for object

« Reply #4 on: March 17, 2023, 07:54:51 pm »

Default() ?
If that fails it is a bug.

Logged

Object Pascal programmers should get rid of their "component fetish" especially with the non-visuals.

Okoba

Hero Member
Posts: 533

Re: Initialize for object

« Reply #5 on: March 17, 2023, 08:11:19 pm »

Default works. Thanks.
I still confused on why Initialize works like this: https://forum.lazarus.freepascal.org/index.php/topic,61795
After years of working with FPC, I still can not properly explain to someone what is Initialize and how it behaves. I should be a very bad at understanding it.

Logged

Warfley

Hero Member
Posts: 1499

Re: Initialize for object

« Reply #6 on: March 17, 2023, 08:23:30 pm »

Initialize has only an effect on managed types, for unmanaged types it has no effect. Also you only need to call it when you are working with untyped memory (or memory whose type is different from the target type).
I must admit it is not quite clear from the doc:

Quote

Initialize is a compiler intrinsic: it initializes a memory area T for any kind of managed variable. Initializing means zeroeing out the memory area. In this sense it is close in functionality to Default, but Default requires an already initialized variable. It performs the opposite operation of finalize, which should be used to clean up the memory block when it is no longer needed.

Note that initialize is different from Default as Default can only be assigned to an initialized object

For each initialize you must also call a finalize. Example:

Code: Pascal [Select][+]

var
  buff: Array[0..SizeOf(String) - 1] of Byte; // Unmanaged memory
  str: PString; // Pointer to managed memory
begin
  str := @buff[0]; // Str points now to unmanaged memory
  Initialize(str^); // because str points to unmanaged memory it must be manually initialized
  ReadLn(str^);
  WriteLn('Hello ', str^);
  Finalize(str^); // The initialize must be matched by a finalize
end;

You only need initialize if you know it is a managed type, or you are using generics which could be a managed type. But as a record or object could contain a managed type (or a type that contains a managed type), which makes itself a managed type by proxy, it's often better to be safe than sorry and to call it whenever you allocate untyped memory

« Last Edit: March 17, 2023, 08:31:08 pm by Warfley »

Logged

GitHub: https://github.com/Warfley

dsiders

Hero Member
Posts: 1084

Re: Initialize for object

« Reply #7 on: March 17, 2023, 08:51:15 pm »

Quote from: Okoba on March 17, 2023, 08:11:19 pm

Default works. Thanks.
I still confused on why Initialize works like this: https://forum.lazarus.freepascal.org/index.php/topic,61795
After years of working with FPC, I still can not properly explain to someone what is Initialize and how it behaves. I should be a very bad at understanding it.

Perhaps this will help: https://www.freepascal.org/docs-html/ref/refse20.html
Do you see Integer mentioned anywhere on the page?

Logged

Preview Lazarus 3.99 documentation at: https://dsiders.gitlab.io/lazdocsnext

Okoba

Hero Member
Posts: 533

Re: Initialize for object

« Reply #8 on: March 17, 2023, 10:08:53 pm »

Thank you both.
So Initialize is almost never needed, and to make an object go back to default is to use Default(). That's a little unfortunate as 1- Default makes a new object and copy to the destination, and that is slow. 2- Initialize operator for records makes someone like me to think that I need to call Initialize every time.
Now I think I learned from my mistakes and your kind help. Default() to go.

Logged

Warfley

Hero Member
Posts: 1499

Re: Initialize for object

« Reply #9 on: March 17, 2023, 10:12:08 pm »

Maybe a real example on how to use initialize might be useful. Assuming you need a datastructure (for simplicety of this example a stack) for high performance usage, so you don't want to waste a lot of time in the memory allocator, you make use of the already existing virtual memory and paging functionality of the OS and simply preallocate a huge memory region beforehand and then operate on that.

Now if you use an Array with SetLength, SetLength will already implicetly call Initialize on all the elements of the Array. This is good for ease of use, but this means that all the memory will be touched and all your virtual memory and paging advantages go out the window. You want raw virtual memory, so you use GetMem instead:

Code: Pascal [Select][+]

program Project1;
 
{$mode objfpc}{$H+}
 
uses
  SysUtils;
 
type
  generic TPreallocatedStack<T> = class
  private type PT = ^T;
  private
    FData: PT;
    FLength: SizeInt;
    FSize: SizeInt;
  public
    constructor Create(const ASize: SizeInt);
    destructor Destroy; override;
 
    procedure Push(constref AItem: T);
    function Pop: T;
  end;
 
constructor TPreallocatedStack.Create(const ASize: SizeInt);
begin
  inherited Create;
  FData := GetMem(ASize * SizeOf(T));
  FSize := ASize;
  FLength := 0;
end;
 
destructor TPreallocatedStack.Destroy;
begin
  Freemem(FData);
  inherited Destroy;
end;
 
procedure TPreallocatedStack.Push(constref AItem: T);
begin
  FData[FLength] := AItem;
  Inc(FLength);
end;
 
function TPreallocatedStack.Pop: T;
begin
  Dec(FLength);
  Result := FData[FLength];
end;
 
type
  TIntStack = specialize TPreallocatedStack<Integer>;
 
var
  Stack: TIntStack;
  i: Integer;
begin
  Stack := TIntStack.Create(1024*1024*1024*10); // Allocates around 40 Gigabytes (10 gigs * 4 bytes per int) 
  try
    for i:=0 to 10 do
      Stack.Push(i);
    for i := 0 to 3 do
      WriteLn(Stack.Pop);
  finally
    Stack.Free;
  end;
end.
 

This works fine and is really fast. If instead of using the raw getmem we would use SetLength, which initializes the memory, the program would hang up on the setlenth and at some point it would crash because my computer runs out of memory (I don't have 40 gigs of RAM).

So we have everything we want right? But the problem is now that if we are using Managed types, the Management Operators will not be called:

Code: Pascal [Select][+]

  TStringStack = specialize TPreallocatedStack<String>;
 
var
  Stack: TStringStack;
  i: Integer;
begin
  Stack := TStringStack.Create(1024*1024*1024); // Changed to only 4GB because of HeapTrc
  try
    for i:=0 to 10 do
      Stack.Push(i.ToString);
    for i := 0 to 3 do
      WriteLn(Stack.Pop);
  finally
    Stack.Free;
  end;
end. 

I'm now using heaptrc, and because heaptrc will fill the memory with $ff (a security feature), but because this touches all the memory, similarly how SetLength works, this of course completely removes the speed advantage and consumes all the virtual memory, so it is just for testing purposes. But as a consequence I needed to reduce the size of the memory allocation, because as with SetLength, otherwise it would crash my PC.

Now we get a segfault, because when heaptrc initializes the memory, it writes $ff into it. This results in invalid string values. If we mitigate this by nulling the data manually (by adding FillChar(FData^, ASize * SizeOf(T), 0) to the constructor), we get a bunch of memory leaks.

The reason for this is, that the memory was not initialized in the beginning and finalized in the end (well actually the fillchar above would be a correct initialization for String, but thats rather a coincidence). So to allow this datastructure to use managed types, it needs to call initialize and finalize. And not like SetLength over all the data, but only where it is needed.
The simplest solution is to just do that in push and pop, as well as the destructor (as the destructor removes all the remaining items):

Code: Pascal [Select][+]

destructor TPreallocatedStack.Destroy;
begin
  Finalize(FData^, FLength);
  Freemem(FData);
  inherited Destroy;
end;
 
procedure TPreallocatedStack.Push(constref AItem: T);
begin
  Initialize(FData[FLength]);
  FData[FLength] := AItem;
  Inc(FLength);
end;
 
function TPreallocatedStack.Pop: T;
begin
  Dec(FLength);
  Result := FData[FLength];
  Finalize(FData[FLength]);
end;

Now if we run the same code again it works flawless, no segfaults, no memory leaks. And when we remove heaptrc, we can again increase the size to ridiculus amounts and it is still blazing fast.

Of course there can still be improvements made, for example when pushing and then popping and pushing again, its Initialize, finalize, initialize, this could be just one initialize, so another counter can be added to count what was already initialized, and not finalize in between.

Another thing is, managed types can add arbitrary complex code during copying, e.g. assume the following managed record:

Code: Pascal [Select][+]

class operator Copy(constref Source: TRec; var Dest: TRec);
begin
  Sleep(1000);
  Dest.Data := Source.Data;
end;

This would at every assignment of the record add a 1 second sleep, really pointless but possible. Now look at the pop method:

Code: Pascal [Select][+]

function TPreallocatedStack.Pop: T;
begin
  Dec(FLength);
  Result := FData[FLength];
  Finalize(FData[FLength]);
end;

What is done here is an assignment, which is great and all, but would cause the 1 second sleep. What you can do with Initialize and Finalize is to implement a clear and Move:

Code: Pascal [Select][+]

function TPreallocatedStack.Pop: T;
begin
  Dec(FLength);
  Finalize(Result); // Clears the result
  Move(FData[FLength], Result, SizeOf(T)); // Move does not call Copy operator
  FillChar(FData[FLength], SizeOf(T), 0); // Not necessary but for robustness
end;

What this does is to clear the result variable, and then move the data from the Stack into result without calling the Copy operator. The clearing is necessary because otherwise overriding without a copy operator would break the validity assumptions of the managed type (e.g. for string the reference count would not be decreased).
After the move, the data in FData is invalid, and should not be used anymore, because the only valid version of the data (as without the copy operator no copy was created), is now in Result. To ensure this, FillChar is used, to write a different value into it, but it is technically not necessary.

Such, admittedly rather advanced programming techniques, allow to control managed behavior closely, and e.g. for performance reasons, avoid copies whenever possible (people familiar with C++ might recognize that this basically emulates std::move in C++).

This is not that much of a concern now (except when you really need that last bit performance for your strings), but assuming that Managed Records will some day work, you might have List type records, which when doing an assignment with := might copy large lists with gigabytes of data. E.g. assume a Hashset of lists (this is actually how I discovered it myself, when managed records came to trunk I built all kinds of datastructures with them and was wondering why everything was so slow), something quite common, if for every rehash all the lists would be elementwise copied, this might cause long freezes and huge amounts of memory consumption for all the copies.

So while it's right now just a bit of a curiosity, if (and when) managed records become popular, this might be necessary for ensuring compatibility with very complex copy mechanics

« Last Edit: March 17, 2023, 10:24:26 pm by Warfley »

Logged

GitHub: https://github.com/Warfley

Okoba

Hero Member
Posts: 533

Re: Initialize for object

« Reply #10 on: March 17, 2023, 10:29:20 pm »

Warfley, that is a great example, thank you so much.
So two summarise what Initialize() does, it prepares a managed type to be used? Like creating a class (and its virtual structure)? For example for string it prepares the reference counting?
I ask to clarify as the documentation only says, ti zeros out the memory, so it should work like FillZero.

Logged

Warfley

Hero Member
Posts: 1499

Re: Initialize for object

« Reply #11 on: March 17, 2023, 11:08:24 pm »

Quote from: Okoba on March 17, 2023, 10:29:20 pm

Warfley, that is a great example, thank you so much.
So two summarise what Initialize() does, it prepares a managed type to be used? Like creating a class (and its virtual structure)? For example for string it prepares the reference counting?
I ask to clarify as the documentation only says, ti zeros out the memory, so it should work like FillZero.

Yes, zeroing out the memory was correct when the only managed types where arrays, strings and interfaces, but with the new managed records, initialization can mean anything, as an example:

Code: Pascal [Select][+]

type
  PManagedTest = ^TManagedTest;
  TManagedTest = record
    Initialized: Boolean;
 
    class operator Initialize(var Self: TManagedTest);
  end;
 
class operator TManagedTest.Initialize(var Self: TManagedTest);
begin
  Self.Initialized := True;
end;
 
 
var
  p: PManagedTest;
begin
  p := GetMem(SizeOf(p^));
  FillChar(p^, SizeOf(p^), 0);
  WriteLn(p^.Initialized); // False because 0 initialized
  Initialize(p^);
  WriteLn(p^.Initialized); // True because initialize operator is called
end.  

Here initialized is explicetly set to true during initialization, so here you can see that the documentation is not up to date anymore.

Managed records are still a bit icky, for example Default zeroes fields, so setting p^:=Default(TManagedTest), will actually set p^.Initialized to false, meaning Default is not actually an initialized value (meaning that using Default you might be in an invalid state). Other notable things include that Generics.Collections.TCustomList does not use Finalize correctly either, in it's DoRemove it does the following:

Code: Pascal [Select][+]

  FItems[AIndex] := Default(T);
  if AIndex <> FLength then
  begin
    System.Move(FItems[AIndex + 1], FItems[AIndex], (FLength - AIndex) * SizeOf(T));
    FillChar(FItems[FLength], SizeOf(T), 0);
  end;

Where FillChar with 0 is used, which with the type above would result in an unitialized value.

So yeah the new managed records aren't really thought through right now. So you are not alone with it

« Last Edit: March 17, 2023, 11:15:24 pm by Warfley »

Logged

GitHub: https://github.com/Warfley

Okoba

Hero Member
Posts: 533

Re: Initialize for object

« Reply #12 on: March 17, 2023, 11:13:37 pm »

Oh now it clicked for me.
Thanks you again.

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: Initialize for object (Read 943 times)

Okoba

Initialize for object

Nitorami

Re: Initialize for object

PascalDragon

Re: Initialize for object

Okoba

Re: Initialize for object

Thaddy

Re: Initialize for object

Okoba

Re: Initialize for object

Warfley

Re: Initialize for object

dsiders

Re: Initialize for object

Okoba

Re: Initialize for object

Warfley

Re: Initialize for object

Okoba

Re: Initialize for object

Warfley

Re: Initialize for object

Okoba

Re: Initialize for object

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook