Maybe a real example on how to use initialize might be useful. Assuming you need a datastructure (for simplicety of this example a stack) for high performance usage, so you don't want to waste a lot of time in the memory allocator, you make use of the already existing virtual memory and paging functionality of the OS and simply preallocate a huge memory region beforehand and then operate on that.
Now if you use an Array with SetLength, SetLength will already implicetly call Initialize on all the elements of the Array. This is good for ease of use, but this means that all the memory will be touched and all your virtual memory and paging advantages go out the window. You want raw virtual memory, so you use GetMem instead:
program Project1;
{$mode objfpc}{$H+}
uses
SysUtils;
type
generic TPreallocatedStack<T> = class
private type PT = ^T;
private
FData: PT;
FLength: SizeInt;
FSize: SizeInt;
public
constructor Create(const ASize: SizeInt);
destructor Destroy; override;
procedure Push(constref AItem: T);
function Pop: T;
end;
constructor TPreallocatedStack.Create(const ASize: SizeInt);
begin
inherited Create;
FData := GetMem(ASize * SizeOf(T));
FSize := ASize;
FLength := 0;
end;
destructor TPreallocatedStack.Destroy;
begin
Freemem(FData);
inherited Destroy;
end;
procedure TPreallocatedStack.Push(constref AItem: T);
begin
FData[FLength] := AItem;
Inc(FLength);
end;
function TPreallocatedStack.Pop: T;
begin
Dec(FLength);
Result := FData[FLength];
end;
type
TIntStack = specialize TPreallocatedStack<Integer>;
var
Stack: TIntStack;
i: Integer;
begin
Stack := TIntStack.Create(1024*1024*1024*10); // Allocates around 40 Gigabytes (10 gigs * 4 bytes per int)
try
for i:=0 to 10 do
Stack.Push(i);
for i := 0 to 3 do
WriteLn(Stack.Pop);
finally
Stack.Free;
end;
end.
This works fine and is really fast. If instead of using the raw getmem we would use SetLength, which initializes the memory, the program would hang up on the setlenth and at some point it would crash because my computer runs out of memory (I don't have 40 gigs of RAM).
So we have everything we want right? But the problem is now that if we are using Managed types, the Management Operators will not be called:
TStringStack = specialize TPreallocatedStack<String>;
var
Stack: TStringStack;
i: Integer;
begin
Stack := TStringStack.Create(1024*1024*1024); // Changed to only 4GB because of HeapTrc
try
for i:=0 to 10 do
Stack.Push(i.ToString);
for i := 0 to 3 do
WriteLn(Stack.Pop);
finally
Stack.Free;
end;
end.
I'm now using heaptrc, and because heaptrc will fill the memory with $ff (a security feature), but because this touches all the memory, similarly how SetLength works, this of course completely removes the speed advantage and consumes all the virtual memory, so it is just for testing purposes. But as a consequence I needed to reduce the size of the memory allocation, because as with SetLength, otherwise it would crash my PC.
Now we get a segfault, because when heaptrc initializes the memory, it writes $ff into it. This results in invalid string values. If we mitigate this by nulling the data manually (by adding FillChar(FData^, ASize * SizeOf(T), 0) to the constructor), we get a bunch of memory leaks.
The reason for this is, that the memory was not initialized in the beginning and finalized in the end (well actually the fillchar above would be a correct initialization for String, but thats rather a coincidence). So to allow this datastructure to use managed types, it needs to call initialize and finalize. And not like SetLength over all the data, but only where it is needed.
The simplest solution is to just do that in push and pop, as well as the destructor (as the destructor removes all the remaining items):
destructor TPreallocatedStack.Destroy;
begin
Finalize(FData^, FLength);
Freemem(FData);
inherited Destroy;
end;
procedure TPreallocatedStack.Push(constref AItem: T);
begin
Initialize(FData[FLength]);
FData[FLength] := AItem;
Inc(FLength);
end;
function TPreallocatedStack.Pop: T;
begin
Dec(FLength);
Result := FData[FLength];
Finalize(FData[FLength]);
end;
Now if we run the same code again it works flawless, no segfaults, no memory leaks. And when we remove heaptrc, we can again increase the size to ridiculus amounts and it is still blazing fast.
Of course there can still be improvements made, for example when pushing and then popping and pushing again, its Initialize, finalize, initialize, this could be just one initialize, so another counter can be added to count what was already initialized, and not finalize in between.
Another thing is, managed types can add arbitrary complex code during copying, e.g. assume the following managed record:
class operator Copy(constref Source: TRec; var Dest: TRec);
begin
Sleep(1000);
Dest.Data := Source.Data;
end;
This would at every assignment of the record add a 1 second sleep, really pointless but possible. Now look at the pop method:
function TPreallocatedStack.Pop: T;
begin
Dec(FLength);
Result := FData[FLength];
Finalize(FData[FLength]);
end;
What is done here is an assignment, which is great and all, but would cause the 1 second sleep. What you can do with Initialize and Finalize is to implement a clear and Move:
function TPreallocatedStack.Pop: T;
begin
Dec(FLength);
Finalize(Result); // Clears the result
Move(FData[FLength], Result, SizeOf(T)); // Move does not call Copy operator
FillChar(FData[FLength], SizeOf(T), 0); // Not necessary but for robustness
end;
What this does is to clear the result variable, and then move the data from the Stack into result without calling the Copy operator. The clearing is necessary because otherwise overriding without a copy operator would break the validity assumptions of the managed type (e.g. for string the reference count would not be decreased).
After the move, the data in FData is invalid, and should not be used anymore, because the only valid version of the data (as without the copy operator no copy was created), is now in Result. To ensure this, FillChar is used, to write a different value into it, but it is technically not necessary.
Such, admittedly rather advanced programming techniques, allow to control managed behavior closely, and e.g. for performance reasons, avoid copies whenever possible (people familiar with C++ might recognize that this basically emulates std::move in C++).
This is not that much of a concern now (except when you really need that last bit performance for your strings), but assuming that Managed Records will some day work, you might have List type records, which when doing an assignment with := might copy large lists with gigabytes of data. E.g. assume a Hashset of lists (this is actually how I discovered it myself, when managed records came to trunk I built all kinds of datastructures with them and was wondering why everything was so slow), something quite common, if for every rehash all the lists would be elementwise copied, this might cause long freezes and huge amounts of memory consumption for all the copies.
So while it's right now just a bit of a curiosity, if (and when) managed records become popular, this might be necessary for ensuring compatibility with very complex copy mechanics