Recent

Author Topic: Memory layout and management questions  (Read 1504 times)

K.IO.s

  • New Member
  • *
  • Posts: 11
Memory layout and management questions
« on: July 05, 2022, 10:45:42 pm »
Hi there, i'd like to know more about when the stack i used for memory and when the heap is used.

In C i'd assume that anything that wasn't malloc'd goes into the stack, but it seems that is not the case with freePascal.
One of my main doubts was about arrays, it seems that if you declare them at the file level(global), they will be allocated on the heap, is there any way to force them to be allocated on the stack?
The same question with regards to shortStrings, or fixed strings, normally this type of thing is bundled in the executable, is that not the case with pascal?

Another issue is I was reading about "dynamic arrays", and specially in the case of 'arrays of arrays', it seems that they don't have a contiguous memory layout, and thus the data is allocated all over the place with pointers everywhere.
 In c++ the vector class is a dynamic array but it does maintain the linearity of it's layout, so you can access it as any regular fixed array. I guess there is no equivalent in freePascal.

So for performance oriented tasks, I'm still not sure if I should use an array[a,b], or a 1d array, and use the old C style syntax to access values as if they were in a 2d space.(as an example). This is for trying to avoid cache misses.

I'm also trying to figure out how to handle the creation of custom allocators in pascal.
It seems that I need to look into the memory manager, but I haven't understood it completely yet. I'm not sure if I can have multiple allocation strategies in the same program, or only one global handling for everything.(which would defeat much of the purpose for custom allocators)

If there is anything else i should know about the memory aspect of the language, please let me know.
And feel free to correct me with regards to anything, still trying to figure this out.

Thanks either way.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Memory layout and management questions
« Reply #1 on: July 06, 2022, 12:13:45 am »
In C i'd assume that anything that wasn't malloc'd goes into the stack, but it seems that is not the case with freePascal.
In some instances, FreePascal somewhat hides where memory is allocated.  For instance, classes are always allocated on the heap and that's not obvious just reading the code.

One of my main doubts was about arrays, it seems that if you declare them at the file level(global), they will be allocated on the heap, is there any way to force them to be allocated on the stack?
if you declare and array or any variable of any type for that matter at the _global_ level then it will be allocated in the program's data segment. In that case, it is neither on the heap nor the stack.  It's important to realize that the location _where_ a variable is declared affects where it will be located (data segment, stack, heap) and, that in a significant number of cases the actual allocation requires "programmer intervention" (usually for dynamically sized structures/variables.) Just in case, no heap allocation depends on where the variable is declared.  Heap allocations are always "manual" (even though the mechanism may be hidden.)

...it seems that they don't have a contiguous memory layout, and thus the data is allocated all over the place with pointers everywhere.
No, the arrays the compiler builds always have contiguous elements, the compiler depends on that to access the array elements.  IOW, allocation isn't all over the place, linearity is always preserved.

So for performance oriented tasks, I'm still not sure if I should use an array[a,b], or a 1d array, and use the old C style syntax to access values as if they were in a 2d space.(as an example). This is for trying to avoid cache misses.
How an array is accessed is easy to change.  First thing is to get the program to work correctly then, once it works correctly, keep it working equally well but faster.  It's the Toyota way, first do it right, then, keep doing it right but faster.  They proved it's an effective way of getting good things done.


It seems that I need to look into the memory manager, but I haven't understood it completely yet.
If you want performance and simplicity and, don't mind some extra work, manage the memory yourself.  You can get very significant performance gains if the custom memory management routines are a good match to the program's behavior.

HTH.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

K.IO.s

  • New Member
  • *
  • Posts: 11
Re: Memory layout and management questions
« Reply #2 on: July 06, 2022, 12:39:58 am »
If you want performance and simplicity and, don't mind some extra work, manage the memory yourself.  You can get very significant performance gains if the custom memory management routines are a good match to the program's behavior.

HTH.

Thanks for the extended reply, awesome.
So basically it works like in C as expected. I read a bunch of posts in the forum that seemed to indicate otherwise. Maybe they are referring to older implementations.


I have a follow up question about the memory manager part. If you are familiar with it, is it possible to get scoped memory management with the MM pascal has, or something of that nature?

In c++17 we had pmr allocators, which allowed for a mix and matching of allocation strategies, i'd be looking for something similar.
Like using a bump allocator in some part of the code, a pool in another and a general freelist for the rest.
I'm still trying to figure out if this can be managed in freePascal.

Thanks again.

K.IO.s

  • New Member
  • *
  • Posts: 11
Re: Memory layout and management questions
« Reply #3 on: July 06, 2022, 01:10:06 am »
For the sake of reference, here is one example of posts that I saw that seemed to indicate arrays were not contiguous in freePascal: https://forum.lazarus.freepascal.org/index.php/topic,39739.msg273701.html#msg273701

And here for an example saying arrays are auto allocated on the heap:
https://forum.lazarus.freepascal.org/index.php/topic,34989.msg230535.html#msg230535

There were others where people explicitly stated such things. That's what got me confused in the first place. Why would the language have such obscure design choices.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Memory layout and management questions
« Reply #4 on: July 06, 2022, 02:37:32 am »
Thanks for the extended reply, awesome.
You're welcome.

So basically it works like in C as expected.
It pretty much works as it does in C but, there some occasional differences/peculiarities.  For instance, FPC will always allocate classes on the heap.  In C++ it is possible to control where a class resides in memory.  FP allows a programmer to control where an _object_ (not class but, old object) or advanced record (inheritance-less "object") can reside in memory.


I read a bunch of posts in the forum that seemed to indicate otherwise. Maybe they are referring to older implementations.
some people have misconceptions about what the compiler does and/or how it does it.  When in doubt, I read the assembly code, that always tells you what reality is.

is it possible to get scoped memory management with the MM pascal has, or something of that nature?
Local variables are inherently scoped.  They exist only until the function/procedure in which they are declared exits.  Variables (as well as types and functions) declared in a unit's interface are in the global scope (data segment in the case of variables) but, they are only accessible by code that declares they use the unit. IOW, what's scoped is not the existence of the variable but, its visibility.

In c++17 we had pmr allocators, which allowed for a mix and matching of allocation strategies, i'd be looking for something similar.
I cannot answer that question.  I stay away from all the OOP stuff.  Except in throw away programs, I always do my own memory management.   It's more work but, much more efficient and powerful.

I'm still trying to figure out if this can be managed in freePascal.
if you're willing to do your own memory management, you can do in FreePascal anything you can do in C.  As far as, what memory management FP can do automatically for you, I cannot provide a good answer because, as I stated before, I deliberately stay away from that stuff. 

Thanks again.
You're welcome again. :)

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

K.IO.s

  • New Member
  • *
  • Posts: 11
Re: Memory layout and management questions
« Reply #5 on: July 06, 2022, 03:28:43 am »
Thanks once more, last reply cleared a few more things.

With regards to the MemoryManager, i was referring to this guy:
https://www.freepascal.org/docs-html/prog/progsu174.html

It seems to be the paradigm I'm looking for. Currently I'm thinking that at the beginning of the Program i can get the standard MemMan, save it in a global, main variable.
Then upon arriving at my unit, or in a routine that I believe requires a bump allocator, I'll set the bumpManager, and at the end of the routine set back the main one back.

The real problem will come with multiple threads. I'm guessing SetMemoryManager isn't thread local or unit scoped. So i can have two different implementations alternating around which would be ugly. In single threaded, with a predictable control flow, it should be less of a problem.

Ideally, as with pmr allocators(in a sense¹), the MemoryManager(the obj) in use should be fully localized, or some option in that regard should be there. It's just calls to the heap anyways.
Specially for people who do a lot of manual memory management this type of thing can greatly streamline the process.

If you have any other hints to add let me know.

Note¹:
For people who aren't aware, in c++17 types where extended so that you can pass an allocator to it, and therefore control the allocation strategy of each type, and section of the code more closely and easily. In games, databases and other high performance software that's useful.
The current freePascal solution with the MemoryManager obj doesn't seem to be as fine grained in that it seems to set an object of global usage. It's the closest thing I've found.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Memory layout and management questions
« Reply #6 on: July 06, 2022, 05:08:42 am »
Thanks once more, last reply cleared a few more things.
You're welcome.

With regards to the MemoryManager, i was referring to this guy:
https://www.freepascal.org/docs-html/prog/progsu174.html
I take that as meaning you'd like to have some assistance from the compiler to simplify/streamline memory management for you.  That's quite reasonable, it's what the majority of programmers go for but, there is a price to pay for that convenience.  (more on that follows.)

The real problem will come with multiple threads.
And it can easily become a problem in a single threaded program.  The real problem is that, in any somewhat complex program, even a single thread one, depending on what some parts of the code need, memory should be managed a certain way.  This leads to the optimal ways being different for different parts of the code.  Just about any method that attempts to cater to all different needs is going to fall short in one way or another.

If you have any other hints to add let me know.
I'll tell you what I do but, you'll need to "customize" it for your program's needs.  First, I rarely allocate heap memory.  If a heap is needed by some part of the program, that part of the program is then responsible for creating its own heap (HeapCreate in Window) and destroying it when it is no longer needed.  That way, no critical sections are needed to read/write the heap because the heap is private to that piece of code.  Very often, there is no need to individually free blocks, instead everything is freed at once when the heap is destroyed.  No memory leaks that way.

More often than not, my programs have a number of blocks of VirtualAlloc-ed memory blocks each of which is written to by, ideally, only one easily identifiable block of code.  Also, quite often a program needs to make some data available, for reading only, to other parts of the program. That's another block (the program's core constant data) and, once setup, it becomes _read_only_ (set it up and freeze it, VirtualProtect it in Windows.) If "foreign" parts of the program need to write to it (something to be avoided but sometimes either unavoidable or avoiding it would significantly complicate the code) then that part of the code (usually a unit) provides specific interfaces to carry out those specific actions (nothing general, everything very specific, to prevent undesirable side-effects.)

The above makes it obvious that I don't rely in any way on the compiler to do memory management.  It's more work on my part but, the performance, maintainability and debug-ability of the code is vastly better than what could be had using the compiler's facilities (and by compiler, in this case, I mean, _any_ compiler, not just FPC.)

I realize that most programmers don't want to go through that much additional work and will choose to have the assistance of the compiler.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Memory layout and management questions
« Reply #7 on: July 06, 2022, 09:13:49 am »
For the sake of reference, here is one example of posts that I saw that seemed to indicate arrays were not contiguous in freePascal: https://forum.lazarus.freepascal.org/index.php/topic,39739.msg273701.html#msg273701

And here for an example saying arrays are auto allocated on the heap:
https://forum.lazarus.freepascal.org/index.php/topic,34989.msg230535.html#msg230535

There were others where people explicitly stated such things. That's what got me confused in the first place. Why would the language have such obscure design choices.

Those are specifically referring to dynamic arrays of two or more dimensions.

The compiler and runtimes handles strings and arrays where the size is not specified in the declaration, as well as instantiated classes, specially: a small block (call it a descriptor if you insist) is inserted at the expected place in the stack or global memory which should by and large be treated as an opaque structure with pointers to the heap. The detail of this is hidden, there is no need to use a pointer-dereference operator (^).

In the case of strings (without a declared length) and dynamic arrays (i.e. without a declared length), there is an implicit finalisation block in the local function which reference-counts the allocation and frees the heap storage when appropriate. The result of this is that strings and dynamic arrays may be safely returned as function results.

If the data associated with a string or a single-dimensioned dynamic array is to be passed to e.g. an OS-level API, then it must be assumed to start at an indexed element, i.e. somestring[ 1] or somearray[ 0]. I believe it is reasonable for data to be assumed to be contiguous, it might not be safe to assume it is safe from relocation.

However, a dynamic array containing a further level of dynamic arrays or strings will actually contain opaque control blocks, which themselves point into the heap. The complexity of dereferening, destruction and where necessary reallocation is (normally) hidden from the user.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: Memory layout and management questions
« Reply #8 on: July 06, 2022, 01:37:06 pm »
...it seems that they don't have a contiguous memory layout, and thus the data is allocated all over the place with pointers everywhere.
No, the arrays the compiler builds always have contiguous elements, the compiler depends on that to access the array elements.  IOW, allocation isn't all over the place, linearity is always preserved.

The question was in the context of array of array of Type. In this case each element of the outer array is in fact a pointer to another dynamic array (and each sub array can have a different length). This is different from a static multidimensional array which is indeed contiguous.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Memory layout and management questions
« Reply #9 on: July 06, 2022, 02:01:01 pm »
The question was in the context of array of array of Type. In this case each element of the outer array is in fact a pointer to another dynamic array (and each sub array can have a different length). This is different from a static multidimensional array which is indeed contiguous.
Yes, you are right.  Each array is dynamically dimensioned individually which means they each are in separate dynamically allocated blocks and each array can have a different number of elements.

Thank you for making it clear.  I'm sure the OP appreciates it too.


(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

K.IO.s

  • New Member
  • *
  • Posts: 11
Re: Memory layout and management questions
« Reply #10 on: July 06, 2022, 04:00:00 pm »
Just to be clear about this, taking the 2d array example.
The top array is a pointer, which points to a series of bottom arrays, that are also pointers to a memory location.
This memory location that contains the actual data, is contiguous correct?
I mean per section of course. So array[0][0...N] would be a contiguous segment, just as array[1][0...N] would, etc.

SymbolicFrank

  • Hero Member
  • *****
  • Posts: 1313
Re: Memory layout and management questions
« Reply #11 on: July 06, 2022, 04:16:25 pm »
Yes. SetLength copies the array if needed.

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Memory layout and management questions
« Reply #12 on: July 06, 2022, 04:20:56 pm »
It only copies when lengthed, not when shorted.
Specialize a type, not a var.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Memory layout and management questions
« Reply #13 on: July 06, 2022, 05:02:41 pm »
Just to be clear about this, taking the 2d array example.
The top array is a pointer, which points to a series of bottom arrays, that are also pointers to a memory location.
This memory location that contains the actual data, is contiguous correct?
I mean per section of course. So array[0][0...N] would be a contiguous segment, just as array[1][0...N] would, etc.

I would suggest that "opaque reference" is more appropriate than "pointer", and to leave considerations of dereferencing, reallocation etc. to the compiler and runtimes.

However I believe your conclusion is broadly correct: the data in an array is contiguous, rather than being allocated as e.g. a tree of system-sized pages.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

K.IO.s

  • New Member
  • *
  • Posts: 11
Re: Memory layout and management questions
« Reply #14 on: July 06, 2022, 05:14:14 pm »
... rather than being allocated as e.g. a tree of system-sized pages.

Exactly! :D

Thank you guys for all the replies, this clear up a lot. For a moment I thought Pascal had strange defaults, but it seems to be as expected.

 

TinyPortal © 2005-2018