Forum > General
Strange memory allocation problem
jollytall:
I have a difficult to debug error. My program generates points in the 3D space, using 3x8 byte records in a dynamic array. I also have some other much smaller arrays that are dynamically created, destroyed, resized, etc. Hundreds of them, but the total size is 4 bytes per point (basically I have a fixed size 3D array where in each cell there is a dynamic array, with the list of int32 values/indexes showing which points are in that cell). This way it is easy to find out where a certain point is as well as to find out what points are at a certain segment of the space.
Up to about a million (depends on many things) points it works OK, but then I run out of memory. So, I have about let's say 28MB used in a system with 12GB memory. I tried to print from the heaptrc unit the GetHeapStatus.Allocated memory and it is indeed as planned. Also checked that there is no memory leak. Still if I look in top/System Monitor I see a huge and rapidly growing memory usage after a certain point. When the total memory used - as seen by Linux, not by Pascal! - reaches the system capacity then the kernel kills the process with an Out Of memory error. In dmesg I see:
--- Code: ---[38281.389610] [ 5773] 1000 5773 2736272 2533780 22261760 193261 200 points
[38281.389613] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-55fa8107-2a93-429a-991e-dbdda3704dfa.scope,task=points,pid=5773,uid=1000
[38281.389636] Out of memory: Killed process 5773 (points) total-vm:10945088kB, anon-rss:10135116kB, file-rss:4kB, shmem-rss:0kB, UID:1000 pgtables:21740kB oom_score_adj:200
--- End code ---
My guess is that it is because the array is continuously resized and moved in memory leaving a lot of holes in the memory. So, I tried tricks. E.g. at the beginning I set the array size of my points to a very large number and then setting it back to the starting point (100 points to start with) where it can grow again. I thought that with this trick the array reserves the large space and when it grows it does not need to be moved around (speed, memory efficiency). Unfortunately it can be seen that the used memory jumps down, so probably it does not keep the Heap reserved. What is strange though that with this high area reserved the problems start later. For some time the memory used as seen in top is the same as seen by heaptrc and then again it suddenly starts growing.
I think I have a workaround (need to rewrite the program), that I reserve and keep the array size at a very large value and use a "lastindex" property to manage how large it really is.
Still it bothers me a lot. I guess it has something to do that other smaller dynamic arrays are allocated/resized. So it can be that e.g. my large array occupies at one point 100MB memory. Then a small array is created for a cell of the other array, listing the points in it. This small array is placed immediately after the large array. Then the large array has to be increased (when it gets larger than the actually reserved size - I know it is normally more than the size was last asked for) and so it blocks a new area. Then again a new small array is created (a new cell has points in it) and for some reason it is not placed e.g. in the area that has just been freed up where the large array was earlier, but again placed after the new large array. So when again the large array has to be increased it takes another "fresh" area and the next small array is again works as a separator, so at the end I will have many large holes in the heap with small blocking separators. Is it possible? Still why does linux see it as the whole area is reserved for my program. Why are the small "separator" array not created somewhere in a whole, so the large array could grow where it is? Is it something depends on linux or FPC? Is there a way to trace what is happening and to see whether my assumption is correct? Any other idea?
jamie:
if all the cells are the same array sizes, don't use Dynamic allocations for arrays like that, use static arrays within each.
This way you larger chucks and less fragmentation of memory allocations.
Martin_fr:
Couple of rough workarounds /patches:
- use a diff mem manager, e.g. "uses cmem;" as the very first uses in your "program foo;" file.
Mem managers may allocate chunks from the OS, and redistribute. So this change can make a diff.
Do not use heaptrc => as each mem alloc gets an extra chunk to save the stack trace...
- You seem to be on Linux
man ulimit
You may be able to allow your app more memory.
But the real deal may be to switch from array to list (either any existing class, or do your own).
As you said over allocate the array a bit. Then keep it that size, until you know it no longer needs to grow (or of course further grow it as needed).
Then: "length(Array)" is the capacity. And you need a 2nd variable "ArrayCnt:Integer" that keeps track how many elements you actually use.
That may save a lot of reallocs.
Mind that resizing the inner dimension of a multi-dim array (or "array of array") means that each inner array is reallocated => and hence if you are unlucky you can get a big amount of gaps.
Other structures (like linked list) can be appended without re-alloc at all. But they have other downsides....
Martin_fr:
Mind that with those amount of datas, you may also google for optimized memory layout.
If you work on the data, then items that you need to work on in one go, should ideally be in a single cache line. That speeds up the processing.
Not your current problem, but maybe of interest.
jamie:
I've ran into cases like this handling large amounts of small packets of data and ended making my own managed memory handler for a single instance of a large chuck of memory.
Jamie
Navigation
[0] Message Index
[#] Next page