Forum > General

really sad sigsegv in freemem. perhaps even tragic.

(1/2) > >>

yogo1212:
Hi :-)

I'm experiencing weird problems with my game engine.

There is a SIGSEGV in FreeMem. Very simple code:

--- Code: ---GetMem(tmpbytes, len);
Move(Data[i * esize], tmpbytes[0], len);
Move(tmpbytes[0], Data[(i + 1) * esize], len);
Freemem(tmpbytes, len);
--- End code ---

It works about twenty times. Then, when the method is called from one particular resource-loader, it crashes.

The only change i made before compiling (and before that it worked) was to wrap the data being stored in another type. (TVec3 -> TCol3, both records with three floats. One with x,y,z the other with r,g,b) and i can't see why this caused the error to appear

If the error occured in one of the moves, i would probably start looking at my indices. but ...
You know...

i already checked that tmpbytes and len don't change and really i am quite puzzled  %)

This guy appears to have had the same error:
http://forum.lazarus.freepascal.org/index.php?topic=20403.0

Could one of you help find the cause?

Martin_fr:

--- Quote ---
--- Code: ---Move(tmpbytes[0], Data[(i + 1) * esize], len);
--- End code ---

--- End quote ---

Are you sure that there is enough space in data?

If you write behind the end of data, then that will cause a crash.

If you write behind the end of data and tmpbytes was allocated in memory right after data, then you get an error when freeing tmpbytes

yogo1212:

--- Quote from: Martin_fr on November 16, 2014, 04:26:44 pm ---Are you sure that there is enough space in data?
--- End quote ---
Hmm, maybe I should have posted the complete function:

--- Code: ---procedure TContinuousMemoryManager.Insert(i: cardinal);
var
tmpbytes: PByte;
len: cardinal;
begin
len := (used - i) * esize;
if used = capacity then
begin
Inc(capacity, bsize);
tmpbytes := Data;
GetMem(Data, capacity * esize);
Move(tmpbytes[0], Data[0], i * esize);
Move(tmpbytes[i * esize], Data[(i + 1) * esize], len);
Freemem(tmpbytes);
end
else if len <> 0 then
begin
GetMem(tmpbytes, len);
Move(Data[i * esize], tmpbytes[0], len);
Move(tmpbytes[0], Data[(i + 1) * esize], len);
Freemem(tmpbytes);
end;
Inc(used);
end;   
--- End code ---


--- Quote ---If you write behind the end of data, then that will cause a crash.
--- End quote ---
'can cause'. but i can step past both moves.


--- Quote ---If you write behind the end of data and tmpbytes was allocated in memory right after data, then you get an error when freeing tmpbytes
--- End quote ---
that's a really nice idea! just give me a moment to check (though i'm sure data is big enough - because i wrote the code  :P ).


UPDATE:
tmpbytes: pbyte($00007FFFF7FDF260)  ' co'
Data: pbyte($00007FFFF7F812B0)  #16'ddk]'#182#168#11#16'ddk'#1
used: 9
capacity: 50
esize: 32
i: 0
len: 288

hmm, doesn't look like it :-(
is there a possibilty that i accidently destroyed fpc's internal allocation table?
is this worth a bug-report?
how do i debug fpc-internals?

Martin_fr:

--- Quote --- but i can step past both moves.
--- End quote ---

--- Quote ---is there a possibilty that i accidently destroyed fpc's internal allocation table?
--- End quote ---

If any of the moves writes outside boundaries of the allocated memory, then it may destroy other data.

The move will not crash, unless you write to memory not owned by your app (then the OS will trap it). Most times crossing the boundaries of one mem block, will keep you in mem owned by your app.

However, at some later time the memory you overwrote will be accessed, and then just anything can happen.

If at any move you happen to write into fpc internal structures, then the next, or second next, or maybe 10 or 20 get/freemem later is going to crash.

Since very likely after the allocated block there will be other mem managed by fpc, there will be an internal fpc structure. So if you write to much, then it is only a question of time until fpc accesses the node that was overwritten.



--- Quote ---is this worth a bug-report?
--- End quote ---
I highly doubt the bug is in free mem. But if it is, you will probably need better proof than your current code. (The error could be in some other procedure, if "move" is used elsewhere.)

Couple of things:

Use heaptrc -gh
It adds a few checks. Like checksum to freed mem. Id does however  no detect the line whent the error happens. If will (if it detects your case) warn you at some later time.


Add asserts. plenty of them.


--- Code: ---len := (used - i) * esize;
--- End code ---
what if negative?


--- Code: ---records with three floats
--- End code ---
Just 3 floats, nothing else?

You are aware that if you use "move" on data, that contains ansistring or array, then you will be in for trouble too?

yogo1212:

--- Quote from: Martin_fr on November 16, 2014, 06:17:49 pm ---
The move will not crash, unless you write to memory not owned by your app (then the OS will trap it). Most times crossing the boundaries of one mem block, will keep you in mem owned by your app.

Since very likely after the allocated block there will be other mem managed by fpc, there will be an internal fpc structure. So if you write to much, then it is only a question of time until fpc accesses the node that was overwritten.
--- End quote ---

and i thought i was aware of all this stuff.. does internal and application-memory get mixed when not using c-mem also?

ok, i wrapped getmem, move and freemen in order to be sure i wasn't doing anything bad with illegal access:

--- Code: ---data getmem: 00007FFFF7F812B0 1600
// ....
tmpbytes getmem: 00007FFFF7EBD180 288
move 00007FFFF7F812B0 to 00007FFFF7EBD180: 288
move 00007FFFF7EBD180 to 00007FFFF7F812D0: 288
freemem 00007FFFF7EBD180

--- End code ---
EDIT:
maybe i should explain what the code does..
it adds a slot between elements at index i by moving all data at and behind i to i+1.
in this case, it shifts the whole data by 32 bytes. (B0->D0)


--- Quote ---I highly doubt the bug is in free mem. But if it is, you will probably need better proof than your current code. (The error could be in some other procedure, if "move" is used elsewhere.)
--- End quote ---
i know that usually it's your own fault when something is broken but i have seen errors in library, the windows-api or gnu-libc.
that's why i provided a link to a thread with some guy who also had segfaults in freemem (i couldn't spot the error in his code either)


--- Quote ---Use heaptrc -gh

--- End quote ---
i have no idea what that is but i've set up valgrind. the output is attached. (wtf ? 0x161ab4c701047125 !!)


--- Quote ---Add asserts. plenty of them.

--- End quote ---
naah, you're right there is hardly any debug-output in my code... i will have to think about doing that by default.
but so far, i haven't been able to reproduce the error reliably in smaller test-applications.


--- Quote ---
--- Code: ---len := (used - i) * esize;
--- End code ---
what if negative?

--- End quote ---
all types are cardinal. and i must always be <= used. i'm adding checks for big and absurd numbers anyway :-)


--- Quote ---Just 3 floats, nothing else?

--- End quote ---
nothing else. packed, even.


--- Quote ---You are aware that if you use "move" on data, that contains ansistring or array, then you will be in for trouble too?

--- End quote ---
that is the downside of having highly-integrated strings; but i can assure you, i always copy strings in arrays of char before getting nasty and make sure i pass the pointer to the first element to c-libraries, for instance.

Navigation

[0] Message Index

[#] Next page

Go to full version