Recent

Author Topic: [SOLVED] Assembly Debug Code to identify code bug?  (Read 1607 times)

Gizmo

  • Hero Member
  • *****
  • Posts: 831
[SOLVED] Assembly Debug Code to identify code bug?
« on: July 02, 2020, 09:38:17 am »
Hi folks

I have been battling for two days trying to work out something and hoping you guys can help me out.

Basically, I am creating a DLL that serves as a "plugin" to another tool. So the DLL has various API calls to the other tool. When it runs in the tool, it looks at various files presented by that tool and for almost all of them it works fine. But there is one file, which looks no different to any of the others, that causes the program to crash with a Page Protection 216 error. I've checked with the developer and he has confirmed it is my plugin due to the memory address of the assembler.

The problem is basically this : the DLL calls a handful of functions. It gets to the end of the main one (so it processes the file in question), as in, it gets to 'end;' of the function. It is then supposed to go to the final function. But it does not start it. So something happens between the end of the main function and the start of the next, but I cannot identify what. All I have is this debugger assembler code which I don't understand. I was hoping one of you might be able to give me a clue by explaining what the highlighted line means (I think it is trying to copy some data to another memory address), in the hope that MIGHT help me work out the problem?

Many thanks
« Last Edit: July 02, 2020, 04:37:27 pm by Gizmo »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9755
  • Debugger - SynEdit - and more
    • wiki
Re: Assembly Debug Code to identify code bug?
« Reply #1 on: July 02, 2020, 01:42:38 pm »
There is not really much in your description.....

As it is your DLL, you should be able to compile your dll with debug info.
SysGetMem is in fpc => so you would need an RTL compiled with debug info...
However any callback to the main program would be without info.

I assume the "main program" (which is by 3rd party?) loads your dll, and calls a function in your dll? The main prog, also passes some sort of handle?, which allows you to make calls back to parts of the main prog? / At least that is what I read from your text.


Quote
It gets to the end of the main one (so it processes the file in question), as in, it gets to 'end;' of the function. It is then supposed to go to the final function.
I really can't figure out what you mean?

What/Who is "getting to the end" ?
- Code in your dll
- Code in the main prog?


"go to the final function" means
- call / jump?
- return (asm "ret")?

Your asm is neither on a call, nor a ret instruction...

Do you have a callstack, and does the callstack look correct to you. I.e. is getMem called from code that should call it?

One possibility that you might have to consider, is that the code "getting to the end" actually got there. That it did already process the "ret" instruction.
But, if something did write to memory out of range, the stack may have been corrupted. The return address could have been overwritten, and the code "ret"-urned to a random location. (just one of many possibilities)

Ideal would be if you could reach a breakpoint before the crash, and then step a few instructions to the crash.
After a SigSegv all the data, including asm, may be wrong....

ccrause

  • Hero Member
  • *****
  • Posts: 843
Re: Assembly Debug Code to identify code bug?
« Reply #2 on: July 02, 2020, 01:48:32 pm »
The error suggest that you (or rather GetMem) is trying to access invalid memory (Run-time errors). Perhaps there is a managed type that you are returning as result (uninitialized string or array)?  A backtrace may show more context around the problem.

The wiki article on shared libraries has a couple of pointers for possible issues and how to avoid them.

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: Assembly Debug Code to identify code bug?
« Reply #3 on: July 02, 2020, 03:10:32 pm »
Hi guys

Sorry for not being clear. You are right Martin, in that the the "caller" of the DLL is a 3rd party propitiatory software tool. The API allows developers to create compiled DLL's that can be executed from within that tool. Using the Lazarus IDE Run Paramaters, I can debug the DLL while it is being executed by the calling tool.

The specific assembler line that keeps being reported time and time again, and which seems to be part of "SYSTEM_$$_CONCAT_TWO_BLOCKS$PMEMCHUNK_VAR$PMEMCHUNK_VAR" (whatever that is) is : 

Code: [Select]
00000001100134B9 48895020                 mov    %rdx,0x20(%rax)

The memory address of my DLL is 110000000, so 1100134B9 is part of my DLL.

The return value of the function is just an integer, and is zero. No managed types being returned.

I have tried clearing arrays and strings etc at the end of the function in case it was one of those to no avail either.

What is quite odd is that when it gets the final 'end;' of the main function, rather than jumping to the next function (which the 3rd party tool calls next) the debugger seems to go back to the "begin", and the next step that I step into causes the page protection 216 error, but no obvious code is executed. It just goes "begin - corrupt". I don't know it is returning to the begin though. 

ccrause

  • Hero Member
  • *****
  • Posts: 843
Re: Assembly Debug Code to identify code bug?
« Reply #4 on: July 02, 2020, 03:52:40 pm »
The specific assembler line that keeps being reported time and time again, and which seems to be part of "SYSTEM_$$_CONCAT_TWO_BLOCKS$PMEMCHUNK_VAR$PMEMCHUNK_VAR" (whatever that is) is
That points to some problems with heap memory.  I can guess you are using New/Free or creating and freeing classes, but you have to show a bit of your code because this guessing game can go on for a long time.

Quote
What is quite odd is that when it gets the final 'end;' of the main function, rather than jumping to the next function (which the 3rd party tool calls next) the debugger seems to go back to the "begin", and the next step that I step into causes the page protection 216 error, but no obvious code is executed. It just goes "begin - corrupt". I don't know it is returning to the begin though.
Jumping to begin when stepping through your code could be a simple issue with how the compiler optimized your code and presented the line information in the debug information, Martin will probably have a better explanation for this behaviour.

Also it may be insightful if you move over tot he Assembler view and single step through the code that gets executed after the last line of your code.  My guess is there is something in your code that invalidates some of the assumptions the RTL is relying on.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9755
  • Debugger - SynEdit - and more
    • wiki
Re: Assembly Debug Code to identify code bug?
« Reply #5 on: July 02, 2020, 04:16:22 pm »

The specific assembler line that keeps being reported time and time again, and which seems to be part of "SYSTEM_$$_CONCAT_TWO_BLOCKS$PMEMCHUNK_VAR$PMEMCHUNK_VAR" (whatever that is) is : 
It is part of the heap manager (memory alloc/dealloc).

FPC gets big chunks of memory from the OS (the OS may only hand out big blocks). And it sub-allocates them to your code. When you free them they must be returned to the internal pool (may be deferred to a later point, maybe a later getmem).
For that each block has a few bytes of internal info. (i.e. in front of the address returned by GetMem, are a few bytes that fpc uses).

And that is bad news.
The error indicates that those internal bytes got overwritten by your code.
You somewhere wrote to outside the memory that your code owned.
I.e. SomeArray[lowbound-1] :=
or  SomeFreedObject_NowPointingToFpcInternalStructs.field :=

And the really bad news: there is no easy way of telling where....

You do have range checks enabled?
You could try different -gt  with 1..4 "t" -gtttt  in case its an uninitialized var.

If on linux, you could try valgrind --tool=memcheck (compile with -gv / for valgrind).
I have never tried a dll in valgrind. No idea how good the results will be. But valgrind is the best tool to find that kind of error.

Otherwise you can use heaptrc. With "keepreleased". set the environment HEAPTRC="keepreleased" 
But that also relies on the same internal structs, so its a coin toss, if it will help or not....
It may work, if it is a dangling pointer (a freed object).
In that case it tells you where the object was first allocated. You then have to figure out: Where was it freed, and where was it accessed after being freed. (breakpoints in dedicated destroy // and watchpoints may help....)

As a really last resort (I would not go down that route, until I tried all else multiple times)
If you have a reproducible case, and the address which is accessed by " mov    %rdx,0x20(%rax)" is always the same, then you can set a watchpoint to it. ^byte($0123456)^ (or similar, you may need to play around a bit, its a long time since....
That will trigger whenever that addressed is accessed.
Try to set the watchpoint as late as possible. Memory is re-used, and so that will trigger hundreds of times before it gets real. (you can use keepreleased to minimize this)).
Set the watchpoint to take a snapshot, and to NOT break. Then after the crash go to the debug-history window....
Mind that this sounds hard enough in theory.... It will take countless attempts in practice to get it right....


Quote
What is quite odd is that when it gets the final 'end;' of the main function, rather than jumping to the next function (which the 3rd party tool calls next) the debugger seems to go back to the "begin", and the next step that I step into causes the page protection 216 error, but no obvious code is executed. It just goes "begin - corrupt". I don't know it is returning to the begin though.

That is "normal".

The "begin" and "end" line contain compiler generated code, such as stack management, and dealing with managed types, and local vars that need initialization.
FPC often writes debug info, so that some code that actually is in ending the procedure, appears to be in the "begin" line. So that is why you do jump like this.
« Last Edit: July 02, 2020, 04:21:52 pm by Martin_fr »

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: Assembly Debug Code to identify code bug?
« Reply #6 on: July 02, 2020, 04:37:16 pm »
Gents

Thank you so much for this epic tutorlidge in memory management etc. It's pretty new to me (this level of debugging) and helpful to know.

Your explanations gave me a clue, and that was around "amounts of memory", basically. So I went through all my code again and noticed that I use SetLength at the start of the function to set the size of a buffer that depends on the size of the given file. Then later on, the handle I initiate to that file gets released, and a second handle is requested of the same file but with different flags. Those different flags give me a "different view" of the same file. That different view means a different size. I then noticed that I do not have a second call to SetLength, so the buffer is still the same size as the first call. As soon as I added a second SetLength to match the size of the second handle call, it worked!

So there we are - we got there in the end. Thanks for the clues guys...I only spent two days on this but your clues helped me solve it within a few hours...


 

TinyPortal © 2005-2018