Recent

Author Topic: CPU-View  (Read 10309 times)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #60 on: February 25, 2025, 01:52:32 pm »
I found an install of mine where the gtk mouse click did not resolve very far. (different trace than yours though)

I added some more logic to the asm unwinder. Not yet live, needs some testing.
https://gitlab.com/martin_frb/lazarus/-/compare/main...fpdebug-unwind-asm-push?from_project_id=28419588

The 2nd commit may be extended to store more known registers if they get pushed.

It still has some issues, but gets a bit further now.

There is a commit adding debugln
https://gitlab.com/martin_frb/lazarus/-/commit/6187969408491c3d963f5887c6995cc08a215265
(it needs to be rebased for the latest amend)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #61 on: February 25, 2025, 06:19:37 pm »
I actually found a bug in the existing code. Changes + fix: https://gitlab.com/martin_frb/lazarus/-/compare/main...fpdebug-unwind-asm-push-2?from_project_id=28419588

Still testing ...

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #62 on: February 25, 2025, 09:46:05 pm »
Well, ... about multiple breakpoints.

There is a tiny danger in it. (One that actually has happened with gdb, because older FPC wrote incorrect info).

If you have a breakpoint at the wrong address, then the app may crash.
- Imagine you have an asm statement at 0x223308. that statement is 3 bytes long.
- On the stack you find the address 0x223309
- Now you may check that the asm at this address is a "call" statement, but the 2nd byte of that real statement may just be a "call".

So you are setting a breakpoint in the middle of a statement. If the code then reaches that statement, it does not see a breakpoint, but instead executes a modified statement. That is most likely to cause trouble.

Actually, if you use that only to return to "user code", i.e. where you have debug info, then you can disassemble from the start of the function, and check you have the start of a real asm instruction.  Since you have line info, you only need to disassemble one lines worth of data. (Can still be a lot...)

Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #63 on: February 26, 2025, 09:27:09 am »
I actually found a bug in the existing code. Changes + fix: https://gitlab.com/martin_frb/lazarus/-/compare/main...fpdebug-unwind-asm-push-2?from_project_id=28419588

Still testing ...
I checked, unfortunately the stack remains the same and stopped at libgtk.
I also made changes to MAX_SEARCH_ADDR, etc... but it didn't help

Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #64 on: February 26, 2025, 09:28:54 am »
Actually, if you use that only to return to "user code", i.e. where you have debug info, then you can disassemble from the start of the function, and check you have the start of a real asm instruction.  Since you have line info, you only need to disassemble one lines worth of data. (Can still be a lot...)
Yes, I've already come up with a similar strategy to more accurately optimize the return address, I'll plug it in a bit later

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #65 on: February 26, 2025, 10:17:57 pm »
I did some more improvements. In some cases getting through gtk stacks now. In others still not. (well, it was always clear, it would never do all).

Also, there is no guarantee the returned frames are always correct. There can be code that modifies the stack pointer in conditional blocks. The unwinder would not know which branch to take. Nor does it analyse if (and also not where) branches join again, or what differences they have. It simply walks one branch, if that hits a "ret" statement, then that is good (well if that ret returns to after a "call" statement). If it does not find a good result it tries (a limited amount of) other branches.

Those libs may even have dwarf info to unroll, but IIRC there are version diffs, so we can't yet read that. One day....

Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #66 on: February 28, 2025, 11:44:24 am »
I did some more improvements. In some cases getting through gtk stacks now. In others still not. (well, it was always clear, it would never do all).

Also, there is no guarantee the returned frames are always correct. There can be code that modifies the stack pointer in conditional blocks. The unwinder would not know which branch to take. Nor does it analyse if (and also not where) branches join again, or what differences they have. It simply walks one branch, if that hits a "ret" statement, then that is good (well if that ret returns to after a "call" statement). If it does not find a good result it tries (a limited amount of) other branches.

Those libs may even have dwarf info to unroll, but IIRC there are version diffs, so we can't yet read that. One day....

Martin, I checked the most recent trunk and it works!!!! Great job! Thanks :)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #67 on: February 28, 2025, 12:31:02 pm »
Glad to hear.

If you find any that don't:

- test higher limits.
- step out to that frame and single step the asm, until you get to a statement after which it works.
  - however, there may not be a visible change in the frame, if something fails in asm-unwind, but is resolved as fallback by RBP-unwind.
    So that needs to be disabled for those tests / or monitored.

Then it can be checked if that statement could be handled by the parser.

If it is a jump (conditional or not), then there are limits.
1) the amount of branches kept for following later.
2) going back only keeps one point
3) going far forward only keeps one
The last two would be a lot of work to change.

Background:
  jnz  + 10
Will keep RIP+10 as potential alternative (but keep continue at the next statement).
If there is no "jmp -100" in the code, then it will joint the "+10" and the branch can be removed. So that limits complexity

ONE jump back will be stored, and it will be the one furthest back. In the hope that all other back jumps (that go a lesser amount back) will then be reachable from that furthest point. That may not always be the case, but again complexity. (Keeping track of blocks already done).

Going forward really far, is the latest addition, and solved some of the issues. Not sure if a 2nd storage slot will help much here...

------------

* I have seen
  jmp [rax]

That is not handled.
Registers are sometimes know. But that knowledge is of variable quality. So following this seems too risky.

* Similar reading from memory (to get RSP / RPB / RIP and other)
- only done from stack
- only from areas of stack that weren't "pushed" to. (at least for rsp / rpb)
- for jumps done for [rip+123] because that usually is in the code section, i.e. constant data.


* On Windows there is a function in exception handling (OS kernel) that goes to the calling stack, by executing "call ...". So the subroutine somehow unrolls to the outer stack. That is impossible to follow, since stepping in isn't part of the game.

-----
The code can probably be speed up a tiny bit, by optimizing calls to "ReadData" but that is a different story.
« Last Edit: February 28, 2025, 12:33:05 pm by Martin_fr »

Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #68 on: February 28, 2025, 02:14:05 pm »
Glad to hear.

If you find any that don't:

- test higher limits.
- step out to that frame and single step the asm, until you get to a statement after which it works.
  - however, there may not be a visible change in the frame, if something fails in asm-unwind, but is resolved as fallback by RBP-unwind.
    So that needs to be disabled for those tests / or monitored.


Got it, I'll do some more research and post if I have any suggestions.

Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #69 on: March 25, 2025, 08:41:01 pm »
Martin, you may be interested in this information. I found out that the work with debug symbols of external libraries, namely with the list of their exported functions, is not optimally implemented. Namely, when getting the function name by address, the usual search of the linear list is performed, not even binary search is used.
This gives a significant delay when debugging in the address space of system libraries, especially in Linux, where search costs can increase more than 50 times (I have specially performed profiling).
As a consequence, the standard AsmView starts to slow down very much when working.
In my CpuView I added a local cache of functions exported by libraries, because I need to support older versions of Lazarus, but in newer versions it can be patched to fix the brakes in AsmView.
Here is a link to the patch itself so you can see my approach to solving this problem:

https://github.com/AlexanderBagel/CPUView/commit/00c35d102a87fceae9fd10b7e900674a48e4a499#diff-9354d6d7ea0473c3c5f240e475819a5b04b7c009dd362f236ca0c915dbe12dd5

File: ‎src/core/CpuView.FpDebug.pas
« Last Edit: March 25, 2025, 08:43:51 pm by Alexander (Rouse_) Bagel »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #70 on: March 25, 2025, 08:49:41 pm »
Ah thats (part of) why disass is slow.

I currently am busy with a few non debugger related items. But I had it on my list to callgrind it... (there are probably more issues)

I'll see that I find some time to look into...

Thanks.


Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 10996
  • Debugger - SynEdit - and more
    • wiki
Re: CPU-View
« Reply #71 on: March 31, 2025, 12:51:13 pm »
I did a little improvement for the address lookup. They now use a bin search.

There may (probably) be more possible optimisations in the disass code. (like the entire prepare ranges, which is a left over from old days). But that will have to wait.

It now needs to sort them all when loading.
On Linux that should be happening in extra threads (at least most of it). So it may even improve loading times. (because then name lookups will be faster).

On Windows the order of loading, and looking up is different, and it has to wait for the threads more often. But it seems to have a similar speed than before.
I hope there aren't any big projects that will load slower from that.
-> the old code had to lookup 4 or 5 names in an unsorted list.
-> the new code now has to sort, and create the hash dictionary, but then will be faster at looking up the names.


Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #72 on: March 31, 2025, 02:10:36 pm »
I did a little improvement for the address lookup. They now use a bin search.
Great, I'll be testing this out shortly

Alexander (Rouse_) Bagel

  • New Member
  • *
  • Posts: 38
Re: CPU-View
« Reply #73 on: April 04, 2025, 03:14:36 pm »
I did a little improvement for the address lookup.

Martin, thank you very much. I checked it out - the debugger work in the system libraries area is incredibly faster!

 

TinyPortal © 2005-2018