Recent

Author Topic: Valgrind stops showing line number at some depth  (Read 976 times)

mm7

  • Full Member
  • ***
  • Posts: 231
  • PDP-11 RSX Pascal, Turbo Pascal, Delphi, Lazarus
Valgrind stops showing line number at some depth
« on: November 12, 2025, 01:19:45 am »
Hi!
I am trying to pinpoint memory issues like dangling pointers.
I start Valgrind following way:

Code: Bash  [Select][+][-]
  1. valgrind --tool=memcheck --track-origins=yes --leak-check=full --show-leak-kinds=all --log-file=valgrind.trc --num-callers=100 --suppressions=valgrind-suppress.cfg --vgdb-error=0 --vgdb=full -v -v ./FreeShip
  2. listening on port 2345 ...connected.
  3.  

Valgrind is configured according to documentation, everything is by default except port number.
Compiler debug options is with: all checks, no optimization, line numbers, debug info, Dwarf3, Valgrind.

Here is options included by Lazatus when I click Test button.
Quote
/usr/bin/fpc
-Rintel
-MObjFPC
-Scaghi
-Cg
-CirotR
-gw3
-gl
-gv
-k-rpath='$ORIGIN/../lib'
-k-Map
-kout.map
-l
-vewnhildbq
-vm5091,5057,5030,5027,5026,5024,2005
However, the Valgrind does not show where memory was freed.

Quote
==3189039==  Address 0xb08d7b0 is 48 bytes inside a block of size 88 free'd
==3189039==    at 0x484B27F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==3189039==    by 0x488B42: CMEM_$$_CFREEMEM$POINTER$$QWORD (in /fs02/home/mark/MyProjects/freeship-plus-in-lazarus/FreeShip)
==3189039==    by 0x477C5A: SYSTEM$_$TOBJECT_$__$$_FREE (in /fs02/home/mark/MyProjects/freeship-plus-in-lazarus/FreeShip)
==3189039==    by 0x8EC48E: FREEGEOMETRY$_$TFREESUBDIVISIONSURFACE_$__$$_SUBDIVIDE (FreeSubdivisionSurface.inc:5302)
==3189039==    by 0x8E93E1: FREEGEOMETRY$_$TFREESUBDIVISIONSURFACE_$__$$_REBUILD (FreeSubdivisionSurface.inc:4774)
==3189039==    by 0x99C5D3: FREESHIPUNIT$_$TFREESHIP_$__$$_LOADPROJECT$TFREEFILEBUFFER (FreeShip.inc:2257)
==3189039==    by 0x962228: FREESHIPUNIT$_$TFREEEDIT_$__$$_FILE_LOAD$ANSISTRING (FreeEdit.inc:6000)
==3189039==    by 0x53F9A4: MAIN$_$TMAINFORM_$__$$_LOADMOSTRECENTFILE (Main.pas:2118)

The last record with code line number is
==3189039==    by 0x8EC48E: FREEGEOMETRY$_$TFREESUBDIVISIONSURFACE_$__$$_SUBDIVIDE (FreeSubdivisionSurface.inc:5302)

But in the code it is just call to method
Code: Pascal  [Select][+][-]
  1.     CtrlFace.Subdivide(Self, True, RefVertexPoints, RefEdges, RefFaces,
  2.       nil, NewEdgeList, nil);
  3.  
But, this method has some code and further calls to my other methods.
I'd like to see where the code frees that memory.
How to make Valgrind go deeper?

Lazarus 4.2
FPC 3.2.2
Valgrind 3.18.1
Ubuntu 22.04

Thaddy

  • Hero Member
  • *****
  • Posts: 18529
  • Here stood a man who saw the Elbe and jumped it.
Re: Valgrind stops showing line number at some depth
« Reply #1 on: November 12, 2025, 06:28:34 am »
https://valgrind.org/

Because this is not specific to fpc/lazarus.
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

ALLIGATOR

  • Sr. Member
  • ****
  • Posts: 306
  • I use FPC [main] 💪🐯💪
Re: Valgrind stops showing line number at some depth
« Reply #2 on: November 12, 2025, 06:45:10 am »

Well, I think Valgrind got the CallStack addresses right

The only question that remains is whether the DWARF information was generated correctly... because there are bugs with this in FPC, at least in FPC git[main] - definitely, but probably in 3.2.2 as well

So I think this is exactly the case when there is a bug in the debugging DWARF

Let's wait for other opinions
I may seem rude - please don't take it personally

ALLIGATOR

  • Sr. Member
  • ****
  • Posts: 306
  • I use FPC [main] 💪🐯💪
Re: Valgrind stops showing line number at some depth
« Reply #3 on: November 12, 2025, 06:47:30 am »

Well, as an option... you can try the FPC git [main] version, maybe 🤷‍♂️ it will help
I may seem rude - please don't take it personally

ALLIGATOR

  • Sr. Member
  • ****
  • Posts: 306
  • I use FPC [main] 💪🐯💪
Re: Valgrind stops showing line number at some depth
« Reply #4 on: November 12, 2025, 06:55:50 am »
By the way... when using Valgrind to search for leaks and other memory management errors, is it necessary to switch the application to use CMEM instead of the default memory manager?

Based on my understanding of how Valgrind works, this is indeed necessary. Since it does not know that FPC has another layer of memory management in addition to the system one, switching to CMEM will provide more detailed results, in my opinion

I would also appreciate hearing other opinions if anyone can provide a more precise answer
I may seem rude - please don't take it personally

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11932
  • Debugger - SynEdit - and more
    • wiki
Re: Valgrind stops showing line number at some depth
« Reply #5 on: November 12, 2025, 09:20:34 am »
By the way... when using Valgrind to search for leaks and other memory management errors, is it necessary to switch the application to use CMEM instead of the default memory manager?

-gv  does that => which is included in his settings  / e.g. added by the "valgrind" checkbox in the IDE.

But, this method has some code and further calls to my other methods.
I'd like to see where the code frees that memory.

Maybe your code was inlined?

Try
Code: Pascal  [Select][+][-]
  1. {$INLINE OFF}
at the top of that unit.

You can also try -Si-  but not sure that will be enough.

FPC does not provide debug/line info for inlined code. So to valgrind that would look like the entire inlined function is all in the one line of its caller. And then "free" would be called from there.



If you are 100% sure it is not inlined, then it could be the "omit stackframe" optimization of fpc.
I am not sure if valgrind can be affected by that (I personally have not seen that happen). But some debuggers can hide stackframes due to this.

If it is that, you would need to recompile your RTL without this optimization.

About the
Code: Pascal  [Select][+][-]
  1. {$Optimization STACKFRAME} // this enables it
Normally (actually, not, but in fpc: normally) each function sets up a frame on the stack, and that frame is hold in a dedicated register. That info can be used to find the caller. (one of many ways to find it, not sure valgrind actually uses this)
However, this may be optimized away.

If it is optimized away, then the caller will be hidden. => That is, the optimized function is in the list, but its caller is not.
So in your case, if "free" had it optimized away, and if that info was used to find the caller for the stack, then the caller of "free" would be hidden. And since "free" is in the RTL, that would need the RTL to be recompiled.

Then again, I used valgrind, with optimized RTL, and I have not seen this. So again, I don't suspect that this affects valgrind. But I don't know.

mm7

  • Full Member
  • ***
  • Posts: 231
  • PDP-11 RSX Pascal, Turbo Pascal, Delphi, Lazarus
Re: Valgrind stops showing line number at some depth
« Reply #6 on: November 12, 2025, 03:55:41 pm »
Wow! Lots of info. Thank you, guys!

No. The code is not implicitly inlined.

And, I assume, the debug info of CtrlFace.Subdivide is there. When I walk there with debugger, even under Valgrind, I can step into, and go deeper and deeper, I even can see values of some variables. But some are invisible (showing dbg command error), which is annoying.

Valgrind is precise when it shows place where program accesses the unallocated memory, and where that memory was previously allocated. But is not precise enough regarding a place where exactly it was freed. I suspect it is something like it "forgets" that. May be it does not have enough memory for something? Any ideas?

« Last Edit: November 13, 2025, 12:48:34 am by mm7 »

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11932
  • Debugger - SynEdit - and more
    • wiki
Re: Valgrind stops showing line number at some depth
« Reply #7 on: November 12, 2025, 04:55:56 pm »
And, I assume, the debug info of CtrlFace.Subdivide is there. When I walk there with debugger, even under Valgrind, I can step into, and go deeper and deeper, I even can see values of some variables. But some are invisible (showing dbg command error), which is annoying.

Try Dwarf-2 vs Dwarf-3  (with/without sets for Dwarf-2 does not matter for valgrind).

Ok, if the debugger can step in, then its not inlined.



Yes, debug info is there. Finding the caller (stack unwind) is more than just debug info, and each debugger has its own implementation. But as I said, I never saw valgrind having an issue with this in particular. (Yet I also don't know what valgrind relies on, and if there could be an issue).

To get a stack trace, any debugger has to find the return addresses on the stack, to tell what the caller is. And "find the return address on the stack" is not always easy.
Though valgrind afaik single steps all code, so it should have a record of all call statements, and then wouldn't need to search the return addresses, but again I don't know

Also:
- debug info is there for "Subdivide"
- but it is not there for "TObject.Free".
The latter just has some info from the linker. Hence you get that mangled name
   by 0x477C5A: SYSTEM$_$TOBJECT_$__$$_FREE

If valgrind, for some reason, get confused by something in TObject.Free, then maybe that hides the caller of "TObject.Free".
(I never had that issue with valgrind, but I usually have an RTL with debug info, so that would differ for me.)

I know, that at least with older gdb (and older fpdebug), some functions could hide their caller in the stack. The problem was never with the hidden function, but always with the function that they had called. I don't know enough about the internals of valgrind to say if that could happen.



Some other guesses.... / Some very very far fetched...

Do you use "smart linking"? I.e. could some functions be dropped from the exe, because they haven't been called? I have seen bugs in debug info caused by that (but IIRC only on Window / not sure though)

Do you use generics? There are known issues with line info and generics. I don't recall the exact details... But some version of fpc under different conditions attribute generic code to the wrong unit.
=> Though then valgrind should report the function, but with that wrong unit. It wouldn't skip it.

There are also some issues with Dwarf-3 and with "type Foo = object {..} end;".
Yet IIRC that is not line info / not for valgrind I guess.

mm7

  • Full Member
  • ***
  • Posts: 231
  • PDP-11 RSX Pascal, Turbo Pascal, Delphi, Lazarus
Re: Valgrind stops showing line number at some depth
« Reply #8 on: November 13, 2025, 01:28:19 am »
Do you use "smart linking"?  - No

Do you use generics? - Yes.

TFasterList is generic,

Code: Pascal  [Select][+][-]
  1.   generic TFasterList<TItemType> = class
  2.  
specialized like
Code: Pascal  [Select][+][-]
  1.   TFasterListTFreeSubdivisionPoint = specialize TFasterList<TFreeSubdivisionPoint>;
  2.   TFasterListTFreeSubdivisionEdge = specialize TFasterList<TFreeSubdivisionEdge>;
  3.   TFasterListTFreeSubdivisionFace = specialize TFasterList<TFreeSubdivisionFace>;
  4.  

And these lists are actively used.

I tried to compile with Dwarf2. Valgrind stopped even one step earlier, at
==3189039==    by 0x8E93E1: FREEGEOMETRY$_$TFREESUBDIVISIONSURFACE_$__$$_REBUILD (FreeSubdivisionSurface.inc:4774)

I had similar issue in another place. the Valgrid did not show row number in another method. For example,  Method1 calls Method2. Valgrind showed row number for Method1 only skipped Method2. I copied entire Method2 as sub-procedure inside the Method1, with name Method2_local.
This way Valgrind showed the line number inside Method2_local.
But, going this route is cumbersome, and sometimes is not possible, if the Method2 is in different class.




 

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11932
  • Debugger - SynEdit - and more
    • wiki
Re: Valgrind stops showing line number at some depth
« Reply #9 on: November 13, 2025, 10:49:40 am »
And these lists are actively used.

Given your previous replies this wont be the issue, but I still mention it. Just in case it bring up any ideas...

If any of the functions of those generics were inlined (into the skipped function), then that could have an impact. But then you would not be able to step through it in the normal debugger either. (And you said you can do that).
After inlining code from a generic in a diff unit, the compiler sometimes writes wrong info for the rest of the outer method.... (It attributes it to the unit of the generic, rather than its own unit).

But again, shouldn't be the issue.


Quote
I tried to compile with Dwarf2. Valgrind stopped even one step earlier, at
==3189039==    by 0x8E93E1: FREEGEOMETRY$_$TFREESUBDIVISIONSURFACE_$__$$_REBUILD (FreeSubdivisionSurface.inc:4774)
Interesting... At least some reaction from valgrind.

Maybe it has an issue with mixed Dwarf versions? Purely guessing.... Would be very strange, the line info part does not really have much difference between versions (possible none?).

In any case, are all the units part of the project itself? Or are some of the units in packages? Because if they are in packages, they may have different dwarf versions.
From the trace in your first post I assume they are all part of the project.
Then the only packages are Lazarus (LCL, etc) and FPC (no debug info, as it looks).

-- You could try to change settings for LCL too: Go to "project Options" > "Additions and overrides", add a new section (it will match project and all packages "*") and add a custom option: "-gw2" or "-gw3"
Or maybe even better "-g- -gw2 -gv" versus "-g- -gw3 -gv" to reset all other debug related settings.

-- If you have "external debug info" enabled, I would recommend to try with it disabled (though I can't see how that would make a diff, as the debug info is found and used - but given that we run out of ideas...)

-- Another far shot (really far...): Make sure to use -O- (all optimizations turned off)
( I don't know what you are currently using, but it may be -O1 ?)
To me -O1 would not explain the absence at all... But then, who knows...

Quote
I had similar issue in another place. the Valgrid did not show row number in another method. For example,  Method1 calls Method2. Valgrind showed row number for Method1 only skipped Method2. I copied entire Method2 as sub-procedure inside the Method1, with name Method2_local.
This way Valgrind showed the line number inside Method2_local.
Ok, so that means in that case it was not in the "called" method. (My earlier comment, that the called method may hide the caller)

Another far shot... (And really far / I don't expect it to make a diff). Maybe valgrind trips over something related to the order of the functions. You could in your code just move that function to the start (or end) of the unit.



I did some googling. Haven't found much...

Only: https://stackoverflow.com/questions/16790520/valgrind-stack-misses-a-function-completely

Which indicates that the issue can also happen for programs compiled with a C compiler. So that would point more towards a bug in valgrind than in fpc (or the combination of both).


One of the replies caught my eye (though it does not say that it was the cause / and I don't think it is for you): "disable tail call optimization". With FPC that would only happen at some higher optimization level (not sure which).
And you probably use either -O1 or -O- ?
=> "tail call optimization" means that if the last statement of a function is "CallFoo();", then instead calling CallFoo as a subroutine, the code jumps to CallFoo, and CallFoo will return to the caller of the current function. If there is no "call" statement, then that could look to a debugger like its still the same function (though line info would say different).



If none of this works then I don't know... You could still try to get an FPC install with debug info (e.g. with fpcupdeluxe). Then TObject.Free would also have debug info, and maybe if your code isn't the top of the known functions, then valgrind would not skip it? Well maybe, and a lot of work to find out.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11932
  • Debugger - SynEdit - and more
    • wiki
Re: Valgrind stops showing line number at some depth
« Reply #10 on: November 13, 2025, 11:13:07 am »
Just when I though I exhausted my realm of ideas... And again, no idea if related to your issue...

There is another "oddity" in debug info by FPC.

FPC inserts some asm code (in the middle of functions, that can either be
- attributed to "line 0" (zero)
- the "begin" line of the function
- the "end" line of the function

You may have seen the jumping around, when single stepping :(

One thing that causes this are try finally/except blocks (including implicit ones, that you didn't write yourself).
I don't know if there are other triggers.

So, does the function have try ... finally/except blocks?

Or does it have "managed" local variables (ansistring, dyn array)? Then FPC inserts its own try/finally.
You can disable that with
Code: Pascal  [Select][+][-]
  1. {$IMPLICITEXCEPTIONS off}

But only if you don't throw exceptions. If you throw exceptions, then this directive will leak strings.



Also, finally blocks go into their own sub-routine (you may have seen $fin in some function names in debugging...). But then that should still be reported, as it should have all debug info it needs.

Well, actually AFAIK the subroutine holding the finally block, will modify the stack so that it looks like it was called by the caller of the function that it belongs to, not the function that contains the finally block.

This normally makes the stack look correct, as you don't expect the finally block to be a function with an extra caller, but for a debugger this can be strange.
E.g. gdb will show the "$fin..." function, but NOT the function that contains the finally block.

ALLIGATOR

  • Sr. Member
  • ****
  • Posts: 306
  • I use FPC [main] 💪🐯💪
Re: Valgrind stops showing line number at some depth
« Reply #11 on: November 14, 2025, 06:30:30 am »
-- You could try to change settings for LCL too: Go to "project Options" > "Additions and overrides", add a new section (it will match project and all packages "*") and add a custom option: "-gw2" or "-gw3"

Wow! That's great! I didn't know you could configure settings for all packages included in the project here! That's exactly what I needed for experiments like “What happens if I set -O4 everywhere?”
Until now, such experiments required a lot of manual work
🆒😎
I may seem rude - please don't take it personally

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12572
  • FPC developer.
Re: Valgrind stops showing line number at some depth
« Reply #12 on: November 14, 2025, 10:02:47 am »
If I google on valgrind and double free, I find

https://stackoverflow.com/questions/5664967/finding-allocation-site-for-double-free-errors-with-valgrind

and they also add

 --keep-stacktraces=alloc-and-free

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11932
  • Debugger - SynEdit - and more
    • wiki
Re: Valgrind stops showing line number at some depth
« Reply #13 on: November 14, 2025, 12:09:06 pm »
Btw, what happens if you start it without "-vgdb=..." ? Just let valgrind run without allowing extra debugging?

I don't expect it to make a difference, but since we are running low on ideas...

mm7

  • Full Member
  • ***
  • Posts: 231
  • PDP-11 RSX Pascal, Turbo Pascal, Delphi, Lazarus
Re: Valgrind stops showing line number at some depth
« Reply #14 on: November 20, 2025, 03:14:41 am »
Unfortunately, whatever I tried did not helped.
Even converting the method into a local sub-procedure. Valgrind did not "remember" line numbers inside the sub-procedure, the last line number was on the call of this sub-proc.

Here are parameters for Valgrind I used.

Code: Text  [Select][+][-]
  1. valgrind --tool=memcheck --track-origins=yes --keep-stacktraces=alloc-and-free --error-limit=no --keep-debuginfo=yes --read-inline-info=yes --read-var-info=yes --leak-check=full --show-leak-kinds=all --log-file=valgrind.trc --num-callers=100 --suppressions=valgrind-suppress.cfg --vgdb-error=0 --vgdb=full -v ./FreeShip

BTW I've found the bug using traditional debug techniques, gradually narrowing down the scope where the bug appeared. It was completely in another place. :) Valgring, obviously, captured just consequences of the wrong object's assignments.
I am not saying that Valgrind is useless. It helped me a lot in other situations.

 

TinyPortal © 2005-2018