Recent

Author Topic: Why do I see variable (record) names in compiled binary  (Read 4042 times)

Fibonacci

  • Sr. Member
  • ****
  • Posts: 419
Re: Why do I see variable (record) names in compiled binary
« Reply #15 on: April 09, 2023, 03:18:57 pm »
Also legally code has very strong intellectual property protections

I don't event know who would I sue and I doubt it would stop anyone from using alternative to my binaries once published.

The languages you listed are interpreted. Java/C# are first converted to byte code IIRC but anyway it needs dependencies to run. I need my code to be run by CPU directly.

It's more about identification than stealing or copying the code. I don't want third parties to know what my binary actually does to accomplish the task, I don't want them to know what libraries I used etc. There is no need to put "why_do_i_see_this_in_binary_file" as plain text into the binary.

It also might help to identify the programmer himself (paranoic mode). Take few binaries and look for repeating pattern.

Until recently I thought only VCL related stuff was put as plain text in the binary.

I decided to use code like this to avoid putting unnecessary human-readable strings in the binary:

Code: Pascal  [Select][+][-]
  1. program simple_prog;
  2.  
  3. type
  4.   why_do_i_see_this_in_binary_file = record
  5.     d: DWORD;
  6.   end;
  7.  
  8. var
  9.   a: array of pointer;
  10.  
  11. begin
  12.   setlength(a, 999);
  13.  
  14.   getmem(a[123], sizeof(why_do_i_see_this_in_binary_file));
  15.   why_do_i_see_this_in_binary_file(a[123]^).d := 456;
  16.  
  17.   //test if works
  18.   writeln(why_do_i_see_this_in_binary_file(a[123]^).d);
  19.   readln;
  20. end.

Yet still, there is "Pointer" visible, and also because the array is dynamic there is plain text "simple_prog". If I don't define program name it defaults to "program" (in plain text in binary).

Well I have to write my own code to manage dynamic arrays.

If I wanted my code to be visible I would use interpreted language, it's easier.

Code: Pascal  [Select][+][-]
  1. # Begin asmlist al_rtti
  2.  
  3. .section .rodata.n_RTTI_$P$SIMPLE_PROG_$$_def00000004,"aw"
  4.         .balign 8
  5. .globl  RTTI_$P$SIMPLE_PROG_$$_def00000004
  6. RTTI_$P$SIMPLE_PROG_$$_def00000004:
  7.         .byte   21,0
  8.         .long   0,4
  9.         .long   RTTI_$SYSTEM_$$_POINTER$indirect
  10.         .long   -1,0
  11.         .byte   11
  12.         .ascii  "simple_prog"
  13. # End asmlist al_rtti

Another option is to modify compiler source code, I found out I can remove RTTI completely but it then wont compile my source. Need to dive deeper.

Code: Pascal  [Select][+][-]
  1. writer.AsmWriteLn(asminfo^.comment+'Begin asmlist '+AsmlistTypeStr[hal]);
  2. if AsmlistTypeStr[hal] = 'al_rtti' then current_asmdata.asmlists[hal].RemoveAll;
file: aggas.pas, line 1933, FPC 3.3.1

KodeZwerg

  • Hero Member
  • *****
  • Posts: 2082
  • Fifty shades of code.
    • Delphi & FreePascal
Re: Why do I see variable (record) names in compiled binary
« Reply #16 on: April 09, 2023, 04:41:20 pm »
By plain reading the content inside a hex-editor, what shall happen?
Omg, he/she was using XYZ as a name, how awful is that, now I do automagical know his source... *LOL*
So when it comes to security, you can on your own just try doing crazy things to obfuscate stuff,
or pay for getting a hard to crack application by using third party tools like themida (on windows).
For your own production without using third party tools, everyone that having enough experience and a good disassembler can "crack" your code and for that the provided plain readable names are as useful like putting salt into tea. (at least here nobody would drink tea with salt...)
Reverting your binary back into its original source is not possible and probably never will be.
Patching your binary is always possible however hard you try to do...
Hiding names is for me a task that is useless because you not need such information at all.
Why the compiler put them in was already explained enough.

So in my humble opinion, just dont think too much about things that are worthless to investigate that deep because it does not give clues about what your code is doing with those names...

Enjoy coding, not reading and focusing too much the binary part of your code :-*
« Last Edit: Tomorrow at 31:76:97 xm by KodeZwerg »

jamie

  • Hero Member
  • *****
  • Posts: 6130
Re: Why do I see variable (record) names in compiled binary
« Reply #17 on: April 09, 2023, 05:41:03 pm »
I've looked around the net on this subject and there seems to be more against it than for it, exactly for the reasons of prying eyes and bloat.

 So i fired up my work laptop which just happen to be home with me today with Delphi, and playing with the switches, it does seem like you can remove most of it, but not all of it. The part that is still there is the expected items needed for the streaming of the properties etc.

 The
{$M, TYPEINFO etc.} does not seem to do anything with fpc and {$RTTI....} is not supported either.

{$WeakLink...} also no supported.

Although I am not really complaining, FPC for many is a god sent gift but.

My employer has voiced their opinions about me using fpc code for final use. I do like Lazarus over the Delphi IDE but in the end I have to ensure what I do with it must compile in Delphi as the final distributed product.

  I am sure they aren't aware of this issue; most bosses are just figure heads that has to follow company rules!
The only true wisdom is knowing you know nothing

nanobit

  • Full Member
  • ***
  • Posts: 160
Re: Why do I see variable (record) names in compiled binary
« Reply #18 on: April 09, 2023, 05:43:21 pm »
@Fibonacci
If you find a safe opportunity to reduce RTTI data (provided that programmers don't call RTTI functions), then let us know:)

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Why do I see variable (record) names in compiled binary
« Reply #19 on: April 09, 2023, 07:30:47 pm »
I don't event know who would I sue and I doubt it would stop anyone from using alternative to my binaries once published.
The publisher of the alternatives, and force them to take them down and pay the damages. Of course someone might re-upload them, but software piracy is not new, and whats the difference if someone uploads your binaries illegally, or remakes?

It's more about identification than stealing or copying the code. I don't want third parties to know what my binary actually does to accomplish the task, I don't want them to know what libraries I used etc. There is no need to put "why_do_i_see_this_in_binary_file" as plain text into the binary.

I think programmers should always be quite transparent about the used libraries and technologies. As a programmer you are liable to develop your applications according to the state of the art, and the best way to ensure that is IMHO to provide transparency about the used technologies, libraries and any third party tools you use.

If I get served an application that uses QT4 (which is EOL) I want to know that, because this is clearly a breach of the state of the art requirement. So there should either be a very good reason, or I need to explicetly consent to using this outdated piece library within that codebase.
I should not be required to blindly trust the other party of a trade that they do what they promise, I want to have as much power as possible to verify if the other party fulfills their requirements

I've looked around the net on this subject and there seems to be more against it than for it, exactly for the reasons of prying eyes and bloat.
Whenever someone uses the word "bloat" with respect to code generation I instinctively start rolling my eyes. In the 25 years I own my own computer now, I never ever was in the situation where I ran out of memory because of executable size. Currently I have 2 512GB SSDs. Most of the memory consumption on that are assets, like graphics, sounds, videos, etc. or private documents (music files, pictures, etc.) If I run out of memory, I don't start deleting "bloated" executables, I start deleting those, like the songs I least listen too.
A 1 TB SSD costs 80€, which means that the price per byte is 10^-12€. Bloat is not a problem, except maybe in very fringe cases like microcontroller development, but there I would argue it is better to have special tooling for such special requirements.

Lastly the thing about RTTI, it can only work well if it is enabled by default. If you add RTTI only to type which explicetly enable it, it means you can never use RTTI together with legacy code.
I think there may be a case for having an opt out method, either for specific types or as a global switch, but there is the question if some internal libraries like the RTL may require it, which makes turning it off infeasable.

jamie

  • Hero Member
  • *****
  • Posts: 6130
Re: Why do I see variable (record) names in compiled binary
« Reply #20 on: April 10, 2023, 04:15:48 am »
I am sorry you feel that way, I've demonstrated many times over the years of producing fully working products using my tactics of bloat reduction and speed increases. Also selecting proper tools.

 I can say without doubt that i have rolled many eye balls around when showing off projects being able to do a lot of functionality with impressive speed results with very low resources.

  So I don't speak from the lower section of my stump.

  You can think and do as you wish, but the fact is, I've seen this kind of treatment done to nice working tools to end up as a overloaded tool to the point where serious coding becomes the thing of the past.
     
   Maybe the DEV's are getting old and just don't care anymore, I know I am getting old but I still care.

 Have a good day.
The only true wisdom is knowing you know nothing

PascalDragon

  • Hero Member
  • *****
  • Posts: 5486
  • Compiler Developer
Re: Why do I see variable (record) names in compiled binary
« Reply #21 on: April 10, 2023, 11:02:12 am »
It's more about identification than stealing or copying the code. I don't want third parties to know what my binary actually does to accomplish the task, I don't want them to know what libraries I used etc. There is no need to put "why_do_i_see_this_in_binary_file" as plain text into the binary.

If I wanted to know what your binaries does I wouldn't really bother with strings in the binary instead I would simply stuff it into something like Ghidra and call it a day (simplified, but essentially true).

Another option is to modify compiler source code, I found out I can remove RTTI completely but it then wont compile my source. Need to dive deeper.

Surprise. Deactivating generation of metadata that is needed for the code to work leads to the code no longer working...  ::)


Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Why do I see variable (record) names in compiled binary
« Reply #22 on: April 10, 2023, 11:16:36 am »
You use the term bloat a bit weirdly, in my previous post I assumed you mean with bloat functionality that is compiled into the binary but rarely used. This is for example what people mean when they call the GNU Core Utils bloated, because they contain a lot of extra functionality that most people don't use (e.g. GNU cat is almost 1 kloc, versus busybox cat which is just 64 loc). But the GNU core utils are not slow, they just have much more functionality that some might consider not needed (e.g. busybox has no signal handling).
Thats why I wrote about bytes and memory previously.

But speed isn't much better to complain either. Do you get a free plushy after saving enough CPU cycles?
There is just one speed metric that matters and that is if the code is fast enough. For interactive systems you just need to be fast enough to not disrupt the interactions. For example in a GUI quality of experience research has shown that up to a few hundred milliseconds to react to a users actions does not impact the user experience, so for a user there is no different if after pressing a button something happens 10ms, 50ms or 100ms afterwards.
Similar for network servers, where the processing time can be up to the same order of magnitude than the roundtrip time, when you have a roundtriptime of 40ms to a server, it doesn't matter if the request takes 1ms, 5ms, or 10ms. In the network jitter that difference is barely measurable.

In those cases you gain absolutely nothing for being faster, as long as you are fast enough. And because of that, code maintainability trumps speed. And I know you agree with this, because you use pascal and not for instance C or even asm, while still most of the pascal specific features like strings, dynamic arrays, virtual classes, heap manager, exceptions, etc. are all adding overhead. Yet to at least some degree you must be fine with this overhead because it results in much better code.
Thats also why scripting and bytecode languages are already widely used since the 80s, because even on the (compared to now) low resource machines back then programmers understood that as long as it is fast enough, simpler code trumps additional speed gains. This is still true today with Python, JavaScript, Java, C#, etc. are some of the most popular languages.

Joanna

  • Hero Member
  • *****
  • Posts: 770
Re: Why do I see variable (record) names in compiled binary
« Reply #23 on: May 12, 2023, 11:32:43 pm »
It seems like the only way to avoid information being leaked is to use older versions of compiler created before these rtti issues were introduced.

✨ 🙋🏻‍♀️ More Pascal enthusiasts are needed on IRC .. https://libera.chat/guides/ IRC.LIBERA.CHAT  Ports [6667 plaintext ] or [6697 secure] channel #fpc  Please private Message me if you have any questions or need assistance. 💁🏻‍♀️

dbannon

  • Hero Member
  • *****
  • Posts: 2802
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Why do I see variable (record) names in compiled binary
« Reply #24 on: May 14, 2023, 05:30:53 am »
You could have a pre release-compile script that generates alternative source, replacing all the names of your variables with S__1, r__23 etc. Probably shorter (dealing with "bloat" and obscuring the hopefully informative var names you use.

I don't think I'd bother ...

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

jamie

  • Hero Member
  • *****
  • Posts: 6130
Re: Why do I see variable (record) names in compiled binary
« Reply #25 on: May 14, 2023, 12:28:51 pm »
I have a better IDEA, just don't do it, create Address references which makes it faster and if external RTTI is required then the names should be bit shifted compressed and/or encrypted. The code behind the scenes can then create a name tab of the same at runtime to resolve it.

The only true wisdom is knowing you know nothing

korba812

  • Sr. Member
  • ****
  • Posts: 396
Re: Why do I see variable (record) names in compiled binary
« Reply #26 on: May 14, 2023, 01:12:01 pm »
I would like more such information contained in compiled binary - the so-called Extended RTTI.

Fibonacci

  • Sr. Member
  • ****
  • Posts: 419
Re: Why do I see variable (record) names in compiled binary
« Reply #27 on: May 14, 2023, 03:02:19 pm »
Quote from: Joanna
It seems like the only way to avoid information being leaked is to use older versions of compiler created before these rtti issues were introduced.

And which version is that? It better not be too old because I like anonymous functions

Quote from: dbannon
You could have a pre release-compile script that generates alternative source, replacing all the names of your variables with S__1, r__23 etc. Probably shorter (dealing with "bloat" and obscuring the hopefully informative var names you use.

I have after-compile script that overrides RTTI information read from asmlists (.s files)

PascalDragon

  • Hero Member
  • *****
  • Posts: 5486
  • Compiler Developer
Re: Why do I see variable (record) names in compiled binary
« Reply #28 on: May 14, 2023, 05:00:23 pm »
It seems like the only way to avoid information being leaked is to use older versions of compiler created before these rtti issues were introduced.

Have fun using some ancient, pre-2.0 version of FPC then, because the RTTI data is there since support for Delphi compatible classes has been introduced.

runewalsh

  • Jr. Member
  • **
  • Posts: 82
Re: Why do I see variable (record) names in compiled binary
« Reply #29 on: May 15, 2023, 03:00:55 am »
I think these names are not really used so you can patch the part of the compiler that writes RTTI to replace them with random characters or ____s.

 

TinyPortal © 2005-2018