Recent

Author Topic: Why do I see variable (record) names in compiled binary  (Read 4019 times)

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11452
  • FPC developer.
Re: Why do I see variable (record) names in compiled binary
« Reply #30 on: May 15, 2023, 10:24:27 am »
It seems like the only way to avoid information being leaked is to use older versions of compiler created before these rtti issues were introduced.

Have fun using some ancient, pre-2.0 version of FPC then, because the RTTI data is there since support for Delphi compatible classes has been introduced.

For _every_ record definition as Fibonacci shows? That is part of extended (D2010+?) RTTI, not base  (D2-D2009) RTTI for the form designer?  Trunk does this, FPC 3.2.2 not it seems

VisualLab

  • Sr. Member
  • ****
  • Posts: 321
Re: Why do I see variable (record) names in compiled binary
« Reply #31 on: May 15, 2023, 09:52:24 pm »
There is just one speed metric that matters and that is if the code is fast enough. For interactive systems you just need to be fast enough to not disrupt the interactions. For example in a GUI quality of experience research has shown that up to a few hundred milliseconds to react to a users actions does not impact the user experience, so for a user there is no different if after pressing a button something happens 10ms, 50ms or 100ms afterwards.
Similar for network servers, where the processing time can be up to the same order of magnitude than the roundtrip time, when you have a roundtriptime of 40ms to a server, it doesn't matter if the request takes 1ms, 5ms, or 10ms. In the network jitter that difference is barely measurable.

Totally agree when it comes to websites. But when you're sending batch data to the server to perform complex calculations, speed is of paramount importance. Example: Gaussian, ORCA (modeling of chemical molecules). In such a situation, Python, PHP or JavaScript are garbage. Also Java or C# are absolutely not suitable for such tasks. I suspect that searching databases of DNA, RNA or amino acid sequences also requires maximum speed. After all, the point is to get results as quickly as possible. In addition, the customer pays for the duration of the calculation.

In those cases you gain absolutely nothing for being faster, as long as you are fast enough. And because of that, code maintainability trumps speed. And I know you agree with this, because you use pascal and not for instance C or even asm, while still most of the pascal specific features like strings, dynamic arrays, virtual classes, heap manager, exceptions, etc. are all adding overhead. Yet to at least some degree you must be fine with this overhead because it results in much better code.
Thats also why scripting and bytecode languages are already widely used since the 80s, because even on the (compared to now) low resource machines back then programmers understood that as long as it is fast enough, simpler code trumps additional speed gains. This is still true today with Python, JavaScript, Java, C#, etc. are some of the most popular languages.

As for the popularity of junk languages (Python, JavaScript, etc.) - there are probably several reasons. I believe that the main (though not the only) reason for their popularity is the belief among young people that everyone can program (and this is not true). This resulted in a tendency to "glue" the "program" from the "pieces". And no, I don't mean building applications from modules (blocks). It's more like making a ball of mud. And something is created that 20-40 years ago was strongly criticized (for obvious reasons). Of course, the heads of companies are often delighted with this, because it lowers the cost of production. Only the finished product has crappy quality. In other industries (thankfully) such behavior has not spread. Imagine cars, planes, medical equipment, processors and similar goods produced by this "cottage-industry" method. Rather, it is unthinkable.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11452
  • FPC developer.
Re: Why do I see variable (record) names in compiled binary
« Reply #32 on: May 15, 2023, 10:31:25 pm »
The trouble is also are you going to maintain deep knowledge of multiple vastly development system (from high level Python to  medium level JIT languages like Java and C# to C++ to Embedded C?).

You can never have all of them in-depth at the same time. 

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Why do I see variable (record) names in compiled binary
« Reply #33 on: May 15, 2023, 11:49:30 pm »
Totally agree when it comes to websites. But when you're sending batch data to the server to perform complex calculations, speed is of paramount importance. Example: Gaussian, ORCA (modeling of chemical molecules). In such a situation, Python, PHP or JavaScript are garbage. Also Java or C# are absolutely not suitable for such tasks. I suspect that searching databases of DNA, RNA or amino acid sequences also requires maximum speed. After all, the point is to get results as quickly as possible. In addition, the customer pays for the duration of the calculation.
It only depends on the reaction time of the interactive system. Here at home my ping to a nearby server (i.e. one located in my home country) is between 20-25 Ms, meaning 5 Ms of jitter. There is no way to meaningfully measure a difference between a script that runs 1 Ms vs one that runs in 2ms. This is just statistically not distinguishable from the already existing random noise. There is no meaningful difference.
That said, when you have a special connection, e.g. the european universities that have their own network between them, with special QoS protocols and so on, they may archive on the same distance a 5ms ping with a jitter of less then 1ms, where this is measurable and makes a big difference.

For interactive system the threshold is always depending on the environment. As soon as you are much faster than your environment you gain nothing for additional speed

Quote
As for the popularity of junk languages (Python, JavaScript, etc.) - there are probably several reasons. I believe that the main (though not the only) reason for their popularity is the belief among young people that everyone can program (and this is not true). This resulted in a tendency to "glue" the "program" from the "pieces". And no, I don't mean building applications from modules (blocks). It's more like making a ball of mud. And something is created that 20-40 years ago was strongly criticized (for obvious reasons). Of course, the heads of companies are often delighted with this, because it lowers the cost of production. Only the finished product has crappy quality. In other industries (thankfully) such behavior has not spread. Imagine cars, planes, medical equipment, processors and similar goods produced by this "cottage-industry" method. Rather, it is unthinkable.

I couldn't harder disagree, different languages simply have different scopes. There is no such thing as a junk language (not even Javascript, even though it is objectively bad). For example I recently had a problem where someone pushed a large binary file into my git without using the LFS, and I only noticed after a few weeks of new commits as git started becoming slow. So I needed to get all the commits that touched that file, and then for each commit retroactively update it and delete the file from.the commit. It took me around 2 minutes to create a bash script that does that. I would probably have needed at least 30 minutes to 1 hour in pascal to do the same.

Or for my server I needed a script that would help me manage my nginx and SSL configs. I've previously wrote a program for that in pascal, took me a few days to write, but worked quite well. Then when I started moving everything to my new server and started dockering everything I needed to extend that script to manage the docker and reverse proxy configurations. Because I've found my original pascal approach to rigid to make changes easily possible, I decided to rewrite that,, to use a more generic templating approach, but that time in python.
It took me 3 hours, for something that previously took me days of work in pascal, and even then it just had a tiny subset of the now newly introduced functionalities. And  after years of using both i can tell you there is literally no benefit of the pascal application vs the new python script. I would have gained nothing from using pascal there

Glueing things together is not just a valid form of programming but in some cases actually even a desirable. Different programming languages make different things easier and others harder. Sometimes (as admin of my server I would even say very often) just gluing things together is exactly what you need.

One of my favorite programming languages is actually excel (or table calculators to be less branded), it's not usable for all things, e.g. you can try alot until you create an TCP server in excel, but when it comes to e.g. calculating the hours of overtime I've worked, literally nothing beats it.

And when looking at how python became so popular from 2018 onwards was simply because of the dominance of big data and AI, and this is nothing other than gluing together. All the components for creating a neural network for AI are provided by the underlying libraries like Torch written in a native language (I think torch is in C). All you need to do to use it is to glue these provided components together an parameterize them,as well as simple input and output transformations (e.g. getting statistics on the performance) and for this python is really good at. Making an auto encoder and decoder for images (the base technology behind the new stable diffusion generative models used by dall-e and co) is like 50 lines of code or so in python with PyTorch. In the time it takes you to do that I probably am not even finished with setting up a basic Lazarus project.

Whenever someone says that that a certain language is junk, e.g. it's just good at glueing stuff together, I always imagine a carpenter saying that a hammer is junk compared to a saw, because it only drives nails in, and is really bad at slicing wood. Different tools have different purposes there is no one tool that fits all situations. A saw is not better than a hammer, it's just different.
« Last Edit: May 16, 2023, 12:09:29 am by Warfley »

VisualLab

  • Sr. Member
  • ****
  • Posts: 321
Re: Why do I see variable (record) names in compiled binary
« Reply #34 on: May 16, 2023, 06:26:07 pm »
It only depends on the reaction time of the interactive system. Here at home my ping to a nearby server (i.e. one located in my home country) is between 20-25 Ms, meaning 5 Ms of jitter. There is no way to meaningfully measure a difference between a script that runs 1 Ms vs one that runs in 2ms. This is just statistically not distinguishable from the already existing random noise. There is no meaningful difference.
That said, when you have a special connection, e.g. the european universities that have their own network between them, with special QoS protocols and so on, they may archive on the same distance a 5ms ping with a jitter of less then 1ms, where this is measurable and makes a big difference.

But I guess we're talking about different things. You are talking about generating "on-the-fly" the content of a whole web page (HTML) or its part (CSS, SVG, raster graphics, etc.) by scripts (PHP, Python, etc.). In this case, I completely agree with you. I meant the situation when the software runs in batch mode. For example:
  • a calculation program is running on the server,
  • a classic executable program (ELF),
  • input data (file) are prepared in a separate program on the user's computer,
  • the user submits input via separate software using SSH (or something similar),
  • the sent data with parameters are placed in the input data queue (batches), which are waiting for the calculations to be performed,
  • the calculation program takes a batch from the queue, performs calculations and places the results in the output queue,
  • the user checks from time to time whether the calculation results have appeared, if so, he downloads them (SSH or similar),
  • the results are viewed and analyzed in a separate program on the user's computer.
Calculations (one cycle) can take 3 days, a week, two weeks or even a month. This is how Gaussian (and other such programs) work. Gaussian is written in Fortran. The greatest possible speed of the program is one of the most important features of such a program. In this case, using scripts doesn't make any sense (a set of scripts would be highly inefficient).

I couldn't harder disagree, different languages simply have different scopes. There is no such thing as a junk language (not even Javascript, even though it is objectively bad). For example I recently had a problem where someone pushed a large binary file into my git without using the LFS, and I only noticed after a few weeks of new commits as git started becoming slow. So I needed to get all the commits that touched that file, and then for each commit retroactively update it and delete the file from.the commit. It took me around 2 minutes to create a bash script that does that. I would probably have needed at least 30 minutes to 1 hour in pascal to do the same.

Or for my server I needed a script that would help me manage my nginx and SSL configs. I've previously wrote a program for that in pascal, took me a few days to write, but worked quite well. Then when I started moving everything to my new server and started dockering everything I needed to extend that script to manage the docker and reverse proxy configurations. Because I've found my original pascal approach to rigid to make changes easily possible, I decided to rewrite that,, to use a more generic templating approach, but that time in python.
It took me 3 hours, for something that previously took me days of work in pascal, and even then it just had a tiny subset of the now newly introduced functionalities. And  after years of using both i can tell you there is literally no benefit of the pascal application vs the new python script. I would have gained nothing from using pascal there

Glueing things together is not just a valid form of programming but in some cases actually even a desirable. Different programming languages make different things easier and others harder. Sometimes (as admin of my server I would even say very often) just gluing things together is exactly what you need.

One of my favorite programming languages is actually excel (or table calculators to be less branded), it's not usable for all things, e.g. you can try alot until you create an TCP server in excel, but when it comes to e.g. calculating the hours of overtime I've worked, literally nothing beats it.

And when looking at how python became so popular from 2018 onwards was simply because of the dominance of big data and AI, and this is nothing other than gluing together. All the components for creating a neural network for AI are provided by the underlying libraries like Torch written in a native language (I think torch is in C). All you need to do to use it is to glue these provided components together an parameterize them,as well as simple input and output transformations (e.g. getting statistics on the performance) and for this python is really good at. Making an auto encoder and decoder for images (the base technology behind the new stable diffusion generative models used by dall-e and co) is like 50 lines of code or so in python with PyTorch. In the time it takes you to do that I probably am not even finished with setting up a basic Lazarus project.

Whenever someone says that that a certain language is junk, e.g. it's just good at glueing stuff together, I always imagine a carpenter saying that a hammer is junk compared to a saw, because it only drives nails in, and is really bad at slicing wood. Different tools have different purposes there is no one tool that fits all situations. A saw is not better than a hammer, it's just different.

Maybe I used too "strong" word without explanation. I mean:
  • I'm creating a script to perform some actions once (several times),
  • I write it quickly (it is simple in construction),
  • I run a script that does the job,
  • the script has done its job, it is no longer needed, so I'm throwing it in the trash.
On the other hand, a separate issue that I mentioned was the increasingly popular way of developing software - quickly, anyhow. Maybe someday we'll fix it, but there's no time for that right now. Plus, it's supposed to be cheap. For small simple tasks that are performed once or several times, this is acceptable. However, this is not about scripts that automate tedious and repetitive tasks to be urgently performed (once or several times and that's it). The problem is that these types of solutions are forcefully promoted to create large software. And this is what creating according to the ball of mud pattern is. Because "brogrammers" don’t want:
  • learn the use of more advanced tools,
  • develop good habits,
  • decently design the program.
It's so bad that they don't even want to know what the downsides of the only (bad) solution they know are. Worse, they react with aggression whenever the subject is brought to their attention.

Comparing programming languages to mechanical tools (especially simple ones) is probably not a good idea (but it is very fashionable). However, if we are to compare languages to tools, I see one thing: programming languages are one and the same type of tool. It's like a carpenter talking about different kinds of, say, saws. For example:
  • a simple hand saw for cutting branches,
  • jigsaw for cutting delicate details,
  • a large two-person saw for cutting logs,
  • mechanical circular saw,
  • a large saw for cutting logs into boards, used in a sawmill,
  • an automated saw with interchangeable blades of various types, adjustable: rotational speed, cutting angle, with an extendable table, used for the production of furniture, windows, doors, roof trusses and the like.
On the other hand, tools such as: a saw, a hammer, a chisel, a screwdriver, a drill (etc.) in the programming industry could be compared to:
  • a compiler (the embodiment of a programming language),
  • debugger,
  • source code editor,
  • a generator of installation files for the created program,
  • help system editor,
  • the RAD suite.
And then it is clear that a simple scripting language (or rather its interpreter, especially one with GIL) is something like the equivalent of an ordinary saw for cutting branches. In contrast, a complex language whose source code is compiled to machine code and for which RAD was created is something that corresponds to an automated saw. So if I want to cut a branch (let's say, the equivalent of searching TXT files that contain old server logs), a simple simple saw is really enough. But if I want to make a large and complex piece of furniture, let's say a 4-door wardrobe (an equivalent of a more extensive program), I will use an automated saw. Because it will allow me to quickly and accurately cut out all the complicated elements needed to make the wardrobe. However, these automated and extensive tools cost money and time. The more complicated the tool, the more time you need to spend learning how to use it. Therefore, some people think that this problem can be easily circumvented, because the elements for the wardrobe can be cut with such a simple saw. Something like: "he discovered this one strange trick, he cured the whole village of cancer, all oncologists hate him" :) Yes, you can use a regular saw, but:
  • it will take longer than using a more extensive and automated tool,
  • the cut elements will not be very accurate (cutting precision),
  • the cut elements will have various deficiencies, blemishes and faults.
As a consequence, after assembly, the wardrobe will be: messy, crooked, with ill-fitting elements and jagged edges. The equivalent of such a wardrobe will be an "application" written in Python or Electron (node.js). Of course, the use of an automated saw by a man who knows little about how to operate this saw will not help much. A lousy wardrobe will still form. The problem is that today's "IT carpenters" not only do not want to learn how to use the "automated saw", not only arrogantly refuse to take the time to learn how to use it, but even brazenly and maliciously tell everyone around that they are so talented and wise that they can make a beautiful "wardrobe" without a blemish, using only this handsaw of theirs to cut branches. And even though their "wardrobe" are crooked, wobbly, chipped, with mismatched elements, the growing horde of these "IT poor-carpenters" brazenly shouts at those people who point out their sloppiness. In the opinion of these impertinent, planned ignorants, people are not allowed to say that:
  • their products are sloppy and of poor quality,
  • their tools are not suitable for making a "wardrobe",
  • their knowledge and experience are so poor that they should rather focus on cutting branches from bushes and trees, and leave the production of cabinets to those who know how to do it and have experience in it.
Unfortunately, the majority of people believe in stupid statements: "after all, everyone can and even should program". Especially young people, because they do not yet have much knowledge and no experience (of course not all of them - but in the population there is usually too small a percentage of reasonable and penetrating people). The more so that a lot of blogs have been created that contain various nonsense. As one scoundrel once said, "a lie repeated a thousand times becomes the truth" (such a twisted term for "brainwashing"). Meanwhile: "paper will accept everything" (or rather: "a text file will store everything").

To sum up - I don't think that scripting languages are unnecessary. I don't think they should be used for more serious software. But they most certainly have their uses. Another thing is that they can often (successfully) be replaced with existing software (like the spreadsheets you mentioned).

Warfley

  • Hero Member
  • *****
  • Posts: 1499
Re: Why do I see variable (record) names in compiled binary
« Reply #35 on: May 16, 2023, 07:30:51 pm »
I meant the situation when the software runs in batch mode.

Yeah we then were talking about completely different thing, when I meant interactive system, I explicetly mean a program that only requires to create short term responses to incoming events. This is explicetly the opposite of batch jobs. Of course for long running batch jobs performance matters greatly. I was previously working in software verification, where the analysis of one piece of code could take up multiple days. There every second counts.

But what I would argue is that Pascal applications, especially with Lazarus are usually used for more interactive systems, like GUI development, or webservers, etc. And if the compiler is optimized for either one or the other, I would prefer the FPC to be optimized for writing such programs, as this seems to be currently the main use.

[...]

To sum up - I don't think that scripting languages are unnecessary. I don't think they should be used for more serious software. But they most certainly have their uses. Another thing is that they can often (successfully) be replaced with existing software (like the spreadsheets you mentioned).
I completely agree, there is the trend that once a language gets popular for one thing, everyone wants to use it on everything (thats management brain in action), and this then results in so great things like having a whole data analysis toolchain GUI been written in Python with the QT Python bindings (if you haven't used Python with QT before, it's an absolute pain), or what I have worked on, a complete Discrete Event Simulation Framework written in Python, where the code was actually Java style OOP code with a lot of classes and inheritance, and absolutely no reason for it being Python. Infact I believe that half of the problems I encountered while working on it would not have happend if a real OOP language with typechecking like Java, Pascal or C# was used.

But I think that this is often the other way around, these languages are not popular because they are used for everything, they become popular and then being misused in such ways is the result of that. Python got really popular with the AI and DataScience boom in 2018, and I still fully believe that it's usage there is completely justified, and this is still one of it's main usages. JavaScript came with the web boom and C# got it's big blowup when Xamarin came along and it was easier to create Android apps with C# than it was with Android Studio (which it still is btw).
All those languages became popular because they are very good at a specific thing (or for JavaScript, literally the only option for web development). The misuse of those languages is than the side effect of their popularity

 

TinyPortal © 2005-2018