Forum > General

Fingerprinting source units

(1/2) > >>

MarkMLl:
A few days ago there was discussion relating to making the full path of a unit accessible as a $I expansion, in the context of diagnostic messages etc. This culminated in the addition of the %sourcefile% predefined https://forum.lazarus.freepascal.org/index.php/topic,60793.0.html which will hopefully arrive in the compiler in due course.

Would it be possible to have something similar which presented a checksum or hash of the sourcefile, e.g. (with a nod to whoever selected a Cheetah as the project's mascot) using the Tiger algorithm?

My rationale is this. A few days ago I raised an issue on StackExchange relating to "blessing" a Linux binary with rights to allow it to e.g. access raw sockets https://unix.stackexchange.com/questions/720010/preventing-posix-capabilities-proliferation . Since I've not been shot down in flames I'll take it to the kernel mailing list (the issue isn't doing it, it's preventing it from proliferating).

In principle, an IDE could include code that allowed it to bless any program it built, but that didn't give the user carte blanche to assign enhanced capabilities to an arbitrary binary elsewhere on the system.

The administrator who was asked to bless the IDE would need some degree of confidence that it had been built with unmodified sourcefiles. In this context, a fingerprint of the binary isn't entirely suitable, since it might have been rebuilt for an unfamiliar processor or with an unexpected level of runtime checks.

In order to have some confidence in the fact that the IDE hasn't been modified, a minimal precaution would be if the main unit- which by convention imports all others- had access to every unit's fingerprint which it could combine and report. That's by no means foolproof, but knowing the file that has originated each fingerprint (i.e. the new %sourcefile% expansion) it should be easy enough to check that the fingerprint isn't being spoofed:


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---unit SomeUnit; interface; const//  UnitFingerprint= {$I %sourcehash% } ;  UnitFingerprint= '1234567890';         // LOOKIT ME: I'M A L33T H4CK3R :-)... 
As I've said, it's not foolproof, but I think it would be a start particularly for targets such as Linux that don't have agreed conventions for binary signing.

MarkMLl

Kays:

--- Quote from: MarkMLl on October 11, 2022, 10:54:05 am ---[…] Would it be possible to have something similar which presented a checksum or hash of the sourcefile […] ? […]
--- End quote ---
Every compiled unit has a couple checksums, see
--- Code: Bash  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---ppudump -vh someunit.ppuI don’t think anything like a {$I %checksum%} will be ever supported.
* Inclusion of a checksum string will necessarily affect the calculated checksum.
* You may use {$I %checksum%} multiple times.
* And then you have the task to find a checksum string that (embedded in a unit) yields the checksum?That’s not a compiler’s job, and unless you happen to have a quantum computer at home it can be unreasonably slow.

MarkMLl:

--- Quote from: Kays on October 11, 2022, 11:55:30 am ---Inclusion of a checksum string will necessarily affect the calculated checksum.

--- End quote ---

No, it only affects the symbol table value in exactly the same way as %file% etc.

MarkMLl

Kays:

--- Quote from: MarkMLl on October 11, 2022, 12:15:50 pm ---
--- Quote from: Kays on October 11, 2022, 11:55:30 am ---Inclusion of a checksum string will necessarily affect the calculated checksum.

--- End quote ---
No, it only affects the symbol table value in exactly the same way as %file% etc.
--- End quote ---
I cannot verify that
--- Code: Bash  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---$ cat > someunit.pas << EOTunit someunit;        interface                const                        F = {\$I %file%};        implementationend.EOT$ ln -s someunit.pas someunit.pp$ fpc someunit.pas$ ppudump someunit.ppu | grep ChecksumChecksum                : 37521D2FInterface Checksum      : A98CDFB9Indirect Checksum       : E1A3CEBA$ fpc someunit.pp$ ppudump someunit.ppu | grep ChecksumChecksum                : 38DB330DInterface Checksum      : 97E2CB1DIndirect Checksum       : E1A3CEBA

MarkMLl:
What the Hell are you on about man? I said absolutely nothing about the .ppu, I said take a hash of the SOURCE FILE and assign that as a constant value, In EXACTLY THE SAME WAY as the existing %file% expansions work.

Hence tentatively


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---program test; const  fingerprint= {$i %file% } + ' ' + {$i %sourcehash% } ; begin  WriteLn('Program generated using ', fingerprint)end. 
So this is absolutely noting to do with a preprocessor substitution of %sourcehash%- which as you observe would obviously change the checksum- but entirely the same as the compiler recognises and substitutes %file% etc.

It would, obviously, result in a performance penalty: but this is not something that everybody would need, and if they did it might not be needed for every unit of a program (in the case of a program which set capabilities, it might only be needed in the one unit which called the system-level library).

MarkMLl

Navigation

[0] Message Index

[#] Next page

Go to full version