Recent

Author Topic: Modular hashing/checksum library with 150+ algos  (Read 1438 times)

domasz

  • Sr. Member
  • ****
  • Posts: 413
Modular hashing/checksum library with 150+ algos
« on: November 25, 2022, 12:26:09 pm »
I started my own hashing/checksum library:
https://github.com/PascalVault/Lazarus_Hashing
Currently there are 150+ algorithms.

Why? I wanted a modular library. Other libraries I know of are quite bulky. In this library if you need just 1 algorithm for you program you can just copy a file with that algorithm + "HasherBase.pas" and you are good to go. There are no packages, no installation is required- just put the files in your project directory and add to "uses".

There is no assembly code, no DLLs, it doesn't require any other libraries so can be used from a clean Lazarus installation.

You can find here, among other things, almost every possible CRC variant. However this is the only library on the Internet in any programming language where all these CRC variants are lookup-table-based- so they are fast.

Most popular hashing functions like SHA-1 and MD-5 are not yet in this library. You can get those from other libraries. In the (near, hopefully) future these will also be available in my library.

Usage examples- hashing a String:

Code: Pascal  [Select][+][-]
  1. uses Hasher;
  2.  
  3. var Hasher: THasher;
  4.     Hash: AnsiString;
  5. begin
  6.   try
  7.     Hasher := THasher.Create('CRC-32 JAMCRC');
  8.     Hasher.Update('123456789');
  9.     Hash := Hasher.Final;
  10.     Hasher.Free;
  11.    
  12.     Memo1.Lines.Add( Hash );
  13.   finally
  14.   end;

Hashing a Stream:

Code: Pascal  [Select][+][-]
  1. uses Hasher;
  2.  
  3. var Hasher: THasher;
  4.     Hash: String;
  5.     Msg: TMemoryStream;
  6. begin
  7.   try
  8.     Msg := TMemoryStream.Create;
  9.     Hasher := THasher.Create('CRC-32 JAMCRC');
  10.     Hasher.Update(Msg);
  11.     Hash := Hasher.Final;
  12.     Hasher.Free;
  13.    
  14.     Memo1.Lines.Add( Hash );
  15.   finally
  16.     Msg.Free;
  17.   end;

THasher is a just helper class. You can use the classes directly:

Code: Pascal  [Select][+][-]
  1. uses CRC64;
  2.  
  3. var Hasher: THasherCRC64;
  4.     Hash: String;
  5.     Msg: AnsiString;
  6. begin
  7.   Msg := '123456789';
  8.   Hasher := THasherCRC64.Create;
  9.   Hasher.Update(@Msg[1], Length(Msg));
  10.   Hash := Hasher.Final;
  11.   Hasher.Free;
  12.  
  13.   Memo1.Lines.Add( Hash );
  14. end;

There is also a PHP port (but missing many algos):
https://github.com/PascalVault/Lazarus_Hashing/tree/main/PHP

Licensed under the terms of MIT so you can freely use it even in commercial projects.

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: Modular hashing/checksum library with 150+ algos
« Reply #1 on: November 26, 2022, 12:20:25 am »
That reminds me that I am using One-at-a-time in my hashmaps, and should replace it with a faster hash eventually

But you do not have the fastest hashes like XXH3, City64, Murmur3 ?

And the output is bad for hashmaps because it needs an integer not a string.

It would be easier to use in one's own project, if it was procedural code without classes, and we only had to import a single unit, rather than two

Perhaps these Move(Msg^, Tmp2, 4); can be replaced with tmp2 := unaligned(PCardinal(Msg)^)

domasz

  • Sr. Member
  • ****
  • Posts: 413
Re: Modular hashing/checksum library with 150+ algos
« Reply #2 on: November 26, 2022, 12:33:15 pm »
But you do not have the fastest hashes like XXH3, City64, Murmur3 ?
I started this library 2 weeks ago and just haven't coded those yet. I've just added Murmur3:
https://github.com/PascalVault/Lazarus_Hashing/blob/main/MurmurHash3.pas

And the output is bad for hashmaps because it needs an integer not a string.
No library will satisfy everyone. You can adapt the code for your needs. I tried to write everything as simply as possible.

It would be easier to use in one's own project, if it was procedural code without classes, and we only had to import a single unit, rather than two
Yes, but if someone needs multiple algorithms in one program then procedural code would be worse for him/her.

Perhaps these Move(Msg^, Tmp2, 4); can be replaced with tmp2 := unaligned(PCardinal(Msg)^)
You are right, thank you.

abouchez

  • Full Member
  • ***
  • Posts: 110
    • Synopse
Re: Modular hashing/checksum library with 150+ algos
« Reply #3 on: November 26, 2022, 03:12:21 pm »
Good idea.
Even if not used directly, it is a good reference for odd hash algorithms source code in pascal.  O:-)

There is a similar hash registration in mORMot 2.
You can either run directly the low-level hashing functions, or you have some high-level catalog.
https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.secure.pas#L946

Code: Pascal  [Select][+][-]
  1. var hasher: ICryptHash;
  2. begin
  3.   hasher := Hash('crc32c');
  4.   hasher.Update('some text');
  5.   writeln(hasher.Final);
  6. end;
  7.  
or in a one-liner
Code: Pascal  [Select][+][-]
  1.   writeln(Hash('crc32c').Full('some text'));
  2.  

mormot.core.secure unit supports 'md5', 'sha1', 'sha256', 'sha384', 'sha512', 'sha3_256', 'sha3_512' and 32-bit non-cryptographic 'crc32', 'crc32c','xxhash32', 'adler32', 'fnv32' for ICryptHash, and there are ICryptSigner for salted digest like 'hmac-sha1', 'hmac-sha256', 'hmac-sha384', 'hmac-sha512', and 'sha3-224', 'sha3-256', 'sha3-384', 'sha3-512', 'sha3-s128', 'sha3-s256'.

And there are the same catalog system for encryption, randomness and also certificates and asymetric cryptography, using either native mORMot code, or OpenSSL.
« Last Edit: November 26, 2022, 03:25:23 pm by abouchez »

abouchez

  • Full Member
  • ***
  • Posts: 110
    • Synopse
Re: Modular hashing/checksum library with 150+ algos
« Reply #4 on: November 26, 2022, 03:22:21 pm »
However this is the only library on the Internet in any programming language where all these CRC variants are lookup-table-based- so they are fast.
I am not sure it is the only one with lookup table, but for sure the most common versions (e.g. crc32 or crc32c) are slow in comparison to most production code.
For instance, you could use 8 lookup tables instead of a single one. The zlib code will be magnitude faster than this single-table code.
And... there are crc32c HW asm opcodes to reach more than 20GB/s on any recent CPU. And similar speed with as libdeflate crc32.

So performance is not the main point of this library.
Simplicity and algorithms coverage are its main points.
Nice work.
« Last Edit: November 26, 2022, 03:32:23 pm by abouchez »

abouchez

  • Full Member
  • ***
  • Posts: 110
    • Synopse
Re: Modular hashing/checksum library with 150+ algos
« Reply #5 on: November 26, 2022, 03:37:28 pm »
BTW in  THasher.Update I would not use TStream.Size.
Some TStream classes don't support it, e.g. if they are redirected streams from another streams.

This is how I did it in mORMot:
https://github.com/synopse/mORMot2/commit/48254f80662884d30b6f8254cb32ac9e2226a083

That is, just read and hash as much data until it ends.

domasz

  • Sr. Member
  • ****
  • Posts: 413
Re: Modular hashing/checksum library with 150+ algos
« Reply #6 on: November 26, 2022, 05:08:26 pm »
Good idea.
Even if not used directly, it is a good reference for odd hash algorithms source code in pascal.  O:-)

There is a similar hash registration in mORMot 2.
You can either run directly the low-level hashing functions, or you have some high-level catalog.
https://github.com/synopse/mORMot2/blob/master/src/crypt/mormot.crypt.secure.pas#L946

Code: Pascal  [Select][+][-]
  1. var hasher: ICryptHash;
  2. begin
  3.   hasher := Hash('crc32c');
  4.   hasher.Update('some text');
  5.   writeln(hasher.Final);
  6. end;
  7.  
or in a one-liner
Code: Pascal  [Select][+][-]
  1.   writeln(Hash('crc32c').Full('some text'));
  2.  

mormot.core.secure unit supports 'md5', 'sha1', 'sha256', 'sha384', 'sha512', 'sha3_256', 'sha3_512' and 32-bit non-cryptographic 'crc32', 'crc32c','xxhash32', 'adler32', 'fnv32' for ICryptHash, and there are ICryptSigner for salted digest like 'hmac-sha1', 'hmac-sha256', 'hmac-sha384', 'hmac-sha512', and 'sha3-224', 'sha3-256', 'sha3-384', 'sha3-512', 'sha3-s128', 'sha3-s256'.

And there are the same catalog system for encryption, randomness and also certificates and asymetric cryptography, using either native mORMot code, or OpenSSL.

Thanks for checking out the lib! Your MORMOT seems very nice and your one-liner is quite brilliant :)

domasz

  • Sr. Member
  • ****
  • Posts: 413
Re: Modular hashing/checksum library with 150+ algos
« Reply #7 on: November 26, 2022, 05:12:37 pm »
BTW in  THasher.Update I would not use TStream.Size.
Some TStream classes don't support it, e.g. if they are redirected streams from another streams.

This is how I did it in mORMot:
https://github.com/synopse/mORMot2/commit/48254f80662884d30b6f8254cb32ac9e2226a083

That is, just read and hash as much data until it ends.

You are right. I changed my code, thank you!

domasz

  • Sr. Member
  • ****
  • Posts: 413
Re: Modular hashing/checksum library with 150+ algos
« Reply #8 on: November 26, 2022, 05:16:31 pm »
Perhaps these Move(Msg^, Tmp2, 4); can be replaced with tmp2 := unaligned(PCardinal(Msg)^)

Thanks, I updated the code.
Also added SHA-1 and XXHash32.

wp

  • Hero Member
  • *****
  • Posts: 11830
Re: Modular hashing/checksum library with 150+ algos
« Reply #9 on: November 26, 2022, 05:51:57 pm »
I quickly scanned through some of the units and found that some of them have "Dialogs" in "uses" (e.g.BKDRHash, rshash, xxHash32, plus some more). Is this really needed? If not this would add an unnecessary dependence on LCL.

domasz

  • Sr. Member
  • ****
  • Posts: 413
Re: Modular hashing/checksum library with 150+ algos
« Reply #10 on: November 26, 2022, 06:02:24 pm »
I quickly scanned through some of the units and found that some of them have "Dialogs" in "uses" (e.g.BKDRHash, rshash, xxHash32, plus some more). Is this really needed? If not this would add an unnecessary dependence on LCL.
No, it can be safely removed. It's just some residue I forgot to remove.
Should all be cleaned now.
« Last Edit: November 26, 2022, 09:11:07 pm by domasz »

BeniBela

  • Hero Member
  • *****
  • Posts: 905
    • homepage
Re: Modular hashing/checksum library with 150+ algos
« Reply #11 on: December 01, 2022, 12:47:58 am »
No library will satisfy everyone. You can adapt the code for your needs. I tried to write everything as simply as possible.

Perhaps I will just copy the update function


Perhaps these Move(Msg^, Tmp2, 4); can be replaced with tmp2 := unaligned(PCardinal(Msg)^)
You are right, thank you.

But you need to write unaligned(..) Otherwise it might crash on some platforms

domasz

  • Sr. Member
  • ****
  • Posts: 413
Re: Modular hashing/checksum library with 150+ algos
« Reply #12 on: December 01, 2022, 12:50:39 am »

But you need to write unaligned(..) Otherwise it might crash on some platforms

I can only test it on Windows right now so other parts of code might also crash on some platforms.

 

TinyPortal © 2005-2018