1) There are definate speed enhancements to SHA1 between 20%-35% in my tests. I've also added the same transform function to the file hashing angle of QuickHash and speed improvements are gained there too of around the same amount, filesystem issues permitting.
Just for testing, you might want to try the same compiler I'm using. It's in a package by TrueTom called Laz4Android used to be here:
(laz4android1.1-41139). It seems the compiler optimizations are getting changed and you will not see the same results I am getting using a different version of the compiler.
2) Further to our private message, I have incorporated the MD5 code for MD5Transform, too. Again, files that took 2m 27s in earlier versions of QuickHash now compute is 1m 36s after a reboot! So that's an astounding improvement with no filesystem caching involved. Clearly the Transform functions in the FPC hashing libraries have been in need of some tender love and care.
Glad you didn't mind switching back to public thread. I believe you should get better results. Just try a small project with nothing more than MD5 hashing using laz4android1.1-41139. As you mentioned previously, your compiler does not recognize USEEBP which has a great impact on MD5Transform.
3) You and I both agree that for the benefit of Freepascal itself and the other forum members, it would be useful if you post the MD5 implementation too, just as you did for SHA1. Then we have a public record of our fixes and tests. So we can then submit patches for both MD5 and SHA1 at the same time. I am happy to contribute the completed and commented customised modules from QuickHash if it helps.
MD5Transform code:
procedure MD5Transform(var Context: TMDContext; Buffer: Pointer);
var
a, b, c, d: Cardinal;
Block: array[0..15] of Cardinal;
begin
Invert(Buffer, @Block, 64);
a := Context.State[0];
b := Context.State[1];
c := Context.State[2];
d := Context.State[3];
{$push}
{$r-,q-}
{$I md5replaced.txt} //<---- from the attachment
inc(Context.State[0],a);
inc(Context.State[1],b);
inc(Context.State[2],c);
inc(Context.State[3],d);
{$pop}
inc(Context.Length,64);
end;
md5replaced.inc is in the attachment. I got a major speed boost when I used -OoUSEEBP
4) So come on then...spill the beans! Help us all understand exactly why and how your code works so much faster? I am intrigued and must confess to not fully understanding it at the moment. I undertsadn about rotation in cryptographic ciphers and stuff, and I think this is what teh Transform functions do. But I don't understand why your structures work so much better than the currenft FPC 2.6.4 implementation.
The current implementation is fine, readable and maintainable and I would not be able to do "beans" without it. The changes I did were inspired by
Avra's post (Thanks Avra)
What I did is already documented on Wikipedia and elsewhere. For sha1 was mainly "loop unrolling", while for md5 simply canceling calls and replacing them with the code being called, similar to using "inline".
Adding -OoUSEEBP here boosted the speed a lot because Intel CPUs have a limited number of registers (32 bit CPUs have EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP) last two are usually reserved (by the compiler) to deal with the stack, and to refer to parameters and variables. That leaves only six registers. If a code segment needs more than six registers, the compiler generates extra code to save registers to temporary memory locations which slows down the speed. -OoUSEEBP instructs the compiler to use EBP as a general register which enhances the speed by increasing the number of available registers from six to seven. In the case of MD5Transform the enhancement was huge (at least on my system
) because the code that was generated by the compiler needed seven registers precisely.