OK, I have managed to implement the SHA1 assemler version (thanks to EngKin for the {$ASMMODE INTEL} compiler directive tip!)
SHA1
For files, running on a 32-bit Windows virtual machine hosted on a 64-bit AMD Linux workstation I saw :
An average of 13 to 14 seconds using the Pascal enhanced SHA1Transform for a 1Gb file
An average of 6-8 seconds using the new assembly compiled SHA1Transform for a 1Gb file
For disks, running on the same platform :
Well, I need to check my time computations. They seem to average about 7Gb per minute (which is still very fast), but from time to time, it leaps to nearly 14gb per minute (which, if true, is mind blowingly fast considering my workstation is about 6 years old!)!! I'm not sure if that is due to disk activity, disk caching etc, or if it';s the way my program is computing the time. I will need to do more tests on native hardware rather than through virtualisation.
For MD5
An average of 10 seconds using the Pascal enhanced MDTransform for a 1Gb file
An average of 5 seconds using the new assembly compileded MDTransform for a 1Gb file.
So, on my system, with this architecture, both look very promising. 50% for files with MD5. And around 40% for files with SHA1!
EngKins new paragraph to his CV (or 'Resume' if he's American) : Developed the fastest open source transform procedure on Earth using assembly!