Recent

Author Topic: [SOLVED] Is Parralellism Possible with Hash LIbraries?  (Read 4556 times)

Gizmo

  • Hero Member
  • *****
  • Posts: 831
[SOLVED] Is Parralellism Possible with Hash LIbraries?
« on: April 20, 2015, 09:52:47 pm »
A user has asked me about the possibility of introducing "parallelism" to my program, so that multiple CPU power can be used to analyse a single object (files, typically).

In my case, my program computes hashes of files typically using the FPC hash library and sometimes the DCPCrypt library for larger algorithms. Having done some reading, I notice there are wiki entries for "multi-threading" and "parallelism" - one seems to relate more to program responsive and the latter appears to relate more to computational speed. I'm curious to know just how feasible such a concept is when using the FPC hash libraries?

For example, the following computes a hash of a file in my program currently :

Code: [Select]
var
  GeneratedHash : string;
begin
GeneratedHash := MD5Print(MD5File( OpenDialog.Filename));
end;

Lets say that file is 1Tb is size, then the program is unresponsive until MD5Print returns the computed value. That issue might benefit from multithreading, and though desirable, is not key to the issue. More significantly is whether it is possible to speed up the process by splitting the task between available CPUs.  Is it even possible to say "CPU1 compute 250Gb of the file, CPU2 250Gb of it, CPU3 250Gb of it, CPU4 250Gb of it"?
« Last Edit: April 21, 2015, 12:49:40 pm by Gizmo »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #1 on: April 20, 2015, 10:06:53 pm »
I looked this up for sha, and it seems the answer is no. Some hashes (like some of the SHA256 candidates) have special parallel modes, but if a non-parallel mode is chosen for you (by an existing hash value) then you are stuck.

MD5, being old, probably doesn't have such mode.

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #2 on: April 20, 2015, 11:14:19 pm »
My program uses md5, sha1, from fpc and sha256, and sha512 from dcpcrypt library. What are these modes you speak of marcov.

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2770
    • havefunsoft.com
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #3 on: April 21, 2015, 01:50:45 am »
A user has asked me about the possibility of introducing "parallelism" to my program....
...Lets say that file is 1Tb is size, then the program is unresponsive until MD5Print returns the computed value...
Since the question is coming from a user, maybe she/he means: "Can parallelism be introduced to prevent program from being unresponsive"? Just calculate the hash in another thread and leave the program running.

It's user's language :)
« Last Edit: April 21, 2015, 01:58:29 am by skalogryz »

Septe

  • Jr. Member
  • **
  • Posts: 68
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #4 on: April 21, 2015, 06:21:59 am »
I'm curious myself.  I have doubts that fpc was designed with parallelism.  This is a very specific type of compiler with access to specific cores built into the language itself.  It's not enough to "break the file into smaller pieces" as you'd have to be able to direct a specific thread to a specific core.  My knowledge isn't sufficient to know if fpc can even do that.

cdbc

  • Hero Member
  • *****
  • Posts: 1026
    • http://www.cdbc.dk
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #5 on: April 21, 2015, 06:31:49 am »
Hi
I think "aminer" is the one to ask, he does a lot of paralel libraries.
Maybe send him a PM  ;)
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

Septe

  • Jr. Member
  • **
  • Posts: 68
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #6 on: April 21, 2015, 07:05:05 am »
After researching how a sha1 hash works, it's my uninformed opinion that breaking the file up wouldn't work as it'd change how the sha1 hashes.  If I read the tutorial correctly, each piece is dependent on the previous pieces.  You can design a hash system for parallel processing but it's no longer sha1 but a new hashing system and thus, need to use the same process of parallelism to reverse the hash to the original form.


marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #7 on: April 21, 2015, 10:08:39 am »
I'm curious myself.  I have doubts that fpc was designed with parallelism.  This is a very specific type of compiler with access to specific cores built into the language itself. 

Even a parallel language can't parallelise an algorithm that is basically F(n)=G(F(n-1),some inputbytes)

Simply because F(n-1) must be known before calculating F(n). I believe the said SHA256 variants were more block oriented to fix that.

And few popular languages are truly parallel. They all have some parallel or functional extensions to their normal imperative behaviour, but it is not something that goes to the core.

Gizmo: I doubt you'll find it in dcpcrypt or readily made Pascal code. As said it was a candidate for the SHA256/3 hashes that had a block mode. Not necessarily the winner. So you'll probably have to research it yourself.
« Last Edit: April 25, 2015, 04:17:59 pm by marcov »

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: Is Parralellism Possible with Hash LIbraries?
« Reply #8 on: April 21, 2015, 12:49:29 pm »
Thanks all for your input and thanks Marcov for the details.

To be clear - the user who submitted the request was a genome science guy. They're using my program (QuickHash GUI) to hash enormous files of genome data. And while QuickHash is doing it OK, they are simply suggesting that if they could utilise more of their CPU cores, it would be done faster. So they weren't referring to application responsiveness.

DCPCrypt does indeed use a more block based system but I was also not sure if it was capabale of parralellism.

I have asked the question to my fellow devloping heroes, and got some fairly conclusive replies. I can go back to the user now and basically say "Thanks, but I can't implement that request".

Thread marked as Solved.

 

TinyPortal © 2005-2018