Recent

Author Topic: Parallelhashlist was updated to version 1.3 ...  (Read 5159 times)

aminer

  • Hero Member
  • *****
  • Posts: 956
Parallelhashlist was updated to version 1.3 ...
« on: May 17, 2012, 11:35:23 pm »

Hello,


In my previous version of ParallelHashList even if i have used
lock striping for the hash chains, synchronizing access to the counter
that computes the number of entries in the hashmap was reintroducing
the scalability problem of exclusive locking, this counter was  called
a hot field because every mutative operation needs to access it.
In this new version 1.3 of parallelhashlist i have splited the global
counter to many counters to enhance the scalability... also i have  changed
parallelhashlist to use only 100 lightweight  MREWs (multiple-readers -exclusive-writer)
this will lower the memory consumption  and this will allow to parallelize
the writes and reads in separate chains , and also to parallelize the reads
in the same chain of the hashmap , so this will give a good performance
and a good scalability . 

Description:

A Parallel HashList with O(1) (best case) and O(log(n)(worst case)
access that use a hash based method that uses lock striping and
100 lightweight  MREWs (multiple-readers -exclusive-writer).
This will allow to parallelize the writes and reads in separate chains , and
also to parallelize the reads in the same chain.


You can download parallelhashlist from:

http://pages.videotron.com/aminer/


Language: FPC Pascal v2.2.0+ / Delphi 5+: http://www.freepascal.org/


Operating Systems: Win , Linux and Mac (x86).

and please take a look at the benchmarks here:

http://pages.videotron.com/aminer/parallelhashlist/queue.htm

Note: When i have done those benchmarks , there was not enough/much items
organized as a self-balancing tree in the individual chains of the hashtable, so ,
almost all the items was found and inserted in O(1) , so the parallel part in the
Amdahl equation was not much bigger compared to to the serial part. But you
will notice in pratice that as soon as you will have more items on the chains of
the Hashtable, organized as self-balancing tree, with a worst case log(n)  , the
parallel part will become bigger in the Amdahl equation and you will have better
performance and scalability than the numbers in the graph of the benchmarks ...



Thank you.

Amine Moulay Ramdane.



aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #1 on: May 18, 2012, 01:03:15 am »

Hello again,


I have designed the Iterate() method to be scalable.

The iterate method calls the AIterateFunc function on each item in the hash

And you can use Iterate like this:

First declare two variables like this

var
:
IterFunc: TIterateFunc;
MyKey:pchar;


And after that declare your TIterateFunc like this:

---

function Iter(AUserData: Pointer; const AStr: string;
var APtr: Pointer): Boolean;

begin

Iter:=true;

writeln(Astr);

if Astr=Pchar(AUserData) then Iter:=false;

end;

---

and after that you add for example the following in the main body:



---

IterFunc:=Iter;

MyKey:='amine moulay ramdane 1';

hash.Iterate(MyKey,IterFunc);

---



That's all, i have just put a test1.pas example that uses the iterate() method

inside the zipfile, you have to uncomment three lines in the test1.pas example

to be able to use this method , please download again the zipfile from:

http://pages.videotron.com/aminer/





Thank you.

Amine Moulay Ramdane.





aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #2 on: May 18, 2012, 01:30:12 am »

I wrote:
> I have designed the Iterate() method to be scalable.


I mean parallel.


Amine Moulay Ramdane.


aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #3 on: May 18, 2012, 02:20:17 am »

Hello again,

You can use the Parallel Iterate() method of parallelhashlist
when also you want to do a sort by keys...



Amine Moulay Ramdane.

aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #4 on: May 18, 2012, 02:23:40 am »


When do we use parallelhashlist ?


"We use hash tables when their magic fits our problem.


For example, caching frequently ends up using a hash table -- for example, let's say we have 45,000 students in a university and some process needs to hold on to records for all of them. If you routinely refer to student by ID number, then a ID => student cache makes excellent sense. The operation you are optimizing for this cache is fast lookup.

Hashes are also extraordinarily useful for storing relationships between data when you don't want to go whole hog and alter the objects themselves. For example, during course registration, it might be a good idea to be able to relate students to the classes they are taking. However, for whatever reason you might not want the Student object itself to know about that. Use a studentToClassRegistration hash and keep it around while you do whatever it is you need to do.

They also make a fairly good first choice for a data structure except when you need to do one of the following:"



aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #5 on: May 18, 2012, 02:59:04 am »

Hello again,

One final touch to parallelhashlist: i have just added this
to the constructor:

if AHashSize < 1000
then
begin
 writeln('Constructor''s Hash size argument must be superior or equal to 1000');
 exit;
end;



So as you have noticed the hash size argument must be superior or equal to 1000


Please download again the parallelhashlist zipfile from:

http://pages.videotron.com/aminer/



Amine Moulay Ramdane.




aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #6 on: May 18, 2012, 03:01:38 am »



I wrote:
>One final touch to parallelhashlist


I mean a final touch to version 1.3



Amine Moulay Ramdane.


aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #7 on: May 18, 2012, 04:03:02 am »

When i wrote:

"This will allow to parallelize the writes and reads in separate chains ,
and also to parallelize the reads in the same chain."

I was speaking about the previous version...

Here is the new description of parallelhashlist:

 
Description:

A parallel HashList with O(1) (best case) and O(log(n)(worst case)
access  that use a hash based method that uses lock striping and
100 lightweight MREWs (multiple-readers -exclusive-writer) .
This allows multiple threads to write and read concurently.



Thank you.


Amine Moulay Ramdane.



aminer

  • Hero Member
  • *****
  • Posts: 956
Re: Parallelhashlist was updated to version 1.3 ...
« Reply #8 on: May 18, 2012, 04:24:11 am »


I am using lock striping with 100 lightweight MREWs (multiple-readers -exclusive-writer)

this allows up to 100 parallel writes, but this upper bound on writes will not affect parallel reads...



Amine Moulay Ramdane.



 

TinyPortal © 2005-2018