Recent

Author Topic: [SOLVED] Hash list with integer as key  (Read 13989 times)

Dibo

  • Hero Member
  • *****
  • Posts: 1048
[SOLVED] Hash list with integer as key
« on: December 12, 2014, 07:34:23 pm »
Hi,

I need a list where I add some integers. Size of list can be very
small but also very big (>10k). I need fast solution to quickly check
if integer already exists. TList is too slow. Im using now TFPHashList
with key as integer converted to string. Is exist something with key
as integer? Because converting integer to string cost CPU, not much if
used periodically but I also need compare two big hashed lists.

Regards
« Last Edit: December 16, 2014, 10:41:04 pm by Dibo »

Blaazen

  • Hero Member
  • *****
  • Posts: 2952
  • POKE 54296,15
    • Eye-Candy Controls
Re: Hash list with integer as key
« Reply #1 on: December 12, 2014, 08:27:39 pm »
Maybe TFPGMap from unit FGL?
Lazarus 2.1.0 r64044 FPC 3.3.1 r40507 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

Dibo

  • Hero Member
  • *****
  • Posts: 1048
Re: Hash list with integer as key
« Reply #2 on: December 12, 2014, 08:54:12 pm »
Yes I thought about this too but I saw fight on news group that it is not hashed, just sequence search like in TList. Didn't check speed in my own tests yet

MathMan

  • Full Member
  • ***
  • Posts: 205
Re: Hash list with integer as key
« Reply #3 on: December 12, 2014, 09:29:03 pm »
Hi,

I need a list where I add some integers. Size of list can be very
small but also very big (>10k). I need fast solution to quickly check
if integer already exists. TList is too slow. Im using now TFPHashList
with key as integer converted to string. Is exist something with key
as integer? Because converting integer to string cost CPU, not much if
used periodically but I also need compare two big hashed lists.

Regards

Hi Dibo,

Just to check that I got it right - the items in the list are integers (32 / 64 bit?), you need to check if an integer is in the list and you need to compare two lists. And all this is time critical so you are looking for the fastest solution.

First I have a question regarding the comparison you mention. Shall the comparison only state if two lists are equal or do you need a comparison that shows if elements are in one list but not the other?

Regarding your approach - calculation of a hash value (at least a reasonable one that does not generate too many collisions) is time consuming. The values you maintain in the list can easily be compared directly - so you have a fast comparison of items. In this case I would guess that a balanced binary tree is faster - even for the sizes you mention (>10k entries). The comparison of two balanced binary trees is slower than comparing two hash lists - so the decision which way to go also depends on how often you need to compare full lists.

Kind regards,
MathMan

Windsurfer

  • Sr. Member
  • ****
  • Posts: 364
    • Windsurfer
Re: Hash list with integer as key
« Reply #4 on: December 12, 2014, 10:34:11 pm »
If the list is sorted, a binary search is fastest.

If this is the case, I have a routine to search a sorted array of strings TDateTime values that could be modified.
« Last Edit: December 12, 2014, 10:45:43 pm by Windsurfer »

Leledumbo

  • Hero Member
  • *****
  • Posts: 8273
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Hash list with integer as key
« Reply #5 on: December 13, 2014, 08:37:32 am »
Maybe TFPGMap from unit FGL?
It's a list with map-like behavior, not a hash based map.

Take a look at fcl-stl's ghashmap unit (need fpc trunk for best result, latest stable is not fully supported).

mse

  • Sr. Member
  • ****
  • Posts: 286
Re: Hash list with integer as key
« Reply #6 on: December 13, 2014, 01:12:39 pm »
MSEgui has tintegerhashdatalist, lib/common/kernel/msehash.pas.

Blaazen

  • Hero Member
  • *****
  • Posts: 2952
  • POKE 54296,15
    • Eye-Candy Controls
Re: Hash list with integer as key
« Reply #7 on: December 13, 2014, 01:37:13 pm »
Quote
It's a list with map-like behavior, not a hash based map.

I guess the only difference is that you have to hash keys yourself (and sort them, otherwise binary search fails).
Lazarus 2.1.0 r64044 FPC 3.3.1 r40507 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8780
  • FPC developer.
Re: Hash list with integer as key
« Reply #8 on: December 13, 2014, 07:38:34 pm »
Quote
It's a list with map-like behavior, not a hash based map.

I guess the only difference is that you have to hash keys yourself (and sort them, otherwise binary search fails).

And that makes insertion performance going down the drain.

Blaazen

  • Hero Member
  • *****
  • Posts: 2952
  • POKE 54296,15
    • Eye-Candy Controls
Re: Hash list with integer as key
« Reply #9 on: December 13, 2014, 08:03:18 pm »
OK, instead of
Code: [Select]
Add();
Sort();
you should do
Code: [Select]
Find();
InsertKeyData();
The hashmap has to do it too (internally).
But yes, if fcl-stl has implemented hash-map, then it's the choice.
Lazarus 2.1.0 r64044 FPC 3.3.1 r40507 x86_64-linux-qt Chakra, Qt 4.8.7/5.13.2, Plasma 5.17.3
Lazarus 1.8.2 r57369 FPC 3.0.4 i386-win32-win32/win64 Wine 3.21

Try Eye-Candy Controls: https://sourceforge.net/projects/eccontrols/files/

MathMan

  • Full Member
  • ***
  • Posts: 205
Re: Hash list with integer as key
« Reply #10 on: December 13, 2014, 08:06:32 pm »
Quote
It's a list with map-like behavior, not a hash based map.

I guess the only difference is that you have to hash keys yourself (and sort them, otherwise binary search fails).

And that makes insertion performance going down the drain.

Exact - and that's why I initially talked about a "balanced binary tree" as alternative. Seems this has somehow magically turned to "sorted list with binary search" on the way - yeah I love chinese whispers even on a written forum  ;)

Dibo

  • Hero Member
  • *****
  • Posts: 1048
Re: Hash list with integer as key
« Reply #11 on: December 16, 2014, 10:40:48 pm »
Sorry for delay. I finally used TFPGMap<Int64, Byte> . I compared speed and memory use with TList, TStringList and TFPHashList. TFPGMap wins in all tests. Filling 1 million items is same quick as TList and check if exists beat all other lists. Memory use also low. Very handy class. Problem solved

Leledumbo

  • Hero Member
  • *****
  • Posts: 8273
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Hash list with integer as key
« Reply #12 on: December 16, 2014, 11:08:47 pm »
Sorry for delay. I finally used TFPGMap<Int64, Byte> . I compared speed and memory use with TList, TStringList and TFPHashList. TFPGMap wins in all tests. Filling 1 million items is same quick as TList and check if exists beat all other lists. Memory use also low. Very handy class. Problem solved
Interesting! Would you mind sharing how you compare them? I see you didn't put THashMap to the list, any reason?

Dibo

  • Hero Member
  • *****
  • Posts: 1048
Re: [SOLVED] Hash list with integer as key
« Reply #13 on: December 17, 2014, 01:12:24 am »
Well, I just used EpickTimer. This is my whole unit, nothing amazing, just connect events to buttons:
Code: Pascal  [Select][+][-]
  1. unit Unit1;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. interface
  6.  
  7. uses
  8.   Classes, SysUtils, FileUtil, Forms, Controls, Graphics, Dialogs, StdCtrls, EpikTimer,
  9.   contnrs, fgl;
  10.  
  11. type
  12.  
  13.   TMyMap = specialize TFPGMap<Int64, Byte>;
  14.  
  15.   { TForm1 }
  16.  
  17.   TForm1 = class(TForm)
  18.     Button1: TButton;
  19.     Button2: TButton;
  20.     Button3: TButton;
  21.     Button4: TButton;
  22.     procedure Button1Click(Sender: TObject);
  23.     procedure Button2Click(Sender: TObject);
  24.     procedure Button3Click(Sender: TObject);
  25.     procedure Button4Click(Sender: TObject);
  26.   private
  27.     { private declarations }
  28.     tm: TEpikTimer;
  29.     l1: TList;
  30.     l2: TStringList;
  31.     l3: TFPHashList;
  32.     l4: TMyMap;
  33.   public
  34.     { public declarations }
  35.     procedure AfterConstruction; override;
  36.     procedure BeforeDestruction; override;
  37.   end;
  38.  
  39. var
  40.   Form1: TForm1;
  41.  
  42. implementation
  43.  
  44. {$R *.lfm}
  45.  
  46. const
  47.   MAX_ITEMS = 1000000;
  48.   FIND_ITEM = 500000;
  49.  
  50. { TForm1 }
  51.  
  52. procedure TForm1.Button1Click(Sender: TObject);
  53. var
  54.   i: Integer;
  55. begin
  56.   tm.Clear();
  57.   tm.Start();
  58.   for i:=1 to MAX_ITEMS do
  59.     l1.Add(Pointer(i));
  60.   tm.Stop();
  61.   WriteLn('List fill time: ', tm.ElapsedStr);
  62.   tm.Clear();
  63.   tm.Start();
  64.   l1.IndexOf(pointer(FIND_ITEM));
  65.   tm.Stop();
  66.   WriteLn('List find time: ', tm.ElapsedStr);
  67. end;
  68.  
  69. procedure TForm1.Button2Click(Sender: TObject);
  70. var
  71.   i: Integer;
  72. begin
  73.   tm.Clear();
  74.   tm.Start();
  75.   for i:=1 to MAX_ITEMS do
  76.     l2.Add(IntToStr(i));
  77.   tm.Stop();
  78.   WriteLn('String fill time: ', tm.ElapsedStr);
  79.   tm.Clear();
  80.   tm.Start();
  81.   l2.Find(IntToStr(FIND_ITEM), i);
  82.   tm.Stop();
  83.   WriteLn('String find time: ', tm.ElapsedStr);
  84. end;
  85.  
  86. procedure TForm1.Button3Click(Sender: TObject);
  87. var
  88.   i: Integer;
  89. begin
  90.   tm.Clear();
  91.   tm.Start();
  92.   for i:=1 to MAX_ITEMS do
  93.     l3.Add(IntToStr(i), nil);
  94.   tm.Stop();
  95.   WriteLn('Hash fill time: ', tm.ElapsedStr);
  96.   tm.Clear();
  97.   tm.Start();
  98.   l3.Find(IntToStr(FIND_ITEM));
  99.   tm.Stop();
  100.   WriteLn('Hash find time: ', tm.ElapsedStr);
  101. end;
  102.  
  103. procedure TForm1.Button4Click(Sender: TObject);
  104. var
  105.   i: Integer;
  106. begin
  107.   tm.Clear();
  108.   tm.Start();
  109.   for i:=1 to MAX_ITEMS do
  110.     l4.Add(i, 0);
  111.   tm.Stop();
  112.   WriteLn('Map fill time: ', tm.ElapsedStr);
  113.   tm.Clear();
  114.   tm.Start();
  115.   Writeln(l4.Find(FIND_ITEM+1, i));
  116.   writeln(i);
  117.   tm.Stop();
  118.   WriteLn('Map find time: ', tm.ElapsedStr);
  119. end;
  120.  
  121. procedure TForm1.AfterConstruction;
  122. begin
  123.   inherited AfterConstruction;
  124.   tm := TEpikTimer.Create(nil);
  125.   l1 := TList.Create;
  126.   l2 := TStringList.Create;
  127.   l3 := TFPHashList.Create;
  128.   l4 := TMyMap.Create;
  129. end;
  130.  
  131. procedure TForm1.BeforeDestruction;
  132. begin
  133.   l1.Free;
  134.   l2.Free;
  135.   l3.Free;
  136.   l4.Free;
  137.   tm.Free;
  138.   inherited BeforeDestruction;
  139. end;
  140.  
  141. end.
  142.  

For RAM, I just checked task manager after button click

Leledumbo

  • Hero Member
  • *****
  • Posts: 8273
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: [SOLVED] Hash list with integer as key
« Reply #14 on: December 17, 2014, 04:56:50 am »
Well, I just used EpickTimer. This is my whole unit, nothing amazing, just connect events to buttons:
Tested myself, it's amazing to know how TFPGMap can beat all other maps though this might be key type specific. THashmap depends much on the hash function, but even a simple `a mod n` is not enough to beat TFPGMap. The final result is a tight gap on insertion, on searching both are equal.

 

TinyPortal © 2005-2018