Print Page - (hash)maps: the final benchmark

Announcements => Third party => Topic started by: BeniBela on October 09, 2016, 07:19:19 pm

Title: (hash)maps: the final benchmark
Post by: BeniBela on October 09, 2016, 07:19:19 pm

Without a good default associative array, there is always some confusion about the maps available in fpc:

Quote from: Gizmo on September 16, 2016, 01:44:56 pm

Urgh...I feel blue :-( But by the looks of it, hashtables are just not scaled for this volume of data. I thought there might be away of ensuring only less bytes were used or that perhaps I could plumb for a different hash type to whatever TFPHashList uses by default. e.g. if it was using SHA256 and I could perhaps have knocked it down to MD5 or even a basic CRC or something. But AFAIK, you can't do that.

Therefore I have compared all the maps (http://www.benibela.de/fpc-map-benchmark_en.html).

Turns out TFPHashList actually works well, although only for short keys.

For larger keys the rtl-generics.collections package is better. If you have a fpc version that can compile it.

Otherwise you can try a 3rd party map (http://forum.lazarus.freepascal.org/index.php?topic=25005.0)

Title: Re: (hash)maps: the final benchmark
Post by: marcov on October 09, 2016, 09:59:47 pm

Where is lightcontainers ? Though admitted while it is a map, it is not a hashmap. (since the main secondary purpose is being ordered iterable, as replacement for tstringlist)

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on October 09, 2016, 10:30:26 pm

Quote from: marcov on October 09, 2016, 09:59:47 pm

Where is lightcontainers ? Though admitted while it is a map, it is not a hashmap. (since the main secondary purpose is being ordered iterable, as replacement for tstringlist)

They did not compile. Didn't you get the forum PM?

Title: Re: (hash)maps: the final benchmark
Post by: guest58172 on October 12, 2016, 04:04:27 pm

Quote from: BeniBela on October 09, 2016, 07:19:19 pm

Without a good default associative array, there is always some confusion about the maps available in fpc:
Therefore I have compared all the maps (http://www.benibela.de/fpc-map-benchmark_en.html).

Thanks, very nice work. I find this very useful, to the extent that I've begun to use ghashmap and discovered that it's not the best choice. What would you recommend for small maps (up to 100 entries max) ? The most important criterion would be the time spent to build the map. (not benchmarked ?).

Also there's a minor issue in the axis definition, both linear and logarithmic are allowed.

Title: Re: (hash)maps: the final benchmark
Post by: marcov on October 12, 2016, 05:39:09 pm

Quote from: BeniBela on October 09, 2016, 10:30:26 pm

They did not compile. Didn't you get the forum PM?

I checked and I did get it, and seeing it jogged my memory, I got it 10 mins before I went off for a long weekend. I planned to reply later but forgot completely.

It is simply a matter of enabling delphi mode. Apparently I added a "uses sysutils;" in delphi at some point, and because the $mode was below it, it was skipped. Some of the demoes might still need delphi mode manually.

The new archive (still here (http://www.stack.nl/~marcov/lightcontainers.zip)) fixes that and synchronizes the generics version (genlight) with current SVN.

In practice I nowadays mostly use the generics version, but my datasets are smaller atm, so if there is a performance difference it matters less. I would be obliged if you benchmarked both. Genlight works a bit like generics.collections, so should be easy to test.

As said basically it is a generic tstringlist alike, but with one level extra indirection (to avoid too slow ordered insertion) and that keys can also be different from strings. Even then, keys are still ordered (which is useful for e.g. tdatetime ranges)

The original design aspect was to avoid tstringlist ordered insertion performance wall, where insertions/second totally breaks down with several hundred thousands of strings, and quick iteration as secondary case. But a tree was too memory hungry (application was then still 32-bit, and index overhead had to be limited).

Lookup was also important (because of the ordered insertion alone), but not as important as iteration as long as it was decent.

The procedural version was the core of my first production 64-bit program back somewhere in the 2002-2005 period.

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on October 13, 2016, 02:09:16 pm

Quote from: BBasile on October 12, 2016, 04:04:27 pm

Thanks, very nice work. I find this very useful, to the extent that I've begun to use ghashmap and discovered that it's not the best choice. What would you recommend for small maps (up to 100 entries max) ? The most important criterion would be the time spent to build the map. (not benchmarked ?).

Building would be the "only writes" model, but at lower count it is too fast for my benchmark. It will all get 0 or 1ms, even ghashmap. Just the generics.collection is rather slow.

Quote from: BBasile on October 12, 2016, 04:04:27 pm

Also there's a minor issue in the axis definition, both linear and logarithmic are allowed.

Both is allowed, and then you get two plots. If you enable all options, you can get hundreds of plots.

Quote from: marcov on October 12, 2016, 05:39:09 pm

It is simply a matter of enabling delphi mode. Apparently I added a "uses sysutils;" in delphi at some point, and because the $mode was below it, it was skipped. Some of the demoes might still need delphi mode manually.

The new archive (still here (http://www.stack.nl/~marcov/lightcontainer.zip)) fixes that and synchronizes the generics version (genlight) with current SVN.

lightcontainers.zip

But it also fails in function int2ln(i:integer):integer; because it is 32-assembly and I compile it on 64-bit.

Title: Re: (hash)maps: the final benchmark
Post by: marcov on October 13, 2016, 03:20:40 pm

Quote from: BeniBela on October 13, 2016, 02:09:16 pm

lightcontainers.zip

But it also fails in function int2ln(i:integer):integer; because it is 32-assembly and I compile it on 64-bit.

Yes. Add -Rintel. I thought FPC did that automatically.

Genlight still doesn't compile under 64-bit. (added later: Fixed I hope Same location (but lightcontainerS.zip))

The assembler works in both 32 and 64-bit, and mainly to exploit the bsr instruction. In FPC there is now an intrinsic for that I believe.

I didn't check the tests, only the core units (am at work)

Title: Re: (hash)maps: the final benchmark
Post by: hnb on October 14, 2016, 11:42:49 am

Quote from: BeniBela on October 09, 2016, 07:19:19 pm

For larger keys the rtl-generics.collections package is better. If you have a fpc version that can compile it.

Really nice work, thanks for the tests. I wonder how other hash algorithms can improve rtl-generics (any hashmap inside generics.collections can be customized in this aspect - each map has additional constraint for this, for example: TObjectCuckooD2<TKey, TValue, THashFactory>):

https://plus.google.com/u/0/+EricGrange/posts/VDAT1Mg7pR1

xxHash looks interesting. Current implementation of rtl-generics uses Bob Jenkins hash.

Title: Re: (hash)maps: the final benchmark
Post by: marcov on October 15, 2016, 02:28:40 pm

I played a bit with it to get a feel for memory consumption, but couldn't get it to draw a graph for tstringlist.

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on October 18, 2016, 02:32:06 pm

Quote from: BBasile on October 12, 2016, 04:04:27 pm

Thanks, very nice work. I find this very useful, to the extent that I've begun to use ghashmap and discovered that it's not the best choice. What would you recommend for small maps (up to 100 entries max) ?

That depends what you want to do with it.

Dictionary words, low read count: TFPHashList

Dictionary words, middle read count: TFPHashList, Bero's flre cache

Long random words: Bero's flre cache, TStringList or lightcontainers

But they are all below 2ms in the test

Quote from: hnb on October 14, 2016, 11:42:49 am

I wonder how other hash algorithms can improve rtl-generics (any hashmap inside generics.collections can be customized in this aspect - each map has additional constraint for this, for example: TObjectCuckooD2<TKey, TValue, THashFactory>):

Well, I have only tried the default settings.

The growth strategy might also change a lot

contnrs uses this hash

Code: [Select]

p:=@s[1];
  pmax:=@s[length(s)+1];
  while (p<pmax) do
    begin
      Result:=LongWord(LongInt(Result shl 5) - LongInt(Result)) xor LongWord(P^);
      Inc(p);
    end;

which seems faster, and with the random keys the distribution of the result might not matter much

Quote from: marcov on October 15, 2016, 02:28:40 pm

I played a bit with it to get a feel for memory consumption, but couldn't get it to draw a graph for tstringlist.

I forgot to switch it to case-sensitive, so it did not run it for TStringList

Title: Re: (hash)maps: the final benchmark
Post by: marcov on October 18, 2016, 04:15:07 pm

Benibela: thanks it works now, and lightcontainers too.

One can make the TStringlist problem also very visible with the website:

- model = "only write" 1,0,0
- metric= "keys/time" and keys/mem (2 graphs)
- dataset leave at dictionary words

First graph keys/time shows that the maps with one array (TFPGMAP and TStringlist) have a sharp slowdown with higher key numbers.

The second graph (keys/mem) shows that these however are the lowest memory usage.

The lightcontainers are just below this in this graph, but don't suffer the breakdown of the first.

Title: Re: (hash)maps: the final benchmark
Post by: circular on October 19, 2016, 05:49:22 pm

What about using the memory address instead of the string content?

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on October 21, 2016, 12:30:16 am

Quote from: circular on October 19, 2016, 05:49:22 pm

What about using the memory address instead of the string content?

I have not tried that. I think it will probably behaves similar to short random keys with shortstrings.

But there are not many maps for that key type anyways. ghashmap and rtl-generics with fpc, and ghashmap is clearly worse

Title: Re: (hash)maps: the final benchmark
Post by: circular on October 21, 2016, 12:25:16 pm

I guess there could be some problem with non-unique strings, i.e. strings that have the same content but have a different memory address. In fact, having a hashmap for all strings could be a way to ensure a single address for each string.

Title: Re: (hash)maps: the final benchmark
Post by: Thaddy on October 21, 2016, 12:38:18 pm

Quote from: BeniBela on October 21, 2016, 12:30:16 am

But there are not many maps for that key type anyways. ghashmap and rtl-generics with fpc, and ghashmap is clearly worse

I can't remember you suggested a better algorithm to solve that Issue?.....
I seem to remember you just compared some?

Quote from: circular

I guess there could be some problem with non-unique strings, i.e. strings that have the same content but have a different memory address. In fact, having a hashmap for all strings could be a way to ensure a single address for each string.

No. hashes will clash at some point and that can be mathematically proven. Hashes trade in proof for fast. As opposed to a real index. That can be proven to be exact [edit: and in any given dimension].
A hash is non-unique by default over an infinite population, but also in practice over a finite population. Reduction of information theorum.

Resolving hash collisions efficiently is another matter and not related to hashes perse, just in case someone wants the to comment on a solution ;).

Note that ALL hashes have this trait, including one-way hashes.

Title: Re: (hash)maps: the final benchmark
Post by: circular on October 21, 2016, 03:29:24 pm

Of course there can be multiple entries with the same hashcode even if it is rare. I suppose you misunderstood what I said.

What I am saying is that hash can be used to sort strings in order to have a list of all strings.

See: https://en.wikipedia.org/wiki/Hash_function

Quote

Typically, the domain of a hash function (the set of possible keys) is larger than its range (the number of different table indices), and so it will map several different keys to the same index. Therefore, each slot of a hash table is associated with (implicitly or explicitly) a set of records, rather than a single record. For this reason, each slot of a hash table is often called a bucket, and hash values are also called bucket indices.

Title: Re: (hash)maps: the final benchmark
Post by: Thaddy on October 21, 2016, 04:04:52 pm

@Circular
That is correct. The FPC compiler uses that mechanism already indeed ;) when it generates code for immutables.
I merely want to refer (and warn) that a hash is not capable of precise uniqueness.
That is a mistake that loads of programmers make and simply misguided. If you want, consider It is an approximation of uniqueness.

Title: Re: (hash)maps: the final benchmark
Post by: circular on October 21, 2016, 07:31:34 pm

Yes, it is obvious to me that a hash is not unique.

I hear you're worried about programmers considering hashcodes to be unique.

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on October 22, 2016, 01:19:31 am

Quote from: Thaddy on October 21, 2016, 12:38:18 pm

I can't remember you suggested a better algorithm to solve that Issue?.....
I seem to remember you just compared some?

http://bugs.freepascal.org/view.php?id=30664

Quote from: circular

In fact, having a hashmap for all strings could be a way to ensure a single address for each string.

I was thinking about using these hashmaps just for that. But it seems hard to make it thread safe, and to expire the strings, i.e. remove old unused strings

Title: Re: (hash)maps: the final benchmark
Post by: circular on October 22, 2016, 08:37:06 am

Hmm.. maybe using critical sections for reference counting and garbage collecting?

Title: Re: (hash)maps: the final benchmark
Post by: heX on August 22, 2018, 12:05:05 pm

My result of compare HashMap/Map.
Testing: fcl-stl: THashMap, TMap. fcl-base: TFPGMap, TFPHashList.

Result:

Code: [Select]

Test key: string
Count: 100000

THashMap:
Create.   Used mem: 640
Adding. Time: 141  Used mem: 23634176
Read.   Time: 780
Delete. Time: 125  Used mem: 17234336
Free.   Time: 31  Used mem: 64

TMap:
Create.   Used mem: 0
Adding. Time: 421  Used mem: 12799936
Read.   Time: 3479
Delete. Time: 1139  Used mem: 0
Free.   Time: 0  Used mem: -64

TFPGMap:
Create.   Used mem: 64
Adding. Time: 2480  Used mem: 8076384
Read.   Time: 1607
Delete. Time: 5179  Used mem: 3392
Free.   Time: 0  Used mem: -64

TFPHashList:
Create.   Used mem: 32
Adding. Time: 47  Used mem: 5211456
Read.   Time: 406
Delete. Time: 32620  Used mem: 2239200
Free.   Time: 0  Used mem: -64

Code: [Select]

Test key: int64 (or string[8] for TFPHashList)
Count: 100000

THashMap:
Create.   Used mem: 640
Adding. Time: 47  Used mem: 16832096
Read.   Time: 156
Delete. Time: 31  Used mem: 16832192
Free.   Time: 47  Used mem: 64

TMap:
Create.   Used mem: 0
Adding. Time: 47  Used mem: 6400000
Read.   Time: 312
Delete. Time: 125  Used mem: 0
Free.   Time: 0  Used mem: -64

TFPGMap:
Create.   Used mem: 64
Adding. Time: 2324  Used mem: 1676448
Read.   Time: 359
Delete. Time: 4992  Used mem: 3392
Free.   Time: 0  Used mem: -64

TFPHashList:
Create.   Used mem: 32
Adding. Time: 15  Used mem: 4063296
Read.   Time: 141
Delete. Time: 32760  Used mem: 1091040
Free.   Time: 0  Used mem: -64
End.

Code: Pascal [Select][+]

program project1;
 
{.$DEFINE TestStr}
 
{$mode objfpc}{$H+}
 
uses
  {$IFDEF UNIX}{$IFDEF UseCThreads}
  cthreads,
  {$ENDIF}{$ENDIF}
  Classes, SysUtils, CustApp,
  { you can add units after this }
  CRC,
  fgl,
  gmap,
  ghashmap,
  contnrs;
 
const
  DefValue = pointer(1);
 
type
 
  { TMyApplication }
 
  TValue = pointer;
 
  {$IFDEF TestStr}
  TKey = ansistring;
  {$ELSE}
  TKey = int64;
  {$ENDIF}
 
  { THashDirectFunc }
 
  THashDirectInt64 = class
    // return uniformly distributed i value in range <0,n-1> base only on arguments,
    // n will be always power of 2
    class function hash(a: int64; n: SizeUInt): SizeUInt;
 
    // [when STL_INTERFACE_EXT is defined]
    // return the boolean test for equality of the two keys.  Typically this is operator=,
    // but it doesn't have to be (e.g. case-insensitive string comparison)
    class function equal(const AKey1, AKey2: int64): Boolean;
  end;
 
  THashFuncString = class
    // return uniformly distributed i value in range <0,n-1> base only on arguments,
    // n will be always power of 2
    class function hash(a: string; n: SizeUInt): SizeUInt;
 
    // [when STL_INTERFACE_EXT is defined]
    // return the boolean test for equality of the two keys.  Typically this is operator=,
    // but it doesn't have to be (e.g. case-insensitive string comparison)
    class function equal(const AKey1, AKey2: string): Boolean;
  end;
 
  { TMapCompare }
 
  {$IFDEF TestStr}
  TTestGHashMap = specialize THashMap<TKey, TValue, THashFuncString>;
  {$ELSE}
  TTestGHashMap = specialize THashMap<TKey, TValue, THashDirectInt64>;
  {$ENDIF}
 
  TMapCompare=class
    class function c(a,b :TKey):boolean;
  end;
 
  TTestGMap = specialize TMap<TKey, TValue, TMapCompare>;
 
  TTestFPGMap = specialize TFPGMap<TKey, TValue>;
 
  TMyApplication = class(TCustomApplication)
  protected
    procedure DoRun; override;
  public
    list1: TFPHashList;
    map1: TTestGHashMap;
    map2: TTestFPGMap;
    map3: TTestGMap;
    constructor Create(TheOwner: TComponent); override;
    destructor Destroy; override;
    procedure WriteHelp; virtual;
  end;
 
//// //// //// //// //// //// //// //// //// //// //// //// //// //// //// ////
 
function RandomOf(RandKey: LongInt): integer;
Begin
 Result := RandKey * 1103515245 + 12345;
enD;
 
{$IFDEF TestStr}
function RndKeyStr(i: LongInt): ansistring;
begin
  Result := IntToStr(Int64( Abs(RandomOf(i)) or $1000000000000000) );
end;
{$ELSE}
function RndKeyStr(i: LongInt): ansistring;
begin
  // create strong 8 byte string
  Result := IntToHex(RandomOf(i), 8);
end;
{$ENDIF}
 
{$IFDEF TestStr}
function RndKey(i: LongInt): ansistring;
begin
  Result := RndKeyStr(i);
end;
{$ELSE}
function RndKey(i: LongInt): int64;
begin
  Result := RandomOf(i);
end;
{$ENDIF}
 
{ TMapCompare }
 
class function TMapCompare.c(a, b: TKey): boolean;
begin
  {$IFDEF TestStr}
  Result := CompareStr(a, b) > 0;
  {$ELSE}
  Result := a > b;
  {$ENDIF}
end;
 
 
//// //// //// //// //// //// //// //// //// //// //// //// //// //// //// ////
 
class function THashFuncString.hash(a: string; n: SizeUInt): SizeUInt;
begin
  Result := CRC.CRC32(n, @a[1], Length(a)) mod n;
end;
 
class function THashFuncString.equal(const AKey1, AKey2: string): Boolean;
begin
  Result := AKey1 = AKey2;
end;
 
class function THashDirectInt64.hash(a: int64; n: SizeUInt): SizeUInt;
begin
  Result := SizeUInt(RandomOf(a)) mod n;
end;
 
class function THashDirectInt64.equal(const AKey1, AKey2: int64): Boolean;
begin
  Result := AKey1 = AKey2;
end;
 
//// //// //// //// //// //// //// //// //// //// //// //// //// //// //// ////
 
function UsedMem(): integer;
var mem: TFPCHeapStatus;
begin
  mem := SysGetFPCHeapStatus();
  Result := mem.CurrHeapUsed;
end;
 
{ TMyApplication }
 
procedure TMyApplication.DoRun;
var
  i, i2, itmp: integer;
  t: int64;
  c: int64;
  value: TValue;
  p: pointer;
  basemem: integer;
begin
  c := StrToInt64Def(self.GetOptionValue('c', 'count'), 100000);
 
  {$IFDEF TestStr}
  writeln('Test key: string');
  {$ELSE}
  writeln('Test key: int64 (or string[8] for TFPHashList)');
  {$ENDIF}
  writeln('Count: ' + IntToStr(c));
 
  writeln('');
  writeln('THashMap:');
  basemem := UsedMem();
  map1 := TTestGHashMap.create();
  writeln('Create. ' + '  Used mem: ' + IntToStr(UsedMem()-basemem));
 
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    map1.insert(RndKey(i), DefValue);
  end;
  writeln('Adding. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  for i2:=1 to 10 do
    for i:=1 to c do
    begin
      if not map1.GetValue(RndKey(i), value) then
        raise EKeyNotFound.Create('THashMap find error') else
        if value <> DefValue then
          raise EKeyNotFound.Create('THashMap find error 2') else
    end;
  for i:=1 to c do
  begin
    map1.GetValue(RndKey(c+i), value);
  end;
  writeln('Read.   Time: ' + IntToStr(GetTickCount64() - t));
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    if map1.contains(RndKey(i)) then
      map1.delete(RndKey(i));
  end;
  writeln('Delete. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  FreeAndNil(map1);
  writeln('Free.   Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
 
 
  writeln('');
  writeln('TMap:');
  basemem := UsedMem();
  map3 := TTestGMap.create();
  writeln('Create. ' + '  Used mem: ' + IntToStr(UsedMem()-basemem));
 
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    map3.insert(RndKey(i), DefValue);
  end;
  writeln('Adding. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  for i2:=1 to 10 do
    for i:=1 to c do
    begin
      if not map3.TryGetValue(RndKey(i), value) then
        raise EKeyNotFound.Create('THashMap find error') else
        if value <> DefValue then
          raise EKeyNotFound.Create('THashMap find error 2') else
    end;
  for i:=1 to c do
  begin
    map3.TryGetValue(RndKey(c+i), value);
  end;
  writeln('Read.   Time: ' + IntToStr(GetTickCount64() - t));
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    if map3.TryGetValue(RndKey(i), value) then
      map3.delete(RndKey(i));
  end;
  writeln('Delete. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  FreeAndNil(map3);
  writeln('Free.   Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
 
 
  writeln('');
  writeln('TFPGMap:');
  basemem := UsedMem();
  map2 := TTestFPGMap.Create();
  map2.Sorted:=true;
  writeln('Create. ' + '  Used mem: ' + IntToStr(UsedMem()-basemem));
 
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    map2.Add(RndKey(i), DefValue);
  end;
  writeln('Adding. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  for i2:=1 to 10 do
    for i:=1 to c do
    begin
      map2.Find(RndKey(i), itmp);
      if itmp = -1 then
        raise EKeyNotFound.Create('TFPGMap find error');
    end;
  for i:=1 to c do
  begin
    map2.Find(RndKey(c+i), itmp);
  end;
  writeln('Read.   Time: ' + IntToStr(GetTickCount64() - t));
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    map2.Find(RndKey(i), itmp);
    if itmp<>-1 then
      map2.Delete(itmp);
  end;
  writeln('Delete. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  FreeAndNil(map2);
  writeln('Free.   Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
 
 
  writeln('');
  writeln('TFPHashList:');
  basemem := UsedMem();
  list1 := TFPHashList.Create();
  writeln('Create. ' + '  Used mem: ' + IntToStr(UsedMem()-basemem));
 
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    list1.Add(RndKeyStr(i), DefValue);
  end;
  writeln('Adding. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  for i2:=1 to 10 do
    for i:=1 to c do
    begin
      if list1.Find(RndKeyStr(i)) = nil then
        raise EKeyNotFound.Create('TFPHashList find error');
    end;
  for i:=1 to c do
  begin
    list1.Find(RndKeyStr(c+i));
  end;
  writeln('Read.   Time: ' + IntToStr(GetTickCount64() - t));
  t := GetTickCount64();
 
  for i:=1 to c do
  begin
    itmp := list1.FindIndexOf(RndKeyStr(i));
    if itmp <> -1 then
      list1.Delete(itmp);
  end;
  writeln('Delete. Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
  FreeAndNil(list1);
  writeln('Free.   Time: ' + IntToStr(GetTickCount64() - t) +
    '  Used mem: ' + IntToStr(UsedMem()-basemem));
  t := GetTickCount64();
 
 
 
  writeln('End.');
  ReadLn();
  // stop program loop
  Terminate;
end;
 
constructor TMyApplication.Create(TheOwner: TComponent);
begin
  inherited Create(TheOwner);
  StopOnException:=True;
end;
 
destructor TMyApplication.Destroy;
begin
  inherited Destroy;
end;
 
procedure TMyApplication.WriteHelp;
begin
  { add your help code here }
  writeln('Usage: ', ExeName, ' -h');
end;
 
var
  Application: TMyApplication;
begin
  Application:=TMyApplication.Create(nil);
  Application.Title:='My Application';
  Application.Run;
  Application.Free;
end.
 

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on April 24, 2020, 11:53:49 pm

I have reran the benchmark with fpc 3.3.1 (http://www.benibela.de/fpc-map-benchmark_en.html) and added another dozen maps. And now there are bubbles and rankings.

They results cannot really compared to the old results, since I have changed the key generation algorithm (so it could generate 2.5x as many keys) and forgot the old compilation mode, but I will try anyways.

ghashmap has improved, 40% to 14% faster. But the memory usage has increased by 11%.

Generic collections have improved slight by 10%. The old benchmark did not include the quadratic probing map of it, because that one crashed, but in 3.3.1 it is working, and also slightly faster than the other ones in the usual setting.

TFPGMap has slightly improved, but it is not a real map.

TStringList has dropped massively. Of course TStringList has never been a hashmap, but up to 1000 keys it was one of the fastest maps. Now it has become 2 to 5 times slower. Also the benchmark failed until I changed the codepage from utf8 to latin1. Perhaps it is now checking if the keys are valid utf-8

I have added the std::unordered_map from C++, to see how bad it is all is. For short keys std::unordered_map is 20% to 40% faster than all FPC maps. But for a lot of long keys, generic collection's quadratic is 14% faster, although with higher memory usage.

The fastest maps are now LGenerics by avk959. They are even 15% to 35% faster than std::unordered_map.

Although in some settings Yamer's map, which was the fastest in the previous benchmark, is still the fastest.

I have also run it twice, once with -O3 and once with -O4, but with different revisions.

-O4 is really good for TFPDataHashTable. With -O3 it is worse than TFPHashList, but with -O4 it becomes the fastest map of all of them for small keys with a low to middle number of insertations, especially for not-contained lookups.

The generic collection cuckoo maps are worse with -O4 than -O3? In some cases over 50% slower around 100000 insertations? I wonder if that is true or there is something wrong with the benchmark. Perhaps I should rerun that part.

Title: Re: (hash)maps: the final benchmark
Post by: marcov on April 25, 2020, 02:00:31 pm

Quote from: BeniBela on April 24, 2020, 11:53:49 pm

TStringList has dropped massively. Of course TStringList has never been a hashmap, but up to 1000 keys it was one of the fastest maps. Now it has become 2 to 5 times slower. Also the benchmark failed until I changed the codepage from utf8 to latin1. Perhaps it is now checking if the keys are valid utf-8

Does Light containers (generic) also suffers from that? It is more or less a tstringlist, but then with a multidimensional instead of a single dimensional array. (which avoids the sudden cliff-edge at higher key numbers)

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on April 25, 2020, 02:39:06 pm

Quote from: marcov on April 25, 2020, 02:00:31 pm

Does Light containers (generic) also suffers from that? It is more or less a tstringlist, but then with a multidimensional instead of a single dimensional array. (which avoids the sudden cliff-edge at higher key numbers)

It is only affecting TStringList. Perhaps because it is sorting itself with AnsiCompareStr. That might put invalid utf-8 at a random position

Title: Re: (hash)maps: the final benchmark
Post by: marcov on April 25, 2020, 02:58:21 pm

Quote from: BeniBela on April 25, 2020, 02:39:06 pm

Quote from: marcov on April 25, 2020, 02:00:31 pm
Does Light containers (generic) also suffers from that? It is more or less a tstringlist, but then with a multidimensional instead of a single dimensional array. (which avoids the sudden cliff-edge at higher key numbers)

It is only affecting TStringList. Perhaps because it is sorting itself with AnsiCompareStr. That might put invalid utf-8 at a random position

In 3.3.1 (but not in 3.2) sorting became configurable (revisions, see http://www.stack.nl/~marcov/mergelogs32/stringlistsort.html)

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on April 26, 2020, 02:21:06 pm

Quote from: marcov on April 25, 2020, 02:58:21 pm

In 3.3.1 (but not in 3.2) sorting became configurable (revisions, see http://www.stack.nl/~marcov/mergelogs32/stringlistsort.html)

Sorting has always been configurable by overriding doCompareText

I have overriden it to call compareStr and that tstringlist is like 5 times faster

Title: Re: (hash)maps: the final benchmark
Post by: PascalDragon on April 26, 2020, 02:23:12 pm

What marcov meant is that the sorting algorithm is changeable, not merely the comparison that is done for sorting.

Title: Re: (hash)maps: the final benchmark
Post by: julkas on June 01, 2020, 02:43:02 pm

I think benchmark results with other memory manager (FastMM4-AVX - https://github.com/maximmasiutin/FastMM4-AVX) would be very interesting.

Title: Re: (hash)maps: the final benchmark
Post by: alantelles on July 14, 2020, 02:51:56 pm

A little point for those who find this topic.

FPC provides a class called TFPObjectHashTable https://www.freepascal.org/docs-html/current/fcl/contnrs/tfpobjecthashtable.html (https://www.freepascal.org/docs-html/current/fcl/contnrs/tfpobjecthashtable.html). Don't use it! It has a badly really really poor performance and extremely high memory usage.

Title: Re: (hash)maps: the final benchmark
Post by: BeniBela on July 18, 2020, 11:49:44 pm

Quote from: julkas on June 01, 2020, 02:43:02 pm

I think benchmark results with other memory manager (FastMM4-AVX - https://github.com/maximmasiutin/FastMM4-AVX) would be very interesting.

You could do run that. I only use the default memory manager

Testing other hash functions might be even more interesting

Quote from: alantelles on July 14, 2020, 02:51:56 pm

A little point for those who find this topic.

FPC provides a class called TFPObjectHashTable https://www.freepascal.org/docs-html/current/fcl/contnrs/tfpobjecthashtable.html (https://www.freepascal.org/docs-html/current/fcl/contnrs/tfpobjecthashtable.html). Don't use it! It has a badly really really poor performance and extremely high memory usage.

That is like TFPDataHashTable in the benchmark

Lazarus

Announcements => Third party => Topic started by: BeniBela on October 09, 2016, 07:19:19 pm