Recent

Author Topic: (hash)maps: the final benchmark  (Read 13203 times)

BeniBela

  • Hero Member
  • *****
  • Posts: 689
    • homepage
(hash)maps: the final benchmark
« on: October 09, 2016, 07:19:19 pm »
Without a good default associative array, there is always some confusion about the maps available in fpc:

Urgh...I feel blue :-( But by the looks of it, hashtables are just not scaled for this volume of data. I thought there might be away of ensuring only less bytes were used or that perhaps I could plumb for a different hash type to whatever TFPHashList uses by default. e.g. if it was using SHA256 and I could perhaps have knocked it down to MD5 or even a basic CRC or something. But AFAIK, you can't do that.

Therefore I have compared all the maps.

Turns out TFPHashList actually works well, although only for short keys.

For larger keys the rtl-generics.collections package is better. If you have a fpc version that can compile it.

Otherwise you can try a 3rd party map

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7639
Re: (hash)maps: the final benchmark
« Reply #1 on: October 09, 2016, 09:59:47 pm »
Where is lightcontainers ? Though admitted while it is a map, it is not a hashmap. (since the main secondary purpose is being ordered iterable, as replacement for tstringlist)
« Last Edit: October 09, 2016, 10:09:06 pm by marcov »

BeniBela

  • Hero Member
  • *****
  • Posts: 689
    • homepage
Re: (hash)maps: the final benchmark
« Reply #2 on: October 09, 2016, 10:30:26 pm »
Where is lightcontainers ? Though admitted while it is a map, it is not a hashmap. (since the main secondary purpose is being ordered iterable, as replacement for tstringlist)

They did not compile. Didn't you get the forum PM?

Basile-B

  • Sr. Member
  • ****
  • Posts: 457
Re: (hash)maps: the final benchmark
« Reply #3 on: October 12, 2016, 04:04:27 pm »
Without a good default associative array, there is always some confusion about the maps available in fpc:
Therefore I have compared all the maps.

Thanks, very nice work. I find this very useful, to the extent that I've begun to use ghashmap and discovered that it's not the best choice. What would you recommend for small maps (up to 100 entries max) ? The most important criterion would be the time spent to build the map. (not benchmarked ?).

Also there's a minor issue in the axis definition, both linear and logarithmic are allowed.

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7639
Re: (hash)maps: the final benchmark
« Reply #4 on: October 12, 2016, 05:39:09 pm »
They did not compile. Didn't you get the forum PM?

I checked and I did get it, and seeing it jogged my memory, I got it 10 mins before I went off for a long weekend. I planned to reply later but forgot completely.

It is simply a matter of enabling delphi mode. Apparently I added a "uses sysutils;" in delphi at some point, and because the $mode was below it, it was skipped. Some of the demoes might still need delphi mode manually.

The new archive (still here) fixes that and synchronizes the generics version (genlight) with current SVN.

In practice I nowadays mostly use the generics version, but my datasets are smaller atm, so if there is a performance difference it matters less. I would be obliged if you benchmarked both. Genlight works a bit like generics.collections, so should be easy to test.

As said basically it is a generic tstringlist alike, but with one level extra indirection (to avoid too slow ordered insertion) and that keys can also be different from strings. Even then, keys are still ordered (which is useful for e.g. tdatetime ranges)

The original design aspect was to avoid tstringlist ordered insertion performance wall, where insertions/second totally breaks down with several hundred thousands of strings, and quick iteration as secondary case. But a tree was too memory hungry (application was then still 32-bit, and index overhead had to be limited).

Lookup was also important (because of the ordered insertion alone), but not as important as iteration as long as it was decent.

The procedural version was the core of my first production 64-bit program back somewhere in the 2002-2005 period.
« Last Edit: October 14, 2016, 11:22:31 am by marcov »

BeniBela

  • Hero Member
  • *****
  • Posts: 689
    • homepage
Re: (hash)maps: the final benchmark
« Reply #5 on: October 13, 2016, 02:09:16 pm »
Thanks, very nice work. I find this very useful, to the extent that I've begun to use ghashmap and discovered that it's not the best choice. What would you recommend for small maps (up to 100 entries max) ? The most important criterion would be the time spent to build the map. (not benchmarked ?).

Building would be the "only writes" model, but at lower count it is too fast for my benchmark. It will all get 0 or 1ms, even ghashmap.  Just the generics.collection is rather slow. 

Also there's a minor issue in the axis definition, both linear and logarithmic are allowed.

Both is allowed, and then you get two plots. If you enable all options, you can get hundreds of plots.


It is simply a matter of enabling delphi mode. Apparently I added a "uses sysutils;" in delphi at some point, and because the $mode was below it, it was skipped. Some of the demoes might still need delphi mode manually.

The new archive (still here) fixes that and synchronizes the generics version (genlight) with current SVN.


lightcontainers.zip

But it also fails in function int2ln(i:integer):integer; because it is 32-assembly and I compile it on 64-bit.


marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7639
Re: (hash)maps: the final benchmark
« Reply #6 on: October 13, 2016, 03:20:40 pm »

lightcontainers.zip

But it also fails in function int2ln(i:integer):integer; because it is 32-assembly and I compile it on 64-bit.

Yes. Add -Rintel. I thought FPC did that automatically.

Genlight still doesn't compile under 64-bit. (added later: Fixed I hope Same location (but lightcontainerS.zip))

The assembler works in both 32 and 64-bit, and mainly to exploit the bsr instruction. In FPC there is now an intrinsic for that I believe.

I didn't check the tests, only the core units (am at work)
« Last Edit: October 14, 2016, 11:24:52 am by marcov »

hnb

  • Sr. Member
  • ****
  • Posts: 268
Re: (hash)maps: the final benchmark
« Reply #7 on: October 14, 2016, 11:42:49 am »
For larger keys the rtl-generics.collections package is better. If you have a fpc version that can compile it.

Really nice work, thanks for the tests. I wonder how other hash algorithms can improve rtl-generics (any hashmap inside generics.collections can be customized in this aspect - each map has additional constraint for this, for example: TObjectCuckooD2<TKey, TValue, THashFactory>):

https://plus.google.com/u/0/+EricGrange/posts/VDAT1Mg7pR1

xxHash looks interesting. Current implementation of rtl-generics uses Bob Jenkins hash.
Checkout NewPascal initiative and donate beer - ready to use tuned FPC compiler + Lazarus for mORMot project

best regards,
Maciej Izak

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7639
Re: (hash)maps: the final benchmark
« Reply #8 on: October 15, 2016, 02:28:40 pm »
I played a bit with it to get a feel for memory consumption, but couldn't get it to draw a graph for tstringlist.

BeniBela

  • Hero Member
  • *****
  • Posts: 689
    • homepage
Re: (hash)maps: the final benchmark
« Reply #9 on: October 18, 2016, 02:32:06 pm »
Thanks, very nice work. I find this very useful, to the extent that I've begun to use ghashmap and discovered that it's not the best choice. What would you recommend for small maps (up to 100 entries max) ?

That depends what you want to do with it.

Dictionary words, low read count: TFPHashList

Dictionary words, middle read count: TFPHashList, Bero's flre cache

Long random words: Bero's flre cache, TStringList or lightcontainers


But they are all below 2ms in the test

I wonder how other hash algorithms can improve rtl-generics (any hashmap inside generics.collections can be customized in this aspect - each map has additional constraint for this, for example: TObjectCuckooD2<TKey, TValue, THashFactory>):


Well, I have only tried the default settings.

The growth strategy might also change a lot

contnrs uses this hash
Code: [Select]
p:=@s[1];
  pmax:=@s[length(s)+1];
  while (p<pmax) do
    begin
      Result:=LongWord(LongInt(Result shl 5) - LongInt(Result)) xor LongWord(P^);
      Inc(p);
    end;           
which seems faster, and with the random keys the distribution of the result might not matter much

I played a bit with it to get a feel for memory consumption, but couldn't get it to draw a graph for tstringlist.

I forgot to switch it to case-sensitive, so it did not run it for TStringList

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7639
Re: (hash)maps: the final benchmark
« Reply #10 on: October 18, 2016, 04:15:07 pm »
Benibela: thanks it works now, and lightcontainers too.

One can make the TStringlist problem also very visible with the website:

- model = "only write" 1,0,0
- metric= "keys/time" and keys/mem  (2 graphs)
- dataset leave at dictionary words

First graph keys/time shows that the maps with one array (TFPGMAP and TStringlist) have a sharp slowdown with higher key numbers.

The second graph (keys/mem) shows that these however are the lowest memory usage.

The lightcontainers are just below this in this graph, but don't suffer the breakdown of the first.
« Last Edit: October 21, 2016, 08:43:23 am by marcov »

circular

  • Hero Member
  • *****
  • Posts: 3058
    • Personal webpage
Re: (hash)maps: the final benchmark
« Reply #11 on: October 19, 2016, 05:49:22 pm »
What about using the memory address instead of the string content?
Conscience is the debugger of the mind

BeniBela

  • Hero Member
  • *****
  • Posts: 689
    • homepage
Re: (hash)maps: the final benchmark
« Reply #12 on: October 21, 2016, 12:30:16 am »
What about using the memory address instead of the string content?

I have not tried that. I think it will probably behaves similar to short random keys with shortstrings.

But there are not many maps for that key type anyways. ghashmap and rtl-generics with fpc, and ghashmap is clearly worse

circular

  • Hero Member
  • *****
  • Posts: 3058
    • Personal webpage
Re: (hash)maps: the final benchmark
« Reply #13 on: October 21, 2016, 12:25:16 pm »
I guess there could be some problem with non-unique strings, i.e. strings that have the same content but have a different memory address. In fact, having a hashmap for all strings could be a way to ensure a single address for each string.
Conscience is the debugger of the mind

Thaddy

  • Hero Member
  • *****
  • Posts: 9303
Re: (hash)maps: the final benchmark
« Reply #14 on: October 21, 2016, 12:38:18 pm »

But there are not many maps for that key type anyways. ghashmap and rtl-generics with fpc, and ghashmap is clearly worse

I can't remember you suggested a better algorithm to solve that Issue?.....
I seem to remember you just compared some?

Quote from: circular
I guess there could be some problem with non-unique strings, i.e. strings that have the same content but have a different memory address. In fact, having a hashmap for all strings could be a way to ensure a single address for each string.

No. hashes will clash at some point and that can be mathematically proven. Hashes trade in proof for fast. As opposed to a real index. That can be proven to be exact [edit: and in any given dimension].
A hash is non-unique by default over an infinite population, but also in practice over a finite population. Reduction of information theorum.

Resolving hash collisions efficiently is another matter and not related to hashes perse, just in case someone wants the to comment on a solution ;).

Note that ALL hashes have this trait, including one-way hashes.
« Last Edit: October 21, 2016, 12:58:40 pm by Thaddy »
also related to equus asinus.