Recent

Author Topic: Checking Speed of SynEdit Highlighters  (Read 105099 times)

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9794
  • Debugger - SynEdit - and more
    • wiki
Re: Checking Speed of SynEdit Highlighters
« Reply #45 on: October 20, 2013, 02:13:16 pm »
With some refactoring, the speed of the hash bashed implementation can probably be increased (maybe even drastically):

1) Ideally choose a hash that is "perfect" for the set of all keywords. So no hash value has more than one matching keyword

2) the important part:

Instead of a hash table containing function-refs, have a hash table with the keywords. (well keyword + function ref)

Then the compare with the keyword can be made before calling the function. That may saves a lot of unnecessary calls.

And if (1) the hash is perfect for the list of keywords, then the hash table only needs one entry per hash value. Otherwise a sorted list or prefix tree were needed.

Further more: the entries in the hash table could contain:
- keyword
- function (may be nil)
- type of ident (e.g. number, keyword, ...)

for keywords which only return a specific attribute, but do nothing else, no function needs to be called.

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #46 on: October 20, 2013, 03:42:23 pm »
A hash is simply  a kind of sieve, a preselection to avoid many comparisons.
Which sieve is the cheapest in computing time depends on the string keys to process
and may vary a lot not only by implementation, but also from compiler to compiler.

An algorithm using character selection for every character of the string keys would be the fastest possible  as it reduces the needed comparisons to zero.

Initializing boolean tables for every sieve level needed to process a given keylist
and store them in a level table to access them by a given level is feasible on todays PCs.
The amount of boolean tables ( sieve levels ) is determined by the longest key.
If  the boolean table in all seeve levels needed for a given string lengt return true
then the result is true without the need of any comparisons.
« Last Edit: October 20, 2013, 04:09:08 pm by Morton »

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #47 on: October 20, 2013, 04:49:02 pm »
A dynamic two dimensional boolean table accessed by level and character could be used too.
Don't know if the level table with boolean tables or the two dimensional boolean table
would be cheaper in computing time for FPC.

The tables will have to indicate the possible token kinds for each level.
Two dimensions may not be enough to avoid comparision completely.
« Last Edit: October 20, 2013, 05:47:42 pm by Morton »

Edson

  • Hero Member
  • *****
  • Posts: 1301
Re: Checking Speed of SynEdit Highlighters
« Reply #48 on: October 20, 2013, 05:21:36 pm »
Quote
An algorithm using character selection for every character of the string keys would be the fastest possible  as it reduces the needed comparisons to zero..

That's is the objetive of the algorithm I proposed. The first-char-prefix is just the first level. I have indicated before:

"That's right. I have used a Prefix Tree, but implemented for only the first level. ... Of course this method could be slower if the sintax has many words starting with the same character.  In that case, it could be improved using a second level of the tree."

I mean, the algorithm can be implemented for the second char, the third, ... Just add the branch you need in the necessary points (Could be a simple case of...).

Doing that way, there is no comparison. The hash will be always lose.
 
Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9794
  • Debugger - SynEdit - and more
    • wiki
Re: Checking Speed of SynEdit Highlighters
« Reply #49 on: October 20, 2013, 05:41:35 pm »
Quote
An algorithm using character selection for every character of the string keys would be the fastest possible  as it reduces the needed comparisons to zero..

That's is the objetive of the algorithm I proposed. The first-char-prefix is just the first level. I have indicated before:
Quote
Doing that way, there is no comparison. The hash will be always lose.

Depends:

How well optimized is your prefix tree? Memory is ont at issue. Cache is. If it need a lot of memory access with cach misses, then it looses.
Depends on architecture and implementation.

Also code flow.
As far as I can see, the hash implementation is very good for cpu feature such as branch prediction (only one branch for end-of-word, almost always false (only true once per word))  and as a consequence data pre-fetching.
Again: Depends on architecture and implementation.

---
That does not mean that a prefix can not beat a hash (on implementation level). It may well do.


Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #50 on: October 20, 2013, 06:04:07 pm »
Doing that way, there is no comparison. The hash will be always lose.

Obviously not true, as the prefix is a hash too.

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #51 on: October 24, 2013, 05:13:43 pm »
Attached is an example implementation and test project of a key matcher, which can perform about twenty million matches per second on my AMD A8-5500, without doing any comparisons.

The principle can be used to match any kind of string.
« Last Edit: October 24, 2013, 06:10:38 pm by Morton »

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #52 on: October 24, 2013, 05:24:24 pm »
Implemented in a HighLighter the macher would perform even faster than by maching against a string list as in the example project.

The number of keys, even a huge one, hast little influence on the speed.
« Last Edit: October 24, 2013, 05:37:58 pm by Morton »

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #53 on: October 24, 2013, 06:13:12 pm »
Made a little mistake:
It is twenty million matches per second, not just two million. :D

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9794
  • Debugger - SynEdit - and more
    • wiki
Re: Checking Speed of SynEdit Highlighters
« Reply #54 on: October 24, 2013, 06:26:14 pm »
And it matches the word "cabel".

Because it starts like "case" and ends like "label"

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #55 on: October 24, 2013, 06:33:13 pm »
And it matches the word "cabel".

Because it starts like "case" and ends like "label"

Thanks for testing.
I think I have a solution.

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #56 on: October 24, 2013, 06:52:13 pm »
And it matches the word "cabel".

Because it starts like "case" and ends like "label"

Sorry, I cannot repeat that.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9794
  • Debugger - SynEdit - and more
    • wiki
Re: Checking Speed of SynEdit Highlighters
« Reply #57 on: October 24, 2013, 07:00:28 pm »
I dont have the modified highlighter. But adding to your app:

procedure TForm1.Button1Click(Sender: TObject);
begin
  if IsKeyWord('label') then Caption :=  Caption + '1';
  if IsKeyWord('xabel') then Caption :=  Caption + '2';
  if IsKeyWord('cabel') then Caption :=  Caption + '3';
end;

and press that button, will give 13

Morton

  • Full Member
  • ***
  • Posts: 111
Re: Checking Speed of SynEdit Highlighters
« Reply #58 on: October 24, 2013, 07:22:32 pm »
I dont have the modified highlighter. But adding to your app:

procedure TForm1.Button1Click(Sender: TObject);
begin
  if IsKeyWord('label') then Caption :=  Caption + '1';
  if IsKeyWord('xabel') then Caption :=  Caption + '2';
  if IsKeyWord('cabel') then Caption :=  Caption + '3';
end;

and press that button, will give 13

Fixed with a bigger matcher.
A sparse implementation would be possible.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9794
  • Debugger - SynEdit - and more
    • wiki
Re: Checking Speed of SynEdit Highlighters
« Reply #59 on: October 24, 2013, 07:25:24 pm »
For pascal you need at least a sequence of 6 
operator  + property = operty



 

TinyPortal © 2005-2018