Recent

Author Topic: Pls help to optimize this WideChar buffer search  (Read 11348 times)

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12715
  • FPC developer.
Re: Pls help to optimize this WideChar buffer search
« Reply #15 on: January 04, 2022, 09:02:14 pm »
Yes, WideChar buffer with AnsiString; widechar buffer can have any text, but StringList has only ASCII text.

Why do I see shortstr conversions then? Are you sure you enabled $h+ ?

Code: [Select]
callq  0x100005d80  <fpc_ansistr_to_shortstr>
00000001000016F6 488d95f0feffff           lea    -0x110(%rbp),%rdx
00000001000016FD 4889e9                   mov    %rbp,%rcx
0000000100001700 e87b000000               callq  0x100001780 <COMPAREFUNC>
[/code

AlexTP

  • Hero Member
  • *****
  • Posts: 2680
    • UVviewsoft
Re: Pls help to optimize this WideChar buffer search
« Reply #16 on: January 04, 2022, 09:16:51 pm »
Quote
Are you sure you enabled $h+ ?
$H+ was forgotten in original test; now it's there in test version2.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12715
  • FPC developer.
Re: Pls help to optimize this WideChar buffer search
« Reply #17 on: January 04, 2022, 10:12:26 pm »
I played a bit and came to this ( very quick and dirty):
Code: Delphi  [Select][+][-]
  1. //compares String item with WideChar buffer (ASCII string in wide buffer)
  2. function CompareFunc(ABuf: PWideChar; ABufLen: integer; const SItem: AnsiString): integer; inline;
  3. var
  4.   NCmp,hb: integer;
  5.   i: PtrInt;
  6.   pp : pchar;
  7. begin
  8.   //compare 1st char
  9.   NCmp:= Ord(ABuf[0])-Ord(SItem[1]);
  10.   if NCmp<>0 then Exit(NCmp);
  11.  
  12.   //compare others
  13.   hb:=Min(ABufLen, Length(SItem));
  14.   pp:=@sitem[2];
  15.   inc(abuf);
  16.   for i:= 1{>0} to hb-1 do
  17.   begin
  18.     NCmp:= ord(abuf^)-ord(pp^) ;
  19.     inc(abuf); inc(pp);
  20.     if NCmp<>0 then break; // avoid an additional jump due to preparing exit() argument
  21.   end;
  22.  if NCmp<>0 then exit(ncmp);
  23.  
  24.   Result:= ABufLen-Length(SItem);
  25. end;
  26.  

it seems to be slightly better, but the results of both vary too much,  (maybe because I'm on a laptop?)

AlexTP

  • Hero Member
  • *****
  • Posts: 2680
    • UVviewsoft
Re: Pls help to optimize this WideChar buffer search
« Reply #18 on: January 04, 2022, 10:16:50 pm »
@macrov, thanks but the result is almost the same (custom find is even slower now, seems).

Quote
user@PC:~/Documents/stringlist_widebuffer$ ./cmp_buf
StrList.Find: 77 / custom find: 84
StrList.Find/bad: 86 / custom find/bad: 91
user@PC:~/Documents/stringlist_widebuffer$ ./cmp_buf
StrList.Find: 68 / custom find: 105
StrList.Find/bad: 86 / custom find/bad: 79
user@PC:~/Documents/stringlist_widebuffer$ ./cmp_buf
StrList.Find: 85 / custom find: 96
StrList.Find/bad: 80 / custom find/bad: 79

balazsszekely

  • Guest
Re: Pls help to optimize this WideChar buffer search
« Reply #19 on: January 05, 2022, 03:14:56 pm »
@Alextp

Did you try a HashList as in the attachment? Is much faster at my side.
Quote
D:\Test>cmp_buf_2
StrList.Find: 517 / custom find: 16
StrList.Find/bad: 546 / custom find/bad: 15


AlexTP

  • Hero Member
  • *****
  • Posts: 2680
    • UVviewsoft
Re: Pls help to optimize this WideChar buffer search
« Reply #20 on: January 05, 2022, 04:34:51 pm »
@GetMem
Nice idea.... but result (after you add 'sl.UseLocale:= False') is faster only in 'bad case', search for non-existing str.

Quote
StrList.Find: 79 / custom find: 89
StrList.Find/bad: 74 / custom find/bad: 64

StrList.Find: 75 / custom find: 88
StrList.Find/bad: 82 / custom find/bad: 67

StrList.Find: 76 / custom find: 86
StrList.Find/bad: 81 / custom find/bad: 71

balazsszekely

  • Guest
Re: Pls help to optimize this WideChar buffer search
« Reply #21 on: January 05, 2022, 05:09:38 pm »
@Alextp
Quote
Nice idea.... but result (after you add 'sl.UseLocale:= False') is faster only in 'bad case', search for non-existing str.
Yes, with X86_64 you get similar results, however on X86 is way faster. I'm trying to figure out why is much slower on 64 bit, also I'm wondering what would happen with a few optimization(-o, ...), unfortunately it's not my area of expertise(to be perfectly honest I suck at it :D).

PS: Please note that as the list grows, the hash method will be increasingly faster. You will see the difference when the list is bigger then 10k.
« Last Edit: January 05, 2022, 05:18:51 pm by GetMem »

AlexTP

  • Hero Member
  • *****
  • Posts: 2680
    • UVviewsoft
Re: Pls help to optimize this WideChar buffer search
« Reply #22 on: January 05, 2022, 05:47:34 pm »
My app has StringList with 30 to e.g. 500 items. Not 10k items.
Also a note for HashList: I am not sure you do it correct:

Code: Pascal  [Select][+][-]
  1.   for I := 0 to ASL.Count - 1 do
  2.   begin
  3.     Hash := MD5Print(MD5String(ASL.Strings[I]));
  4.     New(PStr);
  5.     PStr^ := ASL.Strings[I];
  6.     AHL.Add(Hash, PStr);
  7.   end;

because on collision (same hash str for different long strings) it fails, yes?

balazsszekely

  • Guest
Re: Pls help to optimize this WideChar buffer search
« Reply #23 on: January 05, 2022, 05:58:00 pm »
@Alextp
Quote
My app has StringList with 30 to e.g. 500 items. Not 10k items.
Then I don't see the point of optimization.

Quote
Also a note for HashList: I am not sure you do it correct:
because on collision (same hash str for different long strings) it fails, yes?
Different strings cannot have the same hash, even with md5 the chances of random collisions are incredibly small.
You put sl.Duplicates:= dupIgnore, so there will be no collision.

AlexTP

  • Hero Member
  • *****
  • Posts: 2680
    • UVviewsoft
Re: Pls help to optimize this WideChar buffer search
« Reply #24 on: January 05, 2022, 06:25:31 pm »
@GetMem
Search-str will be the 'words' from source code, and StrList (HashList) will have list of 'std IDs of syntax'. Collisions are possible - hashes may be equal but strings are not.

balazsszekely

  • Guest
Re: Pls help to optimize this WideChar buffer search
« Reply #25 on: January 05, 2022, 06:46:24 pm »
@Alextp
Quote
Search-str will be the 'words' from source code, and StrList (HashList) will have list of 'std IDs of syntax'. Collisions are possible - hashes may be equal but strings are not.
Now I have no idea what are you talking about.
There is a surjective relation between the stringlist and hashlist. Each string has a unique hash equivalent, more over the hashlist also contains the original strings too:
Code: Pascal  [Select][+][-]
  1. Hash := HL.NameOfIndex(I);
  2. PStr := HL.Items[I];
  3. OrigStr := PStr^

Hash and OrigStr are in pair, that's the whole point. Searching in Hash list is much faster because it was build for speed.
I have nothing more to say, good luck with the optimization.

AlexTP

  • Hero Member
  • *****
  • Posts: 2680
    • UVviewsoft
Re: Pls help to optimize this WideChar buffer search
« Reply #26 on: January 05, 2022, 07:10:46 pm »
Thanks, you gave a useful direction for me. I will think about this more.
(I am not sure I was right about collision)
« Last Edit: January 05, 2022, 07:13:03 pm by Alextp »

 

TinyPortal © 2005-2018