Print Page - CompareText improvement

Free Pascal => FPC development => Topic started by: edgarrod71 on March 23, 2023, 05:13:12 am

Title: CompareText improvement
Post by: edgarrod71 on March 23, 2023, 05:13:12 am

Hi, I'm always trying to improve the language I love, so I want to share a faster function for comparing strings, would you add it to sysstrh.inc?

Code: Pascal [Select][+]

type
  TFoT = (Falso, Verdadero);
 
class function CompareText(const S1, S2: string): integer; inline;  // sysstrh.inc
var
  i, count, count1, count2: integer; Chr1, Chr2: byte;
  P1, P2: PChar;
begin
  Count1 := Length(S1);         // 5
  Count2 := Length(S2);         // 5
  if (Count1>Count2) then       // 3
    Count := Count2             // 3
  else
    Count := Count1;            // 2
  i := 0;                       // 1 mov %eax, -0x18(%rbp)
  if count>0 then               // 2 ____ tot 21 
    begin
      P1 := @S1[1];             // 2
      P2 := @S2[1];             // 2
      while i < Count do        // 2
        begin
          Chr1 := byte(p1^);    // 3
          Chr2 := byte(p2^);    // 3
          if Chr1 <> Chr2 then  // 3
            begin
              if Chr1 in [97..122] then  // 4
                dec(Chr1,32);            // 1 subb
              if Chr2 in [97..122] then  // 4
                dec(Chr2,32);            // 1
              if Chr1 <> Chr2 then       // 3 Break inplicit jmp?
                Break;
            end;
          Inc(P1); Inc(P2); Inc(I);      // 3
        end;
    end;
  if i < Count then   // simply rest?    // 3
    result := Chr1-Chr2                  // 5
  else
    result := count1-count2;             // 4   tot 64 asm instructions
end;
 
Class function TextComp(const S1, S2: string): boolean; inline;
var L1, L2: integer;
  i: integer = 0;
  P1, P2: PChar;
  Chr1, Chr2: byte;
begin
  L1 := length(S1);                  // 5
  L2 := length(S2);                  // 5
  Result := L1 = L2;                 // 3
  if (L1 > 0) and Result then begin  // 4 Tot 17
    P1 := @S1[1];                    // 2
    P2 := @S2[1];                    // 2
    repeat
      Chr1 := ord(P1^);              // 3
      Chr2 := ord(P2^);              // 3
      if Chr1 <> Chr2 then begin     // 3
        if Chr1 in [$41..$5A] then   // 4
          Chr1 := Chr1 or $20;       // 1 orb
        if Chr2 in [$41..$5A] then   // 4
          Chr2 := Chr2 or $20;       // 1
        if Chr1 <> Chr2 then begin   // 3
          Result := False;           // 1
          Exit;                      // 1
        end;
      end;
      inc(i); inc(P1); inc(P2);   // 3
    until i >= L1;                // 3    tot 51
  end;
end;
 
const Maxi = 10000000;
var
  S1: string = 'FrEe PaScAl and DELPHI are easier THAN C++';
  S2: String = 'fReE pAsCaL AND delphi ARE EASIER than c++';
  i, j: integer;
  B: boolean;
  t: longint;
  S: String;
initialization
  writeln(Format('Comparing S1=''%s'' and S2=''%s''', [S1, S2]));
 
  t := GetTickCount64();
  for i := -Maxi to Maxi do
    j := CompareText(S1, S2);
  t := GetTickCount64() - t;
  writeln(Format('CompareText: %d, %d ms', [j, t]));
 
  t := GetTickCount64();
  for i := -Maxi to Maxi do
    B := TextComp(S1, S2);
  t := GetTickCount64() - t;
  WriteStr(S, TFoT(B));
  writeln(Format('TextComp: %s, %d ms', [S, t]));
 
  halt(0);
end.

--- Benchmarks ---

Code: Text [Select][+]

Comparing S1='FrEe PaScAl and DELPHI are easier THAN C++' and S2='fReE pAsCaL AND delphi ARE EASIER than c++'
CompareText: 0, 10117 ms
TextComp: Verdadero, 8839 ms

more than 10% faster!

if you add this function, please, mention my email somewhere: edgarrod71@gmail.com

Title: Re: CompareText improvement
Post by: marcov on March 23, 2023, 09:37:38 am

The standard comparetext on the Windows target supports casing of accents and other special letters.

Title: Re: CompareText improvement
Post by: Martin_fr on March 23, 2023, 09:58:04 am

Somewhere in LazUtils there are some similar functions "Compare....Fast"

The idea was to compare like you do, until you hit a unicode char > 127.
So when comparing terms from the English language you would benefit.

The problem is even that approach causes failures. (and I am not sure if they are currently fixed).

One issue is, that those comparisons could not be used for sorting (even if sort order did not matter). Because they were not transitive.
They may have reported that
- String2 goes AFTER String1
- String3 goes AFTER String2
- String3 goes BEFORE String1
And sorting can not fulfil the last statement.

Some of that was fix-able (but not sure if currently fixed).
Other issues (don't recall their nature) might not be fixable.

So those functions can be used for equality check, but not sorting/ordering.

Title: Re: CompareText improvement
Post by: Stefan Glienke on March 23, 2023, 05:07:58 pm

If that CompareText is a copy from the RTL then there is more to be gained than just 10% 8-)

Comparing byte by byte will not get you anywhere performance-wise.

For reference: https://fastcode.sourceforge.net/challenge_content/CompareText.html

Title: Re: CompareText improvement
Post by: edgarrod71 on March 23, 2023, 07:49:41 pm

@Stefan Glienke 10% in everything is really an improvement. When you get any discount 10% on expensive things you start to smile... ;)

By the way, where is your attachment in toll lazarus version?

@Martin_fr I know that, it's simply a replacement for CompareText or an option to it.

Title: Re: CompareText improvement
Post by: PascalDragon on March 23, 2023, 09:39:06 pm

Quote from: edgarrod71 on March 23, 2023, 07:49:41 pm

@Martin_fr I know that, it's simply a replacement for CompareText or an option to it.

It is not a replacement, because CompareText can be used for sorting, yours can not. And it won't be added as an alternative, because there already is the existing CompareText.

Title: Re: CompareText improvement
Post by: edgarrod71 on March 24, 2023, 06:57:28 pm

I'm not saying you're wrong... but! Inside fpcsrc there are 246 calls to CompareText and in Lazarus source code there are 180+ (depending on the components you install) and not all of them are meant for sorting!

When both strings are the same, benchmarks give us 10% aprox. but when they are different benchmark give us more than 90% of acceleration depending on what we want to achieve. For instance, I discovered there was a problem with RegisterFileLocation on weblaz and when I saw another error on it, I found that CompareText is not the best solution.

So, I think we must put ego out and take a humbling sight, our community will be benefit on faster apps.

Title: Re: CompareText improvement
Post by: wp on March 24, 2023, 07:10:46 pm

10% only if every program would consist only of CompareText calls! Your improvement will not be detectable in practice. Conversely, such micro-optimization usually introduces new bugs.

Title: Re: CompareText improvement
Post by: domasz on March 24, 2023, 08:18:04 pm

How about working in 32/64 bits? Perhaps I am not seeing some details but something like below should be faster:

Code: Pascal [Select][+]

function AnsiCompare(S1,S2: String);
var A1: array of Int64;
      B1: array of Int64;
begin
  SetLength...
  FillChar...
  Move(S1[1], A1[0], Length(S1));
  Move(S2[1], A2[0], Length(S2));
 
  for i:=0 to Length-1 do if A1[i] < A2[i] then ...
end;

Title: Re: CompareText improvement
Post by: edgarrod71 on March 24, 2023, 09:03:13 pm

@domasz, looking above the lake, water seems to be delicious, nevertheless, going deep, I found that this calls all of these:

Code: Pascal [Select][+]

const
  MB_CUR_MAX = 10;
 
type
  TFoT = (Falso, Verdadero);
  wint_t = longint;
  clonglong = wint_t;
  mbstate_t = record
    case byte of
      0: (__mbstate8: array[0..127] of char);
      1: (_mbstateL: clonglong); { for alignment }
  end;
  size_t = qword;
  wchar_t = longint;
  pmbstate_t = ^mbstate_t;
 
function wcrtomb(s: pchar; wc: wchar_t; ps: pmbstate_t): size_t; cdecl; external clib name 'wcrtomb';
 
procedure EnsureAnsiLen(var S: AnsiString; const len: SizeInt); inline;
begin
  if (len>length(s)) then
    if (length(s) < 10*256) then
      setlength(s,length(s)+10)
    else
      setlength(s,length(s)+length(s) shr 8);
end;
 
procedure ConcatCharToAnsiStr(const c: char; var S: AnsiString; var index: SizeInt);
begin
  EnsureAnsiLen(s,index);
  pchar(@s[index])^:=c;
  inc(index);
end;
 
{ concatenates an utf-32 char to a widestring. S *must* be unique when entering. }
{$if not(defined(beos) and not defined(haiku))}
procedure ConcatUTF32ToAnsiStr(const nc: wint_t; var S: AnsiString; var index: SizeInt; var mbstate: mbstate_t);
{$else not beos}
procedure ConcatUTF32ToAnsiStr(const nc: wint_t; var S: AnsiString; var index: SizeInt);
{$endif beos}
var
  p     : pchar;
  mblen : size_t;
begin
  { we know that s is unique -> avoid uniquestring calls}
  p:=@s[index];
  if (nc<=127) then
    ConcatCharToAnsiStr(char(nc),s,index)
  else
    begin
      EnsureAnsiLen(s,index+MB_CUR_MAX);
{$if not(defined(beos) and not defined(haiku))}
      mblen:=wcrtomb(p,wchar_t(nc),@mbstate);
{$else not beos}
      mblen:=wctomb(p,wchar_t(nc));
{$endif not beos}
      if (mblen<>size_t(-1)) then
        inc(index,mblen)
      else
        begin
          { invalid wide char }
          p^:='?';
          inc(index);
        end;
    end;
end;
 
function UpperAnsiString(const s : AnsiString) : AnsiString;
  var
    i, slen,
    resindex : SizeInt;
    mblen    : size_t;
{$if not(defined(beos) and not defined(haiku))}
    ombstate,
    nmbstate : mbstate_t;
{$endif beos}
    wc       : wchar_t;
  begin
{$if not(defined(beos) and not defined(haiku))}
    fillchar(ombstate,sizeof(ombstate),0);
    fillchar(nmbstate,sizeof(nmbstate),0);
{$endif beos}
    slen:=length(s);
    SetLength(result,slen+10);
    i:=1;
    resindex:=1;
    while (i<=slen) do
      begin
        if (s[i]<=#127) then
          begin
            wc:=wchar_t(s[i]);
            mblen:= 1;
          end
        else
{$if not(defined(beos) and not defined(haiku))}
          mblen:=mbrtowc(@wc, pchar(@s[i]), slen-i+1, @ombstate);
{$else not beos}
          mblen:=mbtowc(@wc, pchar(@s[i]), slen-i+1);
{$endif beos}
        case mblen of
          size_t(-2):
            begin
              { partial invalid character, copy literally }
              while (i<=slen) do
                begin
                  ConcatCharToAnsiStr(s[i],result,resindex);
                  inc(i);
                end;
            end;
          size_t(-1), 0:
            begin
              { invalid or null character }
              ConcatCharToAnsiStr(s[i],result,resindex);
              inc(i);
            end;
          else
            begin
              { a valid sequence }
              { even if mblen = 1, the uppercase version may have a }
              { different length                                     }
              { We can't do anything special if wchar_t is 16 bit... }
{$if not(defined(beos) and not defined(haiku))}
              ConcatUTF32ToAnsiStr(towupper(wint_t(wc)),result,resindex,nmbstate);
{$else not beos}
              ConcatUTF32ToAnsiStr(towupper(wint_t(wc)),result,resindex);
{$endif not beos}
              inc(i,mblen);
            end;
          end;
      end;
    SetLength(result,resindex-1);
  end;
 
function StrCompAnsiIntern(s1,s2 : PChar; len1, len2: PtrInt; canmodifys1, canmodifys2: boolean): PtrInt;
  var
    a,b: pchar;
    i: PtrInt;
  begin
    if not(canmodifys1) then
      getmem(a,len1+1)
    else
      a:=s1;
    for i:=0 to len1-1 do
      if s1[i]<>#0 then
        a[i]:=s1[i]
      else
        a[i]:=#32;
    a[len1]:=#0;
 
    if not(canmodifys2) then
      getmem(b,len2+1)
    else
      b:=s2;
    for i:=0 to len2-1 do
      if s2[i]<>#0 then
        b[i]:=s2[i]
      else
        b[i]:=#32;
    b[len2]:=#0;
    result:=strcoll(a,b);
    if not(canmodifys1) then
      freemem(a);
    if not(canmodifys2) then
      freemem(b);
  end;
 
 
function AnsiCompareText(const S1, S2: ansistring): PtrInt;
  var
    a, b: AnsiString;
  begin
    a:=UpperAnsistring(s1);
    b:=UpperAnsistring(s2);
    result:=StrCompAnsiIntern(pchar(a),pchar(b),length(a),length(b),true,true);
  end;
 

So I think it is not faster... :(

Title: Re: CompareText improvement
Post by: edgarrod71 on March 24, 2023, 09:11:20 pm

Benchmarks with same text and different...

Code: Bash [Select][+]

fer-mb-pro:programming stuff fernandodager$ ./string_comparison
Comparing S1='definitely, FrEe PaScAl and DELPHI are easier THAN C++ and Bill Sucks' and S2='DEFiNITELY, fReE pAsCaL AND delphi ARE EASIER than c++ AND bILL sUCKS'
CompareText: 0, 18504 ms
TextComp: Verdadero, 16585 ms
AnsiCompareText: 0, 60569 ms
fer-mb-pro:programming stuff fernandodager$ ./string_comparison
Comparing S1='definitely, FrEe PaScAl and DELPHI are easier THAN C++ and Bill Sucks' and S2='REFiNITELY, fReE pAsCaL AND delphi ARE EASIER than c++ AND bILL sUCKS'
CompareText: -14, 622 ms
TextComp: Falso, 575 ms
AnsiCompareText: -14, 59806 ms
fer-mb-pro:programming stuff fernandodager$ 

Title: Re: CompareText improvement
Post by: edgarrod71 on March 24, 2023, 09:18:10 pm

For instance, I'm really sure if we change all of these function calls with the proposed one, we'll have a better debugger, synedit, and propedits, and everything even!

Title: Re: CompareText improvement
Post by: PascalDragon on March 24, 2023, 10:11:51 pm

Quote from: edgarrod71 on March 24, 2023, 06:57:28 pm

So, I think we must put ego out and take a humbling sight, our community will be benefit on faster apps.

This has nothing to do with ego, but with maintainability which is the most important factor in a project like this, even more important than performance! There will only be one CompareText function and that must allow for sort comparison or it's absolutely useless as a replacement. Any performance improvement must be secondary to that requirement.

Title: Re: CompareText improvement
Post by: Martin_fr on March 24, 2023, 10:13:27 pm

Quote from: edgarrod71 on March 24, 2023, 09:18:10 pm

For instance, I'm really sure if we change all of these function calls with the proposed one, we'll have a better debugger, synedit, and propedits, and everything even!

How so?

Ignoring the fact that some of those may need to compare non-latin text, which ones are to sooooo slow for you?
Some of them might already use the "compare...fast" methods I did mention (where and when actually possible).

For example the debugger (FpDebug) is using it's own methods were possible. But where it needs utf8 it needs to have the full deal. And also it minimizes comparisons by using pre-computed hashes, therefore even if you double the speed of text comparison, it would likely make less than one percent of an improvement.
Actually I happen to have a recent (Laz 2.3 based) valgrind/callgrind analysis. It spends about
1% in various fpc comparetext
1% in uppercase
0.1% in its own compare text
0.05% in lowercase
So about 2%. if you could write replacements with full unicode (non-latin incl.) support, and make it twice as fast, then the total speedup of the debugger would be 1%. Only your replacement don't double the speed and don't have full unicode support

Mind however, that the speed of those things that you listed depends heavily on "release builds". If you compile the IDE with heaptrc (-gh) then that has an impact. (not due to compare, but affecting the listed items in other ways).

In that context: If you use fpc 3.2.2 => do NOT use -O2 or higher. It is broken. The IDE will crash (most likely when you debug your project).
But even a release build with -O1 does quite well.

Title: Re: CompareText improvement
Post by: Martin_fr on March 24, 2023, 10:20:31 pm

Btw, about benchmarking.

If you test functions, there is a chance that the result will be influenced by other code (even unrelated code) around your comparison code. The caching and processing the CPU uses can be influenced by this, and speed of the exact same code can vary by up to 30%.

So changes up to 30% time difference may not be caused by your code. They may be caused by other side effect.
Those side effect may then not happen in a real application.
Or in the real app the side effect may speed up the current fpc code, but not yours.....

That said, some of your measurements are in an expected range, and therefore likely not affected by this.

Title: Re: CompareText improvement
Post by: edgarrod71 on March 25, 2023, 01:59:40 am

You're right, I had in mind that. Nevertheless the best way to test is restarting the machine and changing the order of functions before testing. Nevertheless it gives me the same results.

Title: Re: CompareText improvement
Post by: domasz on March 25, 2023, 10:50:08 am

Quote from: edgarrod71 on March 24, 2023, 09:03:13 pm

@domasz, looking above the lake, water seems to be delicious, nevertheless, going deep, I found that this calls all of these:

I kinda think we should have multiple different CompareText for different uses. If I want to compare strings eg. containing SHA-1 hashes then I don't need all these fancy code.

Title: Re: CompareText improvement
Post by: Martin_fr on March 25, 2023, 11:45:22 am

Quote from: domasz on March 25, 2023, 10:50:08 am

I kinda think we should have multiple different CompareText for different uses. If I want to compare strings eg. containing SHA-1 hashes then I don't need all these fancy code.

Then you use CompareByte. It already exists.

And if you end up having so many calls to CompareWhateverText that speed get a noticeable issue, then maybe you should use an entirely different solution? Trees, Hashes, ...

Title: Re: CompareText improvement
Post by: domasz on March 25, 2023, 12:39:36 pm

Quote from: Martin_fr on March 25, 2023, 11:45:22 am

Then you use CompareByte. It already exists.

Interesting, thanks! I don't know anything like this from my Delphi times.

Title: Re: CompareText improvement
Post by: Thaddy on March 25, 2023, 01:14:59 pm

Stupid questions get stupid answers.
It is highly platform dependent how to handle this.

Title: Re: CompareText improvement
Post by: BrunoK on March 26, 2023, 02:50:58 pm

Quote from: edgarrod71 on March 23, 2023, 05:13:12 am

Hi, I'm always trying to improve the language I love, so I want to share a faster function for comparing strings, would you add it to sysstrh.inc?

1 - I could not compile your code.
2 - After making a few changes to compile it the results are very mixed, with no evident gain of one of the functions.
3 - Your TextComp returns TRUE for equal or FALSE for different. It should return <0, 0 or >0 depending on the comparison. Conclusion, your function does not do what is expected.

Title: Re: CompareText improvement
Post by: Stefan Glienke on March 27, 2023, 12:29:42 pm

Quote from: edgarrod71 on March 23, 2023, 07:49:41 pm

By the way, where is your attachment in toll lazarus version?

Sorry, I cannot parse that.

WRT CompareText if you want to knock out some speed there then compare 4 (32bit) or 8 (under 64bit) bytes at once.

Title: Re: CompareText improvement
Post by: marcov on March 27, 2023, 01:57:28 pm

Quote from: domasz on March 25, 2023, 12:39:36 pm

Quote from: Martin_fr on March 25, 2023, 11:45:22 am

Then you use CompareByte. It already exists.
Interesting, thanks! I don't know anything like this from my Delphi times.

Some functions to do certain intrinsic operations were identified as nice to haves when FPC went multi architecture in the 2003-2005 timeframe. Using these in lowlevel and string routines cut back the need for assembler significantly without compromising performance significantly.

Things like compare a block of memory (compare* with *= byte/word/dword/qword,), scan for a byte/word/dword(index*) in a block of memory, fill a block of memory (fill* ) etc.

Title: Re: CompareText improvement
Post by: edgarrod71 on March 30, 2023, 06:04:03 pm

Quote

This has nothing to do with ego, but with maintainability which is the most important factor in a project like this, even more important than performance! There will only be one CompareText function and that must allow for sort comparison or it's absolutely useless as a replacement. Any performance improvement must be secondary to that requirement.

This kind of thinking makes C++ programmers to stay there and not taking pascal as an option. I love pascal, but I keep in mind that we must find better coding to make it faster. Industry is seeking for performance, that's how everybody succeeds.

Title: Re: CompareText improvement
Post by: PascalDragon on March 30, 2023, 09:16:55 pm

Quote from: edgarrod71 on March 30, 2023, 06:04:03 pm

This kind of thinking makes C++ programmers to stay there and not taking pascal as an option. I love pascal, but I keep in mind that we must find better coding to make it faster. Industry is seeking for performance, that's how everybody succeeds.

For most applications performance isn't that important. The main point of Pascal is ease of use and that you can easily create cross platform applications.

Title: Re: CompareText improvement
Post by: BeniBela on March 31, 2023, 12:31:51 am

An actually fast version would need to use SSE/AVX and be optimized for specific processor versions

Title: Re: CompareText improvement
Post by: Thaddy on March 31, 2023, 10:24:24 am

And a sane answer would leave out any cpu reference,

Title: Re: CompareText improvement
Post by: jcmontherock on March 31, 2023, 10:56:05 am

Did you compare your version with all we already have ?

Code: Pascal [Select][+]

WideCompareStr(WideString(s1), WideString(s2)); 
WideCompareText(WideString(s1), WideString(s2));
AnsiStrLIComp(PChar(s1), PChar(s2), Length(s1));
UTF8CompareText(s1, s2);
UTF8CompareLatinTextFast(s1, s2);
UTF8CompareStrCollated(s1, s2);
...
and the windows version with 'shlwapi.dll'

Title: Re: CompareText improvement
Post by: Martin_fr on March 31, 2023, 10:59:41 am

The real question is, how much time does an application actually spent in text comparison? (And that is case insensitive in this case)
And how much of this time should actually be spent in Unicode normalized text comparison?

Take the IDE. By all probability the place that would be most affected is the text search in the editor. But that is more than fast enough. And you have to take into account, that even that is not spending all of its time is TextCompare. It has to look up each line (the text is not a continuos blob), store results, ...
If you do search on disk, it comes down to disk speed rather than search speed.
And after all, searching for text like that (if done case insensitive) should do Unicode normalization (which is missing). So it would have to use an even more complex processing. And hence not even gaining from the proposal.

There are other bits of code that currently call TextCompare. Not all of them should do that, or at least they should only do a small percentage of the calls that they do. But no one has bothered to do the real optimization on them.
Here the question is, if we decided they are to slow, should we give them a 1% or 2% uplift (which is what would very optimistically remain of the 10%, if we consider that CompareText only is a fraction of what they do), or should we change the logic the use, and gain an actual 2 digit percentage?

And any if we really were to provide some code that can compare "only English alphabet" case insensitive, why do we decide to only optimize for such a small part of the world? Then we should also have "only Chinese" and "only Arabic" and "only ...." versions (which all would be faster for the respective languages).

One example was comparing UUID. Well, I don't see why that needs a "English only" compare text. If I have plenty of UUID to compare, I make sure I store them all uppercase (I.e. convert them before storing), and then compare them binary. Which is even faster.
In fact, I would not even store them as text, I may consider them a base-36 number, convert them to a series of QWord, and have even less data to compare.

The same if I have to compare (maybe sort) large amount of text. Uppercase it once, then sort it. This will work for all languages, and speed up the work more than using an "English only" version. (assuming a large enough amount of text, but if the text is small then there is no need for speed up)

All that said, yes there can be very special use cases that would benefit. But they are not common cases. They don't require the RTL to provide them with specialized pre-written code. They can easily have there own function doing such a very specialized comparison. And if they do have their own code, they may be able to tweak it even more than any rtl provided code could be.

Title: Re: CompareText improvement
Post by: BeniBela on March 31, 2023, 05:13:58 pm

I had a case where like half of the time was spent in case-insensitive text comparison. But it was doing a case-insensitive Pos. For just 10 kilobytes of text, it would call the text comparison 10000 times. It became much faster when I changed it to compare the first character before calling the full comparison function

This one is faster if the strings are completely equal:

Code: Pascal [Select][+]

function CompareTextIsEqual(p1, p2: pansichar; const l: SizeInt): boolean;
var i: SizeInt = 0;
    alignedlen: sizeint;
    c1, c2:integer;
    block1, block2: qword;
begin
  alignedlen := l - l and 7;
  result := true;
  while i < l do begin
    while i < alignedlen do begin
      block1 := unaligned(PQWord(p1)^);
      block2 := unaligned(PQWord(p2)^);
      if block1 = block2 then begin
        inc(p1, 8);
        inc(p2, 8);
        inc(i, 8);
      end else break;
    end;
    c1 := ord(p1^);
    c2 := ord(p2^);
    if c1 <> c2 then begin
      if c1 in [97..122] then dec(c1, 32);
      if c2 in [97..122] then dec(c2, 32);
      if c1 <> c2 then begin result := false; exit; end;
    end;
    inc(p1); inc(p2); inc(i);
  end;
end;  

unfortunately. it is slower when the strings are case-sensitively unequal and case-insensitvely equal

Title: Re: CompareText improvement
Post by: Martin_fr on March 31, 2023, 06:05:19 pm

Quote from: BeniBela on March 31, 2023, 05:13:58 pm

I had a case where like half of the time was spent in case-insensitive text comparison. But it was doing a case-insensitive Pos. For just 10 kilobytes of text, it would call the text comparison 10000 times. It became much faster when I changed it to compare the first character before calling the full comparison function

Could you not have lowercased the entire string once, perform the search/pos and take the results on the mixed case string?

Of course that does not work for Unicode where a lowercase char may have a different byte-len than its upper char. But since the "dec(c1, 32);" only works for English anyway...

Maybe even more clever (again leaving out stuff like normalization), get the "lookup char" (for wich you want the pos in the long string) in both: upper and lower (that can still be done via true upper/lower handling all languages). Then compare each pos in the string against the 2 lookup chars. That should save almost all upper/lower conversions.

Title: Re: CompareText improvement
Post by: BeniBela on April 01, 2023, 01:25:35 am

Quote from: Martin_fr on March 31, 2023, 06:05:19 pm

Could you not have lowercased the entire string once, perform the search/pos and take the results on the mixed case string?

But string allocations are also slow

And it would need to lower case both strings.

Quote from: Martin_fr on March 31, 2023, 06:05:19 pm

Maybe even more clever (again leaving out stuff like normalization), get the "lookup char" (for wich you want the pos in the long string) in both: upper and lower (that can still be done via true upper/lower handling all languages). Then compare each pos in the string against the 2 lookup chars. That should save almost all upper/lower conversions.

That is what I did. Which is why checking the first char of the searched string worked so well.

Title: Re: CompareText improvement
Post by: BeniBela on April 04, 2023, 05:54:04 pm

This is generally fast:

Code: Pascal [Select][+]

function CompareTextFast(const S1, S2: string): Integer; overload;
 
var
  i, count, count1, count2, AlignedCount, CharByCharCount: sizeint;
  Chr1, Chr2: byte;
  P1, P2: PChar;
  Block1, Block2: qword;
begin
  Count1 := Length(S1);
  Count2 := Length(S2);
  if (Count1>Count2) then
    Count := Count2
  else
    Count := Count1;
  i := 0;
  if count>0 then
    begin
      AlignedCount := count - count and 7;
      P1 := @S1[1];
      P2 := @S2[1];
      while i < Count do
        begin
          while i < AlignedCount do begin
            Block1 := unaligned(PQWord(P1)^);
            Block2 := unaligned(PQWord(P2)^);
            if Block1 = Block2 then begin
              inc(p1, 8);
              inc(p2, 8);
              inc(i, 8);
            end else break;
          end;
          CharByCharCount := i + 512;
          if CharByCharCount > count then
            CharByCharCount := count;
 
          while i < CharByCharCount do begin
            Chr1 := byte(p1^);
            Chr2 := byte(p2^);
            if Chr1 <> Chr2 then
              begin
                if Chr1 in [97..122] then
                  dec(Chr1,32);
                if Chr2 in [97..122] then
                  dec(Chr2,32);
                if Chr1 <> Chr2 then
                  Break;
              end;
            Inc(P1); Inc(P2); Inc(I);
          end;
        end;
    end;
  if i < Count then
    result := Chr1-Chr2
  else
    // CAPSIZEINT is no-op if Sizeof(Sizeint)<=SizeOF(Integer)
    result:=(Count1-Count2);
end;
  

Even faster than CompareByte.