Forum > FPC development
CompareText improvement
Martin_fr:
--- Quote from: BeniBela on March 31, 2023, 05:13:58 pm ---I had a case where like half of the time was spent in case-insensitive text comparison. But it was doing a case-insensitive Pos. For just 10 kilobytes of text, it would call the text comparison 10000 times. It became much faster when I changed it to compare the first character before calling the full comparison function
--- End quote ---
Could you not have lowercased the entire string once, perform the search/pos and take the results on the mixed case string?
Of course that does not work for Unicode where a lowercase char may have a different byte-len than its upper char. But since the "dec(c1, 32);" only works for English anyway...
Maybe even more clever (again leaving out stuff like normalization), get the "lookup char" (for wich you want the pos in the long string) in both: upper and lower (that can still be done via true upper/lower handling all languages). Then compare each pos in the string against the 2 lookup chars. That should save almost all upper/lower conversions.
BeniBela:
--- Quote from: Martin_fr on March 31, 2023, 06:05:19 pm ---Could you not have lowercased the entire string once, perform the search/pos and take the results on the mixed case string?
--- End quote ---
But string allocations are also slow
And it would need to lower case both strings.
--- Quote from: Martin_fr on March 31, 2023, 06:05:19 pm ---Maybe even more clever (again leaving out stuff like normalization), get the "lookup char" (for wich you want the pos in the long string) in both: upper and lower (that can still be done via true upper/lower handling all languages). Then compare each pos in the string against the 2 lookup chars. That should save almost all upper/lower conversions.
--- End quote ---
That is what I did. Which is why checking the first char of the searched string worked so well.
BeniBela:
This is generally fast:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function CompareTextFast(const S1, S2: string): Integer; overload; var i, count, count1, count2, AlignedCount, CharByCharCount: sizeint; Chr1, Chr2: byte; P1, P2: PChar; Block1, Block2: qword;begin Count1 := Length(S1); Count2 := Length(S2); if (Count1>Count2) then Count := Count2 else Count := Count1; i := 0; if count>0 then begin AlignedCount := count - count and 7; P1 := @S1[1]; P2 := @S2[1]; while i < Count do begin while i < AlignedCount do begin Block1 := unaligned(PQWord(P1)^); Block2 := unaligned(PQWord(P2)^); if Block1 = Block2 then begin inc(p1, 8); inc(p2, 8); inc(i, 8); end else break; end; CharByCharCount := i + 512; if CharByCharCount > count then CharByCharCount := count; while i < CharByCharCount do begin Chr1 := byte(p1^); Chr2 := byte(p2^); if Chr1 <> Chr2 then begin if Chr1 in [97..122] then dec(Chr1,32); if Chr2 in [97..122] then dec(Chr2,32); if Chr1 <> Chr2 then Break; end; Inc(P1); Inc(P2); Inc(I); end; end; end; if i < Count then result := Chr1-Chr2 else // CAPSIZEINT is no-op if Sizeof(Sizeint)<=SizeOF(Integer) result:=(Count1-Count2);end;
Even faster than CompareByte.
Navigation
[0] Message Index
[*] Previous page