Well they have more than one issue, they introduce branching, which can be quite slow. This is my goto function for UTF-8 char handling:
function ValidFirstUTF8Char(FirstChar: Char): Boolean; inline;
begin
Result := ((ord(FirstChar) And %10000000) = %00000000) // One char sequence
Or ((ord(FirstChar) And %11100000) = %11000000) // Two char sequence
Or ((ord(FirstChar) And %11110000) = %11100000) // Three char sequence
Or ((ord(FirstChar) And %11111000) = %11110000); // Four char sequence
end;
Let's benchmark this function with short cuircut vs without:
uses
sysutils;
function ValidFirstUTF8CharSC(FirstChar: Char): Boolean; inline;
begin
Result := ((ord(FirstChar) And %10000000) = %00000000) // One char sequence
Or ((ord(FirstChar) And %11100000) = %11000000) // Two char sequence
Or ((ord(FirstChar) And %11110000) = %11100000) // Three char sequence
Or ((ord(FirstChar) And %11111000) = %11110000); // Four char sequence
end;
function ValidFirstUTF8Char(FirstChar: Char): Boolean; inline;
begin
{$Push}
{$B+} // Disable short circuit to avoid jump instructions
Result := ((ord(FirstChar) And %10000000) = %00000000) // One char sequence
Or ((ord(FirstChar) And %11100000) = %11000000) // Two char sequence
Or ((ord(FirstChar) And %11110000) = %11100000) // Three char sequence
Or ((ord(FirstChar) And %11111000) = %11110000); // Four char sequence
{$Pop}
end;
const Iter=10000000;
var
i: Integer;
start:QWord;
begin
start:=GetTickCount64;
for i:=1 to iter do
ValidFirstUTF8CharSC(chr(random(256)));
WriteLn('Short Circuit: ', GetTickCount64-start);
start:=GetTickCount64;
for i:=1 to iter do
ValidFirstUTF8Char(chr(random(256)));
WriteLn('Long Circuit: ', GetTickCount64-start);
end.
Results (compiled on O4):
Short Circuit: 119
Long Circuit: 69
Short circuit is nearly twice as slow as "long cuircuit". Short Circuit can provide massive benefits if the conditions are complex to evaluate. Like anything involving memory accesses, function calls, floating point operations, etc. but if you have very simple operations, the branching instructions take more time than skipping part of the branches saves.