correct
if not (ALine[I] in ['A' .. 'Z']) then
That's right. Sorry for this, I wrote the code in the memo for answers and did not pay attention to the brackets.
Oxygene has a good solution for this — the
not in operator:
if ALine[I] not in ['A' .. 'Z'] then
And no brackets are needed. But this operator is not supported in
Free Pascal.
Note that this only works for a-z / A-Z.
This is because in UTF-8 a-z are encoded in one octet (1 byte, or one pascal "char").
This is part of UTF-8
- The ASCII chars (ordinary value below 128 (excluding 128)) are the same
- no other codepoint (char) in utf-8 contains an octet (byte) with a value between 0 and 127.
[...]
Yep, good that you mentioned it. I did not write about it myself because I took it for granted. But even if the input string contains multibyte codepoints, this way of iterating over the characters also applies (in the case of UTF-8):
function CountASCIIChars(const ALine: String): Integer;
var
Character: Char;
begin
Result := 0;
for Character in ALine do
Result += Ord(Character in [#0 .. #127]);
end;
CountASCIIChars('zażółć gęślą jaźń'); // gives "8", which is true
Unless we're looking for a multibyte value, then we have to use a different algorithm.