In the UTF8 code set 62 (55%) of the characters are for 'European' characters. !!!
( fpc for FreeDOS + code set 850 was nirvana ).
Does anybody KNOW how to manipulate strings/arrays of 'European' characters ??
Proved examples, please. I would faint with gratitude. Many thanks.

What do you mean? UTF-8 encoding supports the full Unicode.
If you mean the 7-bit ASCII by 'European' characters, then it gets easy because UTF-8 is compatible with 7-bit ASCII.

Use unit LazUTF8.

If you are working on a terminal/console app, you need to add LazUtils package where LazUTF8 is. You do that in:
  Project - Project Inspector
    Add - Add New Requirement
      Type LazU and choose LazUtils

What you call character is actually more of a string

Use UTF8Copy, UTF8Insert...etc


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---program Project1; {$mode objfpc}{$H+} uses  {$IFDEF UNIX}{$IFDEF UseCThreads}  cthreads,  {$ENDIF}{$ENDIF}  Classes  ,LazUTF8  { you can add units after this }; var  s:string;  s1,s2:string;begin  s := 'ÄÇ';  WriteLn(s);  WriteLn(Length(s));//===> 4  WriteLn(UTF8Length(s));//===> 2  s1:=UTF8Copy(s,1,1);  WriteLn(s1); // Ä  s2:=UTF8Copy(s,2,1);  WriteLn(s2); // Ç  UTF8Insert(s2,s,1); // s is ÇÄÇ  WriteLn(s);  UTF8Delete(s,2,1);  // s is ÇÇ  WriteLn(s);  ReadLn;end.
From your previous post, add cwstring unit if you are using Linux


Not sure where you got that.

UTF8 is ASCII compatible encoding of Unicode. Unicode codepoints can take up to 4 bytes when encoded using UTF8.

A is one byte and is ASCII compatible.
Ä is two bytes and is not compatible with ASCII.

A "character" can use more than one codepoint.

The same "character" could be represented with different codepoints.


