So, I guess after FPC 3.0 the compiler directive {$H+} means UnicodeString not AnsiString
No, {$H+} just defines type String = AnsiString. Without this directive, type String = ShortString - which you typically don't want. It has nothing to do with Unicode.
What I meant was that AnsiString (aka String when {$H+}) may contain UTF-8 data, but it doesn't have to - each string instance has a codepage field in its metadata (right next to its reference count & length).
This just means that e.g. unlike in Rust, strings in FPC aren't guaranteed to use UTF-8 under the hood, hence it isn't sound to assume that every AnsiString contains Unicode code points.
I thought after fpc3.0, string=UnicodeString...
In Modern Free Pascal (3.0+):
{$H+}: String = UnicodeString (UTF-16) by default
{$H-}: String = ShortString (255 chars max)
In Older Free Pascal (pre-3.0):
{$H+}: String = AnsiString
{$H-}: String = ShortString
My idea was accessing each character without dealing with Surrogate pairs...