Recent

Author Topic: LoadFromFile Copy and Utf-8  (Read 1147 times)

Nicola Gorlandi

  • Jr. Member
  • **
  • Posts: 76
LoadFromFile Copy and Utf-8
« on: July 07, 2018, 08:26:39 am »
I have a file with this content
ò1234567890

and i want to read it and extract from 2 to 4 characters for each rows.

If I use this code


Quote
Linee:= TStringList.Create;   
Linee.LoadFromFile(FileName); 
Showmessage(Copy(linee[0]2,4));


copy count the UTf-8 char as if it si long two chars.


Could you please give me any explanation ? I am using the last versione Lazarus 1.8.2 32 bit on Windows.
« Last Edit: July 07, 2018, 08:28:14 am by Nicola Gorlandi »

Thaddy

  • Hero Member
  • *****
  • Posts: 7433
Re: LoadFromFile Copy and Utf-8
« Reply #1 on: July 07, 2018, 09:10:25 am »
copy count the UTf-8 char as if it si long two chars.
No, UTF8 is 1-4 characters, not 2. https://en.wikipedia.org/wiki/UTF-8
Ad Brexinitum (can't help it)

Nicola Gorlandi

  • Jr. Member
  • **
  • Posts: 76
Re: LoadFromFile Copy and Utf-8
« Reply #2 on: July 07, 2018, 09:13:28 am »
Thank you, 

Whatever UTF-8 char lenght is,  I would like to count the char as human does, so im my case ò shiuld be counted as o. Is it possible with Lazarus or I have to manage by myself ?

Many thanks again.

wp

  • Hero Member
  • *****
  • Posts: 5353
Re: LoadFromFile Copy and Utf-8
« Reply #3 on: July 07, 2018, 01:48:57 pm »
In such a context it is good to know the unit LazUnicode which provides an easy enumerator of the utf8 code points:
Code: Pascal  [Select]
  1. uses
  2.   LazUnicode;
  3.  
  4. procedure TForm1.Button1Click(Sender: TObject);
  5. var
  6.   s, ch, str: String;
  7.   n: Integer;
  8. begin
  9.   Memo1.Lines.Clear;
  10.   str := 'ò1234567890äöüßÄÖÜ';
  11.   n := 0;
  12.   for ch in str do begin
  13.     s := s + ch;
  14.     inc(n);
  15.     Memo1.Lines.Add('The first ' + IntToStr(n) + ' code point(s): ' + s);
  16.   end;
  17. end;  
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

Bart

  • Hero Member
  • *****
  • Posts: 3311
    • Bart en Mariska's Webstek
Re: LoadFromFile Copy and Utf-8
« Reply #4 on: July 07, 2018, 01:59:15 pm »
Utf8Copy()

Bart

Nicola Gorlandi

  • Jr. Member
  • **
  • Posts: 76
Re: LoadFromFile Copy and Utf-8
« Reply #5 on: July 08, 2018, 09:49:42 am »
Many thank to all I solved.
Just an annotation. With the last version of lazarus I supposed that using string I could use utf8 without any changes of code. This is my misunderstanding