Recent

Author Topic: [Solved]Format function with accented char in text  (Read 452 times)

jcmontherock

  • Full Member
  • ***
  • Posts: 234
[Solved]Format function with accented char in text
« on: July 24, 2023, 05:01:14 pm »
After a MySQL query I have a dynamic array containing the results. Both database and the Lazarus app are in UTF-8 encoded. I want to show the results in a Memo with fixed size font (Consolas).
clArray is defined as: "Array of Array of String"

I am using the following procedure:
Code: Pascal  [Select][+][-]
  1.   line: String;
  2.   ...
  3.   for iRow := 0 to High(clArray) do begin
  4.     line := Format('%-2d', [iRow+1]) + '|';
  5.     for iCol := 0 to High(clArray[iRow]) do begin
  6.       if iCol = 0 then line += Format('%0:-6s',  [clArray[iRow, iCol]]) + '|'
  7.       else             line += Format('%0:-17s', [clArray[iRow, iCol]]) + '|' + ' ---> String length: ' +
  8.                                IntToStr(Length(clArray[iRow, iCol]));
  9.     end;
  10.     Memo1.Append(line);
  11.   end;  
  12.  
  The result of the format function shows a problem with accented characters. The string containing the char "è" (c3a8) takes 2 bytes, so it’s longer,
that's means that we are really in UTF-8
 
1  |3882    |Bernex                 | ---> String length: 6   
2  |125527|Carouge GE         | ---> String length: 10
3  |125527|Genèève        | ---> String length: 9
4  |162175|Geneve                | ---> String length: 6
5  |84493  |Genève              | ---> String length: 7
6  |254782|Hermance            | ---> String length: 8   
7  |103866|Mont Saxonnex    | ---> String length: 13   
8  |103865|Mont Saxonnex    | ---> String length: 13   
9  |185163|Prevessin Moens  | ---> String length: 15   
10|212913|Prevessin Moens  | ---> String length: 15   

Why text with accented char are not aligned ? What did I wrong ?
« Last Edit: July 25, 2023, 05:18:09 pm by jcmontherock »
Windows 11 UTF8-64 - Lazarus 3.2-64 - FPC 3.2.2

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: Format function with accented char in text
« Reply #1 on: July 24, 2023, 06:03:22 pm »
short of using  Utf8Length or something like that, have you tried inserting a TAB character prior to the break ?
The only true wisdom is knowing you know nothing

wp

  • Hero Member
  • *****
  • Posts: 11854
Re: Format function with accented char in text
« Reply #2 on: July 24, 2023, 08:00:56 pm »
This works:
Code: Pascal  [Select][+][-]
  1. uses
  2.   Math, LazUTF8;
  3.  
  4. var
  5.   clArray: array of array of string;
  6.  
  7. procedure TForm1.FormCreate(Sender: TObject);
  8. var
  9.   i, j: Integer;
  10.   line: String;
  11.   w: array of Integer = nil;
  12. begin
  13.   SetLength(clArray, 10, 2);
  14.   clArray[0, 0] := '3882';     clArray[0, 1] := 'Bernex';
  15.   clArray[1, 0] := '125527';   clArray[1, 1] := 'Carouge GE';
  16.   clArray[2, 0] := '125527';   clArray[2, 1] := 'Genèève';
  17.   clArray[3, 0] := '162175';   clArray[3, 1] := 'Geneve';
  18.   clArray[4, 0] := '84493';    clArray[4, 1] := 'Genève';
  19.   clArray[5, 0] := '254782';   clArray[5, 1] := 'Hermance';
  20.   clArray[6, 0] := '103866';   clArray[6, 1] := 'Mont Saxonnex';
  21.   clArray[7, 0] := '103865';   clArray[7, 1] := 'Mont Saxonnex';
  22.   clArray[8, 0] := '185163';   clArray[8, 1] := 'Prevessin Moens';
  23.   clArray[9, 0] := '212913';   clArray[9, 1] := 'Prevessin Moens';
  24.  
  25.   SetLength(w, Length(clArray[0])+1);
  26.   for i := 0 to High(clArray) do
  27.   begin
  28.     w[0] := Max(w[0], Length(IntToStr(i+1)));
  29.     for j := 1 to High(w) do
  30.       w[j] := Max(w[j], UTF8Length(clArray[i, j-1]));   // Length would be enough for the number column
  31.   end;
  32.  
  33.   Memo1.Lines.Clear;
  34.   for i := 0 to High(clArray) do
  35.   begin
  36.     line := Format(' %*d | %*s | %-*s | --> String length: %d', [
  37.       w[0], i+1,
  38.       w[1], clArray[i, 0],
  39.       w[2] + Length(clArray[i, 1]) - UTF8Length(clArray[i, 1]), clArray[i, 1],
  40.       UTF8Length(clArray[i, 1])
  41.     ]);
  42.     Memo1.Lines.Add(line);
  43.   end;
  44. end;

Explanation: Suppose the column width should be 10, and suppose that the string 'Genève' should be put in there. This is 6 "code points" ("visual characters"), but 7 bytes due to the accented è. The Format function uses the Length() function to determine how many characters are printed --> it "thinks" that it needs 3 spaces to fill the column. But due to UTF8 only 6 characters are printed --> we actually need 4 spaces, i.e. 1 space more, i.e. the difference between Length() and UTF8Length().
« Last Edit: July 24, 2023, 08:16:53 pm by wp »

jcmontherock

  • Full Member
  • ***
  • Posts: 234
Re: Format function with accented char in text
« Reply #3 on: July 25, 2023, 03:10:36 pm »
Thanks a lot. I will test it.  :D

...It's work fine.
« Last Edit: July 25, 2023, 05:17:41 pm by jcmontherock »
Windows 11 UTF8-64 - Lazarus 3.2-64 - FPC 3.2.2

 

TinyPortal © 2005-2018