Recent

Author Topic: [SOLVED] TStringList save as UTF16 LE/BE  (Read 2323 times)

totya

  • Hero Member
  • *****
  • Posts: 722
[SOLVED] TStringList save as UTF16 LE/BE
« on: March 05, 2023, 02:09:33 pm »
Hi!

I'd like to save a TStringList (UTF8 by default) as UTF16 (LE or BE) with (linux or windows) line ending. What is the easiest way to do this?

Thanks!
« Last Edit: March 05, 2023, 06:03:56 pm by totya »

jamie

  • Hero Member
  • *****
  • Posts: 7601
Re: TStringList save as UTF16 LE/BE
« Reply #1 on: March 05, 2023, 02:40:13 pm »
Create a Unicodestring and define that from the UTF8 string and then save the Unicodestring?

Just a guess.
The only true wisdom is knowing you know nothing

totya

  • Hero Member
  • *****
  • Posts: 722
Re: TStringList save as UTF16 LE/BE
« Reply #2 on: March 05, 2023, 03:00:00 pm »
Create a Unicodestring and define that from the UTF8 string and then save the Unicodestring? Just a guess.

Could you show me what you mean with code?

jamie

  • Hero Member
  • *****
  • Posts: 7601
Re: TStringList save as UTF16 LE/BE
« Reply #3 on: March 05, 2023, 03:25:39 pm »
off the top of my head as a save procedure..

Study this.
Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. Var
  3.   S:TstringList;
  4.   U:Unicodestring;
  5.   F:TFileStream;
  6. begin
  7.   S := TStringList.create;
  8.   F:=TFileStream.Create('The File name.UTXT',fmCreate);
  9.   S.Add('Test');
  10.   U := S.Text;
  11.   F.Write(Pchar(U)^, Length(U) Shl 1);
  12.   F.free;
  13.   S.Free;
  14. end;                              
  15.  
The only true wisdom is knowing you know nothing

totya

  • Hero Member
  • *****
  • Posts: 722
Re: TStringList save as UTF16 LE/BE
« Reply #4 on: March 05, 2023, 03:32:18 pm »
off the top of my head as a save procedure..

Thank for this code, but as it was in the question, where can I choose between LE/BE CP, and between linux/Windows lineending for the output?

jamie

  • Hero Member
  • *****
  • Posts: 7601
Re: TStringList save as UTF16 LE/BE
« Reply #5 on: March 05, 2023, 03:49:01 pm »
write a BOM at the start?

in windows it would be LE but I guess there are ways to change that.
The only true wisdom is knowing you know nothing

totya

  • Hero Member
  • *****
  • Posts: 722
Re: TStringList save as UTF16 LE/BE
« Reply #6 on: March 05, 2023, 03:54:45 pm »
write a BOM at the start?

in windows it would be LE but I guess there are ways to change that.

Yes, indeed, but LE and BE need different byte orders too (within a char).

jamie

  • Hero Member
  • *****
  • Posts: 7601
The only true wisdom is knowing you know nothing

totya

  • Hero Member
  • *****
  • Posts: 722
Re: TStringList save as UTF16 LE/BE
« Reply #8 on: March 05, 2023, 04:07:18 pm »
https://en.wikipedia.org/wiki/Byte_order_mark

Thanks for the help, I know this, but you don't know that the bytes are also exchanged within the character, as I wrote before. Not only the BOM as a header is different, but also the byte order.

paweld

  • Hero Member
  • *****
  • Posts: 1572
Re: TStringList save as UTF16 LE/BE
« Reply #9 on: March 05, 2023, 04:08:06 pm »
Code: Pascal  [Select][+][-]
  1. procedure TForm1.FormCreate(Sender: TObject);
  2. var
  3.   sl: TStringList;
  4. begin
  5.   sl := TStringList.Create;
  6.   sl.Add('zażółć gęslą jaźń');
  7.   sl.Add('łąka');
  8.   sl.Add('Wiki: переводы статей');
  9.   sl.Add('جهة النّص وبعض المُصطلحات');
  10.   sl.SaveToFile('d:\stringlist_utf8.txt');
  11.   sl.SaveToFile('d:\stringlist_utf16be.txt', TEncoding.BigEndianUnicode);
  12.   sl.SaveToFile('d:\stringlist_utf16le.txt', TEncoding.Unicode);
  13.   sl.Free;
  14. end;  
Best regards / Pozdrawiam
paweld

jamie

  • Hero Member
  • *****
  • Posts: 7601
Re: TStringList save as UTF16 LE/BE
« Reply #10 on: March 05, 2023, 04:10:54 pm »
That would most likely work too, but like a few around here, I still use older versions of fpc.
The only true wisdom is knowing you know nothing

totya

  • Hero Member
  • *****
  • Posts: 722
Re: TStringList save as UTF16 LE/BE
« Reply #11 on: March 05, 2023, 04:16:35 pm »
Code: Pascal  [Select][+][-]
  1. procedure TForm1.FormCreate(Sender: TObject);
  2. var
  3.   sl: TStringList;
  4. begin
  5.   sl := TStringList.Create;
  6.   sl.Add('zażółć gęslą jaźń');
  7.   sl.Add('łąka');
  8.   sl.Add('Wiki: переводы статей');
  9.   sl.Add('جهة النّص وبعض المُصطلحات');
  10.   sl.SaveToFile('d:\stringlist_utf8.txt');
  11.   sl.SaveToFile('d:\stringlist_utf16be.txt', TEncoding.BigEndianUnicode);
  12.   sl.SaveToFile('d:\stringlist_utf16le.txt', TEncoding.Unicode);
  13.   sl.Free;
  14. end;  

Thanks, nice short code, but BOM is missing, and stringlist_utf16be.txt file unreadable for anything.

DomingoGP

  • Full Member
  • ***
  • Posts: 113
Re: TStringList save as UTF16 LE/BE
« Reply #12 on: March 05, 2023, 04:30:45 pm »
Try setting sl.WriiteBOM:=True and for linux sl.LineBreak:=#13;


Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   SL: TStringList;
  4. begin
  5.   SL := TStringList.Create;
  6.   try
  7.     SL.Add('AáB');
  8.  
  9.     SL.LineBreak := #$0D;        // linux
  10.     //SL.LineBreak := #$0D#$0A;  // windows.
  11.  
  12.     SL.WriteBOM := False;
  13.     SL.SaveToFile('utf8-no-boom.txt', TEncoding.UTF8);
  14.     SL.WriteBOM := True;
  15.     SL.SaveToFile('utf8-boom.txt', TEncoding.UTF8);
  16.  
  17.     SL.WriteBOM := False;
  18.     SL.SaveToFile('unicode-no-boom.txt', TEncoding.Unicode);
  19.     SL.WriteBOM := True;
  20.     SL.SaveToFile('unicode-boom.txt', TEncoding.Unicode);
  21.  
  22.     SL.WriteBOM := False;
  23.     SL.SaveToFile('unicodeBE-no-boom.txt', TEncoding.BigEndianUnicode);
  24.     SL.WriteBOM := True;
  25.     SL.SaveToFile('unicodeBE-boom.txt', TEncoding.BigEndianUnicode);
  26.   finally
  27.     SL.Free;
  28.   end;
  29. end;

jamie

  • Hero Member
  • *****
  • Posts: 7601
Re: TStringList save as UTF16 LE/BE
« Reply #13 on: March 05, 2023, 05:26:38 pm »
looking at TStringList in newer versions of the compiler, I don't  understand the use or "Shift, Map, Pop, reduce, Reverse etc" in that class?

The only true wisdom is knowing you know nothing

totya

  • Hero Member
  • *****
  • Posts: 722
Re: TStringList save as UTF16 LE/BE
« Reply #14 on: March 05, 2023, 06:02:41 pm »
Try setting sl.WriiteBOM:=True and for linux sl.LineBreak:=#13;

WriiteBOM is the solution, thank you!

 

TinyPortal © 2005-2018