Forum > Windows

Questions of new Strings in FPC 3.0

<< < (4/4)


--- Quote from: Michl on October 30, 2015, 09:30:27 am ---So here we have the issue, that with Lazarus it is never possible to build clean CP_APC projects, cause Lazarus saves the project file as UTF8,

--- End quote ---

You can change the encoding in editor File settings -> Encoding.

--- Quote ---so the compiler interprets all the time the saved strings as UTF8.

--- End quote ---

Without {$codepage utf8} / -FcUTF8 it should follow the file's excoding.

Thanky you! I didn't know that (I have seen that though but forget again :-[).

But now the own created project isn't compileable and the project from cyrax breaks with a runerror. I will try further and if I identified the problem, I'll make a minimal example and post it in the bugtracker, if it is necessary.


--- Quote from: Michl on October 30, 2015, 10:02:59 am --- I'll make a minimal example and post it in the bugtracker, if it is necessary.
--- End quote ---
Here it is

I found it helpful to put my UTF8 and UTF16 test data into files on disk, which I could check with multiple editors, to make sure they were in particular formats and encodings, before involving FPC (or Delphi).  For me, that obviates the need to declare string constants with complex encodings and worry whether Lazarus etc is modifying the contents as intended.  Please see attached -- a selection of 1-2 kbyte files.   The file extension is .utx so you can connect the files with the editor of your choice.  Note that tests where the file name includes non-Latin letters can prove more difficult, which is why there are a couple file *names* in Russian and Arabic.

The downside to this approach is that the file i/o has to be rock solid, and I am stuck there with my FPC 3.0 tests, which is why I wanted to jump in here.  This in particular seem "wrong" compared to Delphi's idea:

* TStringList.Load  of a file, where the file contains a UTF16 Byte Order Mark ("BOM") does not seem to be able to load that data directly into a string (UnicodeString).  The length of the data is roughly 2x larger than I would expect.  Measured with Length(), Delphi gives a length of 21 and FPC gives a length of 43.  This is for the attached chinese.16.utx file. 

A quote from the Delphi docwiki: "If the Encoding parameter is not given, then the strings are loaded using the appropriate encoding. The value of the encoding is obtained by calling the GetBufferEncoding routine of the TEncoding class."  ( )

In terms of sanity checking the contents of files, I can highly recommend two editors: TedNPad (for very quick display of encoding and BOM on the status bar) and Unipad (for excellent display of details about individual chars).

This is my function for loading via TStringList - confirmed working in Unicode Delphi for quite a few years:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function TTest_ucLogFil.TStringList_File_To_String(  const InFilespec: string): string;var  y: TStringList;begin  y := nil;  Result := '';   try    y := TStringList.Create;    y.LoadFromFile(InFilespec);    // strip trailing CRLF, which was not in the disk file    Result := Copy(y.Text, 1, Length(y.Text) - 2);  finally    FreeAndNil(y);  end;end; 
To summarize, FPC 3 is not loading the chinese.16.utx file the way I think it should.


[0] Message Index

[*] Previous page

Go to full version