Recent

Author Topic: Creating (handling) files with Unicode filenames  (Read 3815 times)

CCRDude

  • Hero Member
  • *****
  • Posts: 596
Creating (handling) files with Unicode filenames
« on: April 28, 2016, 11:26:26 am »
Environments: Lazarus 1.4.4 with FPC 3.0.1 (old fpcup one),  Lazarus 1.6 with FPC 3.0.0 (public release version)

Situation: file system with international filenames, trying to access a filename with special polish characters on English or German computers.

Experienced: filename garbled when creating the file.

Expected: proper handling.

Here's my code:
Code: Pascal  [Select][+][-]
  1. program UnicodeFilenameAccessTest;
  2.  
  3. {$mode objfpc}{$H+}
  4. //{$ModeSwitch UnicodeStrings}
  5.  
  6. uses
  7.    Windows,
  8.    SysUtils,
  9.    LazFileUtils,
  10.    LazUTF8;
  11.  
  12.    function xCreateFileW(lpFileName: LPCWSTR; dwDesiredAccess: DWORD; dwShareMode: DWORD; lpSecurityAttributes: LPSECURITY_ATTRIBUTES; dwCreationDisposition: DWORD;
  13.       dwFlagsAndAttributes: DWORD; hTemplateFile: HANDLE): HANDLE; stdcall; external 'kernel32' Name 'CreateFileW';
  14.  
  15.    procedure CreateTestFile;
  16.    const
  17.       TestFileNameUTF8: UTF8String = 'C:\Temp\Przepraszam Cię UTF-8.txt';
  18.       TestFileNameUnicode: UnicodeString = 'C:\Temp\Przepraszam Cię Unicode.txt';
  19.       TestFileNameWideChar: PWideChar = 'C:\Temp\Przepraszam Cię PWideChar.txt';
  20.    var
  21.       h: THandle;
  22.    begin
  23.       // Using LazFileUtils
  24.       h := FileCreateUTF8(TestFileNameUTF8);
  25.       if h > 0 then begin
  26.          WriteLn('Opened file');
  27.          CloseHandle(h);
  28.       end else begin
  29.          WriteLn(SysErrorMessage(GetLastError));
  30.       end;
  31.  
  32.       // Using fileutilh.inc unicode variant
  33.       h := FileCreate(TestFileNameUnicode);
  34.       if h > 0 then begin
  35.          WriteLn('Opened file');
  36.          CloseHandle(h);
  37.       end else begin
  38.          WriteLn(SysErrorMessage(GetLastError));
  39.       end;
  40.  
  41.       // Direct API access
  42.       h := xCreateFileW(TestFileNameWideChar, GENERIC_ALL, 7, nil, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);
  43.       if h > 0 then begin
  44.          WriteLn('Opened file');
  45.          CloseHandle(h);
  46.       end else begin
  47.          WriteLn(SysErrorMessage(GetLastError));
  48.       end;
  49.    end;
  50.  
  51. begin
  52.    CreateTestFile;
  53. end.

In the filesystem, I get:
  • Przepraszam Cię PWideChar.txt
  • Przepraszam Cię Unicode.txt
  • Przepraszam Cię UTF-8.txt

It looks like there is some silly UTF-8 conversion happening that converts the filename to UTF-8, regardless which method I choose to create the file (and I've been trying even more, including TFileStream and others), even when calling the MSDN API with a PWideChar.

What am I doing wrong here?

Is there a chance FPC 3.1.1 would behave better? fpcup fails for me currently due to BGRA issues, so I can't try right now.

balazsszekely

  • Guest
Re: Creating (handling) files with Unicode filenames
« Reply #1 on: April 28, 2016, 12:10:51 pm »
Under windows always use the *_W api, Like this:
Code: Pascal  [Select][+][-]
  1. program UnicodeFilenameAccessTest;
  2.  
  3. {$mode objfpc}{$H+}
  4. uses
  5.    Windows,
  6.    SysUtils,
  7.    LazFileUtils,
  8.    LazUTF8;
  9.  
  10.    function xCreateFileW(lpFileName: LPCWSTR; dwDesiredAccess: DWORD; dwShareMode: DWORD; lpSecurityAttributes: LPSECURITY_ATTRIBUTES; dwCreationDisposition: DWORD;
  11.       dwFlagsAndAttributes: DWORD; hTemplateFile: HANDLE): HANDLE; stdcall; external 'kernel32' Name 'CreateFileW';
  12.  
  13.    procedure CreateTestFile;
  14.    const
  15.       TestFileName: String = 'C:\Temp\Przepraszam Cię PWideChar.txt';
  16.    var
  17.       h: THandle;
  18.    begin
  19.       // Direct API access
  20.       h := xCreateFileW(PWideChar(WideString(TestFileName)), GENERIC_ALL, 7, nil, CREATE_ALWAYS, FILE_ATTRIBUTE_NORMAL, 0);
  21.       if h > 0 then begin
  22.          WriteLn('Opened file');
  23.          CloseHandle(h);
  24.       end else begin
  25.          WriteLn(SysErrorMessage(GetLastError));
  26.       end;
  27.    end;
  28.  
  29. begin
  30.    CreateTestFile;
  31. end.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Creating (handling) files with Unicode filenames
« Reply #2 on: April 28, 2016, 12:24:00 pm »
Missing a {$codepage utf8} ?

ChrisF

  • Hero Member
  • *****
  • Posts: 542
Re: Creating (handling) files with Unicode filenames
« Reply #3 on: April 28, 2016, 02:34:44 pm »
I agree with marcov (unless you get the file name from another source; like reading it from the file system, for instance).

BTW, I'm not sure your test for the returned handle value is correct. You'd better use something like:
Code: Pascal  [Select][+][-]
  1. //      if h > 0 then begin
  2.       if h <> (THandle(-1)) then begin
  3.  

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Creating (handling) files with Unicode filenames
« Reply #4 on: April 28, 2016, 02:40:45 pm »
FileCreate should work in 3.0+, but of course the problem might lie in where you obtain the data. Literals are default in the default system encoding, unless $codepage is specified.

ChrisF

  • Hero Member
  • *****
  • Posts: 542
Re: Creating (handling) files with Unicode filenames
« Reply #5 on: April 28, 2016, 03:02:02 pm »
According to a quick test, all the 3 ways are working with FPC 3.0+ (after adding the UTF8 code page for the source code).

But you are quite right about the origin of the data.

Anyway, for his test I guess that CCRDude's source code file is UTF-8 encoded (Lazarus IDE ?, as there is some relationship with LazFileUtils and  LazUTF8).
« Last Edit: April 28, 2016, 03:03:54 pm by ChrisF »

CCRDude

  • Hero Member
  • *****
  • Posts: 596
Re: Creating (handling) files with Unicode filenames
« Reply #6 on: April 28, 2016, 09:00:32 pm »
Many thanks!

marcov, you were right! I did set DefaultSystemCodePage and DefaultFileSystemCodePage, but I didn't know about the $codepage yet. Since the IDE (debugger windows Locals) showed them correctly, I assumed they were. Thank you so much, I already had nightmares doubting my decision to finally port this project!

ChrisF, you're right of course, that was a quick hack to test the file creation, but even a quick test should be correct :)



 

TinyPortal © 2005-2018