Recent

Author Topic: Passing Unicode filenames to TUTF8Process  (Read 11937 times)

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Passing Unicode filenames to TUTF8Process
« on: June 28, 2014, 09:19:40 pm »
Having a problem with unicode filenames (again with mplayer) on Windows 8 64bit running Lazarus 32bit Trunk/FPC Trunk...

Code: [Select]
Var
  FPlayerProcess: TProcessUTF8;
...
  FPlayerProcess.Executable:=FMPlayerPath;
  FPlayerProcess.Parameters.Add(AnsiToUTF8(Filename));

  FPlayerProcess.Execute;
If I pass a non-unicode filename to this, everything works, and I get the following by querying FPlayerProcess.output
Quote
MPlayer Redxii-SVN-r37216-4.8.2 (i686) (C) 2000-2014 MPlayer Team
Compiled against FFmpeg version N-63644-ge1bd40f
Build date: Sat May 31 22:08:38 EDT 2014

Playing B:\Code\Compile\Test Data\Clip_1080_5sec_VC1_15mbps.wmv.
...
On the other hand, if I pass a file with a unicode name ("skiing - ǤǥǦ.avi") into the procedure, then instead I see
Quote
MPlayer Redxii-SVN-r37216-4.8.2 (i686) (C) 2000-2014 MPlayer Team
Compiled against FFmpeg version N-63644-ge1bd40f
Build date: Sat May 31 22:08:38 EDT 2014

Playing B:\Code\Compile\Test Data\skiing - ǤǥǦ.avi.

Exiting... (End of file)
I've confirmed from the command line that mplayer itself will play the file with the unicode filename.   So it's looks like it's down to my handling of unicode filenames.  Somehow I'm mangling "skiing - ǤǥǦ.avi" into "skiing - ǤǥǦ.avi"

I've tried this with both
Code: [Select]
  FPlayerProcess.Parameters.Add(AnsiToUTF8(Filename)); // mplayer reports can't find "skiing - ǤǥǦ.avi"
and
  FPlayerProcess.Parameters.Add(Filename);  // mplayer reports can't find "skiing - GgG.avi"

I've also tried UTF8ToAnsi, and UTF8ToSys (both of which return "skiing - GgG.avi")...

Incidently in ALL cases, the Process ends with both an exitcode and exitstatus of 0.

So, how do I pass a Unicode filename into UTF8Process.Parameters?
« Last Edit: June 28, 2014, 09:48:39 pm by Mike.Cornflake »
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

bigeno

  • Full Member
  • ***
  • Posts: 248
Re: Passing Unicode filenames to TUTF8Process
« Reply #1 on: June 28, 2014, 11:25:18 pm »
Did you try TProcess (not TProcessUTF8) and UTF8ToSys ?

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Re: Passing Unicode filenames to TUTF8Process
« Reply #2 on: June 28, 2014, 11:40:54 pm »
I tried UTF8ToSys (and SysToUTF8).  No banana's each time :-(

I didn't try TProcess, do you have experience of it working with unicode paramaters?
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

Bart

  • Hero Member
  • *****
  • Posts: 3546
    • Bart en Mariska's Webstek
Re: Passing Unicode filenames to TUTF8Process
« Reply #3 on: June 28, 2014, 11:54:57 pm »
AFAIK yo do not need to use Utf8ToSys, just supply the name as UTF8, the conversion is done inside the TProcessUtf8.

Bart

bigeno

  • Full Member
  • ***
  • Posts: 248
Re: Passing Unicode filenames to TUTF8Process
« Reply #4 on: June 28, 2014, 11:56:55 pm »
I tried UTF8ToSys (and SysToUTF8).  No banana's each time :-(

I didn't try TProcess, do you have experience of it working with unicode paramaters?
I use TProcessUTF8 without problems, (but not for mplayer), If you use TProcessUTF8 then you need UTF8 Parameters, you can't use UTF8ToSys, for that use TProcess. If command line works then Sys encoding works, hm... Can you remove spaces (from file name) for test  ?

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Re: Passing Unicode filenames to TUTF8Process
« Reply #5 on: June 29, 2014, 12:16:04 am »
I've now tried TProcess with Filename, UTF8ToSys(Filename), SysToUTF8(Filename).
I've tried TProcessUTF8 with the same set of combinations.

In every case, the filename is garbled.  The closest I get is with TProcess and SysToUTF8(Filename).  At least the filename reported by mplayer is very close to the filename mplayer reports when I run it from the commandline.  Still not close enough though....

I'm beginning to wonder if microsoft is doing something odd in the command line.  I've just noticed that at the prompt, the filename ISN'T unicode, it's what I get when I call UTF8ToSys.  I've been using Tab-Completion to get the filename...
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Re: Passing Unicode filenames to TUTF8Process
« Reply #6 on: June 29, 2014, 12:24:01 am »
Can you remove spaces (from file name) for test  ?
Spaces in filenames were the first issue I resolved when working with this code.  But I haven't tested the specific scenario of Unicode + No spaces.  hang on....
OK, Only tested with TProcess and Filename, then SysToUTF8(Filename) and finally UTF8ToSys(Filename).  This time with no spaces in either the folder name or filename.   Nope, no change...
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7603
Re: Passing Unicode filenames to TUTF8Process
« Reply #7 on: June 29, 2014, 01:17:14 am »
It seems that TProcessUTF8 does UTF8tosys() too internally.  You can try the same with TProcess, but TProcess fundamentally doesn't understand unicode on Windows. Moreover it does some string processing that makes it shaky.

If your filename contains characters not in your windows codepage, it will probably fail miserably.

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Re: Passing Unicode filenames to TUTF8Process
« Reply #8 on: June 29, 2014, 10:08:52 am »
AFAIK yo do not need to use Utf8ToSys, just supply the name as UTF8, the conversion is done inside the TProcessUtf8.

Bart

Thanks, but as described elsewhere, I tried that, and didn't get anywhere.

It seems that TProcessUTF8 does UTF8tosys() too internally.  You can try the same with TProcess, but TProcess fundamentally doesn't understand unicode on Windows. Moreover it does some string processing that makes it shaky.

I think there's a closer relationship between TProcessUTF8 and TProcess than you suspect.  All TProcessUTF8 does is automagically do some SysToUTF8 for you.  However I *think* (read "am convinced but haven't actually stepped through the code for confirmation) that TProcessUTF8 calls the bits of TProcess that does flaky stuff.   Possibly CommandToList and definitely TProcess.Execute which ultimately appears to call CreateProcessA, not CreateProcessW.

Quote
If your filename contains characters not in your windows codepage, it will probably fail miserably.

And this (failing miserably) is my case for both TProcessUTF8 and TProcess (which now I've stepped through the code doesn't surprise me, there's no effective difference I can see between carefully handling TProcess and letting TProcessUTF8 do some stuff for you).  And in my Unicode ignorance, I think you might have shed some light on why it works for some, but not others.  Well, why it works for everyone else, but not me :)

Many thanks for your time every one.  I'll produce a simple case somehow and lodge an item in the bugtracker, then I'm going to walk away from this one.  Upgrading win\process.inc to call CreateProcessW will require far more time than I have available in the foreseeable...
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7603
Re: Passing Unicode filenames to TUTF8Process
« Reply #9 on: June 29, 2014, 12:48:24 pm »
  definitely TProcess.Execute which ultimately appears to call CreateProcessA, not CreateProcessW.

That's what I meant yes. 2.6.x uses -A anywhere.  2.7.1 is being changed in that regard to use -W, but that hasn't progressed yet to anything classes based yet (but most system and sysutils routines are. But ExecuteProcess() isn't  >:D )



Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Re: Passing Unicode filenames to TUTF8Process
« Reply #10 on: June 29, 2014, 01:59:01 pm »
2.7.1 is being changed in that regard to use -W, but that hasn't progressed yet to anything classes based yet (but most system and sysutils routines are.

Simply fantastic news :-)  Good to know that the problem is being worked on.  Many thanks for the confirmation.
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1250
Re: Passing Unicode filenames to TUTF8Process
« Reply #11 on: June 29, 2014, 05:04:44 pm »
One last word on this issue promise :-)

Just letting people know that the issue appears limited to Windows.  Identical code compiled under Linux (Mint 14/Mate) successfully opened the unicode file first try.
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 7603
Re: Passing Unicode filenames to TUTF8Process
« Reply #12 on: June 29, 2014, 05:44:15 pm »
One last word on this issue promise :-)

Just letting people know that the issue appears limited to Windows.  Identical code compiled under Linux (Mint 14/Mate) successfully opened the unicode file first try.

The 2.6.x RTL is defined as defaulting to the system 1-byte encoding. On Linux that happens to be UTF8 in full desktop distros.

mdalacu

  • Full Member
  • ***
  • Posts: 202
    • dmSimpleApps
Re: Passing Unicode filenames to TUTF8Process
« Reply #13 on: December 04, 2015, 08:08:58 am »
Did you managed to get it working? I have the same problem....i can not launch a process with unicode filenames passed as parameters.

parcel

  • Full Member
  • ***
  • Posts: 135
Re: Passing Unicode filenames to TUTF8Process
« Reply #14 on: December 04, 2015, 08:31:03 am »
Avoid string problem,

use "pchar(utf8encode(<unicodestring>))" is good way in fpc >= 2.7

RTL string function always assume dest encoding is "defaultsystemcodepage",
it make trouble sometimes.