I just port my app from Linux to MAC OS. I got problem with file name unicode!
I use findfirst to enum all file in a directory and get the file name. But the strange is : all unicode char is encode to an other type that are different from Windows/Linux
Example :
the character "ệ" (code = 7879) will have length = 3, each char will have code : 101 803 770
The Mac way is called "decomposed unicode". The reason they do that, is so that you cannot have two different files in a directory that are both called ệ. On Linux, you can (one encoded without decomposition, one with -- EDIT: you can actually have 4 different files named ệ in Linux, since you can decompose it in three different ways).
They could of course also have picked the composed way for standardizing on, but I guess it's quicker to convert all possible precomposed characters into decomposed ones than to convert all possible partially precomposed and decomposed characters into precomposed ones.
See
http://developer.apple.com/library/mac/#qa/qa2001/qa1235.html for more info.
You can pass precomposed strings to file API functions on Mac OS X, but the system will internally convert them to decomposed form before using them.
i attach two file, one is the character in MAC OS, another is in Windows. They're display same but the size is different, both encode in UNICODE.
Is there any way to convert between two type ?
If you only want to compare file names, use the LCL function filectrl.CompareFilenames (or filectrl.CompareFilenamesIgnoreCase).
I don't know whether there's a standard Lazarus function to convert to precomposed/decomposed form. The link above contains a C example that you can convert to Pascal on Mac OS X though (using the declarations from the MacOSAll unit).