From an application programming standpoint the main justification is not needing to differentiate octet streams and UTF-8 data (because most octet streams will end up being treated as Unicode data at some point). If you don't re-encode before passing to the OS you will be doing it before you save to disk or send it over the network.
Nearly all such formats are dynamically encoded (due to BOM presence, an encoding field in the protocol or (HTTP) metadata annotations) anyway. There is precious few raw wire and file formats guaranteed UTF8 (except in Unix derived software).
Moreover, that is about a totally separate issue, namely document encoding. Which is a totally different issue from API/ABI encoding.
Since it is Windows that is the odd one out, it seems to make the most sense to quarantine its use of UTF-16 as close to its API as you can.
That depends on your situation. It is more than a simple count, since a lot of SME development will have windows as majority target. And increasingly mobile targets are separate codebases, which have supplanted the minimal desktop apps for e.g. OS X.
If you read the sample code it should be pretty clear that re-encoding is essentially the same in FreePascal as it is in C++ (you call a function). It's very likely you will have to change encoding at some point, and there is a best place to do it if you want to remain cross platform.
Well, first cross platform is not a given, and second it is a weighted count rather than a straight one. It does not make sense to do an hatched job on incoming delphi code in an incompatible way for some naive sense of multiplatform when the likely target is again Delphi.
And doing it at the OS interface (which thousands of Windows calls to abstract) is IMHO very bad modularization. You typically modularize to minimize interactions (read: conversions).