Recent

Author Topic: Convert wide chars  (Read 3049 times)

arneolav

  • Full Member
  • ***
  • Posts: 195
    • ElTranslador
Convert wide chars
« on: January 24, 2015, 12:08:33 am »
Got this string from an API service,  json and xml:

JSON "name":"\u00c5\u00d8\u00c6 \u00e5\u00f8\u00e6"
XML  name="ÅØÆ åøæ"

I'm trying to convert these strings to Utf8 and/or Ansi, but cat get any conversion to work.
Have been reading all I can find of Lazarus/utf converting, got no idea.

From server:
Server: Apache/2.2.22 (Debian)
X-Powered-By: PHP/5.4.4-14+deb7u8


Lazarus 1.2.6 r46529 FPC 2.6.4 i386-win32-win32/win64
 
Win XP, Win7, Win 10, Win 11, win64 , Lazarus 3.0RC1
Delphi/DevExpress

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2770
    • havefunsoft.com
Re: Convert wide chars
« Reply #1 on: January 24, 2015, 03:50:30 am »
if you're parsing the string with FPC jsonscanner, then it should convert these wide chars to utf8 for you.
However, the result of conversion doesn't make much sense, unless this is some sort of encryption API? or some unicoded ansi-coding :) Which might occur, if previously pushed data was wrong... or the server is misconfigured.

The result looks like this: ÅØÆ åøæ

Is it like upper/lower case test? any specific reason why there's 32-byte difference between the first and the second set of characters?
« Last Edit: January 24, 2015, 03:55:54 am by skalogryz »

arneolav

  • Full Member
  • ***
  • Posts: 195
    • ElTranslador
Re: Convert wide chars
« Reply #2 on: January 24, 2015, 09:50:43 am »
Thanks,
Sorry I did't write, Yes it is a test of the chars: ÅØÆ åøæ, .

There is no encryption, execpt the communication is https:/... and transport Synapse trunk version (oAuth)

As XML I can recive a text like this: name="Hjemme Nedbør"
The char "ø" should be converten to "ø".

To solve this I know I can make a table, my own convert, but I think something exists.
Not reinvent the wheel...

« Last Edit: January 24, 2015, 12:40:41 pm by arneolav »
Win XP, Win7, Win 10, Win 11, win64 , Lazarus 3.0RC1
Delphi/DevExpress

arneolav

  • Full Member
  • ***
  • Posts: 195
    • ElTranslador
Re: Convert wide chars
« Reply #3 on: January 24, 2015, 11:11:38 am »
Seems to be solved by:

Removed lazutf8 replaced by:  uses ... lazutils
« Last Edit: January 24, 2015, 12:40:53 pm by arneolav »
Win XP, Win7, Win 10, Win 11, win64 , Lazarus 3.0RC1
Delphi/DevExpress

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Convert wide chars
« Reply #4 on: January 24, 2015, 04:51:30 pm »
Code: [Select]

uses cwstring;

const xx = #$00C5#$00D8#$00C6#$00E5#$00E6;  // UCS codepoints.

var s : utf8string;
    a : ansistring;
   s2:unicodestring;

begin
 a:=xx;
 writeln(a);
 s:=xx;
 writeln(s);
 s2:=xx;
 writeln(s2);
end.


Prints (FreeBSD, utf8 console, FPC 3.1.1):

Code: [Select]
ÅØÆåæ
Ã
ÃÃåæ
ÅØÆåæ

with "uses cwstring"

Code: [Select]
ÅØÆåæ
AOAEaae
AOAEaae

with

"uses fpwidestring"

Code: [Select]
ÅØÆåæ
ÅØÆåæ
ÅØÆåæ

The various right and wrongs probably depend on proper initialization of the current codepage by whatever unicode conversion driver unit. (cwstring, fpwidestring)

 

TinyPortal © 2005-2018