* * *

Author Topic: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??  (Read 2408 times)

Thaddy

  • Hero Member
  • *****
  • Posts: 4525
Re: convert char to integer and inverse??
« Reply #30 on: August 13, 2017, 10:01:01 am »
@Handoko
@majid.ebru

Much too complex:
Reduce to three edits, 2 buttons, reduce code to:
Code: Pascal  [Select]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   T: Word =$0643;//just a default. I also connected this to TForm1.OnShow...
  4.   C: unicodechar absolute T;
  5. begin
  6.   C := UnicodeString(Edit1.Text)[1];
  7.   Edit2.Text := IntToHex(T,4);
  8. end;
  9.  
  10. procedure TForm1.Button2Click(Sender: TObject);
  11. begin
  12.    Edit3.Text:= UnicodeChar( StrToInt('$'+Edit2.Text));
  13. end;
  14. { tested all of them
  15. 0    ۰    U+06F0    ٠    U+0660
  16. 1    ۱    U+06F1    ١    U+0661
  17. 2    ۲    U+06F2    ٢    U+0662
  18. 3    ۳    U+06F3    ٣    U+0663
  19. 4    ۴    U+06F4    ٤    U+0664
  20. 5    ۵    U+06F5    ٥    U+0665
  21. 6    ۶    U+06F6    ٦    U+0666
  22. 7    ۷    U+06F7    ٧    U+0667
  23. 8    ۸    U+06F8    ٨    U+0668
  24. 9    ۹    U+06F9    ٩    U+0669
  25. ye    ی    U+06CC    ي    U+064A
  26. kāf    ک    U+06A9    ك    U+0643
  27. }

No silly extra units that bloat code needed.
« Last Edit: August 13, 2017, 10:15:03 am by Thaddy »
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

majid.ebru

  • Sr. Member
  • ****
  • Posts: 266
Re: convert char to integer and inverse??
« Reply #31 on: August 13, 2017, 10:21:05 am »
i just  say : oh my GOD  :o :o :o :o :o

thankyou very much @Thaddy

Thaddy

  • Hero Member
  • *****
  • Posts: 4525
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #32 on: August 13, 2017, 05:13:20 pm »
Glad to be of help.
As I showed, by using the correct UTF16 string/char type (unicodestring/unicodechar), everything becomes much simpler.
Just typecasts.
UTF16--> UTF8 (for the Lazarus controls) is handled transparently in that case. This is likely to improve even more in the future.
And UTF8 is simply a dog to handle on its own as the eloquent - and working - code from Handoko merely demonstrates.

UTF8 is like shooting yourself in the foot on purpose. :D
I hope you now understand what my first question to you meant? Because there is a huge difference between all the unicode types and that is often very confusing.

Rule of thumb: in case of doubt, start with UTF16 (UnicodeString) even in Lazarus (defaults to utf8 Ansi hybrid) because the conversion from UnicodeString to UTF8 is much simpler than calling all kinds of utility functions and mappings. There are rare cases where this is still necessary, though.

Note for Lazarus developers: I was really impressed by the fact that right to left languages (as per my $0643) are handled so well! compliments!
Note for FPC developers: tnx for such a great typecasting system!
« Last Edit: August 13, 2017, 05:42:52 pm by Thaddy »
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

engkin

  • Hero Member
  • *****
  • Posts: 1655
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #33 on: August 14, 2017, 05:12:52 am »
UTF8 is like shooting yourself in the foot on purpose. :D

UTF16 is like shooting yourself in the foot by mistake.  :P

If you try any of the following emojis or maybe some Arabic Mathematical Alphabetic Symbols.

Thaddy

  • Hero Member
  • *****
  • Posts: 4525
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #34 on: August 14, 2017, 07:31:15 am »
UTF16 is like shooting yourself in the foot by mistake.  :P
Indeed. There are rare cases..I mentioned that.
"Logically, no number of positive outcomes at the level of experimental testing can confirm a scientific theory, but a single counterexample is logically decisive."

majid.ebru

  • Sr. Member
  • ****
  • Posts: 266
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #35 on: October 22, 2017, 07:46:08 am »
Hi

so sorry ihave a problem agian ?!?!

i want to spllte char of string but i can not findout what length of char is 1 or 2 ?

when i use :
Code: Pascal  [Select]
  1. ShowMessage(IntToStr(Length("E")));   // Length = 1
  2.  

but when i use :
Code: Pascal  [Select]
  1. ShowMessage(IntToStr(Length("س")));   // Length  = 2
  2.  

how can i findout whitch type of char(ASCII or UTF or ...) inputed in editbox?

or

how can i findout what Length of char is 1 or  2?

thank you

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3135
  • I like bugs.
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #36 on: October 22, 2017, 11:04:49 am »
Code: Pascal  [Select]
  1. ShowMessage(IntToStr(Length("E")));   // Length = 1
  2. ShowMessage(IntToStr(Length("س")));   // Length  = 2
I don't think that code compiles.

Quote
how can i findout whitch type of char(ASCII or UTF or ...) inputed in editbox?
In LCL's EditBoxes it is always UTF-8.

Quote
how can i findout what Length of char is 1 or  2?
or 3 or 4? ... Short answer: UTF8Length().
For more info see:
 http://wiki.freepascal.org/UTF8_strings_and_characters
The same examples could be written in an encoding agnostic way using the unit LazUnicode in LazUtils. It also defines iterators for both codepoints and Unicode "characters".

Some people promote treating UTF-16 as a fixed width encoding which is just plain wrong. Currently already ~ half of codepoints are outside BMP and the number grows when Unicode is extended.
Usually programmers want to squash out even small bugs from their code. Ignoring half of Unicode codepoints however is a major bug.
Fortunately variable width codepoints are easy to get right, regardless of encoding.
« Last Edit: October 22, 2017, 11:16:47 am by JuhaManninen »

engkin

  • Hero Member
  • *****
  • Posts: 1655
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #37 on: October 22, 2017, 06:46:53 pm »
Some people promote treating UTF-16 as a fixed width encoding which is just plain wrong.

😆

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3135
  • I like bugs.
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #38 on: October 22, 2017, 07:15:03 pm »
Emoji outside BMP
How did you manage to enter the emoji on this forum?
If I do it, or just quote your post without changing anything, I get on a pink background:

The following error or errors occurred while posting this message:
The message body was left empty.

engkin

  • Hero Member
  • *****
  • Posts: 1655
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #39 on: October 22, 2017, 07:37:04 pm »
😆
How did you manage to enter the emoji on this forum?
If I do it, or just quote your post without changing anything, I get on a pink background:

The following error or errors occurred while posting this message:
The message body was left empty.


😄
Use the "Quick Reply" at the bottom of the page.
« Last Edit: October 22, 2017, 07:41:14 pm by engkin »

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3135
  • I like bugs.
Re: [SOLVED] convert all char(Unicode) to integer(or HEX) and inverse??
« Reply #40 on: October 22, 2017, 11:13:58 pm »
Use the "Quick Reply" at the bottom of the page.
I don't have "Quick Reply" anywhere.
Did you just copy the emoji and it worked? Could my OS (Linux) affect? I don't think so, it is the server that gives the error.

engkin

  • Hero Member
  • *****
  • Posts: 1655
Use the "Quick Reply" at the bottom of the page.
I don't have "Quick Reply" anywhere.
Did you just copy the emoji and it worked? Could my OS (Linux) affect? I don't think so, it is the server that gives the error.

Check the attached image for "Quick Reply" location.

Yes I just copied the emoji and it worked. I noticed that code points outside BMP work in the Quick Reply only. Not sure why.

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 3135
  • I like bugs.
Check the attached image for "Quick Reply" location.
Interesting! See my screenshot.
Earlier I could not see attachments from others. Martin fixed it by giving me more rights. Maybe he can fix this one, too.
@Martin, ping...

Quote
Yes I just copied the emoji and it worked. I noticed that code points outside BMP work in the Quick Reply only. Not sure why.
Uhhh! Anyway this is a warning example of what can happen when programmers treat UTF-16 as a fixed-width encoding. This SMF system is widely used and is supposed to support Unicode. Unfortunately it does not, except maybe with the "Quick Reply" button.

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus