Recent

Author Topic: How to check if a string has only extended Ascii characters in it ?  (Read 4793 times)

fjabouley

  • Full Member
  • ***
  • Posts: 127
Hello all !
I'm quite confused, and maybe it's a newbie question, but I try to check if a string has only ascii (extended) chars only and return false if not (I try to split strings that contains some chars like 'ç','ô', etc... (UCS2) and not the ones that are only with ascii chars).
I try many things, but that fails everytime, and I'm lost...
Best regards !

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2587
    • havefunsoft.com
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #1 on: March 08, 2021, 07:24:06 am »
Code: Pascal  [Select][+][-]
  1. function isAsciiExtOnly(const s: string): Boolean;
  2. var
  3.   i : integer;
  4. begin
  5.   for i:=1 to length(s) do
  6.     if s[i]<=#127 then begin
  7.       Result:=false;
  8.       exit;
  9.     end;
  10.   Result:=true;
  11. end;
  12.  
« Last Edit: March 08, 2021, 07:26:37 am by skalogryz »
Patron Cocoa Widgetset development https://www.patreon.com/skalogryz

fjabouley

  • Full Member
  • ***
  • Posts: 127
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #2 on: March 08, 2021, 08:55:54 am »
Thanks a lot for your answer
In fact, it's not actually what I want to do, I try to know if I can catch some chars to know how to split a larger string (the goal is to use a huawei api to send messages)


I'd like to do something like this...


Code: Pascal  [Select][+][-]
  1.  
  2. function hasSpecialChars(const s:string):boolean;
  3. var
  4.   i : integer;
  5. const spch : array of string = ('ç','â','ê','î','ô','û','ä','ë','ï','ö','ü','µ','²');
  6. begin
  7.   Result := false;
  8.   for i:=1 to length(s) do
  9.     if s[i] in spch then begin
  10.       Result:=true;
  11.       exit;
  12.     end;
  13. end;  
  14.  
but strings are not encoded with the same number of bytes, and I don't know how to do, I tried utf8length, etc... but I can't manage to do what I want...
Best regards

speter

  • Full Member
  • ***
  • Posts: 197
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #3 on: March 08, 2021, 09:45:48 am »
Would this work!?

Code: Pascal  [Select][+][-]
  1. function hasSpecialChars(const s:string):boolean;
  2. var
  3.   i : integer;
  4. const spch = 'çâêîôûäëïöüµ²';
  5. begin
  6.   Result := false;
  7.   for i:=1 to length(s) do
  8.     if pos(s[i],spch) > 0 then begin
  9.       Result:=true;
  10.       exit;
  11.     end;
  12. end;  

cheers
S.
I climbed mighty mountains, and saw that they were actually tiny foothills. :)

Laz 2.0.10 / FPC 3.2.0 / Windows 10 (64bit)

winni

  • Hero Member
  • *****
  • Posts: 2356
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #4 on: March 08, 2021, 10:28:07 am »
Hi!

You have to forget the old "One byte logic" and get used to all the function from lazUTF8.


Code: Pascal  [Select][+][-]
  1. uses ......,lazUTF8;
  2. ...
  3. function hasSpecialChars(const s:string):boolean;
  4.     var
  5.       i : integer;
  6.       ch : string;
  7.     const spch = 'çâêîôûäëïöüµ²';
  8.     begin
  9.       Result := false;
  10.       for i:=1 to UTF8length(spch) do
  11.       begin
  12.         ch := UTF8Copy(spch,i,1);
  13.         if pos(ch, s) > 0 then begin
  14.           Result:=true;
  15.           exit;
  16.         end;
  17.     end;
  18. end;            

Winni

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4002
  • I like bugs.
Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

engkin

  • Hero Member
  • *****
  • Posts: 2721
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #6 on: March 08, 2021, 03:22:31 pm »
Hi!

You have to forget the old "One byte logic" and get used to all the function from lazUTF8.


Code: Pascal  [Select][+][-]
  1. uses ......,lazUTF8;
  2. ...
  3. function hasSpecialChars(const s:string):boolean;
  4.     var
  5.       i : integer;
  6.       ch : string;
  7.     const spch = 'çâêîôûäëïöüµ²';
  8.     begin
  9.       Result := false;
  10.       for i:=1 to UTF8length(spch) do
  11.       begin
  12.         ch := UTF8Copy(spch,i,1);
  13.         if pos(ch, s) > 0 then begin
  14.           Result:=true;
  15.           exit;
  16.         end;
  17.     end;
  18. end;            

Winni

Hi,

ç, for instance, could be written using one of two forms:
U+00E7 Latin Small Letter C With Cedilla

or

U+0063 Latin Small Letter C and U+0327 Combining Cedilla

Code: Pascal  [Select][+][-]
  1. var
  2.   u1,u2: unicodestring;
  3.   s1,s2: string;
  4. ...
  5.   u1 := #$00E7;
  6.   s1 := string(u1);
  7.  
  8.   u2 := 'c'+#$0327;
  9.   s2 := string(u2);

I think your code will work ok with s1, and it will fail with s2.

wp

  • Hero Member
  • *****
  • Posts: 8410
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #7 on: March 08, 2021, 04:03:52 pm »
the goal is to use a huawei api to send messages
Before inventing fancy routines make clear what exactly you want to achieve: You want to send messages (mails?, chat?) via a huawei api. What is the requirement of this api regarding strings? UTF8? UTF-16? UCS2 as you mention? Since you work with Lazarus the standard encoding is UTF8.
  • If huawei wants UTF-8 there's nothing to do. Simply pass the Lazarus strings on to the API.
  • If they want UTF-16 declare the strings that you want to pass over to the API as widestring and simply assign your Lazarus strings to them - this way UTF-8 of Lazararus will be converted to UTF-16 required by huawei. (You need an FPC 3.0+ for this, but this is fairly standard nowadays.)
  • I don't know the best way to handle UCS2. What always works is to use the encoding routines in unit lconvencoding: UTF8ToUCS2LE or UTF8ToUCS2BE. Ask the API specification again whether you need the "LE" routine ("LE" = "little endian" = low-valued character byte first) or the "BE" routine ("BE" = "big endian" = high-valued character byte first).
These all are ready function waiting for you to apply. There's no need to write character conversion by yourself.

Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

winni

  • Hero Member
  • *****
  • Posts: 2356
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #8 on: March 08, 2021, 04:40:02 pm »

Hi,

ç, for instance, could be written using one of two forms:
U+00E7 Latin Small Letter C With Cedilla

or

U+0063 Latin Small Letter C and U+0327 Combining Cedilla

Code: Pascal  [Select][+][-]
  1. var
  2.   u1,u2: unicodestring;
  3.   s1,s2: string;
  4. ...
  5.   u1 := #$00E7;
  6.   s1 := string(u1);
  7.  
  8.   u2 := 'c'+#$0327;
  9.   s2 := string(u2);

I think your code will work ok with s1, and it will fail with s2.


Hi!

Yes, it is your mistake.
You have to put both  - s1 and s2 - in  spch.

Winni

engkin

  • Hero Member
  • *****
  • Posts: 2721
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #9 on: March 08, 2021, 04:52:38 pm »
Yes, it is your mistake.
You have to put both  - s1 and s2 - in  spch.

That will not solve the problem. If I were to put both s1 and s2 in spch, your code checks one code point at a time, it will return true for 'c' in this case.

winni

  • Hero Member
  • *****
  • Posts: 2356
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #10 on: March 08, 2021, 05:07:38 pm »
Hi!

You are right, engkin.

So he has to use an array of string for his "special chars".
With a string you can't solve it.

Winni

fjabouley

  • Full Member
  • ***
  • Posts: 127
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #11 on: March 08, 2021, 08:47:49 pm »
thank you all for your answers !


Actually I have to split the messages I send to the huawei api depending on what the huawei device detects.
Here is the code used by the huawei device in js
Code: Javascript  [Select][+][-]
  1.  
  2. var SMS_TEXT_MODE_UCS2 =  0;
  3. var SMS_TEXT_MODE_7BIT =  1;
  4. var SMS_TEXT_MODE_8BIT =  2;
  5. var ASCII_CODE    = 127;
  6. var g_SMS_UCS2_MAX_SIZE;
  7. var g_SMS_8BIT_MAX_SIZE;
  8. var g_SMS_7BIT_MAX_SIZE;
  9. var g_content = null;
  10. var g_text_mode = SMS_TEXT_MODE_7BIT;
  11. var g_sms_length = 0;
  12. var g_sms_num = 1;
  13. var g_ucs2_num = 0;
  14. var g_station;
  15. var GSM_7BIT_NUM = 128;
  16. var SMS_STR_NUM = 620;
  17. var EXTENSION_ASCII = 9;
  18. var g_ext_7bit_tab = [
  19. [20, 0x005E], [40, 0x007B], [41, 0x007D],
  20. [47, 0x005C], [60, 0x005B], [61, 0x007E],
  21. [62, 0x005D], [64, 0x007C], [101, 0x20AC]
  22. ];
  23. var g_ext_7bit_tab_turkish = [
  24. [13, 0x001D], [20, 0x005E],
  25. [40, 0x007B], [41, 0x007D], [47, 0x005C], [60, 0x005B],
  26. [61, 0x007E], [62, 0x005D], [64, 0x007C], [71, 0x011E],
  27. [73, 0x0130], [83, 0x015E], [99, 0x00E7], [101, 0x20AC],
  28. [103, 0x011F], [105, 0x0131], [115, 0x015F]
  29. ];
  30. var g_ext_7bit_tab_spanish = [
  31. [9, 0x00E7],  [20, 0x005E],
  32. [40, 0x007B], [41, 0x007D], [47, 0x005C], [60, 0x005B],
  33. [61, 0x007E], [62, 0x005D], [64, 0x007C], [65, 0x00C1],
  34. [73, 0x00CD], [79, 0x00D3], [85, 0x00DA], [97, 0x00E1],
  35. [101, 0x20AC], [105, 0x00ED], [111, 0x00F3], [117, 0x00FA]
  36. ];
  37. var g_ext_7bit_tab_Portuguese = [
  38. [5, 0x00EA], [9, 0x00E7],   [11, 0x00D4],
  39. [12, 0x00F4], [14, 0x00C1], [15, 0x00E1],[18, 0x03A6],
  40. [19, 0x0393], [20, 0x005E], [21, 0x03A9], [22, 0x03A0],
  41. [23, 0x03A8], [24, 0x03A3], [25, 0x0398],
  42. [31, 0x00CA], [40, 0x007B], [41, 0x007D], [47, 0x005C],
  43. [60, 0x005B], [61, 0x007E], [62, 0x005D], [64, 0x007C],
  44. [65, 0x00C0], [73, 0x00CD], [79, 0x00D3], [85, 0x00DA],
  45. [91, 0x00C3], [92, 0x00D5], [97, 0x00C2], [101, 0x20AC],
  46. [105, 0x00ED], [111, 0x00F3], [117, 0x00FA], [123, 0x00E3],
  47. [124, 0x00F5], [127, 0x00E2]
  48. ];
  49. var extension_char = 27;
  50. var ENTER_CHAR = 10;
  51. var CR_CHAR = 13;
  52. var arrayGSM_7bit =
  53. [
  54. 0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07, 0x08, 0x09, 0x0A, 0x0B, 0x0C, 0x0D, 0x0E, 0x0F,
  55. 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1A, 0x1B, 0x1C, 0x1D, 0x1E, 0x1F,
  56. 0x20, 0x21, 0x22, 0x23, 0x24, 0x25, 0x26, 0x27, 0x28, 0x29, 0x2A, 0x2B, 0x2C, 0x2D, 0x2E, 0x2F,
  57. 0x30, 0x31, 0x32, 0x33, 0x34, 0x35, 0x36, 0x37, 0x38, 0x39, 0x3A, 0x3B, 0x3C, 0x3D, 0x3E, 0x3F,
  58. 0x40, 0x41, 0x42, 0x43, 0x44, 0x45, 0x46, 0x47, 0x48, 0x49, 0x4A, 0x4B, 0x4C, 0x4D, 0x4E, 0x4F,
  59. 0x50, 0x51, 0x52, 0x53, 0x54, 0x55, 0x56, 0x57, 0x58, 0x59, 0x5A, 0x5B, 0x5C, 0x5D, 0x5E, 0x5F,
  60. 0x60, 0x61, 0x62, 0x63, 0x64, 0x65, 0x66, 0x67, 0x68, 0x69, 0x6A, 0x6B, 0x6C, 0x6D, 0x6E, 0x6F,
  61. 0x70, 0x71, 0x72, 0x73, 0x74, 0x75, 0x76, 0x77, 0x78, 0x79, 0x7A, 0x7B, 0x7C, 0x7D, 0x7E, 0x7F
  62. ];
  63. var arrayGSM_7DefaultTable =
  64. [
  65. 0x0040, 0x00A3, 0x0024, 0x00A5, 0x00E8, 0x00E9, 0x00F9, 0x00EC, 0x00F2, 0x00C7, 0x000A, 0x00D8, 0x00F8, 0x000D, 0x00C5, 0x00E5,
  66. 0x0394, 0x005F, 0x03A6, 0x0393, 0x039B, 0x03A9, 0x03A0, 0x03A8, 0x03A3, 0x0398, 0x039E, 0x001B, 0x00C6, 0x00E6, 0x00DF, 0x00C9,
  67. 0x0020, 0x0021, 0x0022, 0x0023, 0x00A4, 0x0025, 0x0026, 0x0027, 0x0028, 0x0029, 0x002A, 0x002B, 0x002C, 0x002D, 0x002E, 0x002F,
  68. 0x0030, 0x0031, 0x0032, 0x0033, 0x0034, 0x0035, 0x0036, 0x0037, 0x0038, 0x0039, 0x003A, 0x003B, 0x003C, 0x003D, 0x003E, 0x003F,
  69. 0x00A1, 0x0041, 0x0042, 0x0043, 0x0044, 0x0045, 0x0046, 0x0047, 0x0048, 0x0049, 0x004A, 0x004B, 0x004C, 0x004D, 0x004E, 0x004F,
  70. 0x0050, 0x0051, 0x0052, 0x0053, 0x0054, 0x0055, 0x0056, 0x0057, 0x0058, 0x0059, 0x005A, 0x00C4, 0x00D6, 0x00D1, 0x00DC, 0x00A7,
  71. 0x00BF, 0x0061, 0x0062, 0x0063, 0x0064, 0x0065, 0x0066, 0x0067, 0x0068, 0x0069, 0x006A, 0x006B, 0x006C, 0x006D, 0x006E, 0x006F,
  72. 0x0070, 0x0071, 0x0072, 0x0073, 0x0074, 0x0075, 0x0076, 0x0077, 0x0078, 0x0079, 0x007A, 0x00E4, 0x00F6, 0x00F1, 0x00FC, 0x00E0
  73. ];
  74. var arrayGSM_7ExtTable =
  75. [
  76. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  77. 0xFFFF, 0xFFFF, 0x000A, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  78. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005E, 0xFFFF, 0xFFFF, 0xFFFF,
  79. 0xFFFF, 0xFFFF, 0xFFFF, 0x0020, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  80. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  81. 0x007B, 0x007D, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005C,
  82. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  83. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005B, 0x007E, 0x005D, 0xFFFF,
  84. 0x007C, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  85. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  86. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  87. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  88. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x20AC, 0xFFFF, 0xFFFF,
  89. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  90. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  91. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF
  92. ];
  93. var arrayGSM_7TurkishExtTable  =
  94. [
  95. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x000A, 0xFFFF, 0xFFFF, 0x001D, 0xFFFF, 0xFFFF,
  96. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005E, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x0020, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  97. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x007B, 0x007D, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005C,
  98. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005B, 0x007E, 0x005D, 0xFFFF,
  99. 0x007C, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x011E, 0xFFFF, 0x0130, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  100. 0xFFFF, 0xFFFF, 0xFFFF, 0x015E, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  101. 0xFFFF, 0xFFFF, 0xFFFF, 0x00E7, 0xFFFF, 0x20AC, 0xFFFF, 0x011F, 0xFFFF, 0x0131, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  102. 0xFFFF, 0xFFFF, 0xFFFF, 0x015F, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF
  103. ];
  104. var arrayGSM_7PortugueseExtTable =
  105. [
  106. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00EA, 0xFFFF, 0xFFFF, 0xFFFF, 0x00E7, 0x000A, 0x00D4, 0x00F4, 0xFFFF, 0x00C1, 0x00E1,
  107. 0xFFFF, 0xFFFF, 0x03A6, 0x0393, 0x005E, 0x03A9, 0x03A0, 0x03A8, 0x03A3, 0x0398, 0xFFFF, 0x0020, 0xFFFF, 0xFFFF, 0xFFFF, 0x00CA,
  108. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x007B, 0x007D, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005C,
  109. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005B, 0x007E, 0x005D, 0xFFFF,
  110. 0x007C, 0x00C0, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00CD, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00D3,
  111. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00DA, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00C3, 0x00D5, 0xFFFF, 0xFFFF, 0xFFFF,
  112. 0xFFFF, 0x00C2, 0xFFFF, 0xFFFF, 0xFFFF, 0x20AC, 0xFFFF, 0xFFFF, 0xFFFF, 0x00ED, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00F3,
  113. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00FA, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00E3, 0x00F5, 0xFFFF, 0xFFFF, 0x00E2
  114. ];
  115. var arrayGSM_7SpanishExtTable =
  116. [
  117. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00E7, 0x000A, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  118. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005E, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x0020, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  119. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x007B, 0x007D, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005C,
  120. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x005B, 0x007E, 0x005D, 0xFFFF,
  121. 0x007C, 0x00C1, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00CD, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00D3,
  122. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00DA, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  123. 0xFFFF, 0x00E1, 0xFFFF, 0xFFFF, 0xFFFF, 0x20AC, 0xFFFF, 0xFFFF, 0xFFFF, 0x00ED, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00F3,
  124. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0x00FA, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF
  125. ];
  126. var arrayGSM_7SpanishSpecialTable=[
  127. 0x00A2, 0x00C0, 0x00C1, 0x00C2, 0x00C3, 0x00C8, 0x00CA, 0x00CB, 0x00CC, 0x00CD, 0x00CE, 0x00CF, 0x00D0, 0x00D2, 0x00D3, 0x00D4,
  128. 0x00D5, 0x00D6, 0x00D9, 0x00DA, 0x00DB, 0x00DD, 0x00DE, 0x00E1, 0x00E2, 0x00E3, 0x00E7, 0x00EA, 0x00EB, 0x00ED, 0x00EE, 0x00EF,
  129. 0x00F0, 0x00F3, 0x00F4, 0x00F5, 0x00F6, 0x00FA, 0x00FB, 0x00FD, 0x00FE, 0x00FF, 0x0102, 0x0104, 0x0105, 0x0106, 0x0107, 0x010C,
  130. 0x010D, 0x010E, 0x010F, 0x0111, 0x0114, 0x0118, 0x0119, 0x011B, 0x0132, 0x0133, 0x0139, 0x013D, 0x0141, 0x0142, 0x0143, 0x0144,
  131. 0x0147, 0x0148, 0x0154, 0x0155, 0x0158, 0x0159, 0x015A, 0x015B, 0x015E, 0x015F, 0x0160, 0x0161, 0x0162, 0x0163, 0x0164, 0x0165,
  132. 0x0168, 0x016E, 0x016F, 0x0179, 0x017A, 0x017B, 0x017C, 0x017D, 0x017E, 0x01CE, 0x01D4, 0x0490, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  133. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF,
  134. 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF, 0xFFFF
  135. ];
  136. function sms_isPhoneNumber(str) {
  137. var bRet = true;
  138. var rgExp = /^[+]{0,1}[*#0123456789]{1,20}$/;
  139. if (!(str.match(rgExp))) {
  140. bRet = false;
  141. }
  142. return bRet;
  143. }
  144. function check_extension_ascii_for_char_number(str) {
  145. var i = 0;
  146. var char_i;
  147. var char_i_code;
  148. var k = 0;
  149. var extension_ascii_num = 0;
  150. var charLenAtFirstSMSEnd = 1;
  151. var ext_tab = g_ext_7bit_tab;
  152. var normal_max_len = 160;
  153. var long_max_len = 153;
  154. switch( g_smsFeature.smscharlang ) {
  155. case 0:
  156. ext_tab = g_ext_7bit_tab;
  157. break;
  158. case 1:
  159. ext_tab = g_ext_7bit_tab_turkish;
  160. break;
  161. case 2:
  162. ext_tab = g_ext_7bit_tab_spanish;
  163. break;
  164. case 3:
  165. ext_tab = g_ext_7bit_tab_Portuguese;
  166. break;
  167. default:
  168. break;
  169. }
  170. if( 0 == g_smsFeature.smscharlang || "undefined" == typeof(g_smsFeature.smscharlang) ) {
  171. normal_max_len = 160;
  172. long_max_len = 153;
  173. } else {
  174. normal_max_len = 155;
  175. long_max_len = 149;
  176. }
  177. for(i=0; i<str.length; i++) {
  178. var charLen = 1;
  179. char_i = str.charAt(i);
  180. char_i_code = char_i.charCodeAt();
  181. for(charLen=1,k = 0;k< ext_tab.length;k++) {
  182. if(char_i_code == ext_tab[k][1]) {
  183. charLen = 2;
  184. break;
  185. }
  186. }
  187. if(1 == charLen) {
  188. extension_ascii_num++;
  189. } else {
  190. if(1 == charLenAtFirstSMSEnd) {
  191. if( (long_max_len-1) == extension_ascii_num ) {
  192. extension_ascii_num+=2;
  193. charLenAtFirstSMSEnd=2;
  194. } else if(( (long_max_len*2-1) == extension_ascii_num)
  195. || ( (long_max_len*3-1) == extension_ascii_num )
  196. || ( (long_max_len*4-1) == extension_ascii_num)) {
  197. extension_ascii_num+=3;
  198. } else {
  199. extension_ascii_num+=2;
  200. }
  201. } else {
  202. if(( (long_max_len-1)*2 == extension_ascii_num)
  203. || (((long_max_len-1)*3+1) == extension_ascii_num)
  204. || (((long_max_len-1)*4+2) == extension_ascii_num)) {
  205. extension_ascii_num+=3;
  206. } else {
  207. extension_ascii_num+=2;
  208. }
  209. }
  210. }
  211. }
  212. if(extension_ascii_num > normal_max_len && 2 == charLenAtFirstSMSEnd) {
  213. extension_ascii_num++;
  214. }
  215. return extension_ascii_num;
  216. }
  217. function check_extension_ascii_for_char_number_new(str) {
  218. var i = 0;
  219. var char_i;
  220. var char_i_code;
  221. var k = 0;
  222. var extension_ascii_num = 0;
  223. var charLenAtFirstSMSEnd = 1;
  224. var ext_tab = g_ext_7bit_tab;
  225. var normal_max_len = 160;
  226. var long_max_len = 153;
  227. var ext_tab_ = '';
  228. var tab_7bit_ext = true;
  229. switch( g_smsFeature.smscharlang ) {
  230. case 0:
  231. ext_tab_ = g_ext_7bit_tab;
  232. break;
  233. case 1:
  234. ext_tab_ = g_ext_7bit_tab_turkish;
  235. break;
  236. case 2:
  237. ext_tab_ = g_ext_7bit_tab_spanish;
  238. break;
  239. case 3:
  240. ext_tab_ = g_ext_7bit_tab_Portuguese;
  241. break;
  242. default:
  243. break;
  244. }
  245. g_sms_smscharlang = false;
  246. for(i=0; i<str.length; i++) {
  247. tab_7bit_ext = true;
  248. var charLen = 1;
  249. char_i = str.charAt(i);
  250. char_i_code = char_i.charCodeAt();
  251. for(charLen=1,k = 0;k< ext_tab.length;k++) {
  252. if(char_i_code == ext_tab[k][1]) {
  253. charLen = 2;
  254. normal_max_len = 160;
  255. long_max_len = 153;
  256. tab_7bit_ext = false;
  257. break;
  258. }
  259. }
  260. if(tab_7bit_ext) {
  261. for(charLen=1,k = 0;k< ext_tab_.length;k++) {
  262. if(char_i_code == ext_tab_[k][1]) {
  263. charLen = 2;
  264. normal_max_len = 155;
  265. long_max_len = 149;
  266. g_sms_smscharlang = true;
  267. break;
  268. }
  269. }
  270. }
  271. if(1 == charLen) {
  272. extension_ascii_num++;
  273. } else {
  274. if(1 == charLenAtFirstSMSEnd) {
  275. if( (long_max_len-1) == extension_ascii_num ) {
  276. extension_ascii_num+=2;
  277. charLenAtFirstSMSEnd=2;
  278. } else if(( (long_max_len*2-1) == extension_ascii_num)
  279. || ( (long_max_len*3-1) == extension_ascii_num )
  280. || ( (long_max_len*4-1) == extension_ascii_num)) {
  281. extension_ascii_num+=3;
  282. } else {
  283. extension_ascii_num+=2;
  284. }
  285. } else {
  286. if(( (long_max_len-1)*2 == extension_ascii_num)
  287. || (((long_max_len-1)*3+1) == extension_ascii_num)
  288. || (((long_max_len-1)*4+2) == extension_ascii_num)) {
  289. extension_ascii_num+=3;
  290. } else {
  291. extension_ascii_num+=2;
  292. }
  293. }
  294. }
  295. }
  296. if(extension_ascii_num > normal_max_len && 2 == charLenAtFirstSMSEnd) {
  297. extension_ascii_num++;
  298. }
  299. return extension_ascii_num;
  300. }
  301. function ucs2_number_check(str) {
  302. var i = 0;
  303. var char_i;
  304. var num_char_i;
  305. var j = 0;
  306. var flag;
  307. var ucs2_num_temp=0;
  308. var ext_Table = arrayGSM_7ExtTable;
  309. if (str.length ==0) {
  310. return 0;
  311. }
  312. switch( g_smsFeature.smscharlang ) {
  313. case 0:
  314. if("2" == g_convert_type) {
  315. ext_Table= arrayGSM_7SpanishExtTable;
  316. } else {
  317. ext_Table= arrayGSM_7ExtTable;
  318. }
  319. break;
  320. case 1:
  321. ext_Table = arrayGSM_7TurkishExtTable;
  322. break;
  323. case 2:
  324. ext_Table = arrayGSM_7SpanishExtTable;
  325. break;
  326. case 3:
  327. ext_Table = arrayGSM_7PortugueseExtTable;
  328. break;
  329. default:
  330. break;
  331. }
  332. for(i=0; i<str.length; i++) {
  333. flag = 0;
  334. char_i = str.charAt(i);
  335. num_char_i = char_i.charCodeAt();
  336. for(j = 0; j < GSM_7BIT_NUM; j++) {
  337. if ("2" == g_convert_type) {
  338. if (num_char_i == arrayGSM_7DefaultTable[j] ||
  339. (num_char_i == ext_Table[j] ||
  340. (num_char_i == arrayGSM_7SpanishSpecialTable[j]))) {
  341. flag = 1;
  342. break;
  343. }
  344. } else {
  345. if (num_char_i == arrayGSM_7DefaultTable[j] || (num_char_i == ext_Table[j] )) {
  346. flag = 1;
  347. break;
  348. }
  349. }
  350. }
  351. if (0 == flag) {
  352. ucs2_num_temp++;
  353. }
  354. }
  355. return ucs2_num_temp;
  356. }
  357. function CDMA_textmode_check(str) {
  358. var i = 0;
  359. var char_i;
  360. var num_char_i;
  361. var codeFormat = SMS_TEXT_MODE_7BIT;
  362. var ucs2_num_temp=0;
  363. if (str.length ==0) {
  364. return SMS_TEXT_MODE_7BIT;
  365. }
  366. for(i=0; i<str.length; i++) {
  367. char_i = str.charAt(i);
  368. num_char_i = char_i.charCodeAt();
  369. if((SMS_TEXT_MODE_7BIT == codeFormat)
  370. &&(0 <= num_char_i && 0x7F >= num_char_i)) {
  371. codeFormat = SMS_TEXT_MODE_7BIT;
  372. } else if((SMS_TEXT_MODE_7BIT == codeFormat || SMS_TEXT_MODE_8BIT == codeFormat)
  373. &&(0x7F < num_char_i && 0xFF >= num_char_i)) {
  374. codeFormat = SMS_TEXT_MODE_8BIT;
  375. } else if(0xFF < num_char_i) {
  376. codeFormat = SMS_TEXT_MODE_UCS2;
  377. break;
  378. }
  379. }
  380. return codeFormat;
  381. }
  382.  
  383. function sms_numberCheck(str) {
  384. var N_or_Y_isCDMA_sms_hint_max_ucs2_characters_268=0;
  385. var N_or_Y_isCDMA_sms_hint_max_8bit_characters_532=0;
  386. var N_or_Y_isCDMA_sms_hint_max_ascii_characters_612=0;
  387. if (g_isCDMA)
  388. {
  389. g_SMS_UCS2_MAX_SIZE = 260;
  390. g_SMS_8BIT_MAX_SIZE = 540;
  391. g_SMS_7BIT_MAX_SIZE = 620;
  392. N_or_Y_isCDMA_sms_hint_max_ucs2_characters_268 = sms_hint_max_ucs2_characters_268.replace(/268/, "260");
  393. N_or_Y_isCDMA_sms_hint_max_8bit_characters_532 = sms_hint_max_8bit_characters_532.replace(/532/, "540");
  394. N_or_Y_isCDMA_sms_hint_max_ascii_characters_612 = sms_hint_max_ascii_characters_612.replace(/612/, "620");
  395. } else {
  396. g_SMS_UCS2_MAX_SIZE = 268;
  397. g_SMS_8BIT_MAX_SIZE = 532;
  398. g_SMS_7BIT_MAX_SIZE = 612;
  399. N_or_Y_isCDMA_sms_hint_max_ucs2_characters_268 = sms_hint_max_ucs2_characters_268;
  400. N_or_Y_isCDMA_sms_hint_max_8bit_characters_532 = sms_hint_max_8bit_characters_532;
  401. N_or_Y_isCDMA_sms_hint_max_ascii_characters_612 = sms_hint_max_ascii_characters_612;
  402. }
  403. var sms_left_length;
  404. var sms_num;
  405. var temp_length;
  406. var temp_enter_number;
  407. var normal_max_len = 160;
  408. var long_max_len = 153;
  409. var err_info = null;
  410. temp_length = str.length;
  411. if(SMS_TEXT_MODE_UCS2 == g_text_mode) {
  412. if (g_isCDMA) {
  413. normal_max_len = 70;
  414. long_max_len = 65;
  415. } else {
  416. normal_max_len = 70;
  417. long_max_len = 67;
  418. }
  419. if(temp_length > g_SMS_UCS2_MAX_SIZE) {
  420. err_info = N_or_Y_isCDMA_sms_hint_max_ucs2_characters_268;
  421. }
  422. } else if (SMS_TEXT_MODE_8BIT == g_text_mode) {
  423. if (g_isCDMA) {
  424. normal_max_len = 140;
  425. long_max_len = 135;
  426. } else {
  427. normal_max_len = 140;
  428. long_max_len = 133;
  429. }
  430. if(temp_length > g_SMS_8BIT_MAX_SIZE) {
  431. err_info = N_or_Y_isCDMA_sms_hint_max_8bit_characters_532;
  432. }
  433. } else if (SMS_TEXT_MODE_7BIT == g_text_mode && !g_isCDMA )
  434. {
  435. if(g_lang_edit == '-1') {
  436. temp_length = check_extension_ascii_for_char_number(str);
  437. } else {
  438. temp_length = check_extension_ascii_for_char_number_new(str);
  439. }
  440. if( g_lang_edit != '-1'&& !g_sms_smscharlang ) {
  441. normal_max_len = 160;
  442. long_max_len = 153;
  443. if(temp_length > g_SMS_7BIT_MAX_SIZE) {
  444. err_info = N_or_Y_isCDMA_sms_hint_max_ascii_characters_612;
  445. }
  446. } else if(g_lang_edit == '-1' && (0 == g_smsFeature.smscharlang || "undefined" == typeof(g_smsFeature.smscharlang)) ) {
  447. normal_max_len = 160;
  448. long_max_len = 153;
  449. if(temp_length > g_SMS_7BIT_MAX_SIZE) {
  450. err_info = N_or_Y_isCDMA_sms_hint_max_ascii_characters_612;
  451. }
  452. } else {
  453. normal_max_len = 155;
  454. long_max_len = 149;
  455. if(temp_length > long_max_len*4 ) {
  456. err_info = sms_hint_max_ascii_characters_596;
  457. }
  458. }
  459. } else if(SMS_TEXT_MODE_7BIT == g_text_mode && g_isCDMA) {
  460. normal_max_len = 160;
  461. long_max_len = 155;
  462. if(temp_length > long_max_len*4 ) {
  463. err_info = N_or_Y_isCDMA_sms_hint_max_ascii_characters_612;
  464. }
  465. }
  466. if( null != err_info ) {
  467. sms_clearAllErrorLabel();
  468. showErrorUnderTextbox("sms_count", err_info);
  469. g_sms_length = temp_length;
  470. button_enable("pop_send", "0");
  471. button_enable("pop_save_to_drafts", "0");
  472. } else {
  473. button_enable("pop_send", "1");
  474. button_enable("pop_save_to_drafts", "1");
  475. sms_clearAllErrorLabel();
  476. }
  477. if( temp_length <= normal_max_len ) {
  478. document.getElementById("sms_count").innerHTML = normal_max_len - temp_length + "(" + 1 + ")";
  479. sms_num = 1;
  480. if(temp_length <= 0) {
  481. g_content = str.substring(0);
  482. }
  483. } else if( (temp_length > normal_max_len ) && (temp_length <= long_max_len*4) ) {
  484. sms_num = parseInt(temp_length/long_max_len, 10)+1;
  485. if( 0 == (temp_length%long_max_len) ) {
  486. sms_num -= 1;
  487. }
  488. document.getElementById("sms_count").innerHTML = long_max_len*sms_num - temp_length + "(" + sms_num + ")";
  489. } else {
  490. var tmp =  parseInt((temp_length - long_max_len*4)/long_max_len, 10);
  491. var tmp2 = Math.floor(tmp);
  492. var tmp3 = (long_max_len*4 +(tmp2+1)*long_max_len) - temp_length;
  493. document.getElementById("sms_count").innerHTML =tmp3 + "(" + (tmp2+4+1)+ ")";
  494. }
  495. g_sms_num = sms_num;
  496. g_sms_length = temp_length;
  497. }
  498.  
  499. function sms_contentChange(str) {
  500. if(g_isCDMA) {
  501. g_text_mode = CDMA_textmode_check(str);
  502. } else {
  503. if( $.browser.msie ) {
  504. if(g_net_mode_type==MACRO_NET_DUAL_MODE && g_net_mode_change==MACRO_NET_MODE_CHANGE) {
  505. g_ucs2_num=ucs2_number_check(str);
  506. } else {
  507. sms_contentDiffUCS2Num( str );
  508. }
  509. } else {
  510. g_ucs2_num =  ucs2_number_check(str);
  511. }
  512. if (g_ucs2_num >0) {
  513. g_text_mode = SMS_TEXT_MODE_UCS2;
  514. } else {
  515. g_text_mode = SMS_TEXT_MODE_7BIT;
  516. }
  517. }
  518. sms_numberCheck(str);
  519. g_content = str;
  520. }
  521. function sms_contentCheck() {
  522. var checkResult = true;
  523. if(SMS_TEXT_MODE_UCS2 == g_text_mode) {
  524. if(g_sms_length > g_SMS_UCS2_MAX_SIZE) {
  525. sms_clearAllErrorLabel();
  526. showErrorUnderTextbox("sms_count", IDS_sms_hint_content_too_long);
  527. $("#message_content").select();
  528. checkResult = false;
  529. }
  530. } else if(SMS_TEXT_MODE_7BIT == g_text_mode) {
  531. if(g_sms_length > g_SMS_7BIT_MAX_SIZE) {
  532. sms_clearAllErrorLabel();
  533. showErrorUnderTextbox("sms_count", IDS_sms_hint_content_too_long);
  534. $("#message_content").select();
  535. checkResult = false;
  536. }
  537. } else if(SMS_TEXT_MODE_8BIT == g_text_mode) {
  538. if(g_sms_length > g_SMS_8BIT_MAX_SIZE) {
  539. sms_clearAllErrorLabel();
  540. showErrorUnderTextbox("sms_count", IDS_sms_hint_content_too_long);
  541. $("#message_content").select();
  542. checkResult = false;
  543. }
  544. }
  545. return checkResult;
  546. }
  547.  


Best regards !

winni

  • Hero Member
  • *****
  • Posts: 2356
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #12 on: March 08, 2021, 09:36:38 pm »

Here is the code used by the huawei device in js
Code: Javascript  [Select][+][-]
  1. ....
  2. return checkResult;
  3. }
  4.  


And you are shure that this is allowed??


Winni

fjabouley

  • Full Member
  • ***
  • Posts: 127
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #13 on: March 09, 2021, 07:04:30 am »
Why wouldn't it be allowed? You mean sending code?

I just pasted some code from the web interface of the device.
It contains the function that checks if the lenght of the message you want to send is Ok depending on which chars are in it
« Last Edit: March 09, 2021, 07:06:43 am by fjabouley »

PascalDragon

  • Hero Member
  • *****
  • Posts: 2985
  • Compiler Developer
Re: How to check if a string has only extended Ascii characters in it ?
« Reply #14 on: March 09, 2021, 09:10:37 am »
  • If they want UTF-16 declare the strings that you want to pass over to the API as widestring and simply assign your Lazarus strings to them - this way UTF-8 of Lazararus will be converted to UTF-16 required by huawei. (You need an FPC 3.0+ for this, but this is fairly standard nowadays.)

Note: you should use UnicodeString, not WideString. While this will only make a difference on Windows there it is in important difference: UnicodeString is reference counted (like AnsiString), while WideString is not.

 

TinyPortal © 2005-2018