Recent

Author Topic: Is this string function efficient?  (Read 13305 times)

AlexTP

  • Hero Member
  • *****
  • Posts: 2672
    • UVviewsoft
Is this string function efficient?
« on: January 10, 2018, 09:43:24 am »
Is this func (it copies utf8 string, pure ascii, to Widestring) ok in CPU code?
or need to optimize using PChar/PWChar?

Code: Pascal  [Select][+][-]
  1. function SConvertUtf8ToWideForAscii(const S: string): UnicodeString;
  2. var
  3.   i: integer;
  4. begin
  5.   SetLength(Result, Length(S));
  6.   for i:= 1 to Length(S) do
  7.     Result[i]:= WideChar(Ord(S[i]));
  8. end;
  9.  

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Is this string function efficient?
« Reply #1 on: January 10, 2018, 10:08:19 am »
Where's the UTF8 in your "string"?  I assume you are using Lazarus?
But anyway:
Code: Pascal  [Select][+][-]
  1. program untitled;
  2. {$ifdef fpc}{$mode delphi}{$H+}{$I-}{$endif}
  3. var
  4.   s1:UTF8String = 'Whatever'; // or a Lazarus "string"...Juha...  FPC strings are either ShortString, AnsiString or UnicodeString, never UTF8: that's only the case in Lazarus. Confused???
  5.   s2:Unicodestring;
  6. begin
  7.   s2 := S1;  // conversion is lossless and automatic and fast. Faster than you can  do by hand...
  8.   writeln(s2);
  9. end.

So it is a simple assignment. And your code is wrong anyway: You should have used UnicodeChar.... And the length of an UTF8 string may differ from the length of a UnicodeString... (the latter is a bug in your code)
But as I said: just a simple assignment will do.
« Last Edit: January 10, 2018, 10:16:59 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

AlexTP

  • Hero Member
  • *****
  • Posts: 2672
    • UVviewsoft
Re: Is this string function efficient?
« Reply #2 on: January 10, 2018, 10:20:27 am »
yes, I use Lazarus. and "string" var holds raw lines of text. Usually i call WS:= UTF8Decode(s).
for pure ASCII, i want faster call, and so SConvertUtf8ToWideForAscii.
I think simple assignment is slower than SConvertUtf8ToWideForAscii?

mse

  • Sr. Member
  • ****
  • Posts: 286
Re: Is this string function efficient?
« Reply #3 on: January 10, 2018, 10:28:04 am »
In my experience
Code: Pascal  [Select][+][-]
  1. type
  2.  card8 = byte;
  3.  pcard8 = ^card8;
  4.  card16 = word;
  5.  pcard16 = ^card16;
  6.  
  7. function SConvertUtf8ToWideForAscii(const S: string): UnicodeString;
  8. var
  9.  ps,pe: pcard8;
  10.  pd: pcard16;
  11.  i1: sizeint;
  12. begin
  13.  i1:= Length(S);
  14.  SetLength(Result,i1);
  15.  ps:= pointer(s);
  16.  pe:= ps + i1;
  17.  pd:= pointer(result);
  18.  while ps < pe do begin
  19.   pd^:= ps^;
  20.   inc(ps);
  21.   inc(pd);
  22.  end;
  23. end;
  24.  
is faster.
« Last Edit: January 10, 2018, 10:34:20 am by mse »

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Is this string function efficient?
« Reply #4 on: January 10, 2018, 10:30:29 am »
yes, I use Lazarus. and "string" var holds raw lines of text. Usually i call WS:= UTF8Decode(s).
for pure ASCII, i want faster call, and so SConvertUtf8ToWideForAscii.
I think simple assignment is slower than SConvertUtf8ToWideForAscii?

No it is not. I just timed it. A simple assignment is faster by about 20% (mse: take note)
And YOU could have timed it yourself. >:( ;D And solve the bug, btw.. Length(UTF8) is different from Length(Unicode) ......
« Last Edit: January 10, 2018, 10:37:45 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

mse

  • Sr. Member
  • ****
  • Posts: 286
Re: Is this string function efficient?
« Reply #5 on: January 10, 2018, 10:36:14 am »
No it is not. I just timed it. A simple assignment is faster by about 20% (mse: take note)
Did you test my solution?

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Is this string function efficient?
« Reply #6 on: January 10, 2018, 10:40:00 am »
No need to test, you just fell in the same trap: the length(UTF8) <> Length(UnicodeString)....unless in common translations and by accident.
And you know that! that's even worse...
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

mse

  • Sr. Member
  • ****
  • Posts: 286
Re: Is this string function efficient?
« Reply #7 on: January 10, 2018, 10:41:53 am »
For ASCII length(utf8) = length(utf16).

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Is this string function efficient?
« Reply #8 on: January 10, 2018, 10:47:18 am »
For ASCII length(utf8) = length(utf16).
Just for ASCII and ONLY for ASCII... >:( >:( ;D ;D You know that... >:D

The question is about UTF8string (After I asked) to UnicodeString, which was not obvious.... If you do that in a generic function like yours or OP's you will run into trouble. It is plain wrong.
The simple assignment is fool proof and takes all into account.
« Last Edit: January 10, 2018, 10:50:30 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

mse

  • Sr. Member
  • ****
  • Posts: 286
Re: Is this string function efficient?
« Reply #9 on: January 10, 2018, 10:50:07 am »
Please read the first post from Alextp.

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Is this string function efficient?
« Reply #10 on: January 10, 2018, 10:51:19 am »
Please read the second post from Alexp..... I am not spending any more time on this. Your answer is wrong (not complete and you use pointers) Pure ASCII? Which codepage?
« Last Edit: January 10, 2018, 10:53:15 am by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

AlexTP

  • Hero Member
  • *****
  • Posts: 2672
    • UVviewsoft
Re: Is this string function efficient?
« Reply #11 on: January 10, 2018, 10:53:08 am »
I use func only for ASCII text. before in code it is checking: if Length(UTF8Decode(S))=Length(S) then item marked as ASCII.

@mse
thanks...

question was only about ASCII string.
« Last Edit: January 10, 2018, 10:54:42 am by Alextp »

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Is this string function efficient?
« Reply #12 on: January 10, 2018, 10:54:42 am »
You should not thank mse.  Even pure ASCII will run you into trouble. Use the simple assignment.
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

mse

  • Sr. Member
  • ****
  • Posts: 286
Re: Is this string function efficient?
« Reply #13 on: January 10, 2018, 10:56:22 am »
@mse
thanks...
Any numbers on speed comparison?

AlexTP

  • Hero Member
  • *****
  • Posts: 2672
    • UVviewsoft
Re: Is this string function efficient?
« Reply #14 on: January 10, 2018, 10:58:01 am »
@Thaddy
Simple assignment will try to decode UTF8. ie parse it. @mse's func don't parse it, it's faster?

 

TinyPortal © 2005-2018