Lazarus

Announcements => Third party => Topic started by: avk on January 19, 2022, 06:45:12 am

Title: Alternative set of string-to-int conversion routines
Post by: avk on January 19, 2022, 06:45:12 am
Hi!
For a long time I was going to try to create a string-to-integer conversion function that would be noticeably faster than the built-in Val().
There seems to be something like this (https://github.com/avk959/str2int-alter) at the moment. Anyway, a quick and dirty benchmark against Val() from the current development version of FPC on my machine looks like this:
Code: Text  [Select][+][-]
  1. Int32:
  2. Val(), score:        2853
  3. TryChars2Int, score: 1137
  4. Int64:
  5. Val(), score:        3045
  6. TryChars2Int, score: 1121
  7.  
Parsing is carried out basically according to the same rules as in the built-in Val() (well, as far as I understood them). IMO there is only one significant difference from Val(): since the lone leading zero is the prefix of an octal string, decimal strings cannot start with a zero.
The set also includes functions that accept an array of char and a PChar.
Feedback and comments are highly appreciated, if any.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 19, 2022, 08:06:46 am
Nice. One remark: there is a lot of duplicate code and that can be avoided.
There are two options I can see:
1. You could write generic functions, implemented and declared in the implementation section. That will (should) not affect speed since generics are templates.
2. Use the technique sometimes used in the rtl, where a header include uses the same implementation include per routine wherein the type is changed in the header, so the implementation include knows the correct type. (predates generics) Also should not affect speed.

But compliments. I find the same speed gain, percent wise.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 19, 2022, 08:15:50 am
Please add
Code: Pascal  [Select][+][-]
  1. function TryStr2xxx(const s: string; const aIndex, aCount: SizeInt; out aValue: xxx): Boolean; inline;
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 19, 2022, 08:35:43 am
@Thaddy, thank you.
You're right, a lot of code is duplicated and I don't have an option yet to avoid this without sacrificing performance. On the other hand, this set is still a proof of concept.

@zamtmn, doesn't this code
Code: Pascal  [Select][+][-]
  1.   if TryChars2Int(s[5..15], MyInt) then
  2.     ...
  3.  
cover your case?
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 19, 2022, 08:50:10 am
Code: Pascal  [Select][+][-]
  1. TryStrToInt(Copy(s,5,10),MyInt)
also covers my case, but the point is to make it fast, without unnecessary allocations and copies
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 19, 2022, 09:27:07 am
I just compiled this example:
Code: Pascal  [Select][+][-]
  1. program project1;
  2. {$MODE OBJFPC}{$H+}
  3. uses
  4.   SysUtils, Str2IntAlter;
  5.  
  6. procedure Test(const s: string);
  7. var
  8.   I: Integer;
  9. begin
  10.   if TryChars2Int(s[22..Length(s)], I) then
  11.     WriteLn('I = ', I)
  12.   else
  13.     WriteLn('Ooooppss!');
  14. end;
  15.  
  16. var
  17.   s: string = '';
  18. begin
  19.   Randomize;
  20.   s := 'it is random integer:  ' + Random(MaxInt).ToString;
  21.   Test(s);
  22.   ReadLn;
  23. end.
  24.  

To call TryChars2Int(s[22..Length(s)], I) FPC generated the following code:
32 bits
Code: ASM  [Select][+][-]
  1. # [10] if TryChars2Int(s[22..Length(s)], I) then
  2.         movl    %eax,%edx
  3.         testl   %eax,%eax
  4.         je      .Lj5
  5.         movl    -4(%edx),%edx
  6. .Lj5:
  7.         subl    $22,%edx
  8.         movl    %esp,%ecx
  9.         addl    $21,%eax
  10.         call    STR2INTALTER_$$_TRYCHARS2INT$array_of_CHAR$LONGINT$$BOOLEAN
  11.         testb   %al,%al
  12.         je      .Lj7
  13. # [11] WriteLn('I = ', I)
  14.  
64 bits
Code: ASM  [Select][+][-]
  1. # [10] if TryChars2Int(s[22..Length(s)], I) then
  2.         movq    %rcx,%rdx
  3. # Peephole Optimization: %rdx = %rcx; changed to minimise pipeline stall (MovXXX2MovXXX)
  4.         testq   %rcx,%rcx
  5.         je      .Lj5
  6.         movq    -8(%rdx),%rdx
  7. .Lj5:
  8.         subq    $22,%rdx
  9.         leaq    32(%rsp),%r8
  10.         leaq    21(%rax),%rcx
  11.         call    STR2INTALTER_$$_TRYCHARS2INT$array_of_CHAR$LONGINT$$BOOLEAN
  12.         testb   %al,%al
  13.         je      .Lj7
  14. # [11] WriteLn('I = ', I)
  15.  

There doesn't seem to be any unnecessary copying going on.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 19, 2022, 09:54:08 am
Oh yes, I completely forgot that
Code: Pascal  [Select][+][-]
  1. TryChars2Int(const a: array of char; ...
this is not equivalent to
Code: Pascal  [Select][+][-]
  1. type
  2.   TTest=array of char;
  3. procedure TryChars2Int(const a:TTest; ...
the request is closed :-[
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 19, 2022, 09:22:23 pm
Благодарю за выложенный код!
Я буду знать, что я иду в правильном направлении!  :)

google translate:
Thanks for posting the code!
I will know that I am going in the right direction! :)
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 20, 2022, 02:25:55 pm
As per Thaddy's remarks, tried to somehow reduce code duplication.
Generic functions need to be declared in the interface part of the unit, and since they are helpers, this is not very good.
Includes are also not suitable, because I want everything to stay in one unit.
As a result, as an experiment, I settled on macros. The code looks unreadable, of course, but it seems that it would not be much better with includes.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 20, 2022, 03:09:18 pm
It became very ugly. In my opinion, the best solution would still be genetics and two modules:
Str2IntAlter - for uses in main unit
Str2IntAlterInternal - for uses in Str2IntAlter, contains generic help functions
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 20, 2022, 03:36:03 pm
Ok, I get it, I need to think about it some more. In any case, there is no problem to roll back to the previous version.
But I really want everything to be in one self-sufficient unit.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 20, 2022, 04:38:47 pm
In principle, using generics (as I suggested too) should not have a speed penalty at all.
Title: Re: Alternative set of string-to-int conversion routines
Post by: MarkMLl on January 20, 2022, 06:58:36 pm
In principle, using generics (as I suggested too) should not have a speed penalty at all.

In principle. In practice I find that debug sessions on my code regularly take me through the generics unit, which suggests that it /does/ interpose code in various places even when not used explicitly.

MarkMLl
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 20, 2022, 07:20:06 pm
Lyrical digression.
As I mentioned above, I had a desire to create a faster function for converting a string to an integer quite a long time ago, or more precisely, at the time of this epic story (https://forum.lazarus.freepascal.org/index.php/topic,46103.msg327460.html#msg327460).
At that time, in order to defeat Windows file mapping in the company with NtDll, it came to some completely crazy decisions.
I hope now it can be done without much straining.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 20, 2022, 10:43:05 pm
Quote
Int32:
Val(), score:        3073
rejected:            0

TryChars2Int, score: 1486
rejected:            0

sc_StrToLongWord or sc_StrToInt, score: 1104
rejected:            0

Int64:
Val(), score:        3075
rejected:            0

TryChars2Int, score: 1627
rejected:            0

sc_StrToQWord or sc_StrToInt64, score: 1072
rejected:            0

Press any key to exit...
Я проверил свой код в вашей демке. Не воспринимайте это как "вызов". Ваш код людям нравится больше. И они считают, что вашим кодом можно "выдёргивать" цифры из текста. Возможно они правы.
Не всегда "вычурные" решения бывают самыми лучшими. При переводе строк, табличными значениями удобно пользоваться только в шестнадцатеричной системе. Ваш код помог мне решить несколько мелких проблем в моём коде. Хотя идеи далеко не новы... я просто забыл о них.
google translate:
I checked my code in your demo. Don't take it as a "challenge". People like your code more. And they think that your code can "pull out" numbers from the text. Perhaps they are right.
Not always "artsy" solutions are the best. When translating strings, it is convenient to use table values only in the hexadecimal system. Your code helped me solve a few small problems in my code. Although the ideas are far from new... I just forgot about them.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 21, 2022, 06:44:01 am
I'm glad I was able to help you.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 08:53:02 am
In principle, using generics (as I suggested too) should not have a speed penalty at all.

In principle. In practice I find that debug sessions on my code regularly take me through the generics unit, which suggests that it /does/ interpose code in various places even when not used explicitly.

MarkMLl
I believe this is only true if you use one of the generic libraries, NOT for completely self defined generic code as one can see from the generated assembler.
E.g. this code, that only uses the system unit:
Code: Pascal  [Select][+][-]
  1. {$mode delphi}
  2. type
  3.   TIntArray = TArray<integer>;
  4.   TStringArray = TArray<string>;
  5.   procedure swap<T>(var left,right:T);inline;
  6.   var
  7.     temp:T;
  8.   begin
  9.     temp:=left;
  10.     left:=right;
  11.     right:=temp;
  12.   end;
  13.  
  14.   procedure Reverse<T>(var Value:TArray<T>);inline;
  15.   var
  16.     i: Integer = 0;
  17.   begin
  18.     if Length(Value) > 0 then
  19.       for i := Low(Value) to High(Value) div 2 do
  20.         Swap<T>(Value[i],Value[High(Value) - i]);
  21.   end;
  22.  
  23. var
  24.   i:integer;
  25.   s:string;
  26.   si:TintArray = [1,2,3,4,5,6,7,8,9,10];
  27.   ss:TStringArray = ['there', 'are', 'more', 'things', 'in', 'heaven', 'and', 'earth',', ', 'Horatio'];
  28. begin
  29.   reverse<integer>(si);
  30.   for i in si do write(i:2);
  31.   writeln;
  32.   reverse<string>(ss);
  33.   for s in ss do write(s:length(s)+1);
  34. end.
The assembler shows that the "templates" are filled in *before* any assembler code is generated.
There is simply no reference to anything "generics" in the generated code apart from possibly internal namings.
There is simply no codepath related to generics at all.
I compiled this with -al -CX -XXs -O4
I also tested a non-generic version if Int reverse: Hey, near identical code!
Title: Re: Alternative set of string-to-int conversion routines
Post by: MarkMLl on January 21, 2022, 09:31:58 am
I believe this is only true if you use one of the generic libraries, NOT for completely self defined generic code as one can see from the generated assembler.

I can only assume that it's somewhere in the standard FPC or LCL libraries. It's definitely not anything that I've done explicitly.

MarkMLl
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 09:41:37 am
I can only assume that it's somewhere in the standard FPC or LCL libraries. It's definitely not anything that I've done explicitly.

MarkMLl
That seems strange to me. My example only uses system and no libraries at all.
In fact, it shows using generics for those string to int conversions is a viable option without loss of speed.
The generated code is even inlined.

If you can disproof this, pls show me what happens at your side.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 21, 2022, 10:02:32 am
Geneticists are still making debugging very difficult, and may even affect the build due to compiler bugs. But they make readable code and errors will be fixed (but not as fast as I would like)
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 10:09:06 am
So what is difficult? As long as you specialize at type level the debugger has no issues, at least I did not encounter them. I am aware that if you specialize at var level the debugger does not always play nice. That is an important distinction.
Code: Pascal  [Select][+][-]
  1. type
  2.   TIntArray = TArray<integer>;
  3.   TStringArray = TArray<string>;
Note I use fp + GDB.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 21, 2022, 10:14:27 am
I have often encountered an incorrect display of the current executed line, it is shifted 1-2 up or down
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 10:15:49 am
I have often encountered an incorrect display of the current executed line, it is shifted 1-2 up or down
How did you specialize? type level or var level? Can you give us a small example to reproduce?

GDB is GNU gdb unicode (GDB) 9.2
Title: Re: Alternative set of string-to-int conversion routines
Post by: MarkMLl on January 21, 2022, 10:22:27 am
I can only assume that it's somewhere in the standard FPC or LCL libraries. It's definitely not anything that I've done explicitly.

MarkMLl
That seems strange to me. My example only uses system and no libraries at all.
In fact, it shows using generics for those string to int conversions is a viable option without loss of speed.
The generated code is even inlined.

If you can disproof this, pls show me what happens at your side.

I'll report back if I can spot what's going on.

Please note that I'm not commenting on an observed performance issue, merely that the generics library is insinuating itself /somewhere/ so there's the /potential/ for a performance issue if a shim is being traversed e.g. inside a tight loop.

MarkMLl
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 21, 2022, 10:23:59 am
I specialize in different ways, but I try to do in types, I'm talking about debugging inside generics classes
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 10:25:38 am
Please note that I'm not commenting on an observed performance issue, merely that the generics library is insinuating itself /somewhere/ so there's the /potential/ for a performance issue if a shim is being traversed e.g. inside a tight loop.

MarkMLl
I think the latter is wrong. But I am open to further testing.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 10:27:29 am
I specialize in different ways, but I try to do in types, I'm talking about debugging inside generics classes
That is no issue if you specialize at type level and debug the specialization, not the template - which is not there for the debugger to see.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 21, 2022, 10:32:26 am
That is no issue if you specialize at type level and debug the specialization, not the template - which is not there for the debugger to see.
I know your point of view - it's not a compiler problem, it's a bad programmer's problem)) But no. Debugging the code is also necessary inside the genetic classes and it is very difficult
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 21, 2022, 10:38:18 am
I know your point of view - it's not a compiler problem, it's a bad programmer's problem)) But no. Debugging the code is also necessary inside the genetic classes and it is very difficult
Well, no, that is not what I wrote. Both ways to specialize are fully legal.
Specialized at type level it isn't even possible - by sheer logic - to debug the template code, afaik there is none.
Maybe @PascalDragon can help us here.... Sarah? Or @FPK himself?

Note that this may seem to some a side discussion but it isn't: once we fully understand what is going on, we can help and improve on the original subject's code. At least that is how I think: we have to establish the facts.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 21, 2022, 12:03:08 pm
Maybe this is Lazarus' problem, I can't reproduce this in a minimal example. I think that with some changes in the generic source code, EXE or PPU remain old, and this causes incorrect display of the executed lines. It's related https://gitlab.com/freepascal.org/fpc/source/-/issues/39387
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 21, 2022, 01:08:26 pm
Of course, I have some understanding of the current possibilities of generics, but so far this does not in any way stimulate the choice in their favor due to the above reason.
On the other hand, is duplication of code in a library so bad? Indeed, in this case, changes in the code of some function will not affect others.
Title: Re: Alternative set of string-to-int conversion routines
Post by: PascalDragon on January 21, 2022, 03:51:40 pm
Generic functions need to be declared in the interface part of the unit, and since they are helpers, this is not very good.

Why do you say that generic functions need to be declared in the interface part of the unit?

I know your point of view - it's not a compiler problem, it's a bad programmer's problem)) But no. Debugging the code is also necessary inside the genetic classes and it is very difficult
Well, no, that is not what I wrote. Both ways to specialize are fully legal.
Specialized at type level it isn't even possible - by sheer logic - to debug the template code, afaik there is none.
Maybe @PascalDragon can help us here.... Sarah? Or @FPK himself?

If there are concrete issues with debugging then we need a reproducible example no matter the issue, cause, yes, such things should be fixed, no matter if it's about stepping through generic code correctly or not accidentally entering generic code when one doesn't want it as MarkMLI hinted at.

One potential issue might be (not tested): the generic code is compiled without debug information, but code that uses it is compiled with and thus the specialization will have debug information as well. Thus when stepping through the latter code one might suddenly land in the generic code which one assumed is without debug information.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 21, 2022, 04:13:01 pm
Why do you say that generic functions need to be declared in the interface part of the unit?
Probably lagged behind life, some time ago it was like that, but now I didn’t even try to check. :-[
Thank you.
Title: Re: Alternative set of string-to-int conversion routines
Post by: MarkMLl on January 21, 2022, 04:49:00 pm
One potential issue might be (not tested): the generic code is compiled without debug information, but code that uses it is compiled with and thus the specialization will have debug information as well. Thus when stepping through the latter code one might suddenly land in the generic code which one assumed is without debug information.

On the other hand, I invariably use a locally-built FPC+Lazarus and looking at my build log for FPC 3.2.0 I see

make NO_GDB=1 CPU_TARGET=i386 OPT='-V3.0.4 -O- -gl -Xs- -vt' all |/usr/local/bin/fpc-filter-vt

Lazarus+LCL though is built with default options, so it could be there.

I'll try to keep an eye open for what I think I've seen, and will report back.

MarkMLl
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 22, 2022, 11:40:01 am
Some news from str2int-alter:
  current version has switched to generics;
  added define to force full compatibility with Val();
  added benchmarks for unsigned integers, now it looks like this on my virtual Linux machine:
Code: Text  [Select][+][-]
  1. SInt32:
  2. Val(), score:        3071
  3. TryChars2Int, score: 1250
  4.  
  5. SInt64:
  6. Val(), score:        3131
  7. TryChars2Int, score: 1177
  8.  
  9. UInt32:
  10. Val(), score:        5619
  11. TryChars2Int, score: 1248
  12.  
  13. UInt64:
  14. Val(), score:        7015
  15. TryChars2Int, score: 1280
  16.  
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 22, 2022, 12:01:58 pm
Nice gain.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 24, 2022, 09:37:56 am
Sorry, couldn't resist, found the old benchmark archive from "Sorting and Counting" and ran it again using Str2IntAlter on a win-64 machine:
Code: Text  [Select][+][-]
  1. RandomRange = 1
  2. Julkas1's time: 3.3850  #unique: 100000 #total: 10000000
  3. Julkas2's time: 3.4820  #unique: 100000 #total: 10000000
  4.   Akira's time: 3.1700  #unique: 100000 #total: 10000000
  5.  Howard's time: 4.6500  #unique: 100000 #total: 10000000
  6.    Avk1's time: 1.3300  #unique: 100000 #total: 10000000
  7.    Avk2's time: 0.6100  #unique: 100000 #total: 10000000
  8.   440bx's time: 1.5100  #unique: 100000 #total: 10000000
  9.  BrunoK's time: 1.6900  #unique: 100000 #total: 10000000
  10. BrunoK1's time: 1.1200  #unique: 100000 #total: 10000000
  11.  
  12. RandomRange = 2
  13. Julkas1's time: 3.4000  #unique: 200000 #total: 10000000
  14. Julkas2's time: 3.4500  #unique: 200000 #total: 10000000
  15.   Akira's time: 3.3400  #unique: 200000 #total: 10000000
  16.  Howard's time: 4.6900  #unique: 200000 #total: 10000000
  17.    Avk1's time: 1.3600  #unique: 200000 #total: 10000000
  18.    Avk2's time: 0.6400  #unique: 200000 #total: 10000000
  19.   440bx's time: 1.6200  #unique: 200000 #total: 10000000
  20.  BrunoK's time: 1.7800  #unique: 200000 #total: 10000000
  21. BrunoK1's time: 1.1700  #unique: 200000 #total: 10000000
  22.  
  23. RandomRange = 3
  24. Julkas1's time: 3.4300  #unique: 300000 #total: 10000000
  25. Julkas2's time: 3.4300  #unique: 300000 #total: 10000000
  26.   Akira's time: 3.4300  #unique: 300000 #total: 10000000
  27.  Howard's time: 4.7500  #unique: 300000 #total: 10000000
  28.    Avk1's time: 1.3900  #unique: 300000 #total: 10000000
  29.    Avk2's time: 0.6700  #unique: 300000 #total: 10000000
  30.   440bx's time: 1.6800  #unique: 300000 #total: 10000000
  31.  BrunoK's time: 1.8300  #unique: 300000 #total: 10000000
  32. BrunoK1's time: 1.2500  #unique: 300000 #total: 10000000
  33.  
  34. RandomRange = 4
  35. Julkas1's time: 3.5400  #unique: 400000 #total: 10000000
  36. Julkas2's time: 3.4700  #unique: 400000 #total: 10000000
  37.   Akira's time: 3.6200  #unique: 400000 #total: 10000000
  38.  Howard's time: 4.8200  #unique: 400000 #total: 10000000
  39.    Avk1's time: 1.4200  #unique: 400000 #total: 10000000
  40.    Avk2's time: 0.7100  #unique: 400000 #total: 10000000
  41.   440bx's time: 1.7400  #unique: 400000 #total: 10000000
  42.  BrunoK's time: 1.8900  #unique: 400000 #total: 10000000
  43. BrunoK1's time: 1.3100  #unique: 400000 #total: 10000000
  44.  
  45. RandomRange = 5
  46. Julkas1's time: 3.5000  #unique: 500000 #total: 10000000
  47. Julkas2's time: 3.5000  #unique: 500000 #total: 10000000
  48.   Akira's time: 3.7200  #unique: 500000 #total: 10000000
  49.  Howard's time: 4.9000  #unique: 500000 #total: 10000000
  50.    Avk1's time: 1.4600  #unique: 500000 #total: 10000000
  51.    Avk2's time: 0.7400  #unique: 500000 #total: 10000000
  52.   440bx's time: 1.7900  #unique: 500000 #total: 10000000
  53.  BrunoK's time: 1.9300  #unique: 500000 #total: 10000000
  54. BrunoK1's time: 1.3400  #unique: 500000 #total: 10000000
  55.  
  56. RandomRange = 6
  57. Julkas1's time: 3.5400  #unique: 600000 #total: 10000000
  58. Julkas2's time: 3.5200  #unique: 600000 #total: 10000000
  59.   Akira's time: 3.7700  #unique: 600000 #total: 10000000
  60.  Howard's time: 5.0000  #unique: 600000 #total: 10000000
  61.    Avk1's time: 1.4800  #unique: 600000 #total: 10000000
  62.    Avk2's time: 0.7800  #unique: 600000 #total: 10000000
  63.   440bx's time: 1.8400  #unique: 600000 #total: 10000000
  64.  BrunoK's time: 1.9900  #unique: 600000 #total: 10000000
  65. BrunoK1's time: 1.3900  #unique: 600000 #total: 10000000
  66.  
  67. RandomRange = 7
  68. Julkas1's time: 3.5700  #unique: 700000 #total: 10000000
  69. Julkas2's time: 3.5800  #unique: 700000 #total: 10000000
  70.   Akira's time: 3.8300  #unique: 700000 #total: 10000000
  71.  Howard's time: 5.0600  #unique: 700000 #total: 10000000
  72.    Avk1's time: 1.5300  #unique: 700000 #total: 10000000
  73.    Avk2's time: 0.8000  #unique: 700000 #total: 10000000
  74.   440bx's time: 1.9000  #unique: 700000 #total: 10000000
  75.  BrunoK's time: 2.0100  #unique: 700000 #total: 10000000
  76. BrunoK1's time: 1.4200  #unique: 700000 #total: 10000000
  77.  
  78. RandomRange = 8
  79. Julkas1's time: 3.5900  #unique: 799994 #total: 10000000
  80. Julkas2's time: 3.6100  #unique: 799994 #total: 10000000
  81.   Akira's time: 3.9000  #unique: 799994 #total: 10000000
  82.  Howard's time: 5.1400  #unique: 799994 #total: 10000000
  83.    Avk1's time: 1.5800  #unique: 799994 #total: 10000000
  84.    Avk2's time: 0.8400  #unique: 799994 #total: 10000000
  85.   440bx's time: 1.9400  #unique: 799994 #total: 10000000
  86.  BrunoK's time: 2.0800  #unique: 799994 #total: 10000000
  87. BrunoK1's time: 1.4700  #unique: 799994 #total: 10000000
  88.  
  89. RandomRange = 9
  90. Julkas1's time: 3.6300  #unique: 899990 #total: 10000000
  91. Julkas2's time: 3.6300  #unique: 899990 #total: 10000000
  92.   Akira's time: 3.9400  #unique: 899990 #total: 10000000
  93.  Howard's time: 5.2100  #unique: 899990 #total: 10000000
  94.    Avk1's time: 1.5800  #unique: 899990 #total: 10000000
  95.    Avk2's time: 0.8600  #unique: 899990 #total: 10000000
  96.   440bx's time: 1.9800  #unique: 899990 #total: 10000000
  97.  BrunoK's time: 2.0900  #unique: 899990 #total: 10000000
  98. BrunoK1's time: 1.5200  #unique: 899990 #total: 10000000
  99.  
  100. RandomRange = 10
  101. Julkas1's time: 3.6600  #unique: 999962 #total: 10000000
  102. Julkas2's time: 3.6400  #unique: 999962 #total: 10000000
  103.   Akira's time: 4.0100  #unique: 999962 #total: 10000000
  104.  Howard's time: 5.2700  #unique: 999962 #total: 10000000
  105.    Avk1's time: 1.6100  #unique: 999962 #total: 10000000
  106.    Avk2's time: 0.9000  #unique: 999962 #total: 10000000
  107.   440bx's time: 2.0200  #unique: 999962 #total: 10000000
  108.  BrunoK's time: 2.1300  #unique: 999962 #total: 10000000
  109. BrunoK1's time: 1.5500  #unique: 999962 #total: 10000000
  110.  
  111. repeatMillionsCount = 2
  112. Julkas1's time: 0.8800  #unique: 734359 #total: 2000000
  113. Julkas2's time: 0.8600  #unique: 734359 #total: 2000000
  114.   Akira's time: 1.0700  #unique: 734359 #total: 2000000
  115.  Howard's time: 1.2010  #unique: 734359 #total: 2000000
  116.    Avk1's time: 0.4200  #unique: 734359 #total: 2000000
  117.    Avk2's time: 0.2700  #unique: 734359 #total: 2000000
  118.   440bx's time: 0.5000  #unique: 734359 #total: 2000000
  119.  BrunoK's time: 0.5800  #unique: 734359 #total: 2000000
  120. BrunoK1's time: 0.4700  #unique: 734359 #total: 2000000
  121.  
  122. repeatMillionsCount = 4
  123. Julkas1's time: 1.7400  #unique: 794586 #total: 4000000
  124. Julkas2's time: 1.5600  #unique: 794586 #total: 4000000
  125.   Akira's time: 1.8300  #unique: 794586 #total: 4000000
  126.  Howard's time: 2.1900  #unique: 794586 #total: 4000000
  127.    Avk1's time: 0.7000  #unique: 794586 #total: 4000000
  128.    Avk2's time: 0.4100  #unique: 794586 #total: 4000000
  129.   440bx's time: 0.8900  #unique: 794586 #total: 4000000
  130.  BrunoK's time: 0.9500  #unique: 794586 #total: 4000000
  131. BrunoK1's time: 0.7400  #unique: 794586 #total: 4000000
  132.  
  133. repeatMillionsCount = 6
  134. Julkas1's time: 2.2500  #unique: 799541 #total: 6000000
  135. Julkas2's time: 2.2400  #unique: 799541 #total: 6000000
  136.   Akira's time: 2.5200  #unique: 799541 #total: 6000000
  137.  Howard's time: 3.1700  #unique: 799541 #total: 6000000
  138.    Avk1's time: 0.9900  #unique: 799541 #total: 6000000
  139.    Avk2's time: 0.5700  #unique: 799541 #total: 6000000
  140.   440bx's time: 1.2500  #unique: 799541 #total: 6000000
  141.  BrunoK's time: 1.3400  #unique: 799541 #total: 6000000
  142. BrunoK1's time: 0.9800  #unique: 799541 #total: 6000000
  143.  
  144. repeatMillionsCount = 8
  145. Julkas1's time: 2.9200  #unique: 799966 #total: 8000000
  146. Julkas2's time: 2.9200  #unique: 799966 #total: 8000000
  147.   Akira's time: 3.2100  #unique: 799966 #total: 8000000
  148.  Howard's time: 4.5520  #unique: 799966 #total: 8000000
  149.    Avk1's time: 1.8860  #unique: 799966 #total: 8000000
  150.    Avk2's time: 0.9000  #unique: 799966 #total: 8000000
  151.   440bx's time: 1.8020  #unique: 799966 #total: 8000000
  152.  BrunoK's time: 1.8100  #unique: 799966 #total: 8000000
  153. BrunoK1's time: 1.3400  #unique: 799966 #total: 8000000
  154.  
  155. repeatMillionsCount = 10
  156. Julkas1's time: 3.7200  #unique: 799998 #total: 10000000
  157. Julkas2's time: 3.6100  #unique: 799998 #total: 10000000
  158.   Akira's time: 3.9300  #unique: 799998 #total: 10000000
  159.  Howard's time: 5.1600  #unique: 799998 #total: 10000000
  160.    Avk1's time: 1.5500  #unique: 799998 #total: 10000000
  161.    Avk2's time: 0.8500  #unique: 799998 #total: 10000000
  162.   440bx's time: 1.9400  #unique: 799998 #total: 10000000
  163.  BrunoK's time: 2.0800  #unique: 799998 #total: 10000000
  164. BrunoK1's time: 1.4800  #unique: 799998 #total: 10000000
  165.  
  166. repeatMillionsCount = 12
  167. Julkas1's time: 4.3900  #unique: 799999 #total: 12000000
  168. Julkas2's time: 4.2600  #unique: 799999 #total: 12000000
  169.   Akira's time: 4.6000  #unique: 799999 #total: 12000000
  170.  Howard's time: 6.1300  #unique: 799999 #total: 12000000
  171.    Avk1's time: 1.9300  #unique: 799999 #total: 12000000
  172.    Avk2's time: 0.9700  #unique: 799999 #total: 12000000
  173.   440bx's time: 2.2900  #unique: 799999 #total: 12000000
  174.  BrunoK's time: 2.4300  #unique: 799999 #total: 12000000
  175. BrunoK1's time: 1.7100  #unique: 799999 #total: 12000000
  176.  
  177. repeatMillionsCount = 14
  178. Julkas1's time: 4.9900  #unique: 799999 #total: 14000000
  179. Julkas2's time: 4.9600  #unique: 799999 #total: 14000000
  180.   Akira's time: 5.3200  #unique: 799999 #total: 14000000
  181.  Howard's time: 7.1200  #unique: 799999 #total: 14000000
  182.    Avk1's time: 2.0900  #unique: 799999 #total: 14000000
  183.    Avk2's time: 1.0800  #unique: 799999 #total: 14000000
  184.   440bx's time: 2.6300  #unique: 799999 #total: 14000000
  185.  BrunoK's time: 2.8000  #unique: 799999 #total: 14000000
  186. BrunoK1's time: 1.9800  #unique: 799999 #total: 14000000
  187.  
  188. repeatMillionsCount = 16
  189. Julkas1's time: 5.6700  #unique: 800000 #total: 16000000
  190. Julkas2's time: 5.6200  #unique: 800000 #total: 16000000
  191.   Akira's time: 5.9900  #unique: 800000 #total: 16000000
  192.  Howard's time: 8.1000  #unique: 800000 #total: 16000000
  193.    Avk1's time: 2.3600  #unique: 800000 #total: 16000000
  194.    Avk2's time: 1.2200  #unique: 800000 #total: 16000000
  195.   440bx's time: 2.9800  #unique: 800000 #total: 16000000
  196.  BrunoK's time: 3.1800  #unique: 800000 #total: 16000000
  197. BrunoK1's time: 2.2000  #unique: 800000 #total: 16000000
  198.  
  199. repeatMillionsCount = 18
  200. Julkas1's time: 6.3200  #unique: 800000 #total: 18000000
  201. Julkas2's time: 6.2900  #unique: 800000 #total: 18000000
  202.   Akira's time: 6.6800  #unique: 800000 #total: 18000000
  203.  Howard's time: 9.0600  #unique: 800000 #total: 18000000
  204.    Avk1's time: 2.6700  #unique: 800000 #total: 18000000
  205.    Avk2's time: 1.3900  #unique: 800000 #total: 18000000
  206.   440bx's time: 3.3400  #unique: 800000 #total: 18000000
  207.  BrunoK's time: 3.5700  #unique: 800000 #total: 18000000
  208. BrunoK1's time: 2.4400  #unique: 800000 #total: 18000000
  209.  
  210. repeatMillionsCount = 20
  211. Julkas1's time: 7.0000  #unique: 800000 #total: 20000000
  212. Julkas2's time: 6.9700  #unique: 800000 #total: 20000000
  213.   Akira's time: 7.4000  #unique: 800000 #total: 20000000
  214.  Howard's time: 10.0600 #unique: 800000 #total: 20000000
  215.    Avk1's time: 2.9500  #unique: 800000 #total: 20000000
  216.    Avk2's time: 1.5100  #unique: 800000 #total: 20000000
  217.   440bx's time: 3.6700  #unique: 800000 #total: 20000000
  218.  BrunoK's time: 3.9000  #unique: 800000 #total: 20000000
  219. BrunoK1's time: 2.6900  #unique: 800000 #total: 20000000
  220.  
Procedure Avk1 reads the input data from stdin, Avk2 loads the entire input file into TMemoryStream.
Title: Re: Alternative set of string-to-int conversion routines
Post by: PascalDragon on January 24, 2022, 01:31:49 pm
Why do you say that generic functions need to be declared in the interface part of the unit?
Probably lagged behind life, some time ago it was like that, but now I didn’t even try to check. :-[

This works since the introduction of generic functions. Maybe you confused it with the common workaround before generic functions (though even then one does not need to use the interface section):

Code: Pascal  [Select][+][-]
  1. {$mode objfpc}
  2. {$modeswitch advancedrecords}
  3.  
  4. type
  5.   generic TGenFuncs<T> = record
  6.     class function Add(aArg1, aArg2: T): T; static;
  7.   end;
  8.  
  9. class function TGenFuncs.Add(aArg1, aArg2: T): T; static;
  10. begin
  11.   Result := aArg1 + aArg2;
  12. end;
  13.  
  14. begin
  15.   Writeln(specialize TGenFuncs<LongInt>(3, 5));
  16. end.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 24, 2022, 01:57:25 pm
IIRC at the time when FPC-3.2.0 was not yet released, this method was the only possible one and attempts to declare a generic type in the implementation section were unconditionally suppressed by the compiler.
I just haven't had a need for something like this in a long time.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Zoran on January 24, 2022, 03:55:12 pm
I find it strange that RTL doesn't have a low-level string-to-int function which accepts only decimal (base 10) numbers.

Okay, we don't have it because Delphi doesn't have it. ::)
Which is, again... strange.

Of course it is easy to write a naive implementation, but...
Now we have to check if the two starting characters are decimal digits (not only the first, but the second as well, as val accepts 0x... notation for hex numbers) and only then call val (or TryStrToInt, or your Char2Int), which will again check for binary or hexadecimal notation.

For example:
Code: Pascal  [Select][+][-]
  1. uses
  2.   SysUtils;
  3.  
  4. function StrToIntDecimal(const S: String; out N: Int32): Boolean;
  5. var
  6.   P: PAnsiChar;
  7. begin
  8.   P := PAnsiChar(TrimLeft(S));
  9.   if P^ = '-' then
  10.     Inc(P);
  11.   case P^ of
  12.     '0':
  13.       if not ((P + 1)^ in [#0, '0'..'9']) then
  14.         Exit(False);
  15.     '1'..'9':
  16.       ;
  17.   otherwise
  18.     Exit(False);
  19.   end;
  20.  
  21.   Result := TryStrToInt(S, N);
  22. end;
  23.  
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 24, 2022, 05:04:56 pm
avk
Perhaps it is worth adding unicode versions? Are there any plans for StrToFloat?
Title: Re: Alternative set of string-to-int conversion routines
Post by: Bart on January 24, 2022, 06:44:33 pm
I find it strange that RTL doesn't have a low-level string-to-int function which accepts only decimal (base 10) numbers.

...

Of course it is easy to write a naive implementation, but...
Now we have to check if the two starting characters are decimal digits (not only the first, but the second as well, as val accepts 0x... notation for hex numbers) and only then call val (or TryStrToInt, or your Char2Int), which will again check for binary or hexadecimal notation.

Val() is the standard function for string to a number, and has been for a very long time.
Specialized and optimized functions (e.g. only base 2) should go into a separate unit IMO.
The checking for base and negativity does not take that much of the time. Most time is doen doing the multiplications and adding and checking that the result won't overflow.
It does the job, and is reliable (at least it is supposed to be and I think it is now).
TryStrToInt and family are supposed to handle the same input as Val() in the same way, so you cannot specialize them for e.g. only base 10.

Personally I think that when you require zillions of string to a number conversions in your program, you should go for a specialized solution.
Such a function can (and probably will) assume certain things about the input (always base 10, not negative, no leading whitespace etc. etc) and may raise exceptions if the input does not match the expectation. That's the trade off.
Such a function most likely will not work on (or be optimized for) all supported fpc platforms (e.g. 8-bit CPU's).
As you can see in this thread many have already implemented such a solution, just pick what you need.

Just my 2 cents.

Bart

Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 24, 2022, 06:59:26 pm
@Zoran:
  It seems that such function, if don't resort to various tricks, will be only slightly faster than the universal one. But it will be possible to try.

@zamtmn:
  To be honest, I have not even thought about unicodestring yet.
Some implementation of the string-to-double conversion function is available in the LGenerics package in the lgJson unit, but it was made purely for the needs of JSON and without much flexibility.
Have you seen this (https://github.com/BeRo1985/pasdblstrutils) library? 
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 24, 2022, 09:17:31 pm
avk
>>Have you seen this library?
no, thanks!
Title: Re: Alternative set of string-to-int conversion routines
Post by: BobDog on January 24, 2022, 11:02:34 pm
Here is an int64 conversion from ansistring.
float would be the const array extended using 1/10,1/100,1/1000 e.t.c., but i have not tried it yet.
EDITED 25/01/22
Code: Pascal  [Select][+][-]
  1.  
  2. uses
  3. sysutils;
  4.  
  5. function val64( const x:ansistring):int64; inline;
  6.  
  7. const
  8.  p: array[1..20]of qword=(1,10,100,1000,10000,100000,1000000,10000000,100000000,1000000000,10000000000,100000000000, 1000000000000,10000000000000,
  9.                               100000000000000,1000000000000000,10000000000000000,100000000000000000,1000000000000000000,10000000000000000000);
  10.  var
  11.  count:integer=0;
  12.  sp:boolean=false;
  13.  sg:boolean=false;
  14.  n:integer;
  15.  ans:int64=0;
  16.  sign:shortint=1;
  17.  b:byte;
  18.     begin
  19.     for n := length(x) downto  1 do
  20.     begin
  21.     count:=count+1;
  22.     b:=byte(x[n]);
  23.     case b of
  24.    
  25.     32:
  26.     begin
  27.     sp:=true;
  28.     continue;
  29.     end;
  30.    
  31.     45:    
  32.     begin
  33.     sg:=true;
  34.     sign:=-sign;
  35.     if (sign<>-1) then exit(0);
  36.     continue
  37.     end;
  38.    
  39.     48..57:if sp or sg  then exit(0);
  40.     else
  41.     exit(0);
  42.     end;
  43.    
  44.      ans:= ans+p[count] * (b-48);
  45.     end;
  46.     exit( sign*ans);
  47. end;
  48.  
  49. var
  50.  t:int64;
  51.  res:int64;
  52.  code:word;
  53.  k,i:longword;
  54.  
  55.  number:ansistring=' -998880776665550';
  56.  
  57.  
  58. begin
  59.  
  60.  
  61. for k:=1 to 5 do
  62. begin
  63. t:=gettickcount64;
  64. for i:=0 to 10000000 do
  65. res:=val64(number);
  66. writeln(gettickcount64-t,' milliseconds val64 = ',res);
  67.  
  68. t:=gettickcount64;
  69. for i:=0 to 10000000 do
  70. val(number,res,code);
  71. writeln(gettickcount64-t,' milliseconds val =   ',res);
  72. writeln;
  73. end;
  74. writeln('Press enter to end . . .');
  75. readln;
  76. end.
  77.  
  78.  
Title: Re: Alternative set of string-to-int conversion routines
Post by: Bart on January 24, 2022, 11:43:21 pm
Any illegal string will return 0?
You can have multiple '-'?
'1 2 3' converts to 10203?
'--1' converts to -1
'-1-1-1' converts to -10101
'999999999999999999999999999999999999999999999999999999' gives Aritmetic Overflow.
'000000000000000000000' gives Range Check Error (accessing p[21]).

But yes, it's faster than Val().

This one is also fater than Val():
Code: Pascal  [Select][+][-]
  1. function BartsVal(const S: String): Int64; inline;
  2. begin
  3.   Result := 42;
  4. end;

It's only slightly less accurate than yours, and it will never raise an exception.
And it's the answer to life, the universe and everything.

Bart
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 25, 2022, 02:13:19 am
This one is also fater than Val():
Code: Pascal  [Select][+][-]
  1. function BartsVal(const S: String): Int64; inline;
  2. begin
  3.   Result := 42;
  4. end;
Вот оно, идеальное решение!!! :D Этот метод я не смогу ни как ускорить. :(
Eng:
Here it is, the perfect solution! :D I can't speed up this method. :(
Title: Re: Alternative set of string-to-int conversion routines
Post by: PascalDragon on January 25, 2022, 09:12:48 am
IIRC at the time when FPC-3.2.0 was not yet released, this method was the only possible one and attempts to declare a generic type in the implementation section were unconditionally suppressed by the compiler.

You could always declare generic types in the implementation section, so I don't know what problems exactly you remember.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 25, 2022, 10:34:33 am
It seems there is no more problem (at least so far :)), thank you.
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 25, 2022, 11:18:13 am
PascalDragon
I think it means this
Code: Pascal  [Select][+][-]
  1. unit Unit1;
  2. {$mode objfpc}{$H+}
  3. interface
  4.  
  5. generic procedure GTest<T>(aArg:T);
  6.  
  7. implementation
  8.  
  9. generic procedure GImpl<T>(aArg:T);
  10. begin
  11.   writeln(aArg);
  12. end;
  13.  
  14. generic procedure GTest<T>(aArg:T);
  15. begin
  16.   specialize GImpl<T>(aArg);
  17. end;
  18.  
  19. end.
If there are concrete issues with debugging then we need a reproducible example no matter the issue, cause, yes, such things should be fixed, no matter if it's about stepping through generic code correctly or not accidentally entering generic code when one doesn't want it as MarkMLI hinted at.
It looks like it's a bug lazarus https://gitlab.com/freepascal.org/lazarus/lazarus/-/issues/39584
Could you comment on which way to dig here https://gitlab.com/freepascal.org/fpc/source/-/issues/39387
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 25, 2022, 01:18:38 pm
Why? You only need place
Code: Pascal  [Select][+][-]
  1. generic procedure GImpl<T>(aArg:T);
to interface section, and everything works. I'm not saying it's a compiler error, this is just an illustration of how I understood avk
As per Thaddy's remarks, tried to somehow reduce code duplication.
Generic functions need to be declared in the interface part of the unit, and since they are helpers, this is not very good.
Includes are also not suitable, because I want everything to stay in one unit.
As a result, as an experiment, I settled on macros. The code looks unreadable, of course, but it seems that it would not be much better with includes.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 25, 2022, 01:22:26 pm
That is simply not needed.
Title: Re: Alternative set of string-to-int conversion routines
Post by: PascalDragon on January 25, 2022, 01:33:30 pm
PascalDragon
I think it means this
Code: Pascal  [Select][+][-]
  1. unit Unit1;
  2. {$mode objfpc}{$H+}
  3. interface
  4.  
  5. generic procedure GTest<T>(aArg:T);
  6.  
  7. implementation
  8.  
  9. generic procedure GImpl<T>(aArg:T);
  10. begin
  11.   writeln(aArg);
  12. end;
  13.  
  14. generic procedure GTest<T>(aArg:T);
  15. begin
  16.   specialize GImpl<T>(aArg);
  17. end;
  18.  
  19. end.

avk had said pre-3.2.0. Pre-3.2.0 generic functions didn't exist at all.

But yes, in the example you mentioned it is indeed the case that GImpl<> needs to be declared in the interface section. This is by design.

Could you comment on which way to dig here https://gitlab.com/freepascal.org/fpc/source/-/issues/39387

No, I can't, because if I knew that, then I'd already be at the fix essentially.

(Edit: fixed quote)
Title: Re: Alternative set of string-to-int conversion routines
Post by: BobDog on January 26, 2022, 12:17:17 am
Any illegal string will return 0?
You can have multiple '-'?
'1 2 3' converts to 10203?
'--1' converts to -1
'-1-1-1' converts to -10101
'999999999999999999999999999999999999999999999999999999' gives Aritmetic Overflow.
'000000000000000000000' gives Range Check Error (accessing p[21]).

But yes, it's faster than Val().

This one is also fater than Val():
Code: Pascal  [Select][+][-]
  1. function BartsVal(const S: String): Int64; inline;
  2. begin
  3.   Result := 42;
  4. end;

It's only slightly less accurate than yours, and it will never raise an exception.
And it's the answer to life, the universe and everything.

Bart
I have edited.
I liked a little drink myself when I was 42.
No harm in a drop of John Barleycorn, heck, I live in the country where it was invented .
Have a nice Burn's night.
"And it's the answer to life, the universe and everything."
Very true.


Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 26, 2022, 01:12:23 am
I have edited.
что изменилось? Вы пробовали тестировать свой код?
Eng:
what changed? Have you tried testing your code?
Title: Re: Alternative set of string-to-int conversion routines
Post by: BobDog on January 26, 2022, 01:32:13 am
I have edited.
что изменилось? Вы пробовали тестировать свой код?
Eng:
what changed? Have you tried testing your code?
Yes, tested 32 and 64 bit Windows on Geany ide.
It is better than it was anyway.
Not perfect by any means of course.

Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 26, 2022, 04:45:22 am
Yes, tested 32 and 64 bit Windows on Geany ide.
It is better than it was anyway.
Not perfect by any means of course.
Вы пробовали вывести отрицательные числа? Числа, пред которыми пробел?
Eng: Have you tried displaying negative numbers? Numbers preceded by a space?

Look at here (https://forum.lazarus.freepascal.org/index.php/topic,57819.0.html) - the solution has already been provided and it works a little faster. Quite a bit faster than yours.
Title: Re: Alternative set of string-to-int conversion routines
Post by: BobDog on January 26, 2022, 10:40:00 am
Yes, tested 32 and 64 bit Windows on Geany ide.
It is better than it was anyway.
Not perfect by any means of course.
Вы пробовали вывести отрицательные числа? Числа, пред которыми пробел?
Eng: Have you tried displaying negative numbers? Numbers preceded by a space?

Look at here (https://forum.lazarus.freepascal.org/index.php/topic,57819.0.html) - the solution has already been provided and it works a little faster. Quite a bit faster than yours.
Hi Seenkao
My example uses -ve number preceded by a space
(' -998880776665550')
I cannot run project files here on Geany, but I compiled your unit and used
sc_StrToInt64(number,res);
in my code instead of val.
If I set a space in front(as in my example) your function gives 0.
Maybe the solution has already been provided as you say, but surely a little input from members does no harm.
And yes, your library function is faster for int64 for strings >5 characters, and a little slower for smaller strings.
Thank you for the unit.
I will try out some code for floats, maybe later.







Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 26, 2022, 03:00:39 pm
I cannot run project files here on Geany, but I compiled your unit and used
Edit the geany configuration file. filetype_extensions.conf
You may also want to add the missing keywords. That is another conf file that I forgot, but I will look it up.
You may also want to edit the templates, add a unit.pas etc, and remove the licence if so required.

There are many more options to let geany play nice with Pascal sources.
I am thinking about a wiki entry, since I tend to use geany over fp for simple fpc code.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 26, 2022, 03:46:04 pm
As an experiment, I added a couple of functions(TryDecimals2Int()) that only accept strings in decimal notation using an idea I saw on the forum.

Some quick and dirty benchmark:
Code: Pascal  [Select][+][-]
  1. program bench2;
  2.  
  3. {$MODE OBJFPC}{$H+}
  4.  
  5. uses
  6.   SysUtils, DateUtils, StrUtils, Str2IntAlter;
  7.  
  8. var
  9.   a: TStringArray = nil;
  10.   Trash: TStringArray = (
  11.     ' --', ' ++', ' +-',' --0', ' ++0', ' +-0',' -0-1', ' +0--1', ' 0-1-1',
  12.     '9223372036854775808', '-9223372036854775809');
  13.  
  14. function NextInt: string;
  15. begin
  16.   case Random(4) of
  17.     0: Result := '  '#9'-' + Random(High(Int64)).ToString;
  18.     1: Result := '  '#9'+' + Random(High(Int64)).ToString;
  19.     2: Result := '  '#9'-' + Random(Int64(High(Dword))).ToString;
  20.   else
  21.     Result := '  '#9'+' + Random(Int64(High(Dword))).ToString;
  22.   end;
  23. end;
  24.  
  25. procedure GenTestData;
  26. var
  27.   I: Integer;
  28. const
  29.   TestSize = 1000;
  30. begin
  31.   SetLength(a, TestSize);
  32.   for I := 0 to High(a) do
  33.     a[I] := NextInt;
  34. end;
  35.  
  36. procedure RunVal;
  37. var
  38.   I, J, r, c: Integer;
  39.   Start: TTime;
  40.   v, Score: Int64;
  41. begin
  42.   for I := 0 to High(Trash) do
  43.     begin
  44.       Val(Trash[I], v, c);
  45.       if c = 0 then
  46.         begin
  47.           WriteLn('Val() accepts trash');
  48.           exit;
  49.         end;
  50.     end;
  51.   r := 0;
  52.   Start := Time;
  53.   for J := 1 to 30000 do
  54.     for I := 0 to High(a) do
  55.       begin
  56.         Val(a[I], v, c);
  57.         Inc(r, Ord(c <> 0));
  58.       end;
  59.   Score := MillisecondsBetween(Time, Start);
  60.   WriteLn('Val(), score:           ', Score);
  61.   WriteLn('rejected:               ', r);
  62. end;
  63.  
  64. procedure RunAlt;
  65. var
  66.   I, J, r: Integer;
  67.   Start: TTime;
  68.   v, Score: Int64;
  69. begin
  70.   for I := 0 to High(Trash) do
  71.     if TryChars2Int(Trash[I][1..Length(Trash[I])], v) then
  72.       begin
  73.         WriteLn('TryChars2Int accepts trash');
  74.         exit;
  75.       end;
  76.   r := 0;
  77.   Start := Time;
  78.   for J := 1 to 30000 do
  79.     for I := 0 to High(a) do
  80.       if not TryChars2Int(a[I][1..Length(a[I])], v) then
  81.         Inc(r);
  82.   Score := MillisecondsBetween(Time, Start);
  83.   WriteLn('TryChars2Int, score:    ', Score);
  84.   WriteLn('rejected:               ', r);
  85. end;
  86.  
  87. procedure RunAltDec;
  88. var
  89.   I, J, r: Integer;
  90.   Start: TTime;
  91.   v, Score: Int64;
  92. begin
  93. for I := 0 to High(Trash) do
  94.   if TryDecimals2Int(Trash[I][1..Length(Trash[I])], v) then
  95.     begin
  96.       WriteLn('TryDecimals2Int accepts trash');
  97.       exit;
  98.     end;
  99.   r := 0;
  100.   Start := Time;
  101.   for J := 1 to 30000 do
  102.     for I := 0 to High(a) do
  103.       if not TryDecimals2Int(a[I][1..Length(a[I])], v) then
  104.         Inc(r);
  105.   Score := MillisecondsBetween(Time, Start);
  106.   WriteLn('TryDecimals2Int, score: ', Score);
  107.   WriteLn('rejected:               ', r);
  108. end;
  109.  
  110. begin
  111.   Randomize;
  112.   GenTestData;
  113.  
  114.   RunVal;
  115.   WriteLn;
  116.   RunAlt;
  117.   WriteLn;
  118.   RunAltDec;
  119.   WriteLn;
  120.  
  121.   WriteLn('Press any key to exit...');
  122.   ReadLn;
  123. end.
  124.  

output x64:
Code: Text  [Select][+][-]
  1. Val(), score:           2445
  2. rejected:               0
  3.  
  4. TryChars2Int, score:    923
  5. rejected:               0
  6.  
  7. TryDecimals2Int, score: 770
  8. rejected:               0
  9.  
  10. Press any key to exit...
  11.  

output x32:
Code: Text  [Select][+][-]
  1. Val(), score:           4598
  2. rejected:               0
  3.  
  4. TryChars2Int, score:    1891
  5. rejected:               0
  6.  
  7. TryDecimals2Int, score: 1178
  8. rejected:               0
  9.  
  10. Press any key to exit...
  11.  

I would like to remind that in this thread we are talking about functions to replace Val(), and therefore these should parse strings according to the same rules as Val().
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 26, 2022, 06:43:45 pm
BobDog, на самом деле это не критично. Не такая большая разница, чтоб обращать на разницу в исполнении.
Да, функции предоставленные мной проверяют числа в чистом виде, без пробелов и прочего (исключая знак минус), всё это можно проверить перед функцией и убрать по надобности. Потому и выдало ноль, потому что встретился пробел.
Google translate:
BobDog, actually it is not critical. Not such a big difference to pay attention to the difference in performance.
Yes, the functions provided by me check numbers in their pure form, without spaces and other things (excluding the minus sign), all this can be checked before the function and removed as needed. That's why it gave out zero, because there was a gap.

avk, я не думал, что этот метод сильно ускорит работу с числами. Я ошибался! Увеличение примерно в 1.5 раза. ))) Но только для длинных чисел.
Google translate:
avk, I did not think that this method would greatly speed up the work with numbers. I was wrong! An increase of about 1.5 times. ))) But only for long numbers.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 27, 2022, 09:30:40 am
avk, я не думал, что этот метод сильно ускорит работу с числами. Я ошибался! Увеличение примерно в 1.5 раза. ))) Но только для длинных чисел.
Google translate:
avk, I did not think that this method would greatly speed up the work with numbers. I was wrong! An increase of about 1.5 times. ))) But only for long numbers.
Досадно, но я был прав изначально... как только мы начинаем работать с настоящими строками, а не массивом из Pchar сразу же всё теряется.
При необходимости можно полностью уйти от использования строк и тогда скорость может в два раза возрастёт. Вроде как пользовательские процедуры должны быть, а не для собственного использования. Хотя это на ваше усмотрение.
google translate:
It's a shame, but I was right initially... as soon as we start working with real strings, and not an array from Pchar, everything is immediately lost.
If necessary, you can completely avoid using strings and then the speed can double. Like as the user procedures should be, instead of for own use. Although this is up to you.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 28, 2022, 10:20:12 am
Added more overloaded functions for converting strings in decimal notation to integers.

@Seenkao, could you clarify what you are talking about?
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 28, 2022, 01:23:56 pm
as soon as we start working with real strings, and not an array from Pchar, everything is immediately lost.
If necessary, you can completely avoid using strings and then the speed can double.
On the contrary: Pascal strings are more flexible and faster than PChars. That is because strlen (slow, O(n)) to determine length for a pchar vs reading the size of the string at once in Pascal strings (Fast, O(1)). Also PChars can not contain embedded zero's, you would need a char array for that and with a known length. Pascal strings can contain embedded zero's. Using PChars mixed with Pascal strings can lead to strange results, as I already demonstrated.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 30, 2022, 06:57:20 am
Added some overloads to support unicodestring.
A quick and dirty benchmark against Val() on my machine looks like this:
Code: Text  [Select][+][-]
  1. SInt32:
  2. Val(), score:        3559
  3. rejected:            0
  4. TryChars2Int, score: 780
  5. rejected:            0
  6.  
  7. SInt64:
  8. Val(), score:        4820
  9. rejected:            0
  10. TryChars2Int, score: 1232
  11. rejected:            0
  12.  
  13. UInt32:
  14. Val(), score:        5195
  15. rejected:            0
  16. TryChars2Int, score: 811
  17. rejected:            0
  18.  
  19. UInt64:
  20. Val(), score:        8549
  21. rejected:            0
  22. TryChars2Int, score: 1248
  23. rejected:            0
  24.  
Title: Re: Alternative set of string-to-int conversion routines
Post by: zamtmn on January 30, 2022, 06:59:42 am
Thanks!
Please add unicode version to bench (TryChars2Int(widechars) vs Val(unicodestring))
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 30, 2022, 07:19:05 am
Located in the folder bench_uni, this is its output.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Thaddy on January 30, 2022, 10:24:58 am
Works well, and tnx to include proper tests!
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 30, 2022, 03:59:51 pm
@Seenkao, could you clarify what you are talking about?
О том, что пользователь будет использовать StrToInt('-758343');, а не StrToInt('-758343'[1..length('-758343')]. Я поздравляю вас, вы добились скорости. Но вы потеряли в удобстве использования. А когда возвращаем к обычному представлению String вся ваша скорость теряется.
google translate:
That the user will use StrToInt('-758343'); rather than StrToInt('-758343'[1..length('-758343')] I congratulate you, you have achieved speed, but you have lost in usability, and when we return to the usual String representation, all your speed is lost.

On the contrary: Pascal strings are more flexible and faster than PChars. That is because strlen (slow, O(n)) to determine length for a pchar vs reading the size of the string at once in Pascal strings (Fast, O(1)). Also PChars can not contain embedded zero's, you would need a char array for that and with a known length. Pascal strings can contain embedded zero's. Using PChars mixed with Pascal strings can lead to strange results, as I already demonstrated.
Уж извиняюсь, взял код из того что я сделал: fast_StrToInt (https://github.com/Seenkao/fast_StrToInt)
Eng: Sorry, I took the code from what I did: fast_StrToInt (https://github.com/Seenkao/fast_StrToInt)
Code: Pascal  [Select][+][-]
  1. function geStrToInt(const Str: String; out Value: maxIntVal; Size: LongWord = isInteger): Boolean;
  2. var
  3.   lenStr, i: maxUIntVal;
  4.   m, n, z: maxUIntVal;
  5.   useParametr: PgeUseParametr;
  6.   IntMinus: Boolean;
  7.   correct: maxUIntVal = 0;
  8. label
  9.   jmpEndStr, loopZero;
  10. begin
  11.   {$push}
  12.   {$Q-}{$R-}
  13.   Result := False;
  14.   Value := 0;
  15.   if Size > maxSize then
  16.     Exit;
  17.   IntMinus := False;
  18.   lenStr := Length(Str);
  19.   if lenStr = 0 then
  20.     exit;
  21.   i := 1;
  22.   m := Byte(Str[i]);
  23.  
  24.   if (lenStr = 1) and (m = 48) then
  25.   begin
  26.     Result := True;
  27.     exit;
  28.   end;
  29.   if m = 45 then
  30.   begin
  31.     if lenStr = 1 then
  32.       exit;
  33.     IntMinus := True;
  34.     inc(i);
  35.     m := Byte(Str[2]);
  36.   end;
  37.  
  38. loopZero:
  39.   if m = 48 then
  40.   begin
  41.     inc(i);
  42.     inc(correct);
  43.     m := Byte(Str[i]);
  44.     goto loopZero;
  45.   end;
  46.  
  47.   inc(i);
  48.   m := m - 48;
  49.   if m > 9 then
  50.     exit;
  51.  
  52.   useParametr := @allIntParametr[Size];
  53.   if i > lenStr then
  54.   begin
  55.     z := 0;
  56.     goto jmpEndStr;
  57.   end;
  58.   if (lenStr - correct) > useParametr^.maxLen then
  59.     Exit;
  60.   while i < lenStr do
  61.   begin
  62.     n := (Byte(Str[i]) - 48);
  63.     if n > 9 then
  64.       Exit;
  65.     m := m * 10 + n;
  66.     inc(i);
  67.   end;
  68.  
  69.   if m > useParametr^.maxNumDiv10 then
  70.     exit;
  71.   m := m * 10;
  72.   z := Byte(Str[i]) - 48;
  73.   if z > 9 then
  74.     exit;
  75.  
  76. jmpEndStr:
  77.   if IntMinus then
  78.     n := useParametr^.maxNumeric + 1 - m
  79.   else
  80.     n := useParametr^.maxNumeric - m;
  81.   if z > n then
  82.     exit;
  83.  
  84.   if IntMinus then
  85.     Value := - m - z
  86.   else
  87.     Value := m + z;
  88.   Result := true;
  89.   {$pop}
  90. end;
Переделал:
Remade:
Code: Pascal  [Select][+][-]
  1. function geStrToInt(const Str: array of char; out Value: maxIntVal; Size: LongWord = isInteger): Boolean;
  2. var
  3.   lenStr, i: maxUIntVal;
  4.   m, n, z: maxUIntVal;
  5.   useParametr: PgeUseParametr;
  6.   IntMinus: Boolean = false;
  7.   correct: maxUIntVal = 0;
  8.   v: LongWord;
  9.   pv: PByte;
  10. label
  11.   jmpEndStr, loopZero;
  12. begin
  13.   {$push}
  14.   {$Q-}{$R-}
  15.   Result := False;
  16.   Value := 0;
  17.   if Size > maxSize then
  18.     Exit;
  19.   lenStr := Length(Str);
  20.   if lenStr = 0 then
  21.     exit;
  22.   i := 0;
  23.   m := Byte(Str[i]);
  24.   if (lenStr = 1) and (m = 48) then
  25.   begin
  26.     Result := True;
  27.     exit;
  28.   end;
  29.   if m = 45 then
  30.   begin
  31.     if lenStr = 1 then
  32.       exit;
  33.     IntMinus := True;
  34.     inc(i);
  35.     m := Byte(Str[2]);
  36.   end;
  37. loopZero:
  38.   if m = 48 then
  39.   begin
  40.     inc(i);
  41.     inc(correct);
  42.     m := Byte(Str[i]);
  43.     goto loopZero;
  44.   end;  
  45.   inc(i);
  46.   m := m - 48;
  47.   if m > 9 then
  48.     exit;
  49.   useParametr := @allIntParametr[Size];
  50.   if i > lenStr - 1 then
  51.   begin
  52.     z := 0;
  53.     goto jmpEndStr;
  54.   end;
  55.  
  56.   if (lenStr - correct) > useParametr^.maxLen then
  57.     Exit;
  58.   dec(lenStr);
  59.   while i < lenStr do
  60.   begin
  61.     n := (Byte(Str[i]) - 48);
  62.     if n > 9 then
  63.       Exit;
  64.     m := m * 10 + n;
  65.     inc(i);
  66.   end;
  67.   if m > useParametr^.maxNumDiv10 then
  68.     exit;
  69.   m := m * 10;
  70.   z := Byte(Str[i]) - 48;
  71.   if z > 9 then
  72.     exit;
  73.  
  74. jmpEndStr:
  75.   if IntMinus then
  76.     n := useParametr^.maxNumeric + 1 - m
  77.   else
  78.     n := useParametr^.maxNumeric - m;
  79.   if z > n then
  80.     exit;
  81.  
  82.   if IntMinus then
  83.     Value := - m - z
  84.   else
  85.     Value := m + z;
  86.   Result := true;
  87.   {$pop}
  88. end;
И при тестах в первом варианте (работа со String) function geStrToInt(const Str: String; out Value: maxIntVal; Size: LongWord = isInteger): Boolean; я получаю меньшую скорость обработки, чем во втором (работа с массивом PChar) function geStrToInt(const Str: array of char; out Value: maxIntVal; Size: LongWord = isInteger): Boolean;. Вторая функция работает в два с лишним раза быстрее!!!
Не хотите прокомментировать что происходит? И почему вы утверждаете обратное?
Google translate:
And when testing in the first option (working with String) function geStrToInt(const Str: String; out Value: maxIntVal; Size: LongWord = isInteger): Boolean; I get a lower processing speed than in the second ( working with a PChar array) function geStrToInt(const Str: array of char; out Value: maxIntVal; Size: LongWord = isInteger): Boolean;. The second function works more than twice as fast!!!
Would you like to comment on what's going on? And why are you saying otherwise?

avk, по сути я приложил пример (и не я один), в котором простые вещи обходят слишком замудрёные. Эту функцию function geStrToInt(const Str: array of char; out Value: maxIntVal; Size: LongWord = isInteger): Boolean; сложно обойти по скорости, там уже практически нечего ускорять. Даже используя ваш метод с QWord и DWord.
Google translate:
avk, in fact, I attached an example (and I'm not the only one) in which simple things bypass too complicated ones. This function function geStrToInt(const Str: array of char; out Value: maxIntVal; Size: LongWord = isInteger): Boolean; is hard to get around in terms of speed, there is practically nothing to speed up. Even using your method with QWord and DWord.

My test:
Quote
StrToInt standard     7.8757494776072852E+000
StrToInt made by me  1.5074538423153692E+000
StrToInt made by avk  2.0989100899762976E+000
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 30, 2022, 04:56:22 pm
First of all, I would like to remind you that this section of the forum is called "Third Party announcements", and this topic discusses a specific library.
If you want to discuss something different, it is better to create a new thread for this.
As for Str2IntAlter, it don't seem to be restricting the user in any way in choosing a particular version of the function.
One could try to compare their performance(adding function from fast_StrToInt at the same time), for example:
Code: Pascal  [Select][+][-]
  1. program bench;
  2.  
  3. {$MODE OBJFPC}{$H+}
  4.  
  5. uses
  6.   SysUtils, DateUtils, StrUtils, Str2IntAlter, ge_external_Utils;
  7.  
  8. function NextRandomQWord: QWord;
  9. begin
  10.   Result := QWord(Random(Int64($100000000))) shl 32 or QWord(Random(Int64($100000000)));
  11. end;
  12.  
  13. function NextUInt64: string;
  14. begin
  15.   case Random(4) of
  16.     0: Result := ' '#9'+%' + NextRandomQWord.ToBinString;
  17.     1: Result := ' '#9'+&' + OctStr(NextRandomQWord, 22);
  18.     2: Result := ' '#9'+' + NextRandomQWord.ToString;
  19.   else
  20.     Result := ' '#9'+$' + NextRandomQWord.ToHexString;
  21.   end;
  22. end;
  23.  
  24. var
  25.   a: TStringArray = nil;
  26.  
  27. procedure GenTestData;
  28. var
  29.   I: Integer;
  30. const
  31.   TestSize = 1000;
  32. begin
  33.   SetLength(a, TestSize);
  34.   for I := 0 to High(a) do
  35.     a[I] := NextUInt64;
  36. end;
  37.  
  38. procedure RunChars;
  39. var
  40.   I, J, r: Integer;
  41.   Start: TTime;
  42.   Score: Int64;
  43.   v: QWord;
  44. begin
  45.   r := 0;
  46.   Start := Time;
  47.   for J := 1 to 30000 do
  48.     for I := 0 to High(a) do
  49.       if not TryChars2Int(a[I][1..Length(a[I])], v) then
  50.         Inc(r);
  51.   Score := MillisecondsBetween(Time, Start);
  52.   WriteLn('TryChars2Int, score:  ', Score);
  53.   WriteLn('rejected:             ', r);
  54. end;
  55.  
  56. procedure RunStr;
  57. var
  58.   I, J, r: Integer;
  59.   Start: TTime;
  60.   Score: Int64;
  61.   v: QWord;
  62. begin
  63.   r := 0;
  64.   Start := Time;
  65.   for J := 1 to 30000 do
  66.     for I := 0 to High(a) do
  67.       if not TryStr2Int(a[I], v) then
  68.         Inc(r);
  69.   Score := MillisecondsBetween(Time, Start);
  70.   WriteLn('TryStr2Int, score:    ', Score);
  71.   WriteLn('rejected:             ', r);
  72. end;
  73.  
  74. procedure RunSeen;
  75. var
  76.   I, J, r: Integer;
  77.   Start: TTime;
  78.   Score: Int64;
  79.   v: QWord;
  80. begin
  81.   r := 0;
  82.   Start := Time;
  83.   for J := 1 to 30000 do
  84.     for I := 0 to High(a) do
  85.       if not sc_StrToQWord(a[I], v) then
  86.         Inc(r);
  87.   Score := MillisecondsBetween(Time, Start);
  88.   WriteLn('sc_StrToQWord, score: ', Score);
  89.   WriteLn('rejected:             ', r);
  90. end;
  91.  
  92. begin
  93.   Randomize;
  94.   GenTestData;
  95.  
  96.   RunChars;
  97.   WriteLn;
  98.   RunStr;
  99.   WriteLn;
  100.   RunSeen;
  101.   WriteLn;
  102.  
  103.   WriteLn('Press enter to exit...');
  104.   ReadLn;
  105. end.
  106.  

I get such result:
Code: Text  [Select][+][-]
  1. TryChars2Int, score:  1762
  2. rejected:             0
  3.  
  4. TryStr2Int, score:    1740
  5. rejected:             0
  6.  
  7. sc_StrToQWord, score: 130
  8. rejected:             30000000
  9.  
  10. Press enter to exit...
  11.  

So I still don't understand what you mean.
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 30, 2022, 06:36:02 pm
Code: [Select]
[quote author=avk link=topic=57934.msg432772#msg432772 date=1643558182]
sc_StrToQWord, score: 130
rejected:             30000000
[/quote]
Я изначально не стремился обрабатывать дополнительные символы, кроме ведущих для чисел. Моя библиотека принимает именно числа. А не входящую билеберду. Извиняюсь!
Именно по этой причине, вы видите лишь "ошибки". Если я вам в числа залью буквы, вы так же эти ошибки увидите в своих функциях.
Всю билеберду, можно убрать до вызова. Это не займёт много времени исполнения, а при работе с числами, пользователь как раз обрабатывает символы для чисел. И при правильной программе, не будет ни каких посторонних символов. Кроме ведущих нулей, и выделения системы счисления. Я изначально рассчитывал именно на скорость. Но не думал что настолько получу ускорение.
google translate:
I didn't initially aim to handle additional characters other than leading ones for numbers. My library accepts exactly numbers. And not the incoming billebird. I'm sorry!
It is for this reason that you only see "errors". If I put letters in numbers for you, you will also see these errors in your functions.
All billeberd can be removed before the call. This will not take much execution time, and when working with numbers, the user is just processing the characters for numbers. And with the right program, there will be no extraneous characters. In addition to leading zeros, and the selection of the number system. I originally counted on the speed. But I did not think that I would get such an acceleration.

Quote
So I still don't understand what you mean.
Когда я писал, у вас была выложена только функция function TryDecimals2Int(const a: array of char; out aValue: LongWord): Boolean; без String! Я обрадовался когда увидел, что ваша функция обошла мою по скорости! Но позже, я её переписал для String и вся скорость потерялась на этом этапе.
После этого я переписал свою под массив PChar. И увидел в чём реальное ускорение было. А ускорение было не в функции, а в том, чтоб не использовать String в функции.
Вероятно потому вы и не понимаете о чём я пишу. Функций function TryDecStr2Int(const s: string; out aValue: LongWord): Boolean; у вас тогда ещё не было.
google translate:
When I wrote, you only had function TryDecimals2Int(const a: array of char; out aValue: LongWord): Boolean; without String! I was delighted when I saw that your function bypassed mine in speed! But later, I rewrote it for String and all the speed was lost at this stage.
After that, I rewrote mine under the PChar array. And I saw what the real acceleration was. And the acceleration was not in the function, but in not using String in the function.
That's probably why you don't understand what I'm talking about. Functions function TryDecStr2Int(const s: string; out aValue: LongWord): Boolean; you did not yet have.

Но у меня возникает вопрос, а почему в вашем коде не идёт ускорение при использовании массива PChar? Функция TryChars2Int(a[1..Length(a)], v) работает явно быстрее. На каком моменте в коде начинает проявляться перевод String в массив PChar?
TryChars2Int(a[1..Length(a)], v) работает в два раза быстрее чем TryStr2Int(a, v). Почему в вашем коде это не работает быстрее?
google translate:
But I have a question, why is there no acceleration in your code when using the PChar array? The function TryChars2Int(a[1..Length(a)], v) is clearly faster. At what point in the code does the translation of String into a PChar array begin to appear?
TryChars2Int(a[1..Length(a)], v) is twice as fast as TryStr2Int(a, v) . Why doesn't it work faster in your code?
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 30, 2022, 07:17:41 pm
The library that is discussed in the current topic tries to be compatible with Val().
Other options are not interesting.
Title: Re: Alternative set of string-to-int conversion routines
Post by: BobDog on January 30, 2022, 08:10:10 pm

Val, of course, also covers float numbers.
Actually fp val is quite fast compared to C.
it is also much faster than freebasic val.

Code: Pascal  [Select][+][-]
  1. uses
  2. sysutils;
  3. function atof(p:pchar):double;cdecl  external 'msvcrt.dll' name 'atof';
  4. function strtod (p:pchar; b :pointer):double cdecl  external 'msvcrt.dll' name 'strtod';
  5. var
  6.  t:int64;
  7.  res:double;
  8.  code:word;
  9.  k,i:longword;
  10.  number: ansistring='  -12345.6789';
  11.  b:^pbyte=nil;
  12.  begin
  13.  
  14. for k:=1 to 5 do
  15. begin
  16. t:=gettickcount64;
  17. for i:=0 to 1000000 do
  18. res:=strtod(pchar(number),b);
  19. writeln(gettickcount64-t,' milliseconds strtod = ',res);
  20.  
  21. t:=gettickcount64;
  22. for i:=0 to 1000000 do
  23.  
  24. val(number,res,code);
  25. writeln(gettickcount64-t,' milliseconds  val =   ',res);
  26.  
  27. t:=gettickcount64;
  28. for i:=0 to 1000000 do
  29. res:=atof(pchar(number));
  30. writeln(gettickcount64-t,' milliseconds   atof = ',res);
  31. writeln;
  32. end;
  33. writeln('Press enter to end . . .');
  34. readln;
  35. end.
  36.  
  37.  


It shouldn't be too hard to include floats in pascal conversions, but searching for the decimal point will slow things down.
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 31, 2022, 07:26:54 am

Val, of course, also covers float numbers.
...

Does this have anything to do with "set of string-to-int conversion routines"?
Title: Re: Alternative set of string-to-int conversion routines
Post by: Seenkao on January 31, 2022, 08:21:03 am
The library that is discussed in the current topic tries to be compatible with Val().
Other options are not interesting.

Val, of course, also covers float numbers.
...

Does this have anything to do with "set of string-to-int conversion routines"?
Мда... сами себе противоречите? )))
И ваши функции не совместимы с Val. Потому что Val - это одна функция, работающая с разнообразным множеством чисел. Ваши функции могут быть лишь совместимы с производными от Val. До тех пор, пока это не станет одной функцией.
Это так, для информации. Успехов!
google translate:
Hmm ... contradict yourself? )))
And your functions are not compatible with Val. Because Val is one function that works with a diverse set of numbers. Your functions can only be compatible with Val derivatives. Until it becomes one function.
This is so, for information. Good luck!
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on January 31, 2022, 08:53:27 am
Thank you, I would like to hope that there are still no contradictions here.
Title: Re: Alternative set of string-to-int conversion routines
Post by: MarkMLl on February 01, 2022, 12:38:03 pm
One potential issue might be (not tested): the generic code is compiled without debug information, but code that uses it is compiled with and thus the specialization will have debug information as well. Thus when stepping through the latter code one might suddenly land in the generic code which one assumed is without debug information.

On the other hand, I invariably use a locally-built FPC+Lazarus and looking at my build log for FPC 3.2.0 I see

make NO_GDB=1 CPU_TARGET=i386 OPT='-V3.0.4 -O- -gl -Xs- -vt' all |/usr/local/bin/fpc-filter-vt

Lazarus+LCL though is built with default options, so it could be there.

I'll try to keep an eye open for what I think I've seen, and will report back.

I've caught it in the act. If a method is declared virtual+abstract and then overridden in a subclass, stepping into the method's implementation takes you through fpc_check_object_ext() which is in generic.inc. So it gives the impression that generic stuff is being pulled in, even if that's not in fact happening.

MarkMLl
Title: Re: Alternative set of string-to-int conversion routines
Post by: PascalDragon on February 01, 2022, 01:38:03 pm
One potential issue might be (not tested): the generic code is compiled without debug information, but code that uses it is compiled with and thus the specialization will have debug information as well. Thus when stepping through the latter code one might suddenly land in the generic code which one assumed is without debug information.

On the other hand, I invariably use a locally-built FPC+Lazarus and looking at my build log for FPC 3.2.0 I see

make NO_GDB=1 CPU_TARGET=i386 OPT='-V3.0.4 -O- -gl -Xs- -vt' all |/usr/local/bin/fpc-filter-vt

Lazarus+LCL though is built with default options, so it could be there.

I'll try to keep an eye open for what I think I've seen, and will report back.

I've caught it in the act. If a method is declared virtual+abstract and then overridden in a subclass, stepping into the method's implementation takes you through fpc_check_object_ext() which is in generic.inc. So it gives the impression that generic stuff is being pulled in, even if that's not in fact happening.

Well, not everything that's named “generic” does involve generics. ;)

Though I wonder why it steps through fpc_check_object_check? Do you have the RTL compiled with debug information?

Sidenote: This check is only active if either object checks or range checks are enabled
Title: Re: Alternative set of string-to-int conversion routines
Post by: MarkMLl on February 01, 2022, 02:12:25 pm
Well, not everything that's named “generic” does involve generics. ;)

Note that I'm not complaining- particularly while I can see what I've seen.

Quote
Though I wonder why it steps through fpc_check_object_check? Do you have the RTL compiled with debug information?

Compiled with -O- -gl -Xs- which I presume would be sufficient... that's a legacy of when I'd also got SPARC etc. running years ago and some things were a bit less than robust.

Quote
Sidenote: This check is only active if either object checks or range checks are enabled

Also enabled at the project level, although probably not for the LCL etc.

MarkMLl
Title: Re: Alternative set of string-to-int conversion routines
Post by: avk on February 02, 2022, 05:43:17 pm
This is of course so important for the current topic.

I don't even know how to thank you.
TinyPortal © 2005-2018