Recent

Author Topic: IntToStr with national convention?  (Read 1529 times)

Bart

  • Hero Member
  • *****
  • Posts: 3897
    • Bart en Mariska's Webstek
Re: IntToStr with national convention?
« Reply #15 on: August 03, 2020, 12:55:34 pm »
Or do you print your numbers in UTF8-Cherokee?

Some national thousandseparators are not single byte (at least not when you use Lazarus, where evrything is UTF8).
Hence my use of a string instead of a char.

Bart

Bart

  • Hero Member
  • *****
  • Posts: 3897
    • Bart en Mariska's Webstek
Re: IntToStr with national convention?
« Reply #16 on: August 03, 2020, 01:04:33 pm »
For those who live in a world with seconds:
0.000000569 seconds

Sorry, but I just couldn't resist posting that.

Mind you: I do not really care that much (you might have noticed the emoticon I uses).
There are however plenty of speedfreaks around here.

In the distant past I had a discussion in the bugtracker that my new implementation of some conversion to string was 4 times slower than the original.
The original however gave you the worng results.
My opponent in that discussion found that a wrong result was less of an issue than speed.
Even then you would have to do > 100 thousand conversions to notice the difference, which actually might be a use case, but in such a scenario the input was probably read from a file, which would have been the bottle neck for the entire operation (IO is way slower than conversion code).

You code has the added advantage it can be used for strings of any length (my code could easily be adopted for that if the need arises) and it can do 2 jobs (mine only one).

So, no hard feelings I hope?

Bart

process_1

  • Guest
Re: IntToStr with national convention?
« Reply #17 on: August 03, 2020, 02:07:02 pm »
Milsa,

Try this. I have made and used something similar when I needed 4 digits separator and full range for int64, it is actually universal. This is just recreated from memory:

Code: Pascal  [Select][+][-]
  1. function IntToStrFormat(Value: int64; ASeparator: char; ALen: integer = 3): string;
  2. var
  3.   i, c: integer;
  4.   s: string;
  5. begin
  6.   s := IntToStr(Value);
  7.  
  8.   c := length(s);
  9.  
  10.   if Value < 0 then
  11.     Dec(c);
  12.  
  13.   if c mod ALen = 0 then
  14.     c := c div ALen - 1
  15.   else
  16.     c := c div ALen;
  17.  
  18.   i := length(s) - ALen * c + 1;
  19.  
  20.   while c > 0 do
  21.   begin
  22.  
  23.     insert(ASeparator, s, i);
  24.     Dec(c);
  25.  
  26.     i := i + ALen + 1;
  27.  
  28.   end;
  29.  
  30.   Result := s;
  31. end;

Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   len: integer;
  4.   sep: char;
  5.   i: integer;
  6. begin
  7.  
  8.   len := 3;
  9.   sep := ' ';
  10.  
  11.   Memo1.Lines.Add(IntToStrFormat(0, sep, len));
  12.   Memo1.Lines.Add(IntToStrFormat(-1, sep, len));
  13.   Memo1.Lines.Add(IntToStrFormat(1, sep, len));
  14.   Memo1.Lines.Add(IntToStrFormat(-12, sep, len));
  15.   Memo1.Lines.Add(IntToStrFormat(12, sep, len));
  16.   Memo1.Lines.Add(IntToStrFormat(-123, sep, len));
  17.   Memo1.Lines.Add(IntToStrFormat(123, sep, len));
  18.   Memo1.Lines.Add(IntToStrFormat(-1234, sep, len));
  19.   Memo1.Lines.Add(IntToStrFormat(1234, sep, len));
  20.   Memo1.Lines.Add(IntToStrFormat(-12345, sep, len));
  21.   Memo1.Lines.Add(IntToStrFormat(12345, sep, len));
  22.   Memo1.Lines.Add(IntToStrFormat(-123456789, sep, len));
  23.   Memo1.Lines.Add(IntToStrFormat(123456789, sep, len));
  24.  
  25.   Memo1.Lines.Add(IntToStrFormat($8000000000000000, sep, len));
  26.   Memo1.Lines.Add(IntToStrFormat($7FFFFFFFFFFFFFFF, sep, len));
  27.  
  28.  
« Last Edit: August 03, 2020, 02:30:09 pm by process_1 »

Kays

  • Full Member
  • ***
  • Posts: 211
  • Whasup!?
    • KaiBurghardt.de
Re: IntToStr with national convention?
« Reply #18 on: August 03, 2020, 03:15:29 pm »
I like to write expressions if possible. They are usually easier to understand than actually having to think like a computer, “iterating” through the source code lines. So:
Code: Pascal  [Select][+][-]
  1. {$mode objFPC}
  2. {$longStrings on}
  3.  
  4. uses
  5.         math;
  6.  
  7. resourceString
  8.         groupSeparator = ',';
  9.  
  10. function groupedIntegerImage(const x: nativeUInt): string;
  11. var
  12.         /// temporarily stores the length of the result string
  13.         n: nativeUInt;
  14.         /// allows to choose between strings based on a Boolean expression
  15.         separatorOption: array[Boolean] of string;
  16. begin
  17.         // initial conversion to string
  18.         str(x, result);
  19.        
  20.         n := length(result);
  21.         // separating a single thousands digit is not recommended
  22.         // thus we start at values 10,000 and above.
  23.         if n > 4 then
  24.         begin
  25.                 separatorOption[false] := '';
  26.                 separatorOption[true] := groupSeparator;
  27.                
  28.                 result :=
  29.                         result[max(1, n-20)..max(0, n-18)] + separatorOption[n>18] +
  30.                         result[max(1, n-17)..max(0, n-15)] + separatorOption[n>15] +
  31.                         result[max(1, n-14)..max(0, n-12)] + separatorOption[n>12] +
  32.                         result[max(1, n-11)..max(0, n-9)] + separatorOption[n>9] +
  33.                         result[max(1, n-8)..max(0, n-6)] + separatorOption[n>6] +
  34.                         result[max(1, n-5)..n-3] + groupSeparator +
  35.                         result[n-2..n];
  36.         end;
  37. end;
There’s probably a nicer way to retrieve sub-strings and I deliberately avoided the “problem” of the negative sign, because I just wanted to show the principle.
Yours Sincerely
Kai Burghardt

Thaddy

  • Hero Member
  • *****
  • Posts: 10436
Re: IntToStr with national convention?
« Reply #19 on: August 03, 2020, 06:41:10 pm »
Why not try write/writeln/writestr with the formatting options already available, instead of any superfluous formatxxx()  calls?
See last part of https://www.freepascal.org/docs-html/rtl/system/write.html (the manual)
As long as the installed OS conforms to what is demanded, then the formatting is automatically correct.
« Last Edit: August 03, 2020, 06:42:47 pm by Thaddy »
When you ask a question that is actually answered in the documentation, you are either lazy or a moron.

winni

  • Hero Member
  • *****
  • Posts: 1755
Re: IntToStr with national convention?
« Reply #20 on: August 03, 2020, 06:51:27 pm »

Sorry, but I just couldn't resist posting that.
...
So, no hard feelings I hope?

Bart

Hi!

No hard feeling!

But when I read those speed junkies I don 't know: Laughing or crying?

But  they should be set in front of an Apple II and then  code the sieve of Erastotenes -
Only from 1 to 32767. It was so awful slow, that you never  knew if it is still working.
Or if you had send it in an endless loop.
That is the experience they miss.
So much for that.

You write:
Some national thousandseparators are not single byte (at least not when you use Lazarus, where evrything is UTF8).
I only know separators out of ASCII:

.,;:-/\#$

Or do need a special separator for Disney-Land??

🚦😈🧲👀🚬🤖


Something like that?

Winni
« Last Edit: August 03, 2020, 06:53:01 pm by winni »

Kays

  • Full Member
  • ***
  • Posts: 211
  • Whasup!?
    • KaiBurghardt.de
Re: IntToStr with national convention?
« Reply #21 on: August 03, 2020, 07:42:48 pm »
[…]I only know separators out of ASCII:[…]
Well, there’s the recommendation to use U+2009 “thin space”.
Yours Sincerely
Kai Burghardt

winni

  • Hero Member
  • *****
  • Posts: 1755
Re: IntToStr with national convention?
« Reply #22 on: August 03, 2020, 08:06:07 pm »
Well, there’s the recommendation to use U+2009 “thin space”.

Thanx - I didn't know that.

But the situation is quiet confusing:

* If we work with ISO 31-0 we take a national separator of the well know ASCII signs - including #32. Then we don't need the "thin space"

*If we work with the recommendation for the "thin space" - that superseeds ISO 31-0 - then we don't need the other ASCII-sign anymore.

In the end there will not only be the confusion with the national signs but plus the "thin space".

But as we don't want to win a typography contest we , can take the "normal space".

Winni

Bart

  • Hero Member
  • *****
  • Posts: 3897
    • Bart en Mariska's Webstek
Re: IntToStr with national convention?
« Reply #23 on: August 03, 2020, 10:20:49 pm »
Or do need a special separator for Disney-Land??
🚦😈🧲👀🚬🤖

See for example https://bugs.freepascal.org/view.php?id=26803.
Quote
e.g. the Slovak date separator is 2 chars: a dot followed by a space

Also as an example ThousandSeparator can be U+00AO (non-breaking space), which in UTF-8 is represented by more than 1 byte.

Bart

Bart

  • Hero Member
  • *****
  • Posts: 3897
    • Bart en Mariska's Webstek
Re: IntToStr with national convention?
« Reply #24 on: August 07, 2020, 10:11:45 pm »
Slightly adapted to accept vlaues and separators of any length.

Code: Pascal  [Select][+][-]
  1. function InsertThousandSep(const ValueS, AThousandSep: String): String;
  2. var
  3.   MaxLen, ResPos, SLen, i, j: Integer;
  4. begin
  5.   Result := '';
  6.   SLen := Length(ValueS);
  7.   MaxLen := SLen + ((SLen - 1) div 3) * Length(AThousandSep); //Max needed seps = ((SLen - 1) div 3)
  8.   SetLength(Result, MaxLen);
  9.   ResPos := MaxLen;
  10.   for i := Length(ValueS) downto 1 do
  11.   begin
  12.     if (SLen <> i) and ((SLen-i) mod 3 = 0) then
  13.     begin
  14.       for j := Length(AThousandSep) downto 1 do
  15.       begin
  16.         Result[ResPos] := AThousandSep[j];
  17.         Dec(ResPos);
  18.       end;
  19.     end;
  20.     Result[ResPos] := ValueS[i];
  21.     Dec(ResPos);
  22.   end;
  23. end;

(And a full 1% faster than my previous effort, yeah... )

Bart

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8715
  • FPC developer.
Re: IntToStr with national convention?
« Reply #25 on: August 07, 2020, 10:44:21 pm »
Awful slow.

0.000569  Milliseconds. Awful slow.

Each. How many integers are there in a database dump?

Anyway, I don't think it would be wise to change inttostr, changing that now would break too much, since most integer->string conversions are probably not for presentation/gui purposes,  just make a different one with a different name for internationalized versiosn
« Last Edit: August 07, 2020, 10:46:22 pm by marcov »

winni

  • Hero Member
  • *****
  • Posts: 1755
Re: IntToStr with national convention?
« Reply #26 on: August 07, 2020, 11:02:48 pm »

Each. How many integers are there in a database dump?


1:  0.000569  = ~ 1757 per milisecond
                       =    1 757 000 per second

Enough for your DB?

First compute, then whine.

Winni

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8715
  • FPC developer.
Re: IntToStr with national convention?
« Reply #27 on: August 07, 2020, 11:07:03 pm »
                       =    1 757 000 per second

20 on a row, 87000 rows. And that even is at full CPU, and I assume the database also has other things to do than converting integers. Like, shudder, floats :_)

Quote
Enough for your DB?

First compute, then whine.

First think if a value is useful, then compute it.

winni

  • Hero Member
  • *****
  • Posts: 1755
Re: IntToStr with national convention?
« Reply #28 on: August 07, 2020, 11:17:06 pm »
Hi!

The most time in a database dump is used by the I/O process to wait until ready - even with a cached Linux disc.

Winni

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8715
  • FPC developer.
Re: IntToStr with national convention?
« Reply #29 on: August 08, 2020, 12:15:12 am »
The most time in a database dump is used by the I/O process to wait until ready - even with a cached Linux disc.

For binary dumping maybe. For textual output it is usually easily measurable. 

 

TinyPortal © 2005-2018