Recent

Author Topic: How to efficiently generate a list of random integers within a range?  (Read 12676 times)

Zoran

  • Hero Member
  • *****
  • Posts: 1573
    • http://wiki.lazarus.freepascal.org/User:Zoran
Re: How to efficiently generate a list of random integers within a range?
« Reply #15 on: April 10, 2017, 02:09:12 pm »
"Decrement max by 1 and continue. When max is 0, set max back to the size of the array - 1"
Is the wrong  order. element zero will never be processed... N=1000... as the comments tell you.

I'm not sure about that -- "and continue" in that sentence should be be interpreted "jump to loop start". Okay, not precise enough, I admit, but when you see the following example, which finishes with
Quote
After 11 iterations, all numbers in the array have been selected, max == 0, and the array elements are shuffled:

+--+--+--+--+--+--+--+--+--+--+--+
| 4|10| 8| 6| 2| 0| 9| 5| 1| 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+

At this point, max can be reset to 10 and the process can continue.

Then it becomes clear.

Bart

  • Hero Member
  • *****
  • Posts: 3853
    • Bart en Mariska's Webstek
Re: How to efficiently generate a list of random integers within a range?
« Reply #16 on: April 12, 2017, 04:08:47 pm »
Sorry for the late response.
Thanks all for your input (both public and via PM).

It'll put me on my way.

Such a function might be a nice addition to the math unit perhaps?

Bart

Thaddy

  • Hero Member
  • *****
  • Posts: 10276
Re: How to efficiently generate a list of random integers within a range?
« Reply #17 on: April 14, 2019, 03:29:38 pm »
Bit late, but I will suggest it as an unsort method for TArray<T>
I am more like donkey than shrek

Thaddy

  • Hero Member
  • *****
  • Posts: 10276
Re: How to efficiently generate a list of random integers within a range?
« Reply #18 on: April 19, 2019, 10:32:38 am »
Can we agree on this one? Then I will create a patch:
Code: Pascal  [Select][+][-]
  1. // Shuffles a sorted or unsorted dynamic array using Sattolo's algorithm
  2. // Like Fisher-Yates shuffle, but:
  3. // - produces exactly one cycle, with (length-1)! different possibilities.
  4. // - no element ends up in its starting position
  5. generic procedure shuffle<T>(var Values:array of T);
  6. var
  7.   i,j:integer;
  8.   temp:T;
  9. begin
  10.   for i := High(Values) downto 1 do
  11.   begin
  12.     j := RandomRange(Low(Values),i);
  13.     temp:=Values[i];
  14.     Values[i]:=Values[j];
  15.     Values[j]:=temp;
  16.   end;
  17. end;
Note there was a mistake in my first version, because the same mistake was (is!) at Rosetta code in many of the implementations ( 20 vs 19 iterations). This one is 100% correct.

While we're at it, may be this trivial one too?
Code: Pascal  [Select][+][-]
  1. generic procedure Swap<T>(var a,b:T);inline;
  2. var
  3.  temp:T;
  4. begin
  5.   temp :=a;a:=b;b:=temp;
  6. end;

Suggested target unit is math, but can also be sysutils.
Feed-back appreciated.
« Last Edit: April 19, 2019, 10:42:37 am by Thaddy »
I am more like donkey than shrek

Mike.Cornflake

  • Hero Member
  • *****
  • Posts: 1251
Re: How to efficiently generate a list of random integers within a range?
« Reply #19 on: April 19, 2019, 12:37:28 pm »
Quote
Code: Pascal  [Select][+][-]
  1.   for i := High(Values) downto 1 do

Shouldn't that be:

Code: Pascal  [Select][+][-]
  1.   for i := High(Values) downto Low(Values)+1 do
Lazarus Trunk/FPC Trunk on Windows [7, 10]
  Have you tried searching this forum or the wiki?:   http://wiki.lazarus.freepascal.org/Alternative_Main_Page
  BOOKS! (Free and otherwise): http://wiki.lazarus.freepascal.org/Pascal_and_Lazarus_Books_and_Magazines

Zoran

  • Hero Member
  • *****
  • Posts: 1573
    • http://wiki.lazarus.freepascal.org/User:Zoran
Re: How to efficiently generate a list of random integers within a range?
« Reply #20 on: April 19, 2019, 12:43:31 pm »
Quote
Code: Pascal  [Select][+][-]
  1.   for i := High(Values) downto 1 do

Shouldn't that be:

Code: Pascal  [Select][+][-]
  1.   for i := High(Values) downto Low(Values)+1 do

No. Here, Values is an open array parameter, so Low(Values) is always zero and you can freely use 1 instead of Low(Values) + 1.

avk

  • Sr. Member
  • ****
  • Posts: 273
    • my self-education project
Re: How to efficiently generate a list of random integers within a range?
« Reply #21 on: April 19, 2019, 01:19:38 pm »
@Thaddy, in LGenerics, a similar method exists from the very beginning, only implemented a little differently:
Code: Pascal  [Select][+][-]
  1. class procedure TGArrayHelpUtil.RandomShuffle(var A: array of T);
  2. var
  3.   I, J: SizeInt;
  4.   v: TFake;
  5. begin
  6.   for I := System.High(A) downto 1 do
  7.     begin
  8.       J := Random(I);
  9.       v := TFake(A[I]);
  10.       TFake(A[I]) := TFake(A[J]);
  11.       TFake(A[J]) := v;
  12.     end;
  13. end;
  14.  
What is the meaning of TFake?
Here is a rough performance comparison example for an array of strings:
Code: Pascal  [Select][+][-]
  1. program shuffle_cmp;
  2.  
  3. {$MODE OBJFPC}{$H+}
  4.  
  5. uses
  6.   SysUtils, DateUtils, LGArrayHelpers;
  7.  
  8. type
  9.   TStrHelper = specialize TGComparableArrayHelper<string>;
  10.  
  11. // Shuffles a sorted or unsorted dynamic array using Sattolo's algorithm
  12. // Like Fisher-Yates shuffle, but:
  13. // - produces exactly one cycle, with (length-1)! different possibilities.
  14. // - no element ends up in its starting position
  15. generic procedure shuffle<T>(var Values:array of T);
  16. var
  17.   i,j:integer;
  18.   temp:T;
  19. begin
  20.   for i := High(Values) downto 1 do
  21.   begin
  22.     j := Random(i);
  23.     temp:=Values[i];
  24.     Values[i]:=Values[j];
  25.     Values[j]:=temp;
  26.   end;
  27. end;
  28.  
  29. function RandomString(aLen: Integer): string;
  30. var
  31.   I: Integer;
  32. begin
  33.   SetLength(Result, aLen);
  34.   for I := 1 to aLen do
  35.     Result[I] := Char(Random(95) + 32);
  36. end;
  37.  
  38. const
  39.   TEST_SIZE = 1000000;
  40.  
  41. var
  42.   I: Integer;
  43.   a, b: array of string;
  44.   StartTime: TTime;
  45.   Elapsed: Int64;
  46.  
  47. begin
  48.   SetLength(a, TEST_SIZE);
  49.   for I := 0 to High(a) do
  50.     a[I] := RandomString(16);
  51.   b := Copy(a);
  52.   StartTime := Time;
  53.   specialize shuffle<string>(b);
  54.   Elapsed := MillisecondsBetween(Time, StartTime);
  55.   WriteLn('shuffle, elapsed time = ', Elapsed);
  56.   b := Copy(a);
  57.   StartTime := Time;
  58.   TStrHelper.RandomShuffle(b);
  59.   Elapsed := MillisecondsBetween(Time, StartTime);
  60.   WriteLn('RandomShuffle, elapsed time = ', Elapsed);
  61. end.  
  62.  
Result on my machine is:
Code: Pascal  [Select][+][-]
  1. shuffle, elapsed time = 101
  2. RandomShuffle, elapsed time = 12
  3.  

Thaddy

  • Hero Member
  • *****
  • Posts: 10276
Re: How to efficiently generate a list of random integers within a range?
« Reply #22 on: April 19, 2019, 01:31:07 pm »
I have to check if it satisfies Satollo, but indeed the randomrange is not strictly necessary.
I never use lgenerics because of several issues - may have to check again - compared to standard generic libraries.
I am more like donkey than shrek

avk

  • Sr. Member
  • ****
  • Posts: 273
    • my self-education project
Re: How to efficiently generate a list of random integers within a range?
« Reply #23 on: April 19, 2019, 01:47:44 pm »
I'm not campaigning for LGenerics at all.
TFake is a simply fake of T. 8)
It is needed only to suppress reference counting and is declared as:
Code: Pascal  [Select][+][-]
  1. type
  2.  TFake = {$IFNDEF FPC_REQUIRES_PROPER_ALIGNMENT}array[0..Pred(SizeOf(T))] of Byte{$ELSE}T{$ENDIF};
  3.  

Thaddy

  • Hero Member
  • *****
  • Posts: 10276
Re: How to efficiently generate a list of random integers within a range?
« Reply #24 on: April 19, 2019, 03:38:36 pm »
Well, at least my code doesn't have a reference counting issue.... :D
I still suggest to include my version in the RTL.
I am more like donkey than shrek

BobDog

  • New Member
  • *
  • Posts: 40
Re: How to efficiently generate a list of random integers within a range?
« Reply #25 on: April 22, 2019, 01:28:07 am »

Another
Code: Pascal  [Select][+][-]
  1.  
  2.  
  3.   program shuffle;
  4.  
  5.    var
  6.   a:array[10 .. 100] of integer;
  7.   testrange:array[-2 .. 10] of integer;
  8.   i,counter:integer;
  9.  
  10. function range(first:integer;last:integer):integer;
  11.     begin
  12.     result:= trunc( Random()*(last-first+1)) + first
  13.     end;
  14.  
  15.     procedure shuffle(var a:array of integer);
  16.     var  n,k,tmp:integer ;
  17.     begin
  18.     For n := low(a) To high(a)-1 do
  19.     begin
  20.     k:= range(n,high(a));
  21.     tmp:=a[n];
  22.     a[n]:=a[k];
  23.     a[k]:=tmp;
  24.         end;
  25.      End;
  26.  
  27.  
  28.      begin
  29.      randomize;
  30.      for i:=low(testrange) to high(testrange) do testrange[i]:=0; //initialize testrange
  31.      writeln('Range test');
  32.      writeln;
  33.      writeln('value',' ','frequency');
  34.  
  35.      for i:=1 to 50000000 do testrange[range(low(testrange),high(testrange))]+=1;
  36.      for i:=low(testrange) to high(testrange) do writeln(i,'    ',testrange[i]);
  37.      writeln;
  38.  
  39.      //do five
  40.         counter:=0;
  41.       repeat
  42.       counter+=1;
  43.      for i:=low(a) to high(a) do a[i]:=i;   //setup array
  44.      shuffle(a);
  45.      writeln('Shuffled array ', low(a),' to ', high(a));
  46.       for i:=low(a) to high(a) do write(a[i],' ');
  47.       writeln;
  48.       writeln;
  49.       until counter=5;
  50.       writeln('Press return to end');
  51.  
  52.      readln;
  53.      end.
  54.  
  55.  

Thaddy

  • Hero Member
  • *****
  • Posts: 10276
Re: How to efficiently generate a list of random integers within a range?
« Reply #26 on: April 22, 2019, 09:00:49 am »
That does sort, but not efficient. RTL functions need to be correct in every sense. Even if it looks simple, there needs to be a solid foundation in theory.
I will describe an example why that should be the case:
A card game with 52 cards needs a period of 2^233 for the random generator to have an equidistributed chance on every permutation possible.
That means that the very simple but extremely fast LCG as used by Delphi (2^64) is not very useful, because it will generate distinct patterns, but FPC's Mersenne twister PRNG satisfies this requirement (a period over 2^6000) but is slower. So even the choice of PRNG has influence on such a simple piece of code as a Sattolo cycle. Taken that into account I have settled on this one, but note the comments of the previous one are still valid:
Code: Pascal  [Select][+][-]
  1. generic procedure shuffle<T>(var Values:array of T);
  2. var
  3.   i,j:integer;
  4.   temp:T;
  5. begin
  6.   for i := High(Values) downto 1 do
  7.   begin
  8.     j := Random(i);
  9.     temp:=Values[i];
  10.     Values[i]:=Values[j];
  11.     Values[j]:=temp;
  12.   end;
  13. end;
This removes the speed penalty for randomrange that avk made me aware of and still satisfies both Sattolo and the PRNG limitation as described above.
It has (N-1)! (  <- permutation sign, not exclamation mark)  and a single cycle per iteration. So for N=20 it has 19 cycles, which is optimal. As side effect, none of the members will end up in it starting place, what your code actually does.

If you are interested I wrote an elaborate wiki article about the subject in general, complete with chart.
« Last Edit: April 22, 2019, 09:10:47 am by Thaddy »
I am more like donkey than shrek

WayneSherman

  • Jr. Member
  • **
  • Posts: 70
Re: How to efficiently generate a list of random integers within a range?
« Reply #27 on: February 12, 2020, 04:02:37 am »
LINE 8:     j := Random(i);

According to the Wikipedia article for Fisher-Yates shuffle, the random number j is selected:
Quote
j ← random integer such that 0 ≤ j ≤ i

But according to fpc docs the Random function:
Quote
Random returns a random number larger or equal to 0 and strictly less than argument L

So, should the code be corrected to this?:
j := Random(i+1)

If so, the same correction should be made to TGArrayHelpUtil.RandomShuffle in LGenerics.

« Last Edit: February 12, 2020, 04:07:52 am by WayneSherman »

Thaddy

  • Hero Member
  • *****
  • Posts: 10276
Re: How to efficiently generate a list of random integers within a range?
« Reply #28 on: February 12, 2020, 09:03:46 am »
You are right. Note I would recommend a Sattolo cycle instead of Fisher-Yates. That is a specialized Fisher Yates that is more efficient.
Here's my example I wrote for Rosetta code:
Code: Pascal  [Select][+][-]
  1. program sattolocycle;
  2. {$ifdef fpc}{$mode delphi}{$endif}
  3. uses math;
  4. var
  5.   a:Array of cardinal;
  6.   i,j:integer;
  7.   t:cardinal;
  8. begin
  9.   randomize;
  10.   a:=[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19];
  11.   i := length(a);
  12.   while i > 0 do
  13.   begin
  14.     dec(i);
  15.     j :=randomrange(Low(a),i);  // actually low(a) is always 0 for dynarrays.
  16.     t:=a[i];a[i]:=a[j];a[j]:=t;
  17.     write(a[i]:4);
  18.   end;
  19. end.
This example specifies the correct range, btw, since length is used and we count from zero. (i.e. to i-1 vs i) 0>= i <=length(a)
Sample output proves this:   10  19  12  13   9  17   7  18   4   6   5   3   8   1  11    2  15  16  14
(Uncomment randomize to reproduce the exact same cycle)
« Last Edit: February 12, 2020, 09:36:50 am by Thaddy »
I am more like donkey than shrek

WayneSherman

  • Jr. Member
  • **
  • Posts: 70
Re: How to efficiently generate a list of random integers within a range?
« Reply #29 on: February 12, 2020, 03:31:34 pm »
I would recommend a Sattolo cycle instead of Fisher-Yates. That is a specialized Fisher Yates that is more efficient.

From the references I found, both algorithms have equal computational efficiency (they are basically exactly the same except for the random range that is used).  In both algorithms, for an array of length N, it requires N-1 iterations to shuffle.  Fisher-Yates (j ← random integer such that 0 ≤ j ≤ i) allows the possibility that a shuffled element will remain the same value.  This is generally what you want for a truly random shuffle.  Whereas Sattolo (j ← random integer such that 0 ≤ j < i) always forces a swap, i.e. each element will end up with a different value.  Sattolo does not allow the possibility that, in a truly random shuffle, a given element can remain the same value.

from Wikipedia:
Quote
In fact, as described below, it is quite easy to accidentally implement Sattolo's algorithm when the ordinary Fisher–Yates shuffle is intended. This will bias the results by causing the permutations to be picked from the smaller set of (n−1)! cycles of length N, instead of from the full set of all n! possible permutations.


For your Sattolo implementation, you have an error that requires an extra loop iteration.  When i = 0 then j = 0, so you are swapping a[0] with a[0].  Therefore:
Line 12:    while i > 0 do
should be
Line 12:    while i > 1 do

« Last Edit: February 12, 2020, 03:35:10 pm by WayneSherman »

 

TinyPortal © 2005-2018