Lazarus

Free Pascal => General => Topic started by: Bart on April 08, 2017, 07:47:38 pm

Title: How to efficiently generate a list of random integers within a range?
Post by: Bart on April 08, 2017, 07:47:38 pm
I want to generate a list (array?) of NRequired random integers that range from 0 to Max (both inclusive). NReq is not known at compile time.
The list may not contain duplicate entries.
Is there a more efficient way of doing it like this (pseudocode):

Code: [Select]
  0. N := 0;
  1. NewValue := RandomRange(0, Max+1)
  2. Loop through each value in the list: is it the same as NewValue?
  3. If so, then repeat 1..2 until you come up with a value that is not in the array yet
  4. Add NewValue to array
  5. Inc(N)
  6. If (N = NRequired) then STOP

I don't care what structure is used for the list.

Bart
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: molly on April 08, 2017, 08:04:54 pm
It was discussed on this forums a while ago as well, but have a look at this SO Question (http://stackoverflow.com/questions/196017/unique-non-repeating-random-numbers-in-o1) or this blogpost (http://preshing.com/20121224/how-to-generate-a-sequence-of-unique-random-integers/).

So, it can be done more efficiently as shown in your example.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: howardpc on April 08, 2017, 11:23:46 pm
I doubt that this is the most efficient algorithm, but it is pretty efficient.

Code: Pascal  [Select]
  1. uses
  2.   math;
  3.  
  4. type TIntArray = array of integer;
  5.  
  6.   function RandomArray(aMaxValue, aRequiredCount: integer): TIntArray;
  7.   var
  8.     usedArr: TIntArray;
  9.     maxVal, i: integer;
  10.  
  11.     function GetUniqueRandom: integer;
  12.     begin
  13.       repeat
  14.         Result:=RandomRange(0, maxVal);
  15.       until (usedArr[Result] = 0);
  16.     end;
  17.  
  18.   begin
  19.     Assert(aRequiredCount<aMaxValue,'RandomArray: impossible requirement');
  20.     Randomize;
  21.     maxVal:=aMaxValue+1;
  22.     SetLength(usedArr, maxVal);
  23.     SetLength(Result, aRequiredCount);
  24.     for i:=0 to High(Result) do begin
  25.       Result[i]:=GetUniqueRandom;
  26.       Inc(usedArr[Result[i]]);
  27.     end;
  28.   end;
           
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 09, 2017, 09:25:31 am
Here's a class translated from https://github.com/preshing/RandomSequence/blob/master/randomsequence.h
It seems actually pretty good, provided you know it doesn't really pass any serious random tests. It merely fools you.
But it looks random and is unique ;) It's also faster than a linked lists + swap and remove
Code: Pascal  [Select]
  1. program uniquerand32;
  2. {$mode objfpc}
  3. type
  4.  
  5.   { UniqueRandom32 }
  6.   UniqueRandom32 = class sealed
  7.   private
  8.       class var m_index:cardinal;
  9.       class var m_intermediateOffset:Cardinal;
  10.       class var Fseedbase,Fseedoffset:Cardinal;
  11.       class function permuteQPR(x:cardinal):cardinal;static;inline;
  12.   public
  13.       class constructor UniqueRandom32;
  14.       class function Next:Cardinal;inline;static;
  15.     end;
  16.  
  17. class function UniqueRandom32.permuteQPR(x: cardinal): cardinal;
  18. const
  19.    prime:Cardinal = 4294967291;// closest prime to High(Cardinal);
  20.    Residue:Cardinal = 0;
  21. begin
  22.   if x >= prime then
  23.     Result := x
  24.   else // The 5 integers out of range are mapped to themselves.
  25.     begin
  26.       residue := (qword(x) * x) mod prime;
  27.       if x <= (prime div 2) then
  28.         result:= residue
  29.       else
  30.         result := prime - residue;
  31.     end;
  32. end;
  33.  
  34. class constructor UniqueRandom32.UniqueRandom32;
  35. begin
  36.   FSeedBase :=RandSeed;
  37.   FSeedOffset := not Randseed;
  38.   m_index := permuteQPR(permuteQPR(FseedBase) + $682f0161);
  39.   m_intermediateOffset := permuteQPR(permuteQPR(FseedOffset) + $46790905);
  40. end;
  41.  
  42. class function UniqueRandom32.Next: Cardinal;
  43. begin
  44.   Result := permuteQPR((permuteQPR(m_index) + m_intermediateOffset) xor $5bf03635);
  45.   inc(m_index);
  46. end;
  47.  
  48. { demo }
  49. var
  50.   i:integer;
  51. begin
  52.   randseed := 5469;  // must seed
  53.   //randomize;       // or randomize
  54.   for i := 1 to 100 do
  55.   begin
  56.     write(UniqueRandom32.Next:20);
  57.     if i mod 5 = 0 then writeln;
  58.   end;
  59.   readln;
  60. end.


Note that if you do any math scaling, you will loose that uniqueness, though, so it may not be suitable for Bart without further work.
Quick tested with a TList<Cardinal> from rtl-generics, List.sort and   Assert(List.IndexOf(List[i-1]) = List.LastIndexOf(List[i-1]),'Has duplicates'); over 3.000.000 entries: no duplicates as expected.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Eugene Loza on April 09, 2017, 11:12:13 am
I've never did it that way, but shuffling looks efficient when the amount of numbers one needs to get is high:
Code: Pascal  [Select]
  1. var A: array of integer;
  2. i,j: integer;
  3. ...
  4. setlength(A,max_random_number);
  5. //fill the array
  6. for i := 0 to max_random_number do A[i] := i;
  7. //shuffle the array (double inefficiency)
  8. for i := 0 to how_many_random_numbers_needed do begin
  9.   j := random(max_random_number)
  10.   tmp := A[i];
  11.   A[i] := A[j];
  12.   A[j] := tmp;
  13. end;
  14. //and get the numbers
  15. for i := 0 to how_many_random_numbers_needed do
  16.   some_result := A[i];
  17.  
Might also be done with generic lists, etc.

However, pay attention, most of the methods are CPU-efficient, but RAM-inefficient.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 09, 2017, 08:05:54 pm
I have a version of my above code that stores m_index at hi(qword) and the random at lo(qword) in one go.
When  the number of required randoms are stored in a list or array and sorted on random, the m_index value becomes a unique random for the range.
The result is a very fast random - because the random sequence is pre-calculated - after initial setup and sort at the cost of memory for the list. If this is acceptable for Bart I can put it here as well.

I also did some tests on the randomness for the original code with diehard. It's pretty good for its purpose.
Note it is more or less an LCG and the uniqueness comes from its modulo (superhigh prime). Other LCG's can do the same or massaged to do the same given a high prime for a 32 bits  unsigned. An LCG will never repeat itself for values: 0 < random < prime unless you perform an extra operation like a multiplication. (e.g 2 x 6 = 3 x 4 = 1 x 12 etc)
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 10, 2017, 08:11:43 am
You may also like Sattolo's algorithm, which goes like this:
Code: Pascal  [Select]
  1. program sattolocycle;
  2. {$mode delphi}
  3. uses math;
  4. var
  5.   a:Array of cardinal;
  6.   i,j:integer;
  7.   t:cardinal;
  8. begin
  9.   a:=a.Create(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19);
  10.   i := length(a);
  11.   while i > 0 do
  12.   begin
  13.     dec(i);
  14.     j :=randomrange(Low(a),i);
  15.     t:=a[i];a[i]:=a[j];a[j]:=t;
  16.     writeln(a[i]);
  17.   end;
  18.   readln;
  19. end.
  20.  

This is close to what Eugene wrote but here the random range decreases. It is the most efficient I found yet for defined number ranges other that full cardinal.
See: https://en.wikipedia.org/wiki/Fisher%E2%80%93Yates_shuffle#Sattolo.27s_algorithm
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Eugene Loza on April 10, 2017, 08:30:22 am
Thaddy, wow, that's fantastic!!! I knew there should have been a more efficient way to!
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Zoran on April 10, 2017, 08:54:00 am
Thaddy, wow, that's fantastic!!! I knew there should have been a more efficient way to!

Actually, Molly gave the link to stackoverflow page with this solution in the first reply in this topic. See it, as it is very well explained there.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 10, 2017, 09:02:16 am
When you google you will often find the same sources. Note that I found a direct link to the C++ sourcecode for the class that I translated. But indeed, after a few clicks on molly's links you finally arrive at the same spot.

And that reaction by Eugene was about the sattoro code, which is different from Molly's links.
[edit] Ah I see, Molly links to Fisher Yates. Sattoro is not Fisher Yates...It is an improvement on Fisher Yates (just one cycle) . Same answer:independent.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Zoran on April 10, 2017, 09:34:26 am
When you google you will often find the same sources. Note that I found a direct link to the C++ sourcecode for the class that I translated. But indeed, after a few clicks on molly's links you finally arrive at the same spot.

And that reaction by Eugene was about the sattoro code, which is different from Molly's links.

Isn't this sattoro code same as the chosen solution in molly's link (http://stackoverflow.com/questions/196017/unique-non-repeating-random-numbers-in-o1)?  %)

I just wanted to point Eugene to that link, because it has a very nice and easy-to-understand explanation.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Zoran on April 10, 2017, 09:40:51 am
[edit] Ah I see, Molly links to Fisher Yates. Sattoro is not Fisher Yates...It is an improvement on Fisher Yates (just one cycle) . Same answer:independent.

I think that this link explains just that algorithm - Sattoro, just doesn't call it so, but says it's modified Fisher Yates.

Edit: I didn't check myself what Sattoro is and what Fisher-Yates is, but the explained algorithm is just what you posted. :)
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 10, 2017, 09:43:40 am
No, that answer on stackoverflow is a naive implementation of the same (Sattoro) optimization if I read it correctly.
He actually initializes a whole array for the range needed + 1. (1001)
Thus it wastes space for one element.
Sattoro uses space exactly as is needed. (1000)

So, yes, he came up with a solution that in effect is like Sattoro's, but his implementation is not optimal (one may say wrong, but that doesn't do him justice).
My implementation is EXACT Sattoro and allocates space for the EXACT number of elements.

Note Sattoro is indeed a modified Fisher-Yates.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Zoran on April 10, 2017, 09:51:54 am
No, that answer on stackoverflow is a naive implementation of the same (Sattoro) optimization if I read it correctly.
He actually initializes a whole array for the range needed + 1. (1001)
Thus it wastes space for one element.
Sattoro uses space exactly as is needed. (1000)

So, yes, he came up with a solution that in effect is like Sattoro's, but his implementation is not optimal (one may say wrong, but that doesn't do him justice).
My implementation is EXACT Sattoro and allocates space for the EXACT number of elements.

You allocate 20 numbers for range 0..19, he allocates 1001 for range 0..1000.
For wanted range 0..n, both of you allocate n + 1 numbers. :)
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 10, 2017, 01:17:17 pm
"Decrement max by 1 and continue. When max is 0, set max back to the size of the array - 1"
Is the wrong  order. element zero will never be processed... N=1000... as the comments tell you.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Zoran on April 10, 2017, 02:09:12 pm
"Decrement max by 1 and continue. When max is 0, set max back to the size of the array - 1"
Is the wrong  order. element zero will never be processed... N=1000... as the comments tell you.

I'm not sure about that -- "and continue" in that sentence should be be interpreted "jump to loop start". Okay, not precise enough, I admit, but when you see the following example, which finishes with
Quote
After 11 iterations, all numbers in the array have been selected, max == 0, and the array elements are shuffled:

+--+--+--+--+--+--+--+--+--+--+--+
| 4|10| 8| 6| 2| 0| 9| 5| 1| 7| 3|
+--+--+--+--+--+--+--+--+--+--+--+

At this point, max can be reset to 10 and the process can continue.

Then it becomes clear.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Bart on April 12, 2017, 04:08:47 pm
Sorry for the late response.
Thanks all for your input (both public and via PM).

It'll put me on my way.

Such a function might be a nice addition to the math unit perhaps?

Bart
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 14, 2019, 03:29:38 pm
Bit late, but I will suggest it as an unsort method for TArray<T>
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 19, 2019, 10:32:38 am
Can we agree on this one? Then I will create a patch:
Code: Pascal  [Select]
  1. // Shuffles a sorted or unsorted dynamic array using Sattolo's algorithm
  2. // Like Fisher-Yates shuffle, but:
  3. // - produces exactly one cycle, with (length-1)! different possibilities.
  4. // - no element ends up in its starting position
  5. generic procedure shuffle<T>(var Values:array of T);
  6. var
  7.   i,j:integer;
  8.   temp:T;
  9. begin
  10.   for i := High(Values) downto 1 do
  11.   begin
  12.     j := RandomRange(Low(Values),i);
  13.     temp:=Values[i];
  14.     Values[i]:=Values[j];
  15.     Values[j]:=temp;
  16.   end;
  17. end;
Note there was a mistake in my first version, because the same mistake was (is!) at Rosetta code in many of the implementations ( 20 vs 19 iterations). This one is 100% correct.

While we're at it, may be this trivial one too?
Code: Pascal  [Select]
  1. generic procedure Swap<T>(var a,b:T);inline;
  2. var
  3.  temp:T;
  4. begin
  5.   temp :=a;a:=b;b:=temp;
  6. end;

Suggested target unit is math, but can also be sysutils.
Feed-back appreciated.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Mike.Cornflake on April 19, 2019, 12:37:28 pm
Quote
Code: Pascal  [Select]
  1.   for i := High(Values) downto 1 do

Shouldn't that be:

Code: Pascal  [Select]
  1.   for i := High(Values) downto Low(Values)+1 do
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Zoran on April 19, 2019, 12:43:31 pm
Quote
Code: Pascal  [Select]
  1.   for i := High(Values) downto 1 do

Shouldn't that be:

Code: Pascal  [Select]
  1.   for i := High(Values) downto Low(Values)+1 do

No. Here, Values is an open array parameter, so Low(Values) is always zero and you can freely use 1 instead of Low(Values) + 1.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: avk on April 19, 2019, 01:19:38 pm
@Thaddy, in LGenerics, a similar method exists from the very beginning, only implemented a little differently:
Code: Pascal  [Select]
  1. class procedure TGArrayHelpUtil.RandomShuffle(var A: array of T);
  2. var
  3.   I, J: SizeInt;
  4.   v: TFake;
  5. begin
  6.   for I := System.High(A) downto 1 do
  7.     begin
  8.       J := Random(I);
  9.       v := TFake(A[I]);
  10.       TFake(A[I]) := TFake(A[J]);
  11.       TFake(A[J]) := v;
  12.     end;
  13. end;
  14.  
What is the meaning of TFake?
Here is a rough performance comparison example for an array of strings:
Code: Pascal  [Select]
  1. program shuffle_cmp;
  2.  
  3. {$MODE OBJFPC}{$H+}
  4.  
  5. uses
  6.   SysUtils, DateUtils, LGArrayHelpers;
  7.  
  8. type
  9.   TStrHelper = specialize TGComparableArrayHelper<string>;
  10.  
  11. // Shuffles a sorted or unsorted dynamic array using Sattolo's algorithm
  12. // Like Fisher-Yates shuffle, but:
  13. // - produces exactly one cycle, with (length-1)! different possibilities.
  14. // - no element ends up in its starting position
  15. generic procedure shuffle<T>(var Values:array of T);
  16. var
  17.   i,j:integer;
  18.   temp:T;
  19. begin
  20.   for i := High(Values) downto 1 do
  21.   begin
  22.     j := Random(i);
  23.     temp:=Values[i];
  24.     Values[i]:=Values[j];
  25.     Values[j]:=temp;
  26.   end;
  27. end;
  28.  
  29. function RandomString(aLen: Integer): string;
  30. var
  31.   I: Integer;
  32. begin
  33.   SetLength(Result, aLen);
  34.   for I := 1 to aLen do
  35.     Result[I] := Char(Random(95) + 32);
  36. end;
  37.  
  38. const
  39.   TEST_SIZE = 1000000;
  40.  
  41. var
  42.   I: Integer;
  43.   a, b: array of string;
  44.   StartTime: TTime;
  45.   Elapsed: Int64;
  46.  
  47. begin
  48.   SetLength(a, TEST_SIZE);
  49.   for I := 0 to High(a) do
  50.     a[I] := RandomString(16);
  51.   b := Copy(a);
  52.   StartTime := Time;
  53.   specialize shuffle<string>(b);
  54.   Elapsed := MillisecondsBetween(Time, StartTime);
  55.   WriteLn('shuffle, elapsed time = ', Elapsed);
  56.   b := Copy(a);
  57.   StartTime := Time;
  58.   TStrHelper.RandomShuffle(b);
  59.   Elapsed := MillisecondsBetween(Time, StartTime);
  60.   WriteLn('RandomShuffle, elapsed time = ', Elapsed);
  61. end.  
  62.  
Result on my machine is:
Code: Pascal  [Select]
  1. shuffle, elapsed time = 101
  2. RandomShuffle, elapsed time = 12
  3.  
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 19, 2019, 01:31:07 pm
I have to check if it satisfies Satollo, but indeed the randomrange is not strictly necessary.
I never use lgenerics because of several issues - may have to check again - compared to standard generic libraries.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: avk on April 19, 2019, 01:47:44 pm
I'm not campaigning for LGenerics at all.
TFake is a simply fake of T. 8)
It is needed only to suppress reference counting and is declared as:
Code: Pascal  [Select]
  1. type
  2.  TFake = {$IFNDEF FPC_REQUIRES_PROPER_ALIGNMENT}array[0..Pred(SizeOf(T))] of Byte{$ELSE}T{$ENDIF};
  3.  
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 19, 2019, 03:38:36 pm
Well, at least my code doesn't have a reference counting issue.... :D
I still suggest to include my version in the RTL.
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: BobDog on April 22, 2019, 01:28:07 am

Another
Code: Pascal  [Select]
  1.  
  2.  
  3.   program shuffle;
  4.  
  5.    var
  6.   a:array[10 .. 100] of integer;
  7.   testrange:array[-2 .. 10] of integer;
  8.   i,counter:integer;
  9.  
  10. function range(first:integer;last:integer):integer;
  11.     begin
  12.     result:= trunc( Random()*(last-first+1)) + first
  13.     end;
  14.  
  15.     procedure shuffle(var a:array of integer);
  16.     var  n,k,tmp:integer ;
  17.     begin
  18.     For n := low(a) To high(a)-1 do
  19.     begin
  20.     k:= range(n,high(a));
  21.     tmp:=a[n];
  22.     a[n]:=a[k];
  23.     a[k]:=tmp;
  24.         end;
  25.      End;
  26.  
  27.  
  28.      begin
  29.      randomize;
  30.      for i:=low(testrange) to high(testrange) do testrange[i]:=0; //initialize testrange
  31.      writeln('Range test');
  32.      writeln;
  33.      writeln('value',' ','frequency');
  34.  
  35.      for i:=1 to 50000000 do testrange[range(low(testrange),high(testrange))]+=1;
  36.      for i:=low(testrange) to high(testrange) do writeln(i,'    ',testrange[i]);
  37.      writeln;
  38.  
  39.      //do five
  40.         counter:=0;
  41.       repeat
  42.       counter+=1;
  43.      for i:=low(a) to high(a) do a[i]:=i;   //setup array
  44.      shuffle(a);
  45.      writeln('Shuffled array ', low(a),' to ', high(a));
  46.       for i:=low(a) to high(a) do write(a[i],' ');
  47.       writeln;
  48.       writeln;
  49.       until counter=5;
  50.       writeln('Press return to end');
  51.  
  52.      readln;
  53.      end.
  54.  
  55.  
Title: Re: How to efficiently generate a list of random integers within a range?
Post by: Thaddy on April 22, 2019, 09:00:49 am
That does sort, but not efficient. RTL functions need to be correct in every sense. Even if it looks simple, there needs to be a solid foundation in theory.
I will describe an example why that should be the case:
A card game with 52 cards needs a period of 2^233 for the random generator to have an equidistributed chance on every permutation possible.
That means that the very simple but extremely fast LCG as used by Delphi (2^64) is not very useful, because it will generate distinct patterns, but FPC's Mersenne twister PRNG satisfies this requirement (a period over 2^6000) but is slower. So even the choice of PRNG has influence on such a simple piece of code as a Sattolo cycle. Taken that into account I have settled on this one, but note the comments of the previous one are still valid:
Code: Pascal  [Select]
  1. generic procedure shuffle<T>(var Values:array of T);
  2. var
  3.   i,j:integer;
  4.   temp:T;
  5. begin
  6.   for i := High(Values) downto 1 do
  7.   begin
  8.     j := Random(i);
  9.     temp:=Values[i];
  10.     Values[i]:=Values[j];
  11.     Values[j]:=temp;
  12.   end;
  13. end;
This removes the speed penalty for randomrange that avk made me aware of and still satisfies both Sattolo and the PRNG limitation as described above.
It has (N-1)! (  <- permutation sign, not exclamation mark)  and a single cycle per iteration. So for N=20 it has 19 cycles, which is optimal. As side effect, none of the members will end up in it starting place, what your code actually does.

If you are interested I wrote an elaborate wiki article about the subject in general, complete with chart.