Recent

Author Topic: Output of Random(SomeInteger) appears to be biased  (Read 1067 times)

Thaddy

  • Hero Member
  • *****
  • Posts: 9167
Re: Output of Random(SomeInteger) appears to be biased
« Reply #15 on: July 27, 2019, 10:30:08 pm »
Did you test with at least BigCrush?
also related to equus asinus.

Pangea

  • New member
  • *
  • Posts: 8
Re: Output of Random(SomeInteger) appears to be biased
« Reply #16 on: July 27, 2019, 11:48:48 pm »
I checked TestU01, but i seems like it only accepts powers of 2 as the output range of your PRNG. So i was not able to run the samplers against it.

Edit: If the upper range of the interval is divides the upper bound of the generator, no bias for any sampler is expected.
Edit2: Unless, of course, the underlying PRNG is also biased.
Edit3: Actually, I can convert the output of the sampler to some floating point number and assume some precision. Then i can feed to the testsuit. Maybe i try this.

Edit4: Ok, i tested the samplers against SmallCrush. Find the details and results below. However i don't think i will run this against BigCrush, because completing this takes forever and i cannot run my PC overnight. Also i have read somwhere, that the Mersenne Twister fails some test in big crush? Would this mean it will also fail for the composite number, even when the rejecting maps are working correctly?

Results:

As expected both naive non rejecting methods fail several tests. To be precise:

Code: [Select]
========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        TMulSampler
 Number of statistics:  15
 Total CPU time:   00:00:41.60
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  3  Gap                              eps 
  4  SimpPoker                        eps 
  5  CouponCollector                  eps 
  7  WeightDistrib                    eps 
  9  HammingIndep                     eps 
 10  RandomWalk1 H                    eps 
 10  RandomWalk1 M                    eps 
 10  RandomWalk1 J                    eps 
 10  RandomWalk1 C                   2.5e-5
 ----------------------------------------------
 All other tests were passed


========= Summary results of SmallCrush =========

 Version:          TestU01 1.2.3
 Generator:        TModSampler
 Number of statistics:  15
 Total CPU time:   00:00:41.26
 The following tests gave p-values outside [0.001, 0.9990]:
 (eps  means a value < 1.0e-300):
 (eps1 means a value < 1.0e-15):

       Test                          p-value
 ----------------------------------------------
  2  Collision                        eps 
  6  MaxOft                           eps 
  6  MaxOft AD                      1 - eps1
  7  WeightDistrib                    eps 
  9  HammingIndep                   5.6e-16
 10  RandomWalk1 H                    eps 
 10  RandomWalk1 M                    eps 
 10  RandomWalk1 J                    eps 
 10  RandomWalk1 R                    eps 
 10  RandomWalk1 C                    eps 
 ----------------------------------------------
 All other tests were passed
TMulSampler refers to the multiply and shift method and TModSampler refers to the modulo maping, both (at least in a somewhat similar form) currently implemented in the RTL.

 All other implemented rejecting methods pass SmallCrush without failing any test. I attached the full output log to this post.

Used method:

To feed the output of the samplers into testsuite i construct a floating point number iterative (via Random(L)/(L-1) + Random(L)/(L-1)^2 + ...) until the number has absorbed at least 52 random bits. For the test i used a generator width of 31 bit (backed by the mersenne twister generator in the RTL) and target output interval size of L= 1610612736.

I attached the source code to this post (The code is a bit ugly, but well...). To build it you need fpc, gcc, testu01 installed and all header files and libraries in the respective search path. First call "gcc -c bat.c" and after this build the pascal application. If you do not want to run against testu01, remove the testu01 define in the pascal program and it should compile on any fpc installation. Since all samplers are tested against testu01 directly after each other it is helpful to pipe the output in some file and just watch it with tail -f or something.

Feel free to run it against big or small crush yourself.

I will also remove the attached code on the older forums post, to avoid having two different sources online.

Edit5: Actually it should be Random(L)/L + Random(L)/L^2 + ... . I will rerun the test with the correct formula ....
Edit6: Ok, it yields basically the same results. I updated  the attachments.
« Last Edit: July 28, 2019, 07:14:08 pm by Pangea »

Pangea

  • New member
  • *
  • Posts: 8
Re: Output of Random(SomeInteger) appears to be biased
« Reply #17 on: July 31, 2019, 04:03:02 am »
I have seen, that the issue has been marked as resolved. However i still see two minor problems with this:

- The generated output interval for negative input appears to be one to small? Or is this intended?
- The output is still biased.

What is the stance on this? Is the bias in this version of the function deemed acceptable (since it has the same bias the 32 bit version has)?

This is the current code in trunk:
Code: Pascal  [Select]
  1. function random(l:int64): int64;
  2. var
  3.  a, b, c, d: cardinal;
  4.  q, bd, ad, bc, ac: qword;
  5.  carry: qword;
  6. begin
  7.   if (l < 0) then
  8.     begin
  9.       inc(l);
  10.       q:=qword(-l)
  11.     end
  12.   else
  13.     q:=qword(l);
  14.  
  15.   a:=mtwist_u32rand;
  16.  b:=mtwist_u32rand;
  17.  
  18.  c:=q shr 32;
  19.  d:=cardinal(q);
  20.  
  21.  bd:=b*d;
  22.  ad:=a*d;
  23.  bc:=b*c;
  24.  ac:=a*c;
  25.  
  26.  // We only need the carry bit
  27.  carry:=((bd shr 32)+cardinal(ad)+cardinal(bc)) shr 32;
  28.  
  29.  // Calculate the final result
  30.  result:=int64(carry+(ad shr 32)+(bc shr 32)+ac);
  31.  if l<0 then
  32.    result:=-result;
  33. end;
  34.