Random number challenge
Jacob Fugal
lukfugl at gmail.com
Fri Jun 27 09:06:35 MDT 2008
On Fri, Jun 27, 2008 at 2:03 AM, Corey Edwards <tensai at zmonkey.org> wrote:
>> Everybody counts off 1-$MAX
>> Flip a coin, Heads sets the bit, tails is a blank bit.
>> Flip until you have the bits necessary to represent $MAX
>> Congrats, you have a number for the winner of the first item.
>
> Something about that method just didn't sit right with me. Seemed like
> some numbers would come out more often than others. So I sat down and
> coded up a simulator to see if I was right. According to my results,
> this particular distribution method unfairly favors some numbers
> whenever the sample size ($MAX) is not an even multiple of 2 (2, 4, 8,
> etc.).
>
> I've attached my simulator. Here are a couple of simple examples which I
> think will illustrate the point:
<snip>
Woah woah woah. You're not using the technique that was advocated.
What Hans and Jason were referring to is flip *all* the coins. If the
full result is outside the range, reflip *all* the coins. In your
implementation, you only reflip the most recent coin when the result
is out of bounds. There's a big difference.
Here's a sketch of a proof for the uniformity of the distribution
where you always reflip all the coins:
Assume the valid range is 0..6 (7 people). We flip 3 fair coins
(ceiling of log base 2 of 7) to get a uniform distribution among 0..7,
each number having probability 1/8. If the result is 7 (outside the
valid range), we reflip all coins.
After the first flip, we have:
0 wins with probability 1/8
1 wins with probability 1/8
2 wins with probability 1/8
...
6 wins with probability 1/8
We need to reflip all coins with probability 1/8
Of the 1/8 of sets where we reflip, we get:
0 wins with probability 1/8
1 wins with probability 1/8
2 wins with probability 1/8
...
6 wins with probability 1/8
We need to reflip all coins again with probability 1/8
Combining the chances of winning without needing a reflip with the
chances of winning after one reflip, we have:
0 wins with probability 1/8 * (1 + 1/8)
1 wins with probability 1/8 * (1 + 1/8)
2 wins with probability 1/8 * (1 + 1/8)
...
6 wins with probability 1/8 * (1 + 1/8)
We need to reflip all coins again with probability 1/8^2
If we have to reflip all coins a second time, the results of the
reflip are again:
0 wins with probability 1/8
1 wins with probability 1/8
2 wins with probability 1/8
...
6 wins with probability 1/8
We need to reflip all coins again with probability 1/8
Combining the chances of winning with fewer than 2 reflips with the
chances of winning after exactly two reflips, we have:
0 wins with probability 1/8 * (1 + 1/8 + 1/8^2)
1 wins with probability 1/8 * (1 + 1/8 + 1/8^2)
2 wins with probability 1/8 * (1 + 1/8 + 1/8^2)
...
6 wins with probability 1/8 * (1 + 1/8 + 1/8^2)
We need to reflip all coins again with probability 1/8^3
We can see the pattern; the results after at most N reflips is:
0 wins with probability 1/8 * (1 + 1/8 + 1/8^2 + ... + 1/8^N)
1 wins with probability 1/8 * (1 + 1/8 + 1/8^2 + ... + 1/8^N)
2 wins with probability 1/8 * (1 + 1/8 + 1/8^2 + ... + 1/8^N)
...
6 wins with probability 1/8 * (1 + 1/8 + 1/8^2 + ... + 1/8^N)
We need to reflip all coins again with probability 1/8^(N+1)
If we allow ourselves to keep reflipping as many times as necessary,
we can represent the result by the limit:
K = sum{ k=0..inf | 1/8^k }
= 1/(1-0.125) = 8/7
0 wins with probability 1/8 * K
1 wins with probability 1/8 * K
2 wins with probability 1/8 * K
...
6 wins with probability 1/8 * K
We need to reflip all coins again with probability 0
Now, we know from school that sum{ k=0..inf | r^k } for r<1 is equal
to 1/(1 - r). So for r = 0.125, K = 1/(1-1/8) = 8/7. So the
probability of each number in 0..6 winning is 1/8 * 8/7 = 1/7. Just
what we wanted.
The problem I see with this technique from a purists standpoint is
that there's no *guarantee* that you'll ever stop reflipping. But the
need to keep reflipping never imbalances the probabilities -- the
subset of results for the valid range 0..6 always had the uniform
probability, they just summed less than one.
>From a practical standpoint, the probability of having to reflip more
than a few times depends on the number of people in the room. If you
have (2^N)+1 people in the room, your chances of having to reflip are
the worst at 0.5 - 1/2^(N+1). For most N, this is pretty close to 50%
and you may have to reflip several times to get a valid winner. On the
other hand, if you have (2^N)-1 people in the room, your chances of
having to reflip are very low, in which case this technique works very
well.
Jacob Fugal
More information about the PLUG
mailing list