Great Expectations: Part 1

Great Expectations: Part 1

1. Introduction

HHTTTHHHHHHTTTTHHTTHHHTHHHTHTTHHHTTTHHHHTTHTHHTTHH

Displayed above is the result of 50 consecutive random coin flips. In total there are 29 heads and 21 tails, not too outlandish given our relatively small sample size. We can ask several interesting questions about strings of random coin flips like the one above, and I was working on a mathematical research project recently when I encountered one such problem:

Given a string of n independent flips of a fair coin, what is the expected length of the longest consecutive run of heads? What if the coin was weighted differently?

In the example above, the coin was fair and the longest consecutive string of heads is found from the 6^{th} flip to the 11^{th}, and it has a length of 6:

HHTTTHHHHHHTTTTHHTTHHHTHHHTHTTHHHTTTHHHHTTHTHHTTHH

This particular instance of 6 consecutive heads came from just one possible combination of 50 flips, but there are many more combinations we haven’t checked (one down, 2^{50}-1 to go!). We should be able to come up with an average of all these situations: the expected longest run of heads among 50 flips. But how do we go about solving this, or the more general cases? In today’s post, I will explain the reasoning I came up with to obtain an answer, and compare my result to an officially established formula to see how it fares.

2. Sticks and Stones

Because we are considering consecutive strings of a particular output, I thought it would be a good idea to view the entire collection of n flips as a row of
n indistinguishable objects with dividers placed in between them, where each divider corresponds to the place where one string ends and the next one begins. Our example flips would be represented as follows:

XX|XXX|XXXXXX|XXXX|XX|XX|XXX|X|XXX|X|X|XX|XXX|XXX|XXXX|XX|X|X|XX|XX|XX

Note that there is some ambiguity on whether the first block is heads or tails, but after that, all the other signs are determined as they simply alternate. To tackle the original problem, I decided to first obtain an intermediate result: if we flipped n coins, what would be the expected number of dividers? In other words, what is the expected number of places where the sign of the coin flip changes? Once we know that, we can say something about the gaps between the dividers, which correspond to consecutive runs of the same side.

This problem is easier than the original one, and can be approached using recursion. Let f(n) be the expected number of dividers among n flips. We would like a recursive formula involving f(n), so we consider what would happen if we formed a string of n+1 flips by taking a pre-made string of n flips and adding one flip to the end. There are two possibilities. Suppose the (n+1)th flip is the same as the nth flip: this happens with probability \frac{1}{2} and the number of dividers does not change. In the second case, the (n+1)th flip could be different from the nth flip with probability \frac{1}{2}, and this would add one divider in between the last two flips. Hence, we have

    \begin{equation*} f(n+1)=\frac{1}{2}f(n)+\frac{1}{2}(f(n)+1)=f(n)+1/2 \end{equation*}

This is a very convenient recursion, telling us that f(n) is an arithmetic sequence with common difference \frac{1}{2}. Clearly, we have f(1)=0, so in general,

    \begin{equation*} f(n)=\frac{1}{2}(n-1)$ \end{equation*}

3. The Result (for fair coins)

Now, we will apply this lemma to the original problem in the following way: since the expected number of dividers is f(n)=\frac{1}{2}(n-1), it follows that the expected number of runs of a particular digit is f(n)+1=\frac{1}{2}(n+1). This is because each divider simply acts as the transition point between two consecutive runs of different digits: a run of heads and a run of tails, or vice versa. Since there is also a run before the first divider and a run after the last divider, we add 1 to f(n). Now, because these runs alternate between being containing either heads or tails, we expect half of the total number to be runs of heads. Hence, we expect to have

    \begin{equation*} H(n)=\frac{1}{2}*\frac{1}{2}(n+1)=\frac{1}{4}(n+1) \end{equation*}

consecutive runs of heads. From here, we have a better handle on the problem, since we at least know how many runs of heads we need to think about in our search for the longest one.

Now, instead of just sampling whether individual coins are heads or tails, we consider sampling entire runs. Observe that the probability that a given run has length k is equal to 1/2^k: by the definition of a run, we are guaranteed that the first flip will be heads, and the probability that the next k-1 flips are all heads and that the flip after that is tails (to end the run) is \frac{1}{2^{k-1}}*\frac{1}{2}=\frac{1}{2^k}. So we consider the following process: take the discrete probability distribution p(x)=\frac{1}{2^x}, and sample it a total of H(n)=\frac{1}{4}(n+1) times. We wish to find the expected value of the largest value obtained in this sampling process. To do this, I will make an approximation: treat P(x) as a continuous probability distribution rather than a discrete one.

The above diagram shows two distributions. The black rectangles correspond to the discrete distribution p(x)=\frac{1}{2^x}, where the midpoint of each rectangle corresponds to x and the height corresponds to \frac{1}{2^x}. The blue curve will be our approximation: it corresponds to the same distribution, P(x)=\frac{1}{2^x} (we have used a capital P), but it is continuous. Note that technically speaking, P(x) is not a probability distribution because the area under it from 0 to \infty is not exactly 1. However, it turns out this will not matter very much for us, as we will mainly be looking at the portion of P(x) farther to the right, where it much more accurately approximates p(x). Why did we bother to make this seemingly complicating distinction? I actually found it to be easier to calculate the expected largest sampled value in the continuous case, and this is how:

Consider the above distribution (I used e^{-x} as an example). Suppose we sample it 5 times, and we want to find the expected maximum of the results. As shown in the diagram, we draw 5 vertical lines that split the entire area under the curve into 6 equal pieces, each with area 1/6. Essentially, this means that the chance that any randomly sampled point is less than the first line equals the chance it is between the first and second lines, which equals the chance it is between the second and third lines, and so on. Probabilistically speaking, these intervals are symmetric, and hence the espected value of the largest out of the five samples should be the fifth line to the right.

Now, we apply this to our situation: with H(n) sampled points, we want to divide the total area into H(n)+1 equal pieces, and look for the rightmost dividing segment. Let the position of this segment be x(n), our desired answer to the original problem. Then, the area to the right of x(n) is

    \begin{equation*} \int_{x(n)}^{\infty} \frac{1}{2^x} dx=\frac{1}{\ln(2)}*\frac{1}{2^{x(n)}} \end{equation*}

.

However, this is also just equal to \frac{1}{H(n)+1}, so we have

    \begin{equation*} \ln(2)*2^{x(n)}=H(n)+1 \rightarrow \end{equation*}

    \begin{equation*} x(n)=\log_2\left(\frac{1}{\ln(2)}(H(n)+1)\right)=\log_2\left(\frac{1}{\ln(2)}\left(\frac{n+5}{4}\right)\right). \end{equation*}

So are we done? So I thought…but I have made a mistake. It turns out that my “method” of dividing into equal areas to find the expected value of a random variable does not actually work upon closer inspection (take the simple example of f(x)=e^{-x} sampled once: the expected value is 1 but my method would suggest it is \ln(2)). I will soon post an update to see if I can correct this situation, so stay tuned…

One thought on “Great Expectations: Part 1

Leave a Reply

Your email address will not be published. Required fields are marked *