# Collecting Coupons

## Problem

Coupons in cereal boxes are numbered \(1\) to \(5\), and a set of one of each is required for a prize. With one coupon per box, how many boxes on the average are required to make a complete set?

## Solution

Let \(X_{1}\), \(X_{2}\), …, \(X_{5}\) be random variables representing the number of boxes opened until receiving the \(1\)^{st}, \(2\)^{nd}, …, and \(5\)^{th} unique coupon, respectively, where the number of boxes is counted since the last box containing a unique coupon was opened. For example, \(X_{3}\) represents the number of boxes opened after the \(2\)^{nd} unique coupon was found up until the \(3\)^{3rd} unique coupon was found.

Each \(X_{i}\) is a geometric random variable associated with a different probability of success. For example, \(X_{1}\) is associated with \(p_{1} = 1\) because the \(1\)^{st} box will obviously contain a unique coupon. \(X_{2}\) is associated with \(p_{2} = \frac{4}{5}\) because only \(4\) of the \(5\) coupons are unique at that point. Probabilities for the remaining \(X_{i}\) can be calculated in a similar fashion. For geometric random variables, we also know that \(E[X_{i}] = \frac{1}{p_{i}}\).

To solve the problem, we can apply the *Linearity of Expectation* property as follows:

\[\begin{split}
E \left[\sum_{i=1}^5 X_{i} \right] &= \sum_{i=1}^5 E[X_{i}] \\
&= \sum_{i=1}^5 \frac{1}{p_{i}} \\
&= 5 \left( \frac{1}{5} + \frac{1}{4} +
\frac{1}{3} + \frac{1}{2} + \frac{1}{1} \right) \\
&= \frac{137}{12} \approx 11.42
\end{split}\]

## Notes

When generalized to \(n\) coupons, the solution to this problem can be written as:

\[C(n) = n \sum_{i=1}^n \frac{1}{i} = nH_{n}\]

In the above, \(H_{n}\) represents the \(n\)^{th} *Harmonic Number*, which is the sum of the reciprocals of integers from \(1\) to \(n\).