# Coupon Collecting Problem: Find the Expectation of Boxes to Collect All Toys

## Problem 750

A box of some snacks includes one of five toys. The chances of getting any of the toys are equally likely and independent of the previous results.

(a) Suppose that you buy the box until you complete all the five toys. Find the expected number of boxes that you need to buy.

(b) Find the variance and the standard deviation of the event in part (a).

## Solution.

### Solution of (a)

Let $X$ be the number of boxes that you need to buy until you complete all the five toys. Our goal is to compute the expected value $E[X]$.
To achieve this, we consider the next random variables. Let $X_i$ be the number of boxes you need to buy to get $i$th toy after getting $i-1$ toys. Then it is clear from definition that
$X = X_1 + X_2 + X_3 + X_4 + X_5.$ For example, $X_1$ is the number of boxes you need to buy to get the first toy. Since whenever you open the first box, it is guaranteed that you get a new toy, we have $X_1 = 1$.
Also, to get the second toy after the first one, there are $4/5$ chance of getting new toy and $1/5$ chance of getting the same toy as the first one. Thus, $X_2$ is a geometric random variable with parameter $4/5$. We denote this as $X \sim G_{4/5}$. Similarly, we get
$X_3 \sim G_{3/5}, \quad X_4 \sim G_{2/5}, \quad \text{ and } X_5 \sim G_{1/5}.$

By the linearity of expectation, we have
\begin{align*}
E[X] &= E[X_1 + X_2 + X_3 + X_4 + X_5]\\
&= E[X_1] + E[X_2] + E[X_3] + E[X_4] + E[X_5]\\
&= E[1] + E[G_{4/5}] + E[G_{3/5}] + E[G_{2/5}] + E[G_{1/5}] \end{align*}
Now, the expected value of a geometric random variable $G_p$ is given by
$E[G_p] = \frac{1}{p}.$ It follows that
\begin{align*}
E[X] &= 1 + \frac{5}{4} +\frac{5}{3} + \frac{5}{2} + \frac{5}{1}\\
&= 5\left(1+ \frac{1}{2} + \frac{1}{3} + \frac{1}{4} + \frac{1}{5}\right)\\
&\approx 11.41667
\end{align*}
Thus, the expected number of boxes that you need to buy to complete all the five toys is 11.41667.

### Solution of (b)

Now we compute the variance of $X$. Recall that the variance of a geometric random variable $G_p$ is given by
$V(G_p) = \frac{1-p}{p^2}.$ As we have seen that
\begin{align*}
X &= X_1 + X_2 + X_3 + X_4 + X_5\\
&\sim 1 + G_{4/5} + G_{3/5} + G_{2/5} + G_{1/5}
\end{align*}
and as each random variable $X_i$ is independent of each other, we obtain
\begin{align*}
V(X) &= V(1 + G_{4/5} + G_{3/5} + G_{2/5} + G_{1/5})\6pt] &= V(1) + V(G_{4/5}) + V(G_{3/5}) + V(G_{2/5}) + V(G_{1/5})\\[6pt] &= 0 + \frac{1-\frac{4}{5}}{\left(\frac{4}{5}\right)^2} + \frac{1-\frac{3}{5}}{\left(\frac{3}{5}\right)^2} + \frac{1-\frac{2}{5}}{\left(\frac{2}{5}\right)^2} + \frac{1-\frac{1}{5}}{\left(\frac{1}{5}\right)^2}\\[6pt] &\approx 25.17361 \end{align*} Thus, the variance is V(X) = 25.17361. The standard deviation is the square root of the variance. Hence, we obtain \[\sigma(X) \approx \sqrt{25.17361} \approx 5.01733.

## Remark.

This type of problems is called a Coupon Collecting Problem.