Consider the probability space generated by \(5\) tosses of a fair coin. The sample space \(\Omega\) consists of all sequences of length \(5\) whose terms are letters \(H\) and \(T\). The probability of each outcome is \(\frac1{32}\). Let us denote by \(D\) the event that the number of heads in the last three tosses is at least \(1\). Let \(X\) be the total number of heads in all \(5\) tosses. The quantity \(X\) is a random variable. We want to calculate the expected value of \(X\) given the event \(D\).

Consider the event \(B_2\) that the total number of heads in all \(5\) tosses is \(2\). We know how to cacluate the conditional probability \(\mathbb P\left(\left.B_2\right|D\right)\). This conditional probability is equal \[\mathbb P\left(\left.B_2\right|D\right)= \frac{\mathbb P\left(B_2\cap D\right)}{\mathbb P(D)}. \] The probability of the event \(D\) is a little bit harder to calculate than the probability of the event \(D^C\). This is because when we calculate probability of \(D^C\) we have to subtract that number from number \(1\) to get the probability of \(D\). And of course, doing nothing to a number \(\mathbb P\left(D^C\right)\) is a little bit more difficult than subtracting a number \(\mathbb P\left(D^C\right)\) from \(1\).

The event \(D^C\) is precisely the event that the number of heads in the last three tosses is exactly \(0\). The probability of this event is \(\frac1{2^3}\). Therefore \(\mathbb P(D)=\frac78\).

The event \(B_2\cap D\) is the following event: In \(5\) tosses of the fair coin there are exactly \(2\) heads and at least one of the two heads is in the last three tosses. We can list all outcomes of this event \[B_2\cap D=\left\{HTHTT, HTTHT, HTTTH, THHTT,THTHT,THTTH, TTHHT, TTHTH, TTTHH\right\}.\] Therefore \(\mathbb P\left(B_2\cap D\right)=\frac9{32}\) and \[\mathbb P\left(\left.B_2\right|D\right)= \frac{\mathbb P\left(B_2\cap D\right)}{\mathbb P(D)}=\frac{\frac9{32}}{\frac78}=\frac9{28}.\]

The event \(B_2\) can be written as \(B_2=\{X=2\}\). Therefore we calculated \[\mathbb P\left(\left. X=2\right|D\right)=\frac{9}{28}.\] The quantity \(\mathbb P\left(\left. X=2\right|D\right)\) is also denoted by \(f_{X|D}(2)\). In general we can define the probability mass function of the random variable \(X\) conditioned on the event \(D\)

Once we have the conditional probability mass function of the random variable \(X\) given the event \(D\) we can define and calculate the conditional expectation as \[\mathbb E\left[\left.X\right|D\right]=\sum_kkf_{X|D}(k).\]

It is easy to prove that the conditional expectations satisfy linearity property, i.e. \[\mathbb E\left[\left.X+Y\right|D\right]=\mathbb E\left[\left.X\right|D\right]+\mathbb E\left[\left.Y\right|D\right].\] It is also easy to prove that if the random variable \(X\) is independent from the event \(D\), then \(\mathbb E\left[\left.X\right|D\right]=\mathbb E\left[X\right]\).

In the previous problem we established that \(\mathbb E\left[\left.Y\right|Z=k\right]=\frac12+\frac{2k}3\) for every \(k\) for which \(\mathbb P\left(Z=k\right)\neq 0\). We can write the above equality as \[\mathbb E\left[\left.Y\right|Z\right]=\frac12+\frac{2Z}3.\]

Three frogs are jumping on a regular triangle and they are skillful enough to always land at vertices. A vertex can be occupied by more than one frog. Every minute each frog must jump from the vertex it is located to any other vertex. The frogs choose where to jump independently from each other and they are equally likely to choose to jump to the left or to the right. If initially each vertex contains exactly one frog, how long does it take on average for all the frogs to meet in the same vertex?