MTH4120 Home

Conditional expectation of continuous random variables

Conditioning on events of positive probability

Assume that \(X\) is a uniform random variable on \([0,1]\). We want to calculate the expected value of \(X\) given that \(X\) is bigger than \(\frac12\). Intuitively the answer is \(\frac34\) because if \(X > \frac12\) and \(X\) is uniform, then \(X\) should be uniform on \(\left[\frac12,1\right]\).

To make this more precise we should identify one important event in this example. Let \(A\) be the event that the random variable \(X\) is bigger than \(\frac12\). Then we want to calculate \(\mathbb E\left[\left.X\right|A\right]\). We can now define the conditional cumulative distribution function of \(X\) given the event \(A\) in the following way \[F_{X|A}(t)=\mathbb P\left(\left.X\leq t\right|A\right)=\frac{\mathbb P\left(\left\{X\leq t\right\}\cap A\right)}{\mathbb P\left(A\right)}\] We can now use that \(A=\left\{X\geq \frac12\right\}\). Let us first consider the case \(t < \frac12\). Then \(A\cap \{X\leq t\}=\emptyset\), hence \(F_{X|A}(t)=0\) for \(t < \frac12\). Similarly, if \(x > 1\), then \(\{X\leq t\}\cap A=A\) and \(F_{X|A}(t)=1\). Assume now that \(t\in\left[\frac12,1\right]\). We rewrite the the equation for \(F_{X|A}(t)\) as \[F_{X|A}(t)=\frac{\mathbb P\left(\frac12\leq X\leq t\right)}{\mathbb P\left(A\right)} =\frac{\left(t-\frac12\right)}{\frac12}=2\left(t-\frac12\right).\] Now we have formally obtained that conditioned on the event \(\left\{X\geq \frac12\right\}\) the random variable \(X\) has uniform distribution on \(\left[\frac12,1\right]\). Now it is easy to calculate its expected value and obtain \(\frac34\).

Definition. The conditional probability mass function of the random variable \(X\) given the event \(D\) is the function \(f_{X|D}:\mathbb R\to[0,1]\) defined as \[f_{X|D}(k)=\mathbb P\left(\left.X=k\right|D\right).\]

Conditioning on events of the form \(\{X=\alpha\}\) where \(X\) is a random variable with continuous distribution

We will now condition on random variables instead on events. This is one type of problem that we want to solve:

Problem 1. If \(A\) and \(B\) are independent normal random variables with \(A\sim N(3,16)\) and \(B\sim N(5,36)\), what is the conditional probability density function of \(A\) given that \(A+B=10\)?

The probability of the event \(\{A+B=10\}\) is equal to \(0\) because the random variable \(C=A+B\) is a random variable with continuous distribution.

However, we can still calculate conditional probabilities on events of the type \(\{C=0\}\). The above problem will be solved later in this document.

Our goal is to develop a way to calculate conditional distributions where conditioning is performed over certain events whose probability is \(0\). We will not be able to condition over all events of probability \(0\); for example we will never be able to condition over the empty set. However, there are special events of zero probability that arise from random variables. If \(X\) is a random variable with continuous distribution, then we will look at the events of the type \(\{X=\alpha\}\) where \(\alpha\) is some constant real number. Such events have probability \(0\). However, we will approximate them by events \(\{\alpha\leq X\leq \alpha+\varepsilon\}\) that have non-zero probability whenever \(\varepsilon > 0\). Then we will let \(\varepsilon\to 0\).

Assume that \(Y\) is another random variable with continuous distribution. We want to find the conditional probability density function of \(Y\) given that \(\{X=\alpha\}\). We will first find the conditional cumulative distribution function.

Assume that \(t\) is a real number. We define the conditional probability density function as \begin{eqnarray*} F_{Y|X=\alpha}(t)=\mathbb P\left(Y\leq t|X=\alpha\right)=\lim_{\varepsilon\to0}\frac{\mathbb P\left(Y\leq t, \alpha\leq X\leq\alpha+\varepsilon\right)}{\mathbb P\left(\alpha\leq X\leq \alpha+\varepsilon\right)}. \end{eqnarray*} The denominator of the last fraction is \(F_X(\alpha+\varepsilon)-F_X(\alpha)\) where \(F_X\) is the cumulative distribution function of \(X\). We can use the joint cumulative distribution function \(F_{X,Y}\) of random variables \(X\) and \(Y\) to express the numerator of the fraction as \[\mathbb P\left(Y\leq t, \alpha\leq X\leq\alpha+\varepsilon\right)=\mathbb P\left(Y\leq t, X\leq \alpha+\varepsilon\right)-\mathbb P\left(Y\leq t, X\leq \alpha\right)=F_{X,Y}(\alpha+\varepsilon,t)-F_{X,Y}(\alpha,t).\] The conditional cumulative distribution function of \(Y\) given \(\{X=\alpha\}\) now becomes \begin{eqnarray*} F_{Y|X=\alpha}(t)&=&\lim_{\varepsilon\to0}\frac{F_{X,Y}(\alpha+\varepsilon,t)-F_{X,Y}(\alpha,t)}{F_Y(\alpha+\varepsilon)-F_Y(\alpha)} =\lim_{\varepsilon\to0}\frac{\frac{F_{X,Y}(\alpha+\varepsilon,t)-F_{X,Y}(\alpha,t)}{\varepsilon}}{\frac{F_Y(\alpha+\varepsilon)-F_Y(\alpha)}{\varepsilon}} = \frac{\frac{\partial}{\partial x}F_{X,Y}(\alpha,t)}{f_Y(\alpha)}. \end{eqnarray*} The probability density function of \(Y\) given \(\{X=\alpha\}\) is \[f_{Y|X=\alpha}(t)=\frac{d}{dt}F_{Y|X=\alpha}(t)=\frac{\frac{\partial^2}{\partial y\partial x}F_{X,Y}(\alpha,t)}{f_Y(\alpha)}=\frac{f_{X,Y}(\alpha,t)}{f_Y(\alpha)}.\]

Remark. Observe that for discrete random variables we would have exactly the same equation as the last one; except that in discrete case \(f_{X,Y}\) would not be joint probability density function. Instead, it would be joint probability mass function. The same would hold for \(f_Y\).

Notation. It is common to use notation \(f_{Y|X}(t|\alpha)\) instead of \(f_{Y|X=\alpha}(t)\).

Problem 2. The joint density of \(X\) and \(Y\) is given by \[f(x,y)=\frac{Cx}{y^2}e^{-y^2-\frac{x^2}{y^2}},\quad x > 0, y > 0.\]
  • (a) Determine the constant \(C\).
  • (b) Calculate the conditional expectation \(\mathbb E\left[X|Y=y\right]\).

Problem 3. The joint density of the random variables \(X\) and \(Y\) is given by \[f(x,y)=Cx^2e^{-xy}\cdot 1_{[0,1)}(x)\cdot 1_{[0,x)}(y),\] where \(C\) is a constant.
  • (a) Determine the constant \(C\).
  • (b) Determine the probability of the event \(Y > \frac{X}2\).
  • (c) Determine the conditional density of \(X\) given \(Y=y\).

Bivariate normal random variables

When dealing with bivariate normal random variables, we often can use a trick to avoid dealing with conditional probability density functions. The trick is the following: If \(X\) and \(Y\) have bivariate normal distribution then \(X\) can be expressed as \(X=\alpha Y+\beta Z\) where \(Z\) is independent of \(Y\). Alternatively, \(Y\) can be expressed as \(Y=\gamma X+ \delta W\), where \(W\) is independent of \(X\). You would need to calculate the constants \(\alpha\) and \(\beta\) (or the constants \(\gamma\) and \(\delta\) if you choose to express \(Y\) in terms of \(X\) and \(W\)). These constants are found by solving a system of equations obtained from the covariance matrix.

To illustrate the technique, we will solve the problem 1 from the beginning of this document.

Problem 1. If \(A\) and \(B\) are independent normal random variables with \(A\sim N(3,16)\) and \(B\sim N(5,9)\), what is the conditional probability density function of \(A\) given that \(A+B=10\)?

Problem 4. Assume that \(X\) and \(Y\) are bivariate normal random variables with mean \(0\) and covariance matrix \(\Sigma\). Assume further that \(\Sigma=\left[\begin{array}{cc} 1& \rho\newline \rho&1\end{array}\right]\). Evaluate \(\mathbb E\left[\left.e^X\right|Y=y_0\right]\).