# Conditional expectation of continuous random variables

### Conditioning on events of positive probability

Assume that $$X$$ is a uniform random variable on $$[0,1]$$. We want to calculate the expected value of $$X$$ given that $$X$$ is bigger than $$\frac12$$. Intuitively the answer is $$\frac34$$ because if $$X > \frac12$$ and $$X$$ is uniform, then $$X$$ should be uniform on $$\left[\frac12,1\right]$$.

To make this more precise we should identify one important event in this example. Let $$A$$ be the event that the random variable $$X$$ is bigger than $$\frac12$$. Then we want to calculate $$\mathbb E\left[\left.X\right|A\right]$$. We can now define the conditional cumulative distribution function of $$X$$ given the event $$A$$ in the following way $F_{X|A}(t)=\mathbb P\left(\left.X\leq t\right|A\right)=\frac{\mathbb P\left(\left\{X\leq t\right\}\cap A\right)}{\mathbb P\left(A\right)}$ We can now use that $$A=\left\{X\geq \frac12\right\}$$. Let us first consider the case $$t < \frac12$$. Then $$A\cap \{X\leq t\}=\emptyset$$, hence $$F_{X|A}(t)=0$$ for $$t < \frac12$$. Similarly, if $$x > 1$$, then $$\{X\leq t\}\cap A=A$$ and $$F_{X|A}(t)=1$$. Assume now that $$t\in\left[\frac12,1\right]$$. We rewrite the the equation for $$F_{X|A}(t)$$ as $F_{X|A}(t)=\frac{\mathbb P\left(\frac12\leq X\leq t\right)}{\mathbb P\left(A\right)} =\frac{\left(t-\frac12\right)}{\frac12}=2\left(t-\frac12\right).$ Now we have formally obtained that conditioned on the event $$\left\{X\geq \frac12\right\}$$ the random variable $$X$$ has uniform distribution on $$\left[\frac12,1\right]$$. Now it is easy to calculate its expected value and obtain $$\frac34$$.

Definition. The conditional probability mass function of the random variable $$X$$ given the event $$D$$ is the function $$f_{X|D}:\mathbb R\to[0,1]$$ defined as $f_{X|D}(k)=\mathbb P\left(\left.X=k\right|D\right).$

### Conditioning on events of the form $$\{X=\alpha\}$$ where $$X$$ is a random variable with continuous distribution

We will now condition on random variables instead on events. This is one type of problem that we want to solve:

Problem 1. If $$A$$ and $$B$$ are independent normal random variables with $$A\sim N(3,16)$$ and $$B\sim N(5,36)$$, what is the conditional probability density function of $$A$$ given that $$A+B=10$$?

The probability of the event $$\{A+B=10\}$$ is equal to $$0$$ because the random variable $$C=A+B$$ is a random variable with continuous distribution.

However, we can still calculate conditional probabilities on events of the type $$\{C=0\}$$. The above problem will be solved later in this document.

Our goal is to develop a way to calculate conditional distributions where conditioning is performed over certain events whose probability is $$0$$. We will not be able to condition over all events of probability $$0$$; for example we will never be able to condition over the empty set. However, there are special events of zero probability that arise from random variables. If $$X$$ is a random variable with continuous distribution, then we will look at the events of the type $$\{X=\alpha\}$$ where $$\alpha$$ is some constant real number. Such events have probability $$0$$. However, we will approximate them by events $$\{\alpha\leq X\leq \alpha+\varepsilon\}$$ that have non-zero probability whenever $$\varepsilon > 0$$. Then we will let $$\varepsilon\to 0$$.

Assume that $$Y$$ is another random variable with continuous distribution. We want to find the conditional probability density function of $$Y$$ given that $$\{X=\alpha\}$$. We will first find the conditional cumulative distribution function.

Assume that $$t$$ is a real number. We define the conditional probability density function as \begin{eqnarray*} F_{Y|X=\alpha}(t)=\mathbb P\left(Y\leq t|X=\alpha\right)=\lim_{\varepsilon\to0}\frac{\mathbb P\left(Y\leq t, \alpha\leq X\leq\alpha+\varepsilon\right)}{\mathbb P\left(\alpha\leq X\leq \alpha+\varepsilon\right)}. \end{eqnarray*} The denominator of the last fraction is $$F_X(\alpha+\varepsilon)-F_X(\alpha)$$ where $$F_X$$ is the cumulative distribution function of $$X$$. We can use the joint cumulative distribution function $$F_{X,Y}$$ of random variables $$X$$ and $$Y$$ to express the numerator of the fraction as $\mathbb P\left(Y\leq t, \alpha\leq X\leq\alpha+\varepsilon\right)=\mathbb P\left(Y\leq t, X\leq \alpha+\varepsilon\right)-\mathbb P\left(Y\leq t, X\leq \alpha\right)=F_{X,Y}(\alpha+\varepsilon,t)-F_{X,Y}(\alpha,t).$ The conditional cumulative distribution function of $$Y$$ given $$\{X=\alpha\}$$ now becomes \begin{eqnarray*} F_{Y|X=\alpha}(t)&=&\lim_{\varepsilon\to0}\frac{F_{X,Y}(\alpha+\varepsilon,t)-F_{X,Y}(\alpha,t)}{F_Y(\alpha+\varepsilon)-F_Y(\alpha)} =\lim_{\varepsilon\to0}\frac{\frac{F_{X,Y}(\alpha+\varepsilon,t)-F_{X,Y}(\alpha,t)}{\varepsilon}}{\frac{F_Y(\alpha+\varepsilon)-F_Y(\alpha)}{\varepsilon}} = \frac{\frac{\partial}{\partial x}F_{X,Y}(\alpha,t)}{f_Y(\alpha)}. \end{eqnarray*} The probability density function of $$Y$$ given $$\{X=\alpha\}$$ is $f_{Y|X=\alpha}(t)=\frac{d}{dt}F_{Y|X=\alpha}(t)=\frac{\frac{\partial^2}{\partial y\partial x}F_{X,Y}(\alpha,t)}{f_Y(\alpha)}=\frac{f_{X,Y}(\alpha,t)}{f_Y(\alpha)}.$

Remark. Observe that for discrete random variables we would have exactly the same equation as the last one; except that in discrete case $$f_{X,Y}$$ would not be joint probability density function. Instead, it would be joint probability mass function. The same would hold for $$f_Y$$.

Notation. It is common to use notation $$f_{Y|X}(t|\alpha)$$ instead of $$f_{Y|X=\alpha}(t)$$.

Problem 2. The joint density of $$X$$ and $$Y$$ is given by $f(x,y)=\frac{Cx}{y^2}e^{-y^2-\frac{x^2}{y^2}},\quad x > 0, y > 0.$
• (a) Determine the constant $$C$$.
• (b) Calculate the conditional expectation $$\mathbb E\left[X|Y=y\right]$$.

Problem 3. The joint density of the random variables $$X$$ and $$Y$$ is given by $f(x,y)=Cx^2e^{-xy}\cdot 1_{[0,1)}(x)\cdot 1_{[0,x)}(y),$ where $$C$$ is a constant.
• (a) Determine the constant $$C$$.
• (b) Determine the probability of the event $$Y > \frac{X}2$$.
• (c) Determine the conditional density of $$X$$ given $$Y=y$$.

### Bivariate normal random variables

When dealing with bivariate normal random variables, we often can use a trick to avoid dealing with conditional probability density functions. The trick is the following: If $$X$$ and $$Y$$ have bivariate normal distribution then $$X$$ can be expressed as $$X=\alpha Y+\beta Z$$ where $$Z$$ is independent of $$Y$$. Alternatively, $$Y$$ can be expressed as $$Y=\gamma X+ \delta W$$, where $$W$$ is independent of $$X$$. You would need to calculate the constants $$\alpha$$ and $$\beta$$ (or the constants $$\gamma$$ and $$\delta$$ if you choose to express $$Y$$ in terms of $$X$$ and $$W$$). These constants are found by solving a system of equations obtained from the covariance matrix.

To illustrate the technique, we will solve the problem 1 from the beginning of this document.

Problem 1. If $$A$$ and $$B$$ are independent normal random variables with $$A\sim N(3,16)$$ and $$B\sim N(5,9)$$, what is the conditional probability density function of $$A$$ given that $$A+B=10$$?

Problem 4. Assume that $$X$$ and $$Y$$ are bivariate normal random variables with mean $$0$$ and covariance matrix $$\Sigma$$. Assume further that $$\Sigma=\left[\begin{array}{cc} 1& \rho\newline \rho&1\end{array}\right]$$. Evaluate $$\mathbb E\left[\left.e^X\right|Y=y_0\right]$$.