Conditional Mean, Variance, and MMSE

In this blog post, we’ll explore the connections among conditional mean, variance, conditional Probability Density Functions (PDFs), and also its relationship with Minimum Mean Square Error (MMSE) estimator.

Conditional Probability Density Function

Consider two random variables, $X$ and $Y$ (which could be complex). Let $f_{X|Y}(x|y)$ represent the conditional PDF.

Conditioning on a Realization $Y=y$

When conditioned on a specific realization $Y=y$ (i.e., $X|Y=y$), the corresponding PDF becomes $f_{X|Y=y}(x)$, dependent on $x$, its mean, and variance represented as:

Conditioning on a Random Variable $Y$

When conditioned on the random variable $Y$ (i.e., $X|Y$), the corresponding PDF becomes $f_{X|Y}(x|y)$, a function of both $x$ and $y$. Its mean and variance are functions of $Y$, denoted as $\mathbb{E}[X|Y]$ and $\mathbb{V}[X|Y]$. In contrast to the previous case (for a given realization of $Y$), these terms are random variable. Specifically, each realization of $\mathbb{E}[X|Y]$ takes the form $\mathbb{E}[X|Y=y]$, and similarly, each realization of $\mathbb{V}[X|Y]$ is of the form $\mathbb{V}[X|Y=y]$. In a way, the posterior density of $X$ is different for different realizations of $Y=y$.

MMSE Estimator

The MMSE estimator is defined as: $$ \begin{align} \widehat{X}= g(Y) &= \mathbb{E}[X|Y] \end{align} $$ Here, $g(\cdot)$ represents a function. Hence, the MMSE estimator of $X$ is a function of the random variable $Y$, making the estimator itself a random variable.

For a specific realization $Y=y$, the estimate is: $$ \begin{align} \widehat{x}= g(y) &= \mathbb{E}_{X}[X|Y=y]\ \end{align} $$ This represents the mean of the posterior density $f_{X|Y=y}(x)$.

The MSE for a particular realization $Y=y$ of MMSE estimator is $$ \begin{align} \mathbb{E}_{X}[\vert X - \widehat{X} \vert^2| Y = y]= \mathbb{V}_{X}[X|Y=y] \end{align} $$ This quantity represents the variance of the posterior density $f_{X|Y=y}(x)$.

However, our primary interest lies in the MSE of the estimator across all realizations of $Y$. Considering this, error variance $\mathbb{V}_{\epsilon}$ of the MMSE estimator is $$ \begin{align} \mathbb{V}_{\epsilon}&= \mathbb{E}_{X,Y}[\vert X - \mathbb{E}[X|Y] \vert^2]\\ & = \mathbb{E}_{Y}[\mathbb{E}_{X}[\vert X - \mathbb{E}[X|Y] \vert^2|Y]]\\ &=\mathbb{E}_{Y}[\mathbb{V}[X|Y]] \end{align} $$

An intriguing result emerges when exploring the relationship between the expectation of conditional variance and the variance of conditional expectation: $$ \begin{align} \mathbb{V}[X] = \mathbb{E}[\mathbb{V}[X|Y]] + \mathbb{V}[\mathbb{E}[X|Y]] \end{align} $$


This result is called as law of total variance. This is similar to law of total expectation $\mathbb{E}_Y[\mathbb{E}_X[X|Y]] = \mathbb{E}_X[X]$. For completion, note that following is true: $$ \begin{align} \mathbb{E}_Y[\mathbb{E}_X[g(X)|Y]] = \mathbb{E}_X[g(X)] \end{align} $$ where $g(\cdot)$ is any function.

MMSE Estimator for Linear Model

As a supplementary note, when dealing with random vectors (represented by lowercase bold letters for random vectors), consider a scenario with a Gaussian prior on the signal vector $\mathbf{x} \sim \mathcal{CN}(\boldsymbol{\mu}_{\mathbf{x}}, \mathbf{C}_{\mathbf{x}})$, independent additive Gaussian noise $\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \mathbf{C}_{\mathbf{n}})$, and a linear receive model incorporating a known channel $\mathbf{H} \in \mathbb{C}^{N\times K}$: $$ \begin{align} \mathbf{y} = \mathbf{Hx} + \mathbf{n} \end{align} $$

The posterior density of $\mathbf{x}|\mathbf{y}$ can be expressed as:



Note that in both expressions, the alternative expressions are applicable when covariances of the information signal vector and noise are invertible. Furthermore, we opted to use the notation $\mathbf{C}$ to denote the covariance of a matrix, as it is a more prevalent convention, rather than $\mathbb{V}$.

Remarkably, the conditional covariance is independent of $\mathbf{y}$, which is an uncommon property as it typically depends on $\mathbf{y}$. This also implies that the Mean Square Error (MSE) of the MMSE estimator is the same as that of conditional covariance, given by:

$$ \begin{align} \mathbf{C}_{\epsilon}= \mathbb{E}{\mathbf{y}}[\mathbf{C}_{\mathbf{x}|\mathbf{y}}] = \mathbf{C}_{\mathbf{x}|\mathbf{y}} \end{align} $$

In conclusion, understanding the interplay between conditional mean, variance, and conditional PDFs provides valuable insights into the MMSE estimator. Exploring these concepts, particularly in scenarios like the MMSE estimator for a linear model with Gaussian prior and independent additive Gaussian noise, sheds light on the relationship between the posterior density, conditional mean, covariance and MSE of MMSE estimator.

Zakir Hussain Shaik
Zakir Hussain Shaik
PhD Student in Communications Systems

My research interests include wireless communications, distributed signal processing, and convex optimization