Conditional Mean, Variance, and MMSE

Nov 30, 2023·
Zakir Hussain Shaik
Zakir Hussain Shaik
· 3 min read

In this blog post, we’ll explore the connections among conditional mean, variance, conditional Probability Density Functions (PDFs), and also its relationship with Minimum Mean Square Error (MMSE) estimator.

Conditional Probability Density Function

Consider two random variables, XX and YY (which could be complex). Let fXY(xy)f_{X|Y}(x|y) represent the conditional PDF.

Conditioning on a Realization Y=yY=y

When conditioned on a specific realization Y=yY=y (i.e., XY=yX|Y=y), the corresponding PDF becomes fXY=y(x)f_{X|Y=y}(x), dependent on xx, its mean, and variance represented as:

Conditioning on a Random Variable YY

When conditioned on the random variable YY (i.e., XYX|Y), the corresponding PDF becomes fXY(xy)f_{X|Y}(x|y), a function of both xx and yy. Its mean and variance are functions of YY, denoted as E[XY]\mathbb{E}[X|Y] and V[XY]\mathbb{V}[X|Y]. In contrast to the previous case (for a given realization of YY), these terms are random variables. Specifically, each realization of E[XY]\mathbb{E}[X|Y] takes the form E[XY=y]\mathbb{E}[X|Y=y], and similarly, each realization of V[XY]\mathbb{V}[X|Y] is of the form V[XY=y]\mathbb{V}[X|Y=y]. In a way, the posterior density of XX is different for different realizations of Y=yY=y.

MMSE Estimator

The MMSE estimator is defined as:

X^=g(Y)=E[XY]\widehat{X}= g(Y) = \mathbb{E}[X|Y]

Here, g()g(\cdot) represents a function. Hence, the MMSE estimator of XX is a function of the random variable YY, making the estimator itself a random variable.

For a specific realization Y=yY=y, the estimate is:

x^=g(y)=EX[XY=y]\widehat{x}= g(y) = \mathbb{E}_{X}[X|Y=y]

This represents the mean of the posterior density fXY=y(x)f_{X|Y=y}(x).

The MSE for a particular realization Y=yY=y of MMSE estimator is

EX[XX^2Y=y]=VX[XY=y]\mathbb{E}_{X}[|X - \widehat{X}|^2| Y = y]= \mathbb{V}_{X}[X|Y=y]

This quantity represents the variance of the posterior density fXY=y(x)f_{X|Y=y}(x).

However, our primary interest lies in the MSE of the estimator across all realizations of YY. Considering this, error variance Vϵ\mathbb{V}_{\epsilon} of the MMSE estimator is

Vϵ=EX,Y[XE[XY]2]=EY[EX[XE[XY]2Y]]=EY[V[XY]]\mathbb{V}_{\epsilon}= \mathbb{E}_{X,Y}[|X - \mathbb{E}[X|Y]|^2]= \mathbb{E}_{Y}[\mathbb{E}_{X}[|X - \mathbb{E}[X|Y]|^2|Y]]= \mathbb{E}_{Y}[\mathbb{V}[X|Y]]

An intriguing result emerges when exploring the relationship between the expectation of conditional variance and the variance of conditional expectation:

V[X]=E[V[XY]]+V[E[XY]]\mathbb{V}[X] = \mathbb{E}[\mathbb{V}[X|Y]] + \mathbb{V}[\mathbb{E}[X|Y]]

Proof:

This result is called as law of total variance. This is similar to law of total expectation EY[EX[XY]]=EX[X]\mathbb{E}_Y[\mathbb{E}_X[X|Y]] = \mathbb{E}_X[X]. For completion, note that following is true:

EY[EX[g(X)Y]]=EX[g(X)]\mathbb{E}_Y[\mathbb{E}_X[g(X)|Y]] = \mathbb{E}_X[g(X)]

where g()g(\cdot) is any function.

MMSE Estimator for Linear Model

As a supplementary note, when dealing with random vectors (represented by lowercase bold letters for random vectors), consider a scenario with a Gaussian prior on the signal vector xCN(μx,Cx)\mathbf{x} \sim \mathcal{CN}(\boldsymbol{\mu}_{\mathbf{x}}, \mathbf{C}_{\mathbf{x}}), independent additive Gaussian noise nCN(0,Cn)\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \mathbf{C}_{\mathbf{n}}), and a linear receive model incorporating a known channel HCN×K\mathbf{H} \in \mathbb{C}^{N\times K}:

y=Hx+n\mathbf{y} = \mathbf{Hx} + \mathbf{n}

The posterior density of xy\mathbf{x}|\mathbf{y} can be expressed as:

Here,

and

Note that in both expressions, the alternative expressions are applicable when covariances of the information signal vector and noise are invertible. Furthermore, we opted to use the notation C\mathbf{C} to denote the covariance of a matrix, as it is a more prevalent convention, rather than V\mathbb{V}.

Remarkably, the conditional covariance is independent of y\mathbf{y}, which is an uncommon property as it typically depends on y\mathbf{y}. This also implies that the Mean Square Error (MSE) of the MMSE estimator is the same as that of conditional covariance, given by:

Cϵ=Ey[Cxy]=Cxy\mathbf{C}_{\epsilon}= \mathbb{E}_{\mathbf{y}}[\mathbf{C}_{\mathbf{x}|\mathbf{y}}] = \mathbf{C}_{\mathbf{x}|\mathbf{y}}

In conclusion, understanding the interplay between conditional mean, variance, and conditional PDFs provides valuable insights into the MMSE estimator. Exploring these concepts, particularly in scenarios like the MMSE estimator for a linear model with Gaussian prior and independent additive Gaussian noise, sheds light on the relationship between the posterior density, conditional mean, covariance and MSE of MMSE estimator.