Let <math>(X,Y)</math> be a pair of random variables with values over the space <math>\mathcal{X}\times\mathcal{Y}</math>. If their joint distribution is <math>P_{(X,Y)}</math> and the marginal distributions are <math>P_X</math> and <math>P_Y</math>, the mutual information is defined as | Let <math>(X,Y)</math> be a pair of random variables with values over the space <math>\mathcal{X}\times\mathcal{Y}</math>. If their joint distribution is <math>P_{(X,Y)}</math> and the marginal distributions are <math>P_X</math> and <math>P_Y</math>, the mutual information is defined as |