Intuitively, mutual information measures the information that <math>X</math> and <math>Y</math> share: It measures how much knowing one of these variables reduces uncertainty about the other. For example, if <math>X</math> and <math>Y</math> are independent, then knowing <math>X</math> does not give any information about <math>Y</math> and vice versa, so their mutual information is zero. At the other extreme, if <math>X</math> is a deterministic function of <math>Y</math> and <math>Y</math> is a deterministic function of <math>X</math> then all information conveyed by <math>X</math> is shared with <math>Y</math>: knowing <math>X</math> determines the value of <math>Y</math> and vice versa. As a result, in this case the mutual information is the same as the uncertainty contained in <math>Y</math> (or <math>X</math>) alone, namely the entropy of <math>Y</math> (or <math>X</math>). Moreover, this mutual information is the same as the entropy of <math>X</math> and as the entropy of <math>Y</math>. (A very special case of this is when <math>X</math> and <math>Y</math> are the same random variable.) | Intuitively, mutual information measures the information that <math>X</math> and <math>Y</math> share: It measures how much knowing one of these variables reduces uncertainty about the other. For example, if <math>X</math> and <math>Y</math> are independent, then knowing <math>X</math> does not give any information about <math>Y</math> and vice versa, so their mutual information is zero. At the other extreme, if <math>X</math> is a deterministic function of <math>Y</math> and <math>Y</math> is a deterministic function of <math>X</math> then all information conveyed by <math>X</math> is shared with <math>Y</math>: knowing <math>X</math> determines the value of <math>Y</math> and vice versa. As a result, in this case the mutual information is the same as the uncertainty contained in <math>Y</math> (or <math>X</math>) alone, namely the entropy of <math>Y</math> (or <math>X</math>). Moreover, this mutual information is the same as the entropy of <math>X</math> and as the entropy of <math>Y</math>. (A very special case of this is when <math>X</math> and <math>Y</math> are the same random variable.) |