第245行: |
第245行: |
| mutual information is the [[Kullback–Leibler divergence]] of the product of the [[marginal distribution]]s, <math>p_X \cdot p_Y</math>, from the [[joint distribution]] <math>p_{(X,Y)}</math>, that is, | | mutual information is the [[Kullback–Leibler divergence]] of the product of the [[marginal distribution]]s, <math>p_X \cdot p_Y</math>, from the [[joint distribution]] <math>p_{(X,Y)}</math>, that is, |
| | | |
− | mutual information is the Kullback–Leibler divergence of the product of the marginal distributions, <math>p_X \cdot p_Y</math>, from the joint distribution <math>p_{(X,Y)}</math>, that is, | + | mutual information is the Kullback–Leibler divergence of the product of the marginal distributions, 𝑝𝑋⋅𝑝𝑌, from the joint distribution 𝑝(𝑋,𝑌), that is, |
| | | |
− | 互信息是边际分布乘积的 Kullback-Leibler 散度,也就是联合分布<math>p_{(X,Y)}</math>的乘积,即:
| + | 互信息是边缘分布乘积的相对熵,也就是联合分布<math>p_{(X,Y)}</math>的乘积,即: |
| | | |
| | | |
第257行: |
第257行: |
| Furthermore, let <math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math> be the conditional mass or density function. Then, we have the identity | | Furthermore, let <math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math> be the conditional mass or density function. Then, we have the identity |
| | | |
− | 进一步地,设<math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math>是条件质量或密度函数。那么,我们就有了''身份(这里翻译存疑)'': | + | 进一步地,设<math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math>为条件质量或密度函数。那么,我们就有了''身份(这里翻译存疑)'': |
| | | |
| | | |
第277行: |
第277行: |
| Similarly this identity can be established for jointly continuous random variables. | | Similarly this identity can be established for jointly continuous random variables. |
| | | |
− | 同样,这个恒等式也可以建立在联合连续的随机变量上。
| + | 这个恒等式在联合、连续的随机变量情况下同样成立。 |
| | | |
| | | |
第287行: |
第287行: |
| Note that here the Kullback–Leibler divergence involves integration over the values of the random variable <math>X</math> only, and the expression <math>D_\text{KL}(p_{X|Y} \parallel p_X)</math> still denotes a random variable because <math>Y</math> is random. Thus mutual information can also be understood as the expectation of the Kullback–Leibler divergence of the univariate distribution <math>p_X</math> of <math>X</math> from the conditional distribution <math>p_{X|Y}</math> of <math>X</math> given <math>Y</math>: the more different the distributions <math>p_{X|Y}</math> and <math>p_X</math> are on average, the greater the information gain. | | Note that here the Kullback–Leibler divergence involves integration over the values of the random variable <math>X</math> only, and the expression <math>D_\text{KL}(p_{X|Y} \parallel p_X)</math> still denotes a random variable because <math>Y</math> is random. Thus mutual information can also be understood as the expectation of the Kullback–Leibler divergence of the univariate distribution <math>p_X</math> of <math>X</math> from the conditional distribution <math>p_{X|Y}</math> of <math>X</math> given <math>Y</math>: the more different the distributions <math>p_{X|Y}</math> and <math>p_X</math> are on average, the greater the information gain. |
| | | |
− | 因此,互信息也可以理解为X的单变量分布<math>p_X</math>与给定<math>Y</math>的<math>X</math>的条件分布<math>p_{X|Y}</math>的Kullback–Leibler散度的期望:平均分布<math>p_{X|Y}</math>和<math>p_X</math>的分布差异越大,信息增益越大。 | + | 因此,互信息也可以理解为X的单变量分布<math>p_X</math>与给定<math>Y</math>的<math>X</math>的条件分布<math>p_{X|Y}</math>的相对熵的期望:平均分布<math>p_{X|Y}</math>和<math>p_X</math>的分布差异越大,信息增益越大。 |
| | | |
| === 互信息的贝叶斯估计 Bayesian estimation of mutual information === | | === 互信息的贝叶斯估计 Bayesian estimation of mutual information === |