第241行: |
第241行: |
| For jointly discrete or jointly continuous pairs <math>(X,Y)</math>, | | For jointly discrete or jointly continuous pairs <math>(X,Y)</math>, |
| | | |
− | 对于联合离散或联合连续对数学(x,y) / 数学,
| + | 对于联合离散或联合连续变量对(𝑋,𝑌), |
| | | |
| mutual information is the [[Kullback–Leibler divergence]] of the product of the [[marginal distribution]]s, <math>p_X \cdot p_Y</math>, from the [[joint distribution]] <math>p_{(X,Y)}</math>, that is, | | mutual information is the [[Kullback–Leibler divergence]] of the product of the [[marginal distribution]]s, <math>p_X \cdot p_Y</math>, from the [[joint distribution]] <math>p_{(X,Y)}</math>, that is, |
第247行: |
第247行: |
| mutual information is the Kullback–Leibler divergence of the product of the marginal distributions, <math>p_X \cdot p_Y</math>, from the joint distribution <math>p_{(X,Y)}</math>, that is, | | mutual information is the Kullback–Leibler divergence of the product of the marginal distributions, <math>p_X \cdot p_Y</math>, from the joint distribution <math>p_{(X,Y)}</math>, that is, |
| | | |
− | 互信息是边际分布乘积的 Kullback-Leibler 散度,也就是联合分布数学 p {(x,y)} / math 的乘积, | + | 互信息是边际分布乘积的 Kullback-Leibler 散度,也就是联合分布<math>p_{(X,Y)}</math>的乘积,即: |
− | | |
− | | |
− | | |
− | | |
− | | |
− | {{Equation box 1
| |
− | | |
− | {{Equation box 1
| |
− | | |
− | {方程式方框1
| |
− | | |
− | |indent =
| |
− | | |
− | |indent =
| |
− | | |
− | 不会有事的
| |
− | | |
− | |title=
| |
− | | |
− | |title=
| |
− | | |
− | 标题
| |
− | | |
− | |equation =
| |
− | | |
− | |equation =
| |
− | | |
− | 方程式
| |
− | | |
− | <math>
| |
− | | |
− | <math> | |
− | | |
− | 数学
| |
− | | |
− | \operatorname{I}(X; Y) = D_\text{KL}\left(p_{(X,Y)} \parallel p_Xp_Y\right)
| |
− | | |
− | \operatorname{I}(X; Y) = D_\text{KL}\left(p_{(X,Y)} \parallel p_Xp_Y\right)
| |
− | | |
− | 操作数名{ i }(x; y) d text { KL }左(p {(x,y)}并行 p xp y 右)
| |
− | | |
− | </math>
| |
− | | |
− | </math> | |
− | | |
− | 数学
| |
− | | |
− | |cellpadding= 1
| |
− | | |
− | |cellpadding= 1
| |
− | | |
− | 1号牢房
| |
− | | |
− | |border
| |
− | | |
− | |border
| |
− | | |
− | 边界
| |
− | | |
− | |border colour = #0073CF
| |
− | | |
− | |border colour = #0073CF
| |
− | | |
− | 0073CF
| |
− | | |
− | |background colour=#F5FFFA}}
| |
− | | |
− | |background colour=#F5FFFA}}
| |
− | | |
− | 5 / fffa }
| |
− | | |
| | | |
| | | |
| + | [[文件:MI pic6.png|居中|缩略图]] |
| | | |
| | | |
第327行: |
第257行: |
| Furthermore, let <math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math> be the conditional mass or density function. Then, we have the identity | | Furthermore, let <math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math> be the conditional mass or density function. Then, we have the identity |
| | | |
− | 进一步,设 p { x | y }(x) p {(x,y)}(x,y) / p y (y) / math 是条件质量或密度函数。那么,我们就有了身份
| + | 进一步地,设<math>p_{X|Y=y}(x) = p_{(X,Y)}(x,y) / p_Y(y)</math>是条件质量或密度函数。那么,我们就有了''身份(这里翻译存疑)'': |
− | | |
− | | |
− | | |
− | | |
− | | |
− | {{Equation box 1
| |
− | | |
− | {{Equation box 1
| |
− | | |
− | {方程式方框1
| |
− | | |
− | |indent =
| |
− | | |
− | |indent =
| |
− | | |
− | 不会有事的
| |
− | | |
− | |title=
| |
− | | |
− | |title=
| |
− | | |
− | 标题
| |
− | | |
− | |equation =
| |
− | | |
− | |equation =
| |
− | | |
− | 方程式
| |
− | | |
− | <math>
| |
− | | |
− | <math> | |
− | | |
− | 数学
| |
− | | |
− | \operatorname{I}(X;Y) = \mathbb{E}_Y\left[D_\text{KL}\!\left(p_{X|Y} \parallel p_X\right)\right]
| |
− | | |
− | \operatorname{I}(X;Y) = \mathbb{E}_Y\left[D_\text{KL}\!\left(p_{X|Y} \parallel p_X\right)\right]
| |
− | | |
− | [运算符名称{ i }(x; y) mathbb { e } y 左[ d text { KL } ! 左(p { x | y }并行 p 右)右]]
| |
− | | |
− | </math> | |
− | | |
− | </math>
| |
− | | |
− | 数学
| |
− | | |
− | |cellpadding= 1
| |
− | | |
− | |cellpadding= 1
| |
− | | |
− | 1号牢房
| |
− | | |
− | |border
| |
− | | |
− | |border
| |
− | | |
− | 边界
| |
− | | |
− | |border colour = #0073CF
| |
− | | |
− | |border colour = #0073CF
| |
− | | |
− | 0073CF
| |
− | | |
− | |background colour=#F5FFFA}}
| |
− | | |
− | |background colour=#F5FFFA}}
| |
− | | |
− | 5 / fffa }
| |
− | | |
| | | |
| | | |
| + | [[文件:MI pic7.png|居中|缩略图]] |
| | | |
| | | |
第409行: |
第269行: |
| 联合离散随机变量的证明如下: | | 联合离散随机变量的证明如下: |
| | | |
− | :<math>
| |
− |
| |
− | <math>
| |
− |
| |
− | 数学
| |
− |
| |
− | \begin{align}
| |
− |
| |
− | \begin{align}
| |
− |
| |
− | Begin { align }
| |
− |
| |
− | \operatorname{I}(X;Y) &= \sum_{y \in \mathcal{Y}} p_Y(y) \sum_{x \in \mathcal{X}} p_{X|Y=y}(x) \log \frac{p_{X|Y=y}(x)}{p_X(x)} \\
| |
− |
| |
− | \operatorname{I}(X;Y) &= \sum_{y \in \mathcal{Y}} p_Y(y) \sum_{x \in \mathcal{X}} p_{X|Y=y}(x) \log \frac{p_{X|Y=y}(x)}{p_X(x)} \\
| |
− |
| |
− | 运算符名称{ i }(x; y) & sum { y } p (y) sum { x } p { x | y }(x) log { y }(x)}{ p (x)}}
| |
− |
| |
− | &= \sum_{y \in \mathcal{Y}} p_Y(y) \; D_\text{KL}\!\left(p_{X|Y=y} \parallel p_X\right) \\
| |
− |
| |
− | &= \sum_{y \in \mathcal{Y}} p_Y(y) \; D_\text{KL}\!\left(p_{X|Y=y} \parallel p_X\right) \\
| |
− |
| |
− | 数学上的 y } p y (y) ; d text { KL } ! 左(p { x | y }并行 p 右)
| |
− |
| |
− | &= \mathbb{E}_Y \left[D_\text{KL}\!\left(p_{X|Y} \parallel p_X\right)\right].
| |
− |
| |
− | &= \mathbb{E}_Y \left[D_\text{KL}\!\left(p_{X|Y} \parallel p_X\right)\right].
| |
− |
| |
− | & mathbb { e } y 左[ d text { KL } ! 左(p { x | y }并行 p 右)右]。
| |
− |
| |
− | \end{align}
| |
− |
| |
− | \end{align}
| |
− |
| |
− | End { align }
| |
− |
| |
− | </math>
| |
| | | |
− | </math>
| + | [[文件:MI pic8.png|居中|缩略图]] |
| | | |
− | 数学
| |
| | | |
| Similarly this identity can be established for jointly continuous random variables. | | Similarly this identity can be established for jointly continuous random variables. |
第465行: |
第287行: |
| Note that here the Kullback–Leibler divergence involves integration over the values of the random variable <math>X</math> only, and the expression <math>D_\text{KL}(p_{X|Y} \parallel p_X)</math> still denotes a random variable because <math>Y</math> is random. Thus mutual information can also be understood as the expectation of the Kullback–Leibler divergence of the univariate distribution <math>p_X</math> of <math>X</math> from the conditional distribution <math>p_{X|Y}</math> of <math>X</math> given <math>Y</math>: the more different the distributions <math>p_{X|Y}</math> and <math>p_X</math> are on average, the greater the information gain. | | Note that here the Kullback–Leibler divergence involves integration over the values of the random variable <math>X</math> only, and the expression <math>D_\text{KL}(p_{X|Y} \parallel p_X)</math> still denotes a random variable because <math>Y</math> is random. Thus mutual information can also be understood as the expectation of the Kullback–Leibler divergence of the univariate distribution <math>p_X</math> of <math>X</math> from the conditional distribution <math>p_{X|Y}</math> of <math>X</math> given <math>Y</math>: the more different the distributions <math>p_{X|Y}</math> and <math>p_X</math> are on average, the greater the information gain. |
| | | |
− | 注意,这里 Kullback-Leibler 散度只涉及对随机变量 math x / math 的值的积分,而 math d text { KL }(p { x | y } parallel px) / math 表达式仍然表示随机变量,因为 math y / math 是随机的。因此,互信息也可以理解为单变量分布数学公式的 Kullback-Leibler 散度与条件分布数学公式 p { x | y } / 数学公式的数学公式 p { x | y } / 数学公式给定数学公式 y / 数学公式的数学公式的期望值: 平均分布公式 p { x | y } / 数学公式 p x / 数学公式 p x / 数学公式的分布越不相同,信息增益越大。
| + | 因此,互信息也可以理解为X的单变量分布<math>p_X</math>与给定<math>Y</math>的<math>X</math>的条件分布<math>p_{X|Y}</math>的Kullback–Leibler散度的期望:平均分布<math>p_{X|Y}</math>和<math>p_X</math>的分布差异越大,信息增益越大。 |
| | | |
| === 互信息的贝叶斯估计 Bayesian estimation of mutual information === | | === 互信息的贝叶斯估计 Bayesian estimation of mutual information === |