第321行: |
第321行: |
| The Kullback-Leibler divergence formulation of the mutual information is predicated on that one is interested in comparing <math>p(x,y)</math> to the fully factorized [[outer product]] <math>p(x) \cdot p(y)</math>. In many problems, such as [[non-negative matrix factorization]], one is interested in less extreme factorizations; specifically, one wishes to compare <math>p(x,y)</math> to a low-rank matrix approximation in some unknown variable <math>w</math>; that is, to what degree one might have | | The Kullback-Leibler divergence formulation of the mutual information is predicated on that one is interested in comparing <math>p(x,y)</math> to the fully factorized [[outer product]] <math>p(x) \cdot p(y)</math>. In many problems, such as [[non-negative matrix factorization]], one is interested in less extreme factorizations; specifically, one wishes to compare <math>p(x,y)</math> to a low-rank matrix approximation in some unknown variable <math>w</math>; that is, to what degree one might have |
| | | |
− | The Kullback-Leibler divergence formulation of the mutual information is predicated on that one is interested in comparing 𝑝(𝑥,𝑦) to the fully factorized outer product 𝑝(𝑥)⋅𝑝(𝑦). In many problems, such as non-negative matrix factorization, one is interested in less extreme factorizations; specifically, one wishes to compare 𝑝(𝑥,𝑦) to a low-rank matrix approximation in some unknown variable 𝑤; that is, to what degree one might have | + | The Kullback-Leibler divergence formulation of the mutual information is predicated on that one is interested in comparing 𝑝(𝑥,𝑦) to the fully factorized outer product 𝑝(𝑥)⋅𝑝(𝑦). In many problems, such as non-negative matrix factorization, one is interested in '''<font color="#32CD32">less extreme factorizations</font>'''; specifically, one wishes to compare 𝑝(𝑥,𝑦) to a low-rank matrix approximation in some unknown variable 𝑤; that is, to what degree one might have |
| | | |
− | 相互信息的Kullback-Leibler散度公式是基于这样一个结论的:人们有兴趣将𝑝(𝑥,𝑦)与完全分解的外积𝑝(𝑥)进行比较。在许多问题中,例如非负矩阵因式分解,人们对较不极端的因式分解感兴趣;具体地说,人们希望将𝑝(𝑥,𝑦)与某个未知变量𝑤中的低秩矩阵近似进行比较;也就是说,在多大程度上可能会有这样的结果 | + | 相互信息的Kullback-Leibler散度公式是基于这样一个结论的:人们会更关注将<math>p(x,y)</math>与完全分解的外积<math>p(x) \cdot p(y)</math>进行比较。在许多问题中,例如非负矩阵因式分解中,人们对'''<font color="#32CD32">较不极端的</font>'''因式分解感兴趣;具体地说,人们希望将<math>p(x,y)</math>与某个未知变量<math>w</math>中的低秩矩阵近似进行比较;也就是说,在多大程度上可能会有这样的结果: |
| | | |
| :<math>p(x,y)\approx \sum_w p^\prime (x,w) p^{\prime\prime}(w,y)</math> | | :<math>p(x,y)\approx \sum_w p^\prime (x,w) p^{\prime\prime}(w,y)</math> |
第333行: |
第333行: |
| Alternately, one might be interested in knowing how much more information 𝑝(𝑥,𝑦) carries over its factorization. In such a case, the excess information that the full distribution 𝑝(𝑥,𝑦) carries over the matrix factorization is given by the Kullback-Leibler divergence | | Alternately, one might be interested in knowing how much more information 𝑝(𝑥,𝑦) carries over its factorization. In such a case, the excess information that the full distribution 𝑝(𝑥,𝑦) carries over the matrix factorization is given by the Kullback-Leibler divergence |
| | | |
− | 另一方面,人们可能有兴趣知道在因子分解过程中,有多少信息(𝑝(𝑥,𝑦))携带了多少信息。在这种情况下,全分布𝑝(𝑥,𝑦)通过矩阵因子分解所携带的多余信息由Kullback-Leibler散度给出
| + | 另一方面,人们可能有兴趣知道在因子分解过程中,有<math>p(x,y)</math>携带了多少信息。在这种情况下,全分布<math>p(x,y)</math>通过矩阵因子分解所携带的多余信息由Kullback-Leibler散度给出 |
| | | |
| :<math>\operatorname{I}_{LRMA} = \sum_{y \in \mathcal{Y}} \sum_{x \in \mathcal{X}} | | :<math>\operatorname{I}_{LRMA} = \sum_{y \in \mathcal{Y}} \sum_{x \in \mathcal{X}} |
第347行: |
第347行: |
| The conventional definition of the mutual information is recovered in the extreme case that the process <math>W</math> has only one value for <math>w</math>. | | The conventional definition of the mutual information is recovered in the extreme case that the process <math>W</math> has only one value for <math>w</math>. |
| | | |
− | 在过程𝑊只有一个值的极端情况下,恢复了传统的互信息定义。
| + | 在过程<math>w</math>只有一个值的极端情况下,恢复了传统的互信息定义。 |
| | | |
| == 变形 Variations == | | == 变形 Variations == |