第3行: |
第3行: |
| {{Information theory}} | | {{Information theory}} |
| | | |
− | [[文件:VennInfo3Var.svg|缩略图|右|以上是三个变量<math>x</math>, <math>y</math>, 和 <math>z</math>信息理论测度的维恩图,分别由左下,右下和上部的圆圈表示。条件交互信息<math>I(x;z|y)</math>, <math>I(y;z|x)</math> 和 <math>I(x;y|z)</math>分别由黄色,青色和品红色('''注意:该图颜色标注错误,需要修改''')区域表示。]] | + | [[文件:VennInfo3Var.svg|缩略图|右|以上是三个变量<math>x</math>, <math>y</math>, 和 <math>z</math>信息理论测度的维恩图,分别由左下,右下和上部的圆圈表示。条件互信息<math>I(x;z|y)</math>, <math>I(y;z|x)</math> 和 <math>I(x;y|z)</math>分别由黄色,青色和品红色('''注意:该图颜色标注错误,需要修改''')区域表示。]] |
| | | |
| In [[probability theory]], particularly [[information theory]], the '''conditional mutual information'''<ref name = Wyner1978>{{cite journal|last=Wyner|first=A. D. |title=A definition of conditional mutual information for arbitrary ensembles|url=|journal=Information and Control|year=1978|volume=38|issue=1|pages=51–59|doi=10.1016/s0019-9958(78)90026-8|doi-access=free}}</ref><ref name = Dobrushin1959>{{cite journal|last=Dobrushin|first=R. L. |title=General formulation of Shannon's main theorem in information theory|journal=Uspekhi Mat. Nauk|year=1959|volume=14|pages=3–104}}</ref> is, in its most basic form, the [[expected value]] of the [[mutual information]] of two random variables given the value of a third. | | In [[probability theory]], particularly [[information theory]], the '''conditional mutual information'''<ref name = Wyner1978>{{cite journal|last=Wyner|first=A. D. |title=A definition of conditional mutual information for arbitrary ensembles|url=|journal=Information and Control|year=1978|volume=38|issue=1|pages=51–59|doi=10.1016/s0019-9958(78)90026-8|doi-access=free}}</ref><ref name = Dobrushin1959>{{cite journal|last=Dobrushin|first=R. L. |title=General formulation of Shannon's main theorem in information theory|journal=Uspekhi Mat. Nauk|year=1959|volume=14|pages=3–104}}</ref> is, in its most basic form, the [[expected value]] of the [[mutual information]] of two random variables given the value of a third. |
| | | |
− | 在'''<font color="#ff8000"> 概率论Probability theory</font>'''中,特别是与'''<font color="#ff8000"> 信息论Information theory</font>'''相关的情况下,最基本形式的'''<font color="#ff8000"> 条件交互信息Conditional mutual information </font>''',是在给定第三个值的两个随机变量间交互信息的期望值。 | + | 在'''<font color="#ff8000"> 概率论Probability theory</font>'''中,特别是与'''<font color="#ff8000"> 信息论Information theory</font>'''相关的情况下,最基本形式的'''<font color="#ff8000"> 条件互信息Conditional mutual information </font>''',是在给定第三个值的两个随机变量间互信息的期望值。 |
| | | |
| == Definition 定义 == | | == Definition 定义 == |
| For random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, we define the conditional mutual information as | | For random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, we define the conditional mutual information as |
| | | |
− | 对于具有支持集<math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>的随机变量<math>X</math>, <math>Y</math>, 和 <math>Z</math>,我们将条件交互信息定义为: | + | 对于具有支持集<math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>的随机变量<math>X</math>, <math>Y</math>, 和 <math>Z</math>,我们将条件互信息定义为: |
| | | |
| | | |
第38行: |
第38行: |
| Thus <math>I(X;Y|Z)</math> is the expected (with respect to <math>Z</math>) [[Kullback–Leibler divergence]] from the conditional joint distribution <math>P_{(X,Y)|Z}</math> to the product of the conditional marginals <math>P_{X|Z}</math> and <math>P_{Y|Z}</math>. Compare with the definition of [[mutual information]]. | | Thus <math>I(X;Y|Z)</math> is the expected (with respect to <math>Z</math>) [[Kullback–Leibler divergence]] from the conditional joint distribution <math>P_{(X,Y)|Z}</math> to the product of the conditional marginals <math>P_{X|Z}</math> and <math>P_{Y|Z}</math>. Compare with the definition of [[mutual information]]. |
| | | |
− | 因此,相较于交互信息的定义,<math>I(X;Y|Z)</math>可以表达为期望的'''<font color="#ff8000"> Kullback-Leibler散度</font>'''(相对于<math>Z</math>),即从条件联合分布<math>P_{(X,Y)|Z}</math>到条件边际<math>P_{X|Z}</math> 和 <math>P_{Y|Z}</math>的乘积。
| + | 因此,相较于互信息的定义,<math>I(X;Y|Z)</math>可以表达为期望的'''<font color="#ff8000"> Kullback-Leibler散度</font>'''(相对于<math>Z</math>),即从条件联合分布<math>P_{(X,Y)|Z}</math>到条件边际<math>P_{X|Z}</math> 和 <math>P_{Y|Z}</math>的乘积。 |
| | | |
| == In terms of pmf's for discrete distributions 关于离散分布的概率质量函数 == | | == In terms of pmf's for discrete distributions 关于离散分布的概率质量函数 == |
第44行: |
第44行: |
| For discrete random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, the conditional mutual information <math>I(X;Y|Z)</math> is as follows | | For discrete random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, the conditional mutual information <math>I(X;Y|Z)</math> is as follows |
| | | |
− | 对于具有支持集<math>X</math>, <math>Y</math>, 和 <math>Z</math>的离散随机变量<math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>,条件交互信息<math>I(X;Y|Z)</math>如下: | + | 对于具有支持集<math>X</math>, <math>Y</math>, 和 <math>Z</math>的离散随机变量<math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>,条件互信息<math>I(X;Y|Z)</math>如下: |
| | | |
| | | |
第74行: |
第74行: |
| For (absolutely) continuous random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, the conditional mutual information <math>I(X;Y|Z)</math> is as follows | | For (absolutely) continuous random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, the conditional mutual information <math>I(X;Y|Z)</math> is as follows |
| | | |
− | 对于具有支持集<math>X</math>, <math>Y</math>, 和 <math>Z</math>的(绝对)连续随机变量<math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>,条件交互信息<math>I(X;Y|Z)</math>如下: | + | 对于具有支持集<math>X</math>, <math>Y</math>, 和 <math>Z</math>的(绝对)连续随机变量<math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>,条件互信息<math>I(X;Y|Z)</math>如下: |
| | | |
| | | |
第112行: |
第112行: |
| This can be rewritten to show its relationship to mutual information | | This can be rewritten to show its relationship to mutual information |
| | | |
− | 这么表达以显示其与交互信息的关系
| + | 这么表达以显示其与互信息的关系 |
| | | |
| | | |
第120行: |
第120行: |
| usually rearranged as '''the chain rule for mutual information''' | | usually rearranged as '''the chain rule for mutual information''' |
| | | |
− | 通常情况下,表达式被重新整理为“交互信息的链式法则”
| + | 通常情况下,表达式被重新整理为“互信息的链式法则” |
| | | |
| | | |
第137行: |
第137行: |
| Like mutual information, conditional mutual information can be expressed as a [[Kullback–Leibler divergence]]: | | Like mutual information, conditional mutual information can be expressed as a [[Kullback–Leibler divergence]]: |
| | | |
− | 类似交互信息一样,条件交互信息可以表示为Kullback-Leibler散度:
| + | 类似互信息一样,条件互信息可以表示为Kullback-Leibler散度: |
| | | |
| | | |
第154行: |
第154行: |
| A more general definition of conditional mutual information, applicable to random variables with continuous or other arbitrary distributions, will depend on the concept of '''[[regular conditional probability]]'''. (See also. <ref>[http://planetmath.org/encyclopedia/ConditionalProbabilityMeasure.html Regular Conditional Probability] on [http://planetmath.org/ PlanetMath]</ref><ref>D. Leao, Jr. et al. ''Regular conditional probability, disintegration of probability and Radon spaces.'' Proyecciones. Vol. 23, No. 1, pp. 15–29, May 2004, Universidad Católica del Norte, Antofagasta, Chile [http://www.scielo.cl/pdf/proy/v23n1/art02.pdf PDF]</ref>) | | A more general definition of conditional mutual information, applicable to random variables with continuous or other arbitrary distributions, will depend on the concept of '''[[regular conditional probability]]'''. (See also. <ref>[http://planetmath.org/encyclopedia/ConditionalProbabilityMeasure.html Regular Conditional Probability] on [http://planetmath.org/ PlanetMath]</ref><ref>D. Leao, Jr. et al. ''Regular conditional probability, disintegration of probability and Radon spaces.'' Proyecciones. Vol. 23, No. 1, pp. 15–29, May 2004, Universidad Católica del Norte, Antofagasta, Chile [http://www.scielo.cl/pdf/proy/v23n1/art02.pdf PDF]</ref>) |
| | | |
− | 条件交互信息的其他通用定义(适用于具有连续或其他任意分布的随机变量)将取决于正则条件概率的概念。
| + | 条件互信息的其他通用定义(适用于具有连续或其他任意分布的随机变量)将取决于正则条件概率的概念。 |
| | | |
| | | |
第189行: |
第189行: |
| Finally we can define the conditional mutual information via [[Lebesgue integration]]: | | Finally we can define the conditional mutual information via [[Lebesgue integration]]: |
| | | |
− | 最后,我们可以通过'''<font color="#ff8000"> 勒贝格积分Lebesgue integration</font>'''来定义条件交互信息: | + | 最后,我们可以通过'''<font color="#ff8000"> 勒贝格积分Lebesgue integration</font>'''来定义条件互信息: |
| | | |
| | | |
第208行: |
第208行: |
| In an expression such as <math>I(A;B|C),</math> <math>A,</math> <math>B,</math> and <math>C</math> need not necessarily be restricted to representing individual random variables, but could also represent the joint distribution of any collection of random variables defined on the same [[probability space]]. As is common in [[probability theory]], we may use the comma to denote such a joint distribution, e.g. <math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1).</math> Hence the use of the semicolon (or occasionally a colon or even a wedge <math>\wedge</math>) to separate the principal arguments of the mutual information symbol. (No such distinction is necessary in the symbol for [[joint entropy]], since the joint entropy of any number of random variables is the same as the entropy of their joint distribution.) | | In an expression such as <math>I(A;B|C),</math> <math>A,</math> <math>B,</math> and <math>C</math> need not necessarily be restricted to representing individual random variables, but could also represent the joint distribution of any collection of random variables defined on the same [[probability space]]. As is common in [[probability theory]], we may use the comma to denote such a joint distribution, e.g. <math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1).</math> Hence the use of the semicolon (or occasionally a colon or even a wedge <math>\wedge</math>) to separate the principal arguments of the mutual information symbol. (No such distinction is necessary in the symbol for [[joint entropy]], since the joint entropy of any number of random variables is the same as the entropy of their joint distribution.) |
| | | |
− | 在诸如<math>I(A;B|C)</math>的表达式中,<math>A</math> <math>B</math> 和 <math>C</math>不限于表示单个随机变量,它们同时可以表示在同一概率空间上定义的任意随机变量集合的联合分布。类似概率论中的表达方式,我们可以使用逗号来表示这种联合分布,例如<math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1).</math>。因此,使用分号(或有时用冒号或楔形<math>\wedge</math>)来分隔交互信息符号的主要参数。(在联合熵的符号中,不需要作这样的区分,因为任意数量随机变量的'''<font color="#ff8000"> 联合熵Joint entropy</font>'''与它们联合分布的熵相同。) | + | 在诸如<math>I(A;B|C)</math>的表达式中,<math>A</math> <math>B</math> 和 <math>C</math>不限于表示单个随机变量,它们同时可以表示在同一概率空间上定义的任意随机变量集合的联合分布。类似概率论中的表达方式,我们可以使用逗号来表示这种联合分布,例如<math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1).</math>。因此,使用分号(或有时用冒号或楔形<math>\wedge</math>)来分隔互信息符号的主要参数。(在联合熵的符号中,不需要作这样的区分,因为任意数量随机变量的'''<font color="#ff8000"> 联合熵Joint entropy</font>'''与它们联合分布的熵相同。) |
| | | |
| == Properties 属性== | | == Properties 属性== |
第222行: |
第222行: |
| | | |
| | | |
− | 该结果已被用作证明信息理论中其他不等式的基础,尤其是香农不等式。对于某些正则条件下的连续随机变量,条件交互信息也是非负的。
| + | 该结果已被用作证明信息理论中其他不等式的基础,尤其是香农不等式。对于某些正则条件下的连续随机变量,条件互信息也是非负的。 |
| | | |
| === Interaction information 交互信息 === | | === Interaction information 交互信息 === |
第235行: |
第235行: |
| <math>X</math>, <math>Y</math> 和 <math>Z</math>是成对独立的,特别是<math>I(X;Y)=0</math>,不过这里<math>I(X;Y|Z)=1.</math>。 | | <math>X</math>, <math>Y</math> 和 <math>Z</math>是成对独立的,特别是<math>I(X;Y)=0</math>,不过这里<math>I(X;Y|Z)=1.</math>。 |
| | | |
− | === Chain rule for mutual information 交互信息的链式法则 === | + | === Chain rule for mutual information 互信息的链式法则 === |
| :<math>I(X;Y,Z) = I(X;Z) + I(X;Y|Z)</math> | | :<math>I(X;Y,Z) = I(X;Z) + I(X;Y|Z)</math> |
| | | |