更改

添加12字节 、 2020年12月13日 (日) 19:55
无编辑摘要
第7行: 第7行:  
In [[probability theory]], particularly [[information theory]], the '''conditional mutual information'''<ref name = Wyner1978>{{cite journal|last=Wyner|first=A. D. |title=A definition of conditional mutual information for arbitrary ensembles|url=|journal=Information and Control|year=1978|volume=38|issue=1|pages=51–59|doi=10.1016/s0019-9958(78)90026-8|doi-access=free}}</ref><ref name = Dobrushin1959>{{cite journal|last=Dobrushin|first=R. L. |title=General formulation of Shannon's main theorem in information theory|journal=Uspekhi Mat. Nauk|year=1959|volume=14|pages=3–104}}</ref> is, in its most basic form, the [[expected value]] of the [[mutual information]] of two random variables given the value of a third.
 
In [[probability theory]], particularly [[information theory]], the '''conditional mutual information'''<ref name = Wyner1978>{{cite journal|last=Wyner|first=A. D. |title=A definition of conditional mutual information for arbitrary ensembles|url=|journal=Information and Control|year=1978|volume=38|issue=1|pages=51–59|doi=10.1016/s0019-9958(78)90026-8|doi-access=free}}</ref><ref name = Dobrushin1959>{{cite journal|last=Dobrushin|first=R. L. |title=General formulation of Shannon's main theorem in information theory|journal=Uspekhi Mat. Nauk|year=1959|volume=14|pages=3–104}}</ref> is, in its most basic form, the [[expected value]] of the [[mutual information]] of two random variables given the value of a third.
   −
在'''<font color="#ff8000"> 概率论Probability theory</font>'''中,特别是与'''<font color="#ff8000"> 信息论Information theory</font>'''相关的情况下,最基本形式的'''<font color="#ff8000"> 条件互信息Conditional mutual information </font>'''<ref name = Wyner1978>{{cite journal|last=Wyner|first=A. D. |title=A definition of conditional mutual information for arbitrary ensembles|url=|journal=Information and Control|year=1978|volume=38|issue=1|pages=51–59|doi=10.1016/s0019-9958(78)90026-8|doi-access=free}}</ref><ref name = Dobrushin1959>{{cite journal|last=Dobrushin|first=R. L. |title=General formulation of Shannon's main theorem in information theory|journal=Uspekhi Mat. Nauk|year=1959|volume=14|pages=3–104}}</ref>,是在给定第三个值的两个随机变量间互信息的期望值。
+
在'''<font color="#ff8000"> 概率论 Probability theory</font>'''中,特别是与'''<font color="#ff8000"> 信息论 Information theory</font>'''相关的情况下,最基本形式的'''<font color="#ff8000"> 条件互信息 Conditional mutual information </font>'''<ref name = Wyner1978>{{cite journal|last=Wyner|first=A. D. |title=A definition of conditional mutual information for arbitrary ensembles|url=|journal=Information and Control|year=1978|volume=38|issue=1|pages=51–59|doi=10.1016/s0019-9958(78)90026-8|doi-access=free}}</ref><ref name = Dobrushin1959>{{cite journal|last=Dobrushin|first=R. L. |title=General formulation of Shannon's main theorem in information theory|journal=Uspekhi Mat. Nauk|year=1959|volume=14|pages=3–104}}</ref>,是在给定第三个值的两个随机变量间互信息的期望值。
      第14行: 第14行:  
For random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, we define the conditional mutual information as
 
For random variables <math>X</math>, <math>Y</math>, and <math>Z</math> with [[Support (mathematics)|support sets]] <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> and <math>\mathcal{Z}</math>, we define the conditional mutual information as
   −
对于具有'''<font color="#ff8000"> 支持集Probability theory</font>''' <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>的随机变量<math>X</math>, <math>Y</math>和 <math>Z</math>,我们将条件互信息定义为:
+
对于具有'''<font color="#ff8000"> 支持集 Probability theory</font>''' <math>\mathcal{X}</math>, <math>\mathcal{Y}</math> 和 <math>\mathcal{Z}</math>的随机变量<math>X</math>, <math>Y</math>和 <math>Z</math>,我们将条件互信息定义为:
      第59行: 第59行:  
where the marginal, joint, and/or conditional [[probability mass function]]s are denoted by <math>p</math> with the appropriate subscript. This can be simplified as
 
where the marginal, joint, and/or conditional [[probability mass function]]s are denoted by <math>p</math> with the appropriate subscript. This can be simplified as
   −
其中边缘概率质量函数,联合概率质量函数,和(或)条件'''<font color="#ff8000">概率质量函数 probability mass function </font>'''可以由<math>p</math>加上适当的下标表示。这可以简化为:
+
其中边缘概率质量函数,联合概率质量函数,和(或)条件'''<font color="#ff8000">概率质量函数 Probability mass function </font>'''可以由<math>p</math>加上适当的下标表示。这可以简化为:
      第91行: 第91行:  
where the marginal, joint, and/or conditional [[probability density function]]s are denoted by <math>p</math> with the appropriate subscript. This can be simplified as
 
where the marginal, joint, and/or conditional [[probability density function]]s are denoted by <math>p</math> with the appropriate subscript. This can be simplified as
   −
其中边缘概率密度函数,联合概率密度函数,和(或)条件'''<font color="#ff8000">概率密度函数 probability density function </font>'''可以由p加上适当的下标表示。这可以简化为
+
其中边缘概率密度函数,联合概率密度函数,和(或)条件'''<font color="#ff8000">概率密度函数 Probability density function </font>'''可以由p加上适当的下标表示。这可以简化为
      第164行: 第164行:  
A more general definition of conditional mutual information, applicable to random variables with continuous or other arbitrary distributions, will depend on the concept of '''[[regular conditional probability]]'''.  (See also. <ref>[http://planetmath.org/encyclopedia/ConditionalProbabilityMeasure.html Regular Conditional Probability] on [http://planetmath.org/ PlanetMath]</ref><ref>D. Leao, Jr. et al. ''Regular conditional probability, disintegration of probability and Radon spaces.'' Proyecciones. Vol. 23, No. 1, pp. 15–29, May 2004, Universidad Católica del Norte, Antofagasta, Chile [http://www.scielo.cl/pdf/proy/v23n1/art02.pdf PDF]</ref>)
 
A more general definition of conditional mutual information, applicable to random variables with continuous or other arbitrary distributions, will depend on the concept of '''[[regular conditional probability]]'''.  (See also. <ref>[http://planetmath.org/encyclopedia/ConditionalProbabilityMeasure.html Regular Conditional Probability] on [http://planetmath.org/ PlanetMath]</ref><ref>D. Leao, Jr. et al. ''Regular conditional probability, disintegration of probability and Radon spaces.'' Proyecciones. Vol. 23, No. 1, pp. 15–29, May 2004, Universidad Católica del Norte, Antofagasta, Chile [http://www.scielo.cl/pdf/proy/v23n1/art02.pdf PDF]</ref>)
   −
条件互信息的其他通用定义(适用于具有连续或其他任意分布的随机变量)将取决于'''<font color="#ff8000"> 正则条件概率 regular conditional probability </font>'''的概念。(参阅<ref>[http://planetmath.org/encyclopedia/ConditionalProbabilityMeasure.html Regular Conditional Probability] on [http://planetmath.org/ PlanetMath]</ref><ref>D. Leao, Jr. et al. ''Regular conditional probability, disintegration of probability and Radon spaces.'' Proyecciones. Vol. 23, No. 1, pp. 15–29, May 2004, Universidad Católica del Norte, Antofagasta, Chile [http://www.scielo.cl/pdf/proy/v23n1/art02.pdf PDF]</ref>))
+
条件互信息的其他通用定义(适用于具有连续或其他任意分布的随机变量)将取决于'''<font color="#ff8000"> 正则条件概率 Regular conditional probability </font>'''的概念。(参阅<ref>[http://planetmath.org/encyclopedia/ConditionalProbabilityMeasure.html Regular Conditional Probability] on [http://planetmath.org/ PlanetMath]</ref><ref>D. Leao, Jr. et al. ''Regular conditional probability, disintegration of probability and Radon spaces.'' Proyecciones. Vol. 23, No. 1, pp. 15–29, May 2004, Universidad Católica del Norte, Antofagasta, Chile [http://www.scielo.cl/pdf/proy/v23n1/art02.pdf PDF]</ref>))
      第170行: 第170行:  
Let <math>(\Omega, \mathcal F, \mathfrak P)</math> be a [[probability space]], and let the random variables <math>X</math>, <math>Y</math>, and <math>Z</math> each be defined as a Borel-measurable function from <math>\Omega</math> to some state space endowed with a topological structure.
 
Let <math>(\Omega, \mathcal F, \mathfrak P)</math> be a [[probability space]], and let the random variables <math>X</math>, <math>Y</math>, and <math>Z</math> each be defined as a Borel-measurable function from <math>\Omega</math> to some state space endowed with a topological structure.
   −
令<math>(\Omega, \mathcal F, \mathfrak P)</math>为一个'''<font color="#ff8000"> 概率空间 probability space </font>''',并将随机变量<math>X</math>, <math>Y</math>和 <math>Z</math>分别定义为一个从<math>\Omega</math>到具有拓扑结构的状态空间的'''<font color="#ff8000"> 波莱尔可测函数Borel-measurable function </font>'''。
+
令<math>(\Omega, \mathcal F, \mathfrak P)</math>为一个'''<font color="#ff8000"> 概率空间 Probability space </font>''',并将随机变量<math>X</math>, <math>Y</math>和 <math>Z</math>分别定义为一个从<math>\Omega</math>到具有拓扑结构的状态空间的'''<font color="#ff8000"> 波莱尔可测函数 Borel-measurable function </font>'''。
      第176行: 第176行:  
Consider the Borel measure (on the σ-algebra generated by the open sets) in the state space of each random variable defined by assigning each Borel set the <math>\mathfrak P</math>-measure of its preimage in <math>\mathcal F</math>.  This is called the [[pushforward measure]] <math>X _* \mathfrak P = \mathfrak P\big(X^{-1}(\cdot)\big).</math>  The '''support of a random variable''' is defined to be the [[Support (measure theory)|topological support]] of this measure, i.e. <math>\mathrm{supp}\,X = \mathrm{supp}\,X _* \mathfrak P.</math>
 
Consider the Borel measure (on the σ-algebra generated by the open sets) in the state space of each random variable defined by assigning each Borel set the <math>\mathfrak P</math>-measure of its preimage in <math>\mathcal F</math>.  This is called the [[pushforward measure]] <math>X _* \mathfrak P = \mathfrak P\big(X^{-1}(\cdot)\big).</math>  The '''support of a random variable''' is defined to be the [[Support (measure theory)|topological support]] of this measure, i.e. <math>\mathrm{supp}\,X = \mathrm{supp}\,X _* \mathfrak P.</math>
   −
考虑到在每个随机变量状态空间中的'''<font color="#ff8000"> 波莱尔测度Borel measure</font>'''(关于开放集生成的σ代数),是由<math>\mathcal F</math>中每个波莱尔集分配到的的原像<math>\mathfrak P</math>测度来确定的。这被称为'''<font color="#ff8000"> 前推测度  Pushforward measure </font>''' <math>X _* \mathfrak P = \mathfrak P\big(X^{-1}(\cdot)\big).</math>。随机变量的支撑集定义为该测度的拓扑支撑集,即<math>\mathrm{supp}\,X = \mathrm{supp}\,X _* \mathfrak P.</math>。
+
考虑到在每个随机变量状态空间中的'''<font color="#ff8000"> 波莱尔测度 Borel measure</font>'''(关于开放集生成的σ代数),是由<math>\mathcal F</math>中每个波莱尔集分配到的的原像<math>\mathfrak P</math>测度来确定的。这被称为'''<font color="#ff8000"> 前推测度  Pushforward measure </font>''' <math>X _* \mathfrak P = \mathfrak P\big(X^{-1}(\cdot)\big).</math>。随机变量的支撑集定义为该测度的拓扑支撑集,即<math>\mathrm{supp}\,X = \mathrm{supp}\,X _* \mathfrak P.</math>。
      第182行: 第182行:  
Now we can formally define the [[conditional probability distribution|conditional probability measure]] given the value of one (or, via the [[product topology]], more) of the random variables.  Let <math>M</math> be a measurable subset of <math>\Omega,</math> (i.e. <math>M \in \mathcal F,</math>) and let <math>x \in \mathrm{supp}\,X.</math>  Then, using the [[disintegration theorem]]:
 
Now we can formally define the [[conditional probability distribution|conditional probability measure]] given the value of one (or, via the [[product topology]], more) of the random variables.  Let <math>M</math> be a measurable subset of <math>\Omega,</math> (i.e. <math>M \in \mathcal F,</math>) and let <math>x \in \mathrm{supp}\,X.</math>  Then, using the [[disintegration theorem]]:
   −
现在,我们可以在给定其中一个随机变量值(或通过'''<font color="#ff8000"> 积拓扑  product topology </font>'''获得更多)的情况下正式定义'''<font color="#ff8000"> 条件概率测度  conditional probability distribution|conditional probability measure </font>'''。令<math>M</math>为<math>\Omega</math>的可测子集(即<math>M \in \mathcal F,</math>),令<math>x \in \mathrm{supp}\,X</math>。然后,使用'''<font color="#ff8000"> 分解定理  disintegration theorem </font>''':
+
现在,我们可以在给定其中一个随机变量值(或通过'''<font color="#ff8000"> 积拓扑  product topology </font>'''获得更多)的情况下正式定义'''<font color="#ff8000"> 条件概率测度  Conditional probability distribution|conditional probability measure </font>'''。令<math>M</math>为<math>\Omega</math>的可测子集(即<math>M \in \mathcal F,</math>),令<math>x \in \mathrm{supp}\,X</math>。然后,使用'''<font color="#ff8000"> 分解定理  Disintegration theorem </font>''':
      第193行: 第193行:  
where the limit is taken over the open neighborhoods <math>U</math> of <math>x</math>, as they are allowed to become arbitrarily smaller with respect to [[Subset|set inclusion]].
 
where the limit is taken over the open neighborhoods <math>U</math> of <math>x</math>, as they are allowed to become arbitrarily smaller with respect to [[Subset|set inclusion]].
   −
在<math>x</math>的开放邻域<math>U</math>处取极限,因为相对于'''<font color="#ff8000"> 集包含Set inclusion</font>''',它们可以任意变小。
+
在<math>x</math>的开放邻域<math>U</math>处取极限,因为相对于'''<font color="#ff8000"> 集包含 Set inclusion</font>''',它们可以任意变小。
      第199行: 第199行:  
Finally we can define the conditional mutual information via [[Lebesgue integration]]:
 
Finally we can define the conditional mutual information via [[Lebesgue integration]]:
   −
最后,我们可以通过'''<font color="#ff8000"> 勒贝格积分Lebesgue integration</font>'''来定义条件互信息:
+
最后,我们可以通过'''<font color="#ff8000"> 勒贝格积分 Lebesgue integration</font>'''来定义条件互信息:
      第213行: 第213行:  
where the integrand is the logarithm of a [[Radon–Nikodym derivative]] involving some of the conditional probability measures we have just defined.
 
where the integrand is the logarithm of a [[Radon–Nikodym derivative]] involving some of the conditional probability measures we have just defined.
   −
其中被积函数是'''<font color="#ff8000"> 拉东-尼科迪姆导数Radon–Nikodym derivative</font>'''的对数,涉及我们刚刚定义的一些条件概率测度。
+
其中被积函数是'''<font color="#ff8000"> 拉东-尼科迪姆导数 Radon–Nikodym derivative</font>'''的对数,涉及我们刚刚定义的一些条件概率测度。
      第220行: 第220行:  
In an expression such as <math>I(A;B|C),</math> <math>A,</math> <math>B,</math> and <math>C</math> need not necessarily be restricted to representing individual random variables, but could also represent the joint distribution of any collection of random variables defined on the same [[probability space]].  As is common in [[probability theory]], we may use the comma to denote such a joint distribution, e.g. <math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1).</math>  Hence the use of the semicolon (or occasionally a colon or even a wedge <math>\wedge</math>) to separate the principal arguments of the mutual information symbol.  (No such distinction is necessary in the symbol for [[joint entropy]], since the joint entropy of any number of random variables is the same as the entropy of their joint distribution.)
 
In an expression such as <math>I(A;B|C),</math> <math>A,</math> <math>B,</math> and <math>C</math> need not necessarily be restricted to representing individual random variables, but could also represent the joint distribution of any collection of random variables defined on the same [[probability space]].  As is common in [[probability theory]], we may use the comma to denote such a joint distribution, e.g. <math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1).</math>  Hence the use of the semicolon (or occasionally a colon or even a wedge <math>\wedge</math>) to separate the principal arguments of the mutual information symbol.  (No such distinction is necessary in the symbol for [[joint entropy]], since the joint entropy of any number of random variables is the same as the entropy of their joint distribution.)
   −
在诸如<math>I(A;B|C)</math>的表达式中,<math>A</math> <math>B</math> 和 <math>C</math>不限于表示单个随机变量,它们同时可以表示在同一概率空间上定义的任意随机变量集合的联合分布。类似概率论中的表达方式,我们可以使用逗号来表示这种联合分布,例如<math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1)</math>。因此,使用分号(或有时用冒号或楔形<math>\wedge</math>)来分隔互信息符号的主要参数。(在联合熵的符号中,不需要作这样的区分,因为任意数量随机变量的'''<font color="#ff8000"> 联合熵Joint entropy</font>'''与它们联合分布的熵相同。)
+
在诸如<math>I(A;B|C)</math>的表达式中,<math>A</math> <math>B</math> 和 <math>C</math>不限于表示单个随机变量,它们同时可以表示在同一概率空间上定义的任意随机变量集合的联合分布。类似概率论中的表达方式,我们可以使用逗号来表示这种联合分布,例如<math>I(A_0,A_1;B_1,B_2,B_3|C_0,C_1)</math>。因此,使用分号(或有时用冒号或楔形<math>\wedge</math>)来分隔互信息符号的主要参数。(在联合熵的符号中,不需要作这样的区分,因为任意数量随机变量的'''<font color="#ff8000"> 联合熵 Joint entropy</font>'''与它们联合分布的熵相同。)
      第248行: 第248行:  
in which case <math>X</math>, <math>Y</math> and <math>Z</math> are pairwise independent and in particular <math>I(X;Y)=0</math>, but <math>I(X;Y|Z)=1.</math>
 
in which case <math>X</math>, <math>Y</math> and <math>Z</math> are pairwise independent and in particular <math>I(X;Y)=0</math>, but <math>I(X;Y|Z)=1.</math>
   −
考虑到第三个随机变量条件可能会增加或减少'''<font color="#ff8000"> 互信息Mutual information </font>''':例如其差值<math>I(X;Y) - I(X;Y|Z)</math>,称为'''<font color="#ff8000"> 交互信息Interaction information </font>'''(注意区分互信息Mutual information),可以为正,负或零。即使随机变量是成对独立的也是如此。比如以下情况下:
+
考虑到第三个随机变量条件可能会增加或减少'''<font color="#ff8000"> 互信息 Mutual information </font>''':例如其差值<math>I(X;Y) - I(X;Y|Z)</math>,称为'''<font color="#ff8000"> 交互信息 Interaction information </font>'''(注意区分互信息Mutual information),可以为正,负或零。即使随机变量是成对独立的也是如此。比如以下情况下:
     
25

个编辑