第1行: |
第1行: |
− | 此词条由Jie翻译。
| + | 此词条由Jie翻译。由Lincent审校。 |
| | | |
| {{Short description|Measure of relative information in probability theory}} | | {{Short description|Measure of relative information in probability theory}} |
第8行: |
第8行: |
| In [[information theory]], the '''conditional entropy''' quantifies the amount of information needed to describe the outcome of a [[random variable]] <math>Y</math> given that the value of another random variable <math>X</math> is known. Here, information is measured in [[Shannon (unit)|shannon]]s, [[Nat (unit)|nat]]s, or [[Hartley (unit)|hartley]]s. The ''entropy of <math>Y</math> conditioned on <math>X</math>'' is written as H(X ǀ Y). | | In [[information theory]], the '''conditional entropy''' quantifies the amount of information needed to describe the outcome of a [[random variable]] <math>Y</math> given that the value of another random variable <math>X</math> is known. Here, information is measured in [[Shannon (unit)|shannon]]s, [[Nat (unit)|nat]]s, or [[Hartley (unit)|hartley]]s. The ''entropy of <math>Y</math> conditioned on <math>X</math>'' is written as H(X ǀ Y). |
| | | |
− | 在'''<font color="#ff8000"> 信息论Information theory</font>'''中,假设随机变量<math>X</math>的值已知,那么'''<font color="#ff8000"> 条件熵Conditional entropy</font>'''则用于去量化描述随机变量<math>Y</math>结果所需的信息量。此时,信息以'''<font color="#ff8000"> 香农Shannon </font>''','''<font color="#ff8000"> 奈特nat</font>'''或'''<font color="#ff8000"> 哈特莱hartley</font>'''来衡量。以<math>X</math>为条件的<math>Y</math>熵写为<math>H(X ǀ Y)</math>。 | + | 在'''<font color="#ff8000"> 信息论Information theory</font>'''中,假设随机变量<math>X</math>的值已知,那么'''<font color="#ff8000"> 条件熵Conditional entropy</font>'''则用于去定量描述随机变量<math>Y</math>表示的信息量。此时,信息以'''<font color="#ff8000"> 香农Shannon </font>''','''<font color="#ff8000"> 奈特nat</font>'''或'''<font color="#ff8000"> 哈特莱hartley</font>'''来衡量。已知<math>X</math>的条件下<math>Y</math>的熵记为<math>H(X ǀ Y)</math>。 |
| | | |
| | | |
第32行: |
第32行: |
| where <math>\mathcal X</math> and <math>\mathcal Y</math> denote the [[Support (mathematics)|support sets]] of <math>X</math> and <math>Y</math>. | | where <math>\mathcal X</math> and <math>\mathcal Y</math> denote the [[Support (mathematics)|support sets]] of <math>X</math> and <math>Y</math>. |
| | | |
− | 其中<math>\mathcal X</math>和<math>\mathcal Y</math>表示<math>X</math>和<math>Y</math>的支撑集。 | + | 其中<math>\mathcal X</math>和<math>\mathcal Y</math>表示<math>X</math>和<math>Y</math>的<font color="#32cd32">支撑集</font>。 |
| | | |
| | | |
第38行: |
第38行: |
| ''Note:'' It is conventioned that the expressions <math>0 \log 0</math> and <math>0 \log c/0</math> for fixed <math>c > 0</math> should be treated as being equal to zero. This is because <math>\lim_{\theta\to0^+} \theta\, \log \,c/\theta = 0</math> and <math>\lim_{\theta\to0^+} \theta\, \log \theta = 0</math><ref>{{Cite web|url=http://www.inference.org.uk/mackay/itprnn/book.html|title=David MacKay: Information Theory, Pattern Recognition and Neural Networks: The Book|website=www.inference.org.uk|access-date=2019-10-25}}</ref> <!-- because p(x,y) could still equal 0 even if p(x) != 0 and p(y) != 0. What about p(x,y)=p(x)=0? --> | | ''Note:'' It is conventioned that the expressions <math>0 \log 0</math> and <math>0 \log c/0</math> for fixed <math>c > 0</math> should be treated as being equal to zero. This is because <math>\lim_{\theta\to0^+} \theta\, \log \,c/\theta = 0</math> and <math>\lim_{\theta\to0^+} \theta\, \log \theta = 0</math><ref>{{Cite web|url=http://www.inference.org.uk/mackay/itprnn/book.html|title=David MacKay: Information Theory, Pattern Recognition and Neural Networks: The Book|website=www.inference.org.uk|access-date=2019-10-25}}</ref> <!-- because p(x,y) could still equal 0 even if p(x) != 0 and p(y) != 0. What about p(x,y)=p(x)=0? --> |
| | | |
− | 注意:在约定<math>c > 0</math>始终成立时,表达式<math>0 \log 0</math>和<math>0 \log c/0</math>视为等于零。这是因为<math>\lim_{\theta\to0^+} \theta\, \log \,c/\theta = 0</math>,而且<math>\lim_{\theta\to0^+} \theta\, \log \theta = 0</math>><ref>{{Cite web|url=http://www.inference.org.uk/mackay/itprnn/book.html|title=David MacKay: Information Theory, Pattern Recognition and Neural Networks: The Book|website=www.inference.org.uk|access-date=2019-10-25}}</ref> <!-- because p(x,y) could still equal 0 even if p(x) != 0 and p(y) != 0. What about p(x,y)=p(x)=0? -->
| + | 注意:约定<math>c > 0</math>始终成立时,表达式<math>0 \log 0</math>和<math>0 \log c/0</math>视为等于零。这是因为<math>\lim_{\theta\to0^+} \theta\, \log \,c/\theta = 0</math>,而且<math>\lim_{\theta\to0^+} \theta\, \log \theta = 0</math>><ref>{{Cite web|url=http://www.inference.org.uk/mackay/itprnn/book.html|title=David MacKay: Information Theory, Pattern Recognition and Neural Networks: The Book|website=www.inference.org.uk|access-date=2019-10-25}}</ref> <!-- because p(x,y) could still equal 0 even if p(x) != 0 and p(y) != 0. What about p(x,y)=p(x)=0? --> |
| | | |
| | | |
第44行: |
第44行: |
| Intuitive explanation of the definition : According to the definition, <math>\displaystyle H( Y|X) =\mathbb{E}( \ f( X,Y) \ )</math> where <math>\displaystyle f:( x,y) \ \rightarrow -\log( \ p( y|x) \ ) .</math> <math>\displaystyle f</math> associates to <math>\displaystyle ( x,y)</math> the information content of <math>\displaystyle ( Y=y)</math> given <math>\displaystyle (X=x)</math>, which is the amount of information needed to describe the event <math>\displaystyle (Y=y)</math> given <math>(X=x)</math>. According to the law of large numbers, <math>\displaystyle H(Y|X)</math> is the arithmetic mean of a large number of independent realizations of <math>\displaystyle f(X,Y)</math>. | | Intuitive explanation of the definition : According to the definition, <math>\displaystyle H( Y|X) =\mathbb{E}( \ f( X,Y) \ )</math> where <math>\displaystyle f:( x,y) \ \rightarrow -\log( \ p( y|x) \ ) .</math> <math>\displaystyle f</math> associates to <math>\displaystyle ( x,y)</math> the information content of <math>\displaystyle ( Y=y)</math> given <math>\displaystyle (X=x)</math>, which is the amount of information needed to describe the event <math>\displaystyle (Y=y)</math> given <math>(X=x)</math>. According to the law of large numbers, <math>\displaystyle H(Y|X)</math> is the arithmetic mean of a large number of independent realizations of <math>\displaystyle f(X,Y)</math>. |
| | | |
− | 对该定义的直观解释是:根据定义<math>\displaystyle H( Y|X) =\mathbb{E}( \ f( X,Y) \ )</math>,其中<math>\displaystyle f:( x,y) \ \rightarrow -\log( \ p( y|x) \ ) </math>. <math>\displaystyle f</math>将给定<math>\displaystyle (X=x)</math>的<math>\displaystyle ( Y=y)</math>的信息内容与<math>\displaystyle ( x,y)</math>相关联,这是描述在给定<math>(X=x)</math>条件下的事件<math>\displaystyle (Y=y)</math>所需的信息量。根据大数定律,<math>H(Y ǀ X)</math>是<math>\displaystyle f(X,Y)</math>的大量独立实现的算术平均值。 | + | 对该定义的直观解释是:根据定义<math>\displaystyle H( Y|X) =\mathbb{E}( \ f( X,Y) \ )</math>,其中<math>\displaystyle f:( x,y) \ \rightarrow -\log( \ p( y|x) \ ) </math>. <math>\displaystyle f</math>将给定<math>\displaystyle (X=x)</math>时的<math>\displaystyle ( Y=y)</math>的信息内容与<math>\displaystyle ( x,y)</math>相关联,这是描述在给定<math>(X=x)</math>条件下的事件<math>\displaystyle (Y=y)</math>所需的信息量。根据大数定律,<math>H(Y ǀ X)</math>是大量<math>\displaystyle f(X,Y)</math>独立实验结果的算术平均值。 |
| | | |
| | | |
第51行: |
第51行: |
| Let <math>H(Y ǀ X = x)</math> be the [[Shannon Entropy|entropy]] of the discrete random variable <math>Y</math> conditioned on the discrete random variable <math>X</math> taking a certain value <math>x</math>. Denote the support sets of <math>X</math> and <math>Y</math> by <math>\mathcal X</math> and <math>\mathcal Y</math>. Let <math>Y</math> have [[probability mass function]] <math>p_Y{(y)}</math>. The unconditional entropy of <math>Y</math> is calculated as <math>H(Y):=E[I(Y)</math>, i.e. | | Let <math>H(Y ǀ X = x)</math> be the [[Shannon Entropy|entropy]] of the discrete random variable <math>Y</math> conditioned on the discrete random variable <math>X</math> taking a certain value <math>x</math>. Denote the support sets of <math>X</math> and <math>Y</math> by <math>\mathcal X</math> and <math>\mathcal Y</math>. Let <math>Y</math> have [[probability mass function]] <math>p_Y{(y)}</math>. The unconditional entropy of <math>Y</math> is calculated as <math>H(Y):=E[I(Y)</math>, i.e. |
| | | |
− | 设<math>H(Y ǀ X = x)</math>为离散随机变量<math>Y</math>的熵,条件是离散随机变量<math>X</math>取一定值<math>x</math>。用<math>\mathcal X</math>和<math>\mathcal Y</math>表示<math>X</math>和<math>Y</math>的支撑集。令<math>Y</math>具有概率质量函数<math>p_Y{(y)}</math>。<math>Y</math>的无条件熵计算为<math>H(Y):=E[I(Y)</math>。 | + | 设<math>H(Y ǀ X = x)</math>为离散随机变量<math>Y</math>在离散随机变量<math>X</math>取定值<math>x</math>时的熵。用<math>\mathcal X</math>和<math>\mathcal Y</math>表示<math>X</math>和<math>Y</math>的支撑集。令<math>Y</math>的概率密度函数为<math>p_Y{(y)}</math>。<math>Y</math>的无条件熵计算为<math>H(Y):=E[I(Y)</math>。 |
| | | |
| | | |
第60行: |
第60行: |
| where <math>\operatorname{I}(y_i)</math> is the [[information content]] of the [[Outcome (probability)|outcome]] of <math>Y</math> taking the value <math>y_i</math>. The entropy of <math>Y</math> conditioned on <math>X</math> taking the value <math>x</math> is defined analogously by [[conditional expectation]]: | | where <math>\operatorname{I}(y_i)</math> is the [[information content]] of the [[Outcome (probability)|outcome]] of <math>Y</math> taking the value <math>y_i</math>. The entropy of <math>Y</math> conditioned on <math>X</math> taking the value <math>x</math> is defined analogously by [[conditional expectation]]: |
| | | |
− | 这里当取值为<math>y_i</math>时,<math>\operatorname{I}(y_i)</math>是其结果<math>Y</math>的信息内容。类似地以<math>X</math>为条件的<math>Y</math>的熵,当值为<math>x</math>时,也可以通过条件期望来定义:
| + | 当<math>Y</math>取值为<math>y_i</math>时,<math>\operatorname{I}(y_i)</math>是其结果<math>Y</math>的信息内容。类似地,当<math>X</math>值为<math>x</math>时以<math>X</math>为条件的<math>Y</math>的熵,也可以通过条件期望来定义: |
| | | |
| | | |
第69行: |
第69行: |
| Note that<math> H(Y ǀ X)</math> is the result of averaging <math>H(Y ǀ X = x)</math> over all possible values <math>x</math> that <math>X</math> may take. Also, if the above sum is taken over a sample <math>y_1, \dots, y_n</math>, the expected value <math>E_X[ H(y_1, \dots, y_n \mid X = x)]</math> is known in some domains as '''equivocation'''.<ref>{{cite journal|author1=Hellman, M.|author2=Raviv, J.|year=1970|title=Probability of error, equivocation, and the Chernoff bound|journal=IEEE Transactions on Information Theory|volume=16|issue=4|pp=368-372}}</ref> | | Note that<math> H(Y ǀ X)</math> is the result of averaging <math>H(Y ǀ X = x)</math> over all possible values <math>x</math> that <math>X</math> may take. Also, if the above sum is taken over a sample <math>y_1, \dots, y_n</math>, the expected value <math>E_X[ H(y_1, \dots, y_n \mid X = x)]</math> is known in some domains as '''equivocation'''.<ref>{{cite journal|author1=Hellman, M.|author2=Raviv, J.|year=1970|title=Probability of error, equivocation, and the Chernoff bound|journal=IEEE Transactions on Information Theory|volume=16|issue=4|pp=368-372}}</ref> |
| | | |
− | 注意,<math> H(Y ǀ X)</math>是在<math>X</math>可能取的所有可能值<math>x</math>上对<math>H(Y ǀ X = x)</math>求平均值的结果。同样,如果将上述总和接管到样本<math>y_1, \dots, y_n</math>上,则预期值<math>E_X[ H(y_1, \dots, y_n \mid X = x)]</math>在某些领域中会变得模糊。<ref>{{cite journal|author1=Hellman, M.|author2=Raviv, J.|year=1970|title=Probability of error, equivocation, and the Chernoff bound|journal=IEEE Transactions on Information Theory|volume=16|issue=4|pp=368-372}}</ref> | + | 注意,<math> H(Y ǀ X)</math>是在<math>X</math>可能取的所有可能值<math>x</math>时对<math>H(Y ǀ X = x)</math>求平均值的结果。同样,如果上述和取自样本<math>y_1, \dots, y_n</math>上,则期望值<math>E_X[ H(y_1, \dots, y_n \mid X = x)]</math><font color="#32cd32"> 在某些领域中认为是模糊值</font>。<ref>{{cite journal|author1=Hellman, M.|author2=Raviv, J.|year=1970|title=Probability of error, equivocation, and the Chernoff bound|journal=IEEE Transactions on Information Theory|volume=16|issue=4|pp=368-372}}</ref> |
| | | |
| | | |
第75行: |
第75行: |
| Given [[Discrete random variable|discrete random variables]] <math>X</math> with image <math>\mathcal X</math> and <math>Y</math> with image <math>\mathcal Y</math>, the conditional entropy of <math>Y</math> given <math>X</math> is defined as the weighted sum of <math>H(Y|X=x)</math> for each possible value of <math>x</math>, using <math>p(x)</math> as the weights:<ref name=cover1991>{{cite book|isbn=0-471-06259-6|year=1991|authorlink1=Thomas M. Cover|author1=T. Cover|author2=J. Thomas|title=Elements of Information Theory|url=https://archive.org/details/elementsofinform0000cove|url-access=registration}}</ref>{{rp|15}} | | Given [[Discrete random variable|discrete random variables]] <math>X</math> with image <math>\mathcal X</math> and <math>Y</math> with image <math>\mathcal Y</math>, the conditional entropy of <math>Y</math> given <math>X</math> is defined as the weighted sum of <math>H(Y|X=x)</math> for each possible value of <math>x</math>, using <math>p(x)</math> as the weights:<ref name=cover1991>{{cite book|isbn=0-471-06259-6|year=1991|authorlink1=Thomas M. Cover|author1=T. Cover|author2=J. Thomas|title=Elements of Information Theory|url=https://archive.org/details/elementsofinform0000cove|url-access=registration}}</ref>{{rp|15}} |
| | | |
− | 给定具有像<math>\mathcal X</math>的离散随机变量<math>X</math>和具有像<math>\mathcal Y</math>的离散随机变量<math>Y</math>,将给定<math>X</math>的<math>Y</math>的条件熵定义为<math>H(Y|X=x)</math>的权重之和,以<math>x</math>的每个可能值为准,并使用<math>p(x)</math>作为权重,其表达式如下:<ref name=cover1991>{{cite book|isbn=0-471-06259-6|year=1991|authorlink1=Thomas M. Cover|author1=T. Cover|author2=J. Thomas|title=Elements of Information Theory|url=https://archive.org/details/elementsofinform0000cove|url-access=registration}}</ref>{{rp|15}} | + | 给定具有像<math>\mathcal X</math>的离散随机变量<math>X</math>和具有像<math>\mathcal Y</math>的离散随机变量<math>Y</math>,将给定<math>X</math>的<math>Y</math>的条件熵定义为以<math>p(x)</math>作为权重,对<math>x</math>的每个可能取值得到的<math>H(Y|X=x)</math>的加权和。其表达式如下:<ref name=cover1991>{{cite book|isbn=0-471-06259-6|year=1991|authorlink1=Thomas M. Cover|author1=T. Cover|author2=J. Thomas|title=Elements of Information Theory|url=https://archive.org/details/elementsofinform0000cove|url-access=registration}}</ref>{{rp|15}} |
| | | |
| | | |
第97行: |
第97行: |
| <math>H(Y|X)=0</math> if and only if the value of <math>Y</math> is completely determined by the value of <math>X</math>. | | <math>H(Y|X)=0</math> if and only if the value of <math>Y</math> is completely determined by the value of <math>X</math>. |
| | | |
− | 当且仅当<math>Y</math>的值完全由<math>X</math>的值确定时,才为<math>H(Y|X)=0</math>。 | + | 当且仅当<math>Y</math>的值完全由<math>X</math>的值确定时,<math>H(Y|X)=0</math>。 |
| | | |
| | | |
第104行: |
第104行: |
| Conversely, <math>H(Y|X) = H(Y)</math> if and only if <math>Y</math> and <math>X</math> are [[independent random variables]]. | | Conversely, <math>H(Y|X) = H(Y)</math> if and only if <math>Y</math> and <math>X</math> are [[independent random variables]]. |
| | | |
− | 相反,当且仅当<math>Y</math>和<math>X</math>是独立随机变量时,则为<math>H(Y|X) =H(Y)</math>。 | + | 相反,当且仅当<math>Y</math>和<math>X</math>是互相独立的随机变量时,则<math>H(Y|X) =H(Y)</math>。 |
| | | |
| | | |
第111行: |
第111行: |
| Assume that the combined system determined by two random variables <math>X</math> and <math>Y</math> has [[joint entropy]] <math>H(X,Y)</math>, that is, we need <math>H(X,Y)</math> bits of information on average to describe its exact state. Now if we first learn the value of <math>X</math>, we have gained <math>H(X)</math> bits of information. Once <math>X</math> is known, we only need <math>H(X,Y)-H(X)</math> bits to describe the state of the whole system. This quantity is exactly <math>H(Y|X)</math>, which gives the ''chain rule'' of conditional entropy: | | Assume that the combined system determined by two random variables <math>X</math> and <math>Y</math> has [[joint entropy]] <math>H(X,Y)</math>, that is, we need <math>H(X,Y)</math> bits of information on average to describe its exact state. Now if we first learn the value of <math>X</math>, we have gained <math>H(X)</math> bits of information. Once <math>X</math> is known, we only need <math>H(X,Y)-H(X)</math> bits to describe the state of the whole system. This quantity is exactly <math>H(Y|X)</math>, which gives the ''chain rule'' of conditional entropy: |
| | | |
− | 假设由两个随机变量<math>X</math>和<math>Y</math>确定的组合系统具有联合熵<math>H(X,Y)</math>,也就是说,我们通常需要<math>H(X,Y)</math>位信息来描述其确切状态。现在,如果我们首先获得<math>X</math>的值,我们将知晓<math>H(X)</math>位信息。一旦知道了<math>X</math>的值,我们就可以通过<math>H(X,Y)</math>-<math>H(X)</math>位来描述整个系统的状态。这个数量恰好是<math>H(Y|X)</math>,它给出了条件熵的链式法则: | + | 假设由两个随机变量<math>X</math>和<math>Y</math>确定的组合系统具有联合熵<math>H(X,Y)</math>,也就是说,我们通常需要<math>H(X,Y)</math>位信息来描述其确切状态。现在,如果我们首先尝试获得<math>X</math>的值,我们将知晓<math>H(X)</math>位信息。一旦<math>X</math>的值确定,我们就可以通过<math>H(X,Y)</math>-<math>H(X)</math>位来描述整个系统的状态。这个数量恰好是<math>H(Y|X)</math>,它给出了条件熵的链式法则: |
| | | |
| | | |
第185行: |
第185行: |
| where <math>\operatorname{I}(X;Y)</math> is the [[mutual information]] between <math>X</math> and <math>Y</math>. | | where <math>\operatorname{I}(X;Y)</math> is the [[mutual information]] between <math>X</math> and <math>Y</math>. |
| | | |
− | 其中<math>\operatorname{I}(X;Y)</math>是<math>X</math>和<math>Y</math>之间的相互信息。 | + | 其中<math>\operatorname{I}(X;Y)</math>是<math>X</math>和<math>Y</math>之间的<font color="#ff8000"> 互信息</font>。 |
| | | |
| | | |
第199行: |
第199行: |
| Although the specific-conditional entropy <math>H(X|Y=y)</math> can be either less or greater than <math>H(X)</math> for a given [[random variate]] <math>y</math> of <math>Y</math>, <math>H(X|Y)</math> can never exceed <math>H(X)</math>. | | Although the specific-conditional entropy <math>H(X|Y=y)</math> can be either less or greater than <math>H(X)</math> for a given [[random variate]] <math>y</math> of <math>Y</math>, <math>H(X|Y)</math> can never exceed <math>H(X)</math>. |
| | | |
− | 尽管对于给定的<math>Y</math>随机变量<math>y</math>,特定条件熵<math>H(X|Y=y)</math>可以小于或大于<math>H(X)</math>,但<math>H(X|Y)</math>永远不会超过<math>H(X)</math>。
| + | 对于给定随机变量<math>Y</math>的值<math>y</math>,尽管特定条件熵<math>H(X|Y=y)</math>可以小于或大于<math>H(X)</math>,但<math>H(X|Y)</math>永远不会超过<math>H(X)</math>。 |
| | | |
| | | |
第237行: |
第237行: |
| Notice however that this rule may not be true if the involved differential entropies do not exist or are infinite. | | Notice however that this rule may not be true if the involved differential entropies do not exist or are infinite. |
| | | |
− | 但是请注意,如果所涉及的微分熵不存在或无限,则此规则可能不成立。
| + | 但是请注意,如果所涉及的微分熵不存在或无限,则此法则可能不成立。 |
| | | |
| | | |
第243行: |
第243行: |
| Joint differential entropy is also used in the definition of the [[mutual information]] between continuous random variables: | | Joint differential entropy is also used in the definition of the [[mutual information]] between continuous random variables: |
| | | |
− | 联合微分熵也用于定义连续随机变量之间的交互信息:
| + | 联合微分熵也用于定义连续随机变量之间的互信息: |
| | | |
| | | |
第251行: |
第251行: |
| <math>h(X|Y) \le h(X)</math> with equality if and only if <math>X</math> and <math>Y</math> are independent.<ref name=cover1991 />{{rp|253}} | | <math>h(X|Y) \le h(X)</math> with equality if and only if <math>X</math> and <math>Y</math> are independent.<ref name=cover1991 />{{rp|253}} |
| | | |
− | 当且仅当X和Y是独立的时,<math>h(X|Y) \le h(X)</math>才相等。
| + | 当且仅当X和Y是独立的,<math>h(X|Y) \le h(X)</math>等号成立。 |
| | | |
| | | |
| | | |
− | === Relation to estimator error 与预估误差的关系 === | + | === Relation to estimator error 与估计量误差的关系 === |
| The conditional differential entropy yields a lower bound on the expected squared error of an [[estimator]]. For any random variable <math>X</math>, observation <math>Y</math> and estimator <math>\widehat{X}</math> the following holds:<ref name=cover1991 />{{rp|255}} | | The conditional differential entropy yields a lower bound on the expected squared error of an [[estimator]]. For any random variable <math>X</math>, observation <math>Y</math> and estimator <math>\widehat{X}</math> the following holds:<ref name=cover1991 />{{rp|255}} |
| | | |
第287行: |
第287行: |
| | | |
| * '''<font color="#ff8000"> 熵(信息论)Entropy (information theory)</font>''' | | * '''<font color="#ff8000"> 熵(信息论)Entropy (information theory)</font>''' |
− | * '''<font color="#ff8000"> 交互信息Mutual information</font>''' | + | * '''<font color="#ff8000"> 互信息Mutual information</font>''' |
| * '''<font color="#ff8000"> 条件量子熵Conditional quantum entropy</font>''' | | * '''<font color="#ff8000"> 条件量子熵Conditional quantum entropy</font>''' |
− | * '''<font color="#ff8000"> 信息变差Variation of information</font>''' | + | * '''<font color="#ff8000"> 信息差异Variation of information</font>''' |
| * '''<font color="#ff8000"> 熵幂不等式Entropy power inequality</font>''' | | * '''<font color="#ff8000"> 熵幂不等式Entropy power inequality</font>''' |
| * '''<font color="#ff8000"> 似然函数Likelihood function</font>''' | | * '''<font color="#ff8000"> 似然函数Likelihood function</font>''' |