更改

添加2,345字节 、 2021年2月12日 (五) 23:40
第60行: 第60行:  
微分熵的性质
 
微分熵的性质
 
* For probability densities <math>f</math> and <math>g</math>, the [[Kullback–Leibler divergence]] <math>D_{KL}(f || g)</math> is greater than or equal to 0 with equality only if <math>f=g</math> [[almost everywhere]]. Similarly, for two random variables <math>X</math> and <math>Y</math>, <math>I(X;Y) \ge 0</math> and <math>h(X|Y) \le h(X)</math> with equality [[if and only if]] <math>X</math> and <math>Y</math> are [[Statistical independence|independent]].
 
* For probability densities <math>f</math> and <math>g</math>, the [[Kullback–Leibler divergence]] <math>D_{KL}(f || g)</math> is greater than or equal to 0 with equality only if <math>f=g</math> [[almost everywhere]]. Similarly, for two random variables <math>X</math> and <math>Y</math>, <math>I(X;Y) \ge 0</math> and <math>h(X|Y) \le h(X)</math> with equality [[if and only if]] <math>X</math> and <math>Y</math> are [[Statistical independence|independent]].
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:*对于概率密度f和g,[[Kullback–Leibler散度]]D{KL}(f | | g)</math>只有在f=g[[几乎处处]]时才大于或等于0且相等。类似地,对于两个随机变量X和Y,I(X;Y)\ge 和h(X | Y)\le h(X),等式:当且仅当>X和Y是[[统计独立性|独立性]]。
    
* The chain rule for differential entropy holds as in the discrete case<ref name="cover_thomas" />{{rp|253}}
 
* The chain rule for differential entropy holds as in the discrete case<ref name="cover_thomas" />{{rp|253}}
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:微分熵的链式法则在离散情况下成立
    
::<math>h(X_1, \ldots, X_n) = \sum_{i=1}^{n} h(X_i|X_1, \ldots, X_{i-1}) \leq \sum_{i=1}^{n} h(X_i)</math>.
 
::<math>h(X_1, \ldots, X_n) = \sum_{i=1}^{n} h(X_i|X_1, \ldots, X_{i-1}) \leq \sum_{i=1}^{n} h(X_i)</math>.
    
* Differential entropy is translation invariant, i.e. for a constant <math>c</math>.<ref name="cover_thomas" />{{rp|253}}
 
* Differential entropy is translation invariant, i.e. for a constant <math>c</math>.<ref name="cover_thomas" />{{rp|253}}
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:微分熵是平移不变的,即对于常数c存在
    
::<math>h(X+c) = h(X)</math>
 
::<math>h(X+c) = h(X)</math>
    
* Differential entropy is in general not invariant under arbitrary invertible maps.
 
* Differential entropy is in general not invariant under arbitrary invertible maps.
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:在任意可逆映射下,微分熵一般是不不变的。
    
:: In particular, for a constant <math>a</math>
 
:: In particular, for a constant <math>a</math>
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:特别地,对于一个常数a存在
    
:::<math>h(aX) = h(X)+ \log |a|</math>
 
:::<math>h(aX) = h(X)+ \log |a|</math>
    
:: For a vector valued random variable <math>\mathbf{X}</math> and an invertible (square) [[matrix (mathematics)|matrix]] <math>\mathbf{A}</math>
 
:: For a vector valued random variable <math>\mathbf{X}</math> and an invertible (square) [[matrix (mathematics)|matrix]] <math>\mathbf{A}</math>
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:对于向量值随机变量X和可逆(平方)矩阵存在
    
:::<math>h(\mathbf{A}\mathbf{X})=h(\mathbf{X})+\log \left( |\det \mathbf{A}| \right)</math><ref name="cover_thomas" />{{rp|253}}
 
:::<math>h(\mathbf{A}\mathbf{X})=h(\mathbf{X})+\log \left( |\det \mathbf{A}| \right)</math><ref name="cover_thomas" />{{rp|253}}
    
* In general, for a transformation from a random vector to another random vector with same dimension <math>\mathbf{Y}=m \left(\mathbf{X}\right)</math>, the corresponding entropies are related via
 
* In general, for a transformation from a random vector to another random vector with same dimension <math>\mathbf{Y}=m \left(\mathbf{X}\right)</math>, the corresponding entropies are related via
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:一般地,对于从一个随机向量到另一个具有相同维数(X,Y)的随机向量的变换,相应的熵通过
    
::<math>h(\mathbf{Y}) \leq h(\mathbf{X}) + \int f(x) \log \left\vert \frac{\partial m}{\partial x} \right\vert dx</math>
 
::<math>h(\mathbf{Y}) \leq h(\mathbf{X}) + \int f(x) \log \left\vert \frac{\partial m}{\partial x} \right\vert dx</math>
    
:where <math>\left\vert \frac{\partial m}{\partial x} \right\vert</math> is the [[Jacobian matrix and determinant|Jacobian]] of the transformation <math>m</math>.<ref>{{cite web |title=proof of upper bound on differential entropy of f(X) |work=[[Stack Exchange]] |date=April 16, 2016 |url=https://math.stackexchange.com/q/1745670 }}</ref> The above inequality becomes an equality if the transform is a bijection. Furthermore, when <math>m</math> is a rigid rotation, translation, or combination thereof, the Jacobian determinant is always 1, and <math>h(Y)=h(X)</math>.
 
:where <math>\left\vert \frac{\partial m}{\partial x} \right\vert</math> is the [[Jacobian matrix and determinant|Jacobian]] of the transformation <math>m</math>.<ref>{{cite web |title=proof of upper bound on differential entropy of f(X) |work=[[Stack Exchange]] |date=April 16, 2016 |url=https://math.stackexchange.com/q/1745670 }}</ref> The above inequality becomes an equality if the transform is a bijection. Furthermore, when <math>m</math> is a rigid rotation, translation, or combination thereof, the Jacobian determinant is always 1, and <math>h(Y)=h(X)</math>.
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:其中(m,x)是变换m的[[Jacobian矩阵和行列式| Jacobian]]。如果变换是双射,则上述不等式变为等式。此外,当m是刚性旋转、平移或其组合时,雅可比行列式总是1,并且h(Y)=h(X)
    
* If a random vector <math>X \in \mathbb{R}^n</math> has mean zero and [[covariance]] matrix <math>K</math>, <math>h(\mathbf{X}) \leq \frac{1}{2} \log(\det{2 \pi e K}) = \frac{1}{2} \log[(2\pi e)^n \det{K}]</math> with equality if and only if <math>X</math> is [[Multivariate normal distribution#Joint normality|jointly gaussian]] (see [[#Maximization in the normal distribution|below]]).<ref name="cover_thomas" />{{rp|254}}
 
* If a random vector <math>X \in \mathbb{R}^n</math> has mean zero and [[covariance]] matrix <math>K</math>, <math>h(\mathbf{X}) \leq \frac{1}{2} \log(\det{2 \pi e K}) = \frac{1}{2} \log[(2\pi e)^n \det{K}]</math> with equality if and only if <math>X</math> is [[Multivariate normal distribution#Joint normality|jointly gaussian]] (see [[#Maximization in the normal distribution|below]]).<ref name="cover_thomas" />{{rp|254}}
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:如果一个随机向量X具有均值零和协方差矩阵<math>K</math>,<math>h(\mathbf{X})\leq\frac{1}{2}\log(\det{2\pi e K})=\frac{1}{2}\log[(2\pi e)^n\det{K}]</math>等式当且仅当X为多元正态分布/联合正态性/联合高斯(见下文[[#正态分布中的最大化])。
 +
    
* It is not invariant under [[change of variables]], and is therefore most useful with dimensionless variables.
 
* It is not invariant under [[change of variables]], and is therefore most useful with dimensionless variables.
第93行: 第112行:     
A modification of differential entropy that addresses these drawbacks is the '''relative information entropy''', also known as the Kullback–Leibler divergence, which includes an [[invariant measure]] factor (see [[limiting density of discrete points]]).
 
A modification of differential entropy that addresses these drawbacks is the '''relative information entropy''', also known as the Kullback–Leibler divergence, which includes an [[invariant measure]] factor (see [[limiting density of discrete points]]).
 +
 +
--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]])  【审校】补充翻译:
 +
解决这些缺点的微分熵的一种改进是“相对信息熵”,也称为Kullback–Leibler散度,它包括一个“不变测度”因子(参见:离散点的极限密度)。
    
==Maximization in the normal distribution==
 
==Maximization in the normal distribution==
526

个编辑