更改

微分熵 (查看源代码)

2021年2月12日 (五) 23:40的版本

添加2,345字节、 2021年2月12日 (五) 23:40

第60行：第60行：

微分熵的性质

* For probability densities <math>f</math> and <math>g</math>, the [[Kullback–Leibler divergence]] <math>D_{KL}(f || g)</math> is greater than or equal to 0 with equality only if <math>f=g</math> [[almost everywhere]]. Similarly, for two random variables <math>X</math> and <math>Y</math>, <math>I(X;Y) \ge 0</math> and <math>h(X|Y) \le h(X)</math> with equality [[if and only if]] <math>X</math> and <math>Y</math> are [[Statistical independence|independent]].

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：*对于概率密度f和g，[[Kullback–Leibler散度]]D{KL}（f | | g）</math>只有在f=g[[几乎处处]]时才大于或等于0且相等。类似地，对于两个随机变量X和Y，I（X；Y）\ge 和h（X | Y）\le h（X），等式：当且仅当>X和Y是[[统计独立性|独立性]]。

* The chain rule for differential entropy holds as in the discrete case<ref name="cover_thomas" />{{rp|253}}

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：微分熵的链式法则在离散情况下成立

::<math>h(X_1, \ldots, X_n) = \sum_{i=1}^{n} h(X_i|X_1, \ldots, X_{i-1}) \leq \sum_{i=1}^{n} h(X_i)</math>.

* Differential entropy is translation invariant, i.e. for a constant <math>c</math>.<ref name="cover_thomas" />{{rp|253}}

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:微分熵是平移不变的，即对于常数c存在

::<math>h(X+c) = h(X)</math>

* Differential entropy is in general not invariant under arbitrary invertible maps.

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：在任意可逆映射下，微分熵一般是不不变的。

:: In particular, for a constant <math>a</math>

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：特别地，对于一个常数a存在

:::<math>h(aX) = h(X)+ \log |a|</math>

:: For a vector valued random variable <math>\mathbf{X}</math> and an invertible (square) [[matrix (mathematics)|matrix]] <math>\mathbf{A}</math>

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：对于向量值随机变量X和可逆（平方）矩阵存在

:::<math>h(\mathbf{A}\mathbf{X})=h(\mathbf{X})+\log \left( |\det \mathbf{A}| \right)</math><ref name="cover_thomas" />{{rp|253}}

* In general, for a transformation from a random vector to another random vector with same dimension <math>\mathbf{Y}=m \left(\mathbf{X}\right)</math>, the corresponding entropies are related via

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：一般地，对于从一个随机向量到另一个具有相同维数（X,Y）的随机向量的变换，相应的熵通过

::<math>h(\mathbf{Y}) \leq h(\mathbf{X}) + \int f(x) \log \left\vert \frac{\partial m}{\partial x} \right\vert dx</math>

:where <math>\left\vert \frac{\partial m}{\partial x} \right\vert</math> is the [[Jacobian matrix and determinant|Jacobian]] of the transformation <math>m</math>.<ref>{{cite web |title=proof of upper bound on differential entropy of f(X) |work=[[Stack Exchange]] |date=April 16, 2016 |url=https://math.stackexchange.com/q/1745670 }}</ref> The above inequality becomes an equality if the transform is a bijection. Furthermore, when <math>m</math> is a rigid rotation, translation, or combination thereof, the Jacobian determinant is always 1, and <math>h(Y)=h(X)</math>.

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：其中(m,x)是变换m的[[Jacobian矩阵和行列式| Jacobian]]。如果变换是双射，则上述不等式变为等式。此外，当m是刚性旋转、平移或其组合时，雅可比行列式总是1，并且h（Y）=h（X）

* If a random vector <math>X \in \mathbb{R}^n</math> has mean zero and [[covariance]] matrix <math>K</math>, <math>h(\mathbf{X}) \leq \frac{1}{2} \log(\det{2 \pi e K}) = \frac{1}{2} \log[(2\pi e)^n \det{K}]</math> with equality if and only if <math>X</math> is [[Multivariate normal distribution#Joint normality|jointly gaussian]] (see [[#Maximization in the normal distribution|below]]).<ref name="cover_thomas" />{{rp|254}}

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：如果一个随机向量X具有均值零和协方差矩阵<math>K</math>，<math>h（\mathbf{X}）\leq\frac{1}{2}\log（\det{2\pi e K}）=\frac{1}{2}\log[（2\pi e）^n\det{K}]</math>等式当且仅当X为多元正态分布/联合正态性/联合高斯（见下文[[#正态分布中的最大化]）。

+

* It is not invariant under [[change of variables]], and is therefore most useful with dimensionless variables.

第93行：第112行：

A modification of differential entropy that addresses these drawbacks is the '''relative information entropy''', also known as the Kullback–Leibler divergence, which includes an [[invariant measure]] factor (see [[limiting density of discrete points]]).

+

--[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译：

+

解决这些缺点的微分熵的一种改进是“相对信息熵”，也称为Kullback–Leibler散度，它包括一个“不变测度”因子（参见：离散点的极限密度）。

==Maximization in the normal distribution==

CecileLi

526

个编辑