第60行: |
第60行: |
| 微分熵的性质 | | 微分熵的性质 |
| * For probability densities <math>f</math> and <math>g</math>, the [[Kullback–Leibler divergence]] <math>D_{KL}(f || g)</math> is greater than or equal to 0 with equality only if <math>f=g</math> [[almost everywhere]]. Similarly, for two random variables <math>X</math> and <math>Y</math>, <math>I(X;Y) \ge 0</math> and <math>h(X|Y) \le h(X)</math> with equality [[if and only if]] <math>X</math> and <math>Y</math> are [[Statistical independence|independent]]. | | * For probability densities <math>f</math> and <math>g</math>, the [[Kullback–Leibler divergence]] <math>D_{KL}(f || g)</math> is greater than or equal to 0 with equality only if <math>f=g</math> [[almost everywhere]]. Similarly, for two random variables <math>X</math> and <math>Y</math>, <math>I(X;Y) \ge 0</math> and <math>h(X|Y) \le h(X)</math> with equality [[if and only if]] <math>X</math> and <math>Y</math> are [[Statistical independence|independent]]. |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:*对于概率密度f和g,[[Kullback–Leibler散度]]D{KL}(f | | g)</math>只有在f=g[[几乎处处]]时才大于或等于0且相等。类似地,对于两个随机变量X和Y,I(X;Y)\ge 和h(X | Y)\le h(X),等式:当且仅当>X和Y是[[统计独立性|独立性]]。 |
| | | |
| * The chain rule for differential entropy holds as in the discrete case<ref name="cover_thomas" />{{rp|253}} | | * The chain rule for differential entropy holds as in the discrete case<ref name="cover_thomas" />{{rp|253}} |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:微分熵的链式法则在离散情况下成立 |
| | | |
| ::<math>h(X_1, \ldots, X_n) = \sum_{i=1}^{n} h(X_i|X_1, \ldots, X_{i-1}) \leq \sum_{i=1}^{n} h(X_i)</math>. | | ::<math>h(X_1, \ldots, X_n) = \sum_{i=1}^{n} h(X_i|X_1, \ldots, X_{i-1}) \leq \sum_{i=1}^{n} h(X_i)</math>. |
| | | |
| * Differential entropy is translation invariant, i.e. for a constant <math>c</math>.<ref name="cover_thomas" />{{rp|253}} | | * Differential entropy is translation invariant, i.e. for a constant <math>c</math>.<ref name="cover_thomas" />{{rp|253}} |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:微分熵是平移不变的,即对于常数c存在 |
| | | |
| ::<math>h(X+c) = h(X)</math> | | ::<math>h(X+c) = h(X)</math> |
| | | |
| * Differential entropy is in general not invariant under arbitrary invertible maps. | | * Differential entropy is in general not invariant under arbitrary invertible maps. |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:在任意可逆映射下,微分熵一般是不不变的。 |
| | | |
| :: In particular, for a constant <math>a</math> | | :: In particular, for a constant <math>a</math> |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:特别地,对于一个常数a存在 |
| | | |
| :::<math>h(aX) = h(X)+ \log |a|</math> | | :::<math>h(aX) = h(X)+ \log |a|</math> |
| | | |
| :: For a vector valued random variable <math>\mathbf{X}</math> and an invertible (square) [[matrix (mathematics)|matrix]] <math>\mathbf{A}</math> | | :: For a vector valued random variable <math>\mathbf{X}</math> and an invertible (square) [[matrix (mathematics)|matrix]] <math>\mathbf{A}</math> |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:对于向量值随机变量X和可逆(平方)矩阵存在 |
| | | |
| :::<math>h(\mathbf{A}\mathbf{X})=h(\mathbf{X})+\log \left( |\det \mathbf{A}| \right)</math><ref name="cover_thomas" />{{rp|253}} | | :::<math>h(\mathbf{A}\mathbf{X})=h(\mathbf{X})+\log \left( |\det \mathbf{A}| \right)</math><ref name="cover_thomas" />{{rp|253}} |
| | | |
| * In general, for a transformation from a random vector to another random vector with same dimension <math>\mathbf{Y}=m \left(\mathbf{X}\right)</math>, the corresponding entropies are related via | | * In general, for a transformation from a random vector to another random vector with same dimension <math>\mathbf{Y}=m \left(\mathbf{X}\right)</math>, the corresponding entropies are related via |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:一般地,对于从一个随机向量到另一个具有相同维数(X,Y)的随机向量的变换,相应的熵通过 |
| | | |
| ::<math>h(\mathbf{Y}) \leq h(\mathbf{X}) + \int f(x) \log \left\vert \frac{\partial m}{\partial x} \right\vert dx</math> | | ::<math>h(\mathbf{Y}) \leq h(\mathbf{X}) + \int f(x) \log \left\vert \frac{\partial m}{\partial x} \right\vert dx</math> |
| | | |
| :where <math>\left\vert \frac{\partial m}{\partial x} \right\vert</math> is the [[Jacobian matrix and determinant|Jacobian]] of the transformation <math>m</math>.<ref>{{cite web |title=proof of upper bound on differential entropy of f(X) |work=[[Stack Exchange]] |date=April 16, 2016 |url=https://math.stackexchange.com/q/1745670 }}</ref> The above inequality becomes an equality if the transform is a bijection. Furthermore, when <math>m</math> is a rigid rotation, translation, or combination thereof, the Jacobian determinant is always 1, and <math>h(Y)=h(X)</math>. | | :where <math>\left\vert \frac{\partial m}{\partial x} \right\vert</math> is the [[Jacobian matrix and determinant|Jacobian]] of the transformation <math>m</math>.<ref>{{cite web |title=proof of upper bound on differential entropy of f(X) |work=[[Stack Exchange]] |date=April 16, 2016 |url=https://math.stackexchange.com/q/1745670 }}</ref> The above inequality becomes an equality if the transform is a bijection. Furthermore, when <math>m</math> is a rigid rotation, translation, or combination thereof, the Jacobian determinant is always 1, and <math>h(Y)=h(X)</math>. |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:其中(m,x)是变换m的[[Jacobian矩阵和行列式| Jacobian]]。如果变换是双射,则上述不等式变为等式。此外,当m是刚性旋转、平移或其组合时,雅可比行列式总是1,并且h(Y)=h(X) |
| | | |
| * If a random vector <math>X \in \mathbb{R}^n</math> has mean zero and [[covariance]] matrix <math>K</math>, <math>h(\mathbf{X}) \leq \frac{1}{2} \log(\det{2 \pi e K}) = \frac{1}{2} \log[(2\pi e)^n \det{K}]</math> with equality if and only if <math>X</math> is [[Multivariate normal distribution#Joint normality|jointly gaussian]] (see [[#Maximization in the normal distribution|below]]).<ref name="cover_thomas" />{{rp|254}} | | * If a random vector <math>X \in \mathbb{R}^n</math> has mean zero and [[covariance]] matrix <math>K</math>, <math>h(\mathbf{X}) \leq \frac{1}{2} \log(\det{2 \pi e K}) = \frac{1}{2} \log[(2\pi e)^n \det{K}]</math> with equality if and only if <math>X</math> is [[Multivariate normal distribution#Joint normality|jointly gaussian]] (see [[#Maximization in the normal distribution|below]]).<ref name="cover_thomas" />{{rp|254}} |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译:如果一个随机向量X具有均值零和协方差矩阵<math>K</math>,<math>h(\mathbf{X})\leq\frac{1}{2}\log(\det{2\pi e K})=\frac{1}{2}\log[(2\pi e)^n\det{K}]</math>等式当且仅当X为多元正态分布/联合正态性/联合高斯(见下文[[#正态分布中的最大化])。 |
| + | |
| | | |
| * It is not invariant under [[change of variables]], and is therefore most useful with dimensionless variables. | | * It is not invariant under [[change of variables]], and is therefore most useful with dimensionless variables. |
第93行: |
第112行: |
| | | |
| A modification of differential entropy that addresses these drawbacks is the '''relative information entropy''', also known as the Kullback–Leibler divergence, which includes an [[invariant measure]] factor (see [[limiting density of discrete points]]). | | A modification of differential entropy that addresses these drawbacks is the '''relative information entropy''', also known as the Kullback–Leibler divergence, which includes an [[invariant measure]] factor (see [[limiting density of discrete points]]). |
| + | |
| + | --[[用户:CecileLi|CecileLi]]([[用户讨论:CecileLi|讨论]]) 【审校】补充翻译: |
| + | 解决这些缺点的微分熵的一种改进是“相对信息熵”,也称为Kullback–Leibler散度,它包括一个“不变测度”因子(参见:离散点的极限密度)。 |
| | | |
| ==Maximization in the normal distribution== | | ==Maximization in the normal distribution== |