# 联合熵

## 定义

$\displaystyle{ \Eta(X,Y) = -\sum_{x\in\mathcal X} \sum_{y\in\mathcal Y} P(x,y) \log_2[P(x,y)] }$

(Eq.1)

$\displaystyle{ \Eta(X_1, ..., X_n) = -\sum_{x_1 \in\mathcal X_1} ... \sum_{x_n \in\mathcal X_n} P(x_1, ..., x_n) \log_2[P(x_1, ..., x_n)] }$

(Eq.2)

## 属性

### 非负性

$\displaystyle{ H(X,Y) \geq 0 }$
$\displaystyle{ H(X_1,\ldots, X_n) \geq 0 }$

### 高值性/最值性/大于或等于单个熵的最大值

$\displaystyle{ H(X,Y) \geq \max \left[H(X),H(Y) \right] }$
$\displaystyle{ H \bigl(X_1,\ldots, X_n \bigr) \geq \max_{1 \le i \le n} \Bigl\{H\bigl(X_i\bigr) \Bigr\} }$

### 低值性/小于或等于单个熵的总和

$\displaystyle{ H(X,Y) \leq H(X) + H(Y) }$
$\displaystyle{ H(X_1,\ldots, X_n) \leq H(X_1) + \ldots + H(X_n) }$

## 与其他熵测度的关系

$\displaystyle{ H(X|Y) = H(X,Y) - H(Y)\, }$,

and $\displaystyle{ H(X_1,\dots,X_n) = \sum_{k=1}^n H(X_k|X_{k-1},\dots, X_1) }$

$\displaystyle{ \operatorname{I}(X;Y) = H(X) + H(Y) - H(X,Y)\, }$

## 联合微分熵

### 定义

$\displaystyle{ h(X,Y) = -\int_{\mathcal X , \mathcal Y} f(x,y)\log f(x,y)\,dx dy }$

(Eq.3)

$\displaystyle{ h(X_1, \ldots,X_n) = -\int f(x_1, \ldots,x_n)\log f(x_1, \ldots,x_n)\,dx_1 \ldots dx_n }$

(Eq.4)

### 属性

$\displaystyle{ h(X_1,X_2, \ldots,X_n) \le \sum_{i=1}^n h(X_i) }$[2]:253

$\displaystyle{ h(X,Y) = h(X|Y) + h(Y) }$

$\displaystyle{ h(X_1,X_2, \ldots,X_n) = \sum_{i=1}^n h(X_i|X_1,X_2, \ldots,X_{i-1}) }$

$\displaystyle{ \operatorname{I}(X,Y)=h(X)+h(Y)-h(X,Y) }$

## 参考文献

1. Theresa M. Korn; Korn, Granino Arthur. Mathematical Handbook for Scientists and Engineers: Definitions, Theorems, and Formulas for Reference and Review. New York: Dover Publications. ISBN 0-486-41147-8.
2. Thomas M. Cover; Joy A. Thomas. Elements of Information Theory. Hoboken, New Jersey: Wiley. ISBN 0-471-24195-4.
3. "InfoTopo: Topological Information Data Analysis. Deep statistical unsupervised and supervised learning - File Exchange - Github". github.com/pierrebaudot/infotopopy/. Retrieved 26 September 2020.