更改

添加2,316字节 、 2021年12月25日 (六) 17:14
无编辑摘要
第46行: 第46行:     
Given the [[statistical model]] which generates a set <math>\mathbf{X}</math> of observed data, a set of unobserved latent data or [[missing values]] <math>\mathbf{Z}</math>, and a vector of unknown parameters <math>\boldsymbol\theta</math>, along with a [[likelihood function]] <math>L(\boldsymbol\theta; \mathbf{X}, \mathbf{Z}) = p(\mathbf{X}, \mathbf{Z}\mid\boldsymbol\theta)</math>, the [[maximum likelihood estimate]] (MLE) of the unknown parameters is determined by maximizing the [[marginal likelihood]] of the observed data.
 
Given the [[statistical model]] which generates a set <math>\mathbf{X}</math> of observed data, a set of unobserved latent data or [[missing values]] <math>\mathbf{Z}</math>, and a vector of unknown parameters <math>\boldsymbol\theta</math>, along with a [[likelihood function]] <math>L(\boldsymbol\theta; \mathbf{X}, \mathbf{Z}) = p(\mathbf{X}, \mathbf{Z}\mid\boldsymbol\theta)</math>, the [[maximum likelihood estimate]] (MLE) of the unknown parameters is determined by maximizing the [[marginal likelihood]] of the observed data.
 +
 +
<nowiki>给定生成一组 {\displaystyle \mathbf {X} }\mathbf {X} 观察数据,一组未观察到的潜在数据或缺失值 {\displaystyle \mathbf {Z} }\mathbf {Z} 的统计模型, 和未知参数向量 {\displaystyle {\boldsymbol {\theta }}}{\boldsymbol {\theta }},以及似然函数 {\displaystyle L({\boldsymbol {\theta }};\mathbf {X } ,\mathbf {Z} )=p(\mathbf {X} ,\mathbf {Z} \mid {\boldsymbol {\theta }})}{\displaystyle L({\boldsymbol {\theta }};\mathbf {X} ,\mathbf {Z} )=p(\mathbf {X} ,\mathbf {Z} \mid {\boldsymbol {\theta }})},未知参数的最大似然估计(MLE) 通过最大化观察数据的边际可能性来决定。</nowiki>
    
:<math>L(\boldsymbol\theta; \mathbf{X}) = p(\mathbf{X}\mid\boldsymbol\theta) = \int  p(\mathbf{X},\mathbf{Z} \mid \boldsymbol\theta) \, d\mathbf{Z} </math>
 
:<math>L(\boldsymbol\theta; \mathbf{X}) = p(\mathbf{X}\mid\boldsymbol\theta) = \int  p(\mathbf{X},\mathbf{Z} \mid \boldsymbol\theta) \, d\mathbf{Z} </math>
第51行: 第53行:  
However, this quantity is often intractable since [math]\displaystyle{ \mathbf{Z} }[/math] is unobserved and the distribution of [math]\displaystyle{ \mathbf{Z} }[/math] is unknown before attaining [math]\displaystyle{ \boldsymbol\theta }[/math] .
 
However, this quantity is often intractable since [math]\displaystyle{ \mathbf{Z} }[/math] is unobserved and the distribution of [math]\displaystyle{ \mathbf{Z} }[/math] is unknown before attaining [math]\displaystyle{ \boldsymbol\theta }[/math] .
   −
 
+
<nowiki>然而,这个量通常是难以处理的,因为 {\displaystyle \mathbf {Z} }\mathbf {Z} 是不可观察的,并且 {\displaystyle \mathbf {Z} }\mathbf {Z} 的分布在达到 {\displaystyle { \boldsymbol {\theta }}}{\boldsymbol {\theta }}之前是未知的。</nowiki>
 
The EM algorithm seeks to find the MLE of the marginal likelihood by iteratively applying these two steps:
 
The EM algorithm seeks to find the MLE of the marginal likelihood by iteratively applying these two steps:
   第57行: 第59行:     
:''Expectation step (E step)'': Define <math>Q(\boldsymbol\theta\mid\boldsymbol\theta^{(t)})</math> as the [[expected value]] of the log [[likelihood function]] of <math>\boldsymbol\theta</math>, with respect to the current [[conditional probability distribution|conditional distribution]] of <math>\mathbf{Z}</math> given <math>\mathbf{X}</math> and the current estimates of the parameters <math>\boldsymbol\theta^{(t)}</math>:
 
:''Expectation step (E step)'': Define <math>Q(\boldsymbol\theta\mid\boldsymbol\theta^{(t)})</math> as the [[expected value]] of the log [[likelihood function]] of <math>\boldsymbol\theta</math>, with respect to the current [[conditional probability distribution|conditional distribution]] of <math>\mathbf{Z}</math> given <math>\mathbf{X}</math> and the current estimates of the parameters <math>\boldsymbol\theta^{(t)}</math>:
:
+
:<nowiki>期望步(E步):定义 {\displaystyle Q({\boldsymbol {\theta }}\mid {\boldsymbol {\theta }}^{(t)})}{\displaystyle Q({\boldsymbol {\theta }}\mid {\boldsymbol {\theta }}^{(t)})} 作为 {\displaystyle {\boldsymbol {\theta }}}{\boldsymbol {\theta }} 的对数似然函数的期望值 ,关于 {\displaystyle \mathbf {Z} }\mathbf {Z} 的当前条件分布给定 {\displaystyle \mathbf {X} }\mathbf {X} 和参数 {\displaystyle {\ 粗体符号 {\theta }}^{(t)}}\boldsymbol\theta^{(t)}:</nowiki>
 
::<math>Q(\boldsymbol\theta\mid\boldsymbol\theta^{(t)}) = \operatorname{E}_{\mathbf{Z}\mid\mathbf{X},\boldsymbol\theta^{(t)}}\left[ \log L (\boldsymbol\theta; \mathbf{X},\mathbf{Z})  \right] \,</math>
 
::<math>Q(\boldsymbol\theta\mid\boldsymbol\theta^{(t)}) = \operatorname{E}_{\mathbf{Z}\mid\mathbf{X},\boldsymbol\theta^{(t)}}\left[ \log L (\boldsymbol\theta; \mathbf{X},\mathbf{Z})  \right] \,</math>
    
:''Maximization step (M step)'': Find the parameters that maximize this quantity:
 
:''Maximization step (M step)'': Find the parameters that maximize this quantity:
   −
M步:找到使数量最大化的参数:
+
最大化步(M步):找到使数量最大化的参数:
    
::<math>\boldsymbol\theta^{(t+1)} = \underset{\boldsymbol\theta}{\operatorname{arg\,max}} \ Q(\boldsymbol\theta\mid\boldsymbol\theta^{(t)}) \, </math>
 
::<math>\boldsymbol\theta^{(t+1)} = \underset{\boldsymbol\theta}{\operatorname{arg\,max}} \ Q(\boldsymbol\theta\mid\boldsymbol\theta^{(t)}) \, </math>
第347行: 第349行:  
==== 终止 ====
 
==== 终止 ====
 
Conclude the iterative process if {\displaystyle E_{Z\mid \theta ^{(t)},\mathbf {x} }[\log L(\theta ^{(t)};\mathbf {x} ,\mathbf {Z} )]\leq E_{Z\mid \theta ^{(t-1)},\mathbf {x} }[\log L(\theta ^{(t-1)};\mathbf {x} ,\mathbf {Z} )]+\varepsilon }{\displaystyle E_{Z\mid \theta ^{(t)},\mathbf {x} }[\log L(\theta ^{(t)};\mathbf {x} ,\mathbf {Z} )]\leq E_{Z\mid \theta ^{(t-1)},\mathbf {x} }[\log L(\theta ^{(t-1)};\mathbf {x} ,\mathbf {Z} )]+\varepsilon } for {\displaystyle \varepsilon }\varepsilon below some preset threshold.
 
Conclude the iterative process if {\displaystyle E_{Z\mid \theta ^{(t)},\mathbf {x} }[\log L(\theta ^{(t)};\mathbf {x} ,\mathbf {Z} )]\leq E_{Z\mid \theta ^{(t-1)},\mathbf {x} }[\log L(\theta ^{(t-1)};\mathbf {x} ,\mathbf {Z} )]+\varepsilon }{\displaystyle E_{Z\mid \theta ^{(t)},\mathbf {x} }[\log L(\theta ^{(t)};\mathbf {x} ,\mathbf {Z} )]\leq E_{Z\mid \theta ^{(t-1)},\mathbf {x} }[\log L(\theta ^{(t-1)};\mathbf {x} ,\mathbf {Z} )]+\varepsilon } for {\displaystyle \varepsilon }\varepsilon below some preset threshold.
 +
 +
如果 {\displaystyle E_{Z\mid \theta ^{(t)},\mathbf {x} }[\log L(\theta ^{(t)};\mathbf {x} ,\mathbf {Z} )]\leq E_{Z\mid \theta ^{(t-1)},\mathbf {x} }[\log L(\theta ^{(t-1)};\mathbf {x} ,\mathbf {Z} )]+\varepsilon }{\displaystyle E_{Z\mid \theta ^{(t)},\mathbf {x} }[\log L(\theta ^{(t)};\ mathbf {x} ,\mathbf {Z} )]\leq E_{Z\mid \theta ^{(t-1)},\mathbf {x} }[\log L(\theta ^{(t-1) };\mathbf {x} ,\mathbf {Z} )]+\varepsilon } 用于 {\displaystyle \varepsilon }\varepsilon 低于某个预设阈值终止迭代过程。
    
'''一般化'''
 
'''一般化'''
    
The algorithm illustrated above can be generalized for mixtures of more than two multivariate normal distributions.
 
The algorithm illustrated above can be generalized for mixtures of more than two multivariate normal distributions.
 +
 +
上面说明的算法可以推广到两个以上多元正态分布的混合。
    
'''截断和删减回归'''
 
'''截断和删减回归'''
    
The EM algorithm has been implemented in the case where an underlying linear regression model exists explaining the variation of some quantity, but where the values actually observed are censored or truncated versions of those represented in the model. Special cases of this model include censored or truncated observations from one normal distribution.
 
The EM algorithm has been implemented in the case where an underlying linear regression model exists explaining the variation of some quantity, but where the values actually observed are censored or truncated versions of those represented in the model. Special cases of this model include censored or truncated observations from one normal distribution.
 +
 +
EM 算法已在存在解释某些量变化的基础线性回归模型的情况下实施,但实际观察到的值是模型中表示的那些值的删失或截断版本。 此模型的特殊情况包括来自一个正态分布的删失或截断观察。
    
== 选择 ==
 
== 选择 ==
12

个编辑