更改

跳到导航 跳到搜索
删除697字节 、 2024年9月10日 (星期二)
无编辑摘要
第139行: 第139行:  
</math>
 
</math>
 
|{{EquationRef|1}}}}
 
|{{EquationRef|1}}}}
 +
      第228行: 第229行:  
</math>
 
</math>
 
|{{EquationRef|2}}}}
 
|{{EquationRef|2}}}}
      
By averaging the columns of the matrix, we obtain the average transition vector P=∑k=1N​Pk​/N. DKL​ is the KL divergence between two distributions. Therefore, EI is the average KL divergence between each row transition vector Pi​ and the average transition vector P.
 
By averaging the columns of the matrix, we obtain the average transition vector P=∑k=1N​Pk​/N. DKL​ is the KL divergence between two distributions. Therefore, EI is the average KL divergence between each row transition vector Pi​ and the average transition vector P.
第631行: 第631行:  
</math>
 
</math>
 
|{{EquationRef|4}}}}
 
|{{EquationRef|4}}}}
      
<nowiki>Here, [math]p(y|x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(y-f(x))^2}{\sigma^2}\right)[/math] is the conditional probability density of y given x. Since [math]\varepsilon[/math] follows a normal distribution with mean 0 and variance [math]\sigma^2[/math], [math]y=f(x)+\varepsilon[/math] follows a normal distribution with mean [math]f(x)[/math] and variance [math]\sigma^2[/math].</nowiki>
 
<nowiki>Here, [math]p(y|x)=\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(y-f(x))^2}{\sigma^2}\right)[/math] is the conditional probability density of y given x. Since [math]\varepsilon[/math] follows a normal distribution with mean 0 and variance [math]\sigma^2[/math], [math]y=f(x)+\varepsilon[/math] follows a normal distribution with mean [math]f(x)[/math] and variance [math]\sigma^2[/math].</nowiki>
第652行: 第651行:  
\end{aligned}
 
\end{aligned}
 
</math>
 
</math>
  −
其中,e为自然对数的底,最后一个等式是根据高斯分布函数的Shannon熵公式计算得出的。
  −
  −
然而,要计算第二项,即使使用了积分区间为无穷大这个条件,仍然很难计算得出结果,为此,我们对函数[math]f(x_0)[/math]进行一阶泰勒展开:
      
Here, e is the base of the natural logarithm, and the last equality is derived using the Shannon entropy formula for a Gaussian distribution.
 
Here, e is the base of the natural logarithm, and the last equality is derived using the Shannon entropy formula for a Gaussian distribution.
第672行: 第667行:  
p(y)=\int_{-\frac{L}{2}}^{\frac{L}{2}}\frac{1}{L}\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(y-f(x_0))^2}{\sigma^2}\right)dx_0\approx \int_{-\infty}^{\infty}\frac{1}{L}\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(y-f(x)-f'(x)(x_0-x))^2}{\sigma^2}\right)dx_0\approx \frac{1}{L}\cdot\frac{1}{f'(x)}
 
p(y)=\int_{-\frac{L}{2}}^{\frac{L}{2}}\frac{1}{L}\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(y-f(x_0))^2}{\sigma^2}\right)dx_0\approx \int_{-\infty}^{\infty}\frac{1}{L}\frac{1}{\sigma\sqrt{2\pi}}\exp\left(-\frac{(y-f(x)-f'(x)(x_0-x))^2}{\sigma^2}\right)dx_0\approx \frac{1}{L}\cdot\frac{1}{f'(x)}
 
</math>
 
</math>
  −
值得注意的是,在这一步中,我们不仅将[math]f(x_0)[/math]近似为一个线性函数,同时还引入了一个假设,即p(y)的结果与y无关,而与[math]x[/math]有关。我们知道在对EI计算的第二项中包含着对x的积分,因此这一近似也就意味着不同x处的p(y)近似是不同的。
  −
  −
这样,{{EquationNote|4}}中的第二项近似为:
      
It is important to note that in this step, we not only approximate [math]f(x_0)[/math]as a linear function but also introduce an assumption that the result of p(y) is independent of y and depends on [math]x[/math]. Since the second term of the EI calculation includes an integration over x, this approximation implies that p(y) is approximately different at different values of x.
 
It is important to note that in this step, we not only approximate [math]f(x_0)[/math]as a linear function but also introduce an assumption that the result of p(y) is independent of y and depends on [math]x[/math]. Since the second term of the EI calculation includes an integration over x, this approximation implies that p(y) is approximately different at different values of x.
1,117

个编辑

导航菜单