更改

有效信息 (查看源代码)

2024年6月7日 (五) 21:26的版本

添加92字节、 2024年6月7日 (星期五)

→‎高维情况

第702行：第702行：

===高维情况===

−

~~连续映射EI的表达式可以扩展到更高的维度，假设<math>\mathbf{x}\in[-L/2，L/2]^n\subset\mathcal{R}^n~~

+

我们可以把上述对一维变量的EI计算推广到更一般的n维情景。即：

−

~~</math>且~~<math>\mathbf{y}\in\mathcal{R}^m

+

<math>

+

\mathbf{y}=f(\mathbf{x})+\xi, \xi\sim \mathcal{N}(0,\Sigma)

+

</math>

−

</math>，其中<math>n

+

其中<math>\Sigma</math>是高斯噪声<math>\xi</math>的协方差矩阵。

+

首先，我们将[math]\mathbf{x}[/math]干预成<math>[-L/2，L/2]^n\subset\mathcal{R}^n</math>上的均匀分布，<math>[-L，L]^n</math>表示n维空间中的超立方体，我们假设<math>\mathbf{y}\in\mathcal{R}^m</math>，其中<math>n</math>和<math>m</math>是正整数。只存在观测噪声的情况下，EI可以推广为以下形式：

−

~~</math>和<math>m~~

+

<math>EI\approx \ln\left(\frac{L^n}{(2\pi e)^{m/2}}\right)+\frac{1}{2}\int_{-\frac{L}{2}}^{\frac{L}{2}}\ln\left|\det\left(\frac{\partial_\mathbf{x} f(\mathbf{x})}{\Sigma^{1/2}}\right)\right|^2 d\mathbf{x},

−

~~</math>是正整数。只存在观测噪声的情况下，EI可以推广为以下形式：~~

−

<math>EI\approx \ln\left(\frac{L^n}{(2\pi e)^{m/2}}\right)+\frac{1}{2}\~~mathbb{E}_~~{~~\mathbf{x}\sim U ([~~-\frac{L}{2},\frac{L}{2}~~]^n)~~}\ln\left|\det\left(\frac{\partial_\mathbf{x} f(\mathbf{x})}{\Sigma^{1/2}}\right)\right|^2,

</math>

−

~~其中<math>\Sigma~~

+

其中，<math>|\cdot|</math>是绝对值运算，<math>\det</math>是行列式。

−

+

<!--

−

~~</math>是高斯噪声<math>\varepsilon~~

−

~~</math>的协方差矩阵，<math>U([-L，L]^n)~~

−

~~</math>表示超立方体<math>[-L，L]^n~~

−

~~</math>上的均匀分布，~~<math>|\cdot|

−

</math>是绝对值运算，<math>\det

−

</math>是行列式。

−

~~为了将信息几何推广到具有干预噪声和观测噪声的情况，需要引入一个新的维度为~~<~~math>l~~

−

~~</math>的中间变量<math>\theta\subset\mathcal{R}^l~~

−

~~</math>，使得我们不能通过直接干预<math>\mathbf{x}~~

−

~~</math>来控制<math>\mathbf{y}~~

−

~~</math>。相反，我们可以干预<math>\mathbf{x}~~

−

~~</math>以影响<math>\theta~~

−

~~</math>并间接影响<math>\mathbf{y}~~

−

~~</math>。因此，这三个变量形成了一个马尔可夫链：<math>\mathbf{x}\to\theta\to\mathbf{y}~~

−

</math>。

+

为了将信息几何推广到具有干预噪声和观测噪声的情况，需要引入一个新的维度为<math>l</math>的中间变量<math>\theta\subset\mathcal{R}^l</math>，使得我们不能通过直接干预<math>\mathbf{x}</math>来控制<math>\mathbf{y}</math>。相反，我们可以干预<math>\mathbf{x}</math>以影响<math>\theta</math>并间接影响<math>\mathbf{y}</math>。因此，这三个变量形成了一个马尔可夫链：<math>\mathbf{x}\to\theta\to\mathbf{y}</math>。

−

在这种情况下，可以获得两个流形：效应流形<math>\mathcal{M}_E=\{p(\mathbf{y}|\theta)\}_{\theta}

+

在这种情况下，可以获得两个流形：效应流形<math>\mathcal{M}_E=\{p(\mathbf{y}|\theta)\}_{\theta}</math>，度量为<math>g_{\mu\nu}=-\mathbb{E}_{p(\mathbf{y}|\theta)}\partial_{\mu}\partial_{\nu}\ln p(\mathbf{y}|\theta)</math>；干预流形<math>\mathcal{M}_I=\{\tilde{q}(\mathbf{x}|\theta)\}_{\theta\in \Theta}</math>，度量为<math>h_{\mu\nu}=-\mathbb{E}_{\tilde{q}(\mathbf{x}|\theta)}\partial_{\mu}\partial_{\nu}\ln \tilde{q}(\mathbf{x}|\theta)</math>。其中<math>\tilde{q}\equiv \frac{q(\theta|\mathbf{x})}{\int q(\theta|\mathbf{x})d\mathbf{x}}</math>，<math>\partial_{\mu}=\partial/\partial \theta_{\mu}</math>。效应和干预两个流形合在一起称为因果几何。

−

</math>$$，度量为<math>g_{\mu\nu}=-\mathbb{E}_{p(\mathbf{y}|\theta)}\partial_{\mu}\partial_{\nu}\ln p(\mathbf{y}|\theta)

−

</math>；干预流形<math>\mathcal{M}_I=\{\tilde{q}(\mathbf{x}|\theta)\}_{\theta\in \Theta}

−

</math>，度量为<math>h_{\mu\nu}=-\mathbb{E}_{\tilde{q}(\mathbf{x}|\theta)}\partial_{\mu}\partial_{\nu}\ln \tilde{q}(\mathbf{x}|\theta)

−

</math>。其中<math>\tilde{q}\equiv \frac{q(\theta|\mathbf{x})}{\int q(\theta|\mathbf{x})d\mathbf{x}}

−

</math>，<math>\partial_{\mu}=\partial/\partial \theta_{\mu}

−

</math>。效应和干预两个流形合在一起称为因果几何。

因果几何的EI计算公式为:

−

<math>EI_g=\ln\frac{V_I}{(2\pi e)^{n/2}}-\frac{1}{2V_I}\int_\Theta\sqrt{|\det(h_{\mu\nu})|} \ln\left|\det\left( I_n+\frac{h_{\mu\nu}}{g_{\mu\nu}}\right)\right|d^l\theta,

+

<math>

+

EI_g=\ln\frac{V_I}{(2\pi e)^{n/2}}-\frac{1}{2V_I}\int_\Theta\sqrt{|\det(h_{\mu\nu})|} \ln\left|\det\left( I_n+\frac{h_{\mu\nu}}{g_{\mu\nu}}\right)\right|d^l\theta,

</math>

+

-->

==维度平均的EI与因果涌现==

Jake

786

个编辑

更改

有效信息 (查看源代码)

2024年6月7日 (五) 21:26的版本

导航菜单

搜索