更改

无编辑摘要
第724行: 第724行:  
<math>EI\approx \ln\left(\frac{L^n}{(2\pi e)^{m/2}}\right)+\frac{1}{L^n}\int_{-[\frac{L}{2},\frac{L}{2}]^n}\ln\left|\det\left(\frac{\partial_\mathbf{x} f(\mathbf{x})}{\Sigma^{1/2}}\right)\right| d\mathbf{x},
 
<math>EI\approx \ln\left(\frac{L^n}{(2\pi e)^{m/2}}\right)+\frac{1}{L^n}\int_{-[\frac{L}{2},\frac{L}{2}]^n}\ln\left|\det\left(\frac{\partial_\mathbf{x} f(\mathbf{x})}{\Sigma^{1/2}}\right)\right| d\mathbf{x},
 
</math>
 
</math>
|{{EquationRef|6}}}}Where <math>|\cdot|</math> denotes the absolute value, and <math>\det</math> refers to the determinant.<!--为了将信息几何推广到具有干预噪声和观测噪声的情况,需要引入一个新的维度为<math>l</math>的中间变量<math>\theta\subset\mathcal{R}^l</math>,使得我们不能通过直接干预<math>\mathbf{x}</math>来控制<math>\mathbf{y}</math>。相反,我们可以干预<math>\mathbf{x}</math>以影响<math>\theta</math>并间接影响<math>\mathbf{y}</math>。因此,这三个变量形成了一个马尔可夫链:<math>\mathbf{x}\to\theta\to\mathbf{y}</math>。
+
|{{EquationRef|6}}}}Where <math>|\cdot|</math> denotes the absolute value, and <math>\det</math> refers to the determinant.
 +
<!--
 +
 
 +
为了将信息几何推广到具有干预噪声和观测噪声的情况,需要引入一个新的维度为<math>l</math>的中间变量<math>\theta\subset\mathcal{R}^l</math>,使得我们不能通过直接干预<math>\mathbf{x}</math>来控制<math>\mathbf{y}</math>。相反,我们可以干预<math>\mathbf{x}</math>以影响<math>\theta</math>并间接影响<math>\mathbf{y}</math>。因此,这三个变量形成了一个马尔可夫链:<math>\mathbf{x}\to\theta\to\mathbf{y}</math>。
    
在这种情况下,可以获得两个流形:效应流形<math>\mathcal{M}_E=\{p(\mathbf{y}|\theta)\}_{\theta}</math>,度量为<math>g_{\mu\nu}=-\mathbb{E}_{p(\mathbf{y}|\theta)}\partial_{\mu}\partial_{\nu}\ln p(\mathbf{y}|\theta)</math>;干预流形<math>\mathcal{M}_I=\{\tilde{q}(\mathbf{x}|\theta)\}_{\theta\in \Theta}</math>,度量为<math>h_{\mu\nu}=-\mathbb{E}_{\tilde{q}(\mathbf{x}|\theta)}\partial_{\mu}\partial_{\nu}\ln \tilde{q}(\mathbf{x}|\theta)</math>。其中<math>\tilde{q}\equiv \frac{q(\theta|\mathbf{x})}{\int q(\theta|\mathbf{x})d\mathbf{x}}</math>,<math>\partial_{\mu}=\partial/\partial \theta_{\mu}</math>。效应和干预两个流形合在一起称为因果几何。
 
在这种情况下,可以获得两个流形:效应流形<math>\mathcal{M}_E=\{p(\mathbf{y}|\theta)\}_{\theta}</math>,度量为<math>g_{\mu\nu}=-\mathbb{E}_{p(\mathbf{y}|\theta)}\partial_{\mu}\partial_{\nu}\ln p(\mathbf{y}|\theta)</math>;干预流形<math>\mathcal{M}_I=\{\tilde{q}(\mathbf{x}|\theta)\}_{\theta\in \Theta}</math>,度量为<math>h_{\mu\nu}=-\mathbb{E}_{\tilde{q}(\mathbf{x}|\theta)}\partial_{\mu}\partial_{\nu}\ln \tilde{q}(\mathbf{x}|\theta)</math>。其中<math>\tilde{q}\equiv \frac{q(\theta|\mathbf{x})}{\int q(\theta|\mathbf{x})d\mathbf{x}}</math>,<math>\partial_{\mu}=\partial/\partial \theta_{\mu}</math>。效应和干预两个流形合在一起称为因果几何。
第732行: 第735行:  
<math>
 
<math>
 
EI_g=\ln\frac{V_I}{(2\pi e)^{n/2}}-\frac{1}{2V_I}\int_\Theta\sqrt{|\det(h_{\mu\nu})|} \ln\left|\det\left( I_n+\frac{h_{\mu\nu}}{g_{\mu\nu}}\right)\right|d^l\theta,
 
EI_g=\ln\frac{V_I}{(2\pi e)^{n/2}}-\frac{1}{2V_I}\int_\Theta\sqrt{|\det(h_{\mu\nu})|} \ln\left|\det\left( I_n+\frac{h_{\mu\nu}}{g_{\mu\nu}}\right)\right|d^l\theta,
</math>-->
+
</math>
 +
-->
 +
 
 
==Dimension-Averaged EI==
 
==Dimension-Averaged EI==
 +
 
In discrete-state systems, when comparing systems of different scales, we can compute either the direct EI difference or the normalized EI difference. Normalized EI is divided by [math]\log N[/math], where [math]N=#(\mathcal{X})[/math] represents the number of elements in the discrete state space [math]\mathcal{X}[/math].
 
In discrete-state systems, when comparing systems of different scales, we can compute either the direct EI difference or the normalized EI difference. Normalized EI is divided by [math]\log N[/math], where [math]N=#(\mathcal{X})[/math] represents the number of elements in the discrete state space [math]\mathcal{X}[/math].
   −
However, for continuous variables, if the original EI is used, an unreasonable result may occur. Firstly, as shown in equation (6), the EI formula contains a term [math]\ln L^n[/math]. Since L is a large positive number, the EI result will be significantly affected by L. Secondly, when calculating normalized EI (Eff), the issue arises that for continuous variables, the number of elements in the state space is infinite. A potential solution is to treat the volume of the space as the number N, and thus normalize it by [math]n \ln L[/math], meaning it is proportional to n and ln L:
+
However, for continuous variables, if the original EI is used, an unreasonable result may occur. Firstly, as shown in equation {{EquationNote|6}}, the EI formula contains a term [math]\ln L^n[/math]. Since L is a large positive number, the EI result will be significantly affected by L. Secondly, when calculating normalized EI (Eff), the issue arises that for continuous variables, the number of elements in the state space is infinite. A potential solution is to treat the volume of the space as the number N, and thus normalize it by [math]n \ln L[/math], meaning it is proportional to n and ln L:
    
<math>
 
<math>
第742行: 第748行:  
</math>
 
</math>
   −
However, this still contains the L term, affecting Eff significantly. Moreover, when comparing microscopic (n-dimensional) and macroscopic (m-dimensional, where m < n) Eff, the L term cannot be eliminated. This suggests that the normalization issue in continuous variable systems cannot simply be transferred from the discrete case.
+
However, this still contains the L term, affecting Eff significantly. Moreover, when comparing microscopic (n-dimensional) and macroscopic (m-dimensional, where m < n) Eff, the L term cannot be eliminated.  
 +
 
 +
This suggests that the normalization issue in continuous variable systems cannot simply be transferred from the discrete case.
   −
When the Neural Information Squeezer (NIS) framework was proposed, the authors invented another way to normalize EI for continuous variables using state space dimensions. This approach solves the EI comparison problem for continuous state variables, introducing the Dimension-Averaged Effective Information (dEI):
+
When the [[Neural Information Squeezer]] (NIS) framework was proposed <ref name=zhang_nis />, the authors invented another way to normalize EI for continuous variables using state space dimensions. This approach solves the EI comparison problem for continuous state variables, introducing the '''Dimension Averaged Effective Information''' (dEI):
    
<math>
 
<math>
第750行: 第758行:  
</math>
 
</math>
   −
Where n is the dimension of the state space. It can be proven that in the discrete state space, dimension-averaged EI and Eff are actually equivalent. The EI for continuous variables will be further discussed below.
+
Where [math]n[/math] is the dimension of the state space. It can be proven that in the discrete state space, '''Dimension-averaged EI''' and '''Eff''' are actually equivalent. The EI for continuous variables will be further discussed below.
   −
For n-dimensional iterative dynamical systems, [math]\mathbf{y}[/math] and [math]\mathbf{x}[/math] are variables of the same dimension, meaning [math]m=n[/math]. Substituting equation (6) into the dimension-averaged EI gives:
+
For n-dimensional iterative dynamical systems, [math]\mathbf{y}[/math] and [math]\mathbf{x}[/math] are variables of the same dimension, meaning [math]m=n[/math]. Substituting equation {{EquationNote|6}} into the dimension-averaged EI gives:
    
<math>
 
<math>
1,117

个编辑