第338行: |
第338行: |
| In summary, determinism refers to how much confidence we have in predicting future states based on the current state's probability distribution, while degeneracy refers to how much certainty we have in inferring previous states from the current state. Systems with high determinism and low degeneracy exhibit strong causal dynamics. | | In summary, determinism refers to how much confidence we have in predicting future states based on the current state's probability distribution, while degeneracy refers to how much certainty we have in inferring previous states from the current state. Systems with high determinism and low degeneracy exhibit strong causal dynamics. |
| | | |
− | ==EI的函数性质== | + | ==Properties of the EI Function== |
− | 由公式{{EquationNote|2}}可以看出,在概率转移矩阵P上,EI是关于矩阵中每一个元素(从某一状态到另一状态的条件概率)的函数,于是我们自然会问:这样一个函数具有哪些数学性质?如它有没有极值点?极值点在哪里?凸性如何?最大值和最小值又是多少?
| + | From Equation 2, we can see that in the transition probability matrix P, EI is a function of each element, representing the conditional probabilities of transitioning from one state to another. Thus, a natural question arises: What mathematical properties does this function have? Does it have extreme points, and if so, where are they? Is it convex? What are its maximum and minimum values? |
− | ===定义域=== | + | ===Domain=== |
− | 在离散状态和离散时间的马尔科夫链上,EI的定义域显然是概率转移矩阵P。P是一个由[math]N\times N[/math]个元素构成的矩阵,其中每一个元素[math]p_{ij}\in[0,1][/math]代表一个概率值,同时对于任意的行,这组概率值需要满足归一化条件,也就是对于任意的[math]\forall i\in[1,N][/math]:{{NumBlk|:| | + | 在离散状态和离散时间的马尔科夫链上,EI的定义域显然是概率转移矩阵P。P是一个由[math]N\times N[/math]个元素构成的矩阵,其中每一个元素[math]p_{ij}\in[0,1][/math]代表一个概率值,同时对于任意的行,这组概率值需要满足归一化条件,也就是对于任意的[math]\forall i\in[1,N][/math]: |
| + | |
| + | In the case of Markov chains with discrete states and discrete time, the domain of EI is clearly the transition probability matrix P. P is a matrix composed of N×N elements, each representing a probability value pij∈[0,1]. Additionally, each row must satisfy the normalization condition, meaning for any ∀i∈[1,N], the sum of the row's probabilities equals:{{NumBlk|:| |
| <math> | | <math> |
| ||P_i||_1=\sum_{j=1}^N p_{ij}=1 | | ||P_i||_1=\sum_{j=1}^N p_{ij}=1 |
| </math> | | </math> |
− | |{{EquationRef|3}}}}因此EI的定义域,也就是P的可能空间并不是全部[math]N\times N[/math]维的实数空间[math]\mathcal{R}^{N^2}[/math],由于归一化条件{{EquationNote|3}}的存在,使得该定义域成为一个[math]N\times N[/math]维实数空间中的一个子空间。如何表达这个子空间呢? | + | |{{EquationRef|3}}}}Thus, the domain of EI, which is the possible space for P, is not the entire real space RN2. Due to the normalization condition, it is a subspace of the N×N-dimensional real space. How can we express this subspace? |
| | | |
− | 首先,对于任意一个行向量[math]P_i[/math]来说,它的取值范围空间为N维实数空间中的一个超正多面体。例如,当[math]N=2[/math]的时候,该空间为一条直线:[math]p_{i,1}+p_{i,2}=1, \forall i\in\{1,2\}[/math]。当[math]N=3[/math]的时候,该空间为一张三维空间中的平面:[math]p_{i,1}+p_{i,2}+p_{i,3}=1, \forall i\in\{1,2,3\}[/math]。这两个空间如下图所示:
| + | For any row vector Pi, its value range is a hyper simplex in N-dimensional real space. For example, when N=2, this space is a line segment: pi,1+pi,2=1. When N=3, this space is a plane in three-dimensional space: pi,1+pi,2+pi,3=1. These two spaces are illustrated below: |
| | | |
| [[文件:P1+p2=1.png|301x301像素|替代=|链接=https://wiki.swarma.org/index.php/%E6%96%87%E4%BB%B6:P1+p2=1.png]][[文件:P1+p2+p3=1.png|380x380像素|替代=|链接=https://wiki.swarma.org/index.php/%E6%96%87%E4%BB%B6:P1+p2+p3=1.png]] | | [[文件:P1+p2=1.png|301x301像素|替代=|链接=https://wiki.swarma.org/index.php/%E6%96%87%E4%BB%B6:P1+p2=1.png]][[文件:P1+p2+p3=1.png|380x380像素|替代=|链接=https://wiki.swarma.org/index.php/%E6%96%87%E4%BB%B6:P1+p2+p3=1.png]] |
| | | |
− | 在一般情况,我们将N维空间下的行向量[math]P_i[/math]的取值范围空间定义为[math]\Delta=\{p_{j}|\sum_{j=1}^Np_{j}=1,p_{j}\in[0,1]\}[/math],则N个这样的空间的笛卡尔积即为EI的定义域:
| + | In the general case, we define the value range space of the row vector Pi in N-dimensional space as Δ={pj∣∑j=1Npj=1,pj∈[0,1]}. The Cartesian product of N such spaces forms the domain of EI: |
| | | |
| <math> | | <math> |
| Dom(EI)=\Delta\times \Delta\cdots\times\Delta=\Delta^N | | Dom(EI)=\Delta\times \Delta\cdots\times\Delta=\Delta^N |
| </math> | | </math> |
− | ===一阶导数及极值点=== | + | ===First Derivative and Extreme Points=== |
− | 对公式{{EquationNote|2}}求[math]p_{ij}[/math]的一阶导数,并注意到归一化条件[math]\sum_{j=1}^Np_{ij}=1[/math],则就可以得到:{{NumBlk|:|
| + | Taking the first derivative of Equation 2 with respect to pij, and considering the normalization condition ∑j=1Npij=1, we get:{{NumBlk|:| |
| <math> | | <math> |
| \frac{\partial EI}{\partial p_{ij}}=\log\left(\frac{p_{ij}}{p_{iN}}\right)-\log\left(\frac{\bar{p}_{\cdot j}}{\bar{p}_{\cdot N}}\right), | | \frac{\partial EI}{\partial p_{ij}}=\log\left(\frac{p_{ij}}{p_{iN}}\right)-\log\left(\frac{\bar{p}_{\cdot j}}{\bar{p}_{\cdot N}}\right), |
| </math> | | </math> |
− | |{{EquationRef|3}}}}其中,<math>p_{ij}</math>表示P中第i行第j列的条件概率,因为P每一行有归一化约束条件{{EquationNote|2}},所以EI函数本身有<math>N(N-1)</math>个自由变元,我们可以取<math>1\leq i\leq N, 1\leq j\leq N-1</math>。<math>p_{iN}</math>表示第i行第N列的条件概率,<math>\bar{p}_{\cdot j}, \bar{p}_{\cdot N}</math><nowiki>则分别表示第j列和第N列条件概率的均值。不难看出,该导数有定义的前提是对于选定的[math]i,j\in[1,N][/math],[math]p_{ij},p_{iN},\bar{p}_{\cdot j}\equiv \frac{\sum_{k=1}^Np_{kj}}{N},\bar{p}_{\cdot N}\equiv \frac{\sum_{k=1}^Np_{kN}}{N}[/math]都大于0,只有满足这个条件,则EI在[math]p_{ij}[/math]附近是可导的。否则,如果有一项为0,则导数不存在。</nowiki> | + | |{{EquationRef|3}}}} |
| | | |
− | 令式{{EquationNote|3}}等于0,可以求得极值点:即对于任意的<math>1\leq i\leq N, 1\leq j\leq N-1</math>,都有下式的成立,
| + | |
| + | Here, pij denotes the conditional probability of transitioning from state i to state j in matrix P. Since each row of P is subject to the normalization constraint, the EI function has N(N−1) degrees of freedom. We can select 1≤i≤N and 1≤j≤N−1, with piN representing the conditional probability in the i-th row and N-th column. pˉ⋅j and pˉ⋅N represent the average conditional probabilities of the j-th and N-th columns, respectively. |
| + | |
| + | It is easy to see that this derivative is defined only when, for the selected i,j∈[1,N], the probabilities pij,piN,pˉ⋅j,pˉ⋅N are all greater than 0. Otherwise, if any of these terms are 0, the derivative does not exist. |
| + | |
| + | Setting Equation 3 to 0 yields the extreme points. For any 1≤i≤N,1≤j≤N−1, the following equation holds: |
| | | |
| <math> | | <math> |
第369行: |
第376行: |
| </math> | | </math> |
| | | |
− | 不难计算出,此时<math>EI=0</math>,即EI达到了极值点。根据EI的二阶导数不难判断出,这是极小值点。换个角度来看这个公式,这意味着EI的极小值点有很多个,只要转移概率矩阵所有行向量完全一致,无论该行向量本身是怎样的分布,EI都会等于0。
| + | It is straightforward to compute that in this case, EI=0, indicating that EI reaches an extreme point. From the second derivative of EI, we can easily determine that this is a minimum point. In other words, EI has many minimum points, as long as all the row vectors of the transition probability matrix are identical. Regardless of the specific distribution of these row vectors, EI will be zero. |
− | ===二阶导数与凸性=== | + | ===Second Derivative and Convexity=== |
− | 进一步地,为了求得EI函数的凸性,我们可以求出EI这个函数的二阶导数<math>\frac{\partial^2 EI}{\partial p_{ij}\partial p_{st}}</math>,其中<math>1\leq s \leq N, 1\leq t \leq N-1 </math>。首先我们需要引入一个函数符号<math>\delta_{i,j} </math>,
| + | 进一步地,为了求得EI函数的凸性,我们可以求出EI这个函数的二阶导数,其中<math>1\leq s \leq N, 1\leq t \leq N-1 </math>。首先我们需要引入一个函数符号<math>\delta_{i,j} </math>, |
| + | |
| + | To explore the convexity of the EI function, we can compute its second derivative<math>\frac{\partial^2 EI}{\partial p_{ij}\partial p_{st}}</math>, where 1≤s≤N and 1≤t≤N−1. First, we introduce a function symbol δi,j, and then proceed to derive the second derivative. When i=s: |
| | | |
| <math> | | <math> |