第187行: |
第187行: |
| For these three Markov chains, the state space is X={1,2,3,4}, so the size of their TPM is 4×4. | | For these three Markov chains, the state space is X={1,2,3,4}, so the size of their TPM is 4×4. |
| ==EI of Markov Chains== | | ==EI of Markov Chains== |
− | 在[[马尔科夫链]]中,任意时刻的状态变量[math]X_t[/math]都可以看作是原因,而下一时刻的状态变量[math]X_{t+1}[/math]就可以看作是结果,这样[[马尔科夫链]]的[[状态转移矩阵]]就是它的[[因果机制]]。因此,我们可以将有效信息的定义套用到[[马尔科夫链]]上来。
| |
− |
| |
| | | |
| + | In a Markov chain, the state variable at any time Xt can be considered as the cause, and the state variable at the next time Xt+1 can be considered as the effect. Thus, the state transition matrix of a Markov chain is its causal mechanism. Therefore, we can apply the definition of Effective Information to Markov chains. |
| <math> | | <math> |
| \begin{aligned} | | \begin{aligned} |
第199行: |
第198行: |
| </math> | | </math> |
| | | |
− | | + | Here, X~t,X~t+1 are the states at times t and t+1 after intervening to make Xt uniformly distributed, and pij is the probability of transitioning from state i to state j. From this equation, it is clear that EI is merely a function of the probability transition matrix P. |
− | 其中<math>\tilde{X}_t,\tilde{X}_{t+1}</math>分别为把t时刻的[math]X_t[/math][[干预]]为[[均匀分布]]后,前后两个时刻的状态。<math>p_{ij}</math>为第i个状态转移到第j个状态的转移概率。从这个式子,不难看出,EI仅仅是概率转移矩阵[math]P[/math]的函数。
| + | ==Vector Form of EI in Markov Chains== |
− | ==马尔科夫链EI的向量形式== | + | We can also represent the transition probability matrix P as a concatenation of N row vectors, i.e.: |
− | 我们也可以将[[转移概率矩阵]][math]P[/math]写成[math]N[/math]个行向量拼接而成的形式,即:
| |
| | | |
| <math> | | <math> |
第208行: |
第206行: |
| </math> | | </math> |
| | | |
− | 其中,[math]P_i[/math]矩阵[math]P[/math]的第[math]i[/math]个行向量,且满足条件概率的归一化条件:[math]||P_i||_1=1[/math],这里的[math]||\cdot||_1[/math]表示向量的1范数。那么EI可以写成如下的形式:{{NumBlk|:|
| + | Where Pi is the i-th row vector of matrix P, and it satisfies the normalization condition for conditional probabilities: ∣∣Pi∣∣1=1, where ∣∣⋅∣∣1 denotes the L1-norm of a vector. Then, EI can be written as follows:{{NumBlk|:| |
| <math> | | <math> |
| \begin{aligned} | | \begin{aligned} |
第216行: |
第214行: |
| \end{aligned} | | \end{aligned} |
| </math> | | </math> |
− | |{{EquationRef|2}}}}将矩阵每列求均值,可得到平均转移向量<math>\overline{P}=\sum_{k=1}^N P_k/N</math>。[math]D_{KL}[/math]便是两个分布的[[KL散度]]。因此,EI是转移矩阵每个行转移向量[math]P_i[/math]与平均转移向量[math]\bar{P}[/math]的[[KL散度]]的均值。 | + | |{{EquationRef|2}}}} |
| + | |
| + | |
| + | By averaging the columns of the matrix, we obtain the average transition vector P=∑k=1NPk/N. DKL is the KL divergence between two distributions. Therefore, EI is the average KL divergence between each row transition vector Pi and the average transition vector P. |
| + | |
| + | For the three state transition matrices listed above, their respective EI values are: 2 bits, 1 bit, and 0 bits. This shows that if more 0s or 1s appear in the transition probability matrix (i.e., if more of the row vectors are one-hot vectors, where one position is 1 and the others are 0), the EI value will be higher. In other words, the more deterministic the jump from one time to the next, the higher the EI value tends to be. However, this observation is not entirely precise, and more exact conclusions are provided in the following sections. |
| | | |
− | 针对上面所列的三个[[状态转移矩阵]],我们可以分别求出它们的EI为:2比特、1比特和0比特。由此可见,如果[[转移概率矩阵]]中出现更多的0或1,也就是行向量多是[[独热向量]](也叫做[[one-hot向量]],即某一个位置为1,其它位置为0的向量),则EI值就会更大。也就是说,如果在状态转移的过程中,从某一时刻到下一时刻的跳转越确定,则EI值就会倾向于越高。但是,这个观察并不十分精确,更精确的结论由后面的小节给出。
| |
| ==归一化== | | ==归一化== |
| 显然,EI的大小和状态空间大小有关,这一性质在我们比较不同尺度的[[马尔科夫链]]的时候非常不方便,我们需要一个尽可能不受尺度效应影响的[[因果效应度量]]。因此,我们需要对有效信息EI做一个归一化处理,得到和系统尺寸无关的一个量化指标。 | | 显然,EI的大小和状态空间大小有关,这一性质在我们比较不同尺度的[[马尔科夫链]]的时候非常不方便,我们需要一个尽可能不受尺度效应影响的[[因果效应度量]]。因此,我们需要对有效信息EI做一个归一化处理,得到和系统尺寸无关的一个量化指标。 |