第123行: |
第123行: |
| | | |
| === Several causal emergence theories === | | === Several causal emergence theories === |
− | How to define causal emergence is a key issue. There are several representative works, namely the method based on effective information proposed by Hoel et al. <ref name=":0" /><ref name=":1" />, the method based on information decomposition proposed by Rosas et al. <ref name=":5">Rosas F E, Mediano P A, Jensen H J, et al. Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data[J]. PLoS computational biology, 2020, 16(12): e1008289.</ref>, a new causal emergence theory based on singular value decomposition proposed by Zhang Jiang et al. <ref name=":2" />, and some other theories. | + | How to define causal emergence is a key issue. There are several representative works, namely the method based on [[effective information]] proposed by Hoel et al. <ref name=":0" /><ref name=":1" />, the method based on [[information decomposition]] proposed by Rosas et al. <ref name=":5">Rosas F E, Mediano P A, Jensen H J, et al. Reconciling emergences: An information-theoretic approach to identify causal emergence in multivariate data[J]. PLoS computational biology, 2020, 16(12): e1008289.</ref>, a new causal emergence theory based on [[singular value decomposition]] proposed by Zhang Jiang et al. <ref name=":2" />, and some other theories. |
| | | |
| | | |
| ==== Erik Hoel's causal emergence theory ==== | | ==== Erik Hoel's causal emergence theory ==== |
− | In 2013, Hoel et al. <ref name=":0" /><ref name=":1" /> proposed the causal emergence theory. The following figure is an abstract framework for this theory. The horizontal axis represents time and the vertical axis represents scale. This framework can be regarded as a description of the same dynamical system on both microscopic and macroscopic scales. Among them, [math]f_m[/math] represents microscopic dynamics, [math]f_M[/math] represents macroscopic dynamics, and the two are connected by a coarse-graining function [math]\phi[/math]. In a discrete-state Markov dynamical system, both [math]f_m[/math] and [math]f_M[/math] are Markov chains. By performing coarse-graining of the Markov chain on [math]f_m[/math], [math]f_M[/math] can be obtained. <math> EI </math> is a measure of effective information. Since the microscopic state may have greater randomness, which leads to relatively weak causality of microscopic dynamics, by performing reasonable coarse-graining on the microscopic state at each moment, it is possible to obtain a macroscopic state with stronger causality. The so-called causal emergence refers to the phenomenon that when we perform coarse-graining on the microscopic state, the effective information of macroscopic dynamics will increase, and the difference in effective information between the macroscopic state and the microscopic state is defined as the intensity of causal emergence. | + | In 2013, Hoel et al. <ref name=":0" /><ref name=":1" /> proposed the causal emergence theory. The following figure is an abstract framework for this theory. The horizontal axis represents time and the vertical axis represents scale. This framework can be regarded as a description of the same dynamical system on both microscopic and macroscopic scales. Among them, [math]f_m[/math] represents microscopic dynamics, [math]f_M[/math] represents macroscopic dynamics, and the two are connected by a coarse-graining function [math]\phi[/math]. In a discrete-state Markov dynamical system, both [math]f_m[/math] and [math]f_M[/math] are Markov chains. By performing [[coarse-graining of the Markov chain]] on [math]f_m[/math], [math]f_M[/math] can be obtained. <math> EI </math> is a measure of [[effective information]]. Since the microscopic state may have greater randomness, which leads to relatively weak [[causality]] of microscopic dynamics, by performing reasonable coarse-graining on the microscopic state at each moment, it is possible to obtain a macroscopic state with stronger causality. The so-called causal emergence refers to the phenomenon that when we perform coarse-graining on the microscopic state, the [[effective information]] of macroscopic dynamics will increase, and the difference in [[effective information]] between the macroscopic state and the microscopic state is defined as the intensity of causal emergence. |
| | | |
| | | |
第134行: |
第134行: |
| | | |
| ===== Effective Information ===== | | ===== Effective Information ===== |
− | Effective Information (<math> EI </math>) was first proposed by Tononi et al. in the study of integrated information theory <ref>Tononi G, Sporns O. Measuring information integration[J]. BMC neuroscience, 2003, 41-20.</ref>. In causal emergence research, Erik Hoel and others use this causal effect measure index to quantify the strength of causality of a causal mechanism. | + | Effective Information (<math> EI </math>) was first proposed by [[Tononi]] et al. in the study of [[integrated information theory]] <ref>Tononi G, Sporns O. Measuring information integration[J]. BMC neuroscience, 2003, 41-20.</ref>. In causal emergence research, [[Erik Hoel]] and others use this [[causal effect measure]] index to quantify the strength of causality of a [[causal mechanism]]. |
| | | |
| | | |
− | Specifically, the calculation of <math> EI </math> is as follows: use an intervention operation to intervene on the independent variable and examine the mutual information between the cause and effect variables under this intervention. This mutual information is effective information, that is, the causal effect measure of the causal mechanism. | + | Specifically, the calculation of <math> EI </math> is as follows: use an [[intervention]] operation to intervene on the independent variable and examine the [[mutual information]] between the cause and effect variables under this intervention. This mutual information is [[Effective Information]], that is, the causal effect measure of the causal mechanism. |
| | | |
| | | |
− | In a Markov chain, the state variable [math]X_t[/math] at any time can be regarded as the cause, and the state variable [math]X_{t + 1}[/math] at the next time can be regarded as the result. Thus, the state transition matrix of the Markov chain is its causal mechanism. Therefore, the calculation formula for <math>EI</math> for a Markov chain is as follows: | + | In a [[Markov chain]], the state variable [math]X_t[/math] at any time can be regarded as the cause, and the state variable [math]X_{t + 1}[/math] at the next time can be regarded as the result. Thus, the [[state transition matrix]] of the [[Markov chain]] is its [[causal mechanism]]. Therefore, the calculation formula for <math>EI</math> for a [[Markov chain]] is as follows: |
| | | |
| | | |
第151行: |
第151行: |
| | | |
| | | |
− | Here <math>f</math> represents the state transition matrix of a Markov chain, [math]U(\mathcal{X})[/math] represents the uniform distribution on the value space [math]\mathcal{X}[/math] of the state variable [math]X_t[/math]. <math>\tilde{X}_t,\tilde{X}_{t+1}</math> are the states at two consecutive moments after intervening [math]X_t[/math] at time <math>t</math> into a uniform distribution. <math>p_{ij}</math> is the transition probability from the <math>i</math>-th state to the <math>j</math>-th state. From this formula, it is not difficult to see that <math> EI </math> is only a function of the probability transition matrix [math]f[/math]. The intervention operation is performed to make the effective information objectively measure the causal characteristics of the dynamics without being affected by the distribution of the original input data. | + | Here <math>f</math> represents the state transition matrix of a Markov chain, [math]U(\mathcal{X})[/math] represents the uniform distribution on the value space [math]\mathcal{X}[/math] of the state variable [math]X_t[/math]. <math>\tilde{X}t,\tilde{X}{t+1}</math> are the states at two consecutive moments after [[intervening]] [math]X_t[/math] at time <math>t</math> into a [[uniform distribution]]. <math>p_{ij}</math> is the transition probability from the <math>i</math>-th state to the <math>j</math>-th state. From this formula, it is not difficult to see that <math> EI </math> is only a function of the probability transition matrix [math]f[/math]. The intervention operation is performed to make the effective information objectively measure the causal characteristics of the dynamics without being affected by the distribution of the original input data. |
| | | |
| | | |
− | Effective information can be decomposed into two parts: '''determinism''' and '''degeneracy'''. Normalization can also be introduced to eliminate the influence of the size of the state space. For more detailed information about effective information, please refer to the entry: Effective Information. | + | Effective information can be decomposed into two parts: '''determinism''' and '''degeneracy'''. For more detailed information about [[Effective Information]], please refer to the entry: Effective Information. |
| | | |
| | | |
第179行: |
第179行: |
| | | |
| | | |
− | However, for more general Markov chains and more general state groupings, this simple operation of averaging probabilities is not always feasible. This is because the merged probability transition matrix may not satisfy the conditions of a Markov chain (such as the rows of the matrix not satisfying the normalization condition, or the element values exceeding the range of [0, 1]). For what kind of Markov chains and state groupings can a feasible macroscopic Markov chain be obtained, please refer to the section “Reduction of Markov Chains” later in this entry, or see the entry “Coarse-graining of Markov Chains”. | + | However, for more general Markov chains and more general state groupings, this simple operation of averaging probabilities is not always feasible. This is because the merged probability transition matrix may not satisfy the conditions of a Markov chain (such as the rows of the matrix not satisfying the normalization condition, or the element values exceeding the range of [0, 1]). For what kind of Markov chains and state groupings can a feasible macroscopic Markov chain be obtained, please refer to the section “Reduction of Markov Chains” later in this entry, or see the entry [[Coarse-graining of Markov Chains]]. |
| | | |
| | | |
| =====Boolean Network Example===== | | =====Boolean Network Example===== |
− | Another example in the literature <ref name=":0"/> is an example of causal emergence in a Boolean network. As shown in the figure, this is a Boolean network with 4 nodes. Each node has two states, 0 and 1. Each node is connected to two other nodes and follows the same microscopic dynamics mechanism (figure a). Therefore, this system contains a total of sixteen microscopic states, and its dynamics can be represented by a state transition matrix (figure c). | + | Another example in the literature <ref name=":0"/> is an example of causal emergence in a [[Boolean network]]. As shown in the figure, this is a Boolean network with 4 nodes. Each node has two states, 0 and 1. Each node is connected to two other nodes and follows the same microscopic dynamics mechanism (figure a). Therefore, this system contains a total of sixteen microscopic states, and its dynamics can be represented by a state transition matrix (figure c). |
| | | |
| | | |
第189行: |
第189行: |
| | | |
| | | |
− | Through comparison, we find that the effective information of macroscopic dynamics is greater than that of microscopic dynamics <math>EI(f_M\ )>EI(f_m\ ) </math>. Causal emergence occurs in this system. | + | Through comparison, we find that the [[effective information]] of macroscopic dynamics is greater than that of microscopic dynamics <math>EI(f_M\ )>EI(f_m\ ) </math>. Causal emergence occurs in this system. |
| | | |
| | | |
第196行: |
第196行: |
| | | |
| =====Causal Emergence in Continuous Variables===== | | =====Causal Emergence in Continuous Variables===== |
− | Furthermore, in the paper <ref name="Chvykov_causal_geometry">{{cite journal|author1=Chvykov P|author2=Hoel E.|title=Causal Geometry|journal=Entropy|year=2021|volume=23|issue=1|page=24|url=https://doi.org/10.3390/e2}}</ref>, Hoel et al. proposed the theoretical framework of causal geometry, trying to generalize the causal emergence theory to function mappings and dynamical systems with continuous states. This article defines <math>EI</math> for random function mapping, and also introduces the concepts of intervention noise and causal geometry, and compares and analogizes this concept with information geometry. Liu Kaiwei et al.<ref name="An_exact_theory_of_causal_emergence">{{cite journal|author1=Liu K|author2=Yuan B|author3=Zhang J|title=An Exact Theory of Causal Emergence for Linear Stochastic Iteration Systems|journal=Entropy|year=2024|volume=26|issue=8|page=618|url=https://arxiv.org/abs/2405.09207}}</ref> further gave an exact analytical causal emergence theory for random iterative dynamical systems. | + | Furthermore, in the paper <ref name="Chvykov_causal_geometry">{{cite journal|author1=Chvykov P|author2=Hoel E.|title=Causal Geometry|journal=Entropy|year=2021|volume=23|issue=1|page=24|url=https://doi.org/10.3390/e2}}</ref>, Hoel et al. proposed the theoretical framework of [[causal geometry]], trying to generalize the causal emergence theory to function mappings and dynamical systems with continuous states. This article defines <math>EI</math> for [[random function mapping]], and also introduces the concepts of intervention noise and [[causal geometry]], and compares and analogizes this concept with [[information geometry]]. [[Liu Kaiwei]] et al.<ref name="An_exact_theory_of_causal_emergence">{{cite journal|author1=Liu K|author2=Yuan B|author3=Zhang J|title=An Exact Theory of Causal Emergence for Linear Stochastic Iteration Systems|journal=Entropy|year=2024|volume=26|issue=8|page=618|url=https://arxiv.org/pdf/2405.09207}}</ref> further gave an exact analytical causal emergence theory for [[random iterative dynamical systems]]. |
| | | |
| | | |
第207行: |
第207行: |
| | | |
| =====Partial Information Decomposition===== | | =====Partial Information Decomposition===== |
− | This method is based on the nonnegative decomposition of multivariate information theory proposed by Williams and Beer et al <ref name=":16" />. This paper uses partial information decomposition (PID) to decompose the mutual information between microstates and macrostates. | + | This method is based on the nonnegative decomposition of multivariate information theory proposed by Williams and Beer et al <ref name=":16" />. This paper uses [[partial information decomposition]] (PID) to decompose the [[mutual information]] between microstates and macrostates. |
| | | |
| | | |
− | Without loss of generality, assume that our microstate is <math>X(X^1,X^2)</math>, that is, it is a two-dimensional variable, and the macrostate is <math>V</math>. Then the mutual information between the two can be decomposed into four parts: | + | Without loss of generality, assume that our microstate is <math>X(X^1,X^2)</math>, that is, it is a two-dimensional variable, and the macrostate is <math>V</math>. Then the [[mutual information]] between the two can be decomposed into four parts: |
| | | |
| | | |
第216行: |
第216行: |
| | | |
| | | |
− | Among them, <math>Red(X^1,X^2;V)</math> represents redundant information, which refers to the information repeatedly provided by two microstates <math>X^1</math> and <math>X^2</math> to the macrostate <math>V</math>; <math>Un(X^1;V│X^2)</math> and <math>Un(X^2;V│X^1)</math> represent unique information, which refers to the information provided by each microstate variable alone to the macrostate; <math>Syn(X^1,X^2;V)</math> represents synergistic information, which refers to the information provided by all microstates <math>X</math> jointly to the macrostate <math>V</math>. | + | Among them, <math>Red(X^1,X^2;V)</math> represents [[redundant information]], which refers to the information repeatedly provided by two microstates <math>X^1</math> and <math>X^2</math> to the macrostate <math>V</math>; <math>Un(X^1;V│X^2)</math> and <math>Un(X^2;V│X^1)</math> represent [[unique information]], which refers to the information provided by each microstate variable alone to the macrostate; <math>Syn(X^1,X^2;V)</math> represents [[synergistic information]], which refers to the information provided by all microstates <math>X</math> jointly to the macrostate <math>V</math>. |
| | | |
| | | |
第223行: |
第223行: |
| | | |
| | | |
− | 1) When the unique information <math>Un(V_t;X_{t+1}| X_t^1,\ldots,X_t^n\ )>0 </math>, it means that the macroscopic state <math>V_t</math> at the current moment can provide more information to the overall system <math>X_{t + 1}</math> at the next moment than the microscopic state <math>X_t</math> at the current moment. At this time, there is causal emergence in the system; | + | 1) When the [[unique information]] <math>Un(V_t;X_{t+1}| X_t^1,\ldots,X_t^n\ )>0 </math>, it means that the macroscopic state <math>V_t</math> at the current moment can provide more information to the overall system <math>X_{t + 1}</math> at the next moment than the microscopic state <math>X_t</math> at the current moment. At this time, there is causal emergence in the system; |
| | | |
| | | |
− | 2) The second method bypasses the selection of a specific macroscopic state <math>V_t</math>, and defines causal emergence only based on the synergistic information between the microscopic state <math>X_t</math> and the microscopic state <math>X_{t + 1}</math> at the next moment of the system. When the synergistic information <math>Syn(X_t^1,…,X_t^n;X_{t + 1}^1,…,X_{t + 1}^n)>0</math>, causal emergence occurs in the system. | + | 2) The second method bypasses the selection of a specific macroscopic state <math>V_t</math>, and defines causal emergence only based on the [[synergistic information]] between the microscopic state <math>X_t</math> and the microscopic state <math>X_{t + 1}</math> at the next moment of the system. When the synergistic information <math>Syn(X_t^1,…,X_t^n;X_{t + 1}^1,…,X_{t + 1}^n)>0</math>, causal emergence occurs in the system. |
| | | |
| | | |
− | It should be noted that for the first method to judge the occurrence of causal emergence, it depends on the selection of the macroscopic state <math>V_t</math>. The first method is the lower bound of the second method. This is because <math>Syn(X_t;X_{t+1}\ ) ≥ Un(V_t;X_{t+1}| X_t\ )</math> always holds. So, if <math>Un(V_t;X_{t + 1}|X_t)</math> is greater than 0, then causal emergence occurs in the system. However, the selection of <math>V_t</math> often requires predefining a coarse-graining function, so the limitations of the Erik Hoel causal emergence theory cannot be avoided. Another natural idea is to use the second method to judge the occurrence of causal emergence with the help of synergistic information. However, the calculation of synergistic information is very difficult and there is a combinatorial explosion problem. Therefore, the calculation based on synergistic information in the second method is often infeasible. In short, both quantitative characterization methods of causal emergence have some weaknesses, so a more reasonable quantification method needs to be proposed. | + | It should be noted that for the first method to judge the occurrence of causal emergence, it depends on the selection of the macroscopic state <math>V_t</math>. The first method is the lower bound of the second method. This is because <math>Syn(X_t;X_{t+1}\ ) ≥ Un(V_t;X_{t+1}| X_t\ )</math> always holds. So, if <math>Un(V_t;X_{t + 1}|X_t)</math> is greater than 0, then causal emergence occurs in the system. However, the selection of <math>V_t</math> often requires predefining a coarse-graining function, so the limitations of the [[Erik Hoel causal emergence theory]] cannot be avoided. Another natural idea is to use the second method to judge the occurrence of causal emergence with the help of synergistic information. However, the calculation of synergistic information is very difficult and there is a combinatorial explosion problem. Therefore, the calculation based on synergistic information in the second method is often infeasible. In short, both quantitative characterization methods of causal emergence have some weaknesses, so a more reasonable quantification method needs to be proposed. |
| | | |
| | | |
| =====Specific Example===== | | =====Specific Example===== |
− |
| |
− |
| |
| [[文件:因果解耦以及向下因果例子1.png|500x500像素|居左|因果解耦以及向下因果例子]] | | [[文件:因果解耦以及向下因果例子1.png|500x500像素|居左|因果解耦以及向下因果例子]] |
| | | |
| | | |
− | The author of the paper <ref name=":5" /> lists a specific example (as above), to illustrate when causal decoupling, downward causation and causal emergence occur. This example is a special Markov process. Here, <math>p_{X_{t + 1}|X_t}(x_{t + 1}|x_t)</math> represents the dynamic relationship, and <math>X_t=(x_t^1,…,x_t^n)\in\{0,1\}^n</math> is the microstate. The definition of this process is to determine the probability of taking different values of the state <math>x_{t + 1}</math> at the next moment by checking the values of the variables <math>x_t</math> and <math>x_{t + 1}</math> at two consecutive moments, that is, judging whether the sum modulo 2 of all dimensions of <math>x_t</math> is the same as the first dimension of <math>x_{t + 1}</math>: if they are different, the probability is 0; otherwise, judge whether <math>x_t,x_{t + 1}</math> have the same sum modulo 2 value in all dimensions. If both conditions are satisfied, the value probability is <math>\gamma/2^{n - 2}</math>, otherwise the value probability is <math>(1-\gamma)/2^{n - 2}</math>. Here <math>\gamma</math> is a parameter and <math>n</math> is the total dimension of x. | + | The author of the paper <ref name=":5" /> lists a specific example (as above), to illustrate when [[causal decoupling]], [[downward causation]] and [[causal emergence]] occur. This example is a special Markov process. Here, <math>p_{X_{t + 1}|X_t}(x_{t + 1}|x_t)</math> represents the dynamic relationship, and <math>X_t=(x_t^1,…,x_t^n)\in\{0,1\}^n</math> is the microstate. The definition of this process is to determine the probability of taking different values of the state <math>x_{t + 1}</math> at the next moment by checking the values of the variables <math>x_t</math> and <math>x_{t + 1}</math> at two consecutive moments, that is, judging whether the sum modulo 2 of all dimensions of <math>x_t</math> is the same as the first dimension of <math>x_{t + 1}</math>: if they are different, the probability is 0; otherwise, judge whether <math>x_t,x_{t + 1}</math> have the same sum modulo 2 value in all dimensions. If both conditions are satisfied, the value probability is <math>\gamma/2^{n - 2}</math>, otherwise the value probability is <math>(1-\gamma)/2^{n - 2}</math>. Here <math>\gamma</math> is a parameter and <math>n</math> is the total dimension of x. |
| | | |
| | | |
第248行: |
第246行: |
| | | |
| ====Causal Emergence Theory Based on Singular Value Decomposition==== | | ====Causal Emergence Theory Based on Singular Value Decomposition==== |
− | Erik Hoel's causal emergence theory has the problem of needing to specify a coarse-graining strategy in advance. Rosas' information decomposition theory does not completely solve this problem. Therefore, Zhang Jiang et al.<ref name=":2">Zhang J, Tao R, Yuan B. Dynamical Reversibility and A New Theory of Causal Emergence. arXiv preprint arXiv:2402.15054. 2024 Feb 23.</ref> further proposed the causal emergence theory based on singular value decomposition. | + | [[Erik Hoel's causal emergence theory]] has the problem of needing to specify a coarse-graining strategy in advance. Rosas' information decomposition theory does not completely solve this problem. Therefore, [[Zhang Jiang]] et al.<ref name=":2">Zhang J, Tao R, Yuan B. Dynamical Reversibility and A New Theory of Causal Emergence. arXiv preprint arXiv:2402.15054. 2024 Feb 23.</ref> further proposed the [[causal emergence theory based on singular value decomposition]]. |
| | | |
| | | |
| =====Singular Value Decomposition of Markov Chains===== | | =====Singular Value Decomposition of Markov Chains===== |
− | Given the Markov transition matrix <math>P</math> of a system, we can perform singular value decomposition on it to obtain two orthogonal and normalized matrices <math>U</math> and <math>V</math>, and a diagonal matrix <math>\Sigma</math>: <math>P = U\Sigma V^T</math>, where [math]\Sigma = diag(\sigma_1,\sigma_2,\cdots,\sigma_N)[/math], where [math]\sigma_1\geq\sigma_2\geq\cdots\sigma_N[/math] are the singular values of <math>P</math> and are arranged in descending order. <math>N</math> is the number of states of <math>P</math>. | + | Given the [[Markov transition matrix]] <math>P</math> of a system, we can perform [[singular value decomposition]] on it to obtain two orthogonal and normalized matrices <math>U</math> and <math>V</math>, and a diagonal matrix <math>\Sigma</math>: <math>P = U\Sigma V^T</math>, where [math]\Sigma = diag(\sigma_1,\sigma_2,\cdots,\sigma_N)[/math], where [math]\sigma_1\geq\sigma_2\geq\cdots\sigma_N[/math] are the singular values of <math>P</math> and are arranged in descending order. <math>N</math> is the number of states of <math>P</math>. |
| | | |
| | | |
| =====Approximate Dynamical Reversibility and Effective Information===== | | =====Approximate Dynamical Reversibility and Effective Information===== |
− | We can define the sum of the <math>\alpha</math> powers of the singular values (also known as the [math]\alpha[/math]-order Schatten norm of the matrix) as a measure of the approximate dynamical reversibility of the Markov chain, that is: | + | We can define the sum of the <math>\alpha</math> powers of the singular values (also known as the [math]\alpha[/math]-order [[Schatten norm]] of the matrix) as a measure of the [[approximate dynamical reversibility]] of the Markov chain, that is: |
| | | |
| | | |
第264行: |
第262行: |
| | | |
| | | |
− | Here, [math]\alpha\in(0,2)[/math] is a specified parameter that acts as a weight or tendency to make [math]\Gamma_{\alpha}[/math] reflect determinism or degeneracy more. Under normal circumstances, we take [math]\alpha = 1[/math], which can make [math]\Gamma_{\alpha}[/math] achieve a balance between determinism and degeneracy. | + | Here, [math]\alpha\in(0,2)[/math] is a specified parameter that acts as a weight or tendency to make [math]\Gamma_{\alpha}[/math] reflect [[determinism]] or [[degeneracy]] more. Under normal circumstances, we take [math]\alpha = 1[/math], which can make [math]\Gamma_{\alpha}[/math] achieve a balance between determinism and degeneracy. |
| | | |
| | | |
第293行: |
第291行: |
| | | |
| | | |
− | In summary, the advantage of this method for quantifying causal emergence is that it can quantify causal emergence more objectively without relying on a specific coarse-graining strategy. The disadvantage of this method is that to calculate [math]\Gamma_{\alpha}[/math], it is necessary to perform SVD decomposition on <math>P</math> in advance, so the computational complexity is [math]O(N^3)[/math], which is higher than the computational complexity of <math>EI</math>. Moreover, [math]\Gamma_{\alpha}[/math> cannot be explicitly decomposed into two components: determinism and degeneracy. | + | In summary, the advantage of this method for quantifying causal emergence is that it can quantify causal emergence more objectively without relying on a specific coarse-graining strategy. The disadvantage of this method is that to calculate [math]\Gamma_{\alpha}[/math], it is necessary to perform [[SVD decomposition]] on <math>P</math> in advance, so the computational complexity is [math]O(N^3)[/math], which is higher than the computational complexity of <math>EI</math>. Moreover, [math]\Gamma_{\alpha}[/math> cannot be explicitly decomposed into two components: determinism and degeneracy. |
| | | |
| | | |
第302行: |
第300行: |
| | | |
| | | |
− | The author gives four specific examples of Markov chains. The state transition matrix of this Markov chain is shown in the figure. We can compare the <math>EI</math> and approximate dynamical reversibility (the <math>\Gamma</math> in the figure, that is, <math>\Gamma_{\alpha = 1}</math>) of this Markov chain. Comparing figures a and b, we find that for different state transition matrices, when <math>EI</math> decreases, <math>\Gamma</math> also decreases simultaneously. Further, figures c and d are comparisons of the effects before and after coarse-graining. Among them, figure d is the coarse-graining of the state transition matrix of figure c (merging the first three states into a macroscopic state). Since the macroscopic state transition matrix in figure d is a deterministic system, the normalized <math>EI</math>, <math>eff\equiv EI/\log N</math> and the normalized [math]\Gamma[/math]: <math>\gamma\equiv \Gamma/N</math> all reach the maximum value of 1. | + | The author gives four specific examples of Markov chains. The state transition matrix of this Markov chain is shown in the figure. We can compare the <math>EI</math> and [[approximate dynamical reversibility]] (the <math>\Gamma</math> in the figure, that is, <math>\Gamma_{\alpha = 1}</math>) of this Markov chain. Comparing figures a and b, we find that for different state transition matrices, when <math>EI</math> decreases, <math>\Gamma</math> also decreases simultaneously. Further, figures c and d are comparisons of the effects before and after coarse-graining. Among them, figure d is the coarse-graining of the state transition matrix of figure c (merging the first three states into a macroscopic state). Since the macroscopic state transition matrix in figure d is a [[deterministic system]], the normalized <math>EI</math>, <math>eff\equiv EI/\log N</math> and the normalized [math]\Gamma[/math]: <math>\gamma\equiv \Gamma/N</math> all reach the maximum value of 1. |
| | | |
| | | |
| ====Dynamic independence==== | | ====Dynamic independence==== |
− | Dynamic independence is a method to characterize the macroscopic dynamical state after coarse-graining being independent of the microscopic dynamical state <ref name=":6">Barnett L, Seth AK. Dynamical independence: discovering emergent macroscopic processes in complex dynamical systems. Physical Review E. 2023 Jul;108(1):014304.</ref>. The core idea is that although macroscopic variables are composed of microscopic variables, when predicting the future state of macroscopic variables, only the historical information of macroscopic variables is needed, and no additional information from microscopic history is needed. This phenomenon is called dynamic independence by the author. It is another means of quantifying emergence. The macroscopic dynamics at this time is called emergent dynamics. The independence, causal dependence, etc. in the concept of dynamic independence can be quantified by transfer entropy. | + | [[Dynamic independence]] is a method to characterize the macroscopic dynamical state after coarse-graining being independent of the microscopic dynamical state <ref name=":6">Barnett L, Seth AK. Dynamical independence: discovering emergent macroscopic processes in complex dynamical systems. Physical Review E. 2023 Jul;108(1):014304.</ref>. The core idea is that although macroscopic variables are composed of microscopic variables, when predicting the future state of macroscopic variables, only the historical information of macroscopic variables is needed, and no additional information from microscopic history is needed. This phenomenon is called [[dynamic independence]] by the author. It is another means of quantifying emergence. The macroscopic dynamics at this time is called emergent dynamics. The independence, causal dependence, etc. in the concept of dynamic independence can be quantified by [[transfer entropy]]. |
| | | |
| | | |
| =====Quantification of dynamic independence===== | | =====Quantification of dynamic independence===== |
− | Transfer entropy is a non-parametric statistic that measures the amount of directed (time-asymmetric) information transfer between two stochastic processes. The transfer entropy from process <math>X</math> to another process <math>Y</math> can be defined as the degree to which knowing the past values of <math>X</math> can reduce the uncertainty about the future value of <math>Y</math> given the past values of <math>Y</math>. The formula is as follows: | + | [[Transfer entropy]] is a non-parametric statistic that measures the amount of directed (time-asymmetric) information transfer between two stochastic processes. The transfer entropy from process <math>X</math> to another process <math>Y</math> can be defined as the degree to which knowing the past values of <math>X</math> can reduce the uncertainty about the future value of <math>Y</math> given the past values of <math>Y</math>. The formula is as follows: |
| | | |
| | | |
第322行: |
第320行: |
| | | |
| | | |
− | In the paper, the author conducts experimental verification in a linear system. The experimental process is: 1) Use the linear system to generate parameters and laws; 2) Set the coarse-graining function; 3) Obtain the expression of transfer entropy; 4) Optimize and solve the coarse-graining method of maximum decoupling (corresponding to minimum transfer entropy). Here, the optimization algorithm can use transfer entropy as the optimization goal, and then use the gradient descent algorithm to solve the coarse-graining function, or use the genetic algorithm for optimization. | + | In the paper, the author conducts experimental verification in a [[linear system]]. The experimental process is: 1) Use the linear system to generate parameters and laws; 2) Set the coarse-graining function; 3) Obtain the expression of transfer entropy; 4) Optimize and solve the coarse-graining method of maximum decoupling (corresponding to minimum transfer entropy). Here, the optimization algorithm can use transfer entropy as the optimization goal, and then use the [[gradient descent algorithm]] to solve the coarse-graining function, or use the [[genetic algorithm]] for optimization. |
| | | |
| | | |
| =====Example===== | | =====Example===== |
− | The paper gives an example of a linear dynamical system. Its dynamics is a vector autoregressive model. By using genetic algorithms to iteratively evolve different initial conditions, the degree of dynamical decoupling of the system can also gradually increase. At the same time, it is found that different coarse-graining scales will affect the degree of optimization to dynamic independence. The experiment finds that dynamic decoupling can only be achieved at certain scales, but not at other scales. Therefore, the choice of scale is also very important. | + | The paper gives an example of a linear dynamical system. Its dynamics is a vector autoregressive model. By using genetic algorithms to iteratively evolve different initial conditions, the degree of dynamical decoupling of the system can also gradually increase. At the same time, it is found that different coarse-graining scales will affect the degree of optimization to dynamic independence. The experiment finds that [[dynamic decoupling]] can only be achieved at certain scales, but not at other scales. Therefore, the choice of scale is also very important. |
| | | |
| | | |
第346行: |
第344行: |
| |Dynamic independence <ref name=":6"/>||Granger causality||Requires specifying a coarse-graining method||Arbitrary dynamics||Dynamic independence: transfer entropy | | |Dynamic independence <ref name=":6"/>||Granger causality||Requires specifying a coarse-graining method||Arbitrary dynamics||Dynamic independence: transfer entropy |
| |} | | |} |
− |
| |
| | | |
| ==Identification of Causal Emergence== | | ==Identification of Causal Emergence== |