更改

The concept of Effective Information (EI) was first introduced by Giulio Tononi in 2003 as a key measure in Integrated Information Theory[1]. A system is said to have a high degree of integration when there is a strong causal connection among its components, and EI is the metric used to quantify this degree of causal connection.

−

In 2013, Giulio Tononi's student, Erik Hoel, further refined the concept of EI to quantitatively characterize emergence, leading to the development of the theory of Causal Emergence[2]. In this theory, Hoel used Judea ~~Pearl’s~~ "do" operator to modify the general mutual information metric, which made EI fundamentally different from mutual information. While mutual information measures correlation, EI—due to the use of the "do" operator—measures causality. The article also introduced a normalized version of EI, referred to as Eff.

+

In 2013, Giulio Tononi's student, Erik Hoel, further refined the concept of EI to quantitatively characterize emergence, leading to the development of the theory of Causal Emergence[2]. In this theory, Hoel used Judea Pearl's "do" operator to modify the general mutual information metric, which made EI fundamentally different from mutual information. While mutual information measures correlation, EI—due to the use of the "do" operator—measures causality. The article also introduced a normalized version of EI, referred to as Eff.

Traditionally, EI was primarily applied to discrete-state Markov chains. To extend this to continuous domains, P. Chvykov and E. Hoel collaborated in 2020 to propose the theory of Causal Geometry, expanding EI's definition to function mappings with continuous state variables. By incorporating Information Geometry, they explored a perturbative form of EI and compared it with Fisher Information, proposing the concept of Causal Geometry. However, this method of calculating EI for continuous variables required the assumption of infinitesimal variance for normal distribution variables, which was an overly stringent condition.

第83行：第83行：

The original definition of effective information (EI) was based on discrete Markov chains. However, to expand its applicability, we explore a more general form of EI here.

==Formal Definition ==

−

Consider two random variables, X and Y, representing the cause variable and the effect variable, respectively. Let their value ranges be X and Y. The effective information (EI) from X to Y is defined as:

+

Consider two random variables, [math]X[/math] and [math]Y[/math], representing the cause variable and the effect variable, respectively. Let their value ranges be [math]\mathcal{X}[/math] and [math]\mathcal{Y}[/math]. The effective information (EI) from [math]X[/math] to [math]Y[/math] is defined as:

−

~~EI≡I~~(X:~~Y∣do~~(~~X∼U~~(X)))≡I(X~:Y~)

+

<math>

+

EI\equiv I(X:Y|do(X\sim U(\mathcal{X})))\equiv I(\tilde{X}:\tilde{Y})

+

</math>

−

Here, do(~~X∼U~~(X)) represents applying a do-intervention (or do-operator) on X, making it follow a uniform distribution U(X) over X, which corresponds to a maximum entropy distribution. X~ and Y~ represent the variables after the do-intervention on X and Y, respectively, where:

+

Here, [math]do(X\sim U(\mathcal{X}))[/math] represents applying a do-intervention (or do-operator) on [math]X[/math], making it follow a uniform distribution [math]U(\mathcal{X})[/math] over [math]\mathcal{X}[/math], which corresponds to a maximum entropy distribution. [math]\tilde{X}[/math] and [math]\tilde{Y}[/math] represent the variables after the [math]do[/math]-intervention on [math]X[/math] and [math]Y[/math], respectively, where:

−

Pr(X~=x)=#(X)1

+

<math>

+

Pr(\tilde{X}=x)=\frac{1}{\#(\mathcal{X})},

+

</math>

−

This means that the main difference between X~ after the intervention and X before the intervention is their distributions: X~ follows a uniform distribution over X, while X may follow any arbitrary distribution. #(X) represents the cardinality of the set X, or the number of elements in the set if it is finite.

+

This means that the main difference between [math]\tilde{X}[/math] after the intervention and [math]X[/math] before the intervention is their distributions: [math]\tilde{X}[/math] follows a uniform distribution over [math]\mathcal{X}[/math], while [math]X[/math] may follow any arbitrary distribution. [math]\#(\mathcal{X})[/math] represents the cardinality of the set [math]\mathcal{X}[/math], or the number of elements in the set if it is finite.

−

According to Judea ~~Pearl’s~~ theory, the do-operator cuts off all causal arrows pointing to variable X, while keeping other factors unchanged, particularly the causal mechanism from X to Y. The causal mechanism is defined as the conditional probability of Y taking any value ~~y∈Y~~ given X takes a value ~~x∈X~~:

+

According to Judea Pearl's theory, the do-operator cuts off all causal arrows pointing to variable [math]X[/math], while keeping other factors unchanged, particularly the causal mechanism from [math]X[/math] to [math]Y[/math]. The causal mechanism is defined as the conditional probability of [math]Y[/math] taking any value [math]\mathcal{Y}[/math] given [math]X[/math] takes a value [math]y\in \mathcal{Y}[/math]:

−

~~f≡Pr~~(Y=~~y∣X~~=x)

+

<math>

+

f\equiv Pr(Y=y|X=x)

+

</math>

−

In the intervention, this causal mechanism f remains constant. When no other variables are influencing the system, this leads to a change in the distribution of Y, which is indirectly intervened upon and becomes:

+

In the intervention, this causal mechanism [math]f[/math] remains constant. When no other variables are influencing the system, this leads to a change in the distribution of [math]Y[/math], which is indirectly intervened upon and becomes:

−

Pr(Y~=y)=~~x∈X∑Pr~~(X=x)Pr(Y=~~y∣X~~=x)=~~x∈X∑#(~~X)Pr(Y=~~y∣X~~=x)

+

<math>

+

Pr(\tilde{Y}=y)=\sum_{x\in \mathcal{X}}Pr(X=x) Pr(Y=y|X=x)=\sum_{x\in \mathcal{X}} \frac{Pr(Y=y|X=x)}{\#(\mathcal{X})}.

+

</math>

−

~~Here~~, Y~ represents the ~~modified distribution of~~ Y ~~after the~~ do-intervention ~~on X~~, ~~reflecting how~~ the distribution ~~of Y changes indirectly due to the intervention on X~~.

+

Among them, [math]\tilde{Y}[/math] represents the [math]Y[/math] variable indirectly changed by [math]X[/math]'s do-intervention while maintaining the causal mechanism [math]f[/math] unchanged, and this change is mainly reflected in the change of probability distribution.

−

Therefore, the effective information (EI) of a causal mechanism f is the mutual information between the intervened cause variable X~ and the intervened effect variable Y~.

+

Therefore, the effective information (EI) of a causal mechanism [math]f[/math] is the mutual information between the intervened cause variable [math]\tilde{X}[/math] and the intervened effect variable [math]\tilde{Y}[/math].

==Why Use the Do-Operator?==

相信未来

2,435

个编辑

更改

Effective Information (查看源代码)

2024年9月10日 (二) 09:58的版本

导航菜单

搜索