第82行: |
第82行: |
| The original definition of effective information (EI) was based on discrete Markov chains. However, to expand its applicability, we explore a more general form of EI here. | | The original definition of effective information (EI) was based on discrete Markov chains. However, to expand its applicability, we explore a more general form of EI here. |
| ==Formal Definition == | | ==Formal Definition == |
− | Consider two random variables, [math]X[/math] and [math]Y[/math], representing the cause variable and the effect variable, respectively. Let their value ranges be [math]\mathcal{X}[/math] and [math]\mathcal{Y}[/math]. The effective information (EI) from [math]X[/math] to [math]Y[/math] is defined as: | + | Consider two random variables, [math]X[/math] and [math]Y[/math], representing the Cause Variable and the Effect Variable, respectively. Let their value ranges be [math]\mathcal{X}[/math] and [math]\mathcal{Y}[/math]. The effective information (EI) from [math]X[/math] to [math]Y[/math] is defined as: |
| | | |
| <math> | | <math> |
第88行: |
第88行: |
| </math> | | </math> |
| | | |
− | Here, [math]do(X\sim U(\mathcal{X}))[/math] represents applying a do-intervention (or do-operator) on [math]X[/math], making it follow a uniform distribution [math]U(\mathcal{X})[/math] over [math]\mathcal{X}[/math], which corresponds to a maximum entropy distribution. [math]\tilde{X}[/math] and [math]\tilde{Y}[/math] represent the variables after the [math]do[/math]-intervention on [math]X[/math] and [math]Y[/math], respectively, where: | + | Here, [math]do(X\sim U(\mathcal{X}))[/math] represents applying a [[do-operator]] on [math]X[/math], making it follow a uniform distribution [math]U(\mathcal{X})[/math] over [math]\mathcal{X}[/math], which corresponds to a [[Maximum Entropy Distribution]]. [math]\tilde{X}[/math] and [math]\tilde{Y}[/math] represent the variables after the [math]do[/math]-intervention on [math]X[/math] and [math]Y[/math], respectively, where: |
| | | |
| <math> | | <math> |
| Pr(\tilde{X}=x)=\frac{1}{\#(\mathcal{X})}, | | Pr(\tilde{X}=x)=\frac{1}{\#(\mathcal{X})}, |
− | </math> | + | </math> |
| | | |
| This means that the main difference between [math]\tilde{X}[/math] after the intervention and [math]X[/math] before the intervention is their distributions: [math]\tilde{X}[/math] follows a uniform distribution over [math]\mathcal{X}[/math], while [math]X[/math] may follow any arbitrary distribution. [math]\#(\mathcal{X})[/math] represents the cardinality of the set [math]\mathcal{X}[/math], or the number of elements in the set if it is finite. | | This means that the main difference between [math]\tilde{X}[/math] after the intervention and [math]X[/math] before the intervention is their distributions: [math]\tilde{X}[/math] follows a uniform distribution over [math]\mathcal{X}[/math], while [math]X[/math] may follow any arbitrary distribution. [math]\#(\mathcal{X})[/math] represents the cardinality of the set [math]\mathcal{X}[/math], or the number of elements in the set if it is finite. |
| | | |
− | According to Judea Pearl's theory, the do-operator cuts off all causal arrows pointing to variable [math]X[/math], while keeping other factors unchanged, particularly the causal mechanism from [math]X[/math] to [math]Y[/math]. The causal mechanism is defined as the conditional probability of [math]Y[/math] taking any value [math]\mathcal{Y}[/math] given [math]X[/math] takes a value [math]y\in \mathcal{Y}[/math]: | + | According to [[Judea Pearl]]'s theory, the do-operator cuts off all causal arrows pointing to variable [math]X[/math], while keeping other factors unchanged, particularly the causal mechanism from [math]X[/math] to [math]Y[/math]. The [[Causal Mechanism]] is defined as the conditional probability of [math]Y[/math] taking any value [math]\mathcal{Y}[/math] given [math]X[/math] takes a value [math]y\in \mathcal{Y}[/math]: |
| | | |
| <math> | | <math> |
第102行: |
第102行: |
| </math> | | </math> |
| | | |
− | In the intervention, this causal mechanism [math]f[/math] remains constant. When no other variables are influencing the system, this leads to a change in the distribution of [math]Y[/math], which is indirectly intervened upon and becomes: | + | In the intervention, this [[Causal Mechanism]] [math]f[/math] remains constant. When no other variables are influencing the system, this leads to a change in the distribution of [math]Y[/math], which is indirectly intervened upon and becomes: |
| | | |
| <math> | | <math> |
| Pr(\tilde{Y}=y)=\sum_{x\in \mathcal{X}}Pr(X=x) Pr(Y=y|X=x)=\sum_{x\in \mathcal{X}} \frac{Pr(Y=y|X=x)}{\#(\mathcal{X})}. | | Pr(\tilde{Y}=y)=\sum_{x\in \mathcal{X}}Pr(X=x) Pr(Y=y|X=x)=\sum_{x\in \mathcal{X}} \frac{Pr(Y=y|X=x)}{\#(\mathcal{X})}. |
− | </math> | + | </math> |
| | | |
| Among them, [math]\tilde{Y}[/math] represents the [math]Y[/math] variable indirectly changed by [math]X[/math]'s do-intervention while maintaining the causal mechanism [math]f[/math] unchanged, and this change is mainly reflected in the change of probability distribution. | | Among them, [math]\tilde{Y}[/math] represents the [math]Y[/math] variable indirectly changed by [math]X[/math]'s do-intervention while maintaining the causal mechanism [math]f[/math] unchanged, and this change is mainly reflected in the change of probability distribution. |
| | | |
− | Therefore, the effective information (EI) of a causal mechanism [math]f[/math] is the mutual information between the intervened cause variable [math]\tilde{X}[/math] and the intervened effect variable [math]\tilde{Y}[/math]. | + | Therefore, the effective information (EI) of a causal mechanism [math]f[/math] is the [[Mutual Information]] between the intervened cause variable [math]\tilde{X}[/math] and the intervened effect variable [math]\tilde{Y}[/math]. |
| | | |
| ==Why Use the Do-Operator?== | | ==Why Use the Do-Operator?== |