更改

跳到导航 跳到搜索
无编辑摘要
第82行: 第82行:  
The original definition of effective information (EI) was based on discrete Markov chains. However, to expand its applicability, we explore a more general form of EI here.
 
The original definition of effective information (EI) was based on discrete Markov chains. However, to expand its applicability, we explore a more general form of EI here.
 
==Formal Definition ==
 
==Formal Definition ==
Consider two random variables, [math]X[/math] and [math]Y[/math], representing the cause variable and the effect variable, respectively. Let their value ranges be [math]\mathcal{X}[/math] and [math]\mathcal{Y}[/math]. The effective information (EI) from [math]X[/math] to [math]Y[/math] is defined as:
+
Consider two random variables, [math]X[/math] and [math]Y[/math], representing the Cause Variable and the Effect Variable, respectively. Let their value ranges be [math]\mathcal{X}[/math] and [math]\mathcal{Y}[/math]. The effective information (EI) from [math]X[/math] to [math]Y[/math] is defined as:
    
<math>
 
<math>
第88行: 第88行:  
</math>
 
</math>
   −
Here, [math]do(X\sim U(\mathcal{X}))[/math] represents applying a do-intervention (or do-operator) on [math]X[/math], making it follow a uniform distribution [math]U(\mathcal{X})[/math] over [math]\mathcal{X}[/math], which corresponds to a maximum entropy distribution. [math]\tilde{X}[/math] and [math]\tilde{Y}[/math] represent the variables after the [math]do[/math]-intervention on [math]X[/math] and [math]Y[/math], respectively, where:
+
Here, [math]do(X\sim U(\mathcal{X}))[/math] represents applying a [[do-operator]] on [math]X[/math], making it follow a uniform distribution [math]U(\mathcal{X})[/math] over [math]\mathcal{X}[/math], which corresponds to a [[Maximum Entropy Distribution]]. [math]\tilde{X}[/math] and [math]\tilde{Y}[/math] represent the variables after the [math]do[/math]-intervention on [math]X[/math] and [math]Y[/math], respectively, where:
    
<math>
 
<math>
 
Pr(\tilde{X}=x)=\frac{1}{\#(\mathcal{X})},
 
Pr(\tilde{X}=x)=\frac{1}{\#(\mathcal{X})},
</math>
+
</math>
    
This means that the main difference between [math]\tilde{X}[/math] after the intervention and [math]X[/math] before the intervention is their distributions: [math]\tilde{X}[/math] follows a uniform distribution over [math]\mathcal{X}[/math], while [math]X[/math] may follow any arbitrary distribution. [math]\#(\mathcal{X})[/math] represents the cardinality of the set [math]\mathcal{X}[/math], or the number of elements in the set if it is finite.
 
This means that the main difference between [math]\tilde{X}[/math] after the intervention and [math]X[/math] before the intervention is their distributions: [math]\tilde{X}[/math] follows a uniform distribution over [math]\mathcal{X}[/math], while [math]X[/math] may follow any arbitrary distribution. [math]\#(\mathcal{X})[/math] represents the cardinality of the set [math]\mathcal{X}[/math], or the number of elements in the set if it is finite.
   −
According to Judea Pearl's theory, the do-operator cuts off all causal arrows pointing to variable [math]X[/math], while keeping other factors unchanged, particularly the causal mechanism from [math]X[/math] to [math]Y[/math]. The causal mechanism is defined as the conditional probability of [math]Y[/math] taking any value [math]\mathcal{Y}[/math] given [math]X[/math] takes a value [math]y\in \mathcal{Y}[/math]:
+
According to [[Judea Pearl]]'s theory, the do-operator cuts off all causal arrows pointing to variable [math]X[/math], while keeping other factors unchanged, particularly the causal mechanism from [math]X[/math] to [math]Y[/math]. The [[Causal Mechanism]] is defined as the conditional probability of [math]Y[/math] taking any value [math]\mathcal{Y}[/math] given [math]X[/math] takes a value [math]y\in \mathcal{Y}[/math]:
    
<math>
 
<math>
第102行: 第102行:  
</math>
 
</math>
   −
In the intervention, this causal mechanism [math]f[/math] remains constant. When no other variables are influencing the system, this leads to a change in the distribution of [math]Y[/math], which is indirectly intervened upon and becomes:
+
In the intervention, this [[Causal Mechanism]] [math]f[/math] remains constant. When no other variables are influencing the system, this leads to a change in the distribution of [math]Y[/math], which is indirectly intervened upon and becomes:
    
<math>
 
<math>
 
Pr(\tilde{Y}=y)=\sum_{x\in \mathcal{X}}Pr(X=x) Pr(Y=y|X=x)=\sum_{x\in \mathcal{X}} \frac{Pr(Y=y|X=x)}{\#(\mathcal{X})}.
 
Pr(\tilde{Y}=y)=\sum_{x\in \mathcal{X}}Pr(X=x) Pr(Y=y|X=x)=\sum_{x\in \mathcal{X}} \frac{Pr(Y=y|X=x)}{\#(\mathcal{X})}.
</math>
+
</math>
    
Among them, [math]\tilde{Y}[/math] represents the [math]Y[/math] variable indirectly changed by [math]X[/math]'s do-intervention while maintaining the causal mechanism [math]f[/math] unchanged, and this change is mainly reflected in the change of probability distribution.
 
Among them, [math]\tilde{Y}[/math] represents the [math]Y[/math] variable indirectly changed by [math]X[/math]'s do-intervention while maintaining the causal mechanism [math]f[/math] unchanged, and this change is mainly reflected in the change of probability distribution.
   −
Therefore, the effective information (EI) of a causal mechanism [math]f[/math] is the mutual information between the intervened cause variable [math]\tilde{X}[/math] and the intervened effect variable [math]\tilde{Y}[/math].
+
Therefore, the effective information (EI) of a causal mechanism [math]f[/math] is the [[Mutual Information]] between the intervened cause variable [math]\tilde{X}[/math] and the intervened effect variable [math]\tilde{Y}[/math].
    
==Why Use the Do-Operator?==
 
==Why Use the Do-Operator?==
1,117

个编辑

导航菜单