“因果度量”的版本间的差异

2024年11月17日 (日) 19:53的版本

因果度量是通过科学方法和统计模型推断变量之间的因果关系，并衡量变量之间因果效应大小的方法。不同领域的科学家在选择因果度量方法时可能存在主观偏好，对是否存在因果关系的判定存在主观性。但这些因果度量方法在许多条件下在数学描述上表现却非常相似，具有相同的基本属性，这些相同的基本属性可以称作 “因果基元”。研究者们不再需要找到一个必须达成普遍共识的唯一因果关系衡量标准，而是可以通过关注这些相同的基本属性继续理解其他因果现象。

历史渊源

John Locke在他1690年发表的著作《人类理解论》中首次正式提出了因和果的概念：把产生观念的事物叫做原因，把所产生的东西叫做结果。在18世纪David Hume进一步发展了这个概念^[1]，提出因果不是事实之间的概念，而是经验之间的习惯性联想。他强调判断因果关系的三条准则：空间邻近性、时间连续性、恒常连结性。20世纪70年代David Lewis推广了David Hume对因果关系的定义^[2]，提出了判断因果关系的反事实推理法：“如果原因发生了，结果就会发生；如果原因不发生，结果就不会发生。”和这差不多的时间Ellery Eells和Patrick Suppes等人从概率论^[3]^[4]的角度给出了因果关系的定义，原因c成为结果e的原因的一个条件是，在c存在的情况下e的概率必须高于在c不存在的情况下e的概率。20世纪末Judea Pearl基于概率论和反事实的概念提出了结构因果模型和潜在结果模型，将因果关系划分为关联、干预、反事实三个层级，使得因果推理更加精确和实用^[5]。进入21世纪初Giulio Tononi 和 Olaf Sporns 提出有效信息 (EI)的概念^[6]，它可以用来衡量一个马尔科夫动力学的因果效应强度。最近的2022年Erik hoel发表的一篇论文^[7]中总结了各类因果度量方法中存在的相同基本属性。

因果关系的形式化

为了归纳各个因果度量方法之间的相似性，需要用一套形式化的方法描述它们，所以我们需要先给出因果关系的形式化的方法：在一个给定的空间[math]\displaystyle{ Ω }[/math]，即所有可能发生的情况的集合，在这个空间中，事件的单个原因记作[math]\displaystyle{ c }[/math]，单个结果记作[math]\displaystyle{ e }[/math]，，一组原因记作[math]\displaystyle{ C }[/math] ，一组结果记作[math]\displaystyle{ E }[/math]，其中假定[math]\displaystyle{ c }[/math]在[math]\displaystyle{ e }[/math]之前，并满足[math]\displaystyle{ c∈Ω 、 e∈Ω 、C ⊆ Ω 、 E ⊆ Ω }[/math] 。为了衡量因果关系，把没有发生[math]\displaystyle{ c }[/math]的情况下获得[math]\displaystyle{ e }[/math]的概率写成[math]\displaystyle{ P (e|C\c) }[/math]，其中[math]\displaystyle{ P }[/math]代表概率，[math]\displaystyle{ C\c }[/math]代表[math]\displaystyle{ c }[/math]的补集，指的是在[math]\displaystyle{ C }[/math]中的任何原因都可能产生[math]\displaystyle{ e }[/math]的情况下，除了[math]\displaystyle{ c }[/math]之外，[math]\displaystyle{ e }[/math]的概率，用公式表示为

[math]\displaystyle{ P(e\mid C)=\sum_{c\in C}P(c)P(e\mid c) }[/math]

主要因果度量方法

David Hume的恒常连结

David Hume将因果定义为“一个对象，后面跟着另一个对象，并且所有与第一个对象相似的对象后面跟着与第二个对象相似的对象”^[1]。换句话说，因果关系源于事件之间的这种连续规律性模式^[8]。总体而言，事件 c 后面跟着事件 e 的“恒常连结”会让我们预期一旦观察到 c，就会发生 e，因此推断 c 是 e 的原因。在这里，我们遵循 Judea Pearl 的观点，他将David Hume的连续规律性概念解释为我们今天所说的事件之间的相关性^[5]。这可以形式化为候选原因 c 和结果 e 之间观察到的统计协方差：

[math]\displaystyle{ \operatorname{Cov}(X, Y)=E(X Y)-E(X) E(Y) }[/math]

[math]\displaystyle{ E(X) }[/math]和[math]\displaystyle{ E(Y) }[/math]分别是随机变量[math]\displaystyle{ X }[/math]和[math]\displaystyle{ Y }[/math]的期望值，即各自独立时的平均结果。[math]\displaystyle{ E(XY) }[/math]这是随机变量[math]\displaystyle{ X }[/math]和[math]\displaystyle{ Y }[/math]乘积的期望值，表示在多次实验中，[math]\displaystyle{ X }[/math]和[math]\displaystyle{ Y }[/math]乘积的平均结果。如果我们用指示函数

[math]\displaystyle{ X_{c} }[/math]（以及[math]\displaystyle{ Y_{e} }[/math]）来替换上述方程中的变量，其中[math]\displaystyle{ X_{c} }[/math]（以及[math]\displaystyle{ Y_{e} }[/math]）在[math]\displaystyle{ c }[/math]（或[math]\displaystyle{ e }[/math]）发生时取值为1，否则取值为0，那么就可以得到一个新的方程：

[math]\displaystyle{ \begin{aligned} Cov(X_{c},Y_{e})& =P(c,e)-P(c)P(e) \\ &=P(c)P(e\mid c)-P(c)[P(c)P(e\mid c)+P(\bar{c})P(e\mid C\backslash c)] \\ &=P(e\mid c)P(c)[1-P(c)]+P(c)P(C\backslash c)P(e\mid C\backslash c) \\ &=P(e\mid c)P(c)P(C\backslash c)]+P(c)P(C\backslash c)P(e\mid C\backslash c) \\ &=P(c)P(C\backslash c)[P(e\mid c)-P(e\mid C\backslash c)]) \end{aligned} }[/math]

我们利用了这样一个事实：[math]\displaystyle{ P(e\mid c) }[/math]可以分解为两个加权和，即[math]\displaystyle{ c }[/math]和[math]\displaystyle{ C\c }[/math]。按照其他人的命名法^[9]，我们将其称为因果强度的“高尔顿测度（Galton measure）”，因为它与生物学中性状遗传的形式非常相似，也是统计协方差的一种形式，最后得到David Hume的恒常连结形式化公式：

[math]\displaystyle{ CS_{Galton}(e,c)=P(c)P(C\backslash c)[P(e\mid c)-P(e\mid C\backslash c)] }[/math]

Eells 的因果关系度量是概率提升

Ellery Eells提出^[3]，[math]\displaystyle{ c }[/math]成为[math]\displaystyle{ e }[/math]的原因的一个条件是，[math]\displaystyle{ c }[/math]存在时[math]\displaystyle{ e }[/math]发生的概率必须高于其不存在时[math]\displaystyle{ e }[/math]发生的概率，这可以用因果强度的度量形式化为两个量之间的差：

[math]\displaystyle{ CS_{Eells}=P(e\mid c)-P(e\mid C\backslash c) }[/math]

Suppes将因果关系度量为概率提升

哲学家和科学家Patrick Suppes将因果关系定义为概率增加^[4]。用我们的形式化方法可以表示为：

[math]\displaystyle{ CS_{Suppes}(c,e)=P(e\mid c)-P(e\mid C) }[/math]

[math]\displaystyle{ CS_{Eells} }[/math]和[math]\displaystyle{ CS_{Suppes} }[/math]测量方法之间的区别在于，从测量[math]\displaystyle{ c }[/math]对[math]\displaystyle{ e }[/math]的因果必要性（即[math]\displaystyle{ c }[/math]是否可以由[math]\displaystyle{ c }[/math]以外的其他原因产生）转变为评估产生[math]\displaystyle{ e }[/math]的方法的多样性（它衡量的是通过多少种不同的方式可以实现[math]\displaystyle{ e }[/math]）。两者都是有效的措施，事实上在某些情况下是等效的^[10]。

请注意，我们可以把条件概率 [math]\displaystyle{ P(e\mid C\backslash c) }[/math]扩展为[math]\displaystyle{ P(e\mid C) }[/math]，包括[math]\displaystyle{ c }[/math]本身。如果是这样，我们考虑的就不仅仅是在没有[math]\displaystyle{ c }[/math]的情况下能否产生[math]\displaystyle{ e }[/math]，而是[math]\displaystyle{ e }[/math]可能出现的所有方式，包括通过[math]\displaystyle{ c }[/math]本身。因此，另一个版本可以定义为：

[math]\displaystyle{ C S_{\text {Suppes }_{I I}}(c, e)=\frac{P(e \mid c)}{P(e \mid C)} }[/math]

程氏的因果归因

Patricia Cheng 提出了一个广受欢迎的因果归因心理学模型，在这个模型中，推理者不仅要评估事件之间的纯粹共变关系（即两个事件是否同时发生或变化），还要估计候选原因产生（或阻止）结果的 “因果能力”^[11]，它衡量的是[math]\displaystyle{ c }[/math]对[math]\displaystyle{ e }[/math]的影响程度。在这一模型中，[math]\displaystyle{ c }[/math]产生[math]\displaystyle{ e }[/math]的因果能力由以下公式给出：

[math]\displaystyle{ CS_{Cheng}(c,e)=\frac{P(e\mid c)-P(e\mid C\backslash c)}{1-P(e\mid C\backslash c)} }[/math]

Lewis的反事实因果理论

David Lewis基于反事实（counterfactuals）对因果关系进行了另一种实质性的、有影响力的解释^[2]。Lewis给因果关系下的定义是：如果给定事件的[math]\displaystyle{ c }[/math]和[math]\displaystyle{ e }[/math]都发生了，当且仅当“[math]\displaystyle{ c }[/math]没有发生，那么[math]\displaystyle{ e }[/math]就不会发生”这一情况成立时，[math]\displaystyle{ c }[/math]才是[math]\displaystyle{ e }[/math]的原因。刘易斯还把他的理论扩展到了 “不确定的世界”，在这种世界里^[12]，[math]\displaystyle{ e }[/math]可能只是以一定的概率跟随[math]\displaystyle{ c }[/math]发生。在这种情况下，[math]\displaystyle{ c }[/math]仍然可以被视为[math]\displaystyle{ e }[/math]的原因，但这种因果关系是概率性的，而不是确定性的。按照Fitelson和Hitchcock提出的一种使用概率来度量因果强度的方法^[13]，，我们将Lewis的因果强度正式表述为比率：[math]\displaystyle{ \frac{P(e\mid c)}{P(e\mid C\setminus c)} }[/math]。这个定义也被称为 “相对风险”：“它是指有 c 时发生 e 的风险与没有 c 时发生 e 的风险的比较”^[13]。利用[math]\displaystyle{ p/q\to(p-q)/p }[/math]映射，可以对这一指标进行归一化处理，得到一个在-1到1范围内的度量：

[math]\displaystyle{ CS_{Lewis}(c,e)=\frac{P(e\mid c)-P(e\mid C\backslash c)}{P(e\mid c)} }[/math]

Judea Pearl的因果关系测量方法

Judea Pearl，他在因果关系的研究中重新定义了以前的[math]\displaystyle{ CS_{Eells} }[/math]、[math]\displaystyle{ CS_{Lewis} }[/math]和[math]\displaystyle{ CS_{Cheng} }[/math]三种测量方法^[14]。并把它们重新命名为关联（PNS）、干预（PN）和反事实（PS）,形式化的表示为：

[math]\displaystyle{ \mathrm{PNS}=P(e\mid c)-P(e\mid C\backslash c) }[/math]，

[math]\displaystyle{ \mathrm{PN}=\frac{P(e\mid c)-P(e\mid C\backslash c)}{P(e\mid c)} }[/math]，

[math]\displaystyle{ \mathrm{PS}=\frac{P(e\mid c)-P(e\mid C\backslash c)}{1-P(e\mid C\backslash c)} }[/math]

最接近的可能世界因果关系

David Lewis传统上给出了一种因果关系的反事实理论，其中反事实被指定为[math]\displaystyle{ c }[/math]没有发生的最接近的可能世界^[15]。换句话说，如果我们想知道某个事件c是否导致了另一个事件e，我们可以考虑一个假想的世界，在这个世界中c没有发生，然后观察e是否仍然会发生。为了使这一想法形式化，我们需要在单纯的概率转换之外添加进一步的结构。也就是说，这种测量需要一种可能状态（或 “世界”）之间的距离概念。一种简单的方法是使用二进制状态标签，利用汉明距离（Hamming distance）^[16]（将一个二进制字符串转换成另一个二进制字符串所需的比特翻转次数）来诱导度量。通过使用汉明距离作为度量，我们可以在一个状态空间中诱导出一个度量空间。这样，我们就可以定义Lewis所说的“最近的可能世界”，即与当前世界在汉明距离上最接近的世界，形式化公式为：[math]\displaystyle{ \bar{c}_{CPW}=\min_{c'}D_H(c,c') }[/math] ，[math]\displaystyle{ D_H(c,c') }[/math]为[math]\displaystyle{ c }[/math]和[math]\displaystyle{ c' }[/math]之间的汉明距离。有了这一点，我们就可以根据Lewis关于因果关系的论述来定义另一种测量方法，即从最接近的可能世界的反事实出发进行推理：

[math]\displaystyle{ CS_{Lewis CPW}=\frac{P(e\mid c)-P(e\mid\bar{c}_{CPW})}{P(e\mid c)} }[/math]

位翻转措施

另一种依赖于状态间距离概念的测量方法是测量系统中最小变化所产生的差异量。例如，某个局部扰动造成的比特翻转结果。文献^[17]中给出了这样一种测量方法：“当一个随机比特在时间 t 被翻转时，扰动状态与未扰动状态在时间 t + 1 之间的平均汉明距离”。虽然最初是在确定性假设下提出的，但我们在此将其扩展到非确定性系统，即：

[math]\displaystyle{ CS_{bit-flip}(e,c)=\frac{1}{N}\sum_{i}^{N}\sum_{e^{\prime}\in E}P(e^{\prime}\mid c_{[i]})D_{H}(e,e^{\prime}) }[/math]

其中[math]\displaystyle{ c_{[i]} }[/math]对应于第[math]\displaystyle{ i^{th} }[/math]位被翻转的状态，（例如，如果 c = 000，则 c[3] = 001），[math]\displaystyle{ D_H(e,e') }[/math]为[math]\displaystyle{ e }[/math]和[math]\displaystyle{ e' }[/math]之间的汉明距离。

实际因果关系和效果信息

最近，有人提出了一个利用信息论评估动态因果网络实际因果关系的框架[34]。根据这一框架，候选因果必须提高其效应的概率，与未指明因果时的概率相比较（我们再次看到与以前的测量方法相似之处）。中心量是效应信息，其值为：

[math]\displaystyle{ ei(c,e)=\log_2\frac{P(e\mid c)}{P(e\mid C)}=\log_2n[det(e,c)-deg(c)] }[/math]

请注意，效果信息实际上只是 CSSuppesII 的对数，这再次表明，随着后来的作者重新发现因果关系的测量方法，效果信息与 CSSuppesII 的对数是一致的。它也是之前在 [8] 中定义的 “有效性 ”的个体过渡贡献。因此，效应信息一方面是概率 Suppes 测量的比特测量版本，另一方面是简并性和确定性之间的非标准化差异。

有效信息（EI）

有效信息（EI）最早由 Giulio Tononi 和 Olaf Sporns 提出，作为因果相互作用的一种度量，其中使用了系统的随机扰动，以超越统计依赖性^[6]。人们在没有参考先前用法的情况下重新发现了这一概念，并将其称为 “因果特异性”^[18]。有效信息是系统所有可能因果关系中效应信息的期望值：

[math]\displaystyle{ EI=\sum_{e\in E,c\in C}P(e,c)ei(c,e) }[/math]

作为因果关系的衡量标准，有效信息反映了系统中原因产生效应的有效程度（确定性和唯一性），以及从效应中识别原因的选择性^[19]。有效信息是对 c 产生 e 的因果能力的评估--由效应信息衡量--适用于可能的因果之间的所有转换，同时考虑到对因果的最大熵干预分布。更简单地说，它是系统的确定性与简并性之间的非归一化差异。

因果基元的形式化

当我们讨论因果关系时，不应该简单地认为它只是一个简单的原因导致结果的关系。实际上，这种关系可以从两个不同的角度来看：一个是充分性，另一个是必要性。充分性是指一个原因是否总是能导致一个特定的结果，而必要性是指为了得到这个结果，是否需要这个特定的原因。我们可以把这两个概念看作是理解因果关系的基本元素，称为因果基元。在更广泛的意义上，充分性和必要性分别反映了因果关系之间的确定性和简并性。

1.充分性：这里指的是原因[math]\displaystyle{ c }[/math]对产生结果[math]\displaystyle{ e }[/math]的充分程度。如果每当原因[math]\displaystyle{ c }[/math]发生时，结果[math]\displaystyle{ e }[/math]总是随之发生，那么我们可以说[math]\displaystyle{ c }[/math]是产生[math]\displaystyle{ e }[/math]的充分条件。换句话说，[math]\displaystyle{ c }[/math]的存在足以确保[math]\displaystyle{ e }[/math]的发生。充分性用表示公式为

[math]\displaystyle{ suff (e, c) = P (e | c) }[/math]

2.必要性：这里指指原因[math]\displaystyle{ c }[/math]对产生结果[math]\displaystyle{ e }[/math]的必要性程度。如果只有通过[math]\displaystyle{ c }[/math]才能产生[math]\displaystyle{ e }[/math]，那么[math]\displaystyle{ c }[/math]是产生[math]\displaystyle{ e }[/math]的必要条件。这意味着没有[math]\displaystyle{ c }[/math]，[math]\displaystyle{ e }[/math]就不会发生。必要性用表示公式为

[math]\displaystyle{ nec(e, c) = 1 – P (e | C\c) }[/math]

3.确定性：如果原因只有一个结果，即[math]\displaystyle{ P=1 }[/math]，则该熵项为零；如果原因具有完全随机的结果，则熵最大，即[math]\displaystyle{ log_2n }[/math]，其中[math]\displaystyle{ n }[/math]为所有可能结果的数量，用[math]\displaystyle{ H (e | c) }[/math]表示原因导致结果的概率分布的熵，用公式表示为

[math]\displaystyle{ \begin{aligned}H(e\mid c)=\sum_{e\in E}P(e\mid c)\log_2\frac{1}{P(e\mid c)}\end{aligned} }[/math]

因此，我们将原因[math]\displaystyle{ c }[/math]的确定性定义为[math]\displaystyle{ log_2n - H (e | c) }[/math]。我们将它做归一化处理，可以创建一个确定性系数[math]\displaystyle{ det }[/math]，对于给定的原因，该系数的范围与充分性一样，在 0（完全随机）和 1（完全确定性）之间，公式为

[math]\displaystyle{ det(c)=1-\frac{H(e\mid c)}{\log_2n} }[/math]

通过这个公式，我们可以定义一个单个因果转换的确定性系数

[math]\displaystyle{ det(e,c)=1-\frac{\log_2\frac{1}{P(e|c)}}{\log_2n} }[/math]

以及系统级确定性系数

[math]\displaystyle{ det=\sum\limits_{c\in C}P(c) det(c)=\sum\limits_{e\in E, c\in C}P(e,c) det(e,c)=1-\frac{\sum\limits_{c\in C}P(c) H(e\mid c)}{\log_2n} }[/math]

4.简并性：简并性是必要性的一种推广，如果所有可能的结果都有相同的概率，即没有任何一个结果比其他结果更有可能，那么简并性为零。如果某些特定的结果由更多的原因引起，那么这些特定的结果就更有可能发生，从而导致简并性增加。简并性的量化可以用一组原因[math]\displaystyle{ C }[/math]导致[math]\displaystyle{ e }[/math]发生的条件概率的熵来衡量，公式为

[math]\displaystyle{ \begin{aligned}H(e\mid C)=\sum_{e\in E}P(e\mid C)\log_2\frac{1}{P(e\mid C)}\end{aligned} }[/math]

通过这个公式，我们可以定义一个单个因果效应的简并性系数

[math]\displaystyle{ deg(e)=1-\frac{\log_2\frac{1}{P(e|C)}}{\log_2n} }[/math]

以及系统级简并性系数

[math]\displaystyle{ deg=\sum_{e\in E}P(e\mid c) deg(e)=1-\frac{H(e\mid C)}{\log_{2}n} }[/math]

因果度量方法中的因果基元

各种因果度量方法及其形式化公式
序号	名称	形式化公式及其与因果基元的关系	备注
1	David Hume的恒常连结	[math]\displaystyle{ CS_{Galton}(e,c)=P(c)P(C\backslash c)[P(e\mid c)-P(e\mid C\backslash c)]=P(c)P(C\backslash c)[suff(e,c)+nec(e,c)-1] }[/math]
2	Eells 的因果关系度量是概率提升	[math]\displaystyle{ CS_{Eells}=P(e\mid c)-P(e\mid C\backslash c)=suff(e,c)+nec(e,c)-1 }[/math]
3	Suppes将因果关系度量为概率提升	[math]\displaystyle{ CS_{Suppes}(c,e)=P(e\mid c)-P(e\mid C)=suff(e,c)-nec^{\dagger}(e) }[/math]
4	程氏的因果归因	[math]\displaystyle{ CS_{Cheng}(c,e)=\frac{P(e\mid c)-P(e\mid C\backslash c)}{1-P(e\mid C\backslash c)}=\frac{suff(e,c)+nec(e,c)-1}{nec(e,c)} }[/math]
5	Lewis的反事实因果理论	[math]\displaystyle{ CS_{Lewis}(c,e)=\frac{P(e\mid c)-P(e\mid C\backslash c)}{P(e\mid c)}=\frac{suff(e,c)+nec(e,c)-1}{suff(e,c)} }[/math]
6	Judea Pearl的因果关系测量方法	[math]\displaystyle{ \mathrm{PNS}=P(e\mid c)-P(e\mid C\backslash c)=suff(e,c)+nec(e,c)-1 }[/math]， [math]\displaystyle{ \mathrm{PN}=\frac{P(e\mid c)-P(e\mid C\backslash c)}{P(e\mid c)}=\frac{suff(e,c)+nec(e,c)-1}{suff(e,c)} }[/math]， [math]\displaystyle{ \mathrm{PS}=\frac{P(e\mid c)-P(e\mid C\backslash c)}{1-P(e\mid C\backslash c)}=\frac{suff(e,c)+nec(e,c)-1}{nec(e,c)} }[/math]	PNS对应关联层级，等价于[math]\displaystyle{ CS_{Eells} }[/math]PN对应干预层级，等价于[math]\displaystyle{ CS_{Lewis} }[/math] PS对应反事实层级，等价于[math]\displaystyle{ CS_{cheng} }[/math]
7	最接近的可能世界因果关系	[math]\displaystyle{ CS_{Lewis CPW}=\frac{P(e\mid c)-P(e\mid\bar{c}_{CPW})}{P(e\mid c)} }[/math]	其中[math]\displaystyle{ \bar{c}_{CPW}=\min_{c'}D_H(c,c') }[/math] ， [math]\displaystyle{ D_H(c,c') }[/math]为[math]\displaystyle{ c }[/math]和[math]\displaystyle{ c' }[/math]之间的汉明距离
8	位翻转措施	[math]\displaystyle{ CS_{bit-flip}(e,c)=\frac{1}{N}\sum_{i}^{N}\sum_{e^{\prime}\in E}P(e^{\prime}\mid c_{[i]})D_{H}(e,e^{\prime}) }[/math]	其中[math]\displaystyle{ c_{[i]} }[/math]对应于第[math]\displaystyle{ i^{th} }[/math]位被翻转的状态，（例如，如果 c = 000，则 c[3] = 001）， [math]\displaystyle{ D_H(e,e') }[/math]为[math]\displaystyle{ e }[/math]和[math]\displaystyle{ e' }[/math]之间的汉明距离
9	实际因果关系和效果信息	[math]\displaystyle{ ei(c,e)=\log_2\frac{P(e\mid c)}{P(e\mid C)}=\log_2n[det(e,c)-deg(c)] }[/math]
10	有效信息（EI）	[math]\displaystyle{ EI=\sum_{e\in E,c\in C}P(e,c)ei(c,e)=\log_{2}n[det-deg] }[/math]

总结

上表中的每一种因果度量方法，两个基元（充分性和必要性）或它们的广义形式（确定性和简并性性）都被明确地置于某种关系中，通常是差异、比率或权衡的关系。唯一缺乏因果基元明确基础的是比特翻转测量，但作为对扰动敏感性的测量，似乎有可能存在某种基础或关系（在这里并没有寻求分解）。我们并不是第一个指出因果关系有两个维度的人，例如，Judea Pearl 就说过： “显然，必须在因果解释的必要成分和充分成分之间取得某种平衡"。J. L. Mackie虽然没有提出因果关系强度的定量衡量标准，但他在提出原因应满足的 INUS 条件时，考虑到了必要性和充分性两个方面，即作为一个条件的（i）充分但（n）必要的部分，而这个条件本身对于一个结果的发生是（u）必要但（s）充分的^[20]。然而，据我们所知，这是第一次从这个角度对一整套流行的测量方法进行评估，因此我们明确指出：因果关系测量方法的实质性一致性表明，我们应该期望因果关系强度的测量方法以这两种因果关系基元为基础。

本次调查的一个有趣发现是因果关系测量本身的相似性和一致性。概括地说，我们发现因果关系本身并不是一个原始概念，而是可以从两个维度进行分解。这两个维度在哲学文献中被称为充分性和必要性；正如我们所展示的，它们分别是确定性和简并性的具体案例。成功的因果关系测量对这两个维度都很敏感。事实上，正是对这两个维度的敏感性以及它们所捕捉到的不确定性，保证了此类测量方法因果关系出现的可能性。

↑ ^1.0 ^1.1 David Hume. An Enquiry concerning Human Understanding. 1748.
↑ ^2.0 ^2.1 David Lewis. Causation. Journal of Philosophy, 70(17):556–567, 1973.
↑ ^3.0 ^3.1 Ellery Eells. Probabilistic Causality. Cambridge University Press, 1991.
↑ ^4.0 ^4.1 Patrick Suppes. A Probabilistic Theory of Causality. Amsterdam: North-Holland Pub. Co., 1968.
↑ ^5.0 ^5.1 Judea Pearl. Causality. Cambridge University Press, Cambridge, 2 edition, 2009.
↑ ^6.0 ^6.1 Giulio Tononi and Olaf Sporns. Measuring information integration. BMC Neuroscience, page 20, 2003.
↑ Comolatti, R., & Hoel, E. (2022). Causal emergence is widespread across measures of causation. arXiv:2202.01854 [physics.soc-ph]. https://doi.org/10.48550/arXiv.2202.01854
↑ Phyllis Illari and Federica Russo. Causality: Philosophical Theory meets Scientific Practice. Oxford University Press, Oxford, New York, December 2014.
↑ Branden Fitelson and Christopher Hitchcock. Probabilistic Measures of Causal Strength. Causality in the Sciences, January 2010.
↑ Christopher Hitchcock. Probabilistic Causation. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy.Metaphysics Research Lab, Stanford University, spring 2021 edition, 2018.
↑ Patricia W. Cheng and Laura R. Novick. Causes versus enabling conditions. Cognition, 40(1):83–120, August 1991.
↑ David Lewis. Postscripts to ’Causation’. Philosophical Papers Vol. Ii, 1986.
↑ ^13.0 ^13.1 Branden Fitelson and Christopher Hitchcock. Probabilistic Measures of Causal Strength. Causality in the Sciences,January 2010.
↑ Judea Pearl. Causality. Cambridge University Press, Cambridge, 2 edition, 2009.
↑ David Lewis. Causation. Journal of Philosophy, 70(17):556–567, 1973.
↑ Luciano Floridi. Information, possible worlds and the cooptation of scepticism. Synthese, 175:63–88, 2010.Publisher: Springer.
↑ Bryan C. Daniels, Hyunju Kim, Douglas Moore, Siyu Zhou, Harrison B. Smith, Bradley Karas, Stuart A. Kauffman,and Sara I. Walker. Criticality Distinguishes the Ensemble of Biological Regulatory Networks. Physical Review Letters, 121(13):138102, September 2018. Publisher: American Physical Society.
↑ Paul E. Griffiths, Arnaud Pocheville, Brett Calcott, Karola Stotz, Hyunju Kim, and Rob Knight. Measuring Causal Specificity. Philosophy of Science, 82(4):529–555, 2015. Publisher: The University of Chicago Press.
↑ Erik P. Hoel, L. Albantakis, and G. Tononi. Quantifying causal emergence shows that macro can beat micro.Proceedings of the National Academy of Sciences, 110(49):19790–19795, December 2013.
↑ J. L. Mackie. Causes and Conditions. American Philosophical Quarterly, 2(4):245–264, 1965. Publisher:University of Illinois Press.

[:1-1] 1.0 ^1.1 David Hume. An Enquiry concerning Human Understanding. 1748.

[:2-2] 2.0 ^2.1 David Lewis. Causation. Journal of Philosophy, 70(17):556–567, 1973.

[:3-3] 3.0 ^3.1 Ellery Eells. Probabilistic Causality. Cambridge University Press, 1991.

[:4-4] 4.0 ^4.1 Patrick Suppes. A Probabilistic Theory of Causality. Amsterdam: North-Holland Pub. Co., 1968.

[:5-5] 5.0 ^5.1 Judea Pearl. Causality. Cambridge University Press, Cambridge, 2 edition, 2009.

[:6-6] 6.0 ^6.1 Giulio Tononi and Olaf Sporns. Measuring information integration. BMC Neuroscience, page 20, 2003.

[7] Comolatti, R., & Hoel, E. (2022). Causal emergence is widespread across measures of causation. arXiv:2202.01854 [physics.soc-ph]. https://doi.org/10.48550/arXiv.2202.01854

[8] Phyllis Illari and Federica Russo. Causality: Philosophical Theory meets Scientific Practice. Oxford University Press, Oxford, New York, December 2014.

[9] Branden Fitelson and Christopher Hitchcock. Probabilistic Measures of Causal Strength. Causality in the Sciences, January 2010.

[10] Christopher Hitchcock. Probabilistic Causation. In Edward N. Zalta, editor, The Stanford Encyclopedia of Philosophy.Metaphysics Research Lab, Stanford University, spring 2021 edition, 2018.

[11] Patricia W. Cheng and Laura R. Novick. Causes versus enabling conditions. Cognition, 40(1):83–120, August 1991.

[12] David Lewis. Postscripts to ’Causation’. Philosophical Papers Vol. Ii, 1986.

[:0-13] 13.0 ^13.1 Branden Fitelson and Christopher Hitchcock. Probabilistic Measures of Causal Strength. Causality in the Sciences,January 2010.

[14] Judea Pearl. Causality. Cambridge University Press, Cambridge, 2 edition, 2009.

[15] David Lewis. Causation. Journal of Philosophy, 70(17):556–567, 1973.

[16] Luciano Floridi. Information, possible worlds and the cooptation of scepticism. Synthese, 175:63–88, 2010.Publisher: Springer.

[17] Bryan C. Daniels, Hyunju Kim, Douglas Moore, Siyu Zhou, Harrison B. Smith, Bradley Karas, Stuart A. Kauffman,and Sara I. Walker. Criticality Distinguishes the Ensemble of Biological Regulatory Networks. Physical Review Letters, 121(13):138102, September 2018. Publisher: American Physical Society.

[18] Paul E. Griffiths, Arnaud Pocheville, Brett Calcott, Karola Stotz, Hyunju Kim, and Rob Knight. Measuring Causal Specificity. Philosophy of Science, 82(4):529–555, 2015. Publisher: The University of Chicago Press.

[19] Erik P. Hoel, L. Albantakis, and G. Tononi. Quantifying causal emergence shows that macro can beat micro.Proceedings of the National Academy of Sciences, 110(49):19790–19795, December 2013.

[20] J. L. Mackie. Causes and Conditions. American Philosophical Quarterly, 2(4):245–264, 1965. Publisher:University of Illinois Press.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

@@ 第1行： / 第1行： @@
-因果度量是用于识别和量化变量之间因果联系的方法和技术。不同领域的科学家在选择因果度量方法时可能存在主观偏好，对是否存在因果关系的判定存在主观性。但这些因果度量方法在许多条件下表现却非常相似，对相同的基本属性也非常敏感，并数学上存在相似性或一致性，这些相同的基本属性可以称作 “因果基元”。研究者们不再需要找到一个必须达成普遍共识的唯一因果关系衡量标准，而是可以通过关注这些相同的基本属性继续理解其他因果现象。
+因果度量是通过科学方法和统计模型推断变量之间的因果关系，并衡量变量之间[[因果效应]]大小的方法。不同领域的科学家在选择因果度量方法时可能存在主观偏好，对是否存在因果关系的判定存在主观性。但这些因果度量方法在许多条件下在数学描述上表现却非常相似，具有相同的基本属性，这些相同的基本属性可以称作 “因果基元”。研究者们不再需要找到一个必须达成普遍共识的唯一因果关系衡量标准，而是可以通过关注这些相同的基本属性继续理解其他因果现象。
 ==历史渊源==
-John Locke在他1690年发表的著作《人类理解论》中首次正式提出了因和果的概念：把产生观念的事物叫做原因，把所产生的东西叫做结果。在18世纪David Hume进一步发展了这个概念，提出因果不是事实之间的概念，而是经验之间的习惯性联想。他强调判断因果关系的三条准则：空间邻近性、时间连续性、恒常连结性。20世纪70年代David Lewis推广了David Hume对因果关系的定义，提出了判断因果关系的反事实推理法：“如果原因发生了，结果就会发生；如果原因不发生，结果就不会发生。”和这差不多的时间Ellery Eells和Patrick Suppes等人从概率论的角度给出了因果关系的定义，原因c成为结果e的原因的一个条件是，在c存在的情况下e的概率必须高于在c不存在的情况下e的概率。20世纪末Judea Pearl基于概率论和反事实的概念提出了结构因果模型和潜在结果模型，将因果关系划分为关联、干预、反事实三个层级，使得因果推理更加精确和实用。进入21世纪初Giulio Tononi 和 Olaf Sporns 提出有效信息 (EI)的概念，它可以用来衡量一个马尔科夫动力学的因果效应强度。最近的2022年Erik hoel发表的一篇论文中总结了各类因果度量方法中存在的相同基本属性，发现在大多数因果度量方法中都存在因果涌现。
+John Locke在他1690年发表的著作《人类理解论》中首次正式提出了因和果的概念：把产生观念的事物叫做原因，把所产生的东西叫做结果。在18世纪David Hume进一步发展了这个概念<ref name=":1" />，提出因果不是事实之间的概念，而是经验之间的习惯性联想。他强调判断因果关系的三条准则：空间邻近性、时间连续性、恒常连结性。20世纪70年代David Lewis推广了David Hume对因果关系的定义<ref name=":2" />，提出了判断因果关系的反事实推理法：“如果原因发生了，结果就会发生；如果原因不发生，结果就不会发生。”和这差不多的时间Ellery Eells和Patrick Suppes等人从概率论<ref name=":3" /><ref name=":4" />的角度给出了因果关系的定义，原因c成为结果e的原因的一个条件是，在c存在的情况下e的概率必须高于在c不存在的情况下e的概率。20世纪末Judea Pearl基于概率论和反事实的概念提出了结构因果模型和潜在结果模型，将因果关系划分为关联、干预、反事实三个层级，使得因果推理更加精确和实用<ref name=":5" />。进入21世纪初Giulio Tononi 和 Olaf Sporns 提出有效信息 (EI)的概念<ref name=":6" />，它可以用来衡量一个马尔科夫动力学的因果效应强度。最近的2022年Erik hoel发表的一篇论文<ref>Comolatti, R., & Hoel, E. (2022). Causal emergence is widespread across measures of causation. ''arXiv:2202.01854 [physics.soc-ph]''. <nowiki>https://doi.org/10.48550/arXiv.2202.01854</nowiki></ref>中总结了各类因果度量方法中存在的相同基本属性。
 ==因果关系的形式化==
-在一个给定的空间<math>Ω</math>，即所有可能发生的情况的集合，在这个空间中，事件的单个原因记作<math>c</math>，单个结果记作<math>e</math>，，一组原因记作<math>C</math> ，一组结果记作<math>E</math>，其中假定<math>c</math>在<math>e</math>之前，并满足<math>c∈Ω 、 e∈Ω 、C ⊆ Ω 、 E ⊆ Ω</math> 。为了衡量因果关系，把没有发生<math>c</math>的情况下获得<math>e</math>的概率写成<math>P (e|C\c)</math>，其中<math>P</math>代表概率，<math>C\c</math>代表<math>c</math>的补集，指的是在<math>C</math>中的任何原因都可能产生<math>e</math>的情况下，除了<math>c</math>之外，<math>e</math>的概率，用公式表示为
+为了归纳各个因果度量方法之间的相似性，需要用一套形式化的方法描述它们，所以我们需要先给出因果关系的形式化的方法：在一个给定的空间<math>Ω</math>，即所有可能发生的情况的集合，在这个空间中，事件的单个原因记作<math>c</math>，单个结果记作<math>e</math>，，一组原因记作<math>C</math> ，一组结果记作<math>E</math>，其中假定<math>c</math>在<math>e</math>之前，并满足<math>c∈Ω 、 e∈Ω 、C ⊆ Ω 、 E ⊆ Ω</math> 。为了衡量因果关系，把没有发生<math>c</math>的情况下获得<math>e</math>的概率写成<math>P (e|C\c)</math>，其中<math>P</math>代表概率，<math>C\c</math>代表<math>c</math>的补集，指的是在<math>C</math>中的任何原因都可能产生<math>e</math>的情况下，除了<math>c</math>之外，<math>e</math>的概率，用公式表示为
 <math>P(e\mid C)=\sum_{c\in C}P(c)P(e\mid c)</math>
@@ 第10行： / 第10行： @@
 === David Hume的恒常连结 ===
-David Hume将因果定义为“一个对象，后面跟着另一个对象，并且所有与第一个对象相似的对象后面跟着与第二个对象相似的对象”<ref>David Hume. ''An Enquiry concerning Human Understanding''. 1748.</ref>。换句话说，因果关系源于事件之间的这种连续规律性模式<ref>Phyllis Illari and Federica Russo. ''Causality: Philosophical Theory meets Scientific Practice''. Oxford University
+David Hume将因果定义为“一个对象，后面跟着另一个对象，并且所有与第一个对象相似的对象后面跟着与第二个对象相似的对象”<ref name=":1">David Hume. ''An Enquiry concerning Human Understanding''. 1748.</ref>。换句话说，因果关系源于事件之间的这种连续规律性模式<ref>Phyllis Illari and Federica Russo. ''Causality: Philosophical Theory meets Scientific Practice''. Oxford University
-Press, Oxford, New York, December 2014.</ref>。 总体而言，事件 c 后面跟着事件 e 的“恒常连结”会让我们预期一旦观察到 c，就会发生 e，因此推断 c 是 e 的原因。在这里，我们遵循 Judea Pearl 的观点，他将David Hume的连续规律性概念解释为我们今天所说的事件之间的相关性<ref>Judea Pearl. ''Causality.'' Cambridge University Press, Cambridge, 2 edition, 2009.</ref>。这可以形式化为候选原因 c 和结果 e 之间观察到的统计协方差：
+Press, Oxford, New York, December 2014.</ref>。 总体而言，事件 c 后面跟着事件 e 的“恒常连结”会让我们预期一旦观察到 c，就会发生 e，因此推断 c 是 e 的原因。在这里，我们遵循 Judea Pearl 的观点，他将David Hume的连续规律性概念解释为我们今天所说的事件之间的相关性<ref name=":5">Judea Pearl. ''Causality.'' Cambridge University Press, Cambridge, 2 edition, 2009.</ref>。这可以形式化为候选原因 c 和结果 e 之间观察到的统计协方差：
 <math>\operatorname{Cov}(X, Y)=E(X Y)-E(X) E(Y)</math>
@@ 第29行： / 第29行： @@
 === Eells 的因果关系度量是概率提升 ===
-Ellery Eells提出<ref>Ellery Eells. ''Probabilistic Causality''. Cambridge University Press, 1991.</ref>，<math>c</math>成为<math>e</math>的原因的一个条件是，<math>c</math>存在时<math>e</math>发生的概率必须高于其不存在时<math>e</math>发生的概率，这可以用因果强度的度量形式化为两个量之间的差：
+Ellery Eells提出<ref name=":3">Ellery Eells. ''Probabilistic Causality''. Cambridge University Press, 1991.</ref>，<math>c</math>成为<math>e</math>的原因的一个条件是，<math>c</math>存在时<math>e</math>发生的概率必须高于其不存在时<math>e</math>发生的概率，这可以用因果强度的度量形式化为两个量之间的差：
 <math>CS_{Eells}=P(e\mid c)-P(e\mid C\backslash c)</math>
 === Suppes将因果关系度量为概率提升 ===
-哲学家和科学家Patrick Suppes将因果关系定义为概率增加<ref>Patrick Suppes. ''A Probabilistic Theory of Causality''. Amsterdam: North-Holland Pub. Co., 1968.</ref>。用我们的形式化方法可以表示为：
+哲学家和科学家Patrick Suppes将因果关系定义为概率增加<ref name=":4">Patrick Suppes. ''A Probabilistic Theory of Causality''. Amsterdam: North-Holland Pub. Co., 1968.</ref>。用我们的形式化方法可以表示为：
 <math>CS_{Suppes}(c,e)=P(e\mid c)-P(e\mid C)</math>
@@ 第50行： / 第50行： @@
 === Lewis的反事实因果理论 ===
-David Lewis基于反事实（counterfactuals）对因果关系进行了另一种实质性的、有影响力的解释<ref>David Lewis. Causation. ''Journal of Philosophy'', 70(17):556–567, 1973.</ref>。Lewis给因果关系下的定义是：如果给定事件的<math>c</math>和<math>e</math>都发生了，当且仅当“<math>c</math>没有发生，那么<math>e</math>就不会发生”这一情况成立时，<math>c</math>才是<math>e</math>的原因。刘易斯还把他的理论扩展到了 “不确定的世界”，在这种世界里<ref>David Lewis. Postscripts to ’Causation’. ''Philosophical Papers Vol. Ii'', 1986.</ref>，<math>e</math>可能只是以一定的概率跟随<math>c</math>发生。在这种情况下，<math>c</math>仍然可以被视为<math>e</math>的原因，但这种因果关系是概率性的，而不是确定性的。按照Fitelson和Hitchcock提出的一种使用概率来度量因果强度的方法<ref name=":0">Branden Fitelson and Christopher Hitchcock. Probabilistic Measures of Causal Strength. ''Causality in the Sciences'',January 2010.</ref>，，我们将Lewis的因果强度正式表述为比率：<math>\frac{P(e\mid c)}{P(e\mid C\setminus c)}</math>。这个定义也被称为 “相对风险”：“它是指有 c 时发生 e 的风险与没有 c 时发生 e 的风险的比较”<ref name=":0" />。利用<math>p/q\to(p-q)/p</math>映射，可以对这一指标进行归一化处理，得到一个在-1到1范围内的度量：
+David Lewis基于反事实（counterfactuals）对因果关系进行了另一种实质性的、有影响力的解释<ref name=":2">David Lewis. Causation. ''Journal of Philosophy'', 70(17):556–567, 1973.</ref>。Lewis给因果关系下的定义是：如果给定事件的<math>c</math>和<math>e</math>都发生了，当且仅当“<math>c</math>没有发生，那么<math>e</math>就不会发生”这一情况成立时，<math>c</math>才是<math>e</math>的原因。刘易斯还把他的理论扩展到了 “不确定的世界”，在这种世界里<ref>David Lewis. Postscripts to ’Causation’. ''Philosophical Papers Vol. Ii'', 1986.</ref>，<math>e</math>可能只是以一定的概率跟随<math>c</math>发生。在这种情况下，<math>c</math>仍然可以被视为<math>e</math>的原因，但这种因果关系是概率性的，而不是确定性的。按照Fitelson和Hitchcock提出的一种使用概率来度量因果强度的方法<ref name=":0">Branden Fitelson and Christopher Hitchcock. Probabilistic Measures of Causal Strength. ''Causality in the Sciences'',January 2010.</ref>，，我们将Lewis的因果强度正式表述为比率：<math>\frac{P(e\mid c)}{P(e\mid C\setminus c)}</math>。这个定义也被称为 “相对风险”：“它是指有 c 时发生 e 的风险与没有 c 时发生 e 的风险的比较”<ref name=":0" />。利用<math>p/q\to(p-q)/p</math>映射，可以对这一指标进行归一化处理，得到一个在-1到1范围内的度量：
 <math>CS_{Lewis}(c,e)=\frac{P(e\mid c)-P(e\mid C\backslash c)}{P(e\mid c)}</math>
@@ 第83行： / 第83行： @@
 === 有效信息（EI） ===
-有效信息（EI）最早由 Giulio Tononi 和 Olaf Sporns 提出，作为因果相互作用的一种度量，其中使用了系统的随机扰动，以超越统计依赖性<ref>Giulio Tononi and Olaf Sporns. Measuring information integration. ''BMC Neuroscience'', page 20, 2003.</ref>。人们在没有参考先前用法的情况下重新发现了这一概念，并将其称为 “因果特异性”<ref>Paul E. Griffiths, Arnaud Pocheville, Brett Calcott, Karola Stotz, Hyunju Kim, and Rob Knight. Measuring Causal Specificity. ''Philosophy of Science'', 82(4):529–555, 2015. Publisher: The University of Chicago Press.</ref>。有效信息是系统所有可能因果关系中效应信息的期望值：
+有效信息（EI）最早由 Giulio Tononi 和 Olaf Sporns 提出，作为因果相互作用的一种度量，其中使用了系统的随机扰动，以超越统计依赖性<ref name=":6">Giulio Tononi and Olaf Sporns. Measuring information integration. ''BMC Neuroscience'', page 20, 2003.</ref>。人们在没有参考先前用法的情况下重新发现了这一概念，并将其称为 “因果特异性”<ref>Paul E. Griffiths, Arnaud Pocheville, Brett Calcott, Karola Stotz, Hyunju Kim, and Rob Knight. Measuring Causal Specificity. ''Philosophy of Science'', 82(4):529–555, 2015. Publisher: The University of Chicago Press.</ref>。有效信息是系统所有可能因果关系中效应信息的期望值：
 <math>EI=\sum_{e\in E,c\in C}P(e,c)ei(c,e)</math>