Causality: Model, Reasoning, and Inference

【负责】周浩杰，如有问题，欢迎交流与提出建议

【说明】本书无中文版，故目录内容是自己翻译的，所看的是英文第二版。

【备注】好难，有些内容理解的不深刻因而写的不太好，我只能抛砖引玉，需要更厉害的人在初版基础上进一步迭代

书籍简介

这本书是因果科学领域最著名的学者之一朱迪亚·珀尔所著。它深入讨论了当代的因果分析方法，将因果科学从一个模糊的概念变成一个可以量化的理论，并可以广泛应用于数理统计、人工智能、经济学、认知科学等领域。

基本信息

书名因果论：模型、推理和推断 Causality: Model, Reasoning, and Inference 2nd edition
作者朱迪亚·珀尔 Judea Pearl
出版社 剑桥大学出版社
出版年份 2009
在线网站 含有习题、勘误、问题讨论等资源

目录与概要

1 概率、图和因果模型介绍 Introduction to Probabilities, Graphs, and Causal Models 1

1.1 概率论介绍 Introduction to Probability Theory 1

1.1.1 为什么需要概率 Why Probabilities? 1

因果论断的发生具有不确定性，比如“心不在焉的开车会导致车祸”，前因会让后果更容易发生，但不是绝对的。
与断言逻辑相比，基于概率的表达更容易处理，不然断言需要考虑到大量使其不成立的特例。

1.1.2 概率论的基本概念 Basic Concepts in Probability Theory 2

介绍有关概率论中离散变量的相关基础知识，并主要聚焦于贝叶斯推理。

1.1.3 组合预测支持和诊断支持【感觉翻译的不好，但又不知道怎么翻译的更好，只能直译】Combining Predictive and Diagnostic Supports 6

1.1.4 随机变量与数学期望 Random Variables and Expectations 8

介绍了随机变量的符号表示，单离散变量的数学期望，条件期望，函数期望，方差。
双变量的数学期望，相关系数，条件相关系数以及单连续变量的概率密度函数。

1.1.5 条件独立与Graphoid Conditional Independence and Graphoids 11

介绍了条件独立的定义以及5个性质，对称，消去，弱连接，合并与插入。
这些性质被称作graphoid公理，并在书中给出了直观解释。

1.2 图和概率 Graphs and Probabilities 12

1.2.1 图的记号与术语 Graphical Notation and Terminology 12

结点，边，邻边，路径，DAG，根节点，树，链，完全图

1.2.2 贝叶斯网络 Bayesian Networks 13

介绍了马尔可夫父代的定义，这有利于简化贝叶斯模型的输入信息，以及马尔可夫相容性的定义。

1.2.3 d-分离准则 The d-Separation Criterion 16

d-分离的定义，以及概率下的d-分离，有序马尔可夫条件，父代马尔可夫条件，观测等价性这些定理。

1.2.4 贝叶斯网络推断 Inference with Bayesian Networks 20

1.3 因果贝叶斯网络 Causal Bayesian Networks 21

1.3.1 作为Oracle的被干预的因果网络 Causal Networks as Oracles for Interventions 22

因果贝叶斯网络的定义和两个性质

1.3.2 因果关系及其稳定性 Causal Relationships and Their Stability 24

说明了因果关系为何比概率关系稳定，因果关系的重要性。

1.4 函数因果模型 Functional Causal Models 26

1.4.1 结构等式 Structural Equations 27

1.4.2 因果模型中的概率预测 Probabilistic Predictions in Causal Models 30

介绍了因果马尔可夫条件，其通过父代马尔科夫条件建立了因果和概率间的联系

1.4.3 函数模型中的干预和因果效应 Interventions and Causal Effects in Functional Models 32

阐释了为什么干预在函数模型中的表示比在随机模型更灵活和通用

1.4.4 函数模型中的反事实 Counterfactuals in Functional Models 33

强调了回答反事实问题是困难的，说明了反事实和结构等式间的关系，随机因果模型不足以计算反事实中概率的真正原因。

1.5 因果和统计学的术语 Causal versus Statistical Terminology 38

介绍了概率参数，统计参数，因果参数，统计假设与因果假设。
对比了统计学与因果科学术语的差异

2 因果推断理论 A Theory of Inferred Causation 41

2.1 绪论：直观的理解 Introduction – The Basic Intuitions 42

2.2 因果发现框架 The Causal Discovery Framework 43

因果模型和因果结构的定义

2.3 模型偏好（奥卡姆剃刀） Model Preference (Occam's Razor) 45

由于符合条件的模型有很多，通过最小化确定模型
介绍了推断因果（初级），潜在结构，结构偏好，最小性，一致性，推断因果这些概念的定义

2.4 稳定分布 Stable Distributions 48

为什么需要提出稳定性这个概念。最小性原则不能保证模型是最小的或是计算可行的【看着有点怪，具体最小是什么意思我也不是非常懂，这句话的意思出自这里Although the minimality principle is sufficient for forming a normative theory of inferred causation, it does not guarantee that the structure of the actual data-generating model would be minimal, or that the search through the vast space of minimal structures would be computationally practical】
介绍稳定性的定义，阐释其与最小性间的关系

2.5 发现DAG结构 Recovering DAG Structures 49

IC算法，输入稳定概率分布，输出等价DAG结构

2.6 发现隐结构 Recovering Latent Structures 51

投影的定义，Verma定理，任何隐结构至少有一个投影，识别不同特点边的IC*算法

2.7 因果关系推断的局部准则 Local Criteria for Inferring Causal Relations 54

潜在因果，真实因果，伪相关，有时间信息的真实因果，有时间信息的伪相关这些概念的定义

2.8 非时间因果与统计时间 Nontemporal Causation and Statistical Time 57

统计时间的定义与时序偏差假设

2.9 总结 Conclusions 59

2.9.1 关于极小性，马尔可夫性和稳定性 On Minimality, Markov, and Stability 61

捍卫有关极小性和稳定性的一些假设

3 因果图和识别因果效应 Causal Diagrams and the Identification of Causal Effects 65

3.1 简介 Introduction 66

3.2 马尔可夫模型中的干预 Intervention in Markovian Models 68

3.2.1 作为干预模型的图 Graphs as Models of Interventions 68

3.2.2 作为干预的变量 Interventions as Variables 70

3.2.3 计算干预的效应 Computing the Effect of Interventions 72

3.2.4 识别因果量值 Identification of Causal Quantities 77

3.3 控制混杂偏差 Controlling Confounding Bias 78

3.3.1 后门准则 The Back-Door Criterion 79

3.3.2 前门准则 The Front-Door Criterion 81

3.3.3 例子：吸烟和基因论 Example: Smoking and the Genotype Theory 83

3.4 计算干预 A Calculus of Intervention 85

3.4.1 记号预备 Preliminary Notation 85

3.4.2 推理规则 Inference Rules 85

3.4.3 例子：因果效应的符号推导 Symbolic Derivation of Causal Effects: An Example 86

3.4.4 替代试验的因果推断 Causal Inference by Surrogate Experiments 88

由于一些原因如成本或伦理问题，不能控制某变量进行实验，于是需要控制另一个可替代的变量
介绍利用替代变量进行因果效应的计算方法

3.5 可识别性的图测试 Graphical Tests of Identifiability 89

3.5.1 可识别模型 Identifying Models 91

3.5.2 不可识别模型 Nonidentifying Models 93

3.6 讨论 Discussion 94

3.6.1 要求与扩展 Qualifications and Extensions 94

3.6.2 作为数学语言的图 Diagrams as a Mathematical Language 96

3.6.3 从图到潜在因果的转换 Translation from Graphs to Potential Outcomes 98

3.6.4 跟Robin的G-估计的关系 Relations to Robins's G-Estimation 102

4 行动，计划和直接效应 Actions, Plans, and Direct Effects 107

4.1 简介 Introduction 108

4.1.1 行动，动作和概率 Actions, Acts, and Probabilities 108

4.1.2 决策分析中的行动 Actions in Decision Analysis 110

4.1.3 行动和反事实 Actions and Counterfactuals 112

4.2 有条件行动和随机策略 Conditional Actions and Stochastic Policies 113

4.3 什么时候行动的结果是可测量的 When Is the Effect of an Action Identifiable? 114

4.3.1 基于图的识别条件 Graphical Conditions for Identification 114

4.3.2 识别效率 Remarks on Efficiency 116

4.3.3 对控制问题解析解的推到 Deriving a Closed-Form Expression for Control Queries 117

4.3.4 总结 Summary 118

4.4 动态计划的识别 The Identification of Dynamic Plans 118

4.4.1 动机 Motivation 118

4.4.2 识别计划：记号和假设 Plan Identification: Notation and Assumptions 120

4.4.3 识别计划：顺序后门准则 Plan Identification: The Sequential Back-Door Criterion 121

4.4.4 识别计划：流程 Plan Identification: A Procedure 124

4.5 直接和间接效应 Direct and Indirect Effects 126

4.5.1 直接效应和总效应 Direct versus Total Effects 126

4.5.2 直接效益，定义和识别 Direct Effects, Definition, and Identification 127

4.5.3 例子：大学录取中的性别歧视 Example: Sex Discrimination in College Admission 128

4.5.4 自然直接效应 Natural Direct Effects 130

4.5.5 间接效应和中介公式 Indirect Effects and the Mediation Formula 132

5 社会学和经济学中的因果关系和结构模型 Causality and Structural Models in Social Science and Economics 133

5.1 简介 Introduction 134

5.1.1 寻找因果语言 Causality in Search of a Language 134

5.1.2 SEM：意思是怎么变模糊的 SEM: How Its Meaning Became Obscured 135

5.1.3 作为数学语言的图 Graphs as a Mathematical Language 138

5.2 图和模型测试 Graphs and Model Testing 140

5.2.1 结构模型的可检验含义 The Testable Implications of Structural Models 140

5.2.2 检验和可检验性 Testing the Testable 144

5.2.3 模型等价 Model Equivalence 145

5.3 图和可识别性 Graphs and Identifiability 149

5.3.1 线性模型的参数识别 Parameter Identification in Linear Models 149

5.3.2 对比非参数识别 Comparison to Nonparametric Identification 154

5.3.3 因果效应：结构等式模型的干预解释 Causal Effects: The Interventional Interpretation of Structural Equation Models 157

5.4 部分基础概念 Some Conceptual Underpinnings 159

5.4.1 结构参数的真正含义是什么？ What Do Structural Parameters Really Mean? 159

5.4.2 效应分解的解释 Interpretation of Effect Decomposition 163

5.4.3 外生性，超外生性和其他 Exogeneity, Superexogeneity, and Other Frills 165

5.5 结论 Conclusion 170

5.6 第二版附言 Postscript for the Second Edition 171

5.6.1 计量经济学的觉醒 An Econometric Awakening? 171

5.6.2 线性模型的识别问题 Identification in Linear Models 171

5.6.3 因果论断的鲁棒性 Robustness of Causal Claims 172

6 辛普森悖论，混杂与可压缩性 Simpson's Paradox, Confounding, and Collapsibility 173

6.1 剖析辛普森悖论 Simpson's Paradox: An Anatomy 174

6.1.1 一个有关悖论的示例 A Tale of a Non-Paradox 174

6.1.2 统计学中苦恼的事情 A Tale of Statistical Agony 175

6.1.3 因果关系和可交换性 Causality versus Exchangeability 177

6.1.4 被解决的悖论（或者说人是什么类型的机器） A Paradox Resolved (Or: What Kind of Machine Is Man?) 180

6.2 为什么没有关于混杂的统计检验，为什么认为应该有，为什么他们基本正确 Why There Is No Statistical Test for Confounding, Why Many Think There Is, and Why They Are Almost Right 182

6.2.1 简介 Introduction 182

6.2.2 因果和关联的定义 Causal and Associational Definitions 184

6.3 关联准则如何失效 How the Associational Criterion Fails 185

6.3.1 凭借边缘化使充分性失效 Failing Sufficiency via Marginality 185

6.3.2 凭借封闭世界假设使充分性失效 Failing Sufficiency via Closed-World Assumptions 186

6.3.3 凭借无益代理使必要性失效 Failing Necessity via Barren Proxies 186

6.3.4 凭借偶然抵消使必要性失效 Failing Necessity via Incidental Cancellations 188

6.4 稳定无偏与偶然无偏 Stable versus Incidental Unbiasedness 189

6.4.1 动机 Motivation 189

6.4.2 形式化定义 Formal Definitions 191

6.4.3 稳定无混杂的运算检验 Operational Test for Stable No-Confounding 192

6.5 混杂，可压缩性和可交换性 Confounding, Collapsibility, and Exchangeability 193

6.5.1 混杂和可压缩性 Confounding and Collapsibility 193

6.5.2 混杂与混杂因子 Confounding versus Confounders 194

6.5.3 可交换性与混杂结构分析 Exchangeability versus Structural Analysis of Confounding 196

6.6 总结 Conclusions 199

7 结构化反事实的逻辑 The Logic of Structure-Based Counterfactuals 201

7.1 语义模型的语义学 Structural Model Semantics 202

7.1.1 定义：因果模型，行动和反事实Definitions: Causal Models, Actions, and Counterfactuals 202

7.1.2 评估反事实：确定性分析 Evaluating Counterfactuals: Deterministic Analysis 207

7.1.3 评估反事实：概率分析 Evaluating Counterfactuals: Probabilistic Analysis 212

7.1.4 孪生网络法 The Twin Network Method 213

7.2 结构模型的应用和解释 Applications and Interpretation of Structural Models 215

7.2.1 例子：线性计量经济学模型的策略分析 Policy Analysis in Linear Econometric Models: An Example 215

7.2.2 反事实的实证性内容 The Empirical Content of Counterfactuals 217

7.2.3 因果解释，表达及其解释 Causal Explanations, Utterances, and Their Interpretation 221

7.2.4 从机制到行动再到因果 From Mechanisms to Actions to Causation 223

7.2.5 辛普森的因果顺序 Simon's Causal Ordering 226

7.3 公理刻画 Axiomatic Characterization 228

7.3.1 结构反事实的公理 The Axioms of Structural Counterfactuals 228

7.3.2 例子：从反事实逻辑来的因果效应 Causal Effects from Counterfactual Logic: An Example 231

7.3.3 因果相关的公理 Axioms of Causal Relevance 234

7.4 结构化和基于相似的反事实 Structural and Similarity-Based Counterfactuals 238

7.4.1 与路易斯反事实的关系Relations to Lewis's Counterfactuals 238

7.4.2 公理系统的比较 Axiomatic Comparison 240

7.4.3 成像与条件 Imaging versus Conditioning 242

7.4.4 与內曼-鲁宾框架的关系 Relations to the Neyman–Rubin Framework 243

7.4.5 外生性和工具变量：基于反事实和图的定义 Exogeneity and Instruments: Counterfactual and Graphical Definitions 245

7.5 结构因果与概率因果 Structural versus Probabilistic Causality 249

7.5.1 时序依赖 The Reliance on Temporal Ordering 249

7.5.2 死循环风险 The Perils of Circularity 250

7.5.3 与孩子们一起挑战封闭世界假设 Challenging the Closed-World Assumption, with Children 252

7.5.4 特例因果与一般因果 Singular versus General Causes 253

7.5.5 总结 Summary 256

8 不完美实验：边界效应和反事实 Imperfect Experiments: Bounding Effects and Counterfactuals 259

8.1 简介 Introduction 259

8.1.1 不完美的间接实验 Imperfect and Indirect Experiments 259

8.1.2 不依从性和治疗意愿 Noncompliance and Intent to Treat 261

8.2 工具变量的边界效应 Bounding Causal Effects with Instrumental Variables 262

8.2.1 问题的形式化描述：约束优化 Problem Formulation: Constrained Optimization 262

8.2.2 正则划分：有限响应变量的演化 Canonical Partitions: The Evolution of Finite-Response Variables 263

8.2.3 线性规划公式 Linear Programming Formulation 266

8.2.4 自然边界 The Natural Bounds 268

8.2.5 治疗效果（ETT） Effect of Treatment on the Treated (ETT) 269

8.2.6 例子：消胆胺的效果 Example: The Effect of Cholestyramine 270

8.3 反事实和法律责任 Counterfactuals and Legal Responsibility 271

8.4 工具变量测试 A Test for Instruments 274

8.5 利用解决不依从性 A Bayesian Approach to Noncompliance 275

8.5.1 贝叶斯方法和吉布斯采样 Bayesian Methods and Gibbs Sampling 275

8.5.2 采样大小和先验分布的影响 The Effects of Sample Size and Prior Distribution 277

8.5.3 不完全依从的临床数据估计因果效应 Causal Effects from Clinical Data with Imperfect Compliance 277

8.5.4 单事件因果的贝叶斯估计 Bayesian Estimate of Single-Event Causation 280

8.6 结论 Conclusion 281

9 因果概率：解释和识别 Probability of Causation: Interpretation and Identification 283

9.1 简介 Introduction 283

9.2 必要和充分因果：条件和识别 Necessary and Sufficient Causes: Conditions of Identification 286

9.2.1 定义，记号和基本关系 Definitions, Notation, and Basic Relationships 286

9.2.2 外生性下的边界和基本关系 Bounds and Basic Relationships under Exogeneity 289

9.2.3 单调性和外生性下的可识别性 Identifiability under Monotonicity and Exogeneity 291

9.2.4 单调性和非外生性下的可识别性 Identifiability under Monotonicity and Nonexogeneity 293

9.3 例子和应用 Examples and Applications 296

9.3.1 例1：公平硬币赌博 Example 1: Betting against a Fair Coin 296

9.3.2 例2：刑法执行 Example 2: The Firing Squad 297

9.3.3 例3：辐射对白血病的影响 Example 3: The Effect of Radiation on Leukemia 299

9.3.4 例4：来自实验数据和非实验数据的合法责任 Example 4: Legal Responsibility from Experimental and Nonexperimental Data 302

9.3.5 结果总结 Summary of Results 303

9.4 识别非单调模型 Identification in Nonmonotonic Models 304

9.5 总结 Conclusions 307

10 实际原因 The Actual Cause 309

10.1 简介：必要因果的不充分性 Introduction: The Insufficiency of Necessary Causation 309

10.1.1 重新回顾单原因 Singular Causes Revisited 309

10.1.2 抢占和结构信息的作用 Preemption and the Role of Structural Information 311

10.1.3 过度确定和伪依赖性 Overdetermination and Quasi-Dependence 313

10.1.4 麦基的INUS条件 Mackie's INUS Condition 313

10.2 产生，依赖和维持 Production, Dependence, and Sustenance 316

10.3 因果束和基于维持的因果关系 Causal Beams and Sustenance-Based Causation 318

10.3.1 因果束：定义及其含义 Causal Beams: Definitions and Implications 318

10.3.2 例子：从析取式到通用公式 Examples: From Disjunction to General Formulas 320

10.3.3 束，抢占和单事件因果的概率 Beams, Preemption, and the Probability of Single-Event Causation 322

10.3.4 路径切换因果 Path-Switching Causation 324

10.3.5 时序抢占 Temporal Preemption 325

10.4 总结 Conclusions 327

11 跟读者的回应，阐述和讨论 Reflections, Elaborations, and Discussions with Readers 331

11.1 因果，统计和图的相关术语 Causal, Statistical, and Graphical Vocabulary 331

11.1.1 有必要区分因果和统计吗？ Is the Causal-Statistical Dichotomy Necessary? 331

11.1.2 不痛哭的d-分离（第一章） d-Separation without Tears (Chapter 1, pp. 16–18) 335

11.2 逆统计时间（第二章） Reversing Statistical Time (Chapter 2, p. 58–59) 337

11.3 估计因果效应 Estimating Causal Effects 338

11.3.1 后门准则的直观理解（第三章） The Intuition behind the Back-Door Criterion (Chapter 3, p. 79) 338

11.3.2 揭秘“强可忽略性” Demystifying “Strong Ignorability” 341

11.3.3 后门准则的另一种证明 Alternative Proof of the Back-Door Criterion 344

11.3.4 协变量选择中的数据与知识 Data vs. Knowledge in Covariate Selection 346

11.3.5 理解倾向得分 Understanding Propensity Scores 348

11.3.6 do-算子的直观理解 The Intuition behind do-Calculus 352

11.3.7 G-估计的有效性 The Validity of G-Estimation 352

11.4 策略评估和do-算子 Policy Evaluation and the do-Operator 354

11.4.1 识别条件计划（4.2节）Identifying Conditional Plans (Section 4.2, p. 113) 354

11.4.2 间接效应的含义 The Meaning of Indirect Effects 355

11.4.3 do(x)能代表实际的实验吗？ Can do(x) Represent Practical Experiments? 358

11.4.4 do(x)算子是通用的吗？ Is the do(x) Operator Universal? 359

11.4.5 没有操作的因果关系！ Causation without Manipulation!!! 361

11.4.6 与卡特赖特一起追寻原因 Hunting Causes with Cartwright 362

11.4.7 非模块化的错觉 The Illusion of Nonmodularity 364

11.5 线性结构模型中的因果分析 Causal Analysis in Linear Structural Models 366

11.5.1 参数识别的通用准则（第五章） General Criterion for Parameter Identification (Chapter 5, pp. 149–54) 366

11.5.2 结构系数的因果解释 The Causal Interpretation of Structural Coefficients 366

11.5.3 为SEM（或者SEM急救包）的因果解释辩护 Defending the Causal Interpretation of SEM (or, SEM Survival Kit) 368

11.5.4 今天的经济模型在哪？-与赫克曼一起追寻原因 Where Is Economic Modeling Today? – Courting Causes with Heckman 374

11.5.5 外部变化与外科手术 External Variation versus Surgery 376

11.6 决策与混杂（第六章） Decisions and Confounding (Chapter 6) 380

11.6.1 辛普森悖论和决策树 Simpson's Paradox and Decision Trees 380

11.6.2 时序信息对决策树来说是否充分？ Is Chronological Information Sufficient for Decision Trees? 382

11.6.3 林德利关于因果性，决策树和贝叶斯主义的理解 Lindley on Causality, Decision Trees, and Bayesianism 384

11.6.4 为什么混杂不是统计学的概念？ Why Isn't Confounding a Statistical Concept? 387

11.7 计算反事实 The Calculus of Counterfactuals 389

11.7.1 线性系统中的反事实 Counterfactuals in Linear Systems 389

11.7.2 反事实的意义 The Meaning of Counterfactuals 391

11.7.3 反事实中的d-分离 d-Separation of Counterfactuals 393

11.8 工具变量和不依从性 Instrumental Variables and Noncompliance 395

11.8.1 不依从性下的紧边界 Tight Bounds under Noncompliance 395

11.9 更多关于因果的概率 More on Probabilities of Causation 396

11.9.1 “有罪的概率为1”还有可能吗？ Is "Guilty with Probability One" Ever Possible? 396

11.9.2 收紧因果概率的边界 Tightening the Bounds on Probabilities of Causation 398