第1行: |
第1行: |
− | [[文件:《统计因果推理入门》.jpg|缩略图|《统计因果推理入门》]] | + | [[文件:《Elements of Causal Inference: Foundations and Learning Algorithms》.jpg|缩略图|《Elements of Causal Inference: Foundations and Learning Algorithms》]] |
| =内容简介-英文= | | =内容简介-英文= |
| The mathematization of causality is a relatively recent development, and has become increasingly important in data science and machine learning. This book offers a self-contained and concise introduction to causal models and how to learn them from data. After explaining the need for causal models and discussing some of the principles underlying causal inference, the book teaches readers how to use causal models: how to compute intervention distributions, how to infer causal models from observational and interventional data, and how causal ideas could be exploited for classical machine learning problems. All of these topics are discussed first in terms of two variables and then in the more general multivariate case. The bivariate case turns out to be a particularly hard problem for causal learning because there are no conditional independences as used by classical methods for solving multivariate cases. The authors consider analyzing statistical asymmetries between cause and effect to be highly instructive, and they report on their decade of intensive research into this problem. | | The mathematization of causality is a relatively recent development, and has become increasingly important in data science and machine learning. This book offers a self-contained and concise introduction to causal models and how to learn them from data. After explaining the need for causal models and discussing some of the principles underlying causal inference, the book teaches readers how to use causal models: how to compute intervention distributions, how to infer causal models from observational and interventional data, and how causal ideas could be exploited for classical machine learning problems. All of these topics are discussed first in terms of two variables and then in the more general multivariate case. The bivariate case turns out to be a particularly hard problem for causal learning because there are no conditional independences as used by classical methods for solving multivariate cases. The authors consider analyzing statistical asymmetries between cause and effect to be highly instructive, and they report on their decade of intensive research into this problem. |
第18行: |
第18行: |
| | | |
| =作者介绍= | | =作者介绍= |
− | '''Jonas Peters'''Jonas Peters is Professor of Statistics at the University of Copenhagen. | + | '''Jonas Peters''' |
| | | |
| + | Jonas Peters是哥本哈根大学数学科学系的统计学教授。此前,他曾担任图宾根马克斯-普朗克智能系统研究所的组长和苏黎世联邦理工学院统计研讨会的玛丽居里研究员。他在海德堡大学和剑桥大学学习数学,并在 MPI 和 ETH 联合获得博士学位。他感兴趣的是从不同类型的数据中推断因果关系,以及建立对分布变化具有鲁棒性的统计方法。在他的研究中,乔纳斯试图将理论、方法和应用结合起来。他的工作涉及计算统计、因果推理、图形模型、独立性测试或高维统计等领域。 |
| | | |
− | '''Dominik Janzing'''Dominik Janzing is a Senior Research Scientist at the Max Planck Institute for Intelligent Systems in Tübingen, Germany.
| |
| | | |
| | | |
− | '''Bernhard Schölkopf'''Bernhard Schölkopf is Director at the Max Planck Institute for Intelligent Systems in Tübingen, Germany. He is coauthor of Learning with Kernels (2002) and is a coeditor of Advances in Kernel Methods: Support Vector Learning (1998), Advances in Large-Margin Classifiers (2000), and Kernel Methods in Computational Biology (2004), all published by the MIT Press. | + | '''Dominik Janzing''' |
| + | Dominik Janzing是位于德国图宾根的马克思·普朗克智能系统研究所的高级研究科学家,现在致力于从统计数据进行因果推理和新的因果推理规则的基础。 |
| + | |
| + | '''Bernhard Schölkopf''' |
| + | Bernhard Schölkopf是位于德国图宾根的马克思·普朗克智能系统研究所所长。他是《用核函数学习》(2002)的合著者,也是《核函数方法的进展:支持向量学习》(1998)、《大边缘分类器的进展》(2000)和《计算生物学中的核函数方法》(2004)的共同编辑。他还是苏黎世联邦理工学院的附属教授,蒂宾根大学和柏林工业大学的名誉教授,以及欧洲学习与智能系统实验室(ELLIS) 的主席。 |
| | | |
| =内容目录= | | =内容目录= |
− | Preface xi
| + | 原书前言 |
− | | + | 符号和术语 |
− | 1 Preliminaries: Statistical and Causal Models 1 | + | 章 统计和因果模型 |
− | | + | 1.1 概率论与统计学 |
− | 1.1 Why Study Causation 1 | + | 1.2 学习理论 |
− | | + | 1.3 因果建模和学习 |
− | 1.2 Simpson’s Paradox 2 | + | 1.4 实例 |
− | | + | 1.4.1 模式识别 |
− | 1.3 Probability and Statistics 9
| + | 1.4.2 基因干扰 |
− | | + | 第2章 因果推断假设 |
− | 1.3.1 Variables 10
| + | 2.1 独立机制原则 |
− | | + | 2.2 历史记录 |
− | 1.3.2 Events 11
| + | 2.3 因果模型的物理结构 |
− | | + | 2.3.1 时间的作用 |
− | 1.3.3 Conditional probability 11
| + | 2.3.2 物理定律 |
− | | + | 2.3.3 循环赋值 |
− | 1.3.4 Independence 13
| + | 2.3.4 干预的可行性 |
− | | + | 2.3.5 原因和机制的独立性以及时间的热力学之箭 |
− | 1.3.5 Probability distributions 14
| + | 第3章 原因-效果模型 |
− | | + | 3.1 结构因果模型 |
− | 1.3.6 The law of total probability 15 | + | 3.2 干预 |
− | | + | 3.3 反事实 |
− | 1.3.7 Using Bayes’ rule 18
| + | 3.4 结构因果模型的标准表示 |
− | | + | 3.5 问题 |
− | 1.3.8 Expected values 22 | + | 第4章 学习原因-效果模型 |
− | | + | 4.1 结构可识别性 |
− | 1.3.9 Variance and covariance 24 | + | 4.1.1 为什么需要额外的假设 |
− | | + | 4.1.2 假设类型的概述 |
− | 1.3.10 Regression 27 | + | 4.1.3 非高斯加性噪声的线性模型 |
− | | + | 4.1.4 非线性加性噪声模型 |
− | 1.3.11 Multiple regression 31 | + | 4.1.5 离散加性噪声模型 |
− | | + | 4.1.6 后非线性模型 |
− | 1.4 Graphs 33
| + | 4.1.7 信息-几何因果推断 |
− | | + | 4.1.8 Trace方法 |
− | 1.5 Structural Causal Models 36
| + | 4.1.9 以算法信息理论为可能的基础 |
− | | + | 4.2 结构识别方法 |
− | 1.5.1 Modeling causal assumptions 36 | + | 4.2.1 加性噪声模型 |
− | | + | 4.2.2 信息几何因果推断 |
− | 1.5.2 Product decomposition 40 | + | 4.2.3 Trace方法 |
− | | + | 4.2.4 监督学习方法 |
− | 2 Graphical Models and Their Applications 47
| + | 4.3 问题 |
− | | + | 第5章 与机器学联系1 |
− | 2.1 Connecting Models to Data 47
| + | 5.1 半监督学习 |
− | | + | 5.1.1 半监督学习和因果方向 |
− | 2.2 Chains and Forks 48
| + | 5.1.2 关于半监督学习在因果方向上的注释 |
− | | + | 5.2 协变量偏移 |
− | 2.3 Colliders 55
| + | 5.3 问题 |
− | | + | 第6章 多变量因果模型 |
− | 2.4 ''d''-Separation 62
| + | 6.1 图的术语 |
− | | + | 6.2 结构因果模型 |
− | 2.5 Model Testing and Causal Search 66 | + | 6.3 干预 |
− | | + | 反事实 |
− | 3 The Effects of Interventions 71
| + | 6.5 马尔可夫性、忠实性和因果小性 |
− | | + | 6.5.1 马尔可夫性 |
− | 3.1 Interventions 71
| + | 6.5.2 因果图模型 |
− | | + | 6.5.3 忠实性和因果小性 |
− | 3.2 The Adjustment Formula 74
| + | 6.6 通过协变量调整计算干预分布 |
− | | + | 6.7 do-calculus |
− | 3.2.1 To adjust or not to adjust? 79
| + | 6.8 因果模型的等价性和可证伪性 |
− | | + | 6.9 潜在的结果 |
− | 3.2.2 Multiple interventions and the truncated product rule 81
| + | 6.9.1 定义与实例 |
− | | + | 6.9.2 潜在的结果与结构因果模型之间的关系 |
− | 3.3 The Back-Door Criterion 82
| + | 6.10 单一对象的广义结构因果模型 |
− | | + | 6.11 条件算法独立性 |
− | 3.4 The Front-Door Criterion 89
| + | 6.12 问题 |
− | | + | 第7章 学习多变量因果模型 |
− | 3.5 Conditional Interventions and Covariate-Specific Effects 95
| + | 7.1 结构可识别性 |
− | | + | 7.1.1 忠实性 |
− | 3.6 Inverse Probability Weighing 98
| + | 7.1.2 加性噪声模型 |
− | | + | 7.1.3 具有等误差方差的线性高斯模型 |
− | 3.7 Mediation 103
| + | 7.1.4 线性非高斯无环模型 |
− | | + | 7.1.5 非线性高斯加性噪声模型 |
− | 3.8 Causal Inference in Linear Systems 107
| + | 7.1.6 观测数据和实验数据 |
− | | + | 7.2 结构识别方法 |
− | 3.8.1 Structural vs. regression coefficients 110 | + | 7.2.1 基于独立的方法 |
− | | + | 7.2.2 基于分数的方法 |
− | 3.8.2 The causal interpretation of structural coefficients 111
| + | 7.2.3 加性噪声模型 |
− | | + | 7.2.4 已知因果次序 |
− | 3.8.3 Identifying structural coefficients and causal effect 113 | + | 7.2.5 观测数据与实验数据 |
− | | + | 7.3 问题 |
− | 3.8.4 Mediation in linear systems 119
| + | 第8章 与机器学联系2 |
− | | + | 8.1 半同胞回归 |
− | 4 Counterfactuals and their Applications 123 | + | 8.2 因果推断与场景强化学习 |
− | | + | 8.2.1 逆概率加权 |
− | 4.1 Counterfactuals 123 | + | 8.2.2 场景强化学习 |
− | | + | 8.2.3 21点(Blackjack)中的状态简化 |
− | 4.2 Defining and Computing Counterfactuals 126 | + | 8.2.4 改进广告布置的加权 |
− | | + | 8.3 域适应 |
− | 4.2.1 The structural interpretation of counterfactuals 126
| + | 8.4 问题 |
− | | + | 第9章 隐藏变量 |
− | 4.2.2 The fundamental law of counterfactuals 130 | + | 9.1 干预充分性 |
− | | + | 9.2 Simpson悖论 |
− | 4.2.3 From population data to individual behavior – an illustration 131
| + | 9.3 工具变量 |
− | | + | 9.4 条件独立性和图表示 |
− | 4.2.4 The three steps in computing counterfactuals 133
| + | 9.4.1 图 |
− | | + | 9.4.2 快速因果推断 |
− | 4.3 Non-Deterministic Counterfactuals 136
| + | 9.5 条件独立性之外的约束 |
− | | + | 9.5.1 Verma约束 |
− | 4.3.1 Probabilities of counterfactuals 136
| + | 9.5.2 不等式约束 |
− | | + | 9.5.3 基于协方差的约束 |
− | 4.3.2 The Graphical representation of counterfactuals 141
| + | 9.5.4 附加噪声模型 |
− | | + | 9.5.5 检测低复杂度混杂因子 |
− | 4.3.3 Counterfactuals in experimental settings 144
| + | 9.5.6 不同的环境 |
− | | + | 9.6 问题 |
− | 4.3.4 Counterfactuals in linear models 147
| + | 0章 时间序列 |
− | | + | 10.1 基础和术语 |
− | 4.4 Practical uses of counterfactuals 149 | + | 10.2 结构因果模型和干预 |
− | | + | 10.2.1 下采样 |
− | 4.4.1 Recruitment to a program 149
| + | 10.3 学习因果时间序列模型 |
− | | + | 10.3.1 马尔可夫条件和忠实性 |
− | 4.4.2 Additive interventions 152
| + | 10.3.2 一些不要求忠实性的因果结论 |
− | | + | 10.3.3 Granger因果关系 |
− | 4.4.3 Personal decision making 155
| + | 10.3.4 具有受限函数类的模型 |
− | | + | 10.3.5 频谱独立准则 |
− | 4.4.4 Sex discrimination in hiring 158
| + | 10.4 动态因果建模 |
− | | + | 10.5 问题 |
− | 4.4.5 Mediation and path-disabling interventions 159
| + | 附录 |
− | | + | 附录A 一些概率与统计学基础知识 |
− | 4.5 Mathematical Tool Kits for Attribution and Mediation 161
| + | A.1 基本定义 |
− | | + | A.2 独立性以及条件独立性测试 |
− | 4.5.1 A tool kit for attribution and probabilities of causation 162
| + | A.3 函数类的容量 |
− | | + | 附录B 因果次序和邻接矩阵 |
− | 4.5.2 A tool kit for mediation 167
| + | 附录C 证明 |
− | | + | C.1 定理4.2的证明 |
− | References 176
| + | C.2 命题6.3的证明 |
| + | C.3 备注6.6的证明 |
| + | C.4 命题6.13的证明 |
| + | C.5 命题6.14的证明 |
| + | C.6 命题6.36的证明 |
| + | C.7 命题8的证明 |
| + | C.8 命题9的证明 |
| + | C.9 命题7.1的证明 |
| + | C.10 命题7.4的证明 |
| + | C.11 命题8.1的证明 |
| + | C.12 命题8.2的证明 |
| + | C.13 命题9.3的证明 |
| + | C.14 命题10.3的证明 |
| + | C.15 定理10.4的证明 |
| + | 参考文献 |
| =资源获取= | | =资源获取= |
− | *[http://bayes.cs.ucla.edu/PRIMER/pearl-etal-2016-primer-errata-pages-april2021.pdf 《Causal Inference in Statistics: A Primer》原著勘误版本] | + | *[https://library.oapen.org/bitstream/id/056a11be-ce3a-44b9-8987-a6c68fce8d9b/11283.pdf 《Elements of Causal Inference: Foundations and Learning Algorithms》] |
| =相关wiki= | | =相关wiki= |
| *[[因果推断 Causal inference]] | | *[[因果推断 Causal inference]] |