Judea Pearl

基本信息

朱迪亚·珀尔


类别	信息
生日	1936年9月4日
出生地	以色列特拉维夫
国籍	美国
居住地	美国加利福尼亚州洛杉矶市
任职	加州大学洛杉矶分校，计算机科学教授
著名成就	贝叶斯网络 do算子（do-calculus）珀尔因果层次模型（PCH: Pearl Causal Hierarchy）
主要研究方向	人工智能因果推理科学哲学
教育院校	以色列理工学院纽瓦克工程学院（现新泽西理工学院）新泽西州新不伦瑞克市罗格斯大学纽约布鲁克林理工学院（现纽约大学理工学院）

朱迪亚·珀尔（Judea Pearl）——以色列裔美籍计算机科学家、哲学家，以倡导人工智能的概率方法和贝叶斯网络而闻名。他因发明了贝叶斯网络、定义复杂概率模型的数学形式以及这些模型中用于推理的主要算法而受到赞誉。这项工作不仅彻底改变了人工智能领域，而且成为许多其他工程和自然科学分支的重要工具。他后来创建了一个因果推理的数学框架，该框架对社会科学产生了重大影响。ACM授予Judea Pearl 2011年度图灵奖，以表彰他“通过发展概率和因果推理演算对人工智能做出的基础性贡献”。

他早在40多年前便通过贝叶斯网络的设计，使机器实现概率推理而在人工智能领域声名大噪，并被誉为“贝叶斯网络之父”，但近年却公开声称自己其实是人工智能社区的一名“叛徒”：离开了主流追逐、并且也是由他奠定重要理论基础和方法论的概率推理，而去追求一项更具挑战性的任务——因果推理。Judea Pearl 认为当今深度学习所有令人印象深刻的成就，都只不过是为了适应“曲线拟合（Curve fitting）”。而今，这也导致深度学习的研究员们困在了因果之梯的最底层——“关联”层次的问题窘境里。Judea Pearl 期望能掀起一场“因果革命”，采用因果推理模型，从因果而非单纯的数据关联角度去研究人工智能。

成长经历

求学

Judea Pearl 于1960年在海法的以色列理工学院获得电气工程学士学位。他于1961年在纽瓦克工程学院（现为新泽西理工学院）获得电气工程硕士学位。1965年，他在新泽西州新不伦瑞克市的罗格斯大学获得物理学硕士学位，同年，在纽约布鲁克林理工学院（现纽约大学理工学院）获得电气工程博士学位。他的博士学位论文是“超导记忆的涡旋理论”，“Pearl 涡旋（Pearl Vortex）”就是用来描述他所研究的超导电流的类型，这个词在物理学家中很流行。

工作

Pearl 曾在新泽西州普林斯顿的 RCA 研究实验室从事超导参数放大器和存储器件方面的工作，并在加利福尼亚州霍桑市的 Electronic Memory, Inc. 从事高级存储系统方面的工作。尽管当时他的工作聚焦在物理器件方面，Pearl 说从那时起他就对智能系统潜在应用充满向往。

学术

当磁性和超导存储器的工业研究因大规模半导体存储器的出现而减少时，Pearl 决定进入学术界以追求他对逻辑和推理的长期兴趣。1969 年，他加入加州大学洛杉矶分校，最初在工程系统系任教，并于1970年在新成立的计算机科学系获得终身教职。1976年晋升为正教授。1978年，他创立了认知系统实验室——这个名称强调了他对理解人类认知的愿望。

研究领域

节选自AMC图灵奖人物主页

搜索和启发式

Pearl 在计算机科学领域的声誉最初不是建立在概率推理（这在当时是一个有争议的话题）上，而是建立在组合搜索上。从1980年开始发表一系列期刊论文，最终于1984年 Pearl 出版了《启发式：计算机问题解决的智能搜索策略》一书。这项工作包括许多关于传统搜索算法的新结果，例如A*算法（发音：A star algorithm），以及在游戏算法方面，将人工智能研究提升到一个新的严谨和深度水平。它还提出了关于如何从宽松的问题定义中自动推导出可接受的启发式的新想法，这种方法导致了规划系统的巨大进步。

贝叶斯网络

Pearl 认为，对问题进行合理的概率分析会给出直观正确的结果，即使在基于规则的系统行为不正确的情况下也是如此。一个这样的案例与因果推理（从原因到结果）和诊断推理的能力有关（从结果到原因）。“如果使用诊断规则，则无法进行预测，如果使用预测规则，则无法进行诊断推理，如果同时使用两者，则会遇到正反馈不稳定性，这是我们在概率论中从未遇到过的。” 另一个案例涉及“解释消失”现象，即当观察到给定结果时，对导致结果的任何原因的相信程度会增加，但当发现其他原因也能导致观察到的结果时，对之前原因的相信程度就会降低。基于规则的系统不能表现出“解释消失”现象，而它在概率分析中会自动发生。

Pearl 意识到条件独立的概念将是构建具有多项式多参数的复杂概率模型和组织分布式概率计算的关键。论文“Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach”[2]介绍了由有向无环图定义的概率模型，并推导出了一种精确的、分布式的、异步的、线性时间的树推理算法——我们现在称之为信念传播的算法，turbocodes的基础。随后，Pearl 有一段非凡的创意产出，发表了 50 多篇论文，涵盖一般图的精确推理、使用马尔可夫链蒙特卡罗的近似推理算法、条件独立属性、学习算法等，直到 1988 年出版了《智能系统中的概率推理》。这部具有里程碑意义的著作将珀尔的哲学、他的人类认知理论和他所有的技术材料结合成一个有说服力的整体这引发了人工智能领域的一场革命。在短短几年内，来自人工智能内部逻辑阵营和神经网络阵营的主要研究人员采用了一种概率（通常简称为现代）方法来研究人工智能。

Pearl 的贝叶斯网络为多元概率模型提供了句法和演算，就像乔治·布尔为逻辑模型提供句法和演算一样。与贝叶斯网络相关的理论和算法问题是机器学习和统计学现代研究议程的重要组成部分，它们的使用也渗透到其他领域，如自然语言处理、计算机视觉、机器人技术、计算生物学和认知科学。截至2012年，已经出现了大约 50,000 篇以贝叶斯网络为主要关注点的出版物。

因果关系

即使在开发贝叶斯概率网络的理论和技术时，Pearl 也怀疑需要一种不同的方法来解决他多年来一直关注的因果关系问题。在他2000年关于因果关系的著作《因果关系：模型、论证、推理》中，他描述了他早期的兴趣如下：

在我高中三年级的时候，我第一次看到了因果关系的黑暗世界。我的科学老师 Feuchtwanger 博士通过讨论19世纪的发现，向我们介绍了逻辑研究，发现死于天花接种的人比死于天花本身的人多。一些人利用这些信息争辩说接种是有害的，而事实上，数据证明恰恰相反，接种通过根除天花来挽救生命。
Feuchtwanger 博士总结道：“逻辑的用武之地就是保护我们免受此类因果谬误的影响。” 当时的我们都为逻辑的奇迹而折服，尽管 Feuchtwanger 博士从未真正向我们展示过逻辑如何保护我们免受这些谬误的影响。
然而多年后我作为一名人工智能研究员意识到，事实并非如此。逻辑学和数学的任何分支都没有开发出足够的工具来管理类似天花疫苗这样涉及因果关系的问题。

实际上，贝叶斯网络无法捕获因果信息，例如“吸烟”-->“肺癌”，它在数学上等同于网络“肺癌”-->“吸烟”。因果网络的关键特征是它能捕捉外生干预变量的潜在效果。在因果网络X-->Y中，人为设定Y的值对Y实施干预，不应该改变人对X的先验认知，即，对Y的干预切断了从X到Y的影响链；因此，因果网络“吸烟”-->“肺癌”能够反映我们关于真实世界如何运作的信念（迫使受试者吸烟确实能改变一个人的信念，使他相信这会让受试者会患上癌症），而“癌症”-->“吸烟”则不能反映我们对真实世界的理解（如果受试者因为人为诱导而患上癌症，则不会改变一个人对该受试者是否吸烟的信念）。这个 Pearl 称之为do-calculus的简单分析，导致了一套完整数学框架的出现，对因果模型做了形式化，并能通过分析数据确定因果关系。这项工作推翻了长期以来人们对统计学的看法，即因果关系只能通过受控随机试验来确定——多数情况下，在生物和社会科学等领域实施随机受控实验是不可能的。

奖项与成就

多年来，他因在人工智能、人类推理和科学哲学领域做出重大贡献而享誉国际。获得近50项各类奖项（http://bayes.cs.ucla.edu/jp_home.html）

以下为重要奖项：

2001 年，他因提出科学哲学方面的最佳著作而获得伦敦经济学院授予的拉科塔斯奖。

2003 年，他获得了 ACM 艾伦纽厄尔奖。

2006 年，获得了 Civic Venture 的首届目的奖，该奖项旨在表彰 60 岁及以上在解决社区和国家问题方面表现出非凡远见的个人。

2008 年，富兰克林研究所授予他本杰明富兰克林计算机和认知科学奖章。

2011 年，他因对人类认知理论基础的贡献而获得大卫 E. Rumelhart 奖。他的母校授予他哈维科学技术奖。

2011 年，他获得了ACM的图灵奖，这是计算机工程领域的最高荣誉，以表彰他“通过开发用于概率和因果推理的微积分对人工智能的根本贡献”。

2015年，ACM fellow

主要文章及著作

他就人工智能的各个主题发表了近500篇科学论文（http://bayes.cs.ucla.edu/jp_home.html）。此外，他在上述感兴趣的领域共出版五本著作：

1984 年，《启发法》 Heuristics, Addison-Wesley, 1984

1988 年，《智能系统中的概率推理：合理推断网络》 Probabilistic Reasoning in Intelligent Systems, Morgan-Kaufmann, 1988

2009 年，《因果关系：模型、论证、推理》 Causality: Models, Reasoning, and Inference, Cambridge University Press, 2000; 2nd edition, 2009.

2016 年，《统计因果推理入门》 Causal Inference in Statistics: A Primer, (with Madelyn Glymour and Nicholas P. Jewell) Wiley, 2016.

2018 年，《为什么：关于因果关系的新科学》 The Book of Why: The New Science of Cause and Effect (with Dana Mackenzie), New York: Basic Books, May 2018

参考文献

Pearl, J., “Asymptotic properties of minimax trees and game-searching procedures,” Artificial Intelligence, 14, pp. 113–138, September 1980. One of the first papers to establish “phase transition” properties for a combinatorial problem; introduced new mathematical techniques into the AI literature.
Pearl, J., “Knowledge versus search: A quantitative analysis using A*,” Artificial Intelligence, Vol. 20, pp. 1–13, 1983. Proved the first results relating heuristic accuracy to search algorithm complexity.
Pearl, J., “On the nature of pathology in game searching,” Artificial Intelligence, Vol. 20, pp. 427–453, 1983. Proved that, under the standard model of game trees, deeper search does not necessarily improve play; and showed that this paradox is resolved by correct probabilistic updating of beliefs.
Karp, R. and J. Pearl, “Searching for an optimal path in a tree with random costs," Artificial Intelligence, Vol. 21, pp. 99–116, 1983. Identified a phase transition property for a very simple path-finding problem, with significant complexity implications.
Pearl, J., “On the discovery and generation of certain heuristics,” AI Magazine, Winter/Spring, pp. 23–33, 1983. The first paper on the systematic generation of admissible heuristics (lower bounds on optimal solution costs) by relaxing formally represented problem definitions; this idea led to dramatic advances in automated planning systems.
Pearl, J., Heuristics: Intelligent Search Strategies for Computer Problem Solving, Addison-Wesley, 1984. Synthesized essentially everything known up to that point about intelligent methods for search and game playing, much of it Pearl’s own work; also the first textbook to treat AI topics formally at a technically advanced level.
Dechter, R. and J. Pearl, “Generalized best-first search strategies and the optimality of A*,” Journal of the Association for Computing Machinery, Vol. 32, pp. 505–536, 1985. Available here.Proved that A* is the most efficient member of a very broad class of problem-solving algorithms.
Pearl, J., “Reverend Bayes on inference engines: A distributed hierarchical approach,” Proceedings, AAAI-82, 1982. The paper that began the probabilistic revolution in AI by showing how several desirable properties of reasoning systems can be obtained through sound probabilistic inference. It introduced tree-structured networks as concise representations of complex probability models, identified conditional independence relationships as the key organizing principle for uncertain knowledge, and described an efficient, distributed, exact inference algorithm.
Kim, J. and J. Pearl, “A computational model for combined causal and diagnostic reasoning in inference systems,” Proceedings, IJCAI-83, 1983. Generalized the tree-structured network to allow for multiple parents, or causal influences, on any given variable.
Pearl, J., “Learning hidden causes from empirical data,” Proceedings, IJCAI-85, 1985. Initiated the study of methods for learning the structure of probabilistic causal models.
Pearl, J., “On the logic of probabilistic dependencies,” Proceedings, AAAI-86, 1986. One of several papers establishing the connection between graphical models and conditional independence relationships.
Pearl, J., “Fusion, propagation and structuring in belief networks,” Artificial Intelligence, Vol. 29, pp. 241–288, 1986. The key technical paper on representation and exact inference in general Bayesian networks; by 1991 this had become the most cited paper in the history of the Artificial Intelligence journal.
Pearl J. and A. Paz, “Graphoids: A graph-based logic for reasoning about relevance relations,” In B. du Boulay et al. (Eds.), Advances in Artificial Intelligence II, North-Holland, 1987. Establishes an axiomatic characterization of the properties that enable probabilities and other relational systems to be represented by graphs.
Pearl, J., “Evidential reasoning using stochastic simulation of causal models,” Artificial Intelligence, Vol. 32, pp. 245–258, 1987. Derived a general approximation algorithm for Bayesian network inference using Markov chain Monte Carlo (MCMC); this was the first significant use of MCMC in mainstream AI.
Pearl, J., Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann, 1988. Explained the philosophical, cognitive, and technical basis for a probabilistic view of knowledge, reasoning, and decision making. One of the most cited works in the history of computer science, this book initiated the modern era in AI and converted many researchers who had previously worked in the logical and neural-network communities.
Pearl J. and T.S. Verma, “A theory of inferred causation,” Proceedings, KR-91, 1991. Introduces minimal-model semantics as a basis for causal discovery, and shows that causal directionality can be inferred from patterns of correlations without resorting to temporal information.
Pearl, J., “Graphical models, causality, and intervention,” Statistical Science, Vol. 8, pp. 266–269, 1993. Introduces the back-door criterion for covariate selection, the first to guarantee bias-free estimation of causal effects.
Pearl, J., “Causal diagrams for empirical research,” Biometrika, Vol. 82, Num. 4, pp. 669–710, 1995. Introduces the theory of causal diagrams and its associated do-calculus; the first (and still the only) mathematical method to enable a systematic removal of confounding bias in observations.
Pearl, J., “The Art and Science of Cause and Effect,” UCLA Cognitive Systems Laboratory, Technical Report R-248, 1996. Transcript of lecture given Thursday, October 29, 1996, as part of the UCLA 81st Faculty Research Lecture Series.Used later as epilogue to the book Causality (2000). Provides a panoramic view of the historical development of causal thoughts from antiquity to modern days.
Pearl, J., Causality: Models, Reasoning, and Inference, Cambridge University Press, 2000. Building on theoretical results from 1987 to 2000, lays out a complete framework for causal discovery, interventional analysis and counterfactual reasoning, bringing mathematical rigor and conceptual clarity to an area previously considered off-limits for statistics. Winner of the 2001 Lakatos Prize for the most significant new work in the philosophy of science.
Pearl, J., “The logic of counterfactuals in causal inference (Discussion of `Causal inference without counterfactuals' by A.P. Dawid),” Journal of American Statistical Association, Vol. 95, pp. 428–435, 2000. Demonstrates how counterfactual reasoning underlines scientific thought and argues against its exclusion from statistical analysis.
Tian, J. and J. Pearl, “Probabilities of causation: Bounds and identification,” Annals of Mathematics and Artificial Intelligence, Vol. 28, pp. 287–313, 2000. Derives tight bounds on the probability that one observed event was the cause of another, in the legal sense of "but for," thus providing a principled way of substantiating guilt and innocence from data.
Pearl, J., “Robustness of causal claims,” Proceedings, UAI-04, 2004. Offers a formal definition of robustness and develops a method for assessing the degree to which causal claims are robust to model misspecification.
Pearl, J., “Direct and indirect effects,” Proceedings, UAI-01, 2001. Establishes the theoretical basis of modern mediation analysis. Derives the "Mediation Formula" and provides graphical conditions for the identification of direct and indirect effect.
Tian, J. and J. Pearl, “A general identification condition for causal effects,” Proceedings, AAAI-02, 2002. Uses the do-calculus to derive a general graphical condition for identifying causal effects from a combination of data and assumptions.
Halpern, J. and J. Pearl, “Causes and explanations: A structural-model approach—Parts I and II,” British Journal for the Philosophy of Science, Vol. 56, pp. 843–887 and 889–911, 2005. Establishes counterfactual conditions for one event to be perceived as the “actual cause” of another and for one event to provide an “explanation” of another.
Pearl, J., “Causal inference in statistics: An overview,” Statistics Surveys, Vol. 3, pp. 96–146, 2009. Describes a unified methodology for causal inference based on a symbiosis between graphs and counterfactual logic.
Pearl, J., “The algorithmization of counterfactuals,” Annals of Mathematics and Artificial Intelligence, Vol. 61, pp. 29–39, 2011. Describes a computational model that explains how humans generate, evaluate and distinguish counterfactual statements so swiftly and consistently.
Pearl J. and E. Bareinboim, “Transportability of causal and statistical relations: A formal approach,” Proceedings, AAAI-11, 2011. Reduces the classical problem of external validity to mathematical transformations in the do-calculus, and establishes conditions under which experimental results can be generalized to new environments in which only passive observation can be conducted.

编者推荐

福利 | 因果推断会是下一个AI热潮吗？Judea Pearl《因果论》重磅上市！

Stephen Wolfram专访Judea Pearl：从贝叶斯网络到元胞自动机 | 集智俱乐部

说明

如何编写问题？请参考“V形图”

如何编写学者词条？

如何多人一起编写学者词条？