结构因果模型
此词条暂由彩云小译翻译,翻译字数共843,未经人工整理和审校,带来阅读不便,请见谅。
In philosophy of science, a causal model (or structural causal model) is a conceptual model that describes the causal mechanisms of a system. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.
In philosophy of science, a causal model (or structural causal model) is a conceptual model that describes the causal mechanisms of a system. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.
在科学哲学中,因果模型(或结构因果模型)是描述系统因果机制的概念模型。因果模型可以通过提供清晰的规则来决定需要考虑/控制哪些自变量,从而改进研究设计。
They can allow some questions to be answered from existing observational data without the need for an interventional study such as a randomized controlled trial. Some interventional studies are inappropriate for ethical or practical reasons, meaning that without a causal model, some hypotheses cannot be tested.
They can allow some questions to be answered from existing observational data without the need for an interventional study such as a randomized controlled trial. Some interventional studies are inappropriate for ethical or practical reasons, meaning that without a causal model, some hypotheses cannot be tested.
它们可以从现有的观察数据中回答一些问题,而无需进行随机对照试验似的干预性研究。一些干预性研究由于伦理或实践的原因是不合适的,这意味着如果没有一个因果模型,一些假设无法被检验。
Causal models can help with the question of external validity (whether results from one study apply to unstudied populations). Causal models can allow data from multiple studies to be merged (in certain circumstances) to answer questions that cannot be answered by any individual data set.
Causal models can help with the question of external validity (whether results from one study apply to unstudied populations). Causal models can allow data from multiple studies to be merged (in certain circumstances) to answer questions that cannot be answered by any individual data set.
因果模型可以帮助解决外部有效性问题(一项研究的结果是否适用于未研究的总体)。因果模型可以允许多个研究的数据(在某些情况下)合并来回答任何单个数据集都无法回答的问题。
Causal models are falsifiable, in that if they do not match data, they must be rejected as invalid. They must also be credible to those close to the phenomena the model intends to explain.[2]
Causal models are falsifiable, in that if they do not match data, they must be rejected as invalid. They must also be credible to those close to the phenomena the model intends to explain.
因果模型是可证伪的,因为如果它们与数据不匹配,它们就必须作为无效模型而被拒绝。它们还必须使得接触模型所要解释现象的群体信赖它们。
Causal models have found applications in signal processing, epidemiology and machine learning.[3]
Causal models have found applications in signal processing, epidemiology and machine learning.
因果模型在信号处理、流行病学和机器学习中都有应用。
Definition
Causal models are mathematical models representing causal relationships within an individual system or population. They facilitate inferences about causal relationships from statistical data. They can teach us a good deal about the epistemology of causation, and about the relationship between causation and probability. They have also been applied to topics of interest to philosophers, such as the logic of counterfactuals, decision theory, and the analysis of actual causation.[4]
Judea Pearl defines a causal model as an ordered triple [math]\displaystyle{ \langle U, V, E\rangle }[/math], where U is a set of exogenous variables whose values are determined by factors outside the model; V is a set of endogenous variables whose values are determined by factors within the model; and E is a set of structural equations that express the value of each endogenous variable as a function of the values of the other variables in U and V.[3]
Judea Pearl defines a causal model as an ordered triple [math]\displaystyle{ \langle U, V, E\rangle }[/math], where U is a set of exogenous variables whose values are determined by factors outside the model; V is a set of endogenous variables whose values are determined by factors within the model; and E is a set of structural equations that express the value of each endogenous variable as a function of the values of the other variables in U and V.
Judea Pearl 将因果模型定义为一个有序的三元组[math]\displaystyle{ \langle U, V, E\rangle }[/math] ,其中 U 是一组外生变量,其值由模型外部的因素决定; V 是一组内生变量,其值由模型内部的因素决定; E 是一组结构方程,把每个内生变量的值表示为 U 和 V 中其他变量值的函数。
History
作为一个实证主义者,皮尔逊将因果的概念从许多科学中去除,他认为因果关系是一种无法证明的特殊的关联,并引入相关系数作为关联强度的度量方法。他写道: “作为运动原因的力,与作为成长原因的树神完全一样”,而因果关系只是“现代科学高深奥秘中的迷信”。皮尔森在伦敦大学学院创建了期刊“Biometrika”和生物统计学实验室,后者成为了统计学的世界领军者。[5]
1908年,哈代和温伯格通过复活孟德尔的继承权,解决了导致高尔顿放弃因果关系的特质稳定问题。[5]
在1921年,赖特的路径分析成为因果模型和因果图的理论祖先。[6]他开发了这种方法,同时试图阐明遗传,发育和环境对豚鼠皮毛模式的相对影响。他通过证明这样的分析如何解释豚鼠出生体重,子宫内时间和产仔数之间的关系来支持他当时的观点。杰出的统计学家对这些想法的反对使他们在接下来的40年中被忽略(在动物饲养员中除外)。取而代之的是,科学家依赖于相关性,部分是受赖特(Wright)评论家(和主要统计学家)费舍尔(Fisher)的要求。[5]唯一的例外是伯克斯(Burks),他是一名学生,他于1926年首先应用路径图来表示中介影响(mediator),并断言保持常量不变会导致错误。她可能独立地发明了路线图。[5]:304
1923年,内曼(Neyman)提出了潜在结果的概念,但是直到1990年他的论文才从波兰语翻译成英语。[5]:271
1958年,考克斯(Cox)警告说,控制变量Z仅在极不可能受到自变量影响的情况下才有效。[5]:154
在1960年代,Duncan,Blalock,Goldberger等人重新发现了路径分析。邓肯在阅读布拉洛克关于路径图的著作时,想起了二十年前奥格本的一次演讲,其中提到了赖特的论文,而后者又提到了伯克斯。[5]:308
社会学家最初将因果模型称为结构方程模型,但一旦成为死记硬背的方法,它就失去了效用,导致一些从业者拒绝与因果关系的任何联系。经济学家采用了路径分析的代数部分,称其为联立方程模型。但是,经济学家仍然避免将因果意义归因于他们的方程式。[5]
赖特在发表第一篇论文60年后,根据卡林(Karlin)等人的批评,发表了一篇概述该论文的文章,该论据反对仅处理线性关系,而健壮的,无模型的数据表示方式则更具启发性。[5]
1973年,刘易斯(Lewis)提倡用因果关系(反事实)代替相关性。他提到了人类预见替代世界的能力,在这个世界中,有因没有发生,而其影响只有在其原因之后才出现。[5]:266鲁宾在1974年提出了“潜在结果”的概念,作为问因果问题的语言。[5]:269
1983年,卡特赖特(Cartwright)提出,以任何与效果“因果相关”的因素为条件,超越简单概率作为唯一指导。[5]:48
1986年,Baron和Kenny引入了检测和评估线性方程组中的中介的原理。截至2014年,他们的论文在所有时间中被引用次数排在第33位。[5]:324那年,格陵兰和罗宾斯提出了“可交换性”方法,通过考虑反事实来处理混杂问题。他们建议评估如果他们没有接受治疗会给治疗组带来什么后果,并将其结果与对照组进行比较。如果他们匹配,据说没有混淆。[5]:154
哥伦比亚大学设有因果人工智能实验室,该实验室正试图将因果建模理论与人工神经网络联系起来。[7]
Aristotle defined a taxonomy of causality, including material, formal, efficient and final causes. Hume rejected Aristotle's taxonomy in favor of counterfactuals. At one point, he denied that objects have "powers" that make one a cause and another an effect.[5]Later he adopted "if the first object had not been, the second had never existed" ("but-for" causation).[5]
亚里斯多德定义了因果关系的分类法,包括质料因、形式因、动力因、目的因。休谟更偏爱反事实,他拒绝了亚里士多德的分类法。有段时间,他甚至否认物体本身具有使得一个物体成为原因而另一个物体成为结果的“力量”。后来,他接收了“第一物体还没存在时,第二个根本不存在”(“but-for”因果关系)。
In the late 19th century, the discipline of statistics began to form. After a years-long effort to identify causal rules for domains such as biological inheritance, Galton introduced the concept of mean regression (epitomized by the sophomore slump in sports) which later led him to the non-causal concept of correlation.[5]
在19世纪末,统计学学科开始形成。经过多年努力确定诸如生物遗传等领域的因果规则后,高尔顿引入了均值回归的概念(体育运动中二年级低迷使之退步),后来将他引向了非因果的相关性概念。[5]
As a positivist, Pearson expunged the notion of causality from much of science as an unprovable special case of association and introduced the correlation coefficient as the metric of association. He wrote, "Force as a cause of motion is exactly the same as a tree god as a cause of growth" and that causation was only a "fetish among the inscrutable arcana of modern science". Pearson founded Biometrika and the Biometrics Lab at University College London, which became the world leader in statistics.[5]
In 1908 Hardy and Weinberg solved the problem of trait stability that had led Galton to abandon causality, by resurrecting Mendelian inheritance.[5]
The highest level, counterfactual, involves consideration of an alternate version of a past event, or what would happen under different circumstances for the same experimental unit. For example, what is the probability that, if a store had doubled the price of floss, the toothpaste-purchasing shopper would still have bought it?
最高层次的反事实,包括考虑过去事件的另一个版本,或者同一个实验单位在不同情况下会发生什么。例如,如果一家商店将牙线的价格提高了一倍,购买牙膏的购物者仍然会购买牙线的概率是多少?
In 1921 Wright's path analysis became the theoretical ancestor of causal modeling and causal graphs.[6] He developed this approach while attempting to untangle the relative impacts of heredity, development and environment on guinea pig coat patterns. He backed up his then-heretical claims by showing how such analyses could explain the relationship between guinea pig birth weight, in utero time and litter size. Opposition to these ideas by prominent statisticians led them to be ignored for the following 40 years (except among animal breeders). Instead scientists relied on correlations, partly at the behest of Wright's critic (and leading statistician), Fisher.[5] One exception was Burks, a student who in 1926 was the first to apply path diagrams to represent a mediating influence (mediator) and to assert that holding a mediator constant induces errors. She may have invented path diagrams independently.[5]
- [math]\displaystyle{ P (floss |toothpaste, price*2) }[/math]
In 1923, Neyman introduced the concept of a potential outcome, but his paper was not translated from Polish to English until 1990.[5]
Counterfactuals can indicate the existence of a causal relationship. Models that can answer counterfactuals allow precise interventions whose consequences can be predicted. At the extreme, such models are accepted as physical laws (as in the laws of physics, e.g., inertia, which says that if force is not applied to a stationary object, it will not move).
反事实可以表明因果关系的存在。能够回答反事实的模型允许精确的干预,其结果可以预测。在极端情况下,这些模型被认为是物理定律(就像物理定律一样,例如惯性,它说如果力不作用于静止的物体,它就不会移动)。
In 1958 Cox warned that controlling for a variable Z is valid only if it is highly unlikely to be affected by independent variables.[5]
In the 1960s, Duncan, Blalock, Goldberger and others rediscovered path analysis. While reading Blalock's work on path diagrams, Duncan remembered a lecture by Ogburn twenty years earlier that mentioned a paper by Wright that in turn mentioned Burks.[5]
For x to be a necessary cause of y, the presence of y must imply the prior occurrence of x. The presence of x, however, does not imply that y will occur. Necessary causes are also known as "but-for" causes, as in y would not have occurred but for the occurrence of x.
如果 x 是 y 的必然原因,那么 y 的存在必然意味着 x 的事先出现。然而,x 的存在并不意味着 y 会出现。必要的原因也被称为“ but-for”原因,因为在 y 中,如果没有发生 x,就不会发生 y。
Sociologists originally called causal models structural equation modeling, but once it became a rote method, it lost its utility, leading some practitioners to reject any relationship to causality. Economists adopted the algebraic part of path analysis, calling it simultaneous equation modeling. However, economists still avoided attributing causal meaning to their equations.[5]
Sixty years after his first paper, Wright published a piece that recapitulated it, following Karlin et al.'s critique, which objected that it handled only linear relationships and that robust, model-free presentations of data were more revealing.[5]
A causal diagram is a directed graph that displays causal relationships between variables in a causal model. A causal diagram includes a set of variables (or nodes). Each node is connected by an arrow to one or more other nodes upon which it has a causal influence. An arrowhead delineates the direction of causality, e.g., an arrow connecting variables A and B with the arrowhead at B indicates that a change in A causes a change in B (with an associated probability). A path is a traversal of the graph between two nodes following causal arrows.
因果图是一个有向图,它显示了因果模型中变量之间的因果关系。因果关系图包括一组变量(或节点)。每个节点通过一个箭头连接到一个或多个其他节点,对这些节点有因果影响。箭头描述了因果关系的方向,例如,一个箭头连接变量 a 和 b 与箭头在 b 表示 a 的变化导致 b 的变化(与一个相关的概率)。路径是在两个节点之间按照因果箭头进行的图的遍历。
In 1973 Lewis advocated replacing correlation with but-for causality (counterfactuals). He referred to humans' ability to envision alternative worlds in which a cause did or not occur and in which effect an appeared only following its cause.[5] In 1974 Rubin introduced the notion of "potential outcomes" as a language for asking causal questions.[5]
Because genes vary randomly across populations, presence of a gene typically qualifies as an instrumental variable, implying that in many cases, causality can be quantified using regression on an observational study.
因为基因在不同人群中随机变化,基因的存在通常被认为是工具变量,这意味着在许多情况下,因果关系可以用观察性研究回归来量化。
In 1983 Cartwright proposed that any factor that is "causally relevant" to an effect be conditioned on, moving beyond simple probability as the only guide.[5]
- [math]\displaystyle{ P(Y|do(X)) = \textstyle \sum_{z} \displaystyle P(Y|X, Z=z) P(Z=z) }[/math]
In 1986 Baron and Kenny introduced principles for detecting and evaluating mediation in a system of linear equations. As of 2014 their paper was the 33rd most-cited of all time.[5]That year Greenland and Robins introduced the "exchangeability" approach to handling confounding by considering a counterfactual. They proposed assessing what would have happened to the treatment group if they had not received the treatment and comparing that outcome to that of the control group. If they matched, confounding was said to be absent.[5]
The following converts a do expression into a do-free expression by conditioning on the variables along the front-door path.
下面通过对前门路径上的变量进行条件处理,将 do 表达式转换为 do-free 表达式。
Columbia University operates the Causal Artificial Intelligence Lab which is attempting to connect causal modeling theory to artificial neural networks.[7]
Ladder of causation
Pearl's causal metamodel involves a three-level abstraction he calls the ladder of causation. The lowest level, Association (seeing/observing), entails the sensing of regularities or patterns in the input data, expressed as correlations. The middle level, Intervention (doing), predicts the effects of deliberate actions, expressed as causal relationships. The highest level, Counterfactuals (imagining), involves constructing a theory of (part of) the world that explains why specific actions have specific effects and what happens in the absence of such actions.[5]
Association
One object is associated with another if observing one changes the probability of observing the other. Example: shoppers who buy toothpaste are more likely to also buy dental floss. Mathematically:
- [math]\displaystyle{ P (floss | toothpaste) }[/math]
or the probability of (purchasing) floss given (the purchase of) toothpaste. Associations can also be measured via computing the correlation of the two events. Associations have no causal implications. One event could cause the other, the reverse could be true, or both events could be caused by some third event (unhappy hygenist shames shopper into treating their mouth better ).[5]
Intervention
This level asserts specific causal relationships between events. Causality is assessed by experimentally performing some action that affects one of the events. Example: if we doubled the price of toothpaste, what would be the new probability of purchasing? Causality cannot be established by examining history (of price changes) because the price change may have been for some other reason that could itself affect the second event (a tariff that increases the price of both goods). Mathematically:
- [math]\displaystyle{ P (floss | do(toothpaste)) }[/math]
where do is an operator that signals the experimental intervention (doubling the price).[5] The operator indicates performing the minimal change in the world necessary to create the intended effect, a "mini-surgery" on the model with as little change from reality as possible.[8]
Counterfactuals
The highest level, counterfactual, involves consideration of an alternate version of a past event, or what would happen under different circumstances for the same experimental unit. For example, what is the probability that, if a store had doubled the price of floss, the toothpaste-purchasing shopper would still have bought it?
- [math]\displaystyle{ P (floss | toothpaste, price*2) }[/math]
Category:Causal diagrams
类别: 因果图表
Category:Causality
分类: 因果关系
Counterfactuals can indicate the existence of a causal relationship. Models that can answer counterfactuals allow precise interventions whose consequences can be predicted. At the extreme, such models are accepted as physical laws (as in the laws of physics, e.g., inertia, which says that if force is not applied to a stationary object, it will not move).[5]
Category:Formal epistemology
范畴: 形式认识论
Category:Scientific modeling
类别: 科学建模
This page was moved from wikipedia:en:Causal model. Its edit history can be viewed at 结构因果模型/edithistory
- ↑ Karl Friston (Feb 2009). "Causal Modelling and Brain Connectivity in Functional Magnetic Resonance Imaging". PLOS Biology. 7 (2): e1000033. doi:10.1371/journal.pbio.1000033. PMC 2642881. PMID 19226186.
- ↑ Barlas, Yaman; Carpenter, Stanley (1990). "Philosophical roots of model validation: Two paradigms". System Dynamics Review (in English). 6 (2): 148–166. doi:10.1002/sdr.4260060203.
- ↑ 3.0 3.1 Pearl 2009
- ↑ Hitchcock, Christopher (2018), "Causal Models", in Zalta, Edward N. (ed.), The Stanford Encyclopedia of Philosophy (Fall 2018 ed.), Metaphysics Research Lab, Stanford University, retrieved 2018-09-08
- ↑ 5.00 5.01 5.02 5.03 5.04 5.05 5.06 5.07 5.08 5.09 5.10 5.11 5.12 5.13 5.14 5.15 5.16 5.17 5.18 5.19 5.20 Pearl, Judea; Mackenzie, Dana (2018-05-15) (in en). [[[:模板:Google books]] The Book of Why: The New Science of Cause and Effect]. Basic Books. ISBN 9780465097616. 模板:Google books.
- ↑ Okasha, Samir (2012-01-12). "Causation in Biology". In Beebee, Helen (in en). [[[:模板:Google books]] The Oxford Handbook of Causation]. 1. OUP Oxford. doi:10.1093/oxfordhb/9780199279739.001.0001. ISBN 9780191629464. http://www.oxfordhandbooks.com/view/10.1093/oxfordhb/9780199279739.001.0001/oxfordhb-9780199279739-e-0036.
- ↑ Bergstein, Brian. "What AI still can't do". MIT Technology Review (in English). Retrieved 2020-02-20.
- ↑ Pearl, Judea (29 Oct 2019). "Causal and Counterfactual Inference" (PDF). Retrieved 14 December 2020.
{{cite journal}}
: Cite journal requires|journal=
(help)